Calculate Hash of a File using Python
Computing hash values of files is a fundamental aspect of data integrity verification and security in modern computing. Whether you’re verifying downloaded software, implementing file integrity monitoring, or developing blockchain applications, understanding how to calculate file hashes using Python is an invaluable skill for developers and system administrators. In this comprehensive guide, we’ll explore the entire process of calculating file hashes using Python, from understanding the basic concepts to implementing advanced techniques for large-scale applications. We’ll cover everything from hashlib fundamentals to optimized implementations, security considerations, and real-world applications specifically in Linux environments. By the end of this guide, you’ll have the knowledge and practical skills to implement robust file hashing solutions for various scenarios.
Understanding Hash Functions
Hash functions are mathematical algorithms that transform data of arbitrary size into a fixed-size string of characters, typically a hexadecimal number. These functions are designed to be deterministic, meaning the same input will always produce the same output hash value. This property makes hash functions perfect for verifying data integrity, as any change to the input data, no matter how small, will result in a completely different hash value.
Cryptographic hash functions possess several important properties that make them useful for security applications. First, they demonstrate the “avalanche effect,” where a tiny change in the input produces a drastically different output hash. For example, changing a single bit in a file will result in a completely different hash value. Second, they are designed to be one-way functions, meaning it’s computationally infeasible to derive the original input from the hash value alone. Third, strong hash functions aim to be collision-resistant, meaning it’s extremely difficult to find two different inputs that produce the same hash output.
Hash values essentially serve as unique “fingerprints” for files, allowing us to verify whether a file has been modified, corrupted, or tampered with. In Linux environments, command-line utilities like sha256sum and md5sum are commonly used to generate and verify file hashes, but Python provides more flexibility and programmatic control through its built-in libraries.
These cryptographic hash functions differ from non-cryptographic hash functions in their security properties. While non-cryptographic hash functions like those used in hash tables are optimized for speed, cryptographic hash functions prioritize security properties like collision resistance and pre-image resistance, making them suitable for security-sensitive applications.
Python’s Hashlib Module
The hashlib module is Python’s standard library for cryptographic hashing and serves as the foundation for file hash calculation in Python applications. This versatile module provides a common interface to many different secure hash and message digest algorithms, making it straightforward to implement various hashing functions in your Python code.
Before diving into implementation, let’s ensure you have hashlib available. While hashlib comes pre-installed with Python, in some Linux distributions, you might need to install it separately. You can do this using pip, the Python package installer:
# On Ubuntu or Debian-based systems
sudo apt install python-pip
pip install hashlib
You can verify that hashlib is installed by importing it in a Python script or interactive shell:
import hashlib
print(hashlib.algorithms_guaranteed)
This will display the hash algorithms guaranteed to be supported by your Python installation. Common algorithms include MD5, SHA-1, SHA-224, SHA-256, SHA-384, and SHA-512. Since Python 3.6, BLAKE2b and BLAKE2s algorithms are also available.
The hashlib module follows an object-oriented design pattern. To use it, you create a hash object for your chosen algorithm, update it with data, and then retrieve the digest (the hash value) in either binary form or as a hexadecimal string. The basic syntax looks like this:
import hashlib
# Create a hash object
hash_object = hashlib.sha256()
# Update with data
hash_object.update(b'data to hash')
# Get the hexadecimal representation
hex_digest = hash_object.hexdigest()
print(hex_digest)
It’s important to note that the update() method accepts only bytes objects, not strings. This is why we either need to use a bytes literal (prefixed with b) or encode strings using the encode() method before hashing them. This distinction is particularly important when working with text files versus binary files.
Basic Implementation: Hashing Small Files
Now that we understand the hashlib module, let’s implement a basic solution for calculating the hash of a small file. The key to hashing files in Python is to read the file in binary mode (‘rb’) rather than text mode, as this ensures consistent results across different platforms and avoids issues with line ending conversions.
Here’s a step-by-step implementation to calculate the SHA-256 hash of a file:
import hashlib
def calculate_file_hash(filename):
"""Calculate SHA-256 hash of a file."""
# Create a new hash object
hash_object = hashlib.sha256()
# Open the file in binary mode
with open(filename, 'rb') as file:
# Read the entire file content
file_content = file.read()
# Update the hash object with the file content
hash_object.update(file_content)
# Return the hexadecimal representation of the hash
return hash_object.hexdigest()
# Example usage
file_path = "example.txt"
file_hash = calculate_file_hash(file_path)
print(f"SHA-256 hash of {file_path}: {file_hash}")
This implementation works well for small files that can fit entirely in memory. The code opens the file in binary mode, reads all its contents into memory, and then updates the hash object with this data. Finally, it returns the hexadecimal representation of the hash using the hexdigest() method.
To verify that our implementation is correct, we can compare its output with that of Linux command-line utilities. For example, we can verify an SHA-256 hash using the sha256sum command:
sha256sum example.txt
If both outputs match, our implementation is working correctly. This verification step is crucial when developing hash calculation code, as it ensures that our implementation conforms to established standards.
Advanced Techniques
The basic implementation works well for small files, but it becomes problematic when dealing with large files that exceed the available RAM. Loading an entire multi-gigabyte file into memory would cause performance issues or even crash your program. A more efficient approach is to read the file in chunks and update the hash object incrementally.
Here’s an optimized implementation for hashing large files:
import hashlib
def calculate_large_file_hash(filename, chunk_size=65536):
"""Calculate SHA-256 hash of a large file by reading it in chunks."""
hash_object = hashlib.sha256()
with open(filename, 'rb') as file:
while True:
# Read file in chunks of 64KB
chunk = file.read(chunk_size)
# If end of file, break the loop
if not chunk:
break
# Update hash with the chunk data
hash_object.update(chunk)
return hash_object.hexdigest()
This implementation reads the file in 64KB chunks (65536 bytes), which is a good balance between memory usage and performance for most systems. For each chunk, we update the hash object incrementally. This approach allows us to calculate hashes for files of any size while maintaining a constant memory footprint.
The choice of chunk size can affect performance. Smaller chunks reduce memory usage but may increase I/O operations, while larger chunks might improve performance at the cost of higher memory usage. For most applications, a chunk size between 4KB and 1MB works well, with 64KB being a common default.
When dealing with very large files (several gigabytes or more), you might want to add a progress indicator to inform users about the hashing progress. This can be implemented by tracking the number of chunks processed and calculating the percentage based on the file size:
import hashlib
import os
def calculate_large_file_hash_with_progress(filename, chunk_size=65536):
"""Calculate SHA-256 hash of a large file with progress reporting."""
hash_object = hashlib.sha256()
# Get file size
file_size = os.path.getsize(filename)
bytes_processed = 0
with open(filename, 'rb') as file:
while True:
chunk = file.read(chunk_size)
if not chunk:
break
hash_object.update(chunk)
bytes_processed += len(chunk)
# Calculate and print progress
progress = (bytes_processed / file_size) * 100
print(f"Progress: {progress:.2f}%", end='\r')
print("\nHashing complete!")
return hash_object.hexdigest()
This implementation provides a smoother user experience when processing large files, as it gives feedback about the hashing progress.
Multiple Hash Algorithms
In many applications, you might want to calculate multiple hash values using different algorithms simultaneously. This can be useful for compatibility with different systems or to provide stronger verification by comparing multiple hash values.
Here’s an implementation that calculates hashes using multiple algorithms in a single pass through the file:
import hashlib
def calculate_multiple_hashes(filename, algorithms=None, chunk_size=65536):
"""Calculate multiple hash values for a file in a single pass."""
if algorithms is None:
algorithms = ['md5', 'sha1', 'sha256']
# Initialize hash objects for each algorithm
hash_objects = {}
for algorithm in algorithms:
hash_func = getattr(hashlib, algorithm)
hash_objects[algorithm] = hash_func()
# Process the file once for all hash algorithms
with open(filename, 'rb') as file:
while True:
chunk = file.read(chunk_size)
if not chunk:
break
# Update all hash objects with the same chunk
for hash_obj in hash_objects.values():
hash_obj.update(chunk)
# Collect results
results = {}
for algorithm, hash_obj in hash_objects.items():
results[algorithm] = hash_obj.hexdigest()
return results
Let’s compare the different hash algorithms available in hashlib in terms of security and performance:
- MD5: The fastest but considered cryptographically broken. It’s not suitable for security-critical applications but might be useful for non-security hash verification or backward compatibility.
- SHA-1: Faster than SHA-2 family but also considered cryptographically weak. Like MD5, it should be avoided for security-critical applications.
- SHA-256: Part of the SHA-2 family, it provides a good balance between security and performance. It’s widely used and recommended for most applications as of 2025.
- SHA-384/SHA-512: These provide stronger security than SHA-256 but are slightly slower. They might be preferable for highly sensitive applications.
- BLAKE2: A newer hash function designed to be faster than MD5 while providing security comparable to SHA-3. Available in Python 3.6 and later as BLAKE2b (optimized for 64-bit platforms) and BLAKE2s (optimized for 32-bit platforms).
When choosing a hash algorithm, consider the balance between security requirements and performance needs. For most applications in 2025, SHA-256 is the recommended default, offering a good compromise between security and speed. However, for highly sensitive applications where security is paramount, consider using SHA-384, SHA-512, or BLAKE2.
Building a Versatile File Hashing Tool
Now let’s combine our knowledge to build a versatile command-line tool for file hashing. This tool will support different hash algorithms, handle multiple files, and provide various output formats.
import hashlib
import argparse
import os
import json
import sys
from datetime import datetime
def calculate_file_hash(filename, algorithm='sha256', chunk_size=65536):
"""Calculate hash of a file using specified algorithm."""
hash_func = getattr(hashlib, algorithm)
hash_object = hash_func()
try:
with open(filename, 'rb') as file:
while True:
chunk = file.read(chunk_size)
if not chunk:
break
hash_object.update(chunk)
return hash_object.hexdigest()
except IOError as e:
print(f"Error reading file {filename}: {e}")
return None
def process_directory(directory, algorithm, recursive=False, pattern='*'):
"""Process all files in a directory, optionally recursively."""
import fnmatch
results = {}
if recursive:
for root, _, files in os.walk(directory):
for filename in fnmatch.filter(files, pattern):
filepath = os.path.join(root, filename)
results[filepath] = calculate_file_hash(filepath, algorithm)
else:
for filename in fnmatch.filter(os.listdir(directory), pattern):
filepath = os.path.join(directory, filename)
if os.path.isfile(filepath):
results[filepath] = calculate_file_hash(filepath, algorithm)
return results
def main():
"""Main function for the file hashing tool."""
# Set up command-line argument parser
parser = argparse.ArgumentParser(description='Calculate hash values of files.')
parser.add_argument('paths', nargs='+', help='Files or directories to hash')
parser.add_argument('--algorithm', '-a', default='sha256',
choices=['md5', 'sha1', 'sha224', 'sha256', 'sha384', 'sha512'],
help='Hash algorithm to use (default: sha256)')
parser.add_argument('--recursive', '-r', action='store_true',
help='Process directories recursively')
parser.add_argument('--output', '-o', choices=['text', 'json'], default='text',
help='Output format (default: text)')
args = parser.parse_args()
results = {}
# Process each specified path
for path in args.paths:
if os.path.isfile(path):
results[path] = calculate_file_hash(path, args.algorithm)
elif os.path.isdir(path):
dir_results = process_directory(path, args.algorithm, args.recursive)
results.update(dir_results)
else:
print(f"Error: {path} is not a valid file or directory")
# Output results in the specified format
if args.output == 'text':
for filepath, hash_value in results.items():
if hash_value:
print(f"{hash_value} {filepath}")
elif args.output == 'json':
json_results = {
'timestamp': datetime.now().isoformat(),
'algorithm': args.algorithm,
'files': {filepath: hash_value for filepath, hash_value in results.items() if hash_value}
}
print(json.dumps(json_results, indent=2))
if __name__ == '__main__':
main()
This versatile tool supports multiple hash algorithms, can process individual files or entire directories (recursively if needed), and can output results in either text format (compatible with standard Linux utilities) or JSON format (useful for programmatic processing).
To use this tool, save it as a Python script (e.g., `file_hasher.py`) and run it from the command line:
# Calculate SHA-256 hash of a single file
python file_hasher.py myfile.txt
# Calculate MD5 hash of multiple files
python file_hasher.py --algorithm md5 file1.txt file2.txt
# Calculate SHA-512 hashes for all files in a directory, recursively
python file_hasher.py --algorithm sha512 --recursive /path/to/directory
# Output results in JSON format
python file_hasher.py --output json --recursive /path/to/directory
This tool provides a solid foundation that you can extend with additional features like verification against expected hash values, integration with databases for hash storage, or parallel processing for better performance on multi-core systems.
File Integrity Verification Systems
Now that we have a robust way to calculate file hashes, let’s explore how to build a file integrity verification system. Such systems are crucial for detecting unauthorized modifications to files, which could indicate security breaches or data corruption.
The basic idea is to calculate and store hash values for a set of files, then periodically recalculate those hashes and compare them with the stored values to detect changes. Here’s a simplified implementation of a file integrity monitoring system:
import hashlib
import os
import json
import time
from datetime import datetime
class FileIntegrityMonitor:
"""Simple file integrity monitoring system."""
def __init__(self, hash_db_path='hashes.json'):
"""Initialize with path to hash database file."""
self.hash_db_path = hash_db_path
self.hash_db = self._load_hash_db()
def _load_hash_db(self):
"""Load hash database from file or initialize if not exists."""
if os.path.exists(self.hash_db_path):
try:
with open(self.hash_db_path, 'r') as f:
return json.load(f)
except (json.JSONDecodeError, IOError):
print(f"Error loading hash database, initializing new one")
return {}
return {}
def _save_hash_db(self):
"""Save hash database to file."""
with open(self.hash_db_path, 'w') as f:
json.dump(self.hash_db, f, indent=2)
def calculate_file_hash(self, filepath, algorithm='sha256'):
"""Calculate hash for a file."""
hash_func = getattr(hashlib, algorithm)
hash_object = hash_func()
try:
with open(filepath, 'rb') as file:
for chunk in iter(lambda: file.read(65536), b''):
hash_object.update(chunk)
return hash_object.hexdigest()
except IOError as e:
print(f"Error reading file {filepath}: {e}")
return None
def add_files(self, file_paths, algorithm='sha256'):
"""Add files to the monitoring database."""
timestamp = datetime.now().isoformat()
for filepath in file_paths:
if os.path.isfile(filepath):
hash_value = self.calculate_file_hash(filepath, algorithm)
if hash_value:
self.hash_db[filepath] = {
'hash': hash_value,
'algorithm': algorithm,
'last_verified': timestamp,
'added': timestamp
}
print(f"Added {filepath} to monitoring")
else:
print(f"Skipping {filepath} - not a regular file")
self._save_hash_db()
def verify_files(self):
"""Verify all files in the database and report changes."""
timestamp = datetime.now().isoformat()
results = {
'timestamp': timestamp,
'modified': [],
'missing': [],
'unchanged': []
}
for filepath, info in list(self.hash_db.items()):
if not os.path.exists(filepath):
results['missing'].append(filepath)
continue
if not os.path.isfile(filepath):
results['missing'].append(filepath)
continue
current_hash = self.calculate_file_hash(filepath, info['algorithm'])
if current_hash != info['hash']:
results['modified'].append(filepath)
# Update the stored hash to the new value
self.hash_db[filepath]['hash'] = current_hash
self.hash_db[filepath]['last_verified'] = timestamp
else:
results['unchanged'].append(filepath)
self.hash_db[filepath]['last_verified'] = timestamp
self._save_hash_db()
return results
# Example usage
if __name__ == '__main__':
monitor = FileIntegrityMonitor()
# Add files to monitor
monitor.add_files(['/etc/passwd', '/etc/group', '/etc/shadow'])
# Verify files
results = monitor.verify_files()
# Print results
print(f"Verification completed at {results['timestamp']}")
print(f"Modified files: {len(results['modified'])}")
for filepath in results['modified']:
print(f" - {filepath}")
print(f"Missing files: {len(results['missing'])}")
for filepath in results['missing']:
print(f" - {filepath}")
print(f"Unchanged files: {len(results['unchanged'])}")
This implementation provides a basic file integrity monitoring system that can detect modified or missing files. In a real-world scenario, you would typically run this system as a cron job or systemd service to periodically check file integrity.
For a more robust solution, you might want to consider these enhancements:
1. Support for monitoring entire directories recursively
2. Email or SMS notifications for detected changes
3. Integration with logging systems like syslog
4. Protection for the hash database itself (perhaps using encryption or digital signatures)
5. Whitelisting for expected changes (e.g., log files that change regularly)
Real-world Applications in Linux Environments
File hashing has numerous applications in Linux environments, particularly in system administration, security, and software development. Let’s explore some of these applications and how to implement them using Python.
System File Integrity Monitoring
In production Linux systems, monitoring the integrity of system files is crucial for detecting unauthorized modifications that could indicate a security breach. Many rootkits, for example, modify system binaries to hide their presence. By regularly checking the hashes of important system files against known good values, you can detect such modifications.
You can implement this by creating a cron job that runs your file integrity monitor regularly:
# Add to /etc/crontab to run integrity check every hour
0 * * * * root /usr/local/bin/integrity_check.py
Software Distribution Verification
When distributing software, providing hash values allows users to verify that they’ve received the exact file you intended to distribute, without any corruption or tampering. This is particularly important for security-sensitive software.
You can implement this by calculating and publishing hash values for your distribution packages:
import hashlib
import os
def generate_distribution_hashes(directory, output_file='CHECKSUMS.txt'):
"""Generate hash values for all distribution files."""
with open(output_file, 'w') as f:
f.write("# SHA-256 checksums for distribution files\n")
f.write("# Generated on: " + datetime.now().isoformat() + "\n\n")
for filename in os.listdir(directory):
filepath = os.path.join(directory, filename)
if os.path.isfile(filepath):
hash_value = calculate_file_hash(filepath)
f.write(f"{hash_value} {filename}\n")
def calculate_file_hash(filepath, algorithm='sha256'):
"""Calculate hash for a file."""
hash_func = getattr(hashlib, algorithm)
hash_object = hash_func()
with open(filepath, 'rb') as file:
for chunk in iter(lambda: file.read(65536), b''):
hash_object.update(chunk)
return hash_object.hexdigest()
Backup Verification
When creating backups, it’s essential to verify that the backed-up data is intact and hasn’t been corrupted during the backup process. You can use file hashing to create a manifest of all backed-up files and their hash values, which can later be used to verify the integrity of the backup.
import hashlib
import os
import json
import tarfile
def create_backup_with_manifest(source_dir, backup_file, manifest_file):
"""Create a backup archive with a hash manifest."""
# Calculate hashes for all files
manifest = {'files': {}}
for root, _, files in os.walk(source_dir):
for filename in files:
filepath = os.path.join(root, filename)
rel_path = os.path.relpath(filepath, source_dir)
hash_value = calculate_file_hash(filepath)
manifest['files'][rel_path] = hash_value
# Save manifest
with open(manifest_file, 'w') as f:
json.dump(manifest, f, indent=2)
# Create backup archive
with tarfile.open(backup_file, 'w:gz') as tar:
tar.add(source_dir, arcname=os.path.basename(source_dir))
tar.add(manifest_file, arcname=os.path.basename(manifest_file))
These applications demonstrate the versatility and importance of file hashing in Linux environments. By understanding and implementing file hashing in your Python code, you can build more secure and reliable systems.
Security Considerations and Best Practices
When implementing file hashing, it’s important to be aware of security considerations and follow best practices to ensure your implementation is robust and secure.
Algorithm Selection
Not all hash algorithms are created equal, and some that were once considered secure are now known to be vulnerable:
- MD5: Considered cryptographically broken since 2004. Collisions can be generated in seconds on modern hardware. Avoid using MD5 for security-critical applications.
- SHA-1: Also considered weak against well-funded attackers. The first practical collision was demonstrated in 2017. Like MD5, it should be avoided for security applications.
- SHA-256 and above: Currently considered secure and recommended for most applications. As of 2025, there are no known practical attacks against SHA-256, SHA-384, or SHA-512.
- BLAKE2: A newer algorithm that provides strong security with better performance than SHA-3. It’s a good choice for new applications.
The general recommendation is to use at least SHA-256 for security-critical applications in 2025. For highly sensitive applications, consider SHA-384, SHA-512, or BLAKE2.
Protecting Hash Databases
If you’re building a file integrity monitoring system, the hash database itself becomes a valuable target. If an attacker can modify your hash database, they can hide their changes to monitored files. Consider these measures:
- Store your hash database on read-only media or in a secure location
- Use digital signatures to protect the hash database
- Implement access controls to restrict who can modify the hash database
- Keep backup copies of the hash database in different locations
Timing Attacks
When comparing hash values, be aware of timing attacks where an attacker can infer information based on how long a comparison takes. Use constant-time comparison functions when comparing hash values in security-critical applications:
def constant_time_compare(val1, val2):
"""Compare two strings in constant time."""
if len(val1) != len(val2):
return False
result = 0
for x, y in zip(val1, val2):
result |= ord(x) ^ ord(y)
return result == 0
Common Pitfalls
Avoid these common mistakes when implementing file hashing:
- Using deprecated algorithms: As mentioned earlier, avoid MD5 and SHA-1 for security applications.
- Not checking error conditions: Always handle file I/O errors gracefully.
- Improper string encoding: When hashing strings, be consistent with your encoding (UTF-8 is recommended).
- Not validating input: Validate file paths and other user inputs to prevent security vulnerabilities.
- Ignoring performance considerations: For large files or high-throughput applications, optimize your implementation for performance.
By following these security considerations and best practices, you can ensure that your file hashing implementation is both secure and robust.
Performance Optimization Techniques
When working with large files or processing many files, performance becomes a critical consideration. Here are some techniques to optimize your file hashing implementation:
Chunked Reading with Optimal Buffer Size
The buffer size used when reading files can significantly impact performance. While the default 64KB (65536 bytes) chunk size works well for many cases, you might want to experiment with different sizes based on your specific hardware and workload. On modern systems with ample memory, larger buffers (e.g., 1MB or 4MB) might improve performance by reducing the number of I/O operations:
def calculate_hash_with_custom_buffer(filename, algorithm='sha256', buffer_size=4*1024*1024):
"""Calculate file hash with a custom buffer size."""
hash_func = getattr(hashlib, algorithm)
hash_object = hash_func()
with open(filename, 'rb') as file:
while True:
chunk = file.read(buffer_size)
if not chunk:
break
hash_object.update(chunk)
return hash_object.hexdigest()
Parallel Processing for Multiple Files
When hashing multiple files, you can leverage multiprocessing to utilize multiple CPU cores:
import hashlib
import os
from multiprocessing import Pool
def calculate_file_hash(filepath, algorithm='sha256'):
"""Calculate hash for a single file."""
hash_func = getattr(hashlib, algorithm)
hash_object = hash_func()
with open(filepath, 'rb') as file:
for chunk in iter(lambda: file.read(65536), b''):
hash_object.update(chunk)
return (filepath, hash_object.hexdigest())
def hash_files_parallel(file_list, algorithm='sha256', num_processes=None):
"""Hash multiple files in parallel using multiprocessing."""
with Pool(processes=num_processes) as pool:
# Create list of arguments for each file
args = [(filepath, algorithm) for filepath in file_list]
# Map the function across all files in parallel
results = pool.starmap(calculate_file_hash, args)
# Convert results to dictionary
return dict(results)
Memory-Mapped Files
For very large files on 64-bit systems, memory-mapped files can sometimes provide better performance by leveraging the operating system’s virtual memory capabilities:
import hashlib
import mmap
import os
def calculate_hash_mmap(filename, algorithm='sha256'):
"""Calculate file hash using memory-mapped files."""
hash_func = getattr(hashlib, algorithm)
hash_object = hash_func()
with open(filename, 'rb') as file:
# Get file size
file_size = os.path.getsize(filename)
# Only use mmap for files that aren't too large
if file_size > 0 and file_size < 1024*1024*1024: # 1GB limit
with mmap.mmap(file.fileno(), 0, access=mmap.ACCESS_READ) as mm:
hash_object.update(mm)
else:
# Fall back to chunked reading for very large files
for chunk in iter(lambda: file.read(65536), b''):
hash_object.update(chunk)
return hash_object.hexdigest()
Profiling and Benchmarking
When optimizing for performance, it’s essential to measure the actual impact of your optimizations. Python’s built-in `timeit` module is useful for quick benchmarks:
import timeit
import hashlib
def benchmark_hash_algorithms(filename, iterations=5):
"""Benchmark different hash algorithms on a file."""
algorithms = ['md5', 'sha1', 'sha256', 'sha512']
results = {}
for algorithm in algorithms:
hash_func = getattr(hashlib, algorithm)
def hash_file():
h = hash_func()
with open(filename, 'rb') as f:
for chunk in iter(lambda: f.read(65536), b''):
h.update(chunk)
return h.hexdigest()
# Time the function
time_taken = timeit.timeit(hash_file, number=iterations)
results[algorithm] = time_taken / iterations
return results
By applying these performance optimization techniques, you can significantly improve the efficiency of your file hashing implementation, particularly when working with large files or processing many files in batch.