I wrote a Python (2.7) script that compares a file byte by byte. filecmp
was not suitable for the application because it compares metadata as well. How can I improve this code?
def byte_comp(fname, fname2): """ Compares two files byte by byte Returns True if the files match, false otherwise """ read_size = 1048576 # Determined experimentally if os.path.getsize(fname) != os.path.getsize(fname2): # If two files do not have the same size, they cannot be same return False with open(fname, "rb") as f: with open(fname2, "rb") as f2: count = 0 # Counts how many bytes have been read while count <= os.path.getsize(fname): # Loops until the whole file has been read if(f.read(read_size) != f2.read(read_size)): # If the chunk of the file is not the same, the function returns false return False count += read_size return True # If the files are the same, it returns True
I would also appreciate help on how to make the function faster and less CPU intensive.