Compare Two Lists

The Tale

On a single computer you are copying files from directory A to directory B. Things have gotten a little confused and you want to verify the transfers.

Project #1

You have two directories (folders) containing files. You want to make sure the the files are the same. Compare the two directories.

Create a program that:

Project #2

The same as Project #1 except compare file sizes.

Project #3

The same as Project #1 except compare the internal data in files. (Compare bytes, lines, checksums?)

Note: Checksums are best.

Project #3

The same as Projects #1, #2, and #3 except compare files in two directory trees.

Note: Remember, the same file name may be in two different directories.

Project #4

The same as Project #1 except links and Directories.

Code Hints

# ------------------------------------------------------------ # file MD5 Checksum # ------------------------------------------------------------ import hashlib def file_md5_checksum(filepath): with open(filepath, 'rb') as bfile: bdata = bfile.read() return hashlib.md5(bdata).hexdigest()

# ------------------------------------------------------------ # difference between two lists (list1 - list2) # ------------------------------------------------------------ def difference_in_lists(list1,list2): diff = [] for element in list1: if element not in list2: diff.append(element) return diff

# ------------------------------------------------------------ # list of files in a directory # ------------------------------------------------------------ import os # ------------------------------------------------------------ # ---- get a list of the filenames in a directory (skip links) # ------------------------------------------------------------ def get_list_of_files(dir): # --- get a list of entries in the directory entries = os.listdir(dir) # --- collect all of the files in the directory files = [] for f in entries: ff = dir + '/' + f # path + filename if os.path.isfile(ff) if not os.path.islink(ff): files.append(f) files.sort() return files # ------------------------------------------------------------ # ---- main # ------------------------------------------------------------ if __name__ == '__main__': dir_old = 'd:/abc' dir_new = 'd:/xyz' old_files = get_list_of_files(dir_old) new_files = get_list_of_files(dir_new) print() print(f'{len(old_files)} files found in old dir') print(f'{len(new_files)} files found in new dir')

Links

Different Types of Files in Linux

inode (Wikipedia)

File-system permissions (Wikipedia)

chmod (Wikipedia)

Python os.stat() method

stat — Interpreting stat() results