The Tale
On a single computer you are copying files from directory A to directory B.
Things have gotten a little confused
and you want to verify the transfers.
Project #1
You have two directories (folders) containing files. You want to make sure the
the files are the same. Compare the two directories.
Create a program that:
- creates two lists of file names. One for each directory.
- a list should contain only regular files
- display the number of filenames in each list
- display the filenames in list1 that are not in list2
- display the filenames in list2 that are not in list1
Project #2
The same as Project #1 except compare file sizes.
Project #3
The same as Project #1 except compare the internal data in files.
(Compare bytes, lines, checksums?)
Note: Checksums are best.
Project #3
The same as Projects #1, #2, and #3 except compare files in two directory trees.
Note: Remember, the same file name may be in two different directories.
Project #4
The same as Project #1 except links and Directories.
Code Hints
# ------------------------------------------------------------
# file MD5 Checksum
# ------------------------------------------------------------
import hashlib
def file_md5_checksum(filepath):
with open(filepath, 'rb') as bfile:
bdata = bfile.read()
return hashlib.md5(bdata).hexdigest()
# ------------------------------------------------------------
# difference between two lists (list1 - list2)
# ------------------------------------------------------------
def difference_in_lists(list1,list2):
diff = []
for element in list1:
if element not in list2:
diff.append(element)
return diff
# ------------------------------------------------------------
# list of files in a directory
# ------------------------------------------------------------
import os
# ------------------------------------------------------------
# ---- get a list of the filenames in a directory (skip links)
# ------------------------------------------------------------
def get_list_of_files(dir):
# --- get a list of entries in the directory
entries = os.listdir(dir)
# --- collect all of the files in the directory
files = []
for f in entries:
ff = dir + '/' + f # path + filename
if os.path.isfile(ff)
if not os.path.islink(ff):
files.append(f)
files.sort()
return files
# ------------------------------------------------------------
# ---- main
# ------------------------------------------------------------
if __name__ == '__main__':
dir_old = 'd:/abc'
dir_new = 'd:/xyz'
old_files = get_list_of_files(dir_old)
new_files = get_list_of_files(dir_new)
print()
print(f'{len(old_files)} files found in old dir')
print(f'{len(new_files)} files found in new dir')
Links
Different Types of Files in Linux
inode
(Wikipedia)
File-system permissions
(Wikipedia)
chmod
(Wikipedia)
Python os.stat() method
stat — Interpreting stat() results