2
\$\begingroup\$

Python beginner here. Created this program as a project to help learn python. Just ran this in a centos7vm so far to test. What it does is - zip up files containing certain extensions (excluding those defined in the var ext_list), logs the zip up process as it continues through, and when finished, send an email confirming the backup is finished to the person specified to send to. (For sending & receiving the email I have just used gmail) Also it will cleanup existing backups in the destination folder before creating the zip.

I would appreciate some suggestions for improvements. Things that I may have missed, things that would make this look more professional. That type of thing.

import os, sys, zipfile, fnmatch, glob, sys, time, datetime, smtplib, logging from datetime import date from datetime import time from datetime import datetime import os.path from pathlib import Path from email.mime.text import MIMEText from email.mime.multipart import MIMEMultipart from email.mime.base import MIMEBase from email import encoders # General variables dir_to_backup = '/home/admin/python' backup_dir = '/tmp/backupdir/' backup_log = '/tmp/backuplog.log' admin_email = 'email' # File extensions to exclude from backupzip ext_list = ('.txt','.jpg', '.py','.pyc', '.rpm') # Dates todays_date = date.today() current_time = datetime.time(datetime.now()) # zip file info zip_name = 'backup.zip' zip_file_name_full = ("%s_%s_%s" %(todays_date, current_time, zip_name)) zip_file_name_abs_path = os.path.join('/tmp/backupdir/', zip_file_name_full) # Delete previous backup zips from backup dir clear_backups = 'yes' # Email variables email_user = 'email' email_send = 'email' email_pass = 'emailpasswd' subject = 'Backup Completed' msg = MIMEMultipart() msg['From'] = email_user msg['To'] = email_send msg['Subject'] = subject body = 'See attached log file for details' msg.attach(MIMEText(body, 'plain')) # filename = '/tmp/backupdir/backuplog.log' # attachment = open(filename, 'rb') email_log_file = 'yes' # End email variables def logging_setup(): """ Configure logging """ logging.basicConfig(filename=backup_log, level=logging.DEBUG, filemode = 'w', format="%(asctime)s:%(levelname)s:%(message)s") logging_setup() def checkdir(): """ Check if the backup dir is present, if not create """ if not os.path.exists(backup_dir): os.makedirs(backup_dir) logging.debug("Backup dir has been created %s" , backup_dir) else: logging.debug("Backup dir exists proceed..") checkdir() def countbackups(): """ Clear all existing backups if the variable is set to yes """ if clear_backups == 'yes': for name in sorted(glob.glob('/tmp/backupdir/*backup*.zip')): logging.debug("Existing backup found: %s" , name) logging.debug("Removing the above backup...") else: logging.debug("No existing backups found") countbackups() def logging_setup(): logging.basicConfig(filename=backup_log, level=logging.DEBUG, filemode = 'w', format="%(asctime)s:%(levelname)s:%(message)s") logging_setup() def back_up_zip(): """ Create the zip file, with the date in the front """ logging.debug("\n\t\t\t\t ###### ZIP FILE CONTENTS #######\n") zip_file_name_abs_path = os.path.join(backup_dir, zip_file_name_full) zip_file_name = zipfile.ZipFile(zip_file_name_abs_path, 'w') for root, dirs, files in os.walk(dir_to_backup): zip_file_name.write(root) logging.debug("Adding directories: %s", root) logging.debug("\t Adding subdirs: %s", dirs) for file in files: if not file.endswith(ext_list): logging.debug("\t \tAdding files: %s ", file) zip_file_name.write(os.path.join(root, file), compress_type=zipfile.ZIP_DEFLATED) zip_file_name.close() logging.debug("\n\t\t\t\t ###### END OF ZIP FILE CONTENTS ######\n") logging.debug("Backup file was created: %s" , zip_file_name_abs_path) print("Backup file created") back_up_zip() def send_email(): """ Send an email to the admin with the log file attached """ filename = backup_log attachment = open(filename, 'rb') if email_log_file == 'yes': part = MIMEBase('application', 'octet-stream') part.set_payload((attachment).read()) encoders.encode_base64(part) part.add_header('Content-Disposition', 'attachment', filename= filename) msg.attach(part) text = msg.as_string() server = smtplib.SMTP('smtp.gmail.com: 587') server.starttls() server.login(email_user, email_pass) server.sendmail(email_user, email_send, text) logging.debug("Backup sent to user: %s", email_send) server.quit() print("Email sent") else: logging.debug("Email option not selected") send_email() 
\$\endgroup\$

    1 Answer 1

    3
    \$\begingroup\$
    import os, sys, zipfile, fnmatch, glob, sys, time, datetime, smtplib, logging 

    is a prime example why you should not cram all your imports on a single line. Why? If you have a closer look, you will see that there is a double import of sys.

    It is also good practice to group imports by "topic", i.e. imports that have similar purposes. Something like

    import os import glob import zipfile import logging import datetime from datetime import date, datetime import smtplib from email import encoders from email.mime.text import MIMEText from email.mime.base import MIMEBase from email.mime.multipart import MIMEMultipart 

    As you can see some of your imports also got lost on the way. Basically it was my IDE (Visual Studio Code) telling me that they're not used in your program. At the moment I'm under Windows so I cannot test run your code, so take that with a grain of caution.


    Another common practice is to wrap the parts of the code that are supposed to be run as script, with an if name == "__main__": clause. In order to do that, you should collect all the loose function falls immediately done after the function definition into that block.

    if __name__ == "__main__": logging_setup() checkdir() countbackups() back_up_zip() send_email() 

    The major issue of your code is the excessive amount of global variables that make up 100% of your data flow between the functions. You should definitely think about wrapping most of the global variables into a main function or however you want to call it, and let your subroutines accept parameters.

    Also please don't store user credentials in script files. Everything that is configurable (username, password, email address, smtp server, smtp port) should be stored in an external configuration file and/or environment variables. This also avoids lot of headaches if you ever get tempted to put your script under version control (think GitHub).


    I will not write much about the core code of your script, but just give a few hints towards minor improvements. I'm sure if there are major issues, other members of the community will pick them up.

    The first thing would be string formatting. I think the style from the logging library carried over to the rest of your code. However, the "old" % string formatting can be replaced by the much more convenient f-strings in Python 3. They are as easy to use as zip_file_name_full = f"{todays_date}_{current_time}_{zip_name}".

    Another best practice that is around quite a bith longer is to use the with statement when handling resources that need to be closed afterwards, such as files. With attachment = open(filename, 'rb') the file won't be closed when leaving the function (the garbage collector will eventually take care of it). Using with open(filename, 'rb') as attachment: instead ensures that the file gets closed no matter what, including exceptions of any kind in the code. As a matter of fact, smtplib.STMP may also be used with with.

    \$\endgroup\$
    1
    • \$\begingroup\$Thanks. Ive made some changes as you suggested. Split this out into 2 files, one basically with the configurations, and the other with the functions\$\endgroup\$
      – anfield
      CommentedMay 6, 2019 at 13:42

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.