Simple automation executing platform in Python

Question

I'm building a platform like Rundeck/AWX but for server reliability testing.

People could log into a web interface upload scripts, run them on servers and get statistics on them ( failure / success).

Each script is made up of three parts, probes to check if the server is fine, methods to do stuff in the server, and rollback to reverse what we did to the server.

First we run the probes, if they past we run the methods, wait a certain time the user that created the method put, then run the probes again to check if the server self healed, if not then we run the rollbacks and probes again, then send the data to the db.

I have limited experience with programming as a job and am very unsure if what I’m doing is good let alone efficient so I would love to get some really harsh criticism.

This is the micro-service that is in charge of running the scripts of the user’s request, it gets a DNS and the fault name (fault is the whole object of probes/methods/rollbacks).

#injector.py import requests from time import sleep import subprocess import time import script_manipluator as file_manipulator class InjectionSlave(): def __init__(self,db_api_url = "http://chaos.db.openshift:5001"): self.db_api_url = db_api_url def initiate_fault(self,dns,fault): return self._orchestrate_injection(dns,fault) def _orchestrate_injection(self,dns,fault_name): try : # Gets fault full information from db fault_info = self._get_fault_info(fault_name) except Exception as E : return { "exit_code":"1" ,"status": "Injector failed gathering facts" } try : # Runs the probes,methods and rollbacks by order. logs_object = self._run_fault(dns, fault_info) except : return { "exit_code":"1" ,"status": "Injector failed injecting fault" } try : # Sends logs to db to be stored in the "logs" collection db_response = self._send_result(dns,logs_object,"logs") return db_response except Exception as E: return { "exit_code":"1" ,"status": "Injector failed sending logs to db" } def _get_fault_info(self,fault_name): # Get json object for db rest api db_fault_api_url = "{}/{}/{}".format(self.db_api_url, "fault", fault_name) fault_info = requests.get(db_fault_api_url).json() # Get the names of the parts of the fault probes = fault_info["probes"] methods = fault_info["methods"] rollbacks = fault_info["rollbacks"] name = fault_info["name"] fault_structure = {'probes' : probes , 'methods' : methods , 'rollbacks' : rollbacks} # fault_section can be the probes/methods/rollbacks part of the fault for fault_section in fault_structure.keys(): fault_section_parts = [] # section_part refers to a specific part of the probes/methods/rollbacks for section_part in fault_structure[fault_section]: section_part_info = requests.get("{}/{}/{}".format(self.db_api_url,fault_section,section_part)).json() fault_section_parts.append(section_part_info) fault_structure[fault_section] = fault_section_parts fault_structure["name"] = name return fault_structure def _run_fault(self,dns,fault_info): try: # Gets fault parts from fault_info fault_name = fault_info['name'] probes = fault_info['probes'] methods = fault_info['methods'] rollbacks = fault_info['rollbacks'] except Exception as E : logs_object = {'name': "failed_fault" ,'exit_code' : '1' , 'status' : 'expirement failed because parameters in db were missing ', 'error' : E} return logs_object try : method_logs = {} rollback_logs = {} probe_after_method_logs = {} # Run probes and get logs and final probes result probes_result,probe_logs = self._run_probes(probes,dns) # If probes all passed continue if probes_result is True : probe_logs['exit_code'] = "0" probe_logs['status'] = "Probes checked on victim server successfully" # Run methods and get logs and how much time to wait until checking self recovery methods_wait_time, method_logs = self._run_methods(methods, dns) # Wait the expected recovery wait time sleep(methods_wait_time) probes_result, probe_after_method_logs = self._run_probes(probes, dns) # Check if server self healed after injection if probes_result is True: probe_after_method_logs['exit_code'] = "0" probe_after_method_logs['status'] = "victim succsessfully self healed after injection" else: probe_after_method_logs['exit_code'] = "1" probe_after_method_logs['status'] = "victim failed self healing after injection" # If server didnt self heal run rollbacks for rollback in rollbacks: part_name = rollback['name'] part_log = self._run_fault_part(rollback, dns) rollback_logs[part_name] = part_log sleep(methods_wait_time) probes_result, probe_after_method_logs = self._run_probes(probes, dns) # Check if server healed after rollbacks if probes_result is True: rollbacks['exit_code'] = "0" rollbacks['status'] = "victim succsessfully healed after rollbacks" else: rollbacks['exit_code'] = "1" rollbacks['status'] = "victim failed healing after rollbacks" else : probe_logs['exit_code'] = "1" probe_logs['status'] = "Probes check failed on victim server" logs_object = {'name': fault_name ,'exit_code' : '0' , 'status' : 'expirement ran as expected','rollbacks' : rollback_logs , 'probes' : probe_logs , 'method_logs' : method_logs, 'probe_after_method_logs' : probe_after_method_logs} if logs_object["probe_after_method_logs"]["exit_code"] == "0" : logs_object["successful"] = True else: logs_object["successful"] = False except Exception as E: logs_object = {'name': fault_name ,'exit_code' : '1' , 'status' : 'expirement failed because of an unexpected reason', 'error' : E} return logs_object def _inject_script(self,dns,script_path): # Run script proc = subprocess.Popen("python {} -dns {}".format(script_path,dns), stdout=subprocess.PIPE, stderr=subprocess.STDOUT, shell=True) # get output from proc turn it from binary to ascii and then remove /n if there is one output = proc.communicate()[0].decode('ascii').rstrip() return output def _run_fault_part(self,fault_part,dns): script, script_name = file_manipulator._get_script(fault_part) script_file_path = file_manipulator._create_script_file(script, script_name) logs = self._inject_script(dns, script_file_path) file_manipulator._remove_script_file(script_file_path) return logs def _str2bool(self,output): return output.lower() in ("yes", "true", "t", "1") def _run_probes(self,probes,dns): probes_output = {} # Run each probe and get back True/False boolean result for probe in probes : output = self._run_fault_part(probe, dns) result = self._str2bool(output) probes_output[probe['name']] = result probes_result = probes_output.values() # If one of the probes returned False the probes check faild if False in probes_result : return False,probes_output return True,probes_output def _get_method_wait_time(self,method): try: return method['method_wait_time'] except Exception as E : return 0 def _get_current_time(self): current_time = time.strftime('%Y%m%d%H%M%S') return current_time def _run_methods(self,methods,dns): method_logs = {} methods_wait_time = 0 for method in methods: part_name = method['name'] part_log = self._run_fault_part(method, dns) method_wait_time = self._get_method_wait_time(method) method_logs[part_name] = part_log methods_wait_time += method_wait_time return methods_wait_time,method_logs def _send_result(self,dns,logs_object,collection = "logs"): # Get current time to timestamp the object current_time = self._get_current_time() # Creating object we will send to the db db_log_object = {} db_log_object['date'] = current_time db_log_object['name'] = "{}-{}".format(logs_object['name'],current_time) db_log_object['logs'] = logs_object db_log_object['successful'] = logs_object['successful'] db_log_object['target'] = dns # Send POST request to db api in the logs collection db_api_logs_url = "{}/{}".format(self.db_api_url,collection) response = requests.post(db_api_logs_url, json = db_log_object) return response.content.decode('ascii')

#script_manipulator.py import os import requests def _get_script(fault_part): file_share_url = fault_part['path'] script_name = fault_part['name'] script = requests.get(file_share_url).content.decode('ascii') return script, script_name def _create_script_file(script, script_name): injector_home_dir = "/root" script_file_path = '{}/{}'.format(injector_home_dir, script_name) with open(script_file_path, 'w') as script_file: script_file.write(script) return script_file_path def _remove_script_file( script_file_path): os.remove(script_file_path) ```

Erik White · Accepted Answer · 2020-06-15 12:15:09Z

This is a bit much to go through all at once. It would be better if you could separate out the general concept illustrated by examples as a single review, and then specific implementation of components for other reviews.

I'm afraid I can't give much feedback on the overall concept, but I will highlight some areas that stood out to me.

Configuration

You have hardcoded configuration scattered throughout your code. This not only makes it more difficult to update, but also makes it inflexible. There are a range of options, but it will depend on your specific preferences and needs.

def __init__(self,db_api_url = "http://chaos.db.openshift:5001"):

current_time = time.strftime('%Y%m%d%H%M%S')

def _str2bool(self,output): return output.lower() in ("yes", "true", "t", "1")

Path manipulation

Don't do it manually! Trying to use string manipulation to concatenate file paths is full of pitfalls. Instead, you should use the pathlib standard library which removes all the headaches of worrying about getting the correct separator characters etc.

You should also not hard code configuration into your functions, at least provide a means of overriding it. For example your _create_script_file function:

def _create_script_file(script, script_name): injector_home_dir = "/root" script_file_path = '{}/{}'.format(injector_home_dir, script_name) with open(script_file_path, 'w') as script_file: script_file.write(script) return script_file_path

Could be rewritten:

def _create_script_file(script, script_name, injector_home_dir = "/root"): script_file_path = Path(injector_home_dir).joinpath(injector_home_dir, script_name) with open(script_file_path, 'w') as script_file: script_file.write(script) return script_file_path

Even better, load your injector_home_dir from configuration or load as a Path object in an initializer or somewhere.

String literals

This may be more of a personal preference, but I think fstrings are far more readable than string formatting:

db_fault_api_url = "{}/{}/{}".format(self.db_api_url, "fault", fault_name)

vs

db_fault_api_url = f"{self.db_api_url}/fault/{fault_name}")

List/dictionary comprehension

In this section you appear to be essentially filtering a dictionary. This can be greatly simplified since you're reusing the keys:

 # Get the names of the parts of the fault probes = fault_info["probes"] methods = fault_info["methods"] rollbacks = fault_info["rollbacks"] name = fault_info["name"] fault_structure = {'probes' : probes , 'methods' : methods , 'rollbacks' : rollbacks}

 # Get the names of the parts of the fault parts = ["probes", "methods", "rollbacks", "name"] fault_structure = {key: value for key, value in fault_info.items() if key in parts}

The keys used in parts appear to be reused in various places so they are a good candidate for storing in configuration.

Exception handling

I'm not keen on this section. There is a lot of repeated code, I would much prefer to return a value based on the exception. You also have what is essentially a bare exception where you catch any type of exception.

 def _orchestrate_injection(self,dns,fault_name): try : # Gets fault full information from db fault_info = self._get_fault_info(fault_name) except Exception as E : return { "exit_code":"1" ,"status": "Injector failed gathering facts" } try : # Runs the probes,methods and rollbacks by order. logs_object = self._run_fault(dns, fault_info) except : return { "exit_code":"1" ,"status": "Injector failed injecting fault" } try : # Sends logs to db to be stored in the "logs" collection db_response = self._send_result(dns,logs_object,"logs") return db_response except Exception as E: return { "exit_code":"1" ,"status": "Injector failed sending logs to db" }

Use a single try/catch block, store the response and then finally return at the end:

 def _orchestrate_injection(self,dns,fault_name): try : # Gets fault full information from db fault_info = self._get_fault_info(fault_name) # Runs the probes,methods and rollbacks by order. logs_object = self._run_fault(dns, fault_info) # Sends logs to db to be stored in the "logs" collection db_response = self._send_result(dns,logs_object,"logs") except SpecificExceptionType as E: # Examine exception and determine return message if e.args == condition: exception_message = "" else: exception_message = str(E) db_response = { "exit_code":"1" ,"status": exception_message } return db_response

Repetition and encapsulation

Consider where you're repeating code or large functions can be broken down into smaller, reusable parts. Your run_fault method is large, with a lot of branching. An obvious repetition is where you update the exit code:

# Check if server healed after rollbacks if probes_result is True: rollbacks['exit_code'] = "0" rollbacks['status'] = "victim succsessfully healed after rollbacks" else: rollbacks['exit_code'] = "1" rollbacks['status'] = "victim failed healing after rollbacks"

This makes for a nice little function:

def update_exit_status(log, exit_code, status_message = ""): if not status_message: if exit_code: status_message = "victim successfully healed after rollbacks" else: status_message = "victim failed healing after rollbacks" log["exit_code"] = "1" if exit_code else "0" log["status"] = status_message return log

You use a lot a dictionary manipulation throughout, it could be worthwhile to make a small class to contain this information. This would have the benefit of removing the need for so many magic strings where you retrieve information by keys, instead you could use the properties of your class. You could also then contain some of the data handling logic within you class, instead of spread throughout the rest of your methods.

thats exactly what I was looking for ! do you have any more comments on the run fault function ? — amos-baron, CommentedJun 15, 2020 at 19:02

pjz · Accepted Answer · 2020-06-15 17:53:49Z

@erik-white covered a lot of good ground, but a couple of other things jumped out at me:

if <x> is True: should be written as just if <x>:

 if logs_object["probe_after_method_logs"]["exit_code"] == "0" : logs_object["successful"] = True else: logs_object["successful"] = False

could be better written as just:

 logs_object["successful"] = probe_after_method_logs["exit_code"] == "0"

Stack Exchange Network

Simple automation executing platform in Python

2 Answers 2

Hot Network Questions

Simple automation executing platform in Python

2 Answers 2

Related

Hot Network Questions