7
\$\begingroup\$

I am a high school intern, and I am trying to parse my mentor's code so that he can read in an XML file and call simple methods to edit or get information from his XML file.

I was hoping someone could provide me with some (any) feedback - ways to optimize the code perhaps?

The XML file that my mentor uses is: (It's not necessary to look at his code - it's just for reference / a general idea)

<global> <fringe_depth value="3"/> <ignore_direction dir="z"/> <minimize_overlap/> <operating_conditions velocity="0.99756405,0.069756473,0" pressure="0"/> <solver method="BlockImplicit" equations="IncompressibleEuler" num_time_steps="1" steady_state="yes" time_step="0.1" cfl="1280"> <block_implicit num_dual_time_steps="500" max_inviscid_order="5"/> </solver> <output> <grid_file filename="Plot/plot.g" style="p3d" frequency="-1" byte_order="little_endian" format="unformatted"/> <solution_file filename="Plot/plot.func" style="func" frequency="-1" byte_order="little_endian" format="unformatted"/> <grid_file filename="Plot/plot.xyz" style="p3dwib" frequency="-1" format="formatted"/> <solution_file filename="Plot/plot.q" style="p3dq" frequency="-1" format="formatted"/> <domain_connectivity filename="output++.dci" style="dci"/> </output> <body name="root"> <body name="airfoil"> <volume_grid name="left" style="p3d" filename="Grids/block_1.sp3dudl"> <skip_overlap_opt set_dsf_value="1e-20"/> <boundary_surface name="imin"> <region range1="min" range2="all" range3="all"/> <boundary_condition type="overlap"/> <solver_boundary_condition type="overlap"/> </boundary_surface> <boundary_surface name="imax"> <region range1="max" range2="all" range3="all"/> <boundary_condition type="overlap"/> <solver_boundary_condition type="overlap"/> </boundary_surface> <boundary_surface name="jmin"> <region range1="all" range2="min" range3="all"/> <boundary_condition type="solid"/> <solver_boundary_condition type="solid"/> </boundary_surface> <boundary_surface name="jmax"> <region range1="all" range2="max" range3="all"/> <boundary_condition type="overlap"/> <solver_boundary_condition type="overlap"/> </boundary_surface> <boundary_surface name="kmin"> <region range1="all" range2="all" range3="min"/> <boundary_condition type="symmetry"/> <solver_boundary_condition type="zero_gradient"/> </boundary_surface> <boundary_surface name="kmax"> <region range1="all" range2="all" range3="max"/> <boundary_condition type="symmetry"/> <solver_boundary_condition type="zero_gradient"/> </boundary_surface> </volume_grid> <volume_grid name="right" style="p3d" filename="Grids/block_2.sp3dudl"> <skip_overlap_opt set_dsf_value="1e-20"/> <boundary_surface name="imin"> <region range1="min" range2="all" range3="all"/> <boundary_condition type="overlap"/> <solver_boundary_condition type="overlap"/> </boundary_surface> <boundary_surface name="imax"> <region range1="max" range2="all" range3="all"/> <boundary_condition type="overlap"/> <solver_boundary_condition type="overlap"/> </boundary_surface> <boundary_surface name="jmin"> <region range1="all" range2="min" range3="all"/> <boundary_condition type="solid"/> <solver_boundary_condition type="solid"/> </boundary_surface> <boundary_surface name="jmax"> <region range1="all" range2="max" range3="all"/> <boundary_condition type="overlap"/> <solver_boundary_condition type="overlap"/> </boundary_surface> <boundary_surface name="kmin"> <region range1="all" range2="all" range3="min"/> <boundary_condition type="symmetry"/> <solver_boundary_condition type="zero_gradient"/> </boundary_surface> <boundary_surface name="kmax"> <region range1="all" range2="all" range3="max"/> <boundary_condition type="symmetry"/> <solver_boundary_condition type="zero_gradient"/> </boundary_surface> </volume_grid> </body> <body name="box"> <volume_grid name="box" style="p3d" filename="Grids/block_3.sp3dudl" cartesian_grid="yes"> <boundary_surface name="imin"> <region range1="min" range2="all" range3="all"/> <boundary_condition type="farfield"/> <solver_boundary_condition type="farfield"/> </boundary_surface> <boundary_surface name="imax"> <region range1="max" range2="all" range3="all"/> <boundary_condition type="farfield"/> <solver_boundary_condition type="farfield"/> </boundary_surface> <boundary_surface name="jmin"> <region range1="all" range2="min" range3="all"/> <boundary_condition type="farfield"/> <solver_boundary_condition type="farfield"/> </boundary_surface> <boundary_surface name="jmax"> <region range1="all" range2="max" range3="all"/> <boundary_condition type="farfield"/> <solver_boundary_condition type="farfield"/> </boundary_surface> <boundary_surface name="kmin"> <region range1="all" range2="all" range3="min"/> <boundary_condition type="symmetry"/> <solver_boundary_condition type="zero_gradient"/> </boundary_surface> <boundary_surface name="kmax"> <region range1="all" range2="all" range3="max"/> <boundary_condition type="symmetry"/> <solver_boundary_condition type="zero_gradient"/> </boundary_surface> </volume_grid> </body> </body> </global> 

The code that I have written in Python is what I would like feedback on:

import xml.etree.ElementTree as ET import math def parse(fileName): global tree, root tree = ET.parse(fileName) root = tree.getroot() ''' Returns total number of 'volume_grid' elements ''' def getNumGrids(): count = 0 for volume_grid in root.iter('volume_grid'): count+=1 return count ''' Returns all of 'the volume_grid' name attributes as a list ''' def getGridNames(): att = [] for grid in root.iter('volume_grid'): att.append(grid.attrib) names = [] for n in att: names.append(n.get('name')) return names ''' Helper method for getBoundarySurfaceNamesForVolumeGrid(grid) Returns list of attributes for a particular 'volume_grid', which is given as parameter ''' def getGridBoundarySurfaceNameAttributes(volumeGridName): selectedAtt = [] att = [] names = [] for v in root.iter('volume_grid'): att.append(v.attrib) for a in att: names.append(a.get('name')) for n in names: if n == volumeGridName: for surface in v.iter('boundary_surface'): selectedAtt.append(surface.attrib) return selectedAtt ''' Returns a list of the names for a particular volume_grid, which is given as a parameter ''' def getBoundarySurfaceNamesForVolumeGrid(grid): g = getGridNames() n = [] for names in g: if names == grid: n = (getGridBoundarySurfaceNameAttributes(names)) return n ''' Changes 'solver_boundary_condition' type attribute in a specified volume_grid takes 3 parameters: 1. 'volumeGridName' -- name of the 'volume_grid' 2. 'boundarySurfaceName' -- 'boundary_surface' 'name' attribute 3. 'newType' -- the new 'type' for the 'solver_boundary_condition' ''' def setSolverBoundaryConditionType(volumeGridName, boundarySurfaceName, newType): for v in root.iter('volume_grid'): if v.get('name') == volumeGridName: for b in v.iter('boundary_surface'): if b.get('name') == boundarySurfaceName: for s in b.iter('solver_boundary_condition'): s.set('type', newType) ''' returns type of 'solver_boundary_condition' within a volume grid and boundary surface ''' def getSolverBoundaryConditionType(volumeGridName, boundarySurfaceName): for v in root.iter('volume_grid'): if v.get('name') == volumeGridName: for b in v.iter('boundary_surface'): if b.get('name') == boundarySurfaceName: for s in b.iter('solver_boundary_condition'): return s.get('type') ''' Changes all solver_boundary_condition 'type' attributes from the first parameter 'old' to the second 'new'. Only changes if 'type' originally contains 'old' ''' def setAllSameSolverBoundaryConditionTypes(old, new): for v in root.iter('volume_grid'): for b in v.iter('boundary_surface'): for s in b.iter('solver_boundary_condition'): if s.get('type') == old: s.set('type', new) ''' Changes 'boundary_condition' type attribute in a specified volume_grid Takes 3 parameters: 1. 'volumeGridName' -- name of the 'volume_grid' 2. 'boundarySurfaceName' -- 'boundary_surface' 'name' attribute 3. 'newType' -- the new 'type' for the 'solver_boundary_condition' ''' def setBoundaryConditionType(volumeGridName, boundarySurfaceName, newType): for v in root.iter('volume_grid'): if v.get('name') == volumeGridName: for b in v.iter('boundary_surface'): if b.get('name') == boundarySurfaceName: for s in b.iter('boundary_condition'): s.set('type', newType) ''' Changes all 'boundary_condition' 'type' attributes from the first parameter 'old' to the second 'new'. Only changes if 'type' originally contains 'old' ''' def setAllSameBoundaryConditionTypes(old, new): for v in root.iter('volume_grid'): for b in v.iter('boundary_surface'): for s in b.iter('boundary_condition'): if s.get('type') == old: s.set('type', new) ''' Given the volumeGrid name parameter (1st), the user can choose to edit either the style (2nd) or filename (3rd) parameters. If the user wants only to edit one of the two editable parameters, they can put in 'NC' (no change) for the unchanged parameter ''' def setVolumeGrid(name, style, fileName): for grid in root.iter('volume_grid'): if (grid.get('name') == name): if (style != 'NC'): grid.set('style', style) if (fileName != 'NC'): grid.set('filename', fileName) ''' helper method for setOperatingConditionsPolarVelocity() Changes operating conditions, takes in velocity and pressure parameters Velocity parameter is in form "x,y,z" 'NC' (no change) for any parameter that should not be changed ''' def setOperatingConditions(velocity, pressure): for cond in root.iter('operating_conditions'): if (velocity!= 'NC'): cond.set('velocity', velocity) if(pressure != 'NC'): cond.set('pressure', pressure) ''' Given a radius and theta as parameters, calculates the velocity for operating_conditions in xyz. Then sets the operating conditions attributes by calling setOperatingConditions() User can put in 'NC' (no change) for any parameter that should not be changed r = magnitude, theta = inflow angle ''' def setOperatingConditionsPolarVelocity(r, theta, pressure): x =str(r*math.cos(theta)) y = str(r*math.sin(theta)) z = str(0) velocity = ''.join(x + ',' + y + ',' + z) setOperatingConditions(velocity, pressure) ''' Returns the current operating conditions: velocity and pressure as a dictionary ''' def getOperatingConditions(): for cond in root.iter('operating_conditions'): return cond.attrib ''' Changes solver element attributes NC if no change ''' def setSolverType(method, equations, numTimeSteps, steadyState, timeStep, cfl): for sol in root.iter('solver'): if method!='NC': sol.set('method', method) if equations != 'NC': sol.set('equations', equations) if numTimeSteps != 'NC': sol.set('num_time_steps', numTimeSteps) if steadyState != 'NC': sol.set('steady_state', steadyState) if timeStep!='NC': sol.set('time_step', timeStep) if cfl !='NC': sol.set('cfl', cfl) ''' provides the attributes for output grid_file, given the filename The parameter filename is in the form Plot/Plot.name, but to make it easier for the user, he/she must only type in the name part For example, getOutputGridFileInfo(xyz) would work for the filename 'Plot/plot.xyz' ''' def getOutputGridFileInfo(filename): filename = 'Plot/plot.' + filename for file in root.iter('grid_file'): if (file.get('filename') == filename): return (file.attrib) ''' Allow the user to change the style, frequency, and format attributes within <output> <grid_file> given the filename The parameter filename is in the form Plot/Plot.name, but to make it easier for the user, he/she must only type in the name part Note: format is a built-in symbol, so formatting was used in place ''' def setOutputGridFile(filename, style, frequency, formatting): filename = 'Plot/plot.' + filename for g in root.iter('grid_file'): if g.get('filename') == filename: if style !='NC': g.set('style', style) if frequency != 'NC': g.set('frequency', frequency) if formatting != 'NC': g.set('format', formatting) ''' Returns all of the attricutes for a grid_file in output given a filename. The parameter filename is in the form Plot/Plot.name, but to make it easier for the user, he/she must only type in the name part ''' def getOutputSolutionFile(filename): filename = 'Plot/plot.' + filename for file in root.iter('solution_file'): if (file.get('filename') == filename): return (file.attrib) ''' returns the name of each body element in a list ''' def getBodyNames(): att = [] for body in root.iter('body'): att.append(body.attrib) names = [] for n in att: names.append(n.get('name')) return names ''' Changes the old body name (oldName parameter) to a new body name (newName parameter) ''' def setBodyName(oldName, newName): for body in root.iter('body'): if body.get('name') == oldName: body.set('name', newName) ''' Writes the file onto a new XML. Useful when importing this module because etree will not need to be imported/file re-parsed to write the file out from other module ''' def writeFile(name): tree.write(name) 

My mentor would import my module in terminal and use that to call the methods.

\$\endgroup\$

    1 Answer 1

    9
    \$\begingroup\$

    Overall, it’s pretty good. I can follow what you’re doing fairly easily, and the docstrings make it easy to see what each function does (even though I’ve not used this module before). Here are some suggestions for how you could make your code more “Pythonic”, and more compact:

    • Take a look at PEP 8, which is the style guide for Python. Two obvious ways in which your code could be considered “un-Pythonic”:

      • Docstrings should live inside the function definition, not directly before them. This is covered further in PEP 257:

        A docstring is a string literal that occurs as the first statement in a module, function, class, or method definition.

        Single-line docstrings also don’t need separate lines for their opening and closing quotes; it can all go together.

      • The Python convention for function names is lowercase with underscores, not mixed case. There’s also a tendency to use long, descriptive names (e.g. attributes instead of att), which would make your code easier to read.

    • The parse function declares the global variables tree and root, but this isn’t a good way to do it: if you try to parse two files in a row, then what happens? Um… The fact that this is ambiguous or potentially confusing means you should probably do something different.

      I’d suggest turning this into a new class, with tree and root attributes. This allows you to work with multiple files in the same session. Something like this should get you started:

      class MentorTree(object): def __init__(self, filename): self.tree = ET.parse(filename) self.root = self.tree.getroot() 
    • Most of the functions use if and for loops. It all works, but you can use list comprehensions and data structures to make the code more compact, and less nested. (Lots of nested constructions aren’t necessarily a bad thing, but I prefer to avoid them when I don’t need them.)

      For example, in getNumGrids(), you can get the length of the iterator with a single list comprehension:

      def get_grid_count(self): """Return the number of 'volume_grid' elements.""" return sum(1 for _ in self.root.iter('volume_grid')) 

      Similarly for getGridNames():

      def get_grid_names(self): """Returns a list of the 'volume_grid' name attributes.""" attributes = [grid.attrib for grid in root.iter('volume_grid')] return [n.get('name') for n in attributes] 

      Lots of other functions can be modified this way.

      Where you have long strings of if statements of the form X != 'NC', you can clean then up with something like a dictionary or list. For example, here’s setSolverType():

      def set_solver_type(method, equations, num_time_steps, steady_state, time_step, cfl): """ Change the solver element attributes. Attribute is not updated if the argument is 'None'. """ element_attributes = [ 'method', 'equations', 'num_time_steps', 'steady_state', 'time_step', 'cfl' ] for sol in root.iter('solver'): for attribute in element_attributes: if eval(attribute) is not None: sol.set(attribute, eval(attribute)) 

      Horrible abuse of eval(), but there you are. Note that I replaced 'NC' by None.

    I feel like there’s more that could be done, but I need to go to bed and this seems like enough to get you started.

    \$\endgroup\$

      Start asking to get answers

      Find the answer to your question by asking.

      Ask question

      Explore related questions

      See similar questions with these tags.