Inserts a docstring attribute into a python file
We are using Napoleonic style docstrings for python modules. However, additional properties need to be automatically populated in the document strings named
Data Owner and
DAL Owner so that the given function looks like this
def func(self, arg1=None, arg2=None): """ Returns the timeseries for the specified arg1 and arg2. Args: arg1: argument 1 arg2: argument 2 Returns: DataFrame containing timeseries of arg1 for arg2. DAL Owner: Team IT Data Owner: Team A """
These additional properties and their values for a given function are provided in a separate CSV file. My idea was to have a script (awk, sed?). ) will
- Extracts all function names in a given Python file. It can be easily done in Python
- For those function names, check that the owner exists in the CSV file, and if so, create a mapping of the function name and owner. Doable
Now, here’s the part where I haven’t figured it out and don’t know the best way forward. For a given function name and owner, I need to go back to the python file and add the owner to the docstring if it exists. I’m thinking about some kind of awk script, but I’m not quite sure
- Find the function that matches the pattern
- For that mode, see if doctsring exists with triple quotes after the closing parenthesis
- If docstring is present, add two extra lines for the owner before the closing triple quotes
- If docstring does not exist, insert two lines for owners between tripe quotations in the line after the function declaration.
I know it’s a lot of steps, but anyone can provide insight into the previous 4 bullet points to insert additional properties into docstrings given functions, properties, and python files. Linux utilities like sed, awk would be more useful, or should I go the Python route. Are there other options that are easier to implement.
The process for assigning a new docstring in AST is:
- Use >ast.get_docstring to get an existing doc string
- Create a new AST node with the modified content
- If the existing dostring is
None, a new node is inserted at the beginning of the parent node body
- If there is an existing docstring, replace its node with a new one
- 使用unparse * Tools from Cpython Tools to generate new source code (you may need to download this from GitHub – make sure you get a version that matches your python version).
Here is some sample code:
$ cat fixdocstrings.py import ast import io from unparse import Unparser class DocstringWriter(ast. NodeTransformer): def visit_FunctionDef(self, node): docstring = ast.get_docstring(node) new_docstring_node = make_docstring_node(docstring) if docstring: # Assumes the existing docstring is the first node # in the function body. node.body = new_docstring_node else: node.body.insert(0, new_docstring_node) return node def make_docstring_node(docstring): if docstring is None: content = "A new docstring" else: content = docstring + " -- amended" s = ast. Str(content) return ast. Expr(value=s) if __name__ == "__main__": tree = ast.parse(open("docstringtest.py").read()) transformer = DocstringWriter() new_tree = transformer.visit(tree) ast.fix_missing_locations(new_tree) buf = io. StringIO() Unparser(new_tree, buf) buf.seek(0) print(buf.read()) $ cat docstringtest.py def foo(): pass def bar(): """A docstring.""" $ python fixdocstrings.py def foo(): 'A new docstring' pass def bar(): 'A docstring. -- amended'
(I answered a similar question myself for python 2.7, here)
* Starting with Python 3.9, the ast module provides a unparse instead
src = ast.unparse(new_tree).