Python – Preserve line breaks in nestedExpr

Preserve line breaks in nestedExpr… here is a solution to the problem.

Preserve line breaks in nestedExpr

Can nestedExpr preserve line breaks?

Here’s a simple example:

import pyparsing as pp

# Parse expressions like: \name{body}
name = pp. Word( pp.alphas )
body = pp.nestedExpr( '{', '}' )
expr = '\\' + name('name') + body('body')

# Example text to parse
txt = '''
This \works{fine}, but \it{
    does not
    preserve newlines
}
'''

# Show results
for e in expr.searchString(txt):
    print 'name: ' + e.name
    print 'body: ' + str(e.body) + '\n'

Output:

name: works
body: [['fine']]

name: it
body: [['does', 'not', 'preserve', 'newlines']]

As you can see, the body of the second expression (\it{ ...) is parsed, and although there are line breaks in the body, I want the result to store each row in a separate subarray. This result makes it impossible to distinguish between single-line and multi-line body content.

Solution

I

didn’t see your answer until a few minutes ago, and I’ve thought of this method :

body = pp.nestedExpr( '{', '}', content = (pp. LineEnd() | name.setWhitespaceChars(' ')))

Changing body to this definition results in the following:

name: works
body: [['fine']]

name: it
body: [['\n', 'does', 'not', '\n', 'preserve', 'newlines', '\n']]

Edit:

Wait, if what you want is a separate line, then maybe this is what you’re looking for :

single_line = pp. OneOrMore(name.setWhitespaceChars(' ')).setParseAction(' '.join)
multi_line = pp. OneOrMore(pp. Optional(single_line) + pp. LineEnd().suppress())
body = pp.nestedExpr( '{', '}', content = multi_line | single_line )

Give:

name: works
body: [['fine']]

name: it
body: [['does not', 'preserve newlines']]

Related Problems and Solutions