Preserve line breaks in nestedExpr… here is a solution to the problem.
Preserve line breaks in nestedExpr
Can nestedExpr
preserve line breaks?
Here’s a simple example:
import pyparsing as pp
# Parse expressions like: \name{body}
name = pp. Word( pp.alphas )
body = pp.nestedExpr( '{', '}' )
expr = '\\' + name('name') + body('body')
# Example text to parse
txt = '''
This \works{fine}, but \it{
does not
preserve newlines
}
'''
# Show results
for e in expr.searchString(txt):
print 'name: ' + e.name
print 'body: ' + str(e.body) + '\n'
Output:
name: works
body: [['fine']]
name: it
body: [['does', 'not', 'preserve', 'newlines']]
As you can see, the body of the second expression (\it{ ...
) is parsed, and although there are line breaks in the body, I want the result to store each row in a separate subarray. This result makes it impossible to distinguish between single-line and multi-line body content.
Solution
I
didn’t see your answer until a few minutes ago, and I’ve thought of this method :
body = pp.nestedExpr( '{', '}', content = (pp. LineEnd() | name.setWhitespaceChars(' ')))
Changing body
to this definition results in the following:
name: works
body: [['fine']]
name: it
body: [['\n', 'does', 'not', '\n', 'preserve', 'newlines', '\n']]
Edit:
Wait, if what you want is a separate line, then maybe this is what you’re looking for :
single_line = pp. OneOrMore(name.setWhitespaceChars(' ')).setParseAction(' '.join)
multi_line = pp. OneOrMore(pp. Optional(single_line) + pp. LineEnd().suppress())
body = pp.nestedExpr( '{', '}', content = multi_line | single_line )
Give:
name: works
body: [['fine']]
name: it
body: [['does not', 'preserve newlines']]