Python – How to get XML declaration strings using lxml

How to get XML declaration strings using lxml… here is a solution to the problem.

How to get XML declaration strings using lxml

I use lxml to parse XML documents
How do I get the claim string?

 <?xml version="1.0" encoding="utf-8" ?> 

I want to check if it exists, what encoding it has, and what xml version it has.

Solution

When parsing a document, the resulting ElementTree object should have a DocInfo object that contains information about the parsed XML or HTML document.

For XML, you may be interested in the xml_version and encoding properties of this DocInfo:

>>> from lxml import etree
>>> tree = etree.parse('input.xml')
>>> tree.docinfo
<lxml.etree.DocInfo object at 0x7f8111f9ecc0>
>>> tree.docinfo.xml_version
'1.0'
>>> tree.docinfo.encoding
'UTF-8'

Related Problems and Solutions