Java – fatal error : Character reference “&# org. xml.sax.SAXParseException;

fatal error : Character reference “&# org. xml.sax.SAXParseException;… here is a solution to the problem.

fatal error : Character reference “&# org. xml.sax.SAXParseException;

Yes

 <BATCHNAME>&#4; Any</BATCHNAME> 

The tag in my XML request contains the “” character in the value. Without these characters, my code works perfectly, but in some cases I have these characters. It gives me the following error

[Fatal Error] :144:28: Character reference “&#
org.xml.sax.SAXParseException; lineNumber: 144; columnNumber: 28; Character
reference “&# at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)
at
com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:339)
at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121)
at d.b(AllCommonTasks.java:277) at…

I need these characters to verify

I’m trying this code=>

try {                      

DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        DocumentBuilder db = dbf.newDocumentBuilder();

URLConnection urlConnection = new URL(urlString).openConnection();
        urlConnection.addRequestProperty("Accept", "application/xml");
        urlConnection.addRequestProperty("User-Agent", "Mozilla/5.0 ( compatible ) ");
        Document doc = db.parse(urlConnection.getInputStream());
        doc.getDocumentElement().normalize();

str = convertDocumentToString(doc);

}catch(Exception e){
        System.err.println("In exception 1");
        e.printStackTrace();
    }

How do I fix this?

Solution

See Wikipedia page for XML and HTML entity references, following #nnnn; An entity reference to a pattern is a Unicode code point in decimal form, which means that will be equivalent to Unicode U+0004 : END OF TRANSMISSION This is a non-printing character.

So I think the parser is right to fail in this case.

In fact, if you look at the source code for com.sun.org.apache.xerces.internal.impl.XMLScanner#scanCharReferenceValue, you’ll find that it references com.sun. org.apache.xerces.internal.util.XMLChar#isValid here:

/**
 * Returns true if the specified character is valid. This method
 * also checks the surrogate character range from 0x10000 to 0x10FFFF.
 * <p>
 * If the program chooses to apply the mask directly to the
 * <code>CHARS</code> array, then they are responsible for checking
 * the surrogate character range.
 *
 * @param c The character to check.
 */
public static boolean isValid(int c) {
    return (c < 0x10000 && (CHARS[c] & MASK_VALID) != 0) ||
           (0x10000 <= c && c <= 0x10FFFF);
} // isValid(int):boolean

Related Problems and Solutions