Java – Use iText on Android to convert HTML to PDF. How do I set UTF-8 (diacritics)?

Use iText on Android to convert HTML to PDF. How do I set UTF-8 (diacritics)?… here is a solution to the problem.

Use iText on Android to convert HTML to PDF. How do I set UTF-8 (diacritics)?

I’m trying the following

itextg 5.5.3 jar

xmlworker 5.5.3 jar


test.html

<html xmlns="http://www.w3.org/1999/xhtml" lang="cs">
<head>
</head>
<body>
Test: ěščřžýáíé Ň ň ě Ě

<div style="font-family: 'Times New Roman',font-weight: bold,backround-color blue; ">
  Test: ěščřžýáíé Ň ň ě Ě
</div>

</body>
</html>

ConvertHTMLToPDF.java

public class ConvertHTMLToPDF {

public static final String RESULT = Environment.getExternalStorageDirectory().getAbsolutePath() + "/Notes/test.pdf";
    public static final String RESORCE = Environment.getExternalStorageDirectory().getAbsolutePath() + "/Notes/html/test.html";

void convertHTMLToPDF() throws IOException, DocumentException {

Rectangle pagesize = new Rectangle(415,1750);
            Document document = new Document(pagesize);

PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(RESULT));

document.open();

XMLWorkerHelper.getInstance().parseXHtml(writer, document, new FileInputStream(RESORCE));

document.close();
            System.out.println( "PDF Created!" );
        }
    }

Output the test .pdf

  • Test: šžýáíé

  • Test: šžýáíé


How to get the test .pdf

 Test: ěščřžýáíé Ň ň ě Ě
 Test: ěščřžýáíé Ň ň ě Ě

Solution

Check out the XML Worker example: http://itextpdf.com/sandbox/xmlworker/

Example 1: D07_ParseHtmlAsian

Tell the parseHtml() method that you are reading XML: in UTF-8 format

XMLWorkerHelper.getInstance().parseXHtml(writer, document,
    new FileInputStream(HTML), Charset.forName("UTF-8"));

Example 2: D07bis_ParseHtmlAsian

Tell the parse() method that you are reading XML: in UTF-8

format

XMLWorker worker = new XMLWorker(css, true);
XMLParser p = new XMLParser(worker);
p.parse(new FileInputStream(HTML), Charset.forName("UTF-8"));

Note that you prefer Example 1 when you want to control the font used to output the PDF. Compare with D07tris_ParseHtmlAsian

Related Problems and Solutions