Python – PyMuPDF – Read/write text box

PyMuPDF – Read/write text box… here is a solution to the problem.

PyMuPDF – Read/write text box

I’ve been able to read the contents of the PDF: PYMuPDF using code similar to the following:

myfile = r"C:\users\xxx\desktop\testpdf1.pdf"
doc  =fitz.open(myfile)
page=doc[1]
text = page.getText("text")

Read the contents of the PDF file, but I can’t read the comments of the text box, is there a way to do this?

Solution

Use >firstAnnot on the page object. Once you have an annotation object, you can call >next on it and get the others. Note example at the bottom of the Annot page.

I created a PDF from a Word document and added a text box and a sticky note. The following code prints the contents of each. Look insideinfo for additional information you may need.

import fitz

pdf = fitz.open('WordTest.pdf')
page = pdf[0]
annot = page.firstAnnot
print(annot.info['content'])
next_annot = annot.next
print(next_annot.info['content'])
pdf.close()

Related Problems and Solutions