Skip to content

Instantly share code, notes, and snippets.

View dingusagar's full-sized avatar

Dingu Sagar dingusagar

View GitHub Profile
@Samathy
Samathy / dumppdfcomments.py
Created January 5, 2018 18:50
Python Script to extract highlighted text from PDFs. Uses python-poppler-qt4. Updated [1] to Python 3 [1] https://stackoverflow.com/questions/21050551/extracting-text-from-higlighted-text-using-poppler-qt4-python-poppler-qt4
import popplerqt4
import sys
import PyQt4
def main():
doc = popplerqt4.Poppler.Document.load(sys.argv[1])
total_annotations = 0
for i in range(doc.numPages()):