pdf to txt

How to translate pdfs to text (results are very poor, and will need lots of corrections).

Dependencies

Search for 'tesseract english' (or whatever language).

Arch: tesseract-data-eng and poppler-utils

1pdftoppm -png *file*.pdf test

1for x in *png; do
2    tesseract -l eng  "$x" - >> out.txt
3done