<> = Points of Interests = The following is a list of common steps executed by ''pdftotext'' to extract text from a PDF file, and in which file the corresponding code is located. Note that the stated locations refer to commit [[https://github.com/freedesktop/poppler/tree/065dca3816db3979dfacdc2f8592abed2ff6859a|065dca3]] and may have changed by now. * '''Opening and reading a PDF file''' <
> [[https://github.com/freedesktop/poppler/blob/065dca3816db3979dfacdc2f8592abed2ff6859a/poppler/PDFDoc.cc#L129|PDFDoc::PdfDoc(), line 144ff]] = HOWTOs = == Create a PDF with human-readable objects + content streams == Put the following in the preamble of your TeX file (between `\documentclass{}` and `\begin{document}`): {{{#!highlight tex \pdfobjcompresslevel=0 \pdfcompresslevel=0 }}} == Create a PDF with specified crop box == Put the following in the preamble of your TeX file (between `\documentclass{}` and `\begin{document}`): {{{#!highlight tex \pdfpageattr{ /CropBox [50 50 100 100] } }}} == Create A PDF without page numbering == Put the following in the preamble of your TeX file: {{{#!highlight tex \thispagestyle{empty} }}}