<<TableOfContents(3)>>

= Points of Interests =

The following is a list of common steps executed by ''pdftotext'' to extract text from a PDF file, and in which file the corresponding code is located. Note that the stated locations refer to commit [[https://github.com/freedesktop/poppler/tree/065dca3816db3979dfacdc2f8592abed2ff6859a|065dca3]] and may have changed by now.

 * '''Opening and reading a PDF file''' <<BR>> [[https://github.com/freedesktop/poppler/blob/065dca3816db3979dfacdc2f8592abed2ff6859a/poppler/PDFDoc.cc#L129|PDFDoc::PdfDoc(), line 144ff]]

= HOWTOs =

== Create a PDF with human-readable objects + content streams ==

Put the following in the preamble of your TeX file (between `\documentclass{}` and `\begin{document}`):

{{{#!highlight tex
\pdfobjcompresslevel=0 
\pdfcompresslevel=0
}}}

== Create a PDF with specified crop box ==

Put the following in the preamble of your TeX file (between `\documentclass{}` and `\begin{document}`):

{{{#!highlight tex
\pdfpageattr{
  /CropBox [50 50 100 100]
}
}}}

== Create A PDF without page numbering ==

Put the following in the preamble of your TeX file:

{{{#!highlight tex
\thispagestyle{empty}
}}}