Points of Interests

The following is a list of common steps executed by pdftotext to extract text from a PDF file, and in which file the corresponding code is located. Note that the stated locations refer to commit 065dca3 and may have changed by now.

Opening and reading a PDF file
PDFDoc::PDFDoc(), line 144ff

HOWTOs

Create a PDF with human-readable objects + content streams

Put the following in the preamble of your TeX file (between \documentclass{} and \begin{document}):

   1 \pdfobjcompresslevel=0 
   2 \pdfcompresslevel=0

Create a PDF with specified crop box

Put the following in the preamble of your TeX file (between \documentclass{} and \begin{document}):

   1 \pdfpageattr{
   2   /CropBox [50 50 100 100]
   3 }

Create A PDF without page numbering

Put the following in the preamble of your TeX file:

   1 \thispagestyle{empty}

-  ⇤ ← Revision 4 as of 2023-01-05 16:03:42 → 
  Size: 512
  Editor: adpult
  Comment:
+   ← Revision 10 as of 2023-01-17 14:02:48 → ⇥
  Size: 1228
  Editor: adpult
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 2:
+= Points of Interests =

The following is a list of common steps executed by ''pdftotext'' to extract text from a PDF file, and in which file the corresponding code is located. Note that the stated locations refer to commit [[https://github.com/freedesktop/poppler/tree/065dca3816db3979dfacdc2f8592abed2ff6859a|065dca3]] and may have changed by now.

 * '''Opening and reading a PDF file''' <<BR>> &nbsp;&nbsp;&nbsp;[[https://github.com/freedesktop/poppler/blob/065dca3816db3979dfacdc2f8592abed2ff6859a/poppler/PDFDoc.cc#L144|PDFDoc::PDFDoc(), line 144ff]]
-Line 23:
+Line 29:
+== Create A PDF without page numbering ==

Put the following in the preamble of your TeX file:

{{{#!highlight tex
\thispagestyle{empty}
}}}