286
Comment:
|
1665
|
Deletions are marked like this. | Additions are marked like this. |
Line 2: | Line 2: |
= Points of Interests = The following is a list of common steps executed by ''pdftotext'' to extract text from a PDF file, and in which file the corresponding code is located. Note that the stated locations refer to commit [[https://github.com/freedesktop/poppler/tree/065dca3816db3979dfacdc2f8592abed2ff6859a|065dca3]] and may have changed by now. * '''Opening and reading a PDF file''' <<BR>> [[https://github.com/freedesktop/poppler/blob/065dca3816db3979dfacdc2f8592abed2ff6859a/poppler/PDFDoc.cc#L144|PDFDoc::PDFDoc(), line 144ff]] * '''Parsing the PDF version number from the PDF file header''' <<BR>> [[https://github.com/freedesktop/poppler/blob/065dca3816db3979dfacdc2f8592abed2ff6859a/poppler/PDFDoc.cc#L350|PDFDoc::checkHeader(), line 350]] * '''Reading startxref''' <<BR>> [[https://github.com/freedesktop/poppler/blob/065dca3816db3979dfacdc2f8592abed2ff6859a/poppler/PDFDoc.cc#L1999|PDFDoc::getStartXRef(), line 1999ff]] |
|
Line 13: | Line 21: |
== Create a PDF with specified crop box == Put the following in the preamble of your TeX file (between `\documentclass{}` and `\begin{document}`): {{{#!highlight tex \pdfpageattr{ /CropBox [50 50 100 100] } }}} == Create A PDF without page numbering == Put the following in the preamble of your TeX file: {{{#!highlight tex \thispagestyle{empty} }}} |
Contents
Points of Interests
The following is a list of common steps executed by pdftotext to extract text from a PDF file, and in which file the corresponding code is located. Note that the stated locations refer to commit 065dca3 and may have changed by now.
Opening and reading a PDF file
PDFDoc::PDFDoc(), line 144ffParsing the PDF version number from the PDF file header
PDFDoc::checkHeader(), line 350Reading startxref
PDFDoc::getStartXRef(), line 1999ff
HOWTOs
Create a PDF with human-readable objects + content streams
Put the following in the preamble of your TeX file (between \documentclass{} and \begin{document}):
Create a PDF with specified crop box
Put the following in the preamble of your TeX file (between \documentclass{} and \begin{document}):
Create A PDF without page numbering
Put the following in the preamble of your TeX file:
1 \thispagestyle{empty}