Skip to content

How to extract table of contents in PDF #481

Answered by samkit-jain
Ynjxsjmh asked this question in Q&A
Discussion options

You must be logged in to vote

Hi @Ynjxsjmh pdfplumber is built on pdfminer.six and it also provides a get_outlines(...) method. It might be different from the one provided by PyPDF2. To access it, you can use the following code

>>> import pdfplumber
>>> pdf = pdfplumber.open("file.pdf")
>>> pdf.doc.get_outlines()

pdf.doc is an instance of PDFDocument. An example on how to use the method can be found here.

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by Ynjxsjmh
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants