In this article you can see how to extract images from pdf files and save it in your local. For that here we are using PyPDF2 library.
PyPDF2 is a pure-python PDF library that can split, merge, crop, and otherwise alter the pages of PDF files. It is free and open-source.
Install PyPDF2
!pip install PyPDF2
Sample code to extract images from PDF
from PyPDF2 import PdfReader
pdfreader = PdfReader("freshers_ny.pdf")
first_page = pdfreader.pages[0]
count = 0
for image_file in first_page.images:
with open(str(count) + image_file.name,"wb") as fp:
fp.write(image_file.data)
count = count + 1
PyPDF2 Official page
Get more post on Python, PySpark