Extract Text from a PDF (OCR) – Free, 7 Easy Methods (2025)

Works for scans, photos, and image-only PDFs

Extract text from PDF the right way—even when the file is a photo or a flat scan. This step-by-step guide explains when you actually need OCR, the fastest free options (Google Drive, OneNote, mobile OCR), and power tools like Tesseract for tricky pages. You’ll also learn accuracy tips (DPI, deskew, contrast), how to keep layout intact, and what to do with tables and math.

Try PDF to Word — best for text-based PDFs; edit then export via Word to PDF.

[lwptoc]

[Advertisement — 336×280]

Do You Need OCR to Extract Text from PDF?

Try selecting text with your mouse. If the cursor won’t highlight characters, the PDF is an image (scan or photo) and you need OCR (optical character recognition) to extract the text. If the text highlights, you can convert directly using PDF to Word and skip OCR entirely.

  • Image-only (scan): use OCR to extract selectable text.
  • Digital (text-based): convert with PDF to Word, edit, then export via Word to PDF.

Tip: If you only need a few quotes, try the Screenshot OCR trick.

Method 1: Google Drive OCR (Free & Easy)

  1. Upload your PDF to Google Drive.
  2. Right-click → Open with → Google Docs. Drive runs OCR automatically.
  3. Review the extracted text in Docs. Fix headings, lists, and spacing.
  4. Export as needed: File → Download → DOCX or PDF.

Official steps: Scan docs with Google Drive. For long files, process sections (50–100 pages) for higher accuracy and fewer timeouts.

Method 2: PDF → Word (When No OCR Is Needed)

If the PDF already contains real text layers, use our PDF to Word tool. You’ll get editable paragraphs, which is the fastest way to extract text from PDF while keeping formatting. When done, export a clean file via Word to PDF.

  • Best for: digital PDFs, reports, manuals.
  • Not for: camera scans or photographed pages (use OCR first).

Method 3: Microsoft OneNote OCR (Free)

OneNote’s built-in OCR works well for screenshots and pasted images.

  1. Open OneNote (desktop or web) → paste an image or a page snapshot.
  2. Right-click the image → Copy Text from Picture.
  3. Paste into your document and clean up line breaks.

Docs: Microsoft OneNote Support.

Method 4: Tesseract OCR (Open Source, Powerful)

If you regularly extract text from PDF with complex fonts or multiple languages, try Tesseract OCR. It’s free, supports language packs, and can be scripted for batches.

  1. Convert the PDF pages to high-quality images (see PDF to Image).
  2. Run Tesseract on each image: tesseract page1.png out1 -l eng
  3. Combine the outputs or paste where needed.

Best results: 300 DPI, deskewed, high contrast images. Add -l spa, -l deu, etc., for other languages.

Method 5: Mobile OCR (iOS & Android)

iOS (Live Text)

  1. Open the Camera or Photos app; tap text to select with Live Text.
  2. Copy and paste into Notes/Docs. For PDFs, snap a photo first.

Apple’s help: Use Live Text.

Android (Photos/Drive)

  1. Open a photo in Google Photos → Lens → select text → copy.
  2. Or use Google Drive’s Scan and open with Docs for OCR.

Guide: Copy text from Photos.

Method 6: PDF → Image → OCR (Flatten & Fix)

Stubborn PDFs (heavy artifacts, unusual fonts) often OCR better after a clean image pass:

  1. Convert pages with PDF to Image (PNG at 300 DPI).
  2. Deskew/crop if needed, then OCR via Drive, OneNote, or Tesseract.
  3. Rebuild a searchable document: Image to PDF → (optional) Merge PDF.

Method 7: Screenshot OCR for Snippets

For a couple of quotes or a figure caption, screenshot the area and OCR the image (OneNote/Photos/Drive). It’s the quickest way to extract text from PDF without touching the whole file.

Accuracy Tips to Extract Text from PDF Cleanly

  • DPI: 300 DPI for books/receipts; 200–300 for printouts. Avoid camera blur.
  • Contrast: B&W or grayscale often yields the best OCR for text-only pages.
  • Deskew: straighten pages before OCR. Crop borders and remove shadows.
  • Languages: load the correct language packs (e.g., -l eng+spa).
  • Tables: OCR text first, then reconstruct tables in Word/Sheets.
  • Math: OCR isn’t LaTeX; consider typing equations or using a math OCR tool.

File size tip: Compress scanned PDFs with Compress PDF (Balanced) before sharing.

[Advertisement — 728×90]

Troubleshooting & Quick Fixes

  • Gibberish characters: increase DPI and re-scan; choose the correct language.
  • Weird line breaks: paste into Word and use Find & Replace for manual cleanup.
  • Columns mixed: OCR page halves separately (left then right) and recombine.
  • Photos inside text: mask/crop photos before OCR to reduce confusion.
  • Only need a chapter: use Split PDF first; OCR that portion.

Extract text from PDF (OCR) — featured graphic with scan to text pipeline
Featured — Free ways to extract text from PDF (OCR) with quality results.

FAQs: Extract Text from PDF

What’s the fastest free method?

Google Drive → Open with Docs. For clean digital PDFs, just use PDF to Word.

Can I keep the exact layout?

OCR focuses on text. For page-perfect layout, convert to DOCX and refine manually, or keep the PDF and extract only the needed quotes.

Is OCR private?

Avoid public Wi-Fi and use trusted tools. For sensitive files, prefer offline OCR (OneNote desktop, Tesseract) and store locally.

Extract Text Fast – Try PDF to Word