OCR PDF
Make scanned PDFs searchable with optical character recognition
Drag and drop your files here
or click to browse. Supports PDFs, PNG, JPG, DOCX, and more.
Language
Language data (~10MB) downloads on first use and is cached locally.
How it works
- 1. Renders each page as an image
- 2. Tesseract.js recognizes text
- 3. Invisible text layer is added
- 4. Original appearance is preserved
About OCR PDF
Modufile's OCR PDF tool makes scanned PDFs searchable by adding an invisible text layer to each page. It renders every page as a high-resolution image (200 DPI) using MuPDF, runs Tesseract.js optical character recognition to detect and position text, and writes the recognized words back into the PDF as an invisible overlay using pdf-lib. The original visual appearance is completely preserved while enabling full-text search, text selection, and copy-paste. This supports over 100 languages and runs entirely in your browser — your documents are never uploaded to any server. It is ideal for making scanned contracts, archived documents, and image-based PDFs searchable.