Is there any freeware OCR software (for Linux and/or Windows) that can take a PDF scanned document as input and output a Searchable PDF like Adobe Acrobat does?
Active1 year, 7 months ago
With searchable PDF I meant that the OCRed text is invisible over the original text and can be selected with the mouse and copied.
Open create-searchable.pdf in Acrobat DC or open a photo of one of your own documents. In the right hand pane, select the Enhance Scans tool. Open the scanned file in Adobe Acrobat (eg.Adobe Acrobat Pro DC). If the file is scanned PDF, open it in Adobe directly; If the file is an image, run Adobe first, then select “Create PDF” in the wizard, choose the image file and it will be imported into Adobe. Go to ToolsEnhance ScansRecognize TextIn This File.
I know that gscan2pdf on Linux can do something like this, but the text is placed in the top left corner of the page and is way too small, not at all synchronized with the text on the background scanned page. This because gscan2pdf feeds the whole page to an OCR engine. It should decompose the image in small images with single lines of text or small paragraphs to send to OCR software.
Nicolas Raoul♦
11.6k99 gold badges4343 silver badges116116 bronze badges
CorneliusCornelius
4,11511 gold badge1717 silver badges3939 bronze badges
10 Answers
A tool that lets you do that is PDF-XChange Viewer. The free version will allow you to OCR your document in a variety of languages (you can download additional language packs for free) and add the OCR'd text as an overlay text layer you can copy from and search with CTRL+F.
What it doesn't:
4,11511 gold badge1717 silver badges3939 bronze badges
Guido DomeniciGuido Domenici
Try
pdfsandwich . From the man-page:
pdfsandwich generates 'sandwich' OCR pdf files, i.e. pdf files which contain only images (no text) will be processed by optical character recognition (OCR) and the text will be added to each page invisibly 'behind' the images.
pdfsandwich is a command line utility. If you have a scanned pdf file, for instance this one:
alice.pdf (which is the first chapter of a novel you might have heard of), invoke pdfsandwich like this:
This will generate a file
alice_ocr.pdf which looks like the orginal file, but the recognized text will be placed behind the scanned images. You can make full text searches now or select text areas.
Convert To Searchable Pdf Free
Another option might be
OCRmyPDF .
studentstudent
The newer version of Tesseract (3.03 RC at the time of writing this) can do this:
This is the command:
CorneliusCornelius
4,11511 gold badge1717 silver badges3939 bronze badges
pypdfocr is what worked for me. It is a Python script streamlining the whole Tesseract usage. After getting dependencies installed (on Linux it's a much simpler process) it's as simple as typing:
pypdfocr myfile.pdf
And opening
myfile_ocr.pdf a while later.
ZarothZaroth
I use Microsoft OneNote as OCR tool. On Right click against an image It can copy the entire text in images and It also has the capability to search text with in image. It is free and accurate and runs on windows and support almost all image formats.
It can also search through PDF files, and Images in PDF files.
Bonus point is that it supports multiple languages :) English, French, Spanish also
BarathVutukuriBarathVutukuri
https://www.microsoft.com/en-us/store/p/leadtools-ocr/9wzdncrdr0d5 is a small simple WinRT app (runs fine on Win10 as well) that does nothing more than take an image or pdf and output a sandwich PDF or text. It's kinda ugly and has absolutely no configuration, but it does this one small task perfectly.
James PolleyJames Polley
You can get searchable text using Google Drive.
First, choose a key setting. Under 'general' in your Google Drive settings, check the box next 'Convert uploads: Convert uploaded files to Google Docs editor format.'
Now upload the pdf to your Google Drive (click 'new', then 'file upload'). When the upload is complete (might take a minute or two), right click it. (If you have trouble finding it, try hitting 'Recent' in the left-hand sidebar.) As I was saying, right-click the pdf you uploaded, and choose 'Open with... Google Docs'. Now you will have searchable text.
aparente001aparente001
While the other answers on this thread focus on desktop software, I've had a lot of success with this webservice: http://www.searchablepdfs.org/
It allows you to upload a PDF of a scanned document, and it generates a 'sandwich PDF' with embedded OCR text that you can copy/paste.
Pros:
Cons:
calvinyoungcalvinyoung
![]()
Another option is pdf2pdfocr (https://github.com/LeoFCardoso/pdf2pdfocr) that is based on Tesseract-OCR and can run natively on Windows, MacOS and Linux operating systems.
Disclaimer: I'm the pdf2pdfocr developer.
Adobe Searchable Pdf Software Download
Leo CardosoLeo Cardoso
Two more options:
1) Online: www.sandwichpdf.com
2) Desktop (multiple OSes): NAPS2 - https://www.naps2.com/
kpkkpk
Not the answer you're looking for? Browse other questions tagged windowsgratislinuxpdfocr or ask your own question.
Feb 22,2019 • Filed to: Edit PDF
Searchable text is a lifesaver. A university student writing a paper wants to read PDF documents with searchable text to save time. An entrepreneur reviewing reports wants the reports to be searchable so he/she can pick out specific moments and save time. PDF documents with searchable text are an invaluable asset because it saves time and makes reading easy to do. Hence, there are several programs including Adobe Systems that make PDF text searchable. We are going to explore the three best options to convert PDF to searchable text. We chose these options based on variety of features and accessibility. In this article, you will learn how to make PDF text searchable with PDFelement Pro.
Part 1. The Best PDF Text Editor to Make PDF Text Searchable
How to make PDF text searchable? The answer is a professional PDF editing software. We highly recommend PDFelement Pro because it is an editor offering plenty of options including the ability to make PDF documents with searchable text. PDF Pro Editor is designed to make PDF documents accessible and manageable. By using PDFelement Pro, you have the capacity to make significant alterations to PDF documents. It is a very useful software for anyone be it a student or a secretary.
Detail Steps on How to Make PDF Text Searchable
Don't have the time to learn Adobe Acrobat? Install PDFelement Pro and jump right into the system. PDFelement Pro is so accessible and intuitive you will have no problem learning the software quickly. As an example, we will show how to make a PDF document text searchable.
Step 1. Launch PDFelement Pro
Download PDFelement Pro (be sure to download the correct version) and launch the application. You will get a homepage like this:
Step 2. Open a PDF File
You have two options for importing a PDF file: You can click on the 'Open' box on the right hand corner, or click on 'File'. Either option opens up a window to your local drive, select the PDF file you want and click 'OK' to open it.
Step 3. Make PDF Text Searchable
PDFelement Pro immediately renders the document searchable. To aearch text in ture PDF directly, you can simply press 'Command +F' to bring up the search box. Enter the words into the box to find them in the PDF document.
Step 4. Make Scanned PDF Searchable
If you are thinking how to convert scanned PDF to searchable text then PDFelement Pro has you covered. The first step is to perform OCR on the scanned PDF file. Open the scanned PDF file onto PDFelement Pro (repeat step 1). Click on the 'Edit' menu and select the 'OCR' option. This opens a panel on the right-hand side of the panel. Make the modifications you want and click on 'Perform OCR'. Once the process is complete, the new OCR PDF file will be opened in a new tab. You can now convert PDF to word searchable text. To do that, simply press the 'Command +F' function.
Why Choose PDFelement Pro to Make PDF Text Searchable
PDFelement Pro is the best choice to convert PDF image to searchable text. There are only a handful of programs that can match the value PDFelement Pro offers. PDF Pro Editor is more than just a tool to make PDF searchable text, it is a comprehensive software that lets you perform several functions on PDF programs. Some of the functions you can do on PDFelement Pro include:
If you are an office worker, an artist or a business executive, PDFelement Pro will be useful. The best part about PDFelement Pro is its accessibility, the software is easy to grasp and learn. If you have any questions about executing any functions, like turn PDF into searchable text, you can refer the guide. The PDFelement Pro website contains a comprehensive guide detailing everything that can be done with the PDF editing software. If you need a comprehensive PDF editing software that is easy to learn, and saves time, PDFelement Pro is for you.
![]() Part 2. How to Make PDF Text Searchable for Free
Have you ever tried to open a PDF files to find out that none of the words are searchable? Ever thought about how to make a PDF file text searchable? We have a solution in the form of Sandwich PDF, an online solution for converting Scanned PDF files into searchable text. Sandwich PDF works differently compared to other PDF editing software. Sandwich PDF treats each PDF document like an image. The website adds an invisible layer of text to each page, so that it corresponds to the 'image'. Once the process is complete, you can search for text courtesy of the invisible layer. To learn more about Sandwich PDF and how to convert PDF to searchable text, then visit the official website. If you want to make PDF text searchable free with Sandwich PDF then follow these steps.
Step 1. Load the Sandwich PDF official website. The home page will contain all the elements you need.
Step 2. Add a file from your local drive, and then press 'Start'. You can click on 'Enhance scan quality' for better results.
Adobe Acrobat Searchable Pdf
Step 3. Download the document from the results page. Now, open the downloaded file, perform find search option to search the text easily with this online tool.
Part 3. How to make PDF Text searchable in Adobe
Adobe Acrobat is an excellent program for making text searchable PDF. The latest version of Adobe Acrobat comes with both an accessible interface and comprehensive editing features. If you want to convert scanned documents into high quality PDF document with searchable text, Adobe is your best choice. The one challenge is that Adobe takes time to learn compared to other editing programs like PDFelement Pro and Sandwich PDF. There are number of advantages to working with Adobe Acrobat to convert PDF scan to searchable text. All PDF scans and documents will be sharp, detailed and of high quality.
How to convert scanned PDF to searchable text? Simply follow the steps below to complete the process:
Step 1. Open your scanned PDF file: After scanning a physical copy of the document, load the scan on Acrobat DC. To start converting the scan into searchable text, click on 'Enhance Scans'.
Step 2. Skewing and resizing image: In the 'Enhance Scans' sub menu, select 'Camera Image' Blue dots appear on the scanned image. Drag the blue dots to parts of the image where there is text, and click 'Enhance Page'.
Step 3. Adjust text contrast: In the image preview, there is a slider tool called 'Adjust enhancement text'. The tool is used to raise or lower the contrast of text in a PDF document. Close the menu to return to the 'Enhance Scans' menu. At this point, the image has better resolution and clearer text. However, you still cannot search for text.
Adobe Searchable Pdf Software Free
Step 4. Recognize text: Back on the 'Enhance Scans' menu click on the 'Recognize Text' > 'In This File' This will bring up a different submenu. Click on 'Recognize Text'.
Adobe Searchable Pdf Software Mac
Step 5. Search for text: The 'Recognize Tool' converts the scanned PDF document into a file with searchable text. You can now search for and select certain text in a PDF document. Select the 'Find text' tool.
Comments are closed.
|
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |