Using AWS Rekognition To Turn Images and PDFs Into Human Readable Text
I wanted to turn a PDF into text I could copy and paste so I turned to AWS Rekognition. The console only accepted files that were 5 MB. Mine was 13 MB so I had to use the console with asynchronous methods.
I found an example on the AWS Blog that mentioned how to to do it. You can find the full blog post at AWS Rekognition Blog post. I am going to re-use two snippets from there and explain what they do. The first document I wanted to convert was a screenshot of a book page I took. The following is not text but an image of text that we are going to convert to text that can be copied and pasted with the mouse.
The contents of rekog.py:
All you have to do to adapt it for your case is upload your image to an S3 bucket. In my case that was “picostat.com” which is listed in the code above. You then change the name of the image to whatever your filename is then run it with this command:
The greater than tells Python to put the transcribed text in the file called bg.txt.
If you get an error message that boto is not installed you can install it on a Mac with this command:
To convert a large PDF (larged than 5 MB) you will need an asynchronous script:
Contents of rekog2.py
Once again, all you need to do is change the document name and the bucket name and then run this command.
If you’re not sure how to upload a document to S3 you can do it with this command if you have AWS CLI installed:
Or you can go to the AWS Console to use the graphical user interface uploaded in the S3 section.