Monday, November 16, 2015

Off topic: finally a really good online OCR service

Sometimes I need to import printed documents like contracts, letters and so on, but I must say that until yesterday my attempts to successfully scan these stuff with online OCR services had gone badly.

Most of the times, in the past, the number of wrong words, letters converted into numbers, typically "l"s would become "1"s, "O"s would be turned into "0"s, "A"s into "4"s, quickly led me to dismiss online OCR as a waste of time and I always ended up re-typing everything from scratch.

fragment of the original PDF document to be scanned


Then, for some reason, yesterday I had to import a 4 pages legal contract and I decided to make one more attempt, trying out three different (free) providers. Two of them, that I won't mention for pity, weren't even able to recognize a single line and returned blank pages scattered with noise letters and numbers.

output created by one of the other two OCRs

The third one instead, to my surprise, succeeded and returned the whole document with an astonishing accuracy.
The amazement was even greater because the text was in italian and, as a matter of fact, full of legal terms.
Whilst I had to manually adjust the font weight, font style and the kerning between the letters, which I could do quickly because the corrections applied to the whole text, I had to amend very few words, less than ten in a text containing almost 1000.

the scanned output as it came out from onlineocr.net
As you can see yourself apart from the mixed font weight, there are very few problems, there is just one missing word in this segment, the rightmost word ("allegata") on the third line and an exclamation sign in place of an uppercase "I" near the middle.

In case you are interested, the site is onlineocr.net.
If you sign up for free, they'll give a 25 pages initial allowance, which you can increase by buying blocks of pages at various prices or by doing some promotional action.

Let's be honest folks, I wouldn't waste a second of my time to write this little promotional post if the service wasn't worth it.

May be it was just my lucky day with OCRs but frankly speaking I am impressed, it's the first time I was able to save some time and avoid the annoying work of typing in everything again.

PS: if you know other valuable OCR services that got the job done properly, I'd like to hear from you, so, please, drop a comment.

No comments:

yes you can!

Two great ways to help us out with a minimal effort. Click on the Google Plus +1 button above or...
We appreciate your support!

latest articles