Most of the times, in the past, the number of wrong words, letters converted into numbers, typically "l"s would become "1"s, "O"s would be turned into "0"s, "A"s into "4"s, quickly led me to dismiss online OCR as a waste of time and I always ended up re-typing everything from scratch.
fragment of the original PDF document to be scanned |
Then, for some reason, yesterday I had to import a 4 pages legal contract and I decided to make one more attempt, trying out three different (free) providers. Two of them, that I won't mention for pity, weren't even able to recognize a single line and returned blank pages scattered with noise letters and numbers.
output created by one of the other two OCRs |
The third one instead, to my surprise, succeeded and returned the whole document with an astonishing accuracy.
The amazement was even greater because the text was in italian and, as a matter of fact, full of legal terms.
Whilst I had to manually adjust the font weight, font style and the kerning between the letters, which I could do quickly because the corrections applied to the whole text, I had to amend very few words, less than ten in a text containing almost 1000.
the scanned output as it came out from onlineocr.net |
In case you are interested, the site is onlineocr.net.
If you sign up for free, they'll give a 25 pages initial allowance, which you can increase by buying blocks of pages at various prices or by doing some promotional action.
Let's be honest folks, I wouldn't waste a second of my time to write this little promotional post if the service wasn't worth it.
May be it was just my lucky day with OCRs but frankly speaking I am impressed, it's the first time I was able to save some time and avoid the annoying work of typing in everything again.
PS: if you know other valuable OCR services that got the job done properly, I'd like to hear from you, so, please, drop a comment.
No comments:
Post a Comment