Monday, November 16, 2015

Off topic: finally a really good online OCR service

Sometimes I need to import printed documents like contracts, letters and so on, but I must say that until yesterday my attempts to successfully scan these stuff with online OCR services had gone badly.

Most of the times, in the past, the number of wrong words, letters converted into numbers, typically "l"s would become "1"s, "O"s would be turned into "0"s, "A"s into "4"s, quickly led me to dismiss online OCR as a waste of time and I always ended up re-typing everything from scratch.

fragment of the original PDF document to be scanned

Then, for some reason, yesterday I had to import a 4 pages legal contract and I decided to make one more attempt, trying out three different (free) providers. Two of them, that I won't mention for pity, weren't even able to recognize a single line and returned blank pages scattered with noise letters and numbers.

output created by one of the other two OCRs

The third one instead, to my surprise, succeeded and returned the whole document with an astonishing accuracy.
The amazement was even greater because the text was in italian and, as a matter of fact, full of legal terms.
Whilst I had to manually adjust the font weight, font style and the kerning between the letters, which I could do quickly because the corrections applied to the whole text, I had to amend very few words, less than ten in a text containing almost 1000.

the scanned output as it came out from onlineocr.net
As you can see yourself apart from the mixed font weight, there are very few problems, there is just one missing word in this segment, the rightmost word ("allegata") on the third line and an exclamation sign in place of an uppercase "I" near the middle.

In case you are interested, the site is onlineocr.net.
If you sign up for free, they'll give a 25 pages initial allowance, which you can increase by buying blocks of pages at various prices or by doing some promotional action.

Let's be honest folks, I wouldn't waste a second of my time to write this little promotional post if the service wasn't worth it.

May be it was just my lucky day with OCRs but frankly speaking I am impressed, it's the first time I was able to save some time and avoid the annoying work of typing in everything again.

PS: if you know other valuable OCR services that got the job done properly, I'd like to hear from you, so, please, drop a comment.

Tuesday, November 10, 2015

Error: parsererror - SyntaxError: JSON.parse: unexpected non-whitespace character after JSON data at line 2 column 1 of the JSON data

Always check out the original article at http://www.oraclequirks.com for latest comments, fixes and updates.

Error: parsererror - SyntaxError: JSON.parse: 
unexpected non-whitespace character after JSON data at line 2 column 1 of the JSON data

You may get this self-explanatory error at run-time if you specified a non-existing page item in the list of items to be returned after invoking a PL/SQL procedure from within a dynamic action in Oracle Application Express.

This may easily happen if you mistyped the page item name for instance, but in my case I had completely forgotten to create the item P4_ID.
Luckily I had an epiphany before going totally nuts.

Monday, November 09, 2015

About displaying images using APEX_UTIL.GET_BLOB_FILE_SRC in non trivial situations

Always check out the original article at http://www.oraclequirks.com for latest comments, fixes and updates.

In a perfect situation, when we need to display an inline image inside an Apex report, we might simply pick the BLOB column and apply the special formatting required in these cases, I mean that weird format mask containing a list of column attributes separated by colons where each member represents a column name in the table being queried:

       dbms_lob.getlength(thumbnail) as image,
  FROM image_table...
The format mask specified in the column attributes of the report is something like:

Note also the call to dbms_lob.getlength that for some reason is still NOT properly documented and it is absolutely necessary to make work this type of reports otherwise you will incur into the rather obscure error message when you try to run the report:
report error:
ORA-06502: PL/SQL: numeric or value error: character to number conversion error 

But as I said, sometimes we are not in a perfect situation.
We might have some rows with an image and some without images.
How do we display an alternate image in case the BLOB value is null?
          '<img src="'||APEX_UTIL.GET_BLOB_FILE_SRC ('P4_X',id)||'" />',
          '<img height="72" src="#IMAGE_PREFIX#1px_trans.gif" width="72" />'
       ) as image,
 FROM image_table...

the SQL above displays the BLOB image stored in image_table only if the value is not null (it might still be an empty BLOB, but that's another story and you'd better to avoid this further annoyance), otherwise it displays a transparent image that you could replace with anything that suits better your needs, say an icon with a question mark or whatever.

Now, the problem with a report column defined in a SQL statement like this is that you can no longer use Apex's built-in report image formatting, but you need to add some additional pieces here and there.

The first requirement comes from APEX_UTIL.GET_BLOB_FILE_SRC itself: the first parameter must be the name of a page item (P4_X in the code above), type FILE BROWSE, containing the format mask specifying a list of columns, similar to the picture above, but without the Blob table.

Now, where does Apex take the name of the table if it doesn't ask me to specify one here?
It will be soon clear as you try to run the page, because it will throw the following run-time error.

Error: No corresponding DML process found for page 4
Contact your application administrator
Technical Info (only visible for developers)

    is_internal_error: true
    component.id: 2568302329371214
    component.name: P4_X

This means that Apex is expecting to retrieve the name of the table from a built-in Row Fetch process where you specify the table name and it's primary key(s).

But wait a minute, I am not using a built-in Row Fetch in this page because I am running a report.
Never mind, you can create a *fake* Row Fetch process by specifying never as a condition in the process attributes.

After adding the fake Row Fetch process, the report will magically start working.

However there is one last refinement to be done:
the FILE BROWSE page item we added is probably undesired, visually speaking.
If that is the case, you need to add something like style="display:none;" in the HTML Form Element Attributes to hide it from the user.

I wish we could have a more streamlined version of APEX_UTIL.GET_BLOB_FILE_SRC in a future release of Apex such that we could supply all the necessary parameters without having to resort to this kind of tricks.

yes you can!

Two great ways to help us out with a minimal effort. Click on the Google Plus +1 button above or...
We appreciate your support!

latest articles