Comparing different text recognition (OCR) applications with poor quality material

Background

I recently ran to a need to have several smart phone photographed book pages turned into text searchable format. So I needed to run a text recognition on the JPG files, usually resulting in a PDF file that looks the same but has a capatibility to make searches. In this case, the files are far from optimal quality as the lightning is not uniform and the pages are bent, so it makes an execellent “worst case” testing. The text is still easily readable by human eye.

I remembered from the past that Adobe has such a product, but I also know that Adobe’s products aren’t always the best even though some of them are considered to be good. I found a random web page with review of nine OCR applications present and updated in 2016, so I begun to download Trial versions of them to find a good one. I test the programs in the same order but draw my own conclusions of the applications. I also try some free and open source choices.

Link to the review I used to just find out a list of what applications are available.

1. Omnipage Standard

This was the best in the review, but I tried to obtain a Trial version using two different e-mail addresses but never got any e-mail with a download link. I also checked they were not caught as spam.

2. Adobe Acrobat

The download time was long but the product basically worked. Some pages had slightly darker areas where OCR didn’t work even thought they were not THAT dark. Most of the text was searchable which was better than nothing, but I expected more from so big company name and the highest price tag. Changing the settings did not help and only made the recognition worse, so the defaults were pretty good.

3. ABBYY FineReader

Download was fast, but the installer left a 600MB temporary folder laying around that I deleted myself after making sure it was no longer needed. The software makes use of all four CPU cores by recognizing four pages a time and consuming 100% CPU time. Adobe did only one page a time and CPU usage was close to 25% (one of four cores consumed.)

Abbyy FineReader also was able to recognize even the darkest areas so it was “perfect” and you could not expect more from the recognition quality. That is why I considerer this to be the winner! It can however save only 3 pages a time in Trial mode, but it also on the cheaper side, less than half the price of Adobe Acrobat.

4. ReadIris Pro

The application performed so far most poorly with cell phone photographed document, producing only gibberish and the settings were quite limited. It’s one of the cheapest.

poor_quality

Sample of how poor quality source I happened to have, and what I expected to be well recognized because at least two applications were able to make every word out of this worst portion.

5. Nuance Power PDF Advanced

So far the largest download, over 850MB in size. I wonder, why a text recognition algorithm with some additional data that it uses would need to be this large. It appears to be because it has a huge collection of integrations to many office suites and business e-mail applications. It has also text-to-speech for a few languages. I do not want those and it is possible to uncheck them in the installer. The installer is also so far first that wants you to restart your computer, just for installing a new application. The application seems to start up fine anyway. It is using only one core while processing multiple pages. The program performs amazingly well and is a top choice so far.

6. ABBYY PDF Transformer+

Installing was easy. Uses a single thread. This seems to be a cheaper version of ABBYY FineReader, which was very good in OCR. This version is also as good, and is actually able to save more than 3 pages while in trial!

7. Soda PDF

A nice installer with good design. The application requires a free registration in order to create PDF documents, which takes additional time. This program, again, uses only one core a time even though the process consumes only 200MB of memory. I wonder why it is so difficult to run the recognition on several threads for different pages at the same time for total OCR duration signitifically shorter. This application is the slowest so far, but at least OCR is automatic in nature so you can go for an extended cofffee break when processing over 50 pages on a modern computer. The OCR operation crashed when it was about 50% done, so I did not try again because I had at this point already found two good applications.

8. Presto PageManager

After installing, trying to launch the applications opens up a web registration form right away. The form has multiple pages with a lot of questions, which seems a bit odd, and the layout reminds me of earlier days of Internet. I filled out the form to the best of my ability and it was submitted. Nothing happens. I try to restart the application but simply – nothing happens, and it neither opens up any web pages anymore. It is just totally dead. I guess it might not Windows 10 compatible, which would not seem odd because of the outdated looks of the web form. I wish there would be an error message instead of just nothing happening. I am not able to review this one.

9. PaperPort Professional

As with Omnipage Standard, I never got an e-mail when I requested a trial, not even in the spam filter. I wonder if these companies have something wrong in their e-mailing system or if they’re actually using manual labor to deliver the trials. In any case, it is taking far too long so I can not review this one either.

Free or open source applications

It seems that the many free or open source OCR applications do the actual recognition with same libraries, called Tesseract and GOCR. Tesseract has a history of being purchased from HP by Google, which then took over the development and also opened the source code to public. There are many frontends available for Tesseract, some being promoted and designed like many of the commercial applications, while some being more open source project like in style. I tried one of them and compared the results to running the same photo through Google Document’s OCR on the Google’s cloud service. Both turned out rather poorly, being gibberish with a few recognizable words. With both, Google Docs and a local Tesseract front end program, the result turned about equally slightly better when I manually fine tuned the photo in an image editor before processing. I think because of that fact and that Google owns Tesseract, that Google Docs is internally using Tesseract as well, even though it is not clearly advertised. Tesseract unfortunately is not a choice for photographed material at this moment, unlike few of the commercial non-free applications were, but it is totally free and open. GOCR seemed to be no better than Tesseract. You might want to seriously give these a go though if your source material is finely scanned papers instead of a quickly photographed book page! These are open sourced solutions, which is always a good thing, especially if it works well enough.

Conclusion

ABBYY’s FineReader and PDF Transfomer are both great choices, as well as Nuance Power PDF, which should be added to the trio of the greatest as equal. It’s difficult to say which of the three is best, because they all performed OCR in poor conditions excellently. Instead, there are some other points than technical OCR ability: Nuance Power PDF is fully functional as a trial version, but FineReader converts only three pages a time. PDF Transformer of the same company did not have that limitation and it is also the cheapest, so I can only recommend PDF transformer for trying out and then purchasing if you want to save some money! Technically, all three are good products. Also, if you have problems while trialing any of them, it is good to have two other options.

If your source material is very clean, well scanned and cropped paper, mint quality, you might want to seriously give a go to Google Docs cloud OCR or many of the free front end applications that use the open sourced libraries. It is open and free and might be just good enough!

The commercial programs don’t have any Linux versions even as closed source. It is also very difficult to find free trial versions from most of the web pages of the companies or web pages the products, so I recommend typing to Google search the name of the application and the word “trial” to easily find the right page from the mess.

A common hint to using the applications – they usually don’t open a bunch of JPG photos and perform OCR directly, but instead you have to first combine the photos to a single PDF file with a functionality in the application, and only then you can choose to do the OCR. It is sometimes possible to replace parts of the page with the recognized text, but for more beautiful and consistent look it is better to keep the photos intact for the reader and instead just add an overlayed search possibility.

Rescuing TrueCrypt with only a USB stick

I recently ran to a problem when I tried to set up dual boot of Linux to my ThinkPad x200 among the Windows 7 that I had been using already. The Windows 7 is encryptred using TrueCrypt, yes, TrueCrypt, because it is still the best system partition wide encryption solution for Windows 7. Newer Windows versions can use BitLocker on system drives on any PC, but there are not good enough graphics drivers available for x200 on anything newer than Windows 7.

Anyway, it seems that grub2 loader used in recent Linux distributions insists on storing some data one same area where TrueCrypt stores some of its code (outside of first 512 bytes of MBR), so there is a conflict which chainloading alone can’t solve. I decided to install Linux to a internal SD card instead, which can be up to 32GB and which x200 can boot. It makes dual booting so much easier and less risky and won’t take more space externally. Then each disk can have its own original boot loader of each OS and instead the BIOS boot menu is used to select which OS to boot.

TrueCrypt was at this point broken and restoring the MBR was not enough to fix it. I ran to a solution on a germanese web page, where TrueCrypt rescue CD is set up on a USB stick. ThinkPad x200 does not have a optical drive nor did I want to burn one. TrueCrypt creates a rescue CD when it is installed and I luckily had it stashed on my file server as a ISO file instead. There is no automated way to just generate a USB stick of the ISO file because it is different of normal Linux and Windows. Instead, grub is set on on a USB stick in Linux or by using grub4dos and grubinst in Windows. After that, “grldr” file and Truecrypt Rescue ISO are copied on the stick and a grub “menu.lst” file is created with following magic. Remember to name the ISO file correctly.

title TRUECRYPT RESCUE DISK 
find --set-root /tc.iso 
map --mem /tc.iso (hd32) 
map (hd0) (hd1) 
map (hd1) (hd0) 
map --hook 
root (hd32) 
chainloader (hd32)

Link to grub4dos: http://download.gna.org/grub4dos/grub4dos-0.4.4-2009-06-20.zip
Backup Link: grub4dos-0.4.4-2009-06-20.zip

Link to grubinst: http://download.gna.org/grubutil/grubinst-1.1-bin-w32-2008-01-01.zip
Backup Link: grubinst-1.1-bin-w32-2008-01-01.zip

Link to the germanese article:

TrueCrypt Rescue Disk unter Windows auf USB Stick

Automatically directed ham satellite antenna

I bought recently a pair of Baofeng UV-5R tranceivers, which are excellent VHF/UHF radios for radio amateur use for the price! At my location there is not much chance for setting up a good HF antenna and with the higher bands it’s possible to get only local connections.. then I was reminded by a friend of the HAM repeater satellites. Seems there are some functional ones at the moment, so I decided to start working on a portable antenna that could be used. Someone had published guide for a dual band directional antenna that cost only $4 to make. Basically it is just a wooden stick with simple yagis for 2m and 70cm bands, made of metal sticks.

These satellites fly in non-geosynched orbit, meaning there is only limited time during a day to work one satellite, but it travels around the world so everybody could enjoy it. I decided that building the antenna alone would be too simple for me and I did not like the idea of aiming it with one hand, possibly to wrong direction, while trying to use the tranceiver in another hand. Then I remembered that there are now very cheap digital magnetometers and tilt sensors (accelerometers) available and that it would be possible to use those in combination of GPS to get the exact aim of the antenna and compare that to the calculated position of the satellite based on the flight data (in KEPS format.)

I went on, designed and printed 3D parts that allow connecting two servos to a standard camera stand and then the antenna to the upper servo. The antenna boom is also 3D printed for the fun of it and the sensor is attached to the boon. I already made an arduino program which receives KEPS from bluetooth, location and time from GPS, uses PLAN13 algorithm to calculate location of the satellite and calculates correctly compensated direction and tilt of the antenna. Then the servos use shortest path to direct the antenna to the satellite and keep tracking it. There are some safeties and it seems to work, but I still have to actually give it a real go outdoors. It’s because I ran to a “slight” problem. The Arduino Nano that I use does not have enough timers so I can’t have have everything at once – the serial ports, precise servo PWM control and I2C to the sensors.

To overcome the overloading of Arduino combined to the fact that I would anyway need a smart phone or laptop ot send the latest KEPS data to Arduino, I decided to change the approach to such that all the smart processing is done on smart phone, also the GPS and time of the smart phone is used so there is no need for a separate GPS module. That way arduino has only to report the sensor readings using bluetooth and to receive servo control commands from bluetooth – problem solved. The Android application would get pasted KEPS from a web site, access Location of the device, and then it could keep calculating the satellite position related to the sensed antenna position and make adjustments.

I never made Android apps before so this is also an adventure to the Android Studio 2.0 and brushing up my Java knowledge. I might also first make a more simple command line program for PC because my x200 laptop has also everything needed – bluetooth, integrated GPS and a Internet connection.

More posts and photos will follow about the status of the project when more milestones are reached.

ABS printing

printer

I have been printing with PLA plastic for over a year and recently thought of trying ABS, which is better for some purposes (does not soften in hot sunlight.) It is supposedly more difficult to print because it needs higher temperatures and it warps easier off the surface that it is being printed on.

abs

First, the glass surface that is being printed on has to be heated not to about 60 degrees celcius but up to 110. The default heater of Velleman K8200 printer could go up to only 70 degrees so I decided to add a relay box and switch, which allows using more voltage, resulting in more heat because the resistance of the heater stays the same. I found an old transformer that can provide 26 volts with 5 amps, well enough and a huge improvement to default 15 volts. It does not matter that it is AC because it is only used for heating, and AC is actually easier for a relay to handle. The maximum temperature was then 117 degrees, well enough.

transformer_caseswitch_box

Printing ABS was possible, but the objects kept warping. Applying acetone mixed with some ABS creates substance that, when applied over the glass, fixes most of the warping issues so it became possible to print most objects as they are. The ratio I used was 1000mg of ABS put into 100ml of acetone. The ABS pieces melted in about one hour of room temperature and it was ready to be used.

Final problem is, that printing ABS produces vapors that smell bad and they are slightly toxic. With PLA there is none of that. I solved it by moving the printer to a well ventilated area (a finnish sauna) and also placed activated carbon filter and ventilation fan next to it, which filters the fumes produced over the printer.

Make a photo frame with clock of an old mobile phone

I had an old iPhone that can’t be used as phone anymore because of a very poor battery. Sticked to a charger, there was not many choices what to use it for. I decided to set it up as a desk clock, showing the time and date on screen always. One can’t ever have too many clocks around.

setup

The clock of iOS is very small on the top bar and all “real apps” were either commercial, showed ads or required a newer model. I then realized, even though the phone and OS are old it has a HTML5 capable browser. It took a couple hours to write a simple JavaScripted web app. It works on a 480×320 canvas, fitted onto the screen in full screen manner, drawing an analog clock with clock hands and the date, once every second. HTML5 canvases are very versatile and light with drawing primitives such as lines, and it was possible to have the background art and the clock face on different transparent div’s behind the canvas. That means, the background art can be replaced any time just by copying any photo file on the web server.

End result can be seen at http://files.jcp.fi/clockdemo.

Faster login on Ubuntu

I’ve had several Linux distributions on my home server and also loved NetBSD for many years. Currently it has been on 32-bit Ubuntu-server distro for several years because it has just kept working up to date. After some dist-upgrade though the remote logins became very slow. After some Googling, many people seem to have the problem. There are two “fixes” which both add to the speed and shouldn’t cause any side effects.

Firstly, /etc/dnsswitch.conf for some reason has line:
hosts: files mdns4_minimal [NOTFOUND=return] dns mdns 4 mdns
While simply the following line is enough in normal situations and makes name look-ups faster:
hosts: files dns

Secondly, for some reason Ubuntu by default has a heavy PAM plug-in which gathers all kind of system statistics as MOTD. It can be disabled by commenting out all lines with “pam_motd.so” in /etc/pam.d/sshd file.

Just some simple fun with LEDs

I am making a prop for a costume, where it’s important to have a sphere of 8cm diameter that emits red light and shines. A local store happened to have an exactly 8cm plastic sphere that is transparent and shiny. With some sand paper the inside turned from glossy to matte, so that it distributes light evenly while the outside still reflects external shines like sunlight or camera flashes.

Next step was soldering together some bright red LEDs. I put one to each cubical direction so that there should always be some amount of light at every angle. They worked great with total 2.4V coming from two rechargeable AA batteries without need for resistors, because super bright LEDs have slightly higher edge voltage.

ball3

Everything put together, I was happy with the result because it is so bright that it looks very red even in sun light, so there is no need to apply any paint. Also it will surely be noticed in darker places. My lady liked it too, in a random photo. 🙂ball1  ball4

Hansel & Gretel: Witch Hunters

Hansel & Gretel: Witch Hunters is a movie I didn’t see coming but went to watch as a supprise by my two good friends. It’s called “Hannu ja Kerttu: Noitajahti” in Finland. Hansel and Gretel were originally chosen to be finnish names Hannu and Kerttu when the original child’s tale was translated in the past.

jeremy-renner-gemma-arterton-hansel-gretel-witch-hunters

The movie was a very good suprise as I actually liked it. Recently there have been many movies and series about old magical tales that have been brought up to date with these days.. cars, computers, guns and so on. One of my favorite series being Once upon Time. Hansel and Grensel however is much more kick ass than that as it’s a very violent action movie. However it’s not “very violent” like Saw, more like hilariously fun when a lot of parts and blood fly around when monsters and bad witches get what they deserve. Sometimes it felt more like watching a good computer game (Doom, Carmageddon, etc.) than a movie. A lot of the fighting happened in a forest, about equally during day and night. It reminded me of live action role playing because they’re often held in forests.

It was also nice change that the main characters weren’t easily scared children but grown ups and with more attitude. That is very pleasant and welcome by me. Not a very deep or serious movie in the end, but the relationship that Hansel begun to form during the story touched me in a way anyway.

Ruby on Rails

Today I ported a blog engine I made, from PHP to Ruby, like it wasn’t enough to code it myself in first place using PHP earlier this year. It was when I got enough of WordPress. It was my first time coding in Ruby, but it turned out to be a fantastic language and like my friend Namochan quotes; many who try it fall in love with it. The language is full of tricks that seem to come naturally and they enable you to write cleaner looking code than what I find possible with languages like PHP. It seems even more logical to me than Python, which is another great relatively new language. Some of the power of Ruby comes from Gems that you can install for your project. They are sort of libraries or packages that bring helpful features, for example with RImagick one can get information about picture files and manipulate them with just a few lines of code.

code

Using the rails framework Ruby becomes more powerful for making web sites like this blog. It’s because of the Model-View-Controller model. To put it simple — the HTML page, data structure models and the actual complicated code are in separate files. Then they’re all simple, clean and easy. Rails also defaulted to SQLite3 database which I had already used before. It is a simple file based database that doesn’t require a server. Taking that in use was as simple as copying my existing database file over it and generating a model for the tables. Well, the only table, which is used for tagging blog articles.

Some links:
– <a href=”http://www.ruby-lang.org/”>Ruby language</a>
– <a href=”http://rubyonrails.org/”>Rails framework</a>

EDIT: I have since then reverted back to WordPress, when it begun to seem more mature and it had gotten many useful add-ons. I had since then used Ruby on a professional work for two years and the language itself has been good. The screen shot is from a editor called Sublime Text, which is my personal favorite.

 

SSH without typing the password

There is a way to avoid typing your password every time you use SSH to some host, greatly speeding up your life especially on mobile devices with slower touch screens. It’s not a new or a special trick but not everyone is aware of it.

rsa

You can generate a key, RSA key pair, on the client computer’s openssh, simply with command “ssh-keygen -t rsa” (don’t give a passphrase.) Then contents of .ssh/id_rsa.pub can be appended to remote host’s .ssh/authorized_keys file. After this you’re not asked a password again.

It’s a bit more insecure, but in case someone steals your computer you can always remove the key from authorized hosts of the target computer.