First look at OCR

A recent comment on my blog struck a nerve where the commenter said that OCR would basically put a competitor to ReceiptWallet above it. While I still don’t believe that OCR is all that useful for receipts (if there is one mistake when you’re generally only entering 3 small pieces of data, you’ve wasted time because you have to review each entry carefully), I took a look at an open source OCR package. While this code is a bit rusty, there has been some recent work on it. My first test was a Rite-Aid receipt where I was looking to see if it could read 3 pieces of data, the merchant name, date, and total. It failed on the merchant name because it was a graphic, however, it picked up the date and total in such a way that I could parse the data and grab what I needed. I then tried 2 other receipts, both from Costco and the results were completely miserable such that I couldn’t get anything from them. I’ll keep plugging away and testing to see if my results are better.

In addition, I put in a request for a quote for a commercial OCR engine. However, I suspect that it will be cost prohibitive. If it costs $5,000-$10,000 upfront plus a per copy licensing fee, I can’t afford that as it would completely wipe out any profit unless I significantly increased the cost of ReceiptWallet.

If anyone has more information on OCR engines for the Mac (commercial or open source), please let me know.

3 Replies to “First look at OCR”

  1. Well, you do have a point. Same with speech recognition though, it doesn’t work 100% all the time, but it oes save alot of time. Only thing I’ve had trouble with in OCR is with logosand odd fonts.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.