A recent comment on my blog struck a nerve where the commenter said that OCR would basically put a competitor to ReceiptWallet above it. While I still don’t believe that OCR is all that useful for receipts (if there is one mistake when you’re generally only entering 3 small pieces of data, you’ve wasted time because you have to review each entry carefully), I took a look at an open source OCR package. While this code is a bit rusty, there has been some recent work on it. My first test was a Rite-Aid receipt where I was looking to see if it could read 3 pieces of data, the merchant name, date, and total. It failed on the merchant name because it was a graphic, however, it picked up the date and total in such a way that I could parse the data and grab what I needed. I then tried 2 other receipts, both from Costco and the results were completely miserable such that I couldn’t get anything from them. I’ll keep plugging away and testing to see if my results are better.
In addition, I put in a request for a quote for a commercial OCR engine. However, I suspect that it will be cost prohibitive. If it costs $5,000-$10,000 upfront plus a per copy licensing fee, I can’t afford that as it would completely wipe out any profit unless I significantly increased the cost of ReceiptWallet.
If anyone has more information on OCR engines for the Mac (commercial or open source), please let me know.
I do see their point, hence http://www.neatreceipts.com. Only thing is you can’t buy the software alone, you have to buy it with the scanner.
If the OCR doesn’t work 100% of the time, how much time does it really save?
Well, you do have a point. Same with speech recognition though, it doesn’t work 100% all the time, but it oes save alot of time. Only thing I’ve had trouble with in OCR is with logosand odd fonts.