Document Management Systems (DMS) incorporating Optical Character Recognition (OCR) are the bridge between the physical world and your digital one, so having a good understanding of the power and limitations behind this sometimes-mysterious “paperless” technology is the best way to get the most out of whatever DMS you decide to operate.
The concept is a simple one; your business has paper documents which your staff require instant access to for quick and efficient recall during calls or daily administration tasks.
This includes documents like delivery notes, printed quotations, packing lists – all of these are rarely always digitally managed as it would require all customers and suppliers switching to a common platform, and while we are all (slowly) getting there, lots of situations still require the speed and convenience of paper copies. Of course, there’s sometimes just no substitute for a paper copy of a document; in fact, hard copies are often a legal requirement
That being said, paper is fragile and messy. Working solely from clunky cabinets is slow at best, and unfeasible at worst. Even if you are highly organised and extra careful with your filing systems your business will eventually get to the size where heading to the filing room at every invoice check will bring productivity to a frustrating crawl.
This is where Optical Character Recognition comes into its own. Computer algorithms “read” pages of scanned documents and save the content of the document to a database for intensive searches and archiving. Give the system a document number or tell it what kind of thing you’re looking for and there it is in seconds!
We understand that knowing your tools inside-out is the only way to ensure successful deployment of any system. Our years of experience with numerous off-the-shelf solutions left us disappointed with everything we saw. No one solution we came across did everything our customers required, or it did far too much and was prohibitively expensive, bloated and slow... so we what any self-respecting software company would do and we built our own. OCRchive
is our web-based DMS indexing platform which accepts many common digital file formats (including Microsoft Office Documents, JPEG files) as well as PDFs presented by office scanners. Load your documents into the scanner or drag them from your desktop and within minutes these documents start appearing on the system for other users in your organisation to view, download and distribute.
Since 2015, OCRchive
has scanned over 2.4m documents for customers in Scotland and England. From estate-management organisations to survival clothing companies, in all cases, we have tailored the solution specifically to the exact requirements of the customer. With our built-in API, we can even interface OCRchive
with your existing ERP systems if you require!
There’s an often-underestimated factor to scanning stuff though - things that seems innocuous to us may actually confuse a computer. Smudges on the paper, diagonal scans, coffee stains, poor scan resolution and ham-fisted delivery drivers scribbling on your perfectly pristine print-outs will all break the OCR process to varying degrees. OCR may seem like magic but there are limitations. Poor quality in; poor quality out. That’s why our system introduction will include hands-on training with the software and coaching for the staff handling your documents so you get the absolute best from OCRchive
. We want to help you get it right, first time, every time.
Our general recommendations are always:-
- Ensure good scanning quality from the scanning device.
- Use correct feeder placement and orientation.
- Avoid handwritten markings on the paper. Write in the margin and in red or green ink if absolutely necessary.
- Perform one scan per document – multi-page documents are fully supported by OCRchive.
- Avoid scanning folded or crumpled papers.
As the industry adage goes, “You can’t manage what you don’t measure.” and this is where OCRchive
is very different to other DMS’s. Every document loaded into the application is intelligently reviewed and assigned a scan-score which indicates the read quality of the document.
If the score is in the green, your document has an excellent search rating. If the score is in the red, the user is alerted to the low quality scan and is allowed to assign their own document references manually for searching later or they can re-scan the document under better conditions. Instead of blindly sending your documents down a black hole with no feedback as other DMS’s do, this unique process gives you the transparency and detail you need to monitor the real-world administrative processes at the receiving end to keep record accuracy as high as possible.
Thanks to Google’s powerful Tesseract
engine at the core of OCRchive
, our search abilities are exceptional, with OCRchive
facilitating “fuzzy” searching for things that just look similar (O’s instead of Zeros, 1’s instead of I’s etc). Coupled with instant document previews and advanced concise filtering options, searching for important documents will typically only take seconds, even for an inexperienced user.