Testimonials



VintaSoft OCR .NET Plug-in - Overview

VintaSoft OCR .NET Plug-in is the add-on for VintaSoft Imaging .NET SDK, which allows to recognize text from image and save the recognition results to a text file or searchable PDF document.


General features

  • The plug-in core is a pure .NET library written in C#
  • The plug-in uses code of Google's open-source Tesseract OCR engine written in C++
  • AnyCPU, x86 and x64 mode support
  • Cleanup document image before text recognition - VintaSoft Document Cleanup .NET Plug-in is necessary
    • Detect inverted text in document image and invert it
    • Remove halftones from document image
    • Clear border in document image
    • Remove hole punches in document image
    • Despeckle document image
    • Deskew document image
    • Detect text orientation
  • Segment document image before text recognition
    • Detect segments on document image - VintaSoft Document Cleanup .NET Plug-in is necessary
    • Specify segments on document image from the code
    • Use mouse for selecting segments on document image in Image Viewer - this can be done using the RecognitionRegionEditorTool class from OCR Demo
  • Specify the text recognition parameters
    • Specify the language for the whole document or for each recognition region
    • Supported more than 60 languages: English, German, French, Spanish, Portuguese, Russian, Italian, Dutch, Chinese, Arabic, etc. The full list of supported languages can be found here.
    • Specify the "white list" of characters
  • Recognize text from image or image collection
    • Recognize text from image collection, the whole image or image region
    • Full Unicode support
    • Indicate progress of text recognition
    • Cancel text recognition if necessary
    • Obtain the recognition result in a hierarchy: document, page, region, paragraph, line, word, character
    • Report confidence, location, orientation, text and writing direction of each recognized paragraph, line, word and character
  • Edit text recognition results
    • Edit text recognition results from the code
    • Use mouse for editing text recognition results in Image Viewer - this can be done using the OcrResultEditorTool class from OCR Demo
    • Import text recognition from HOCR format
  • Save text recognition results
    • Save recognition results as text or formatted text
    • Save recognition results to a searchable PDF document as text or as hidden text under the rasterized image - VintaSoft PDF .NET Plug-in is necessary
    • Export text recognition result into HOCR format

Development and Deployment requirements

  • Development requirements:
    • VintaSoft Imaging .NET SDK
    • Development environments: Microsoft Visual Studio .NET 2005, 2008, 2010, 2012, 2013, 2015, 2017
    • Programming languages: VB.NET, C#, any .NET compatible language
    • Development platforms: .NET, WinForms, WPF, ASP.NET WebForms, ASP.NET MVC
  • Deployment requirements:
    • VintaSoft Imaging .NET SDK
    • Microsoft Windows XP, Vista, 7, 8, 8.1, 10 (32-bit and 64-bit)
    • Microsoft Windows Server 2003, 2008, 2012 (32-bit and 64-bit)
    • Microsoft .NET Framework: 2.0, 3.0, 3.5, 4.0, 4.5, 4.6