How to Build Tesseract OCR Library on Windows

Previously, I shared an article Making an Android OCR Application with Tesseract. This time, I’d like to share how to build the tesseract OCR library with Microsoft Visual Studio 2008 on Windows.

Building Tesseract

I’ve tried different ways to set up the building environment, and finally concluded that the most convenient way is to use the installer.

Download

Installation

Follow the installation steps and check the option Tesseract development files:

install_tesseract_ocr

Building

After finishing the installation, find the Visual Studio project folder:

tesseract_ocr_project

Here are all relevant libraries that needed to be linked when building the OCR library.

tesseract_ocr_lib

In Visual Studio 2008, import and build the project. The outputs of DEBUG and RELEASE are respectively libtesseract302d.dll and libtesseract302.dll.

If you read the README file, please notice the paragraph:

Dependencies and Licenses
=========================

Leptonica is required. (www.leptonica.com). Tesseract no longer compiles
without Leptonica.
Libtiff is no longer required as a direct dependency.

Let’s take a further look at what Leptonica is.

Building Leptonica

liblept(Leptonica), written in C,  is an open source library for image processing. It supports the file formats, including JPEG, PNG, TIFF, and GIF.

Download

  • Source code, Visual Studio project, header files and relevant libraries: leptonica-1.68.

Building

Unpack all packages, and make the folder structure for building as following:

BuildFolder\

  include\

  leptonica-1.68\

  lib\

BuildFolder\leptonica-1.68 contents:

config\                    Not used for Windows builds
prog\                      Regression tests, examples, utilities
src\                       Source files for liblept
vs2008\                    Visual Studio 2008 specific files
 DLL Debug\                 liblept DLL Debug build output
 DLL Release\               liblept DLL Release build output
 LIB Debug\                 liblept LIB Debug build output
 LIB Release\               liblept LIB Release build output
 prog_projects\             Projects for prog programs
  ioformats_reg\             Sample project for prog\ioformats_reg.exe
   DLL Debug\                 DLL Debug build output for sample project
   DLL Release\               DLL Release build output for sample project
   LIB Debug\                 LIB Debug build output for sample project
   LIB Release\               LIB Release build output for sample project
   ioformats_reg.vcproj       The ioformats_reg project file
 leptonica.sln              The Leptonica solution file
 leptonica.vcproj           The Leptonica project file

In Visual Studio 2008, import and build the project. The outputs of DEBUG and RELEASE are respectively liblept168d.dll and liblept168.dll.

References

Leptonica
Leptonica & Visual Studio 2008
Tesseract-ocr