Making an Android OCR Application with Tesseract

Tesseract is a well-known open source OCR engine that released under the Apache License 2.0. In this tutorial, I’d like to share how to build the OCR library for Android, as well as how to implement a simple Android OCR application with it.

Dynamsoft Barcode Reader SDK Ads Powered by Dynamsoft

ocr_img

do_ocr_select

Tesseract Android Tools

To build the Tesseract OCR library for Android, we can use the tesseract-android-tools provided by Google.

Get the source code:

git clone https://code.google.com/p/tesseract-android-tools/

Open README, and take the following steps:

cd <project-directory>
curl -O https://tesseract-ocr.googlecode.com/files/tesseract-ocr-3.02.02.tar.gz
curl -O http://leptonica.googlecode.com/files/leptonica-1.69.tar.gz
tar -zxvf tesseract-ocr-3.02.02.tar.gz
tar -zxvf leptonica-1.69.tar.gz
rm -f tesseract-ocr-3.02.02.tar.gz
rm -f leptonica-1.69.tar.gz
mv tesseract-3.02.02 jni/com_googlecode_tesseract_android/src
mv leptonica-1.69 jni/com_googlecode_leptonica_android/src
ndk-build -j8
android update project --target 1 --path .
ant debug (release)

Note: if you are using NDK r9, the building will fail with the error:

format not a string literal and no format arguments [-Werror=format-security]

To solve it, open Application.mk, and add the following line:

APP_CFLAGS += -Wno-error=format-security

After successfully building the OCR library, you will get the class.jar in folder bin and relevant *.so in folder libs.

If you can’t successfully build the source code, please download the jni.zip and copy all source code to your project folder.

Android OCR Application

Create an Android project, and import the relevant libraries.

To do OCR, we can create a class named TessOCR:

public class TessOCR {
	private TessBaseAPI mTess;

	public TessOCR() {
		// TODO Auto-generated constructor stub
		mTess = new TessBaseAPI();
		String datapath = Environment.getExternalStorageDirectory() + "/tesseract/";
		String language = "eng";
		File dir = new File(datapath + "tessdata/");
		if (!dir.exists()) 
			dir.mkdirs();
		mTess.init(datapath, language);
	}

	public String getOCRResult(Bitmap bitmap) {

		mTess.setImage(bitmap);
		String result = mTess.getUTF8Text();

		return result;
    }

	public void onDestroy() {
		if (mTess != null)
			mTess.end();
	}

}

In the constructor, we need to check the directory tessdata. If it doesn’t exist, an exception will be thrown in init(). If you want to know why, read the source code:

public boolean init(String datapath, String language) {
        if (datapath == null) {
            throw new IllegalArgumentException("Data path must not be null!");
        }
        if (!datapath.endsWith(File.separator)) {
            datapath += File.separator;
        }

        File tessdata = new File(datapath + "tessdata");
        if (!tessdata.exists() || !tessdata.isDirectory()) {
            throw new IllegalArgumentException("Data path must contain subfolder tessdata!");
        }

        return nativeInit(datapath, language);
    }

Pretty simple! Now we can use three different ways to load images and do OCR:

In AndroidManifest.xml, add the following intent filter:

<intent-filter>
                <action android:name="android.intent.action.SEND" />

                <category android:name="android.intent.category.DEFAULT" />
				<data android:mimeType="text/plain" />
                <data android:mimeType="image/*" />
</intent-filter>

Decode the image URI:

if (Intent.ACTION_SEND.equals(intent.getAction())) {
    Uri uri = (Uri) intent.getParcelableExtra(Intent.EXTRA_STREAM);
    uriOCR(uri);
}
private void uriOCR(Uri uri) {
		if (uri != null) {
			InputStream is = null;
			try {
				is = getContentResolver().openInputStream(uri);
				Bitmap bitmap = BitmapFactory.decodeStream(is);
				mImage.setImageBitmap(bitmap);
				doOCR(bitmap);
			} catch (FileNotFoundException e) {
				// TODO Auto-generated catch block
				e.printStackTrace();
			} finally {
				if (is != null) {
					try {
						is.close();
					} catch (IOException e) {
						// TODO Auto-generated catch block
						e.printStackTrace();
					}
				}
			}
		}
}

Send the Intent for picking images, and decode the returned URI in onActivityResult:

Intent intent = new Intent(Intent.ACTION_PICK, android.provider.MediaStore.Images.Media.EXTERNAL_CONTENT_URI);
startActivityForResult(intent, REQUEST_PICK_PHOTO);

Taking a picture from camera

To get high-quality images, attach the file path to the Intent:

private void dispatchTakePictureIntent() {
		Intent takePictureIntent = new Intent(MediaStore.ACTION_IMAGE_CAPTURE);
		// Ensure that there's a camera activity to handle the intent
		if (takePictureIntent.resolveActivity(getPackageManager()) != null) {
			// Create the File where the photo should go
			File photoFile = null;
			try {
				photoFile = createImageFile();
			} catch (IOException ex) {
				// Error occurred while creating the File

			}
			// Continue only if the File was successfully created
			if (photoFile != null) {
				takePictureIntent.putExtra(MediaStore.EXTRA_OUTPUT,
						Uri.fromFile(photoFile));
				startActivityForResult(takePictureIntent, REQUEST_TAKE_PHOTO);
			}
		}
}

Before running the Android OCR app, do not forget to download the relevant language data packages and push them to your phone storage.

Source Code

https://github.com/yushulx/android-tesseract-ocr