How to Empower JavaScript Barcode Scan with Tesseract.js OCR

Previously, I shared an article demonstrating how to use Tesseract Python OCR to recognize the accompanying text of a 1D barcode. In this article, I will change the programming language to JavaScript, creating a JavaScript barcode scan app integrated with Tesseract JavaScript OCR.

How to Use JavaScript OCR to Recognize 1D Barcode Text

When we search for “JavaScript OCR” in Google, the first of returned results is Tesseract.js, which is a JS wrapper, built with Emscripten (A tool compiles C/C++ to WebAssembly), for Tesseract OCR engine.

Install Tesseract.js:

npm install tesseract.js

To get started with Tesseract.js, we can learn the examples https://github.com/naptha/tesseract.js/tree/master/examples.

However, you may find that all examples take a long time when running for the first time. The reason is it will trigger the download if there is no language trained data existed. According to tesseract.js-offline, we can manually download the data file and define the local data path to make the app work offline.

Node:

const { createWorker } = require('tesseract.js');
const path = require('path');

const worker = createWorker({
  langPath: path.join(__dirname, '..', 'lang-data'), 
  logger: m => console.log(m),
});

(async () => {
  await worker.load();
  await worker.loadLanguage('eng');
  await worker.initialize('eng');
  const { data: { text } } = await worker.recognize('image-path'));
  console.log(text);
  await worker.terminate();
})();

Web browser:

  <script src="../node_modules/tesseract.js/dist/tesseract.min.js"></script>
  <script>
    const { createWorker } = Tesseract;
    const worker = createWorker({
      workerPath: '../node_modules/tesseract.js/dist/worker.min.js',
      langPath: '../lang-data',
      corePath: '../node_modules/tesseract.js-core/tesseract-core.wasm.js',
      logger: m => console.log(m),
    });

    (async () => {
      await worker.load();
      await worker.loadLanguage('eng');
      await worker.initialize('eng');
      const { data: { text } } = await worker.recognize('image-path');
      console.log(text);
      await worker.terminate();
    })();
  </script>

I got an image for the test:

codabar

Here is the result:

(IMTANMUEARRMAD
3 1383 09602 2010

Since I’m using the OCR for 1D barcodes, the expected outcome should only contain digital numbers. We can filter the returned characters by setting a character whitelist:

await worker.setParameters({
    tessedit_char_whitelist: '0123456789',
  });

Rerun the app:

101011191 1 1 1111  111  11 1   41 111  1  11  11 111 11 1 11
31383096022010

Although the result only contains digital numbers, it is still not ideal. Maybe we can change the trained data to improve the result. Visit tessdata to get the data with higher OCR accuracy. After substituting the eng.traineddata.gz file, I got the expected result:

31383096022010

How to Implement a JavaScript Barcode Reader

Dynamsoft compiles C/C++ barcode library to a WebAssembly module, which facilitates developing high-performance JavaScript barcode apps for Node.js and web browsers. Now, let’s create JavaScript barcode reader apps and use JavaScript OCR to verify the barcode results.

Node

Install the package:

npm install dynamsoft-node-barcode 

Decode barcodes in Node.js:

const path = require('path');
const image_file = path.join(__dirname, '..', 'images', 'codabar-text.png')

let Dynamsoft = require("dynamsoft-node-barcode");
// Get a free trial license from https://www.dynamsoft.com/CustomerPortal/Portal/TrialLicense.aspx
Dynamsoft.BarcodeReader.productKeys = 'LICENSE-KEY';
 
(async()=>{
    let reader = await Dynamsoft.BarcodeReader.createInstance();
    for(let result of await reader.decode(image_file)){
        console.log('Barcode result: ' + result.barcodeText);
    }
    reader.destroy();
    await Dynamsoft.BarcodeReader._dbrWorker.terminate();
})();

Run the app in the command-line tool:

node index.js
JavaScript barcode OCR in Node.js

Web

Install the package:

npm install dynamsoft-javascript-barcode

Read barcodes in web apps:

    <input id="image-file" type="file" accept="image/png,image/jpeg,image/bmp,image/gif">
    <div id="barcode-result"></div>
    <div id="ocr-result"></div>
    <!-- Get a free trial license from https://www.dynamsoft.com/CustomerPortal/Portal/TrialLicense.aspx -->
    <script src="../node_modules/dynamsoft-javascript-barcode/dist/dbr.js"
        data-productKeys="LICENSE-KEY"></script>

    <script>
        let reader = null;

        document.getElementById('image-file').addEventListener('change', async function () {
            try {
                // Use Dynamsoft JavaScript Barcode Reader
                reader = reader || await Dynamsoft.BarcodeReader.createInstance();
                let barcode_results = [];
                let file = this.files[0];
                let results = await reader.decode(file);
                for (let result of results) {
                    barcode_results.push(result.barcodeText);
                }

                document.getElementById("barcode-result").innerText = 'Barcode result: ' + barcode_results.join('\n');

                // Draw the input image to a canvas
                let img = new Image;
                img.onload = function () {
                    let canvas = document.createElement("canvas");
                    let old_canvas = document.querySelector('canvas');
                    if (old_canvas) {
                        old_canvas.parentNode.removeChild(old_canvas);
                    }
                    document.body.appendChild(canvas);
                    canvas.width = img.width;
                    canvas.height = img.height;

                    let ctx = canvas.getContext('2d');
                    ctx.drawImage(img, 0, 0);
                }
                img.src = URL.createObjectURL(file);
            } catch (ex) {
                alert(ex.message);
                throw ex;
            }
        });
    </script>

Run the web app:

npx http-server .

JavaScript barcode OCR in web browser

It’s good to see the JavaScript OCR result matches the barcode result. If your barcode SDK sometimes failed to read 1D barcodes, try OCR to recognize the accompanying text.

Source Code

https://github.com/yushulx/javascript-barcode-ocr