Imaging 101

1. Terminology

a. Pixel

The word pixel is based on a contraction of pix (“pictures”) and el (for “element”). In digital imaging, a pixel is a physical point in a raster image, or the smallest, addressable element in a display device.

A pixel is generally thought of as the smallest single component of a digital image.

b. Resolution

When one talks about resolution, he might refer to one of the following things:

i. Device  resolution

The resolution of an output device such as a monitor or printer is measured by the number of individual dots that can be placed in a line within the span of 1 inch (2.54 cm), or in short, dots per inch (DPI).

ii.      Screen resolution

The display resolution of a digital television or display device is the number of distinct pixels in each dimension that can be displayed.

Pixel count of width by height

when the pixel counts are referred to as resolution, the convention is to describe the pixel resolution with the set of two positive integer numbers, where the first number is the number of pixel columns (width) and the second is the number of pixel rows (height), for example as 640 by 480.

For example, we can adjust our PC screen resolution to 1280*720.

Total number of pixels in one image

Another popular convention is to cite resolution as the total number of pixels in the image, typically given as number of megapixels, which can be calculated by multiplying pixel columns by pixel rows and dividing by one million.
For example, the resolution of one camera lens is 700 MB.

iii.      Image resolution

For a digital image, Pixel per inch (PPI) describes pixels per length unit or pixels per area unit, such as pixels per inch or per square inch.

The higher the PPI, the better the image quality is.

iv.      Scanning resolution

As mentioned, DPI refers to the physical dot density of an image when it is reproduced as a real physical entity, for example printed onto paper, or displayed on a monitor. A digitally stored image has no inherent physical dimensions, measured in inches or centimeters. Thus it has no DPI. Its quality is labeled with PPI.

In the case of scanned images, the digital file formats records the DPI value, which is the size of the original scanned object, as its PPI (pixels per inch) value.

The value can be used when printing the image. It lets the printer know the intended size of the image.

For example, a bitmap image may measure 1,000 × 1,000 pixels, a resolution of 1 megapixels. If it is labeled as 250 PPI, that is an instruction to the printer to print it at a size of 4 × 4 inches. Changing the PPI value would not change the size of the image in pixels which would still be 1,000 × 1,000.

Bit Resolution, Bit Depth & Pixel Type

Color depth or bit depth is the number of bits used to indicate the color of a single pixel in a bitmapped image. This concept is also known as bits per pixel (bpp), particularly when specified along with the number of bits used. It is the measurement of how many colors, or shades of grey, that an image has. The number is based on how many bits of data are used to store color data for each pixel of an image.

Obviously, as the number of bits used to store color information increases, the number of colors that can be presented also increases, but it does so exponentially.

A one bit image would use a single bit of data for each pixel, creating an image that was made up of only two colors, usually black and white. An eight bit (or one byte) image would have 256 (28) possible colors, which might be 256 shades of gray in a grayscale image or a limited 256 color palate in an image saved in the GIF file format.

A 24 bit (or three byte) would have a total of 16,777,216 possible colors. Most 24 bit images are made up of three eight bit channels as part of the RGB color model. RGB images are made up of 256 shades of red, 256 shades of green, and 256 shades of blue for a total of 16,777,216 color combinations (256 x 256 x256 or 224). When the bit depth is 24 bit or higher, it is also known as Truecolor (called Millions on a Macintosh) because it represents a significant portion of the range of colors visible to the human eye.

Higher color depth gives a broader range of distinct colors.

1-bit color (21 = 2 colors) – often black and white

2-bit color (22 = 4 colors) – gray-scale

24-bit color (224 = 16,777,216) – true color; the human eye is capable of discriminating among as many as ten million colors. Read more

Introduction to Dynamsoft OCR SDK

1. Introduction to OCR add-on

 a. What is OCR

OCR (optical character recognition) is software used by a computer to recognize text in a graphic format and turn it into computer text, which can be read and edited normall y. For example, one might take a picture of a car’s license plate, and OCR software could then be used to read the text from the picture into a word document. OCR is implemented through a complex system of trained pattern recognition, whi ch can also recognize fonts and formatting. Modern OCR is very accurate, and thus is practical for use in a wide variety of areas, and is constantly being improved through training and artificial intelligence.

b. The power of modern OCR applications

Computer OCR has been developed for over 60 years. In its most primitive form, it was able to recognize most letters of the English alphabet. Today, OCR is very powerful, and software can be found that is able to support almost all languages in usage, with very reasonable accuracy, and it’s only getting better.

In many cases, the quality of recognition is dependent on the quality of the image. The ideal image is one that has a plain background with a minimal amount of spots and artifacts. However, modern OCR appl i cati ons are also powerful enough to detect anomalies and ignore them in processing. Wordlist data is also used to reduce mistakes, as processed words can be compared to dictionary words.

The Tesseract OCR engine is an example of a powerful modern OCR engine, which supports over 40 languages and is flexible enough to be trained to improve accuracy and add new languages. Tesseract is a mature engine that has existed since 1985, created by HP labs and currently developed by Google. Called an “engine”, it is the lowest level component of an OCR system, meaning its job is to perform recognition and recognition only. To take full advantage of OCR technology and implement features such as output to complex formats, text formatting, and graphical interfaces, a more complete sof tware package is required.

c. How can OCR be used

While the past was a world where documents were all physical, and the future is a world where documents may all be di gital , the present is in a state of transition. In this transition state, physical and digital documents coexi st, and it is important to have technologies like OCR to allow for conversion back and forth.

OCR is useful for a great variety of purposes, including document recovery, data entry, and accessibili ty. Most appli cati ons of OCR are from scanned documents, but in some cases photos are also used. OCR is an essential time saver, as in many cases the only alternative is retyping the document. Some of the ways in which OCR can be used follow:

  • Recovering editable text fi les from scanned documents including faxes
  • Categorizing forms based on an approximation of their handwritten contents
  • Creating searchable and e di tabl e eBooks from book scans
  • Searching and editing text from screenshot images
  • Computerized reading of books for visually impaired individuals through text-to-speech

While these are just some of the ways that OCR can be used, they show the flexibility of OCR technology in a great variety of fields. Almost all employees of all businesses rely heavily on documents every day, so business usage is al so an important focus in the development of OCR systems.

 d. Business applications of OCR

Business usage of OCR generally falls within the field of data organization and input. Many businesses receive documents in a traditional printed form, such as forms that are mailed or faxed in. In other cases, some documents may only be available in written form, such as manuals or printed documents for which the original file has been long lost. Processing of these documents is much more expensive than for documents in a digital form, as they require a human to read the documents and manually categorize or record data.

Usi ng OCR, the manual process is eliminated, only requiring the document to be scanned. Af ter a document has been processed by OCR, its data can be used to utomatically categorize it by the computer, and the information can be edited and searched by employees. OCR is used by post offices, l ibraries, and offices of any kind. Read more

Better Manage Your Documents with TWAIN

Simplify your document scanning and editing process

With the development of digital devices (scanners, webcams) and information delivering and sharing online, more users choose to manage and store documents online, such as legal papers, contracts and IDs. This perfectly solves the problems of traditional paper management:

  • Waste of papers;
  • Waste of time in delivery;
  • Difficult to share the information with your coworkers and customers;
  • Difficult to keep the papers. Consequent you may miss the important information one day

When businesses inclined to handle images and documents digitally, the TWAIN working group worked out TWAIN to regulate communication between software applications and imaging devices such as scanners and digital cameras. Since 1992, the protocol has been constantly improved by the organization and is famous among scanners and document processing applications.

TWAIN technology diagram

TWAIN’s already been a very sophisticated protocol and provides a rich set of application programming interfaces. You can take advantage of them to acquire/capture images from any TWAIN compliant devices to your system.

Scanning Customization

TWAIN allows you to fully customize the whole scanning process. You can decide whether to show the user interface of the source and let users adjust the image values by themselves.

Some systems may require standard images with the same resolution, page size and color (for instance Gray). With TWAIN, you are allowed to hide the user interface to prevent end users from modifying the image properties, and then hardcode by calling the TWAIN capabilities to unify the image properties.

Document Adjustment

TWAIN provides numerous of interfaces to help adjust the images.

  • Rotate, mirror and deskew the scanned images (automatically).
  • Discard blank images. Users also insert blank pages or ones containing barcode information to separate different batches of papers. You can use TWAIN to detect these special pages before uploading them to different categories on your system.
  • Define the image layout if only a part of the image needs to be scanned.

In addition, you can easily deal with the image noises, image review and so on within lines of source code.

Dynamic Web TWAIN – A Document Processing Solution

A TWAIN component can capture images from any TWAIN compliant devices. The component can be an ActiveX control that can be embedded into your desktop scanning solution or a web application planned to be run in IE.

According to the report, the marketing share of Chrome is constantly increasing. Your customers may want to do all document processing tasks (scanning & editing & saving & uploading) in Chrome, as well as Firefox, Safari, etc. That’s also achievable.

Dynamic Web TWAIN is such a TWAIN SDK specially optimized for web applications. With it, you can scan, edit and upload images within your favorite browser.

If you are interested in, you can check out Dynamic Web TWAIN for detailed information.

 

Dynamic Web TWAIN 6.3.1 Released!

I’m glad to announce Dynamic Web TWAIN 6.3.1 was released today!

Dynamic Web TWAIN is a TWAIN scanning SDK specifically optimized for web applications. The TWAIN SDK supports scanning within all the mainstream browsers, including IE, Firefox, Chrome, Safari, Opera and more.

twain supported browsers

Since the release of V6.3, the 64-bit ActiveX Edition has been widely welcomed by users. Motivated by this and committed to providing better products to our customers, our R&D team further optimized the ActiveX 64-bit edition. In addition, Dynamic Web TWAIN users will also be happy to find a very handy property – LogLevel -in the new version. The property logs more exceptions, and helps developers find the cause of the problems more efficiently during the development. Go to What’s New to check out more detailed information.

Download 30 Days’ Free Trial
Try Dynamic Web TWAIN Online Demo