Computers Can Read the Writing on the Wall with Online Handwriting Recognition

Enabling computers to understand natural human input has been the goal of many manufacturers in the last decades. Extensive research has been done in both voice and handwriting recognition technologies in universities and research centers. Until 10 years ago, much of the research on handwriting recognition yielded only theoretical results. The introduction of the personal computer and PDAs has expanded the consumer market, which is now ready to acknowledge handwriting-based solutions, both in the hardware technology side and the ability of adequate online handwriting recognition technology. The root of online handwriting recognition is real time data collection by way of a digital sampling method. The most common input devices are digitizing tablets or touch pads, where the written data is digitized and translated into a series of coordinates. Data points are collected as either two-dimensional x,y coordinates, or as three-dimensional x,y,z coordinates (z = pressure sensitivity). The collected raw data is later used for the recognition process. Online handwriting recognition is different from optical character recognition (OCR), in which the collected data is a result of a scanner output (bitmap). In online handwriting recognition, the dynamics of the writing is understood by the recognition engine and can therefore be used to receive a more accurate interpretation. Since natural handwriting recognition is a difficult task, different methods have been tried to achieve accurate recognition. Each approach is more suitable for a certain task. The following is a list of the most popular:
    Unistroke Characters
    This input consists of characters created out of a single ink stroke (unistroke). The shape of each character is selected in a way to increase the differentiation from the other characters. The user must learn a new handwriting style (some letters are similar to standard writing) and also abide by regulations describing how to use the system, such as how to shift between capital and small letters. This method is highly suitable for extremely small devices on which it is impossible to write naturally. Boxed Input
    The input is done by implementing a natural handwriting style, but each character has to be input by using a predefined box layout. The written strokes are segmented into characters according to their location in the box. This method is usually suitable for form-filling applications. Natural Input
    This writing style is completely natural, as if the user is writing on a piece of paper. The ink strokes can be connected, such as in cursive writing, and there are not any constraints in the writing style. This method is highly suitable for massive input tasks, such as E-mail, note taking, etc. Command and Control
    Handwriting input can also be used as a command and control method (similar to the usage of speech). A specific handwritten symbol can be attached to an action, such as the execution of a macro, launch of an application, performing editing operations and more. This method is highly suitable as an add-on to devices that already incorporate a touchpad (such as laptops) or within any pen-centric devices.
The most sought after input method is a totally natural input, in which the user writes as if writing on paper. Recently, some adept natural handwriting recognition systems were introduced, which can be used without constraints. Most natural online handwriting recognition engines share a common basic design. However the actual implementation varies quite a bit. Using that basic engine, implementation of on-line handwriting recognition can vary considerably:
    Data Collection
    The data is collected in real time from the acquisition hardware. The standard data is a stream of {x, y} coordinates, sampled at almost equal time intervals. The data resolution is around 100 DPS (dots per second) and 100 DPI (dots per inch), and an increase in either or both provides an improved sampling rate In addition to the coordinates, there is also an indication if the stylus is at up or down mode. Data collection ends after a predefined timeout period, where no additional pens down events are encountered. The collected data is usually passed through some filter, eliminating digitization noise and random ink spikes. The data is also checked for the writer's slant and rotation, and if needed, an equalization algorithm is applied. Feature extraction
    The raw data (x, y coordinates) is transformed into more suitable recognition related features. These features model the underlying features of the written data, such as the curve, direction, break points, height and more. These features are the groundwork for the higher levels. Shape Recognition
    The heart of online handwriting recognition is the ability to compare a written set of strokes (or substrokes in cursive letters) to character templates. The results are a set of characters along with their associated match probability. The comparison is based on analyzing the shape features, with the more sophisticated (high level) attributes assisting to provide a complete recognition system later. Segmentation
    The segmentation process takes all of the written data and attempts to segment this data into words and characters. This process also incorporates global features such as baseline, size, and other helpful statistical features above the shape-based recognition. All of these features are combined together and use optimization techniques, which output the most probable segmented recognition results in a short time. Added to, or totally integrated with, the segmentation as a post filter, is context-related information, such as statistical linguistics or a dictionary. Linguistics and Dictionary
    These are additional sources of information that help to resolve conflicts between similar looking characters. The information is usually based on statistical modeling of the language or as a language dictionary. The statistical representation optimizes the written text as an adequate sequence of letters, as expected in the language – such as referring to ing at the end of a word, versus iny. The dictionary searches the written text for the most probable word in the dictionary. Training
    Training enables the user to teach the recognition system his/her individual writing style. Preferably, the training is done in an "on the fly" manner – i.e. any correction of erroneous text is also a training event.
Personalization and Training
Personalization of the recognition system helps to achieve high accuracy. The most important method is the ability to train the system to recognize an individual writing style. Almost any user writes a few characters in a non-conventional way or does not write conflicting characters as expected according to the standard writing style. The only way to recognize those characters is by allowing some kind of training ability. The training has to be done with an easy method – such as the "on-the-fly" training concept. It is also important to enable a dictionary-based system to add custom words to the dictionary, therefore extending the dictionary coverage. Adding handwriting recognition capability enhances the user interface of almost any consumer device, mainly by making the user experience familiar and comfortable to use. The tiny keyboards used in many devices are hard to use for human fingers. The most compelling features of adding handwriting recognition capability to a device are:
  • Can be used in noisy places or in a quiet place. The writing is done in a quiet manner and is not influenced by noises.
  • The device can be used on the move; there is no need to sit at a desk – as required in keyboard usage.
  • The size of the devices can be much smaller by removing the keyboard and using the already existing pen for writing directly on the screen.
  • A familiar input method, just like writing with pen and paper.
Online handwriting recognition has advanced substantially over the last few years. The required hardware today delivers enough processing power, while the price is reasonable. The handwriting recognition technology can allow users today to write naturally and receive high accuracy. Implementation possibilities are almost limitless, actually almost within any consumer device that require some kind of input – PDA, GPS, cellular phones and more.
Eran Aharonson is Vice President, Business Development at Advanced Recognition Technologies, Inc. , and can be reached at 805 581 3999 or at http://www.artcomp.com.
SpeechTek Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues