This document was written in April 1998 as the first summary of the Dasher concept. The master dasher website is now at http://www.inference.org.uk/dasher/.

Arithmetic Coding Data Entry Device

Existing devices for communicating information to computers are either bulky, or slow to use, or unreliable.
Here we describe an idea for a new interface.

We call our prototype interface `dasher'. This document explains the ideas and gives an explanation of a demonstration. It's probably best to read the explanation first, then look at the pictures. For the latest on the dasher project, see David Ward's dasher page.

Background: how fast should we be able to communicate with one finger?

Existing keyboards require the human to execute one or two gestures per character. All the human's digits are used, and a large (12 inch x 4 inch) physical device is required; only one of the ten digits is used per gesture. The keyboard itself has the capacity to read about log(80)=6.3 bits per gesture, since each gesture is a selection of one key, and there are about 80 keys. But the entropy of english text or C code is probably about 1 bit per character (Shannon, 1948). So existing keyboards are a factor 6 inefficient. This inefficiency manifests itself in the facts that (1) most people can talk much faster than they can type; (2) when people type, they make errors which are clearly not English.

If only we had a data entering device which was well matched to the human and which made use of an adaptive probabilistic model for the text stream, we ought to be able to have much faster and / or more compact data entry.

Consider the gestures you can make with just one finger moving on a surface. The two-dimensional location of the finger can be controlled to, say, a precision of about 3 bits in the x and y directions, and we can make accurate gestures at a rate of say 4 per second. This means we can convey about 24 bits per second with one finger.

Thus in principle, it might be possible to type at a rate of 24 characters per second, if only we could interface just one finger to a probabilistic model of the English language (or whatever language we wish to type).

We have in mind the following data entry devices: (1) a 2-d trackpad touched by one finger (2) an eye-ball tracker - this is a lot like a 2-d trackpad, since one eyeball conveys two real numbers.

Our interface makes use of the concept of arithmetic coding. You don't need to understand arithmetic coding to understand the interface, however. If you do want to read about arithmetic coding you can learn about it in my textbook on information theory.

Here is how dasher works.

We display the possible characters which can be selected as a vertical row of squares. We use a probabilistic model to control the sizes of these squares, so that characters that are most probable in the given context have the largest squares. The characters are arranged in alphabetical order, but we could use any order, and we could have the order change depending on the probabilities if we wanted.

The prototype's probabilistic model

Ideally, we would use a really good model for english, such as those that are used in top-of-the-line text compression systems. These compress down to about one bit per character. For the first prototype, we have simply used a `bigram' model for english, which compresses English down to about 3.5 bits per character. So the model we are using is about 3 times worse than a model which incorporates knowledge of dictionaries, etc. The present model only knows the probabilities of adjacent letters.

How to write

The user selects a letter by moving the point of view into that letter's square. They indicate where they want to go by holding the two-dimensional pointing device in that direction. With the mouse interface, we move the mouse in the direction of the desired square; with an eye-tracking interface, we simply look at the desired location.

What then happens is that as the point of view moves to the right, the squares get larger; possible extensions of the typed string appear within the square that we are moving into. So if we are moving into the "t" square, for example, squares corresponding to "ta", "tb", ... "th" .... will appear in a vertical line like the first line. The sizes correspond to the probabilities of these strings, given the probabilistic model. "ta" is quite probable; "tb" is less so; "th" is very probable. So it is easy to gesture our point of view into "th", and from there into "the" and then into "the_". (The symbol "_" represents a space.) The attached image sequence shows what happens as a user selects the string "the_demo_". The sequence looks jerky, but please imagine it as having a much higher frame rate and forming a continuous sequence so that it feels as if you are flying towards the desired string.

As you watch this demonstration, please imagine that you are choosing the string of characters "the_demo" by using your direction of gaze, that is, by looking at the string as it appears. You may find it feels a little like sky-diving onto a patchwork of fields.

Imagine having a spectacle mounted eye-tracker, and using this device to talk silently to your computer. Is this the future replacement for the keyboard?

Discussion

Eye-tracker Resolution requirements: Superb resolution is not essential. One degree resolution will be fine, because the system responds continuously and gradually to the user's input; it is not essential to know exactly where they are looking because they no irreversible decisions are being made at any point; the user can always correct for any bad steering by subsequent manoeuvres. Moreover, these corrective gestures could easily be used as feedback signals to fine-tune the callibration of the eye-tracker on the fly.

A final thing to emphasize: this demonstration uses a very weak language model; the stronger the language model - the more it knows about dictionaries, grammar, etc. - the fewer gestures will be needed to write.

If you have tcl in your web browser or on your unix computer, you can download dasher from here: (select the Arithmetic coding demonstration).

Authors

Dasher was conceived by David MacKay and Mike Lewicki and written in tcl by David MacKay.

Research Proposal

David MacKay, Reader, Cambridge University, seeks support for this research.

Wanted:


David MacKay <mackay@mrao.cam.ac.uk>
Last modified: Sat Jan 19 16:22:04 2002