The Quietly Revolutionary History of Gaze Control

© 2026 keitsi.fi

Your eyes are liars. Every second you are awake, they dart, flicker, drift and snap back again in movements so fast your conscious mind never registers them. Between those tiny bursts of motion, your gaze pauses on things for roughly a quarter of a second at a time, each pause called a fixation, each jump between pauses called a saccade. Saccades are the fastest movements the human body produces. Faster than a sprinter’s legs, faster than a boxer’s jab. Your eyeballs are, biomechanically speaking, the most athletic part of you.

For most people this is trivia. For someone who has lost the ability to move their arms, legs, fingers, and tongue, those tiny muscular fireworks behind the cornea may be the only voluntary movement left. And that makes them the last usable interface between a human mind and the outside world.

This is the story of how engineers, therapists, and a remarkable number of stubborn researchers turned that interface into a voice.

A Contact Lens, a Mirror, and a Lot of Patience

Eye tracking as a scientific pursuit is older than you might expect. In 1879, the French ophthalmologist Louis Émile Javal noticed something that reading teachers had missed for centuries. Readers’ eyes do not glide smoothly along a line of text. They hop. Short fixations separated by rapid jumps. The discovery was made through nothing more sophisticated than staring very carefully at someone else’s face while they read.

The equipment got worse before it got better. By the early 1900s, researchers were attaching mirrors to contact lenses, resting them on anesthetized eyeballs, and tracing the reflections onto rotating drums. Edmund Huey built one such contraption in 1908. It worked. It also required placing a plaster cup directly onto the eye, which is about as comfortable as it sounds. If you ever complain about dry eyes during a Zoom call, consider that the founding fathers of eye tracking research literally glued optics to their corneas in the name of science.

For decades, the field remained a niche corner of experimental psychology. In the 1940s researchers filmed pilots’ eyes during flight to understand instrument scanning patterns. In the 1950s and 1960s, the Soviet psychologist Alfred Yarbus conducted elegant experiments showing that where you look depends entirely on what you are thinking about. Give someone a painting and ask them to judge the subjects’ ages, and their gaze concentrates on faces. Ask them to estimate wealth, and the eyes roam toward clothing and furniture. The eye, Yarbus demonstrated, is not a passive camera. It is a searchlight directed by intention.

None of this helped anyone communicate, of course. The equipment still required the subject’s head to be bolted in place, the labs were cramped, and the data analysis was done by hand, frame by frame, from celluloid film. Eye tracking was a tool for studying vision, not for enabling it.

Infrared Light Enters the Chat

The real technological pivot came in the 1970s with a technique called Pupil Center Corneal Reflection, or PCCR. The concept is deceptively simple. Shine an infrared light at someone’s eye. The IR beam creates a bright spot on the surface of the cornea, the glassy outer dome, and a separate reflection from the pupil. Because the corneal reflection stays relatively fixed while the pupil moves with the direction of gaze, a camera that tracks the geometric relationship between these two dots can calculate where the eye is pointing.

This is, to this day, how the vast majority of commercial eye trackers work. The intervening fifty years have brought better cameras, faster processors, and smarter algorithms, but the fundamental trick remains the same. Two tiny dots of light, and some math.

The 1980s made things faster. Computers became powerful enough to process the camera feed in real time rather than in post-production. Suddenly eye tracking could happen live. And if it could happen live, someone was going to try to use it as an input device.

From Measurement to Mouse

In 1982, Friedman, Kiliany, Dzmura and Anderson did something that, in hindsight, looks remarkably prescient. They built a system called EyeTracker, designed specifically for children with disabilities in a classroom setting. A child could look at targets on a screen, the system would register the selections, and the output could either be printed on paper or spoken aloud through a speech synthesizer. It was clunky. It was slow. It was also, conceptually, almost identical to the communication devices being sold today for thousands of dollars.

The problem was that the technology of 1982 was not equal to the ambition. Cameras were low-resolution, processors were feeble, and head movement tolerance was essentially zero. Move your head a centimetre and the calibration collapsed. For a population whose defining characteristic was often involuntary movement, this was a rather significant limitation.

Through the late 1980s and 1990s, things improved steadily. Cameras got sharper. Software got smarter. The tracking “head box,” that invisible volume of space within which a user’s eyes could be reliably followed, grew from a postage stamp to something approaching a shoebox. Each generation of device tolerated slightly more head movement, required slightly less perfect calibration, coped slightly better with glasses and odd lighting. Progress was incremental, undramatic, and vital.

The Original Eye Tracker Was a Person

But here is the thing that the technology narrative tends to skip over. Long before infrared cameras and neural networks, people were already communicating with their eyes. They just needed another person on the receiving end instead of a computer.

In the 1970s, an American engineer named Jack Eichler had a friend with ALS. As the disease took away one motor function after another, Eichler did what engineers do when someone they care about has a problem. He built something. What he built was a sheet of clear plexiglass with letters arranged in groups around its edges and a rectangular hole cut in the centre. The device was called the E-Tran, short for Eye Transfer. It looked like almost nothing. It cost almost nothing. And it worked.

The principle is this. The board sits between two people, held upright like a transparent window. The person who cannot speak sits on one side. The communication partner sits on the other, face to face, eyes meeting through the hole in the centre. When the speaker wants to say something, they look at one of the letter groups on the board. The partner, watching through the transparent surface, follows the direction of their gaze. The letters within each group are colour-coded, so after looking at the correct group, the speaker looks at a colour marker that identifies the specific letter. Two glances per character. The partner says the letter aloud for confirmation, and the next one begins.

It is slow. A sentence can take minutes. The letter E might take two seconds. The word “everything” might take twenty. But it works in the dark, in a power outage, in a moving vehicle, in a country where nobody sells assistive technology, and in any language you can fit onto a piece of plastic. It requires no calibration, no software updates, no USB port, and no internet connection. It requires one thing that no technology has yet managed to replicate well, which is a human being who has learned to read another human being’s eyes.

And this is where the story gets quietly remarkable. Experienced E-Tran partners, the family members, nurses, and personal assistants who spend months or years communicating this way, develop an almost uncanny speed. They begin predicting words after two or three letters. They learn the speaker’s vocabulary, their habitual phrases, the topics they gravitate toward. They start completing sentences the way a close friend finishes your joke before you reach the punchline. What was painfully slow at first becomes, between a practised pair, something approaching fluid conversation. Not fast by the standards of spoken language, but fast enough to argue, to make requests, to tell someone you love them, to say something funny at exactly the right moment.

In Finland, where communication aids for people with severe disabilities have a long tradition of institutional support, alphabet boards and auditory scanning are part of the standard toolkit. The technique is called auditiivinen askeltaminen. The communication partner points to rows of letters, reads each aloud, and the speaker confirms the right one with whatever signal they can produce reliably, a blink, a small nod, a sound. The partner becomes, in effect, a human cursor and a human speech synthesiser rolled into one, running on coffee and patience rather than batteries.

The E-Tran board matters to the story of gaze-controlled communication for a reason that goes beyond nostalgia. It proves that the core idea, using the direction of someone’s gaze as an output channel for language, does not depend on technology at all. The technology just automates what a patient partner already does. Every infrared camera, every corneal reflection algorithm, every dwell-time setting is an attempt to replicate, at machine speed and without fatigue, the thing that Jack Eichler’s plexiglass rectangle and a pair of attentive human eyes already accomplished in 1970-something. The silicon got faster. The principle never changed.

And the low-tech version has not gone away. E-Tran boards and their variants are still manufactured, still recommended by speech therapists, still kept as backups for the days when the powered device breaks or the battery dies or the sun is at exactly the wrong angle for infrared tracking. In a field obsessed with the next sensor and the next algorithm, a transparent piece of plastic with some letters on it remains stubbornly indispensable. Which tells you something about the nature of communication. The channel matters less than the willingness to listen.

How Your Eye Becomes a Mouse

Here is how a modern assistive eye tracking system works, stripped to its essentials.

A small bar, roughly the size of a chunky pen, attaches to the bottom edge of a computer screen. Inside it sits an infrared light source and one or more cameras. The bar floods the user’s face with IR light invisible to the human eye but bright to the camera. The camera captures the corneal reflection and pupil position, typically at 30 to 60 frames per second.

Before first use, the system must be calibrated. Dots appear on the screen, one at a time, and the user stares at each in sequence. The software maps the physical geometry of that particular user’s eyes, accounting for the shape of their corneas, the distance between pupil and corneal reflection, and the angle of the screen. Good calibration is the foundation of everything that follows. Bad calibration makes every subsequent interaction an exercise in frustration.

Once calibrated, the user’s point of gaze maps to a cursor position on screen. Look at a button, and the cursor moves to it. But looking is not clicking. You look at hundreds of things per second without meaning to select them. So a selection mechanism is needed.

The most common method is the dwell. Stare at a target for a preset duration, typically somewhere between 500 and 900 milliseconds, and the system registers a click. Too short and you will accidentally select things every time your gaze passes over them. Too long and the sustained staring exhausts your eyes and your patience. Finding the sweet spot is personal, and getting it wrong in either direction can make the whole system feel unusable.

Alternatives exist. External switches triggered by a puff of breath, a muscle twitch, or a deliberate blink can substitute for the dwell. Some systems even allow combining gaze pointing with a physical switch, separating the “where” from the “when” of each selection. Stephen Hawking’s famous communication system, for instance, evolved over decades from a handheld clicker to a cheek-muscle sensor to (briefly) an eye tracker, each adaptation matching his gradually diminishing motor control.

What Makes It Hard

If eye tracking sounds straightforward in principle, the practice is haunted by a constellation of small problems that together can make or break the experience.

Light. Sunlight contains infrared radiation. So do many artificial lights. If strong IR light reaches the user’s eyes or the screen, it can swamp the tracker’s own infrared signal. A sunny window behind the user or a halogen desk lamp aimed at the screen can degrade tracking accuracy or knock it out entirely. The recommended setup reads like instructions for developing photographs in a darkroom. Indirect lighting. Curtains closed. No reflective surfaces.

Glasses. Most eye trackers can cope with glasses, but the lenses introduce extra reflections. A scratched lens, an anti-glare coating that ironically creates its own glare, or thick frames that cast shadows across the pupils can all cause problems. Bifocals with a visible dividing line between the near and far segments are particularly troublesome because the line creates a hard optical boundary that the tracker may misinterpret.

Eyelids. Drooping eyelids can obscure the pupil. This is an underappreciated problem because it can be intermittent. A user might calibrate perfectly while concentrating hard with eyes wide open, then relax afterward and find the system losing track of their gaze as the eyelids settle into their natural, slightly lower position.

Head movement. Modern trackers tolerate a reasonable range of head movement, typically a box of roughly 30 by 25 centimetres at 60 centimetres from the screen. But conditions like cerebral palsy, multiple sclerosis, or involuntary spasms can produce movements that exceed those tolerances unpredictably. The tracker loses the eyes, takes a moment to reacquire, and during that gap the user is voiceless.

Fatigue. Controlling a computer with your eyes requires sustained, deliberate visual focus in a way that natural looking does not. Your eyes evolved to roam freely, skimming surfaces, darting to movement in the periphery, resting on nothing in particular. Forcing them to hold precise fixations on small targets for extended periods is tiring. Users, especially children, may need frequent breaks. Ironically, the faster and more efficient the interface becomes, the more cognitively demanding it is to use.

The Fovea Problem

There is an elegant paradox at the heart of gaze-based interaction. The part of your retina responsible for sharp vision, the fovea, covers only about one to two degrees of your visual field. Hold your arm out and look at your thumbnail. That is roughly the area of sharp focus. Everything beyond it is peripheral vision, good for detecting motion and contrast but useless for reading or precise targeting.

This means that to see anything clearly, you must point your fovea directly at it. But to select something with a gaze interface, you must also point your fovea directly at it. Seeing and selecting collapse into the same action. It would be like designing a mouse where every object the cursor touched was immediately clicked. The dwell timeout exists precisely to pry these two functions apart, to create a tiny temporal gap between looking and choosing. But the fundamental tension between perception and action never fully resolves.

Who Uses This

The primary users of assistive eye tracking fall into a few overlapping groups. People with ALS, also called motor neuron disease, form perhaps the most prominent user population. ALS progressively destroys motor neurons while leaving cognition, vision, and language comprehension intact. This makes it both terrifying and, perversely, well suited to gaze-based communication. The mind is sharp, the eyes work, and every other output channel is gradually shutting down.

Cerebral palsy, spinal cord injuries, traumatic brain injuries, multiple sclerosis, and locked-in syndrome are other common contexts. The common thread is a gap between cognitive capability and motor output. The person has things to say and no conventional way to say them.

For these users, a gaze-controlled communication device is not a gadget or a productivity tool. It is a voice. It is the difference between participating in a conversation and being talked about in the third person while sitting in the same room. It is the ability to say “I’m in pain,” “I love you,” or “no, I wanted the other channel.” The stakes are not about words per minute. They are about personhood.

The State of the Art in 2026

If you had last looked at assistive eye tracking around 2016, you would have found a market dominated by a handful of dedicated hardware companies. Tobii Dynavox in Sweden. EyeTech Digital Systems in the United States. A scattering of smaller players like Alea Technologies and Visual Interaction. The devices were USB bars that clipped to a Windows laptop, calibration required nine points and some patience, and the head tracking box was generous but finite.

A decade later, the landscape has shifted in several significant directions.

Hardware has matured. Tobii’s current flagship, the PCEye 5, and their integrated I-Series devices use a fifth-generation sensor (IS5) with substantially improved head movement tolerance, better performance across diverse eye colours and lighting conditions, and simplified one-time calibration. The devices are thinner, lighter, IP54 rated against dust and water, and include features like a rear-facing “partner window” that mirrors the user’s typed text so that the conversation partner across the table can maintain face-to-face contact rather than peering over the user’s shoulder at a screen.

Apple entered the arena. In late 2024, Apple added native eye tracking to iOS 18, allowing iPhone and iPad users to control their devices using only the front-facing camera. No external hardware required. The feature uses on-device machine learning to estimate gaze direction from the standard selfie camera, and while early feedback from the assistive technology community suggests it is not yet as reliable as dedicated hardware, the trajectory is clear. When the company that builds the world’s most popular mobile devices decides eye tracking is a standard accessibility feature, the cost and availability equation changes overnight.

Tobii responded with the TD Pilot, an eye tracking accessory certified for iPad that brings their dedicated tracking hardware into the Apple ecosystem. Users get the precision of purpose-built infrared tracking combined with the app ecosystem and social familiarity of an iPad rather than a specialised medical device that advertises its difference.

AI rewrote calibration. The nine-point calibration stare, long the awkward first date of every gaze interaction session, is being streamlined and in some cases eliminated entirely. Transformer-based neural networks and semi-supervised learning architectures now enable more robust gaze estimation from less initial data. Pupil Labs, a Berlin-based company, ships wearable eye tracking glasses that use deep learning for gaze estimation and require no user calibration at all. The software builds a model of the user’s eyes on the fly, continuously adapting.

The market exploded, mostly sideways. Eye tracking hardware revenue is projected to grow from about 2.3 billion dollars in 2025 to something approaching 27 billion by 2035. But the growth is driven overwhelmingly by gaming, virtual reality headsets, automotive driver monitoring, and market research. Assistive communication remains a small fraction of the total market. The good news is that volume drives down component costs and up R&D investment. Every VR headset with built-in eye tracking is, in a sense, subsidising the assistive technology sector by creating economies of scale that a niche medical market alone could never sustain.

And then there are the wildcards. In early 2026, researchers at Qingdao University in China published a self-powered eye tracking system that harvests energy from the friction of blinking, using triboelectric nanogenerators embedded in a contact lens. The device requires no external power, works in complete darkness, detects eye movements as small as two degrees, and is reportedly as comfortable as ordinary glasses. It is a lab prototype, nowhere near commercialisation, but it represents a philosophically different approach to the problem. Instead of a bar on a desk pointing cameras at your face, the tracker moves with your body. If wearable eye tracking eventually becomes as unobtrusive as a pair of contact lenses, the entire paradigm of gaze interaction shifts from “sitting at a screen” to “moving through the world.”

Meanwhile, Apple has signalled work on Brain Computer Interface support within iOS Switch Control. BCIs read electrical signals directly from the brain, bypassing the eyes entirely. For users who have lost even ocular motor control, this is the next frontier. For the eye tracking field, it is both a complement and, eventually, perhaps a successor.

What Hasn’t Changed

Amid all the new hardware and algorithms, certain truths about gaze interaction remain stubbornly constant.

Calibration still matters. Even AI-assisted systems perform better with good calibration, and users with unusual eye geometries, strong prescriptions, or involuntary movements still challenge every tracker on the market.

The fovea problem persists. No amount of silicon can resolve the inherent conflict between looking at something to see it and looking at something to select it. Dwell-based selection remains dominant, with all its trade-offs.

Individual variation is enormous. The same device that works flawlessly for one user can be unusable for another, depending on their particular combination of motor control, visual acuity, eyelid geometry, and head stability. Finding the right device for a given person still requires hands-on trial. There is no algorithm for that yet.

And the stakes remain as high as they ever were. Technology reporters like to write about eye tracking as a cool interface for gaming or a clever way to skip video ads. For the communities that depend on it, gaze control is not cool or clever. It is oxygen. The difference between having a voice and not having one is not a matter of user experience optimisation. It is a matter of human dignity.

The eyes may be liars, darting and drifting in ways their owner never intended. But for a growing number of people, they are also the most honest communicators they have left. And the technology that listens to them keeps getting better at paying attention.


Further Reading and Sources

History of eye tracking

A History of Eye Gaze Tracking (Morimoto & Mimica, ResearchGate) and The First Hundred Years: A History of Eye Tracking as a Research Method (ResearchGate) provide thorough academic overviews. A more accessible timeline is offered by History of Eye Tracking (Innodem Neurosciences) and Eye Tracking Through History (EyeSee, Medium).

Eye tracking technology and science

A Comprehensive Framework for Eye Tracking: Methods, Tools, Applications, and Cross-Platform Evaluation (PMC/PubMed Central, 2025) is a recent and very thorough review of methods, performance parameters, and applications. Head-mounted eye gaze tracking devices: An overview of modern devices and recent advances (Cognolato, Atzori & Müller, 2018, PMC) covers wearable device evolution. Eye tracking (Wikipedia) is, as usual, a solid starting point with extensive references.

Assistive use and AAC

Eye-Tracking Assistive Technologies for Individuals With Amyotrophic Lateral Sclerosis (IEEE Xplore) reviews hardware and software in the ALS context. New Perspectives on Eye-Tracking: Theory, Methods, and Applications (MDPI Applied Sciences, 2025) covers recent deep learning advances in gaze estimation.

The E-Tran board and low-tech gaze communication

E-TRAN Eye Pointing (Future of Interface) provides a concise history of Eichler’s invention. Eye Gaze Communication Board (COGAIN wiki) includes printable DIY templates and instructions. How Do You Make An E-Tran Board? (The Adult Speech Therapy Workbook) is a practical clinical guide with a free PDF. For Finnish-language resources, Papunet: Kirjain- ja numerotaulut offers printable alphabet boards and usage instructions, and Tikoteekkiverkosto: Kommunikointitaulut ja -kansiot describes the broader Finnish AAC framework.

Current devices and products

Tobii Dynavox PCEye is the current flagship standalone eye tracker. Tobii Dynavox TD I-Series is the integrated communication device with IS5 sensor. TD Pilot brings eye tracking to iPadOS. Apple’s built-in eye tracking is documented at Control iPad with the movement of your eyes (Apple Support).

Emerging research

Self-powered eye tracker harnesses energy from blinking (TechXplore, January 2026) covers the Qingdao University triboelectric nanogenerator prototype. Apple’s 2025 accessibility announcements, including BCI support for Switch Control, are detailed at Apple unveils powerful accessibility features (Apple Newsroom).

Stephen Hawking’s communication system

Why Intel made Stephen Hawking’s speech system open source (Opensource.com) explains the ACAT toolkit. How Does Stephen Hawking Talk? (Singularity Hub) provides a detailed walkthrough of the system’s evolution.