New research shows that readers process texts written by artificial intelligence (AI) differently from human-authored ones. Eye tracking reveals that AI text is easier to skim but may not capture attention or engagement in the same way.
AI may help you draft an email or polish a report, but new research shows that it does not land with readers the same way as human writing. Eye-tracking experiments reveal that AI text is easier to skim but struggles to hold attention – a caution for anyone relying on it to persuade, impress or inspire.
Scientists asked Microsoft’s generative AI tool Copilot to mimic the writing style of three famous Danish authors and used eye tracking to compare how readers process the AI-generated text versus the real deal. They found significant differences in how the readers focused on human-authored versus AI-generated text, suggesting that AI-generated texts require less mental effort from readers. The team presented their findings at the ACM Symposium on Eye Tracking Research & Applications in Tokyo, Japan in May.
The differences could come down to how AI models work, says co-author Per Bækgaard, an Associate Professor at the Technical University of Denmark (DTU) in Kongens Lyngby who studies interactions between people and computers. Generative AI models string together sentences by predicting the next most likely word – which could make text more skimmable and require less brain power to follow.
“We need to reflect on how much we in our everyday lives use AI to generate the text that we are writing now,” explains co-author Sofie Beier, Full Professor of Design Research at the Royal Danish Academy in Copenhagen. “When we write text, many of us run it through large language models to edit it or to improve the language. What I think we find here is that this actually may affect the reader.”
Adaptable displays – and the need for lots of text
In 2024, a team from DTU and the Royal Danish Academy was preparing a project called Reading the Reader – an initiative to develop adaptable displays for people with impaired vision.
“Imagine a screen that, by tracking your eyes as you read, notices that you are struggling and can adjust the text – make the letters slightly wider, increasing the distance between lines of text – to make it easier to read,” says Beier, a graphic designer by training who heads the Centre for Visibility Design at the Royal Danish Academy.
To train these adjustable displays, the researchers need big banks of text samples for participants to read while their eyes are being tracked, Bækgaard says. “But if we expose them to text they already know” – such as from a popular book – “there is a potential for bias,” he explains. They wondered whether generative AI could solve their problems of novelty in bulk.
But first, “we needed to determine whether people differ in how they process AI-generated text versus standard text,” Bækgaard says.
The eyes have it: tracking how we read AI-generated versus human text
Chaudhary Muhammad Aqdus Ilyas, a computer scientist at DTU who studies interactions between robots and people, asked Microsoft Copilot to generate 600-word passages in the styles of three Danish authors: Hans Christian Andersen, the 19th-century writer of whimsical and haunting children’s tales, including The Little Mermaid and The Ugly Duckling; Karen Blixen, a 20th-century novelist who wrote sweeping stories of love and adventure such as Out of Africa; and Søren Kirkegaard, a 19th-century existentialist philosopher.
The researchers say that they chose Andersen, Blixen and Kirkegaard as benchmarks for easy, medium and difficult reading material. “Andersen’s audience was kids, Blixen’s was adults and Kirkegaard’s was academics,” Ilyas notes.
Ilyas went through about 60 iterations of the prompt before settling on a final version, he says. “We sent some of the AI-generated passages to an English literature professor from Cambridge University, and she gave us feedback as to which prompts got us closest to the original authors,” Ilyas explains.
Twelve adults read six short texts each on a large screen while a camera tracked their eye movements 90 times per second – some passages by the original authors and others generated by AI – and answered reading comprehension questions. A professional-grade eye camera on top of the screens tracked the participants’ eyes.
The tiny eye movements we make while reading offer a window into how the brain processes text, Bækgaard explains. The team focused on four signals: how long the eyes paused on a word, how often they paused, how far they jumped between pauses and how much the pupils widened. Longer pauses usually mean that the reader needs more effort to understand the words, whereas shorter pauses suggest that the text is easier to process.
A saccade is simply the short jump your eyes make as they move from one word to the next, explains co-author Ashkan Tashk, a data scientist at DTU. Shorter saccades indicate that a reader might be struggling to process text – homing in on individual words more often – whereas longer saccades are associated with faster reading and better comprehension.
When your pupils widen slightly, it is a sign that your brain is paying closer attention or working harder. “The more invested you are in something, typically your pupil becomes a bit larger,” he adds.
Literature, simplified by AI
The researchers found that asking Microsoft Copilot to mimic the style of various writers effectively acted as a dimmer switch on the complexity of the text. The eye movements confirmed that pseudo-Kirkegaard was a tougher read than pseudo-Andersen.
However, “there seems to be a statistically significant difference between the AI-generated version and the human-written text,” Bækgaard says.
Overall, the participants seemed to find the AI-generated texts less cognitively demanding than the originals: their eye movements showed shorter fixation durations and higher fixation rates when reading the AI-generated texts. That is more consistent with a skimming style of reading. “Maybe they spend a little less time comprehending it,” Bækgaard says. Indeed, the AI-generated texts scored lower on a readability scale, meaning that they used shorter sentences and fewer long words than the original authors did.
“But we also found some curiosities that make us think that maybe there is also a different strategy for processing the information” for the AI-generated texts, he adds. The researchers observed small but statistically significant differences in pupil dilation: pupil sizes were larger for the AI-generated Andersen and Blixen texts than for the original texts, which would be consistent with a higher cognitive load.
No assumptions for AI
Generative AI moves so quickly that researchers can barely keep up, Beier says. “We test one version today, and tomorrow it may already be replaced by something new,” she explains. “This means that our findings can be outdated almost instantly, which is exactly why recording what is happening right now is so important.”
“But that does not mean we should not do it,” Beier says.
The team hopes that their findings with Microsoft Copilot will serve as a cautionary tale to other researchers: scientists should not assume that AI-generated text is interchangeable with human writing. Before incorporating AI-generated text into studies such as Reading the Reader, understanding the differences in how people process and interpret AI-generated writing is vital, whether that means reading speed, comprehension or other metrics of engagement.
The researchers say that there is no guarantee that the patterns they identified in eye movements would be the same for text generated by different AI models – or even subsequent versions of Microsoft Copilot.
But the overall pattern may hold because of commonalities in how large language models function. “There may be some generalisable things in terms of how the AI models all come up with the next most likely word – the predictability of the words might be different from a normal text,” Bækgaard explains.
What do these differences mean for the use of generative AI outside of the laboratory, in our daily lives? Depending on the intended use, text with a lower cognitive load could be a selling point or a negative feature, the researchers say.
In situations in which readability and ease of comprehension are the priority – perhaps for public service announcements or assembly instructions – lower cognitive load could be desirable. But when the brief is writing that will capture and hold the reader’s attention – a cover letter for a job application, eye-catching marketing copy or stories that inspire real emotion – the originality of a human writer may still win out.
