Here is an excerpt from an article by the editors at Harvard Business Review. To read the complete article, check out others, sign up for email alerts, and obtain subscription information, please click here.
Illustration Credit: Calvin Sprague
* * *
In fields ranging from copywriting to software development, leaders are betting that gen AI can help employees take on more-advanced responsibilities. Research from MIT professor David Autor and others has shown that gen AI shortens the time it takes novices to gain competence at new tasks. But there’s still much we don’t know about the technology’s potential to upskill workers, including one key question: Can it help them perform tasks as well as experts do?
To try to answer that, researchers from Stanford University and Harvard Business School’s Digital Data Design Institute ran a controlled experiment involving 78 employees at IG Group, a United Kingdom–based fintech firm. They began by putting the employees into three groups: experts, adjacent outsiders, and distant outsiders. The experts were writers who regularly drafted articles for IG’s website. The adjacent outsiders were marketing specialists from the writers’ department who had no article-writing experience but had a general understanding of what the writers did. The distant outsiders were developers and data scientists who had no marketing or writing background at all. Each group was asked to complete two tasks: conceptualize and write an article like those found on the company’s website. The researchers randomly assigned gen AI to help some participants but not others. IG executives then rated the results of each assignment on a scale from 1 (lowest grade) to 5 (highest).
When conceptualizing an article without help from gen AI, the writers got the highest average score (3.82), followed by the marketing specialists (3.04) and the technologists (3.02). Those results revealed a significant skill gap between the experts and the others. When the subjects were given gen AI assistance, however, the gap narrowed: Concepts developed by writers scored 4.12, on average, while those developed by marketing and technology specialists scored 4.18 and 4.05, respectively. In other words, marketers using AI slightly outperformed writers using AI—and all three groups that used AI outperformed writers who didn’t.
However, when it came to writing the articles, the results differed. Without gen AI, the writers performed the best of all the groups. Yet even using AI couldn’t help nonexperts produce the same quality of work as the experts. Writers, predictably, performed the best of those using the technology (3.96, on average). Marketing specialists aided by AI were close behind (3.92). But the technology specialists aided by AI didn’t do as well; in fact, their scores with and without gen AI were essentially the same (3.38 and 3.42, respectively).
The Gen AI Wall
Why did gen AI boost performance for one task more than for the other—and help the technology specialists so little at writing?
After conducting interviews with participants, the researchers concluded that the further removed workers were from the knowledge needed for a task, the less likely they were to perform as well as colleagues with relevant expertise—even with gen AI assistance. Nonexperts using AI did better at conceptualization because it required less expertise than writing did; people just had to understand whether a proposed topic was good enough. Writing an article, however, involved knowing how to convey the desired message in the right language. One participant offered a metaphor to illustrate this distinction: Conceptualizing is like imagining running a marathon, but writing is like actually running it, which calls for a completely different level of expertise.
And expertise, the researchers found, is what allowed humans to partner more effectively with the AI tools. The marketing specialists understood the general language the writers used and had enough domain knowledge to refine the gen AI–produced content. But the technology specialists (whose work had nothing to do with writing) could not effectively use or improve the AI’s suggestions. They lacked the intuition and knowledge needed to make good decisions about what language to keep and what to discard. The researchers termed this phenomenon “the AI wall,” the limit to how much gen AI can help people perform tasks outside their area of expertise.
This finding has implications for how organizations deploy gen AI tools. It challenges the assumption that the technology can flatten skill hierarchies and enable what academics call “universal task fluidity.” Instead, the researchers contend, gen AI’s effectiveness depends on the expertise distance between the user and the task domain—and they argue that the AI wall is relevant beyond the context of writers and technology specialists.
* * *
Here is a direct link to the complete article.