WHEN LARGE LANGUAGE MODels exploded onto the scene in 2022, their powerful capabilities to generate fluent text on demand seemed to herald a productivity revolution. But although these powerful AI systems can generate fluent text in human and computer languages, LLMs are far from infallible. They can hallucinate information, exhibit logical inconsistencies, and produce irrelevant or harmful outputs.
While the technology has been widely disseminated, many managers are struggling to identify LLM use cases where productivity improvements outweigh the costs and risks of the tools. What’s needed is a more systematic approach to effectively utilizing LLMs to increase the efficiency of a business process while mitigating their shortcomings. I recommend an approach that involves three steps. First, disaggregate the process into discrete tasks. Second, assess whether each task satisfies the generative AI cost equation, which I’ll explain in this article. When a task meets that requirement, launch a pilot project, iteratively evaluate the results, and make changes to improve the outputs when necessary.
The core of this approach rests on developing a cleareyed understanding of how the strengths and weaknesses of LLMs map to the nature of the task in question, the techniques by which LLMs are adapted to improve their performance on a task, and how all of this shapes the cost-benefit analysis — and the risk-reward picture — for using LLMs to increase the efficiency of the task.
LLMs: Remarkable Strengths, Surprising Weaknesses
When we experience LLMs responding with humanlike fluency to a prompt, it’s easy to forget that they can get simple questions wrong. If you ask even an advanced, large-scale model like GPT-4 the question “What is the fifth word of this sentence?” the answer will often be incorrect, as in, “The fifth word of the sentence ‘What is the fifth word of this sentence?’ is ‘fifth.’”¹
This story is from the Winter 2025 edition of MIT Sloan Management Review.
Start your 7-day Magzter GOLD free trial to access thousands of curated premium stories, and 9,000+ magazines and newspapers.
Already a subscriber ? Sign In
This story is from the Winter 2025 edition of MIT Sloan Management Review.
Start your 7-day Magzter GOLD free trial to access thousands of curated premium stories, and 9,000+ magazines and newspapers.
Already a subscriber? Sign In
Ask Sanyin: How Do You Build for an Unpredictable Future?
While the pandemic was a wild ride of uncertainty for me and many of my peers in leadership, it feels like we never regained our footing.
What You Still Can't Say at Work
Most people know what can’t be said in their organization. But leaders can apply these techniques to break through the unwritten rules that make people self-censor.
Make Character Count in Hiring and Promoting
Most managers focus on competencies when evaluating candidates but it’s character that will transform the DNA of the organization. Here’s how to assess it.
Why Influence Is a Two-Way Street
Managers achieve better outcomes when they prioritize collaborative decision-making over powers of persuasion.
Know Your Data to Harness Federated Machine Learning
A collaborative approach to training AI models can yield better results, but it requires finding partners with data that complements your own.
How Integrating DEI Into Strategy Lifts Performance
Incorporating diversity, equity, and inclusion practices into core business planning can provide a competitive edge.
The Myth of the Sustainable Consumer
Companies that understand the different kinds of consumers for sustainable products can market to them more effectively.
A Practical Guide to Gaining Value From LLMs
Getting a return from generative AI investments requires a systematic approach to analyzing appropriate use cases.
Improve Workflows by Managing Bottlenecks
Understand whether process or resource constraints are stalling work.
Craft Schedules That Work for Everyone
Business leaders can improve retention and business performance with schedules that make sense for workers’ lives.