
Artificial Intelligence systems such as ChatGPT and other big language models (LLMs) have reached the point where they can produce very convincing answers to a range of questions. However, increasingly AI systems that are mode-based, such as those forms based upon language, do not tell you when they cannot be confident or have exceeded their trained knowledge.
Addressing the failure to identify and communicate uncertainty is becoming increasingly problematic as AI technology is service industries based on high-stakes decisions such as health care, automated self-driving vehicles, and scientific research.
Tackling this issue is the primary purpose of the Themis AI company, a spinout from MIT, founded in 2021, which is designed to create ways to characterize an AI model’s uncertainty, thereby creating a more trustworthy and transparent decision-making process across industries.
A Platform to Quantify and Correct AI Errors
At the heart of Themis AI’s innovation is Capsa, a software platform designed to detect and correct unreliable outputs generated by machine learning models. Capsa works by modifying existing AI systems to recognize patterns that suggest ambiguity, data bias, or incompleteness. The platform can be applied to any machine learning model and correct potential failure points in real time.
Themis AI co-founder and MIT Professor Daniela Rus, who also serves as Director of the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL), explained that the technology offers an additional layer of reliability.
Founders Rooted in MIT’s AI Research Ecosystem
Themis AI was co-founded by Professor Daniela Rus along with MIT alumni Alexander Amini (PhD ’22) and Elaheh Ahmadi (MEng ’21), all of whom previously collaborated on AI research projects at CSAIL. Their early work included detecting bias in facial recognition systems and improving the reliability of AI models used in autonomous vehicles and drug discovery.
Amini, now CEO of Themis AI, highlighted the urgent need for AI systems to recognize their limitations.
Real-World Applications Across Critical Industries
Since its founding, Themis AI has engaged with several major industries to test and deploy its Capsa platform. In the telecommunications sector, Capsa has been used for network planning and automation. In the oil and gas industry, it supports the analysis of complex seismic imagery. Perhaps most promising is its application in pharmaceutical development, where AI is increasingly used to predict the behavior of new drug candidates in clinical trials.
Themis AI’s tools allow researchers to distinguish between model outputs grounded in reliable data and those that lack sufficient evidence, streamlining the research process and potentially saving millions in development costs.
Enhancing Trust in LLMs and Edge AI
As more companies look to develop private large language models based on proprietary data, Themis AI offers a way to enhance reliability and trust. According to Stewart Jamieson, Themis AI’s Head of Technology, many enterprises worry about the unpredictable nature of LLM responses.
Beyond large-scale systems, Capsa is also being optimized for edge computing environments, where smaller AI models operate on devices like smartphones, sensors, and autonomous machinery. While these models often sacrifice accuracy for speed and efficiency, Capsa helps bridge that gap, enabling smart delegation of complex tasks to cloud-based systems when uncertainty is detected.
Unlocking Chain-of-Thought Reasoning for Safer AI
Themis AI is now exploring how Capsa can improve chain-of-thought reasoning, a technique used by LLMs to explain their decision-making process. By identifying which logical sequences carry the highest confidence, Capsa can help models reduce errors, lower computational demand, and enhance overall user experience.
Amini emphasized the significance of this capability, calling it an “extremely high-impact opportunity” for scaling responsible AI.
Building a Future of Responsible and Transparent AI
For Professor Rus, the motivation behind Themis AI is not just technological advancement but social impact. By bringing AI reliability research from the lab into the real world, the company aims to help industries adopt safer, more transparent AI systems.
As AI continues to evolve, Themis AI’s mission to teach models what they don’t know — and fix it — could be the key to unlocking safe and meaningful progress across the AI landscape.