
In a significant step toward building an inclusive and self-reliant AI ecosystem, IIT Bombay has released 16 culturally rich AI datasets on AIKOSH, India’s official AI repository.
Unveiled in March 2025 by the Ministry of Electronics and Information Technology, AIKOSH will act as a national platform for inclusive and responsible AI development. The newly incorporated datasets from IIT Bombay are one of the paradigm-changing initiatives through BharatGen which aims to push firmly and responsibly along the paths of indigenous innovation, while also facilitating and supporting startups, researchers, and developers across the whole country.
India-Centric Datasets for Responsible and Sovereign AI
IIT Bombay has released 16 India-focused AI datasets on AIKOSH, offering open access to high-quality, culturally relevant data. Highlights include a Sanskrit OCR dataset with 218,000 annotated sentences, 78+ hours of Sanskrit speech data, table detection tools for 14 Indian languages, and a comprehensive Wiki on Indian Knowledge Systems—supporting multilingual, visual, and spoken input analysis for Indian environments.
Fostering Innovation with Indian Values
Prof. Ganesh Ramakrishnan from the Department of Computer Science Engineering at IIT Bombay stressed the significance of leveraging Indian data to drive the nation’s AI development.
“Our aim isn’t just to develop AI models,” he said. “It’s about building impactful resources that empower Indian startups and system integrators. By doing so, we can nurture a robust, self-reliant AI ecosystem that truly reflects India’s diverse linguistic and cultural landscape.”
A Big Step Toward India’s AI Goals
With this release, IIT Bombay contributes significantly to the growing pool of 21 AI models and datasets currently hosted on AIKOSH. The effort aligns with India’s larger goals of becoming an AI powerhouse by making data—especially India-specific datasets—widely accessible.
The open availability of these resources is expected to spur research, startup innovation, and localized AI solutions, helping to bridge the gap between global AI technologies and regional Indian needs.