
KAYTUS, a global provider of AI and liquid cooling solutions, has launched an upgraded version of its KSManage platform. The release, called KSManage V2.3, is designed to help AI data centers manage power-hungry systems like NVIDIA’s GB200 and B200 with greater precision. These next-generation chips are central to large-scale AI model training but require advanced monitoring and cooling. The update adds intelligent automation and enhanced fault detection for smoother, safer operations.
One KAYTUS partner in Central Asia cut energy use by 20% and slashed device fault response time by 80% using the platform. With AI data centers facing pressure to handle rising workloads without breaking energy budgets, KAYTUS hopes KSManage can help. “We built this for the new generation of AI hardware,” a KAYTUS engineer said. “The smarter your management, the longer your systems last.”
KSManage Offers Precision Monitoring for New AI Hardware
KSManage V2.3 builds on KAYTUS’s experience managing over 5,000 types of devices. It now includes advanced controls for NVIDIA’s high-performance chips, the GB200 and B200. These chips support fast training of AI models but demand more energy and generate more heat than earlier hardware.
To address this, KSManage offers three levels of monitoring: component, machine, and cluster. It uses smart algorithms to predict faults, issue early alerts, and automatically reroute resources during large model training. It can detect memory errors in GPUs, read system logs, and pinpoint issues with 92% accuracy.
KSManage also maps data center resources in 3D, showing how workloads are spread across CPUs and GPUs. Based on that real-time data, it can shift computing power automatically. The system’s network management tools improve internal traffic flow by up to 90%. Together, these features reduce slowdowns and boost training speeds.
AI Centers See Faster Repairs, Safer Cooling, and Lower Costs
Real-world use shows the platform’s impact. A data center in Central Asia improved operations fourfold using KSManage’s automated fault detection and smart cooling controls. Engineers reported cutting device repair times by 80% and lowering power usage per cabinet by 20%.
Cooling, often the most costly part of data center operations, has also improved. KSManage now monitors liquid cooling systems in real time, adjusting coolant flow based on GPU power and temperature. The update boosts coolant use efficiency by 50% and trims total energy use by 10%. The platform also reacts quickly in emergencies.
If a coolant leak occurs, it shuts down flow and powers off key systems in milliseconds to avoid hardware damage. Still, some challenges remain. Adopting advanced tools like KSManage requires trained staff and upfront investment, which smaller operations may struggle to afford. KAYTUS plans to keep expanding the platform’s capabilities. Future updates may support more AI hardware types and further reduce energy consumption.
Smarter Tools Signal a Shift in Data Center Management
KAYTUS’s KSManage update arrives at a crucial time for AI infrastructure. As workloads grow and chips evolve, data centers need smarter tools to keep pace. The upgrade helps operators cut energy use, detect issues early, and extend equipment life.
The platform’s support for GB200 and B200 chips positions it well as more AI labs and cloud providers shift to these high-power systems. With its ability to manage heat, optimize workloads, and react to faults, KSManage offers a glimpse of the future for AI data center operations. In an increasingly power-hungry field, tools like this could help keep performance up and costs down.