Google gives enterprises new controls to manage AI inference costs and reliability

Google has added two new service tiers to the Gemini API that enable enterprise developers to control the cost and reliability of AI inference depending on how time-sensitive a given workload is. While the cost of training large language models for artificial intelligence has been a concern in the past, the focus of attention is…

Read More

The ‘toggle-away’ efficiencies: Cutting AI costs inside the training loop

“A single training run can emit as much CO₂ as five cars do in a year.” That finding from the University of Massachusetts, Amherst, has become the defining statistic of the generative AI era. But for the engineers and data scientists staring at a terminal, the problem isn’t just carbon, it’s the cloud bill. The…

Read More

Harness teams of agentic coders with Squad

At Kubecon Europe recently, Linux kernel maintainer Greg Kroah-Hartman said something that surprised me. After more than a year of AI-based pull requests and security reports that were worthless, living up to their nickname of “slop,” suddenly in the last month or so Kroah-Hartman discovered that those reports had become useful. At the time he…

Read More