Google introduces automatic caching to cut AI API costs

Google is rolling out a new feature in its Gemini API that it says could significantly reduce the cost of using its advanced AI models.

The feature, called implicit caching, promises up to 75 per cent savings on repeated inputs for developers using the Gemini 2.5 Pro and 2.5 Flash models.

The announcement, made on Wednesday, is a major update for developers grappling with rising costs associated with frontier AI models. Unlike Google’s earlier explicit caching system—where developers had to manually define and manage cached prompts—implicit caching works automatically, without requiring user intervention.

According to Google, when a request sent to the Gemini API shares a common beginning with a previously submitted one, the system recognizes it and retrieves cached results, reducing computing load and costs. This automatic mechanism is now enabled by default.

The cost-saving mechanism is triggered when a request includes a minimum of 1,024 tokens for Gemini 2.5 Flash or 2,048 tokens for 2.5 Pro. Tokens are the basic units of data that AI models process—roughly equivalent to three-quarters of a word each.

The launch follows recent developer backlash over unexpectedly high charges tied to Google’s earlier caching system. Complaints surged over the past week, prompting a public apology from the Gemini team and a pledge to improve billing transparency.

However, while the update is positioned as a win for developers, Google has not offered third-party validation of the savings, raising questions about real-world performance. The company recommends that developers place recurring context at the beginning of their prompts to increase cache hits, while more variable data should appear later.

As AI APIs become more embedded in apps and services, the pressure on companies like Google to manage cost and performance will only intensify. Developers and businesses alike will be watching closely to see if Google delivers on its latest promise.

Dangote Refinery boosts Nigeria’s fuel supply to 40.1m

Nigeria procures 1.37m electricity meters from China via

Zinox Chairman hails Tinubu’s reforms as boost for

China denies US claims of funding militias in

Contact Info

Some Populer Post

US embassy Abuja, Lagos consulate closed Monday for ‘Presidents’

NPA urges traders to embrace new export channels

SERAP drags CBN to court over missing ₦3tn funds

Big-screen foldable smartphones could capture 65% market share 2026

Google introduces automatic caching to cut AI API costs

Tagged:

Google rolls out AI-powered scam protection on Chrome

Warner Bros Discovery eyes possible company split

US embassy Abuja, Lagos consulate closed Monday for.

NPA urges traders to embrace new export channels

SERAP drags CBN to court over missing ₦3tn.

Big-screen foldable smartphones could capture 65% market share.

Sign Up for Our Newsletter