Articles tagged “OpenAI inference efficiency 2026”
1 article · ← all articles
News
OpenAI Cuts Inference Costs in Half on Key Models
OpenAI engineers quietly developed an optimization cutting inference costs by more than half. Logged-out ChatGPT traffic now runs on just a few hundred GPUs.