Skip to main content

Articles tagged “AI model inference efficiency”

1 article · ← all articles

News

OpenAI Cuts Inference Costs in Half on Key Models

OpenAI engineers quietly developed an optimization cutting inference costs by more than half. Logged-out ChatGPT traffic now runs on just a few hundred GPUs.