News

The New Stack: “How NetEase Games cut LLM cold starts from 42 minutes to 30 seconds”

NetEase Games cut LLM cold-start times from 42 mins to 30 sec with the CNCF Fluid project, enabling serverless GPU inference on Kubernetes.