NetEase Games cut LLM cold-start times from 42 mins to 30 sec with the CNCF Fluid project, enabling serverless GPU inference on Kubernetes.
NetEase Games cut LLM cold-start times from 42 mins to 30 sec with the CNCF Fluid project, enabling serverless GPU inference on Kubernetes.