Llama 3.1 405B now runs at 969 tokens/s on Cerebras Inference

Article URL: https://cerebras.ai/blog/llama-405b-inference

Comments URL: https://news.ycombinator.com/item?id=42178761

Points: 235

# Comments: 77