News

Most recently, Cerebras claimed it had achieved an inference milestone, generating 969 tokens/sec in Meta's 405 billion parameter behemoth ... to a technique called speculative decoding.