Next-level AI engine comes top in LLM speed showdown
|Responses to AI chat prompts not snappy enough? California-based generative AI company Groq has a super quick solution in its LPU Inference Engine, which has recently outperformed all contenders in public benchmarks.Groq has developed a new type of chip to overcome compute density and memory bandwidth issues and boost processing speeds of intensive computing applications like Large Language Models (LLM), reducing “the amount of time per word calculated, allowing sequences of text to be generated much faster.”This Language Processing Unit is an integral part of the company’s inference engine, which processes information and provides answers to queries from an end user, serving up as many tokens (or words) as possible for super quick responses.