OpenAI and Broadcom announce chip designed for LLM inference at scale

OpenAI, the company behind ChatGPT and Codex and the models that use those tools, and Broadcom, an established silicon provider, have announced a new chip called Jalapeño, designed specifically for large language model inference in data centers.

The chip is intended to be deployed in large data centers; Both companies say this is just the first generation of a long-term project in which the chips will be refined over time.

Broadcom says this ASIC (application-specific integrated circuit) was designed from the ground up for LLM inference, based on “detailed information” from the company’s conversations with OpenAI researchers, and that the chip’s development was based on OpenAI’s own roadmap for future models and products. The design and production of the chip took nine months.

The promise is that this chip is more specialized for the current needs of LLMs than those that inference systems currently run on in existing data centers.

OpenAI claims that “early tests show that Jalapeño will deliver substantially better performance per watt than the current state of the art,” but notes that it has not finished measuring performance and that a “detailed technical report” will be presented in the coming months.

Source link

OpenAI and Broadcom announce chip designed for LLM inference at scale

Leave a ReplyCancel Reply

I tried Google’s AI dictation app, but it couldn’t replace Wispr Flow

Leaked Samsung animation shows a Galaxy Tab that solves my biggest screen problem