Liquid AI, a startup co-founded by former MIT researchers from the Computer Science and Artificial Intelligence Laboratory (CSAIL), has introduced its first multimodal AI models, the “Liquid Foundation Models (LFMs).” These models represent a bold departure from the transformer architecture that has dominated AI development since the release of the 2017 paper “Attention Is All You Need.”

Unlike the current wave of generative AI models built on the transformer architecture, Liquid AI aims to develop foundation models from “first principles,” taking an engineering approach akin to building engines, cars, or airplanes. This fundamental shift has led to models that outperform transformer-based alternatives of similar size, such as Meta’s Llama 3.1-8B and Microsoft’s Phi-3.5 3.8B.

Liquid AI’s LFMs come in three variants:

  • LFM 1.3B (smallest)
  • LFM 3B
  • LFM 40B MoE (largest, a “Mixture-of-Experts” model similar to Mistral’s Mixtral)

The “B” in these models’ names refers to the number of parameters, with higher parameter counts generally resulting in broader capabilities across various tasks. Liquid AI reports that its smallest model, the LFM 1.3B, already surpasses Meta’s Llama 3.2-1.2B and Microsoft’s Phi-1.5 on benchmarks like the Massive Multitask Language Understanding (MMLU) test, which covers 57 science, tech, engineering, and math (STEM) problems. This is the first time a non-transformer-based architecture has outperformed traditional models.

In addition to excelling in benchmarks, Liquid AI’s LFMs are highly memory efficient. For example, the LFM-3B model requires only 16 GB of memory, compared to the 48 GB needed by Meta’s Llama-3.2-3B model. This efficiency makes the LFMs ideal for a wide range of applications, from enterprise-level tasks in financial services and biotechnology to deployment on edge devices.

Maxime Labonne, Liquid AI’s Head of Post-Training, celebrated the release of the LFMs on social media, calling them the “proudest release of my career.” Labonne emphasized that the key advantage of LFMs is their ability to deliver superior performance while consuming significantly less memory than their transformer-based counterparts.

Liquid AI’s models are designed to process various types of sequential data, including audio, video, text, time series, and signals. This multimodal capability enables them to address complex challenges across diverse industries such as biotechnology, financial services, and consumer electronics. Built on computational principles rooted in dynamical systems, signal processing, and numerical linear algebra, the LFMs can efficiently handle up to 1 million tokens while minimizing memory usage, even as token length increases.

The LFM-3B model, for instance, maintains a smaller memory footprint than models like Google’s Gemma-2 and Microsoft’s Phi-3, making it particularly effective for long-context processing tasks such as document analysis or chatbot applications.

Liquid AI’s models aren’t open source but are accessible through the company’s inference playground, Lambda Chat, and Perplexity AI. Liquid AI has optimized these models for deployment on various hardware platforms, including NVIDIA, AMD, Apple, Qualcomm, and Cerebras, allowing for broad compatibility across industries.

Although the models are still in a preview phase, Liquid AI invites early adopters and developers to test them and provide feedback. Labonne acknowledged that while the models are not perfect, the feedback will help the team refine the models ahead of their full launch on October 23, 2024, at MIT’s Kresge Auditorium in Cambridge, MA. The event will include technical discussions, and Liquid AI plans to release a series of blog posts detailing the technology behind its models.

As part of its commitment to transparency, Liquid AI is encouraging red-teaming efforts to identify weaknesses and improve future iterations. With the introduction of the Liquid Foundation Models, Liquid AI is positioning itself as a key player in the foundation model landscape, offering a compelling alternative to the transformer-based architectures that currently dominate the space.

By combining state-of-the-art performance with remarkable memory efficiency, Liquid AI is setting the stage for a new era of AI development.

By Impact Lab