Meta Introduces Llama 4 Scout and Maverick AI Models
  • 487 views
  • 2 min read

Meta has unveiled the Llama 4 family of AI models, introducing Llama 4 Scout and Llama 4 Maverick. These models represent a significant leap forward in open-source generative AI, combining multimodality and a Mixture of Experts (MoE) architecture for enhanced performance and efficiency.

Llama 4 Scout is a multimodal model with 17 billion active parameters and 16 experts. It is designed to be efficient, fitting on a single NVIDIA H100 GPU, and boasts an industry-leading context window of 10 million tokens. This large context window allows Scout to perform tasks such as multi-document summarization, parsing user activity for personalized tasks, and reasoning over vast codebases. Meta claims that Llama 4 Scout outperforms models like Gemma 3, Gemini 2.0 Flash-Lite, and Mistral 3.1 across a range of benchmarks. It is pre-trained and post-trained with a 256K context length, empowering it with advanced length generalization capability. A key innovation in its architecture is the use of interleaved attention layers without positional embeddings, along with inference time temperature scaling of attention to enhance length generalization. This model is available on Workers AI.

Llama 4 Maverick, also featuring 17 billion active parameters, utilizes a larger set of 128 experts within its MoE architecture. This model is designed for a best-in-class performance-to-cost ratio and excels in image and text understanding across 12 languages. Meta positions Maverick as a workhorse for general assistant and chat applications, highlighting its capabilities in precise image understanding and creative writing. Maverick beats GPT-4o and Gemini 2.0 Flash across several benchmarks and achieves comparable results to DeepSeek v3 on reasoning and coding, using less than half the active parameters. An experimental chat version of Llama 4 Maverick scores an ELO of 1417 on LMArena.

Both Llama 4 Scout and Llama 4 Maverick are the first open-weight, natively multimodal models built using a Mixture of Experts (MoE) architecture. In MoE models, only a fraction of the total parameters are activated for each token, making them more compute-efficient for both training and inference. Llama 4 Maverick, for example, has 17 billion active parameters but 400 billion total parameters. The MoE layers use 128 routed experts and a shared expert.

Meta has also previewed Llama 4 Behemoth, a larger model with 288 billion active parameters that is still in training. The company claims that Behemoth outperforms GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on STEM-focused benchmarks like MATH-500 and GPQA Diamond. This model is intended to serve as a teacher model for the other Llama 4 models. CEO Mark Zuckerberg has also mentioned that there will be a Llama 4 Reasoning model coming in the next month.

The Llama 4 models are available on various platforms, including Meta AI (for WhatsApp, Messenger, and Instagram Direct), the Llama website, and Hugging Face. Amazon Web Services (AWS) has announced the availability of Llama 4 Scout and Llama 4 Maverick on Amazon SageMaker JumpStart, with availability as fully managed, serverless models in Amazon Bedrock coming soon. NVIDIA has optimized both Llama 4 Scout and Llama 4 Maverick for NVIDIA TensorRT-LLM and will package the Llama 4 models as NVIDIA NIM microservices for easy deployment on any GPU-accelerated infrastructure.

The models were trained on diverse datasets, including text, images, and videos, using techniques like MetaP and FP8 precision to boost quality and efficiency. They support over 200 languages and are compatible with platforms like WhatsApp, Messenger, and Instagram Direct.


Rohan Sharma is a seasoned tech news writer with a knack for identifying and analyzing emerging technologies. He possesses a unique ability to distill complex technical information into concise and engaging narratives, making him a highly sought-after contributor in the tech journalism landscape.

Latest Post


Sony has recently increased the price of its PlayStation 5 console in several key markets, citing a "challenging economic environment" as the primary driver. This decision, which impacts regions including Europe, the UK, Australia, and New Zealand, r...
  • 466 views
  • 3 min

Intel Corporation has announced a definitive agreement to sell a 51% stake in its Altera business to Silver Lake, a global technology investment firm, for $8. 75 billion. This move aims to establish Altera as an operationally independent entity and th...
  • 442 views
  • 2 min

Meta is set to recommence training its artificial intelligence (AI) models using public data from adult users across its platforms in the European Union. This decision comes after a pause of nearly a year, prompted by data protection concerns raised ...
  • 498 views
  • 2 min

Nvidia is embarking on a significant shift in its manufacturing strategy, bringing the production of its advanced AI chips and supercomputers to the United States for the first time. This move marks a major milestone for the company and a potential t...
  • 161 views
  • 2 min

  • 174 views
  • 3 min

About   •   Terms   •   Privacy
© 2025 techscoop360.com