Google is reportedly planning to significantly reduce its reliance on Scale AI, its primary data-labeling provider, following Meta's acquisition of a 49% stake in the company. This move signals a strategic shift in how Google approaches data services, prioritizing data security and competitive advantage in the rapidly evolving AI landscape.
The decision stems from concerns that Meta's investment, which values Scale AI at $29 billion (up from $14 billion), could expose Google's proprietary information and AI development strategies to a direct competitor. Scale AI plays a crucial role in training AI models like Gemini, Google's competitor to ChatGPT, by providing vast amounts of human-labeled training data. This data is essential for refining AI models and ensuring their accuracy and effectiveness. Google had budgeted approximately $200 million for Scale AI's services in 2025, highlighting the significance of this partnership. In 2024, Google spent around $150 million on Scale AI’s services, accounting for a sizable portion of Scale AI's $870 million revenue.
The Meta deal has triggered a wider industry response, with other major AI firms, including Microsoft, OpenAI, and xAI, also re-evaluating their relationships with Scale AI. These companies share similar concerns about data security and the potential for sensitive information to be accessed by Meta. OpenAI, while still working with Scale AI, has reportedly reduced its reliance on the company.
This situation presents AI companies with a build-versus-buy dilemma regarding data operations. They must decide whether to outsource data labeling to a neutral third party or bring these capabilities in-house. Google had already been working to diversify its data service providers for over a year before the Meta deal, indicating earlier concerns about dependency on a single vendor. This trend is accelerating as AI labs seek alternatives that won't expose their research priorities to competitors. The CEO of Labelbox told Reuters he expects to generate "hundreds of millions of new revenue" from fleeing customers, while Handshake, another competitor, saw its demand from top AI labs triple overnight.
Scale AI's core service involves human annotators, including experts with advanced degrees, labeling complex datasets to refine AI models. These annotations, which can cost up to $100 each, are vital for generative AI developers. The company also serves self-driving car companies and the U.S. government, which are expected to remain clients.
Meta's stake in Scale AI marks a pivotal shift in how the industry views data labeling, transforming it from a commodity service into a strategic asset. Scale AI's valuation doubling reflects the escalating value of specialized data annotation capabilities in the AI race. The industry has evolved significantly, with specialized annotations now commanding a premium due to their direct impact on model performance.
Google is now exploring alternative data service providers and potentially developing in-house capabilities to replace Scale AI's services. This shift requires Google to carefully align its partnerships to ensure data control and neutrality. The company's next strategic moves will provide insights into the future of AI developments and the importance of data autonomy and interoperability in mitigating risks associated with competitive acquisitions. This also creates a massive opportunity for smaller, independent rivals.