OpenAI's GPT-5.2, released in early December 2025, represents a significant advancement in AI capabilities, particularly in reasoning and coding. However, recent reports have shed light on a potentially controversial aspect of its training: the incorporation of data from Grokipedia, an AI-generated online encyclopedia launched by Elon Musk's xAI in late October 2025. This has sparked debate surrounding the quality, bias, and reliability of the information GPT-5.2 uses.
GPT-5.2 was launched as an upgrade to the GPT-5 series, introducing variants like GPT-5.2 Instant and GPT-5.2 Thinking, with improvements in general intelligence, long-context understanding, agentic tool-calling, and vision capabilities. GPT-5.2 Pro is designed for professional knowledge work, enhancing reasoning and performance on benchmarks. It initially rolled out to paid ChatGPT users and became available via API.
Grokipedia was created as an alternative to Wikipedia, aiming to "purge out the propaganda" that Musk believes Wikipedia promotes. Unlike Wikipedia, Grokipedia relies primarily on AI to generate its content, with human intervention limited to suggestions. This approach has raised concerns about the encyclopedia's accuracy and potential biases, with some critics noting the promotion of right-wing perspectives and debunked conspiracy theories.
The integration of Grokipedia into GPT-5.2's knowledge sources has led to instances where the AI model cites the AI-generated encyclopedia when answering user prompts. For example, ChatGPT, powered by GPT-5.2, has been found to cite Grokipedia on topics related to Iran and the Holocaust. This has raised concerns about the potential for GPT-5.2 to disseminate misinformation or biased information.
One major concern is the potential for "LLM Grooming," where AI models are manipulated into sharing disinformation through the strategic flooding of specific content. In this context, the concern is that if an AI measures truth by frequency and semantic relevance, a large, rapidly updating encyclopedia like Grokipedia could disproportionately influence the AI's output. Some experts have also cautioned against training AI on AI-generated data, arguing it could degrade quality and lead to "model collapse".
OpenAI has stated that ChatGPT's web search aims to draw from a broad range of publicly available sources and viewpoints and that it applies safety filters to reduce the risk of surfacing links associated with high-severity harms. However, the incident has sparked discussions about the need for more rigorous evaluation and filtering of training data to prevent the unintentional amplification of biased or inaccurate information.
The use of Grokipedia as a source also brings up broader questions about AI training data. AI models rely on vast amounts of data to learn and make predictions. The quality and diversity of this data are crucial for ensuring the accuracy and reliability of the AI model. Training data can come from various sources, including internal data, third-party vendors, and open datasets. However, using unlicensed or unauthorized data can carry significant risks, including copyright infringement and the potential for litigation.
As AI models like GPT-5.2 become increasingly integrated into various aspects of society, understanding the sources and potential biases in their training data is essential. The controversy surrounding Grokipedia's role in GPT-5.2's training highlights the ongoing challenges and ethical considerations in the development of advanced AI systems.


















