Yarden Porat described a serious vulnerability in the Lang. Chain library[1] (langchain-core module), identified as CVE-2025-68664 with a CVSS score of 9.3. The flaw involves unsafe deserialization: attackers can inject trusted objects via the `lc` key in dictionaries[2], enabling theft of secrets, API tokens, and cloud credentials, and in some configurations leading to remote code execution. Research and reports from Cyata as well as community alerts indicate an urgent need to patch[3] the libraries and update to versions 0.3.81 or 1.2.5.
Critical LangGrinch Vulnerability
OpenAI released the GPT-5.2 model on Dec 11, 2025[5], available in three variants: Instant, Thinking, and Pro. According to an internal memo covered by the industry under the name “Code Red,” this launch was a response to November’s competitor advances[4]—notably the announcement of Gemini 3—and to the words of Sam Altman, who declared, “we will emerge from Code Red by January”[4]. GPT-5.2 targets professional applications; the company made it available to ChatGPT Pro users and via API without disclosing detailed pricing.
Model Race and Benchmarks
Anthropic announced in November 2025 the release of Claude Opus 4.5[7], promoted as an optimal model for agents and programming. In practice, the LLM market became more diverse: xAI offers Grok 4 and Grok 4 Heavy under paid plans ($30/month and $300/month), while Deep. Seek is developing V3.1 with hybrid reasoning[16]. This fragmentation shifts enterprise criteria: it is no longer enough to be the “best overall,” fitting tasks takes precedence.
Agent evaluation challenges intensify. Researchers published by Clément Schneider and D. Kang’s team noted[14] many agent benchmarks suffer from methodological flaws and shortcuts in evaluation. Concurrently, Carnegie Mellon University released analysis revealing[15] agents fail about 70% of end-to-end scenarios; humans fail in 30%, and hybrid human+AI systems fail in approximately 15%. These data undermine trust in benchmarks and increase the value of governance tools.
Genesis Mission and Its Implications
Google Deep. Mind announced on Dec 17, 2025 a partnership with the U. S. Department of Energy[11], as part of the Genesis Mission program: all 17 national DoE laboratories gained access to scientific agents[10] based on Gemini and to the Gemini for Government platform. The White House administration views this project as a way to shorten research cycles from years to days; planned modules include Alpha. Evolve (coding), Alpha. Genome (DNA), and Weather. Next (weather forecasting). This government rollout of frontier AI is unprecedented and may accelerate research but raises questions about oversight and risk.
Market and regulatory impacts are clear. On Nov 19, 2025, the European Commission postponed obligations for high-risk systems[20], easing anonymization requirements and simplifying cookie rules to enhance EU competitiveness. Technical standards are emerging: Linux Foundation established the Agentic AI Foundation[6] and promotes Anthropic Model Context Protocol (MCP) as a means to interoperability among agents. Meanwhile, Service. Now acquired Veza (deal valued at $2.7B)[22], and security firms like Rubrik report increased demand for AI auditing tools. The takeaway is evident: 2026 will be a year of selection—systems with strong governance, auditing, and security will survive.
Related posts:
- Breakthrough Week in AI: China’s Offensive, Agent Platforms, and Security Crises
- OpenAI Breaks Funding Record, U.S. Clashes with Anthropic, and AI Agents Enter ‘Heavy Industry’
- The Era of AI Agents Accelerates: Chinese Models, GLM-5, EU AI Grid, and Initial Security Incidents
- Weekly AI Review: Gemini 3 Pro Dominance, $20B Megafunding for xAI, and New California Regulations
