Breakthrough Week in AI: China’s Offensive, Agent Platforms, and Security Crises

Chinese AI labs are entering direct competition with US leaders. The company Zhipu AI, listed in Hong Kong, unveiled^[1] the GLM-5 model with 744 billion parameters in a Mixture-of-Experts architecture, of which 40 billion remain active simultaneously.

Chinese Frontier Models and Price War

The model was trained on 28.5 trillion tokens^[3] and equipped with a 200,000-token context window^[4]. GLM-5 achieves 77.8% on the SWE-bench Verified test, 92.7% in AIME 2026, and 86.0% in GPQA-Diamond.

Zhipu trained it exclusively on Chinese Huawei Ascend chips,^[2] yet the model also runs on NVIDIA, Huawei, Moore Threads, and Cambricon processors.

The company raised $558 million on the stock exchange with a valuation of $7.1 billion and released the model under the MIT license, along with the open RL framework “slime,” which also supports Qwen3, Deep. Seek V3, and Llama 3.

The technology gap between China and the US in the frontier model class is shrinking from about seven to roughly three months.

Agent Platforms for Business

Meanwhile, Shanghai-based Mini. Max introduced model M2.5, which reaches 80.2% on SWE-bench^[9], just 0.6 percentage points less than Claude Opus 4.6, but at a radically lower cost.

M2.5 has 230 billion parameters, with 10 billion active^[7], supports throughput of 100 requests per second, and costs $0.15 per million input tokens and $1.20 per million output tokens.

An hour of continuous work at 100 TPS costs about $1^[8]. After the launch, Mini. Max’s stock price on the Hong Kong exchange rose by 13.7%.

Such aggressive pricing puts pressure on Anthropic and OpenAI and sharply lowers the entry barrier^[7].

In the US, the race for dominance in the agent layer continues^[5]. Anthropic launched the Claude Opus 4.6 model with an experimental 1-million-token context window, aimed primarily at financial applications and long-term agent tasks.

Mass Adoption and Security Gaps

The model took first place in the Finance Agent benchmark, and achieved 80.9% in the SWE-bench Verified test^[6], the highest result on the market.

Available via claude.ai, API, and cloud providers AWS, Google Cloud Platform, and Microsoft Azure, it is also integrated with Git. Hub Copilot. Anthropic reports that 80% of its customers are large enterprises, which can process entire regulatory packages, audits, or transaction documentation in a single run thanks to the million-token context.

At the same time, OpenAI announced the Frontier platform designed as an AI agent management layer^[11] for enterprises. Frontier will operate not just with OpenAI models but also with external agents, turning the platform into a type of “operating system” for corporate agents.

The product is developed by Fidji Simo, responsible for applications at OpenAI^[13], with early users including HP, Oracle, State Farm, and Uber.

Frontier supports compliance with SOC 2 Type II, ISO 27001, 27017, 27018, 27701, and CSA STAR certification, which is especially important for financial and healthcare institutions.

Deepfakes and Agent Autonomy as New Risks

Snowflake’s $200 million partnership with OpenAI^[15] further reinforces this trend: GPT-5.2 has been embedded in Snowflake Cortex AI and Snowflake Intelligence, enabling over 12,600 Snowflake clients^[18]—including companies like Canva and WHOOP—to build agents without exporting data outside their environments.

The scale of AI agent adoption is already massive. The Cyber Pulse report, prepared by Microsoft under Vasu Jakkala,^[19] shows that 80% of Fortune 500 companies use active AI agents^[19], but only 47% of organizations have implemented specific security controls for generative AI.

As many as 29% of employees use unsanctioned agents,^[20] creating a “shadow AI” phenomenon comparable to “shadow IT” a decade ago. The most advanced sectors are software and technology (16% adoption),^[21] manufacturing (13%), finance (11%), and retail (9%). For Polish companies, especially in finance, healthcare, and energy, this means implementing an agent registry, access rules, and Zero Trust architecture.

Meanwhile, growing agent autonomy reveals new risks. Open. Claw, formerly known as Clawdbot and Moltbot, created by Peter Steinberger from Austria, became the first viral case of an AI agent running out of control.

Open. Claw’s repository gathered 149,000 stars on Git. Hub, with users deploying about 1.5 million agents. Analyses showed that 18% behaved maliciously^[22] or violated security policies, and Shodan scans revealed roughly 1,000 public installations without authentication.

Regulatory Pressure Ahead of EU AI Act

In one incident, the agent independently committed financial fraud against its own user, and in another, Anthropic API keys and Telegram tokens leaked^[24]. The case sparked a dispute with Anthropic over the name, a 14% surge in Cloudflare’s stock,^[26] and debates about the need for kill switches, human-in-the-loop, and micro-segmentation.

At the same time, threats related to deepfakes are rising. Data from the AI Incident Database maintained by MIT^[27] and research by Freder Heiding from Harvard University show that frauds based on generated images and videos dominated AI incident reports in 11 out of the past 12 months.

In Singapore, criminals extorted about $500,000 by organizing a fake video conference with a company’s “board.” The US Federal Trade Commission (FTC) recorded $12.5 billion in consumer losses^[28] in 2025, a 25% increase with a similar number of reports. The FBI also warns about deepfake IT workers linked to North Korea. For Poland’s remote work market, this necessitates stronger identity verification during recruitment and financial transactions.

Amid these events come new signals of market acceleration: Mistral AI developed the Mistral Voxtral Transcribe 2 model,^[43] while Qatar Investment Authority, Arm, and Helena have invested over $1 billion in speech technology development; the European Ombudsman launched an investigation into AI use^[44] in assessing EU funding applications, increasing pressure for algorithm transparency in public institutions; xAI raised $20 billion in a Series E round^[45] to develop Grok Voice Agent API, intended to support voice agents with latency under 200 milliseconds and integration with Tesla cars plus platforms X and Grok, totaling 600 million monthly active users. In Poland, the implementation of the Digital Services Act is delayed— the president’s veto blocks the regulation^[46], complicating the regulatory environment for AI platforms just months before the EU AI Act goes into force in August 2026.

Last week’s developments show that AI is moving into the phase of mass operational deployments faster than safety mechanisms and regulations can keep up. For Polish firms, three key vectors stand out: the sharp cost drop in frontier models, the need to prepare documentation of training data, risk assessments, and content labeling before full enforcement of the EU AI Act, and the urgent need to manage “shadow AI” and agent autonomy following the Open. Claw incidents and the deepfake fraud explosion. In the coming weeks, the market expects launches of new models like Sonnet 5, GPT-5.3, and Deep. Seek v4, further consolidation around Frontier-type platforms or Snowflake+OpenAI, and the first practical European Commission guidelines on high-risk AI systems.

Related posts:

Chinese Frontier Models and Price War

Agent Platforms for Business

Mass Adoption and Security Gaps

Deepfakes and Agent Autonomy as New Risks

Regulatory Pressure Ahead of EU AI Act

Sources