How China overtook the US in AI token usage—and why it matters
By Yang Wei
Tokens are the small units of text that AI models read and process; the building blocks of every request, question, or command we give to AI. In 2026, these tokens have become the invisible infrastructure of the AI economy. Whether autonomous agents like OpenClaw or large language models (LLMs) like ChatGPT, tokens are what AI ultimately runs on. For developers, they’re not just a pricing unit, but a proxy for capability: how well a model understands, remembers, and performs.
According to the latest data from OpenRouter, the world’s largest AI model API aggregation platform, Chinese large models surpassed their US counterparts in weekly token usage during the first two weeks of March. Chinese models also occupied the top three spots globally in total usage. This marks a significant moment in the global development and distribution of AI adoption.
Amid the widespread discussion around tokens that this milestone generated, in this article CEIBS Associate Professor of Strategy Yang Wei identifies four major trends in China’s AI development through the prism of token consumption.
At the China Development Forum in March, Liu Liehong, head of China’s National Data Administration, announced that China’s daily token consumption had surpassed 140 trillion—more than a thousandfold increase from early 2024. Four years after ChatGPT sparked the AI boom, we may be witnessing the emergence of a new economic metric.
“Tokenomics”, a concept originally used in the Web3 era to describe incentive mechanisms in decentralised networks, is now becoming a carrier of value measurement in the AI era.
But what does this number reveal about China’s AI trajectory?
Chinese models are gaining global ground
Before focusing on the 140 trillion figure, it’s worth looking at how Chinese models are performing globally. Data from OpenRouter shows that from March 30 to April 5, global token usage reached 27 trillion, with Chinese models accounting for 12.96 trillion, up 31.48% week-on-week. US models, meanwhile, generated 3.03 trillion. This marks the fifth consecutive week that Chinese models have surpassed their US counterparts.
Given that OpenRouter aggregates usage from over five million developers across more than 400 models worldwide, this isn’t a local phenomenon. Chinese models are competing, perhaps even winning, on the global stage.
The driving force behind comes town to that most basic of economic factors: price. Chinese models are highly cost-competitive. DeepSeek’s V3.2, for example, charges just $0.42 per million output tokens, compared with $75 for Anthropic’s Claude Opus. That’s a cost of more than 170 times less.
Market sensitivity to pricing applies equally to AI model usage. OpenRouter COO Chris Clark noted that Chinese models are widely used because they are “disproportionately represented in agent workflows run by US companies.” In short, this isn’t just about technology leadership. It’s about cost-performance—and that’s where Chinese models are carving out an edge.
The user profile of Chinese models
The more interesting question is not how many tokens are being used, but in what way they’re being used. A joint study by OpenRouter and American venture capital firm Andreessen Horowitz (a16z) shows that programming-related tasks have risen from 11% of total usage in early 2024 to over 50% today, with agent-driven automated workflows generating more than half of all output tokens.
This suggests that heavy users of AI aren’t ordinary end consumers, but experienced developers and industry experts embedding AI into routine workflows.
This aligns closely with areas where many Chinese models excel. Compared with frontier models, the main limitations of Chinese models lie in contextual understanding, complex reasoning, and output quality. However, they perform well in structured, repetitive tasks with moderate complexity, where limitations can be mitigated through precise prompt engineering, context management, and human oversight. In other words, with clear instructions and well-defined workflows, these models can still deliver high-quality outcomes, and those capable of doing this are skilled professionals.
This leads to an important implication: AI adoption today is not leveling the playing field, but amplifying existing advantages. High adoption of AI is currently driven more by productivity gains among highly capable individuals than empowerment of less experienced workers. The early gains from AI are going to those who know how to use it, which means that AI has not yet replaced entry-level workers, but it is making top performers significantly more powerful. In this sense, AI may be accelerating inequality in the workplace.
A boom in applications, and a quiet enterprise shift
So where are China’s 140 trillion tokens actually coming from? OpenRouter processes about three trillion tokens daily, with Chinese models contributing roughly one trillion. Since many users of these models are overseas developers, the domestic share is even smaller; just a fraction of the total 140 trillion. Chinese cloud platforms (Baidu AI Cloud, Alibaba Tongyi, ByteDance Doubao, Tencent Hunyuan) account for another segment, but even combined, they fall far short of the total.
In comparison, Google’s Gemini processed about 14 trillion tokens daily via APIs in Q4 2025; OpenAI handled about 8.6 trillion tokens daily in October 2025; and Microsoft processed over 500 trillion tokens in the first half of 2025. Clearly, publicly trackable API usage alone cannot explain the total of 140 trillion.
The missing piece became clear in April: ByteDance revealed that its Doubao model alone consumes over 120 trillion tokens per day, accounting for the majority of China’s total.
The main driver? AI-generated video. With the rise of multimodal models, video generation and agent-based workflows have become extraordinarily token-intensive. A single AI-generated video can consume tens of millions of tokens, orders of magnitude more than a text interaction. When such use cases are combined with platforms that have massive distribution capabilities like ByteDance, token consumption growth becomes exponential.
This challenges the common narrative of AI being evenly adopted across industries. Instead, what we see is a highly concentrated pattern: one dominant player, one breakout use case, and one rapidly scaling content category.
But that doesn’t make it any less significant. ByteDance’s rise among the global top players in token consumption demonstrates both the scalability of China’s AI infrastructure and the explosive potential of its application layer.
In fact, it highlights a distinctly Chinese path toward AI scale: rather than a linear progression from lab to industry, consumer applications—especially content—drive adoption first, which in turn pushes infrastructure and model capabilities forward.
Whether growth driven by a single content category can be sustained remains uncertain. Is AI-generated content the beginning of a new industry, or simply a new form of repetitive content production? Time will tell.
Beyond video-driven growth, another transformation is the rapid rise of enterprise-level private deployment.
Across industries, companies in China are deploying AI internally at a remarkable speed—often on private infrastructure. From insurance and manufacturing to government systems and smartphones, AI is being embedded into core operations. A combination of open-source licenses, model distillation lowering hardware barriers, and domestic chips like Huawei Ascend improving compatibility is leading enterprise-level AI deployment to surge.
These deployments don’t show up in public rankings of token usage, yet they may hold far greater strategic value. Companies are training models on their own data, workflows, and expertise, building capabilities that are difficult to replicate. The real story of AI adoption may not be visible in usage charts, but within organisations.
When tokens become a KPI
At a macro level, China’s use of 140 trillion tokens is the result of countless micro-decisions related to how companies allocate attention, resources, and effort. The fact that the figure of 140 trillion has drawn so much attention precisely reflects that token consumption is increasingly becoming a key KPI for measuring organisational AI transformation, business intelligence, and employee productivity.
At Nvidia GTC (GPU Technology Conference) 2026, NVIDIA CEO Jensen Huang suggested that engineers should have annual token budgets equivalent to 50% of their salaries, or around $250,000 worth of tokens. He even remarked that if a $500,000-a-year engineer uses only $5,000 in tokens annually, “I will go ape something else,” (in other words, he’d be very unhappy).
The message is clear: token budgets may now be joining salary, equity, and bonuses as part of how companies evaluate talent.
However, token-based metrics come with risks. More tokens used don’t necessarily translate into innovation or effective AI transformation. In some cases, high usage reflects inefficiency from over-complex workflow or unnecessary model calls.
In extreme cases, AI negotiation bots on ecommerce platforms interact with each other, consuming large volumes of tokens without creating real business outcomes.
This reflects Goodhart’s Law in economics: when a metric becomes a target, it stops being useful. Token consumption is a valuable technical metric but, when treated as a proxy for AI capability, it can incentivise inefficient usage rather than effective application.
Ultimately, the true value of AI lies in making the execution of strategic intent and business processes more efficient and controllable.
In an era of exponential technological growth, organisations need to clearly define what they want, and communicate it effectively to AI.
In the end, what matters most is not how many tokens are used, but how effectively they are used. The real competitive advantage lies in using the fewest tokens to solve the most critical business problems.
Conclusion
China’s surge to 140 trillion daily tokens—and its overtaking of the United States in usage—signals a shift in how AI leadership is defined: not solely by model sophistication, but by the ability to scale, deploy, and embed AI into real-world workflows at speed. This lead has been driven by cost efficiency, developer-led adoption, and breakthrough application scenarios such as AI-generated video, alongside a quieter but more strategic wave of enterprise deployment. Yet the significance of this moment lies beyond the numbers. Token consumption is an imperfect proxy—one that captures intensity of use, but not necessarily quality or impact. As tokens increasingly become a metric of progress, the real competitive advantage will belong not to those who consume the most, but to those who translate usage into durable capabilities, productivity gains, and defensible value.
───────────────────────────────────────
Reference:
[1] 国家数据局局长刘烈宏,2026年3月23日中国发展高层论坛演讲。人民日报/国家数据局官网:https://www.nda.gov.cn/sjj/swdt/mtsy/0325/20260325113219152129329_pc.html
[2] 新华网/人民邮电报,2026年3月26日,《日均Token调用量达140万亿》,引用OpenRouter数据。http://www.news.cn/tech/20260326/8026bd54df3f489a8c79d4438cac96b3/c.html
[3] CGTN, 2026年3月23日, "China's AI models top US with 4.69 trillion weekly tokens." https://news.cgtn.com/news/2026-03-23/China-s-AI-models-top-US-with-4-69-trillion-weekly-tokens-1LKFFIvDooo/p.html
[4] OpenRouter平台介绍及规模数据,VoxFor综合报道。https://www.voxfor.com/openrouter-guide-universal-ai-gateway/
[5] DeepSeek官方API定价页面:DeepSeek V3.2 (deepseek-chat) 输出价$0.42/百万token。https://api-docs.deepseek.com/quick_start/pricing
[6] Anthropic Claude Opus 4.6定价:输入$15/百万token,输出$75/百万token。CloudZero定价分析引用。https://www.cloudzero.com/blog/deepseek-pricing/
[7] Dataconomy, 2026年2月25日, "Chinese AI Models Hit 61% Market Share On OpenRouter." Chris Clark引述原文。https://dataconomy.com/2026/02/25/chinese-ai-models-hit-61-market-share-on-openrouter/
[8] OpenRouter & a16z联合发布, "State of AI: An Empirical 100 Trillion Token Study", 2025年12月。https://a16z.com/state-of-ai/ ;原始数据: https://openrouter.ai/state-of-ai
[9] Alphabet Q4 2025财报,CEO Sundar Pichai发言,2026年2月4日。https://blog.google/company-news/inside-google/message-ceo/alphabet-earnings-q4-2025/
[10] a16z/OpenRouter State of AI报告引用OpenAI数据。https://a16z.com/state-of-ai/
[11] Alger On the Money研究报告,引用Microsoft 2025年7月财报电话会。https://www.alger.com/Pages/OnTheMoney.aspx?pageLabel=AOM-Mapping-AI-Momentum
[12] 21经济网,2025年3月24日,《所有人都在接入DeepSeek,自研大模型还有必要吗?》。https://www.21jingji.com/article/20250324/herald/2d3569a1290f6effe55501f9d1ea4f4a.html
[13] CSDN行业深度报告,《DeepSeek大模型一体机:本地私有化部署完全解析》。https://blog.csdn.net/YoungOne2333/article/details/149832978
[14] Tom's Hardware, 2026年3月报道,引用黄仁勋在All-In Podcast (GTC 2026) 发言原文。https://www.cnbc.com/2026/03/20/nvidia-ai-agents-tokens-human-workers-engineer-jobs-unemployment-jensen-huang.html
[15] CNBC, 2026年3月20日, "Nvidia's Huang pitches AI tokens on top of salary as agents reshape how humans work." https://www.cnbc.com/2026/03/20/nvidia-ai-agents-tokens-human-workers-engineer-jobs-unemployment-jensen-huang.html
[16] 新浪微博/新浪新闻转载,行业观察者关于电商AI砍价机器人互相对话的观察。https://www.sina.cn/news/detail/5280628612007020.html
[17]《豆包Token日均消耗量两年翻1000倍,火山引擎千亿营收目标再提速》。 https://m.jiemian.com/article/14203212.html
Yang Wei is an Associate Professor of Management at CEIBS, specialising in AI strategy, corporate AI transformation, and innovation ecosystems.