The Pulse: ‘Tokenmaxxing’ as a weird new trend
The Pulse: ‘Tokenmaxxing’ as a weird new trendAt Meta, Microsoft, Salesforce and other large companies, devs are purposefully burning tokens (and money!) to inflate their AI usage and hit AI usage metrics which they treat as targets.
Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover Big Tech and startups through the lens of senior engineers and engineering leaders. Today, we cover one out of four topics from last week’s The Pulse issue. Full subscribers received the article below seven days ago. If you’ve been forwarded this email, you can subscribe here. Inside Meta, an engineer created a “token leaderboard” that ranks employees by token usage. Last week, The Information reported:
I spoke with a few engineers at Meta about what’s happening, and this is what they said:
As per The Information, Meta employees used a total of 60.2 trillion AI tokens (!!) in 30 days. If this was charged at Anthropic’s API prices, it would cost $900M. Of course, Meta is likely purchasing tokens at a discount, but that could still come in at $100M+ – in large part from senseless “tokenmaxxing”. After backlash on social media, Meta abolished the internal leaderboard last week. One day after The Information revealed details about the incredible tokenmaxxing numbers, I confirmed that Meta has taken down its leaderboard; perhaps they realized that the incentive created enormous and unnecessary waste. If so, it’s a bit surprising that it took media coverage for the social media giant to reach that conclusion. One engineer at Meta told me they think Meta had a different goal with the token leaderboard. A long-tenured engineer suspects increasing AI usage actually was the real goal. They said:
Microsoft: full-force tokenmaxxingSimilarly, Microsoft has had an internal token leaderboard like Meta’s since January, and it started pretty well, as I reported back at the time: there’s an internal token dashboard that displays the individuals who use the most tokens in order to promote the use of tokens and experimentation with LLMs. At the Windows maker, this leaderboard is interesting:
However, what starts as a metric for performance reviews or promotions can quickly become a target for devs. I talked with a software engineer at the Windows maker who admitted they’re full-on “tokenmaxxing” – not to get on the leaderboard, but rather because they don’t want to be seen as using too few tokens:
This engineer is relatively new at the company, so is concerned about job security, and is playing this game to avoid being tagged as insufficiently “AI-native” by burning far more tokens than necessary. Salesforce: burning tokens to hit “minimum” & “ideal” targetsElsewhere, Salesforce has created “tokenmaxxing” incentives, as well. Talking with an engineer there, I learned that the company built two tools that effectively incentivize excessive spending on tokens:
The message Salesforce sends to staff is clear: “use a minimum of $170/month tokens or be flagged.” Who wants to get flagged for using too few tokens? The outcome is somewhat wasteful token spend:
Shopify: an example on how to avoid tokenmaxxingThe first-ever token leaderboard that I’m aware of was built by Shopify in 2025. And it worked well! Last June, the Head of Engineering at Shopify, Farhan Thawar, told me on The Pragmatic Engineer Podcast:
I asked Farhan for details on how it’s gone since. Here’s what he told me:
Shopify’s approach seems to have worked for a few reasons:
One more interesting learning Farhan shared with me: it’s more interesting to not look at “who spent the most in overall token cost?” but instead, “whose tokens cost the most?” Devs who generate tokens that come out as expensive have turned out to do in-depth work that was interesting to learn about! Tokenmaxxing: great for AI vendors, bad for everyone elseI see very few rational reasons why incentivizing tokenmaxxing makes sense for any company. It results in increasing AI spend – by a lot! – in return for little to no value. Heck, in some cases it actually incentivises slower work – as shown by devs using the AI to answer questions when documentation is readily available – and encouraging ‘busywork’ where devs prompt projects that they don’t even want to ship. Tokenmaxxing seems to push devs to focus on stuff that makes no difference to a business. It feels to me that a good part of the industry is using token count numbers similarly to how the lines-of-code-produced metric was used years ago. There was a time when the number of lines written daily or monthly was an important metric in programmer productivity, until it became clear that it’s a terrible thing to focus on. A lines-of-code metric can easily be gamed by writing boilerplate or throwaway code. Also, the best developers are not necessarily those who write the most code; they’re the ones who solve hard problems for the business quickly and reliably with – or without – code! Similarly, the number of tokens a dev generates can easily be gamed, and if this metric is measured then devs will indeed game it. But doing so generates a massive accompanying AI bill! —- Read the full issue of last week’s The Pulse, or check out this week’s The Pulse. This week’s issue covers:
You’re on the free list for The Pragmatic Engineer. For the full experience, become a paying subscriber. Many readers expense this newsletter within their company’s training/learning/development budget. If you have such a budget, here’s an email you could send to your manager. This post is public, so feel free to share and forward it. If you enjoyed this post, you might enjoy my book, The Software Engineer's Guidebook: navigating senior, tech lead, staff and principal positions at tech companies and startups.
|

Comments
Post a Comment