A rush to deploy artificial intelligence in software development is creating a long-term technical debt crisis, as a focus on short-term productivity gains floods corporate systems with low-quality, buggy, and potentially dangerous code. While Alphabet reports that AI now generates 75% of all new code at Google, some of the very engineers who built today’s most popular AI agents warn a reckoning is coming for what they call “vibe slop.”
“You have infrastructure that’s falling apart, and you have software that’s now very, very buggy compared to before,” Mario Zechner, a key creator of the popular OpenClaw AI agent, said in a recent interview. “We can play this game for a couple more months, or maybe even years, but eventually it will catch up to us.”
This tension is visible across the industry. Anthropic’s Claude Code, an AI coding tool, saw median daily use skyrocket from 20 minutes to 20 hours a week in the past year, showing massive adoption. Yet Zechner calls the tool “one of the most broken pieces of software I’ve ever used,” citing issues he attributes to its own AI-led development process. The push for AI-generated code comes as both OpenAI and Anthropic, two of the largest players in the space, are reportedly preparing for initial public offerings.
The conflict between speed and quality presents a hidden, off-balance-sheet risk for investors. The pressure to ship AI features is causing companies to trade near-term productivity for long-term woes, including service outages, security vulnerabilities, and a mounting technical debt that will require costly and time-consuming fixes. The bill for today’s AI-fueled velocity will eventually come due.
The Agentic Wedge Creates a Debt Trap
The strategic playbook for many AI firms is the “agentic wedge,” where a product lands in one workflow, proves its value, and expands across an organization. Palantir’s AIP platform, for example, reduced a 200-hour manufacturing approval process to just 15 seconds. The risk is that this wedge, when applied to software development itself, becomes a debt trap. The same systems automating work are accelerating the creation of the next generation of products, but often without sufficient quality control.
This creates a paradox. While companies like Shopify report that AI writes over 50% of their code, and Google’s Sundar Pichai touts a 75% figure, the creators of these tools are sounding the alarm. The problem, according to Zechner and his partner Armin Ronacher, is that AI agents are good at generating new code but poor at assessing and upgrading the vast, complex legacy systems that power most large enterprises. Startups built on “vibe coding” may scale rapidly at first, but they eventually hit a wall of complexity and fragility that AI tools struggle to navigate.
Evaluation and 'Taste' as the Last Mile
The root of the “vibe slop” problem may be that the hardest part of enterprise AI is not intelligence but evaluation—the structured human judgment that decides if a system is good enough. Ali Ansari, CEO of Micro1, argues that beyond correctness, there is a layer of “taste,” or the unwritten rules a system must honor. An AI can generate code that technically works but is poorly judged, ill-fitting for the brand, or unmaintainable. This is a skill learned through experience, the very experience being denied to a generation of junior engineers now being replaced by AI.
This gap in judgment is where systemic risk accumulates. Without the “tacit knowledge” of seasoned programmers, AI models can “very easily go the wrong direction,” as computer scientist Timothy B. Lee noted. This is not a problem that can be easily benchmarked. It is a qualitative deficit that manifests as bugs, security holes, and fragile architecture. The recent acquisition of Stainless by Anthropic for a reported $300 million highlights the criticality of the underlying tooling that turns code into reliable products, a layer often overlooked in the rush to generate code.
A Reckoning for Software Quality
The push for AI-driven development is forcing a confrontation with two decades of software practice. While some, like Palantir’s Alex Karp, see AI as the “death of legacy software,” the “vibe slop” phenomenon suggests that replacing complex systems is far harder than it appears. The risk for investors is that the productivity gains reported by major tech firms are a mirage, masking a rapid accumulation of technical debt that will eventually slow growth and inflate costs.
The challenge is that this debt is largely invisible until it triggers a major outage, data breach, or product failure. GitHub has already been forced to institute new policies to combat the wave of low-quality, AI-generated contributions. As Zechner believes, a reckoning is coming that will force companies to realize their overemphasis on AI-produced code is driving up costs and leading to subpar software. For investors, the question is not whether AI can write code, but whether the companies relying on it are building on a solid foundation or a mountain of slop.
This article is for informational purposes only and does not constitute investment advice.