The industrialization of software development
Everyone is debating whether AI will kill software companies. But they’re missing the point. The interesting story isn’t whether people will replace their hardened Salesforce instance with a few well-articulated prompts (they won’t). Or whether decades of code complexity and integration know-how will form an impenetrable moat (it won’t). The interesting story is what happens when code becomes free.
Last week, a company shipped production code that no human wrote and no human reviewed. The AI agents that generated the code also wrote the tests, ran those tests in mock environments of Slack and Okta, and deployed the changes. The humans, for their part, got an update when the feature went live.
This isn’t a demo or a research project. This is how the team at StrongDM builds software.
Every wave of technological innovation has been catalyzed by the cost of something expensive trending towards zero. That’s what we’re witnessing right now. Code is (basically) free. The question is: when code is free, what becomes expensive?
“Dark factory” software development
I’m not sure if what StrongDM has invented is the future, but, per Simon Willison, it’s the most vivid version I’ve come across.
A few months ago, the team at StrongDM set two rules for themselves: no code can be written by humans and no code can be reviewed by humans. Those constraints led them to a system they call a software development “dark factory,” inspired by a fully automated manufacturing facility that can operate without lights because there are no humans that need to see.
This was just recently possible, according to the team. The catalyst was Opus 4.5 and GPT 5.2. These models were good enough, finally, to “compound correctness.” In other words, the more the agent "worked" on the system, the more stable it became, rather than more fragile. This is the unlock that allowed the StrongDM team to stop human review entirely and start letting the factory run “dark.”
The hardest problem with letting agents run wild is testing. Having agents write tests only helps if they don’t cheat, but agents, as we all know, really like to cheat.
The solution that StrongDM came up with is twofold. First, instead of unit testing they do scenario testing. Think of “scenarios” as end-to-end user stories stored outside of the codebase (almost like a “holdout” set in machine learning). Success, measured by user satisfaction, is judged by an LLM evaluation of whether the path would likely satisfy a real person.
Second, they built what they call a “Digital Twin Universe,” which is a set of high-fidelity clones of third-party APIs (Okta, Slack, Jira, etc.). This lets their agents run thousands of integration tests against "fake" versions of the internet without hitting rate limits or incurring real-world costs. More importantly, they can test failure modes that would be impossible to test in production. What happens if Okta goes down for 10 minutes? Now they can find out.
This is a fascinating example of what software development looks like when code is nearly free: high-fidelity mocks of the real world were always desirable but economically impossible, and now they're not. It also nods at the challenge (and expense) of handling validation at scale.
From products to processes
Taking the StrongDM team’s approach one step further, I recently spoke with a team that has no fixed product at all. Instead they have specs.
They interview the company and gather all the tribal knowledge they can. They ingest the context from existing software systems. And they use their “factory” to spit out a product that fits the company like a glove.
It looks like a high-end consulting engagement, but the deliverable is a proprietary product with the deployment speed of a SaaS implementation. If your factory can generate, test and war-game a bespoke product in the time it takes to have a discovery call, what’s the difference between a consulting shop and a software company?
Where value accrues
When code itself stops being a moat, where does value accrue? A few ideas:
The factory. Who has the most rigorous testing? The most realistic digital twins? StrongDM's Okta clone encodes years of "how Okta actually behaves when misconfigured." That operational knowledge compounds over time. The teams running agents longest will probably have the best factories. And this isn't something you can catch up to quickly. It's accumulated knowledge. (Though I do wonder if this just gets open-sourced. Why wouldn’t it? Surely the wisdom of the crowd will be better eventually, even if there is some proprietary edge today.)
State monopolies. You can clone Salesforce's code in a month, but you can't replicate the fact that every sales team is already using it, with years of workflow dependencies and integrations. AI can't hallucinate a user’s historical state or the network of people already using the platform. The software is reproducible, but the adoption, and all the associated history, isn't.
Trust and certification. When no human reviews the code running your payroll system, who's liable when it breaks? Who gets audited? Who carries insurance? Established companies can point to years of compliance work, security certifications, enterprise relationships. In regulated industries, this becomes the primary barrier to entry. Trust is expensive to build. It requires time, capital, and proving yourself in lower-stakes environments before enterprises let you touch production systems.
Everyone assumes the commoditization of code will lead to the democratization of software. But what if it does the exact opposite?
When everyone can generate code for free, competitive advantage shifts to things more expensive than code ever was. Building “digital twins” that simulate the internet and then running thousands of experiments knowing 99% will fail. Earning enterprise trust through years of certification and compliance work. These are not inexpensive endeavors.
Shipping code may be free. But the cost of shipping code that will win is getting way higher.
It may seem that I’m going to conclude that incumbents are the inevitable beneficiaries. But I think startups are actually best positioned to build the first software factories. My contrarian take, though, is that startups will need more capital in this new era, not less. The lean startup era is giving way to capital-intensive competition at factory-scale. And the companies that win won't be the ones writing the most elegant code. They'll be the ones who can afford to run the most sophisticated factories.
This isn’t the end of software. But it is certainly the end of artisanal software, coded meticulously by hand.
The debate about whether AI will "kill" software companies misses the transformation entirely. We're not watching software die. We're watching it industrialize. And the companies that figure out how to build factories, not just products, are the ones that will matter.