gstack: The YC CEO’s Open-Source Software Factory That Split the Developer World in Half

One tool, 43,000 stars, and a fight about whether AI needs an org chart

On March 12, 2026, Y Combinator CEO Garry Tan did something unusual for a person running a $600B startup accelerator: he open-sourced his personal coding setup.

Not a framework. Not a library. A collection of Markdown files that turn Claude Code into what he calls “a virtual engineering team.” Twenty-eight slash commands. Each one a different specialist: CEO, engineering manager, designer, QA lead, security officer, release engineer.

Within 48 hours, the GitHub repo had 10,000 stars. Within a week, 33,000. As of this writing, it’s at 43,400. Hacker News threads split roughly 50/50 between “this is genius” and “this is cargo culting with extra steps.”

So which is it?

We went through 18 sources, from TechCrunch to hands-on developer reviews to Garry Tan’s own tweets, to figure out what gstack actually is, why it divided the tech world, and what it tells us about where AI-assisted development is heading.

What gstack actually does

The core idea is simple: instead of using AI as one generic assistant, assign it specific roles with specific responsibilities.

You say /office-hours. Claude becomes a YC partner who pushes back on your product framing before you write a single line of code. Six forcing questions. Challenges your premises. Generates a design doc.

You say /plan-eng-review. Claude becomes an engineering manager who locks architecture, draws ASCII diagrams for data flow, identifies edge cases, and creates test matrices.

You say /review. Claude becomes a paranoid staff engineer looking for the bugs that pass CI but blow up in production.

You say /qa. Claude opens a real Chromium browser, clicks through your app, finds bugs, fixes them, and generates regression tests.

The sprint flow goes: Think, Plan, Build, Review, Test, Ship, Reflect. Each skill feeds into the next. /office-hours writes a design doc that /plan-ceo-review reads. /plan-eng-review writes a test plan that /qa picks up. Nothing falls through the cracks because every step knows what came before.

Eight commands, end to end. From “I want to build a daily briefing app” to a shipped PR with tests.

The numbers that made it go viral

Garry Tan claims he shipped 600,000+ lines of production code in 60 days using gstack. That’s 10,000-20,000 lines per day, part-time, while running YC full-time. His last weekly retro across three projects: 140,751 lines added, 362 commits, ~115K net lines of code in seven days.

“I don’t think I’ve typed like a line of code probably since December,” Andrej Karpathy said on the No Priors podcast. Tan quoted it in the README. The message: the best AI practitioners aren’t coding anymore. They’re orchestrating.

A CTO who tested gstack texted Tan: “This is like god mode. Your eng review discovered a subtle cross-site scripting attack that I don’t even think my team is aware of.”

That tweet got 849,000 views. It also got him into trouble.

The backlash

Three distinct criticism vectors showed up:

“It’s just prompts in a text file.” Vlogger Mo Bitar made a video titled “AI is making CEOs delusional,” arguing gstack contained no novel technology. Technically correct, it IS Markdown files. But as one reviewer noted, “gstack is more useful as a development process than as a prompt pack.” The process is the innovation.

“He wouldn’t get 43K stars if he weren’t the YC CEO.” Sherveen Mashayekhi, founder of Free Agency, commented on Product Hunt: “If you weren’t the CEO of YC, this wouldn’t be on PH.” Fair point. Celebrity amplification is real. But would Linux be Linux without Linus’s reputation? Distribution advantage and underlying quality aren’t mutually exclusive.

“Lines of code is a vanity metric.” Critics argued AI can generate 10,000 lines of boilerplate trivially. The 35% test coverage could itself be AI-generated. One anonymous developer responded to the CTO testimonial: “If it’s true, that CTO should be fired immediately.” Harsh, but the point about LOC being a misleading metric is well-taken.

Then came SXSW. Tan appeared on stage with Bill Gurley and described sleeping four hours a night: “I don’t need modafinil with this revolution.” He called it “cyber psychosis” and said about a third of CEOs he knows have the same condition. His assistant later confirmed it was said in jest. But the headlines wrote themselves, and the story jumped from developer Twitter to mainstream tech media.

What the independent evidence says

Strip away the hype and the backlash, and a consistent signal emerges from independent sources.

Convergent evolution. Nicolas Fry, CEO of TurboDocx, revealed his team independently built the same Director/Manager/Engineer organizational model for AI development. “Role-based AI development works because it mirrors how real engineering organizations already operate.” TurboDocx was building their version before gstack launched. Two teams, no coordination, same architecture.

Hands-on reviews are positive. Hobokai’s developer review tested the actual skills and concluded: “more useful as a development process than as a prompt pack.” FunBlocks rated it “highly recommended for individual power users.” The people who actually used it, as opposed to those critiquing the concept from the outside, consistently found value.

Community extensions signal genuine utility. Gstack++ adapted the framework for C++ development. GStacks.org emerged as a third-party documentation and comparison site. Multiple Medium and Substack deep-dives dissected the architecture. These aren’t celebrity followers. They’re developers building on top of the work.

Even AI models weighed in. In a meta-twist, Claude called gstack “a mature, opinionated system built by someone who actually uses it heavily.” ChatGPT said “AI coding works best when you simulate an engineering org structure.” Gemini focused on making code “correct rather than easier.”

The technical bit nobody’s talking about

While the role-based workflow gets the headlines, the /browse skill’s technical implementation deserves separate attention. It runs a persistent Chromium daemon that achieves ~100ms per command after startup, roughly 20x faster than Chrome MCP tools. Three ring buffers (50,000 entries each) handle console, network, and dialog logs with async disk flush. Element references use Playwright’s accessibility tree to avoid DOM mutation issues.

This is real systems engineering, not prompt wrapping. The browser automation alone would be a noteworthy open-source contribution if it came from anyone else.

What gstack tells us about the future

“Just ask AI to write code” is hitting its ceiling. We’re already seeing what comes next: structured workflows with role separation that look a lot like the way good engineering teams already operate. gstack is one implementation of that idea. It won’t be the last.

The SKILL.md standard could end up mattering more than gstack itself. Plain Markdown files that work across Claude Code, Codex, Gemini CLI, and Cursor. No vendor lock-in. No proprietary format. If this convention gains adoption, gstack becomes the reference implementation of a bigger movement.

The parallel execution angle is interesting too. Conductor (gstack’s parallel session tool) runs up to 10 simultaneous Claude Code sessions, each a different specialist working on a different task. The sprint structure is what makes parallelism possible. Without role boundaries, 10 parallel sessions would just be chaos.

And the adversarial review piece might be the most underrated part. When /review plays the paranoid staff engineer on code that /office-hours designed, you get genuine quality control. The AI didn’t write AND review its own work. That separation matters more than most people realize.

Where I land on this

I keep going back and forth. The star count is inflated by celebrity, there’s no getting around that. The “cyber psychosis” narrative and LOC claims pulled attention away from the actual substance. If an unknown developer published the same tool, it might have 4,000 stars instead of 43,000.

But the core insight, that AI works better with organizational structure than without it, holds up. The convergent evolution from TurboDocx, the positive hands-on reviews, the community extensions: they all point to a tool that delivers real value independent of the hype.

The most lasting contribution probably won’t be gstack itself, but the patterns it popularized: role-based AI task assignment, sprint-structured workflows, separating the AI that writes code from the AI that reviews it, and portable skill definitions in plain Markdown.

Structure your AI workflows like you’d structure an engineering team. That’s the takeaway. Whether you use gstack, build your own, or adopt whatever emerges next, the days of unstructured AI prompting are numbered.

Based on 18 sources including TechCrunch, MarkTechPost, DEV Community, hands-on developer reviews, social media discussions, and the gstack repository itself. Research conducted March 24, 2026.