The traditional integrated development environment is slowly becoming an artifact. We are watching the role of the software engineer fracture and evolve—shifting away from the manual assembly of syntax and toward the high-level orchestration of artificial intelligence.
A masterclass in this new paradigm recently surfaced in the form of an open-source project called TwoFac. Built by developer Arnav Gupta, TwoFac is a sovereign, cross-platform two-factor authentication tool designed to replace legacy giants like Authy. It operates seamlessly across Android, iOS, macOS, Windows, Linux, WearOS, the web, and even a command-line interface.
But the final product, impressive as it is, is the least interesting part of this story. The real revelation lies in the architecture of its creation. Gupta generated roughly 46,000 lines of production-grade code in a matter of weeks, largely bypassing the IDE. He didn’t type this application into existence; he managed the AI agents that did.
Here is a breakdown of the agentic workflow that makes this scale of independent development possible.
The Post-Authy Vacuum and the Push for Sovereignty
The market premise for TwoFac is straightforward. When Authy abruptly killed its desktop application and maintained strict lock-in over user secrets, it created a vacuum. Power users require data sovereignty—the ability to own their encryption keys and back up their 2FA tokens to their own personal Google Drive or iCloud, bypassing centralized third-party servers entirely.
To solve this, Gupta designed a massive multi-platform suite using a Kotlin Multiplatform (KMP) core to handle the heavy cryptographic logic, wrapped in native Compose and SwiftUI interfaces for the various operating systems. Building nine separate deployment targets usually requires a dedicated engineering team. Gupta did it alone, leveraging asynchronous AI workers.
The Architecture of an Agentic Workflow
You cannot simply ask an LLM to “build a 2FA app” and expect a functional repository. The AI will hallucinate, lose context, and inevitably collapse under the weight of the project’s complexity.
Gupta solved this context window degradation by treating his AI models (Claude 3 Opus, Sonnet, and GPT-4) not as omniscient oracles, but as junior developers operating within a strictly confined, locally hosted scaffolding.
His repository contains a dedicated directory for the AI itself:
- The Blueprint (plans/): Before a single line of code is generated, the AI is forced to write a highly granular, step-by-step Markdown plan. This document acts as the source of truth. The AI must define the architecture, list the necessary dependencies, and outline a multi-phase implementation roadmap.
- The Context Boundaries (agents.md): The repository features a strict map of the codebase, telling the AI exactly which modules control which logic. The AI is forbidden from making blind structural changes. It must read the map first.
- Constrained Execution (skills/): Instead of giving the AI open-ended access, Gupta defined specific “skills.” For example, the agent has a specific script to boot up an Android emulator, take a screenshot, and verify if the UI rendered correctly.
The Automated Testing Loop
The fatal flaw of AI-generated code is silent failure. The code looks right, but it simply doesn’t compile or run.
To create a closed-loop system, Gupta integrated Maestro, an open-source UI testing framework. When the AI is assigned a feature—like building a biometric unlock screen—it writes the Kotlin code, compiles it, boots the emulator, and runs a Maestro test script. If the test fails, the AI reads the error log, analyzes a screenshot of the broken UI, rewrites the code, and tests it again.
The developer only steps in when the AI gets trapped in an infinite loop.
The Asymmetrical Leverage of the Modern Developer
What Gupta demonstrates is asymmetrical leverage. He spent his time on the macro-problems: defining the KMP architecture, researching WebAuthn protocols, and writing the testing parameters. The AI handled the micro-problems: typing out boilerplate Compose UI layouts, structuring standard Git commits, and parsing repetitive API documentation.
If there is a physical hardware constraint—like verifying that a fingerprint scanner correctly decrypts a local SQLite database—the human intervenes. Everything else is delegated.
We have crossed the threshold where coding velocity is no longer constrained by typing speed or syntax recall. It is constrained only by a developer’s ability to clearly articulate a system’s architecture and ruthlessly manage the AI agents deployed to build it.


