The Complete Engineering Guide to Gemini CLI

The Terminal as an Intelligent Agent

Google has released Gemini CLI, an open-source AI agent designed to live directly in the developer’s terminal. Unlike standard wrappers that simply pipe text to an LLM, Gemini CLI functions as a fully capable agent with file system access, shell execution capabilities, and a modular extension system based on the Model Context Protocol (MCP).

This is not just a chatbot; it is a programmable workspace assistant. The following guide serves as a comprehensive technical manual for installing, configuring, and mastering Gemini CLI to automate complex engineering workflows.


1. Installation and Authentication

The installation process is streamlined via npm or Homebrew. It appears to be lightweight, installing under a minute on most systems.

Prerequisites: Node.js version 20+ or Homebrew.

Installation Command:

# via npm
npm install -g @google/gemini-cli

# via Homebrew (MacOS)
brew install -g @google/gemini-cli

Verification:
Verify the installation and check the version (e.g., 0.10.0) by running:

gemini --version

Authentication Strategies:
Upon typing gemini to start the interface, the user is prompted to authenticate. There are two primary methods, each with distinct trade-offs:

  • Login with Google: Offers a higher free tier (60 req/min), but often defaults to cost-efficient models.
  • API Key (Recommended): Provides granular control, cost optimization, and access to token caching.

Setting up the API Key:

  1. Navigate to ai.studio/api-keys.
  2. Create a new project (e.g., “gemini-cli-project”).
  3. Generate the key and copy it.
  4. Export it as an environment variable in your shell configuration (.zshrc or .bashrc):
export GEMINI_API_KEY=Your_Key_Here

Launch the CLI, type /auth, and select “Use Gemini API Key.”

2. Operation Modes: Headless vs. Interactive

Gemini CLI operates in distinct modes depending on the user’s intent.

Headless Mode (-p)

Ideal for CI/CD pipelines or scripting where no user interaction is desired. The CLI accepts a prompt, executes it, and terminates.

gemini -p "Suggest 3 variables names for a user authentication array"

Interactive Mode

The default mode. It launches a persistent session with context awareness.

Context Awareness

The tool uses the @ symbol to inject specific context into the model’s window. This eliminates hallucination by grounding the model in actual file data.

  • @script.md: Reads a specific file.
  • @src/: Reads an entire directory.

3. Critical Flags and Configuration

Power users can manipulate the agent’s behavior using runtime flags.

  • –include-directories: By default, the CLI sees the current folder. This flag expands the scope (e.g., gemini –include-directories ../).
  • –model (-m): Select specific model variants (e.g., gemini-2.5-flash for speed).
  • –checkpointing: A crucial feature for safety. It creates a snapshot of the project state before file edits, allowing for an instant rollback via the /restore command.
  • –yolo: Disables all safety confirmations. The agent will execute shell commands and file writes without asking. Use with extreme caution.
  • –approval-mode:
    • default: Asks for everything.
    • auto_edit: Auto-approves file changes but asks for shell commands.
  • –sandbox: Runs the execution in a secure container (Docker or macOS Seatbelt) to prevent malicious system damage.

4. The Built-In Command Suite

Inside the interactive session, the user controls the environment via slash commands.

  • /init: Scans the current directory and generates a GEMINI.md file (system prompt) tailored to the project structure.
  • /chat: Manages conversation history.
    • /chat save <name>: Snapshots the current context.
    • /chat resume <name>: Loads a previous branch of thought.
  • /settings: Opens an interactive UI to modify the settings.json file.
  • /theme: Changes the UI aesthetics (e.g., Dracula, Monokai).
  • ! (Bang): Toggles generic shell mode to run commands like ls or git status directly inside the interface.

5. Custom Commands (Automation)

This appears to be the most powerful feature of Gemini CLI. Developers can create reusable, context-aware prompt templates using .toml files located in ~/.gemini/commands/.

Example: Automated Git Commit Messages
Create a file commit.toml:

# ~/.gemini/commands/commit/message.toml
description = "Conventional commit from staged changes."
prompt = """
Please generate a Conventional Commit message based on the following git diff:
!{git diff --staged}
"""

Injection Syntax:

  • {{args}}: Injects text typed by the user after the command.
  • !{command}: Runs a shell command and injects the output (e.g., git diff).
  • @{file}: Injects the content of a file.

Usage:

/commit:message

The agent executes git diff –staged, reads the output, and generates a formatted commit message.


6. Context Hierarchy and GEMINI.md

To avoid repeating instructions, Gemini CLI looks for a GEMINI.md file to establish persistent context. It loads these files hierarchically:

  1. Global: ~/.gemini/GEMINI.md (User specific preferences).
  2. Project: ./GEMINI.md (Repository specific rules).
  3. Sub-directory: ./src/GEMINI.md (Module specific context).

The “Dual-Brain” Hack:
The developer demonstrates a strategy using sub-directories to force the agent into different “modes.”

  • Create a folder gemini-plan/ with a GEMINI.md containing strict “Architect/Planner” instructions.
  • Create a folder gemini-code/ with a GEMINI.md containing “Junior Dev/Implementer” instructions.
    Running gemini inside these folders creates specialized agents that behave differently based on directory context.

7. Tools and Extensions (MCP)

Gemini CLI supports the Model Context Protocol (MCP), allowing it to interface with external services.

Installing Nanobanana (Image Generation):

gemini extensions install https://github.com/gemini-cli/extensions/nanobanana

Requires a separate API key in environment variables.

Usage:

/generate a picture of a futuristic server room

Installing GitHub Integration:
Allows the agent to manage issues and PRs.

  1. Install the GitHub MCP extension.
  2. Generate a GitHub Personal Access Token (Classic).
  3. Configure settings.json:
"mcpServers": {
  "github": {
    "cmd": "npx",
    "args": ["-y", "@modelcontextprotocol/server-github"],
    "env": {
      "GITHUB_PERSONAL_ACCESS_TOKEN": "YOUR_TOKEN_HERE"
    }
  }
}

Usage:

> Create an issue in the repo regarding the API timeout bug.

8. IDE Integration (Cursor & VS Code)

For seamless workflow, Gemini CLI connects directly to the editor via the Gemini CLI Companion extension.

  1. Install “Gemini CLI Companion” from the VS Code/Cursor marketplace.
  2. Open the terminal inside the IDE.
  3. Start gemini.
  4. Accept the connection prompt.

Benefits:

  • Active Context: The CLI automatically knows which file is open and where your cursor is positioned.
  • Bi-directional Edits: Changes proposed by Gemini in the terminal are reflected in the editor as “Diffs” that can be accepted or rejected.

Conclusion

Gemini CLI represents a shift from “chatting with AI” to “engineering with AI.” By leveraging custom commands, hierarchical context via Markdown, and MCP extensions, developers can construct a highly personalized, automated environment that operates safely within their existing terminal workflows.

Leave a Reply

Your email address will not be published. Required fields are marked *

Scroll to Top