Gemini CLI Complete Guide: MCP Extensions & Memory Files in Practice

Google's newly released Gemini CLI has officially entered the AI coding tool arena, going head-to-head with Anthropic's Claude Code and OpenAI's Codex CLI. With a 1-million-token ultra-long context window, MCP Server extension support, and project memory file functionality, Gemini CLI offers developers a full-featured command-line AI coding solution. This article covers everything from installation and configuration to hands-on demonstrations, providing a comprehensive breakdown of this tool's core capabilities.

Gemini CLI Tutorial

1 Million Token Context: Why It Matters

Gemini CLI is powered by the Gemini 2.5 Pro model, inheriting its 1-million-token-plus context window capability. What does this number actually mean? It's roughly equivalent to 2–3 complete Flask-scale projects, or the entire codebase of a dozen common Python packages.

To understand the significance of this number, you first need to grasp the concepts of tokens and context windows. A token is the basic unit that large language models use to process text — one token corresponds to roughly 3/4 of an English word, or 1–2 Chinese characters. The context window refers to the maximum number of tokens a model can "see" and process simultaneously in a single conversation. Early GPT-3.5 only supported a 4K token context; GPT-4 expanded this to 128K; and Gemini 2.5 Pro's 1-million-token context means the model can process approximately 750,000 English words or tens of thousands of lines of code in a single interaction. This capability relies on Google's optimizations to the Attention Mechanism — in traditional Transformer architectures, the computational complexity of attention scales quadratically with sequence length. Google has dramatically reduced the computational overhead of long-sequence processing through techniques like Ring Attention and sparse attention.

For developers, the practical value of ultra-long context manifests in three key areas:

Holistic architecture analysis — You can feed an entire project to the model at once for global understanding
Cross-file code refactoring — No more explaining context relationships file by file
Complex dependency mapping — The model can see all inter-module call chains simultaneously

In actual testing, after importing the complete codebase of the open-source AI agent framework SmallAgents into Gemini CLI, it accurately analyzed the project's main module responsibilities, data flow patterns, and design pattern usage. It even identified potential architectural issues and provided refactoring suggestions, including function complexity optimization and dependency relationship improvements.

Installation & Basic Configuration

Environment Setup

Before installing Gemini CLI, make sure your system has Node.js installed (V20 recommended). Simply download the appropriate installer for your operating system.

The installation command is straightforward — copy the official installation command provided by Gemini and run it in your terminal:

Mac/Linux: Open Terminal and execute directly
Windows: Open CMD and execute

During installation, you'll be prompted to choose a theme (the default dark theme works fine), then you'll need to log in with your Google account to complete authentication. After successful login, the terminal will display that it's using the Gemini 2.5 Pro model by default.

Essential Commands at a Glance

You can view all available operations with the help command. Here are several key commands worth remembering:

MCP-related commands: Manage and invoke MCP Servers
Memory commands: Set up and manage memory files
Tool list commands: View all available tools
! prefix commands: Execute Shell commands, e.g., !pwd to display the current path

In practice, it's recommended to launch Gemini CLI directly within the built-in terminal of VSCode or PyCharm, allowing you to seamlessly combine the IDE's file management and code editing capabilities.

MCP Server Extensions: Supercharging Gemini CLI

MCP Protocol: The "USB Port" for AI Tools

MCP (Model Context Protocol) is a standardized protocol open-sourced by Anthropic in late 2024, designed to provide large language models with a unified interface for calling external tools. MCP uses a client-server architecture: AI tools (such as Gemini CLI, Claude Code) act as MCP clients, while various external services (such as document retrieval, database queries, project management tools) run as MCP Servers. The two communicate via the JSON-RPC 2.0 protocol — MCP Servers expose a list of callable tools and parameter definitions to the client, and the client automatically selects the appropriate tool based on user intent and initiates the call. This design is analogous to what the USB protocol is to hardware devices — once the standard is established, any developer can write an MCP Server that conforms to the protocol, giving AI tools new capabilities without modifying the AI tool's own code.

How to Configure MCP Servers

MCP Server configuration is one of Gemini CLI's most differentiating features. By editing the configuration file, you can connect various external tools to Gemini CLI.

Configuration steps:

Navigate to the Gemini CLI configuration path in your terminal
Open the configuration file using the nano command
Add the MCP Server JSON configuration to the file

In our hands-on testing, we configured two commonly used MCP Servers:

Context7: Capable of fetching the latest documentation for the vast majority of open-source projects and libraries, effectively solving the problem of LLM training data lag. Since large language models have a knowledge cutoff date, they may still reference deprecated APIs for rapidly evolving open-source projects. Context7 ensures generated code is based on current API versions by retrieving the latest documentation in real time.
Taskmaster: Capable of generating Product Requirements Documents (PRDs) and breaking them down into actionable subtasks, helping developers transform vague product ideas into structured development plans.

After configuration, type /mcp in Gemini CLI to view all configured MCP Servers and their supported tools.

Hands-On: Building an AI Agent Workflow with AutoGen

To validate the practical effectiveness of MCP Servers, we tested a complete development scenario — building an AI agent workflow using Microsoft's AutoGen framework.

AutoGen is a multi-agent conversation framework open-sourced by Microsoft Research. Its core philosophy is to accomplish complex tasks through collaborative dialogue between multiple AI agents. Unlike single-agent approaches, AutoGen allows developers to define multiple agents with different roles and capabilities. These agents can send messages to each other, review each other's outputs, and iteratively optimize results. AutoGen version 0.4 underwent a major architectural overhaul, introducing event-driven asynchronous communication mechanisms and more flexible agent orchestration patterns. This multi-agent collaboration model simulates the Code Review process in software engineering, improving code quality by introducing a "second pair of eyes."

After entering the prompt, Gemini CLI first used Context7 to search for AutoGen's latest documentation and new features, then wrote a workflow containing three agents based on the latest API:

Code Generation Agent: Writes initial code based on requirements
Code Review Agent: Reviews the generated code and provides improvement suggestions
Code Integration Agent: Synthesizes the outputs of the first two agents to produce the final optimized code

During the test run, the three agents worked collaboratively: the first agent generated a Python function to find the Nth prime number, the second agent reviewed the code and proposed optimizations, and the third agent integrated all the information to output a more complete final version. The entire process required no manual intervention and ran successfully on the first attempt.

Memory Files: Making AI Follow Your Development Standards

Creating Project-Level Memory Files

Memory Files are another core feature of Gemini CLI, allowing developers to set persistent rules for a project that the AI follows across all subsequent interactions.

From a technical perspective, memory files are an implementation of System Prompt Engineering. In traditional LLM interactions, developers need to repeat their tech stack preferences, coding conventions, and other constraints at the beginning of every conversation. This not only wastes token quota but also leads to inconsistent outputs due to omissions. Memory files persist these constraints as project-level configuration, similar to what .editorconfig or .eslintrc is to code editors — they define the project's "meta-rules" that are automatically injected into the model's context with each interaction. Notably, this pattern is becoming the standard configuration paradigm for AI coding tools: the equivalent feature in Claude Code is the CLAUDE.md file, while Cursor uses .cursorrules files.

To create one, simply add a GEMINI.md file in your project root directory and define your development standards. A complete memory file typically includes:

Tech stack constraints: e.g., Python 3.11, AutoGen 0.4, using venv virtual environments
Environment configuration notes: Virtual environment creation, activation methods, dependency installation commands
Coding standards & style: Naming conventions, commenting requirements, etc.
Project structure definition: Directory organization
Tool usage strategies: e.g., "Always use Context7 to search for the latest documentation," "All code examples should use Chinese comments"

Once set up, use /memory refresh to reload the memory file and /memory show to confirm it loaded successfully.

Validating the Results in Practice

With the memory file configured, entering a simple prompt like "Build me an AI agent that can create travel itineraries" caused Gemini CLI to automatically follow all the rules defined in the memory file: building with Python 3.11, adhering to project conventions, and organizing code according to the specified directory structure.

It first output a step-by-step development plan. After confirmation, it began creating project files and writing code. When runtime errors occurred, simply pasting the error messages back to Gemini CLI allowed it to quickly locate and fix the issues. After successful execution, entering "Create a 3-day travel plan for Nepal" produced a comprehensive travel itinerary from the agent, complete with daily schedules, budget estimates, transportation options, and attraction recommendations.

Taskmaster Integration: From Requirements to Task Breakdown

Beyond code development, Gemini CLI paired with the Taskmaster MCP Server can also handle project management tasks. In testing, entering "Develop a TodoList App for iOS, generate a PRD and break it down into 10 subtasks" prompted Gemini CLI to invoke Taskmaster and automatically:

Generate a complete Product Requirements Document and save it to a file
Break the PRD down into 10 specific development subtasks and save them

Developers can then use these broken-down subtasks to continue building the entire project step by step within Gemini CLI, creating a complete loop from requirements analysis to code implementation.

Summary & Outlook

The release of Gemini CLI marks a new phase for AI coding tools. The 1-million-token context window addresses the pain point of large-scale project analysis, the MCP Server extension mechanism provides unlimited capability expansion, and the memory file feature evolves AI coding from "random responses" to "standardized development."

The AI command-line coding tool market has now formed a three-way competition: Anthropic's Claude Code is renowned for its depth of code understanding and agentic coding capabilities, excelling in complex refactoring tasks; OpenAI's Codex CLI leverages the broad user base of the GPT model family, emphasizing seamless integration with the ChatGPT ecosystem; and Google's Gemini CLI differentiates itself with ultra-long context and free usage quotas. Beyond command-line tools, IDE-integrated tools like GitHub Copilot, Cursor, and Windsurf are also competing for the developer market, pushing the entire AI coding tool space into fierce competition in 2025.

For developers already using Claude Code or Codex CLI, Gemini CLI's biggest differentiator lies in its ultra-long context and flexible MCP ecosystem. For developers new to AI coding tools, Gemini CLI's free tier and relatively simple configuration process also lower the barrier to entry. As the MCP ecosystem continues to grow, Gemini CLI's practical development capabilities will only continue to strengthen.