GitHub Copilot SDK Deep Dive: Core Features and Practical Guide

At the Microsoft Build //localhost:Shanghai event, a Microsoft Foundry MVP delivered an in-depth session on the core concepts, key features, and live demos of the GitHub Copilot SDK. With the Copilot SDK officially reaching GA during the Build conference, developers can now embed the AI intelligence engine behind Copilot into custom applications via programmatic interfaces. This article provides a systematic overview of the key takeaways from that session.

What Is the Copilot SDK?

Microsoft has released multiple product forms around Copilot: Coding Agent runs in GitHub's cloud-hosted environment, handling tasks like pulling code, analyzing tasks, and modifying code; Copilot CLI provides AI assistance through command-line interaction; and Copilot SDK allows developers to embed the same AI engine behind Copilot CLI into custom applications via programmatic interfaces.

Currently, the Copilot SDK supports six programming languages including Python, .NET, Go, and TypeScript. From its initial release in January to its official GA during the Build conference, it went through roughly six months of iteration.

Bounded CLI Integration Approach

The core architecture of the Copilot SDK adopts a Bounded CLI approach — bundling the Copilot CLI directly into your application. Developers don't need to worry about the complexity of integrating underlying AI services, and users can start using it immediately after installation. Here's how it works: the application launches a CLI process through the SDK Client, the two communicate via standard input/output, and the CLI then acts as a proxy to interact with the cloud-based Copilot service.

This approach delivers four key advantages:

All-in-one delivery: The CLI is bundled with the application — no extra installation needed
SDK version management: Unified dependency version control
Flexible authentication strategies: Support for multiple authentication methods
User-level session management: Independent session context for each user

The entire integration flow is remarkably concise — create a client, create a session, send a request, handle the response — all in just a few lines of code.

Agent Loop: The Decision Engine of Intelligent Agents

The Agent Loop is the core mechanism of Copilot CLI, defining how an agent thinks and acts. You simply tell it the goal, and it autonomously formulates plans, invokes tools, reflects on results, and continues looping until the task is complete.

Agent Loop Architecture and Workflow

System Architecture and Tool-Use Loop

The Agent Loop consists of four components:

App: The application entry point that initiates requests
SDK: The messenger responsible for passing messages
Copilot CLI: The orchestrator that coordinates all activities
Large Language Model: The intelligent brain that makes key decisions

At its core is the tool-use loop. Each iteration represents a complete LLM API call, where the model decides based on the current context whether to continue calling tools for more information or to provide a final answer directly.

This introduces an important concept — Turns: one turn equals one complete LLM API call plus its subsequent tool executions. For example, when you ask a complex question about a codebase, Copilot might need multiple turns: the first turn searches for files, the second reads core content, the third reads dependency files, and the fourth finally delivers the answer.

Event Stream and Completion Mechanisms

Each turn starts with turn_start and ends with turn_end, containing internal events like assistant_message (LLM response) and execution_start/tool_execution_complete (tool execution tracking). After all turns are complete, a session.idle event is emitted.

Regarding completion signals, there are two to distinguish:

session.idle: A mechanical signal meaning "I'm idle now" — triggered whenever the loop ends, regardless of whether the task is actually complete
session.taskComplete: A semantic signal meaning "I believe the task has been fully completed" — only triggered when the LLM proactively calls a specific tool

Hooks: Fine-Grained Control Over Session Lifecycle

Hooks are callback functions triggered at specific points during the session lifecycle. From session start to end, every key step has a corresponding Hook, giving developers fine-grained control over the process flow.

Hooks Feature Overview

Four Practical Use Cases

Permission control: Use onPreToolUse to create read-only agents that only allow safe read tools
Audit compliance: Combine multiple Hooks to log every action from session start to end, generating structured audit logs
Real-time notifications: Monitor agent execution status and push notifications
Error handling: Catch exceptions and provide graceful degradation strategies

Best practices for using Hooks include: keeping Hook execution fast, making explicit return decisions, and managing state properly.

Remote Sessions: Cross-Device Agent Access

Remote Sessions enable a "remote desktop"-like capability for Copilot sessions. The SDK connects to GitHub's Mission Control service, generates a unique URL after authentication, and you can access and control locally running Copilot sessions from a browser or mobile device.

In the SDK, simply set the remote option to true when creating a client, and all sessions will automatically enable remote access. The SDK also recommends converting the remote URL into a QR code for convenient mobile device scanning.

Custom Agents: Specialized Agent Orchestration

Custom Agents are a critically important feature of the Copilot SDK. Each Agent can be thought of as a specialist with a specific role, tools, and knowledge.

Custom Agent Sub-Agent Event Handling

Four Orchestration Patterns Explained

Pipeline pattern: Sequential processing like a factory assembly line — ideal for tasks with clear sequential dependencies
Parallel orchestration: Multiple Agents work simultaneously, significantly improving processing efficiency
Supervisor pattern: A central Agent coordinates everything — suitable for complex scenarios requiring global coordination
Handoff pattern: Agents dynamically decide who handles the next step — offering maximum flexibility

There are two ways to define an Agent: programmatically through the SDK, or declaratively through Markdown files. Key configuration parameters include name, description, tools, and MCP Server settings.

Agent Design Best Practices

Follow the Single Responsibility Principle — let each Agent focus on doing one thing well
Write precise description fields — these are the key basis for agent routing decisions
Strictly follow the Principle of Least Privilege — only grant access to necessary tools
Design tools to be model-friendly — keep interfaces simple and parameters standardized

Skills: Reusable Prompt Modules

Skills are essentially Markdown files containing specific instructions — think of them as intelligent plugins. Their core value lies in:

Encapsulating expert tacit knowledge into executable instructions
Cross-project sharing for improved reusability
Organizing complex AI configurations
Flexible enabling or disabling

Building Skills follows a "convention over configuration" principle: create a skills directory, create a subdirectory for each skill, and place a skills.md file inside. The file starts with YAML front matter defining the name and description, with the body containing the instruction set written in Markdown.

Skills can be combined with Custom Agents, preloading specific domain expertise when an Agent starts up. They can also complement MCP servers, enabling AI to operate external tools.

Hands-On Demos: From Basics to Advanced

The presenter demonstrated multiple practical scenarios in a VS Code environment:

Copilot SDK Development Environment in VS Code

Basic Sessions and Streaming Output

The most basic usage requires just a few steps: import the Copilot Client, create and start a client instance, create a session (specifying permissions and model), and send prompts via send_and_wait to get responses. Streaming output is achieved by listening to SessionEventType's SystemMessageData events for real-time content display.

Custom Tools and Image Input

Custom tools are registered via DefineTool — the demo implemented a weather query tool (with simulated data). Image input supports both file paths and Base64 encoding, with the SDK automatically handling file reading, encoding, and resizing.

Local LLM Integration

A noteworthy highlight is the ability to switch the backend model from cloud-based GPT to a local Ollama platform. Simply specify the local model name in create_session, set the Provider to OpenAI, and point the BaseURL to the Ollama service address. This means you can also use models deployed on other PCs within your local network, meeting data privacy and offline usage requirements.

FastAPI Web Integration

The demo also showcased wrapping the Copilot SDK as a FastAPI web application, providing a more user-friendly interaction experience through a web interface, including features like model selection and image upload analysis.

Conclusion

The official GA of the GitHub Copilot SDK marks a milestone where developers can more flexibly embed AI intelligence engines into custom applications. From the Agent Loop's autonomous decision-making cycle, to Hooks' fine-grained control, to Custom Agents' specialized orchestration and Skills' knowledge reuse, the SDK provides a complete toolchain. Combined with Remote Sessions' cross-device access capabilities and the flexibility of local model integration, developers can build truly "ubiquitous" intelligent agent applications.