We are excited to announce that Semantic Kernel (SK) now has first-class support for the Model Context Protocol (MCP) — a standard created by Anthropic to enable models, tools, and agents to share context and capabilities seamlessly.
With this release, SK can act as both an MCP host (client) and an MCP server, and you can leverage these capabilities directly in your agents. This unlocks powerful new scenarios for tool interoperability, prompt sharing, and agent orchestration across local and remote boundaries. This requires Semantic Kernel Python version 1.28.1 or higher.
What is MCP?
MCP is a protocol that standardizes how models, tools, and agents communicate and share context. It supports multiple transport types (stdio, sse, websocket) and allows for dynamic discovery and invocation of tools and prompts. Learn more in the official documentation.
SK as an MCP host: consuming MCP servers
SK can now connect to any MCP server, whether it’s running locally (via stdio), remotely (via sse), or even in a container. This means you can:
- Call tools and prompts exposed by any MCP server as if they were native SK plugins.
- Perform sampling (e.g., text generation) by using all of Semantic Kernel’s service connectors.
- Chain together multiple MCP servers in a single agent.
Example: connecting to a local MCP server via stdio
from semantic_kernel.connectors.mcp import MCPStdioPlugin async with MCPStdioPlugin( name="ReleaseNotes", description="SK Release Notes Plugin", command="uv", args=[ "--directory=python/samples/demos/mcp_server", "run", "mcp_server_with_sampling.py", ], ) as plugin: # Use plugin as a tool in your agent or kernel ...
Example: connecting to a remote MCP server via sse
from semantic_kernel.connectors.mcp import MCPSsePlugin async with MCPSsePlugin( name="RemoteTools", url="<http://localhost:8000/sse>", ) as plugin: ...
Sampling: handling sampling requests
By default, whenever a MCP Plugin is added to a Kernel, either directly, or through an agent, it will be able to use all of the Chat Completion services registered in that kernel to perform sampling with, this is done automatically and requires no setup outside of making sure the kernel you are using has services registered, see the section on sampling below on how you would call that from a server.
SK as an MCP server: exposing your functions and prompts
You can now expose your SK functions and prompts as an MCP server, making them available to any MCP-compatible client or agent. This is perfect for sharing custom tools, chaining agents, or integrating with other ecosystems.
Example: exposing SK as an MCP server
from semantic_kernel import Kernel from semantic_kernel.connectors.ai.open_ai import OpenAIChatCompletion kernel = Kernel() kernel.add_service(OpenAIChatCompletion(service_id="default")) # Add functions and prompts as usual server = kernel.as_mcp_server(server_name="sk") # Run as stdio server import anyio from mcp.server.stdio import stdio_server async def handle_stdin(): async with stdio_server() as (read_stream, write_stream): await server.run(read_stream, write_stream, server.create_initialization_options()) anyio.run(handle_stdin)
You can also run as an sse server for remote access:
uv --directory=python/samples/demos/mcp_server \ run sk_mcp_server.py --transport sse --port 8000
Example: Claude Desktop configuration to connect to your SK MCP server
{ "mcpServers": { "sk": { "command": "uv", "args": [ "--directory=/path/to/your/semantic-kernel/python/samples/demos/mcp_server", "run", "sk_mcp_server.py" ], "env": { "OPENAI_API_KEY": "", "OPENAI_CHAT_MODEL_ID": "gpt-4o-mini" } } } }
Sampling support: create function that request a generation from their host
With MCP, you can expose functions that delegate sampling (e.g., text generation) to the host. This is not particularly useful because Semantic Kernel can obviously connect to a number of models, but it could be that you want to run SK as a server in a environment that is not allowed to directly call out to those models, while the host can do that.
To use this, you can create a KernelFunction
that receives a server
object (excluded from function choice) and uses it to request sampling from the MCP host:
from semantic_kernel.functions import kernel_function from typing import Annotated from mcp.server.lowlevel import Server @kernel_function( name="run_prompt", description="Run the prompts for a full set of release notes based on the PR messages given.", ) async def sampling_function( messages: Annotated[str, "The list of PR messages, as a string with newlines"], temperature: float = 0.0, max_tokens: int = 1000, server: Annotated[Server | None, "The server session", {"include_in_function_choices": False}] = None, ) -> str: if not server: raise ValueError("Request context is required for sampling function.") sampling_response = await server.request_context.session.create_message( messages=[ types.SamplingMessage(role="user", content=types.TextContent(type="text", text=messages)), ], max_tokens=max_tokens, temperature=temperature, model_preferences=types.ModelPreferences( hints=[types.ModelHint(name="gpt-4o-mini")], ), ) return sampling_response.content.text
This function can be exposed as a tool on your MCP server, and when called, it will use the host’s sampling endpoint to generate the output. The server
argument is automatically injected by the MCP infrastructure and is not shown to users or in function choice UIs.
See the mcp_server_with_sampling.py sample for a full walkthrough.
Using MCP with agents: tool calls and more
You can use MCP plugins directly in your SK agents, allowing your agent to:
- Call tools and prompts from any MCP server
- Chain together multiple MCP servers (e.g., GitHub + Release Notes)
- Use sampling and tool calls as part of the agent’s reasoning
Example: agent with multiple MCP plugins
from semantic_kernel.agents import ChatCompletionAgent from semantic_kernel.connectors.mcp import MCPStdioPlugin async with ( MCPStdioPlugin( name="Github", command="docker", args=["run", "-i", "--rm", "-e", "GITHUB_PERSONAL_ACCESS_TOKEN", "ghcr.io/github/github-mcp-server"], env={"GITHUB_PERSONAL_ACCESS_TOKEN": os.getenv("GITHUB_PERSONAL_ACCESS_TOKEN")}, ) as github_plugin, MCPStdioPlugin( name="ReleaseNotes", command="uv", args=["--directory=python/samples/demos/mcp_server", "run", "mcp_server_with_prompts.py"], ) as release_notes_plugin, ): agent = ChatCompletionAgent( service=OllamaChatCompletion(), name="GithubAgent", plugins=[github_plugin, release_notes_plugin], ) ...
See the local_agent_with_local_server.py sample for a full example.
Exposing agents as MCP servers
You can also expose an entire SK agent as an MCP server, making its reasoning and tool orchestration available to other clients and agents. This enables powerful agent-to-agent collaboration and chaining.
Example: exposing an agent as an MCP server
from semantic_kernel.agents import ChatCompletionAgent from semantic_kernel.connectors.ai.open_ai import OpenAIChatCompletion agent = ChatCompletionAgent( service=OpenAIChatCompletion(), name="ReleaseNotesAgent", instructions="You are a release notes generator agent. Use the run_prompt function to generate release notes.", plugins=[release_notes_plugin], ) server = agent.as_mcp_server(server_name="release_notes_agent") # Now you can run this server using stdio or sse as shown above
You can use this to even go a step further and create multiple agents as servers, consume those servers as plugins in another agent to create a simple multi-agent setup, see the agent_with_mcp_agent.py sample for how this works.
Try it out and share your feedback!
We invite you to try out the new MCP features in Semantic Kernel:
- Connect to local or remote MCP servers
- Expose your own tools and prompts as MCP servers
- Use MCP plugins in your agents
- Chain agents and tools across process and network boundaries
Check out the samples and demos folders for more scenarios and code.
Let us know what you build and what you think! Open an issue or discussion on GitHub — we can’t wait to see what you create!
0 comments
Be the first to start the discussion.