Semantic Kernel for .NET — End‑to‑End Guide

Dung Nguyen

1. Overview & Architecture

Semantic Kernel (SK) is Microsoft’s open-source AI orchestration framework. It lets you:

Call LLMs (Azure OpenAI/OpenAI/local) with prompt functions
Build plugins/tools (C# methods) callable by the LLM
Add memory (vector search) and RAG
Orchestrate multi‑step agent workflows (planning, tool‑calling)
Integrate with enterprise services (SharePoint, DB, ERP, etc.)

High‑level Architecture (ASCII)

2. Prerequisites

.NET 8 SDK (recommended)
Visual Studio 2022 or VS Code (C# Dev Kit)
Azure OpenAI (preferred for enterprise) or OpenAI API key

3. Project Setup

Create a console app:

dotnet new console -n SkDotnetDemo
cd SkDotnetDemo

Add NuGet packages (versions evolve; use latest stable):

dotnet add package Microsoft.SemanticKernel
dotnet add package Microsoft.SemanticKernel.Connectors.OpenAI
dotnet add package Azure.AI.OpenAI   # for advanced Azure OpenAI scenarios (optional)
dotnet add package Microsoft.Extensions.Configuration
dotnet add package Microsoft.Extensions.Configuration.Json
dotnet add package Microsoft.Extensions.Hosting
dotnet add package Microsoft.Extensions.Logging.Console

appsettings.json (add to project root):

{
  "AI": {
    "UseAzureOpenAI": true,
    "AzureOpenAI": {
      "Endpoint": "https://YOUR-RESOURCE.openai.azure.com/",
      "ApiKey": "YOUR-AZURE-OPENAI-KEY",
      "Deployment": "gpt-4o-mini"
    },
    "OpenAI": {
      "ApiKey": "YOUR-OPENAI-KEY",
      "Model": "gpt-4o-mini"
    }
  },
  "VectorMemory": {
    "Provider": "None", // or "AzureAISearch", "Pinecone", etc.
    "AzureAISearch": {
      "Endpoint": "https://YOUR-SEARCH.search.windows.net",
      "ApiKey": "YOUR-SEARCH-ADMIN-KEY",
      "Index": "sk-mem-index"
    }
  }
}

4. Quick Start (Hello, Kernel)

Program.cs

using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Hosting;
using Microsoft.Extensions.Logging;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Connectors.OpenAI;

var host = Host.CreateDefaultBuilder(args)
    .ConfigureAppConfiguration(cfg =>
    {
        cfg.AddJsonFile("appsettings.json", optional: false, reloadOnChange: true);
    })
    .ConfigureLogging(lb => lb.AddConsole())
    .Build();

var config = host.Services.GetRequiredService<IConfiguration>();
var useAzure = config.GetValue<bool>("AI:UseAzureOpenAI");

var builder = Kernel.CreateBuilder();

if (useAzure)
{
    builder.AddAzureOpenAIChatCompletion(
        deploymentName: config["AI:AzureOpenAI:Deployment"]!,
        endpoint: config["AI:AzureOpenAI:Endpoint"]!,
        apiKey: config["AI:AzureOpenAI:ApiKey"]!
    );
}
else
{
    builder.AddOpenAIChatCompletion(
        modelId: config["AI:OpenAI:Model"]!,
        apiKey: config["AI:OpenAI:ApiKey"]!
    );
}

var kernel = builder.Build();

// Create a prompt function inline
var summarizeFn = kernel.CreateFunctionFromPrompt("""
You are a concise assistant. Summarize the following in 1-2 sentences:
---
{{$input}}
""");

var text = """
Semantic Kernel helps orchestrate LLMs, plugins (tools), memory, and plans
to build production AI apps in .NET.
""";

var result = await kernel.InvokeAsync(summarizeFn, new()
{
    ["input"] = text
});

Console.WriteLine("Summary:");
Console.WriteLine(result);

Run:

dotnet run

5. Prompt Functions & Streaming

Prompt with Parameters & System Role

var analyzer = kernel.CreateFunctionFromPrompt("""
SYSTEM:
You are a precise technical reviewer. Respond in markdown, with bullets.
USER:
Analyze this code for clarity and correctness. Highlight risks and improvements.
CODE:
{{$code}}
""");

var analysis = await kernel.InvokeAsync(analyzer, new() {
    ["code"] = "public int Add(int a, int b) => a + b;"
});
Console.WriteLine(analysis);
``

Token Streaming (real‑time output)

using Microsoft.SemanticKernel.ChatCompletion;

var chat = kernel.GetRequiredService<IChatCompletionService>();

var stream = chat.GetStreamingChatMessageContentsAsync(
    new ChatHistory("You are a helpful assistant."),
    new ChatRequestSettings
    {
        Temperature = 0.4,
        MaxTokens = 500
    });

await foreach (var chunk in stream)
{
    Console.Write(chunk.Content);
}

6. Building Plugins/Tools (C# Methods callable by LLM)

Create a plugin class:

using Microsoft.SemanticKernel;

public sealed class TimePlugin
{
    [KernelFunction, Description("Get current time in ISO 8601")]
    public string Now() => DateTimeOffset.UtcNow.ToString("O");

    [KernelFunction, Description("Add minutes to current time")]
    public string AddMinutes([Description("Minutes to add")] int minutes)
        => DateTimeOffset.UtcNow.AddMinutes(minutes).ToString("O");
}
``

var time = new TimePlugin();
kernel.ImportPluginFromObject(time, "time");

var plannerPrompt = kernel.CreateFunctionFromPrompt("""
Given the user's need, call tools if helpful.
User asks: "What time is it now and in 90 minutes?"
""");

var output = await kernel.InvokeAsync(plannerPrompt);
Console.WriteLine(output);

The LLM can reference tools by name when guided via system prompts or planner capabilities. In more advanced scenarios, use function/tool calling interfaces for structured calls (see §8).

7. Memory & RAG (Retrieval Augmented Generation)

7.1 Embeddings + Vector Store

To implement RAG:

Split content into chunks
Embed each chunk
Store embeddings + metadata in a vector DB (Azure AI Search, Pinecone, Qdrant, Redis, etc.)
At query time: embed the question → similarity search → feed top chunks to LLM as grounding context.

Example: Simple Retrieval Flow (Pseudo/RAG-lite)

var embedService = kernel.GetRequiredService<ITextEmbeddingGenerationService>();

// 1) Prepare document chunks
var docs = new[]
{
    ("doc1#p1", "Semantic Kernel enables plugins and planning."),
    ("doc1#p2", "RAG combines retrieval with LLM generation for grounded answers.")
};

// 2) Create embeddings and store (in-memory or your vector DB)
var store = new List<(string Id, float[] Embedding, string Text)>();

foreach (var (id, text) in docs)
{
    var emb = await embedService.GenerateEmbeddingAsync(text);
    store.Add((id, emb.ToArray(), text));
}

// 3) Query
var question = "How does SK support building agents with tools and RAG?";
var qEmb = await embedService.GenerateEmbeddingAsync(question);

// 4) Simple cosine similarity (example only)
float CosSim(float[] a, float[] b) {
    var dot = 0f; var na = 0f; var nb = 0f;
    for (int i = 0; i < a.Length; i++){ dot += a[i]*b[i]; na += a[i]*a[i]; nb += b[i]*b[i]; }
    return dot / (float)(Math.Sqrt(na)*Math.Sqrt(nb));
}

var top = store
    .Select(s => new { s.Id, s.Text, Score = CosSim(qEmb.ToArray(), s.Embedding) })
    .OrderByDescending(x => x.Score)
    .Take(3)
    .ToList();

// 5) Grounded prompt
var ragFn = kernel.CreateFunctionFromPrompt("""
Answer the question using ONLY this context. If unsure, say "I don't know".
Context:
{{#each ctx}}- {{this}}
{{/each}}

Question: {{$q}}
""");

var args = new KernelArguments { ["q"] = question };
args["ctx"] = string.Join("\n", top.Select(t => t.Text));

var ragAnswer = await kernel.InvokeAsync(ragFn, args);
Console.WriteLine(ragAnswer);

7.2 Azure AI Search Connector (Typical)

Create an index with vector fields (contentVector) and metadata
Embed documents (offline ingestion)
Query vectors at runtime and pass results as context to LLM
Use chunking and a context window budget (e.g., 6–12 short chunks)

Recommendations

Chunk size ~500–1,200 tokens with overlap (50–150)
Maintain source links and page numbers
Apply citations in the final answer
Cache frequent queries (app layer)

8. Agentic Patterns: Planner & Tool‑Calling

8.1 Planner (High‑level)

Planner lets the LLM break tasks into steps and choose tools.

// Conceptual: Use a planning prompt or SK planner extension
var planPrompt = kernel.CreateFunctionFromPrompt("""
You are a planner. Decide which tools to use and in what order to fulfill the goal.
List steps, then execute them.
Goal: "Summarize the latest policy PDF and add a meeting for tomorrow 9am."
Tools available: time, calendar.AddEvent, files.GetLatestPolicy
Follow: Think → Plan → Act → Observe → Answer.
""");

var planResult = await kernel.InvokeAsync(planPrompt);
Console.WriteLine(planResult);

8.2 Function/Tool Calling (Structured)

Many model providers support “function calling” where the LLM emits a structured call (name + JSON args). SK surfaces this through chat completion services. Pattern:

Register functions as callable tools with JSON schemas
Run chat with tool choice enabled
When the model requests a tool call, execute C# method → append result → continue the chat

This yields deterministic integration and safer execution.

9. Quality: Prompting, Grounding, Evaluation & Tests

System prompts: define role, style, constraints
Grounding: require the model to cite retrieved context; instruct it to say “I don’t know” if not grounded
Parameters: tune temperature, max_tokens per use case (creative vs. deterministic)
Hallucination controls:
- Constrain to context
- Use tool results and schema validation
- Add content filters for sensitive domains
Unit tests:
- Mock chat and embedding services (DI in SK)
- Snapshot test prompts and outputs for regressions
Offline eval:
- Golden sets: questions → expected nuggets
- Metrics: groundedness, faithfulness, toxicity, latency, cost
Canary
- Shadow traffic before full rollout

10. Observability & Performance

Logging: Use Microsoft.Extensions.Logging. Log:
- Request IDs, token usage, duration, cache hits
- Tool calls (name + args size)
Metrics: latency per turn, cost per request, hit rate on retrieval
Tracing: adopt OpenTelemetry where possible; correlate user request → LLM calls → tool invocations
Performance:
- Prefer streaming for better UX
- Use smaller/cheaper models for simple steps, escalate only when needed
- Cache embedding & retrieval results
- Batch embedding jobs during ingestion

Solutions

Industry

Our thinking

Semantic Kernel for .NET — End‑to‑End Guide

Dung Nguyen

Table of Contents

1. Overview & Architecture

2. Prerequisites

3. Project Setup

4. Quick Start (Hello, Kernel)

5. Prompt Functions & Streaming

6. Building Plugins/Tools (C# Methods callable by LLM)

7. Memory & RAG (Retrieval Augmented Generation)

8. Agentic Patterns: Planner & Tool‑Calling

9. Quality: Prompting, Grounding, Evaluation & Tests

10. Observability & Performance

Dung Nguyen

Leave a Comment Cancel Reply

Suggested Article

NashTech

Solutions

Useful links

Connect with us

Our achievements