April 25th, 2025

Guest Blog: SemantiClip: A Practical Guide to Building Your Own AI Agent with Semantic Kernel

Today we’re excited to welcome Vic Perdana, as a guest author on the Semantic Kernel blog today to cover his work on a SemantiClip: A Practical Guide to Building Your Own AI Agent with Semantic Kernel. We’ll turn it over to Vic to dive in further.

SemantiClip Overview image

Everywhere you look lately, the buzz is about AI agents. But cutting through the noise—what does agentic AI really mean for developers and builders? How can we move from hype to building real, practical solutions that solve business problems, automate workflows, and, simply put, make our lives easier?

I’m excited to share my journey and, as a little Easter egg, I’ve released a working demo—SemantiClip—a personal AI-powered agent that takes a video and transforms it into a well-structured blog post. It leverages the powerful open-source Semantic Kernel framework, alongside state-of-the-art local and cloud AI models via Azure Foundry.

Why Do AI Agents Matter for Devs & Businesses?

Before diving into the how, let’s set the context—why should you care about agentic AI?

  • Productivity Unlocked: AI agents can automate complex, multi-step business processes, handling tasks from researching the web to orchestrating entire workflows.
  • Next-Gen SaaS:Y Combinator predicts the next billion-dollar startups will likely be those who apply vertical AI agents to deeply understand and automate industry-specific, “boring” workflows.
  • Transforming CX:Gartner predicts that by 2029, agents will handle 80% of common customer-service tasks—without human intervention.
  • Evolving Collaboration: We’re not just talking about chatbots answering questions (no more simple ABC = Anything but Chatbot), but true digital companions and co-pilots that reason, remember, and execute.

Agentic AI can:

  • Understand and reason over processes
  • Adapt with memory and contextual learning
  • Execute dynamic tasks with minimal oversight (should you let it)

💡But with great power comes… great complexity.

The Challenges: Why Building Agents Isn’t a Walk in the Park

As you scale, managing hundreds of agents, keeping control over processes, and ensuring reliability and predictability quickly becomes a challenge:

  • How do you coordinate several agents that have different tools?
  • How do you maintain state across long business workflows (e.g., employee onboarding)?
  • What if you need to blend local, edge-hosted models with enterprise-grade cloud models?

This is where frameworks like Semantic Kernel (SK) come to the rescue.

Semantic Kernel’s Process & Agent Framework

Image image

Process Framework: This lets you explicitly define business processes as orchestrated sequences of steps (e.g., “Prepare Video” → “Transcribe Audio” → “Generate Blog Post” → “Evaluate Output”).

Agent Framework: This enables building robust “agents”—modular, intelligent actors that can carry out tasks using different patterns, tools, or models (local or cloud-based).

Key Benefits:

  • Modular, maintainable code
  • Integration with both cloud (Azure OpenAI, GPT-4) and local (Ollama such as the phi4-mini) AI models
  • Control and flexibility—with processes, you get higher predictability; with agents, you get intelligence

Introducing SemantiClip: Personal Agent for Automated Blog Creation

SemantiClip app image

To ground this in something real, I created SemantiClip—a simple app powered with agents that turns a recorded video into a blog post. Built with .NET 9, Semantic Kernel, and Blazor WebAssembly.

SemantiClipUsage image

Let’s examine this in a bit more detail, the process goes like this.

  1. Upload a Video via the SemanticClip UI
  2. Process Video
  3. Draft Blog Post (Local Model)
    • Use a small, fast LLM (like phi4-mini) for the first pass
  4. Polish Draft (AI Agent Service with LLM)
    • Refine with a more advanced model (e.g., GPT-4o via Azure AI Agent Service)
  5. Review & Publish
    • Output a well-structured markdown blog post, ready for fine-tuning and sharing

Each step (prepare, transcribe, draft, review) is handled by a specific agent, powered by Semantic Kernel (SK)’s Process and Agent frameworks. For example:

Process Builder setup

ProcessBuilder processBuilder = new("VideoProcessingWorkflow"); // Add the processing steps var prepareVideoStep = processBuilder.AddStepFromType<PrepareVideoStep>(); var transcribeVideoStep = processBuilder.AddStepFromType<TranscribeVideoStep>(); var generateBlogPostStep = processBuilder.AddStepFromType<GenerateBlogPostStep>(); var evaluateBlogPostStep = processBuilder.AddStepFromType<EvaluateBlogPostStep>();

We then orchestrate the steps as desired, passing state as input/output between them. The process is set up as a chain of sequential steps: it starts when triggered, then processes the video, transcribes it, generates a blog post, and finally evaluates the result. Each step passes its output of itself as an input for the next step as a way to maintain state throughout the process.

// Orchestrate the workflow processBuilder .OnInputEvent("Start") .SendEventTo(new(prepareVideoStep, functionName: PrepareVideoStep.Functions.PrepareVideo, parameterName: "request")); prepareVideoStep .OnFunctionResult() .SendEventTo(new ProcessFunctionTargetBuilder(transcribeVideoStep, functionName: TranscribeVideoStep.Functions.TranscribeVideo, parameterName: "videoPath")); transcribeVideoStep .OnFunctionResult() .SendEventTo(new ProcessFunctionTargetBuilder(generateBlogPostStep, functionName: GenerateBlogPostStep.Functions.GenerateBlogPost, parameterName: "transcript")); generateBlogPostStep .OnFunctionResult() .SendEventTo(new ProcessFunctionTargetBuilder(evaluateBlogPostStep, functionName: EvaluateBlogPostStep.Functions.EvaluateBlogPost, parameterName: "blogstate"));

On each step, for example the GenerateBlogStep uses an agent and with Semantic Kernel Agent framework you can “plug & play” and toggle between local Small Language Model (SLM) and state-of-the-art cloud hosted Large Language Models such as the Azure AI Foundry gpt-4o, per task.

 private ChatCompletionAgent UseTemplateForChatCompletionAgent( Kernel kernel, string transcript) { string generateBlogPostYaml = EmbeddedResource.Read("GenerateBlogPost.yaml"); PromptTemplateConfig templateConfig = KernelFunctionYaml.ToPromptTemplateConfig(generateBlogPostYaml); KernelPromptTemplateFactory templateFactory = new(); ChatCompletionAgent agent = new(templateConfig, templateFactory) { Kernel = kernel, Arguments = new() { { "transcript", transcript } } }; return agent; } [KernelFunction(Functions.GenerateBlogPost)] public async Task<BlogPostProcessingResponse> GenerateBlogPostAsync(string transcript, Kernel kernel, KernelProcessStepContext context) { _logger.LogInformation("Starting blog post generation process"); var agent = UseTemplateForChatCompletionAgent( kernel: kernel, transcript: transcript); // Create the chat history thread var thread = new ChatHistoryAgentThread(); // Invoke the agent with the transcript var result = await InvokeAgentAsync(agent, thread, transcript); // Update the state with the generated blog post VideoProcessingResponse videoProcessingResponse = new VideoProcessingResponse { Transcript = transcript, BlogPost = result }; _state!.BlogPosts.Add(result); _state!.VideoProcessingResponse = videoProcessingResponse; return _state; }

This, equipped with prompt templates or a plugin can “supercharge” your agent with tools and knowledge or simply an instruction or system prompt.

name: GenerateBlogPost template: | Generate a blog post from the following video transcript formatted in markdown, make sure the markdown is valid and well-structured: {{$transcript}} template_format: semantic-kernel description: A function that generates a blog post from a video transcript. input_variables: - name: transcript description: The video transcript to generate a blog post from. is_required: true output_variable: description: The generated blog post. execution_settings: default: function_choice_behavior: type: auto

SLM, LLM and AI Agent Service: Finding the balance

I experimented using both local phi4-mini and gpt-4omodels. Specifically on the EvaluateBlogPostStep I used the Azure AI Agent Service to light up the process. Let’s go through it in a bit more detail.

With the Semantic Kernel (SK) Agent Framework you can use different type of agents, one that’s originated in SK, ChatCompletionAgent or other supported agents such as the AzureAIAgent . Each comes with abilities to use different models, documented in the Semantic Kernel and Azure AI Agent Service pages.

Small Language models (like phi4-mini via ollama) offer privacy, speed, and edge processing whilst large language models, (like GPT-4o on Azure) deliver deeper reasoning, richer output, and easier scaling.

For posterity, you can judge yourself how they perform!

phi4-mini # Prioritizing Security: A Peek into Microsoft's Commitment Thank you for joining us as our CEO Satya Nadella introduces today's session on cybersecurity initiatives spearheaded by none other than **Microsoft**. In an unprecedented move, Microsoft has put security at its very core—making it the top priority above all else. This commitment is so significant that we have mobilized approximately 34,000 engineers to work exclusively in this area; effectively making us responsible for what stands as one of history's largest cybersecurity projects ever undertaken within digital technology. ## Three Pillars Guiding Microsoft's Security Efforts The heart and soul behind these massive efforts are three guiding principles: 1. **Secure by Design**: This principle mandates that security should be a primary consideration during the planning phase, shaping any product or service to incorporate robust defenses from its inception. 2. **Secure by Default**: Reflecting an unwavering commitment towards protection without burden on users; Microsoft has ensured all our systems and services are equipped with strong protective measures enabled right out of the box—no extra effort required. 3. **Secure Operations**: Recognizing that threats evolve, Microsoft's security operations continually advance to address current as well as emerging risks through enhanced control mechanisms and continuous monitoring processes. ## A Sneak Peek into Microsoft’s Security Management Today we are excited to give you a glimpse of how these principles translate in our day-to-day management at **Microsoft**. We will delve deeper: - How exactly the Secure Future initiative is managed within organizations - Practical ways for your own organization or team can adopt and implement similar security measures. We believe that by sharing Microsoft's strategic insights, we hope to empower businesses worldwide with tools necessary not just to protect but also proactively respond in our ever-evolving digital landscape. Stay tuned as **Microsoft** continues its mission of building a more secure future together! --- *Thank you for joining us on this important journey towards prioritizing security.* Stay connected and follow Microsoft for ongoing insights, updates, and resources focused around the Secure Future initiative. #Security #Cybersecurity #DigitalTechnology ***Note: The above post is formatted in Markdown. You can use various tools like Jupyter Notebook or online editors to preview how it renders with different styling options.*

gpt-4o ## Prioritizing Security: Exploring Microsoft's Unwavering Commitment Welcome! Today, as our CEO Satya Nadella introduces a pivotal session on cybersecurity, we invite you to dive into **Microsoft's** bold initiatives to safeguard the digital world. In a remarkable display of leadership, Microsoft has made security its paramount priority, embedding it into every facet of its operations. This dedication is exemplified by the mobilization of approximately 34,000 engineers solely focused on advancing cybersecurity. This initiative represents one of the largest and most comprehensive cybersecurity undertakings in the history of digital technology. --- ## The Three Core Pillars of Microsoft's Security Strategy At the foundation of Microsoft's approach to cybersecurity are three guiding principles. These pillars drive our mission to create an ecosystem where trust and security serve as bedrocks for innovation. ### 1. **Secure by Design** Microsoft’s journey starts from the ground up. **Secure by Design** means embedding security from the very beginning of a product or service’s development life cycle. By prioritizing robust defenses during the planning phase, we create systems and software that are resilient against modern threats, ensuring data is protected at its core. ### 2. **Secure by Default** Ease of use and strong protection shouldn’t be mutually exclusive. Through **Secure by Default**, Microsoft eliminates the need for user intervention by providing security-enabled systems straight "out of the box." Every service and product incorporates strong, built-in security measures, relieving users of the burden of configuration while enhancing protection seamlessly. ### 3. **Secure Operations** Cyber threats are ever-evolving. To stay ahead, Microsoft prioritizes **Secure Operations**, leveraging state-of-the-art monitoring, adaptive security controls, and a proactive response framework. This ensures that our systems not only address current threats but also adapt and evolve to counter emerging challenges in the cybersecurity landscape. --- ## Behind the Scenes: How Microsoft Puts These Principles to Work You might be wondering—how does Microsoft bring these principles to life every day? Here’s a glimpse into our security management practices and how businesses can learn from our approach: - **The Secure Future Initiative**: We’ll explore how this program functions within Microsoft's ecosystem, driving consistent alignment with security priorities. - **Practical Implementation**: Discover actionable strategies to introduce these security measures into your own organization. From small teams to global enterprises, the principles of "Secure by Design," "Secure by Default," and "Secure Operations" can serve as a blueprint for creating a safer digital environment. By sharing our journey, Microsoft aims to empower businesses and organizations worldwide with the tools and strategies needed not just to safeguard data but also to actively respond to evolving digital risks. Together, we can create a more secure, resilient digital landscape. --- ## Join Us in Building a Safer Future Microsoft is unwavering in its mission to protect users and organizations across the globe, and we invite you to be part of this mission. By taking proactive steps and embracing security at all levels, we can collectively strengthen our defense against online threats. *Thank you for joining us on this journey to prioritize security and build trust in the digital age.* Stay connected with Microsoft for more insights, updates, and resources as we continue to advance our **Secure Future Initiative**. --- ### Keywords: #Microsoft #Cybersecurity #DigitalSafety #SecureByDesign #SecureFuture #DigitalTechnology *Note: This content is written in Markdown. You can utilize tools such as Jupyter Notebook or Markdown editors to preview this post with varying styles.* 

Key Takeaways

Process & Agent frameworks bring much-needed rigor and control to the AI agent world:

  • Tame “agent sprawl” by orchestrating steps, maintaining state, and enforcing predictability
  • Unlock productivity by automating multi-step, previously manual processes
  • Combine the flexibility of LLM “intelligence” with the reliability of defined business workflows

💡Start small, pick a piece of your workflow (no matter how boring!), and build a prototype agent. You’ll be surprised at how much it can improve your productivity!

Where to go from here

Go check out SemantiClip, connect and follow me on LinkedIn and GitHub. If you missed the Microsoft Reactor session go ahead and check the recording.

Finally, don’t forget to join the APAC Semantic Kernel Office Hours to discuss anything agentic.

Go and “agentify” your process! 🤩

0 comments