MLOps.community | Lyssna podcast online gratis

533 avsnitt

The Dark Side of MCP Servers
2026-06-23 | 1 h 9 min.
Sam Partee (CTO & co-founder of Arcade.dev) and Nate Barbettini (Founding Engineer at Arcade.dev) sit down at the MCP Dev Summit to unpack what nobody wants to admit about the Model Context Protocol: the security model is still full of sharp edges. From tool poisoning and prompt injection to why OAuth got bolted onto the spec, this is a builder 's-eye view of where MCP breaks — and how to ship agents safely anyway.
What we get into:🔓 OAuth on MCP — Why the spec adopted OAuth as its authorization standard, and the class of spoofing attacks it shuts down.☠️ Tool poisoning — How a malicious server hides instructions in tool descriptions, and why your agent trusts them by default.🧪 MCP Debugger & ToolBench — Shining a light on the rough edges by grading servers from S-tier to F-tier.🖥️ Sandboxing agents — Giving an agent a shell and a file system without handing over the keys to your machine.📜 Allow lists — Why MCP has client-level allow lists but skills mostly don't — and why that worries them.🔄 The auto-update problem — How skills and servers that silently update become a supply-chain risk ("rug pulls").✅ SOC 2, honestly — Why the controls are voluntary, misunderstood, and actually about best practices.🤖 AI-generated PRs — The new behaviors to watch for as agents start writing and merging code.
If you build agents, ship MCP servers, or are responsible for AI security at your company, this one's for you.
🔗 Links & ResourcesArcade.dev: https://www.arcade.devArcade MCP framework (GitHub): https://github.com/ArcadeAI/arcade-mcpSam Partee (GitHub): https://github.com/sparteeNate Barbettini (LinkedIn): https://www.linkedin.com/in/nbarbettiniMLOps.community: https://mlops.community
⏱️ Timestamps[00:00] Skills, agents, and local context
[08:36] MCP Debugger grades your server
[10:34] Why AI clients are still buggy
[20:54] Why agents shouldn’t always have shell access
[22:44] “I have a spicy take.”
[26:27] “Do not build your own auth.”
[31:14] The “checking someone else’s email” problem
[35:40] “OAuth is the best worst option.”
[43:50] The future of AI entertainment
[46:19] Tool poisoning explained
[50:49] “Trust me, bro,” is not a security solution
[52:45] MCP registries as the App Store model
[1:00:28] AI-generated PRs and speed vs quality
[1:02:37] Why behavior-driven development is coming back
[1:08:11] Have we already reached AGI?

#MCP #AIAgentSecurity #ToolPoisoning
Sandboxing, Agent Harnesses, and Agent Teamwork
2026-06-19 | 1 h 19 min.
Shahram Anver is the Co-Founder and CEO of Cleric, the autonomous AI SRE that investigates and root-causes production issues like an experienced teammate — often in under two minutes. Before Cleric, Shahram led MLOps, DevOps, and FinOps platform engineering at Gojek, Southeast Asia's super-app. In this conversation, he breaks down why production operations never kept pace with AI-accelerated development, and why the real unlock for an AI SRE isn't faster triage — it's an agent that *learns* and compounds operational memory across your whole org.

In this episode:
🔧 The on-call problem — Why one broken service still drags ten engineers onto a call, and how AI changes that
🤖 What an AI SRE actually is — How Cleric investigates across your existing observability stack instead of adding another tool
🧠 Learning over MTTR — Why Shahram argues the value isn't alert triage, it's an agent that gets better every investigation
🪜 Ramping like a new engineer — Explore the environment, learn from the work, talk to the team
🔁 The investigate–measure–learn loop — Turning what worked on one incident into context for the next
🕸️ Knowledge graphs & operational memory — Mapping teams, clusters, and dependencies so insight from one team helps another
⚡ Under two minutes to root cause — What "fast" really requires in a live production environment
🚀 The road to autonomy — From assisted investigation toward self-healing infrastructure
If you're an SRE, platform engineer, DevOps lead, or anyone building or buying AI agents for production, this one's for you.

🔗 Links & Resources
Cleric: https://cleric.ai
Shahram on LinkedIn: https://www.linkedin.com/in/shahramanver/
Willem Pienaar (Co-Founder/CTO): https://www.linkedin.com/in/willempienaar/
Cleric launches the first self-learning AI SRE: https://cleric.ai/blog/cleric-launches-the-first-self-learning-ai-sre
MLOps Community: https://mlops.community
Join the community: https://go.mlops.community/slack

⏱️ Timestamps
[00:00] Tech Jargon Confusion
[00:27] Harness vs Model
[08:48] Model Evolution in Cleric
[13:36] Sandboxing and Simulated Environments
[20:40] Shifting AI Perceptions
[24:10] Managing Humans vs Agents
[31:32] Steering Parallel Agents
[34:16] Human Decision Integration in Models
[43:28] 80/20 Data Split
[49:40] Becoming a Skill
[53:35] 2027 Agent Autonomy
[59:14] Agent Learning in Production
[1:04:31] Software as Personal Capabilities
[1:08:31] Vibe Coding vs Durability
[1:18:23] Wrap up

#AISRE #SiteReliabilityEngineering #AIAgents
Zipline Roundtable episode: Building Real-Time ML Systems with Zipline + Chronon
2026-06-17 | 51 min.
Zipline Roundtable episode: Building Real-Time ML Systems with Zipline + ChrononJoin the Community: https://go.mlops.community/YTJoinInGet the newsletter: https://go.mlops.community/YTNewsletterMLOps GPU Guide: https://go.mlops.community/gpuguideBig shout-out to ZiplineAI for the collaboration!// AbstractReal-time ML use cases like personalization and risk decisioning come with a unique set of challenges: serving fresh feature values at low latency for inference, generating temporally consistent backfills for training, and building complex chains of on-demand, batch, and streaming transformations. In this roundtable, practitioners from Intuit, CreditKarma, Depop, and OpenAI share how they use Zipline and the OSS Chronon project to solve these challenges and deploy real-time ML use cases in production.// BioGerman KrikorianGerman is a Software Engineer on the Feature Platform team at Credit Karma. Since joining the company during the early development of its recommendation system, they have played a key role in building and scaling the platform over the years. Their work focuses on feature pipelines and the feature store, which serves as critical infrastructure supporting numerous teams and business verticals across the organization.Ben MagyarBen is an engineer at Depop working on ML and data systems. Before Depop, he worked on Search at Etsy. Most of his work is around the infrastructure and operational problems that come with running ML systems at scale.Raj KatakamRaj architects ML Infrastructure at Credit Karma (Intuit). He holds a Master's in Software Engineering from Carnegie Mellon and a B.Tech in EECE from IIT Kharagpur. His interests include ML Infrastructure, Distributed Systems, Real-Time Data Processing, and Generative AI. His current focus is on providing feature engineering platforms, production GenAI infrastructure, vector databases, ML model serving, and MLOps pipelines for fraud detection, personalized recommendations, financial insights, and model explainability.Mick JermsurawongLed Flyte ML training/experimentation at Stripe, and now led Chronon for ML features at OpenAIHosted by Demetrios// Related LinksWebsite: https://zipline.ai/https://chronon.ai/~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExploreJoin our Slack community [https://go.mlops.community/slack]Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)] Sign up for the next meetup: [https://go.mlops.community/register]MLOps Swag/Merch: [https://shop.mlops.community/]Connect with Demetrios on LinkedIn: /dpbrinkmConnect with German on LinkedIn: /e2zdkwh8cxghydg/Connect with Raj on LinkedIn: /rajkiran2190Connect with Mick on LinkedIn:/mick-jermsurawong/
MCP Servers Are Becoming the UI for AI Agents
2026-06-16 | 47 min.
Naseem Al-Naji is the co-founder of MCPcat.io and the creator of Opal — a builder with deep roots in privacy-first developer tooling. In this conversation, he breaks down why MCP servers have become a black box in production, and how MCPcat gives teams X-ray vision into how agents and users actually behave.

What we get into:
🐱 What MCPcat Is — Open-source analytics and live debugging built specifically for MCP servers
🎬 Session Replay — Watch an agent's full journey through your server, tool call by tool call
🎯 Agent Intent & Goals — Understand "why" a tool was called, not just that it was
🔍 Trace Debugging — Find exactly where agents and users get stuck or confused
🚨 Catching Hallucinations — How issue tracking surfaces when an LLM goes off the rails
🔒 Privacy-First by Design — Client-side redaction so sensitive data never leaves your environment
⚡ One-Line Integration — Python, TypeScript, and Go SDKs that drop into existing stacks
📊 Works With Your Stack — Native support for OpenTelemetry, Datadog, and Sentry
🚀 The Future of MCP — Where agent observability and the MCP ecosystem are heading

If you build, ship, or maintain MCP servers — or you're trying to figure out why your AI agents misbehave in production — this one's for you.

🔔 Subscribe, like, and share for more conversations on agentic AI:
▶️ YouTube: https://www.youtube.com/@AAIFAgenticConversations🎧 Spotify: https://open.spotify.com/show/033rZZJrQOVSSmhcStFhZA?si=rUNjFuNqRvGvAEWwqms7TA

Links & Resources:
🐱 MCPcat: https://mcpcat.io
💻 MCPcat on GitHub: https://github.com/mcpcat
👤 Naseem on LinkedIn: https://www.linkedin.com/in/naseem-al-naji
🐙 Naseem on GitHub: https://github.com/naji247

Timestamps:
[00:00] Intro
[01:41] MCP Needs Gatekeepers
[06:32] Measuring MCP Success
[13:57] MCPAT Feature Rollouts
[18:50] MCP Server Query Optimization
[26:48] UI Design Shift
[29:14] MCP Server Design Choices
[33:51] User Journey Traceability
[40:40] Agent Experience Evaluation
[45:23] AI Model Improvement Strategies

#MCP #AIAgents #Observability
Agents & the $40M Bet on Multiplayer AI
2026-06-12 | 1 h 20 min.
Stanislas Polu is Co-Founder & CTO of Dust — the enterprise AI agent platform used by 51,000 workers at 3,000+ companies. Before Dust, he spent three years on OpenAI's research team under Ilya Sutskever, working on mathematical reasoning in language models, and prior to that was an engineer at Stripe. He brings a rare combination of frontier AI research and product-building experience to the enterprise agent space.

Agents & the $40M Bet on Multiplayer AI // MLOps Podcast #384 with Stanislas Polu, Co-Founder & CTO of Dust

🤖 What is Dust? — How Dust enables teams to build and deploy AI agents powered by internal company data, and why the "multiplayer AI" model is winning in enterprise.
🧠 From OpenAI Research to Startup Founder — Stanislas's journey from studying mathematical reasoning in LLMs under Ilya Sutskever to co-founding an enterprise AI company in Paris with Gabriel Hubert.
🚀 The $40M Series B — What Dust is building with fresh funding, the bet on human-agent collaboration as the future of work, and what "multiplayer AI" actually means in practice.
🔄 The Outer-Loop Era — Stanislas's framework for thinking about where AI agents create the most value: not just automating tasks, but rewiring how work gets done across entire organizations.
⚠️ What Most Enterprise AI Gets Wrong — The biggest mistakes companies make when deploying AI agents, why adoption fails, and how Dust achieves 70%+ weekly adoption rates.
📊 Building Reliable Agent Infrastructure — Lessons from scaling to thousands of companies: observability, governance, data security, and why enterprise AI is harder than it looks.
🛠️ Horizontal vs. Vertical AI Platforms — Why Dust chose to build a horizontal enterprise agent platform and how that decision shapes product, go-to-market, and technical architecture.

This episode is essential for AI/ML engineers, enterprise AI leads, and anyone building or deploying AI agents at scale inside organizations.

🔗 Links & Resources:
• Dust: https://dust.tt
• Stanislas Polu on X/Twitter: https://x.com/spolu
• Dust on LinkedIn: https://www.linkedin.com/company/dust-tt
• Dust $40M Series B announcement: https://dust.tt/blog
• "The Outer-Loop Era" talk by Stanislas (dotconferences): https://www.youtube.com/watch?v=_outer_loop
• Dust + Stripe MCP integration: https://stripe.com/customers/dust
• Dust + Datadog observability case study: https://datadoghq.com/case-studies/dust

⏱️ Timestamps
[00:00] Future of Work
[00:19] Dust Scaling Lessons
[04:44] Human-Agent Collaboration
[14:24] Pod as Workspace
[22:30] Work Flow Optimization
[29:37] Multiplayer Collaboration Vision
[39:55] Token Economics and Inference
[47:20] AI Pricing Challenges
[52:36] Dust vs Co-work
[57:06] Agentic Work Infrastructure
[1:04:23] Stateful Sandbox Challenges
[1:09:58] Product Use Case Discussion
[1:14:05] Agent Data Interaction Needs
[1:20:09] Wrap up

#EnterpriseAI #AIAgents #Dust