What is MCP and why does it matter?

MCP (Model Context Protocol) is an open standard created by Anthropic that lets AI assistants call external tools. Without MCP, an AI assistant can only talk. With MCP, it can generate images, query databases, drive APIs — in short, act on the real world. It's the equivalent of what plugins were for web browsers.

How much does image generation cost with Pruna AI?

Starting at $0.005 per image with the p-image model (generation in ~1.5 seconds). Video costs $0.02-0.04/second. That's 8-16x cheaper than DALL-E 3, with comparable latency.

How do I install pruna-mcp-server?

Zero-install with uvx: 'uvx pruna-mcp-server'. Or via pip: 'pip install pruna-mcp-server'. Then configure your Pruna API key and add the server to your MCP client config (Claude Desktop, Kiro, Cursor).

What is Kiro and how does it speed up development?

Kiro is an agentic IDE developed by AWS. Instead of writing code line by line, you describe what you want to build and Kiro generates the code, tests, and documentation. The Pruna MCP server (400 lines, 93% test coverage) was built in a single session with Kiro.

MCP: When Your AI Assistant Learns to See, Create, and Act

It Started With a Bowl of Dal

Last Sunday, I was building a school lunch menu app for my kids. Nothing serious — a weekend project, the kind of thing you do between coffees. The menu showed “Indian Dal with Rice” and I wanted a nice image to go with it. Natural reflex: I ask Claude. “Generate an image of Indian dal with basmati rice, food photography style.”

Claude politely tells me it can’t do that.

Of course. Claude is a language model. It manipulates text, not pixels. So I did what we all do: left my editor, opened a browser, found an image generator, retyped my prompt, waited, downloaded the result, moved the file to the right folder. Five minutes for one image. Five minutes of pure friction, lost context, broken flow.

And that’s when it hit me. Not the image problem — the architecture problem. Our AI assistants are brains without bodies. They can think, analyze, write with remarkable precision. But they can’t act. No eyes to see an image, no hands to call an API, no legs to go fetch data from an external system.

Indian dal — generated by Pruna AI via MCP This image? Generated 1.5 seconds later, right in the chat, for $0.005. Once the MCP server was in place.

The Protocol That Gives AI Hands

MCP — Model Context Protocol — is the answer to this problem. Created by Anthropic in November 2024, it’s an open standard that defines how an AI assistant can call external tools. Think of it as USB-C for AI: a universal connector that lets you plug any tool into any assistant.

Before MCP, if you wanted your AI to query Salesforce, you needed a custom integration. Generate images? Another custom build. Drive your CI/CD? Yet another. Each integration was a silo — fragile, expensive to maintain, impossible to reuse.

With MCP, you write a server once. It exposes standardized tools. And any compatible client — Claude Desktop, Kiro, Cursor, VS Code — can use them immediately. The ecosystem already has hundreds of servers: databases, CRM, e-commerce, DevOps, productivity. An AI assistant with MCP doesn’t just answer your questions anymore. It executes your requests.

Pruna AI: The Franco-German Startup Nobody Knows About (And That’s a Shame)

Back to my image problem. I needed a generation API that was fast, cheap, with a real REST API — not a Discord bot like Midjourney, not something at $0.08 per image like DALL-E 3.

I discovered Pruna AI somewhat by accident. It’s a Franco-German company, founded by machine learning researchers from TU Munich and the Parisian tech ecosystem. Two research hubs — Munich and Paris — over 300 scientific publications, and 9,000 open-source models optimized to date. Their unofficial tagline: “Built with Pretzels & Croissants.” You can’t make this stuff up.

The name comes from “pruning” — neural network pruning. Their approach is elegant: they don’t create models, they take the best open-source models (FLUX, SDXL, Stable Diffusion) and optimize them through quantization and pruning so they run faster on less hardware. This is high-level applied research, not marketing. The concrete result: images in 1.5 seconds for half a cent. Not a typo. $0.005 per image.

	Pruna AI	DALL-E 3	Midjourney	Replicate
Price/image	$0.005	$0.04-0.08	~$0.05 (sub)	$0.01-0.05
Latency	~1.5s	~5s	~30s	~3-10s
Models	18	1	1	100+
REST API	✅	✅	❌	✅
Video	✅	❌	❌	✅

Replicate has more models, but Pruna is 2-10x cheaper and faster on the models they do offer. And unlike the American giants, it’s a European team, with data flowing through European infrastructure. For an MCP server where latency matters (the user is waiting in chat), it was the obvious choice.

Building the Server With Kiro: One Session, Not a Sprint

I could have coded this by hand. HTTP client, error handling, retry logic, validation, tests — count 2-3 days for an experienced developer. Instead, I opened Kiro.

Kiro is an agentic IDE developed by AWS. The principle: you describe what you want, Kiro generates specs, you validate, Kiro implements, you review. It’s not “vibe coding” where you let the AI do whatever and hope it works. It’s a structured workflow where every architecture decision remains yours, but you’re no longer typing every character.

For the Pruna server, it went like this: I described the 6 tools I wanted to expose (generation, editing, upscale, video, catalog, upload). I specified constraints — async fallback for video, transparent local file handling, inline images in chat. Kiro proposed a 5-module architecture. I adjusted two things (separate HTTP client for multipart uploads, fail-fast config validation). And in one session, I had 400 lines of clean code, 100 tests, 93% coverage, ready to publish on PyPI.

What surprised me most: the quality of the error handling code. Exponential backoff with jitter, configurable timeouts, error messages that tell you exactly what went wrong and how to fix it. The kind of code you write well when you have time — and cut corners on when you’re rushed. Kiro doesn’t cut corners.

The 6 Tools in Detail

generate_image  → text to image, 10 models (~1.5s, $0.005)
edit_image      → modify an image with text instructions (~2s, $0.01)
upscale_image   → AI enhancement up to 8 megapixels (~4.5s, $0.005)
generate_video  → text or image to video (~43s, $0.02-0.04/s)
list_models     → browse catalog with pricing and capabilities (free)
upload_file     → send a local file for editing (free)

The design that makes the daily difference: when you say “edit this image” pointing to a file on your disk, the server detects it’s a local path, auto-uploads it to Pruna, runs the edit, and returns the result inline. You never see the upload. You don’t even know it happened. That’s the right abstraction.

For video, same logic: generation takes ~45 seconds. Too long for a synchronous call. The server tries sync first, and if the timeout is exceeded, it switches to async with polling. The user waits a bit longer, but doesn’t have to do anything different.

What I Learned About the MCP Ecosystem (The Hard Way)

The ecosystem is 6 months old. It’s young. And you can feel it in the rough edges.

Claude Desktop launches MCP servers with PATH=/usr/bin:/bin. That’s it. If your server needs uv or any tool installed in your home directory, it won’t find it. No error. No log. The server simply doesn’t start, and Claude tells you “no tools available” with no further explanation. I lost 20 minutes on this before figuring it out. The fix: absolute path in the config. /Users/you/.local/bin/uvx, not uvx.

Kiro requires a triple declaration for each tool. The server in mcpServers, the tools in tools, and the tools again in allowedTools. It’s verbose but it’s a security choice — nothing executes without explicit authorization. The trap: forget one of the three entries and everything fails silently. Always test with --log-level debug.

FastMCP’s Image() helper is a trap. It works perfectly when you return a single image. But as soon as you want to return text AND an image (which is 100% of the time — “Here’s your image, generated with model X in 1.4s”), it crashes. The fix: forget the helpers, use ImageContent and TextContent from the MCP SDK directly.

What’s Next

The server is published, it works, it’s being used. But what interests me is what this represents for what comes next.

MCP is doing to AI assistants what REST APIs did to the web 20 years ago: turning isolated systems into a connected ecosystem. Today, MCP servers exist for PostgreSQL, Shopify, GitHub, Slack, Google Drive, AWS. New ones appear every week. In 12 months, an AI assistant without MCP access will be as limited as a smartphone without an internet connection.

And agentic development with Kiro makes building these servers accessible to any developer. No need to be an expert in network protocols or binary serialization. You describe what you want, validate the architecture, review the code. One session is enough.

If you have a business API — your ERP, your CRM, your internal management tool — you can make it accessible to your teams through their AI assistant. In an afternoon. That’s exactly the kind of project we help with at LCMH. Let’s talk.

🚀 Install pruna-mcp-server

One command, and your AI assistant generates images:


uvx pruna-mcp-server

Or if you prefer pip:


pip install pruna-mcp-server

Then:

Create your API key on the Pruna developer portal
Add the server to your MCP client (configuration guide)
Type “generate an image of…” in your chat

Links:

📦 GitHub — source code, issues, contributions
🐍 PyPI — pip install pruna-mcp-server
🔌 MCP Registry — io.github.charlesrapp/pruna
🌐 Pruna AI — the service behind the API

MCP: When Your AI Assistant Learns to See, Create, and Act

It Started With a Bowl of Dal

The Protocol That Gives AI Hands

Pruna AI: The Franco-German Startup Nobody Knows About (And That’s a Shame)

Building the Server With Kiro: One Session, Not a Sprint

The 6 Tools in Detail

What I Learned About the MCP Ecosystem (The Hard Way)

What’s Next

🚀 Install pruna-mcp-server

Frequently asked questions

Related Articles

Coding 10x faster with AI: the new calculus of agentic development

AI-DLC: how AI is transforming the software development lifecycle

Shopify Winter '26 RenAIssance: what 150+ updates mean for your store

Google Workspace Studio: automate your business without coding

Google Workspace now includes Gemini in all plans: what changes and what it costs

Renaissance Developer: Werner Vogels' Framework for Thriving in the AI Era