Tiny Systems is now an agent runtime

For most of this year I've been describing Tiny Systems as a visual workflow engine that happens to have an MCP server. That framing is wrong, or at least it stopped matching what people were actually building with it.

The interesting flows everyone's been writing in the last few months are agents. Not "scheduled jobs" or "data pipelines." Agents. LLM picks a tool, calls it, reads the result, decides what to do next. Conversation history persisted to a PVC. Embeddings into pgvector. A Slack command kicking the whole thing off.

So that's what the website says now. The pitch is: self-hosted AI agents on Kubernetes. From a prompt.

What actually changed

Nothing in the engine. Same operators, same CRDs, same module ecosystem, same Helm chart. The pivot is in the framing, not the code. You can keep building everything you were building before: the cluster cost saver, the TLS expiry watcher, the CrashLoopBackOff Slack alerter. They all still work the same way.

What's different is how we talk about it. The homepage doesn't say "build Kubernetes workflows" anymore. It says "build an agent." The Solutions page is now "Starter agents." The Modules page calls itself "the agent toolkit," grouped by what the module does for the agent — Brain (llm_chat, llm_complete, llm_tools), Memory (document_store, kv), Knowledge (embed_text, pgvector), Tools (HTTP, K8s, Slack, Postgres).

This isn't a rebrand. It's an acknowledgement that the thing we built lines up with a market that didn't have a clean name when we started.

Why agents specifically

Two reasons.

First, the workflow framing put us next to n8n, Temporal, and Inngest. We're not those. Temporal is durable execution for engineers who already know what they want to build. n8n is a SaaS-first low-code tool for marketing ops. Both of those buyers walk past our K8s-native pitch. Temporal users want the durability primitive, not a flow editor. n8n users want hosted, not self-hosted. Wrong rooms.

The agent framing puts us next to Dify, LangGraph, Flowise, CrewAI. That's a different room with a different buyer: someone who wants to ship an AI feature without sending customer data to OpenAI. Mid-market with compliance requirements. Vertical-AI startups whose enterprise customers won't accept cloud LLM SaaS. Internal-tools teams. That buyer exists and is looking, and they care about the things we're good at: cluster-native deployment, namespace isolation, real RBAC, modules as Helm charts they can audit.

Second, the work we shipped this quarter only makes sense as agent infrastructure. The bundles work (TEI for embeddings, pgvector for vector storage as conditional subcharts of the operator chart) is only useful for RAG agents. The pkg/secret SDK helper for resolving {{secret:name/key}} placeholders exists because agents need Anthropic API keys and you can't paste them into flow JSON. The namespace enforcement, a pre-install hook requiring the tinysystems.io/managed=true label, is a security boundary you only care about when your modules read secrets and call paid APIs.

Each of those features individually felt incremental. Lined up under the agent pitch they're obviously the same project.

What's different from Dify

People will ask. Let me answer honestly so I don't have to write a comparison post later.

Dify has more polish. Their drag-drop UI is years ahead of anything we have. They have a knowledge base ingestion flow (upload docs, configure chunking, retrieval settings) that's a real feature we don't have packaged. Their plugin marketplace has more entries than our module catalog. They have 800+ contributors.

Dify runs as Docker Compose. Their Kubernetes story is community-grade, not first-class. They're built as one big container set with a UI, not as a runtime where every primitive is a real K8s object with proper RBAC and namespace isolation. If you're a developer building a chatbot for a non-K8s shop, Dify is the right tool. If you're an SRE who has to run this in production alongside everything else in your cluster, Dify is going to feel weird the moment you try to scale a worker, debug a stuck deployment, or apply your existing K8s security policies to it.

We're built the other way around. The engine is K8s. The UI is on top. Adding observability means using your existing OTel collector, not learning Dify's analytics tab. Adding secrets management means using K8s Secrets with RBAC, not a config file. This was always our shape; the agent pitch just makes it obvious why that shape matters.

We're behind Dify on breadth: multi-provider LLM support, knowledge ingestion UI, templates marketplace, observability dashboards. Those are real gaps and the next round of work closes most of them. But we're not trying to build a Dify clone with K8s underneath. The wedge is the production-shape, not feature parity.

What this changes for the SDK

Nothing. Modules keep the same shape. If you wrote a module last month, it still works. The module.Bundle declaration, the pkg/secret.Resolve helper, the bundle.URL() service discovery were added under the agent framing but they don't break anything older.

If you're writing a new module, the question to ask is whether it's a Brain, Memory, Knowledge, or Tool component. That's the framing you should reach for. A new database client isn't "a database module"; it's "a tool agents can use to query data." A new file storage thing isn't "storage"; it's "agent memory." The shift is in what shape you reach for first.

What you can actually build today

Three working agent shapes that the existing modules cover:

Chat with memory. http_server receives a question. document_store.get loads the conversation history. llm_chat continues the conversation. document_store.put saves the updated history. http_server.response returns the answer. Five nodes, persists across pod restarts via the bbolt PVC.
K8s incident triage. pod_watch notices something looks off. llm_complete (with the pod status as context) writes a one-line diagnosis. slack_send posts to your on-call channel. Three nodes. Replaces an alerting bash script with an agent that explains what it sees.
RAG over your docs. http_server takes a question. embed_text (against in-cluster TEI from the bundle) generates the query vector. Vector search against pgvector (also a bundle), though right now this is a postgres_query call until the dedicated vector_search component lands in the next release. Retrieved chunks plus the question go to llm_chat. Answer comes back. Six nodes, zero data leaving the namespace.

All three install via Claude Code talking to the MCP server. You don't write the flow JSON; you describe what you want and the MCP fills it in.

What's next

Three short tracks of work for the next month, in priority order:

Credibility floor: multi-provider LLM support in llm-module (Anthropic-only is too narrow), vector_search / vector_upsert components in database-module (closes the RAG slice), 5-6 starter agents in the Solutions catalog so the page isn't sparse.

Polish sprint: signal single-shot mode (right now it retries forever on failure, burning API credits), proper error codes when an edge expression resolves to empty (the LLM API saying "x-api-key required" when the field is missing is misleading), session refresh that doesn't bounce you to the workspace picker, multi-tab presence detection that doesn't lock you out of your own flows.

Platform team layer: the things that justify a paid tier on top of the free open-source runtime. Private module registry, secrets UX (point-and-click instead of kubectl), cross-cluster observability dashboards, audit log. None of these change what the runtime does; they're what an engineering org with prod/staging/dev wants on top.

The runtime stays free open-source forever. The platform is the upgrade path when a team forms around it. Same shape as Mattermost, Supabase, Cal.com, GitLab: open core, paid team layer. Not a new model, just the right one.

Catching up

If you've been following along, this is the same project. If you're new: install the MCP server, point it at a cluster you control, ask Claude Code to build you an agent. The pitch is "your cluster, your keys, your namespace", and unlike everyone else saying similar words, we mean every word of it literally because the runtime is K8s and we don't operate it.