ENGINEERING

Why we built Tablize as one Rust binary instead of microservices

2026-05-23 · Tablize Team

Tablize ships as one Rust binary. Inside that binary: the agent runtime, the HTTP server, five product domains (Data, IoT, App, Media, Platform), 38 third-party integrations, the LLM provider client, the tool registry, the auth layer, the billing ledger. Deploy it with docker run. One process. One image.

This is, in 2026, an unfashionable choice. The dominant pattern for a product like ours is: 5+ microservices, separate deployments per domain, a service mesh, a separate database per service, an event bus to glue them together. We did not do that. This post is why.

(There’s a tradition in engineering blog posts where someone defends a contrarian architectural choice by listing the upsides and waving away the downsides. This isn’t going to be that post. The choice has real costs. We’ll cover them.)

The setup

The constraints we were optimizing for:

1. Self-hostable. A user should be able to docker compose up and have a fully functional Tablize instance. Not “Tablize Lite.” The real thing.

2. Small-team operable. Both for our customers (DTC operator who doesn’t run Kubernetes) and for us (we don’t have a platform team to maintain 12 services).

3. Domain isolation in code. No tangled cross-imports between Data and IoT logic. Each domain should be independently understandable.

4. Hot iteration. We’re moving fast. Adding a feature shouldn’t require touching 4 services and coordinating 4 deploys.

5. Low cold-start cost. Our managed cloud spins up a workspace per user. Cold-start matters; spinning up 5 containers per workspace is too slow.

Three of those constraints (1, 2, 5) push hard toward a monolith. Constraint 3 is what you’d think a microservice split solves. Constraint 4 also pushes toward a monolith — coordinating cross-service changes is the largest tax of microservice teams.

So we asked: can we get the code-organization benefits of microservices (constraint 3) without the operational complexity?

What we built instead

The answer is a single Rust workspace with 17 crates:

server                ← HTTP entry point + route registration
rusty-claude-cli      ← CLI / interactive shell
console-server        ← admin / billing / workspace mgmt
  └── domain-data         ← analytics, ingest, CSV/JSON/Parquet
  └── domain-iot          ← MQTT, devices, spatial indexing
  └── domain-app          ← app generation, contracts
  └── domain-media        ← S3 storage
  └── domain-platform     ← auth, RBAC, billing, jobs
  └── domain-integrations ← 38 connectors
  └── tools               ← aggregates tool_specs() from all domains
      └── runtime → plugins → commands
          └── api         ← LLM provider client

The rule: domains have zero runtime cross-dependencies. Data doesn’t import IoT. IoT doesn’t import App. They each expose:

A {Domain}State struct that owns the domain’s resources (PG pool, MQTT client, etc.)
A tool_specs() function returning the tools this domain provides to the agent
An execute_tool_async() function called by the runtime to actually run a tool

The tools crate is the only thing that imports all five domains, and it does so to aggregate them into one global tool registry. The agent runtime calls execute_tool_async() on whichever domain owns the requested tool.

This is microservice-like code isolation with monolith deployment.

Why Rust

Three reasons.

Memory safety without GC pauses. We run a lot of concurrent I/O: WebSocket connections from agents, MQTT subscriptions, HTTP requests to LLM providers, Postgres queries, S3 streaming. A pause-the-world GC at the wrong moment would drop messages and time out queries. Rust’s compile-time memory safety + zero-cost async lets us run thousands of concurrent connections per process.

One binary, no runtime. Compared to Node or Python, the deployment story is dramatically simpler. There’s no “make sure libffi version X is installed” or “use Python 3.11 not 3.12.” The binary statically links everything it needs. Docker image is 55 MB.

The compiler catches the things we’d catch in code review. This sounds glib. It’s not. About 30% of the “would have shipped a bug” cases we’ve had were caught at compile time — usually around lifetime correctness in shared state. The compiler-as-pair-programmer is real.

The cost: Rust is slower to write than Python, and our team had to ramp up on async Rust patterns. We chose this trade because Tablize is fundamentally an infrastructure product, and we’re going to be living with this code for years.

Why one PG instance

Each domain has its own schema inside one Postgres instance (iot.*, data.*, app.*, etc.) — not its own database, not its own server. Five schemas, one PG.

The reasoning:

Cross-domain joins are common. “Show me orders whose customer also has an active IoT alert” — that’s a join across data.orders and iot.alerts. With one PG, this is one query. With separate databases per domain, you’d be doing in-memory joins after fetching from two services.

One backup is much simpler than five. Self-hosted users do pg_dump and have everything. Managed users get one logical backup per workspace.

TimescaleDB extension applies everywhere. Time-series acceleration on iot.sensor_readings and on data.event_logs uses the same extension, same hypertable mechanics, same retention policies.

Connection pooling is simpler. One pool, sized once, shared by all domains.

The cost: a runaway query in one domain can affect the others. We mitigate with statement timeouts and connection limits per domain. Has it been a problem? Once, when a user’s IoT subscription started receiving 50K msg/sec and the inserts backed up the connection pool. We added per-domain pool sizing and the issue hasn’t recurred. Cheap fix.

Why no event bus between domains

Most microservice architectures use Kafka / NATS / Redis pub-sub to communicate between services. We don’t. When the IoT domain needs to trigger an analysis (e.g., sensor anomaly → run a data query), it does so via a tool call routed through the agent runtime.

This sounds slower than a direct event bus. It’s not, because the agent runtime is the same process. Tool calls dispatch in microseconds. There’s no network hop, no serialization, no broker.

The downside: there’s no audit log of “domain A sent event X to domain B” outside the agent’s tool-call trace. We accept this. The agent trace is the cross-domain log.

What domain-isolation buys us

In practice, the rule “domains can’t import each other” is the single most valuable architectural constraint we have.

It means: when a developer is working on the IoT domain, they don’t have to know how the Data domain works. The IoT crate has its own Cargo.toml, its own tests, its own integration tests against PG. They cannot accidentally take a dependency on data::orders because the import won’t compile.

It also means: when we onboard a contractor for one domain, they ramp up on one domain’s code, not the whole codebase. The mental model fits in a head.

The enforced isolation is, in our experience, more important than the deployment isolation. You can run your monolith if the code is well-organized; you cannot run your microservices well if the code is tangled. We chose the easier of the two problems.

What we’d do differently

Honest list:

We over-isolated some things. Early on, we made domain-media (S3 storage) its own crate. In retrospect, S3 storage is such a thin abstraction that it could have lived in a util crate that any domain imports. We over-applied the domain pattern to things that aren’t really domains.

The agent runtime crate is too big. runtime → plugins → commands is one logical unit but it grew to 8 sub-crates with confusing boundaries. We’ve been gradually consolidating; this is ongoing.

WebSocket session management is in the wrong place. Right now it lives in server. It probably should be in runtime since it’s tied to the agent lifecycle, not the HTTP layer. Refactor on the list.

One PG isn’t going to scale forever. For our current scale (workspaces under 100M rows in a single PG), this is fine. For workspaces that grow into the billions, we’ll probably need to split the high-volume IoT data into a separate TimescaleDB or shard it. We’re watching for the breakpoint.

What it cost us not to do microservices

Some things microservices give you that we gave up:

Independent deploy cadence. When we ship a new IoT feature, the entire binary redeploys. With microservices, we could deploy just IoT. In practice, we deploy 2-3 times a week and the full rollout is fast (Fly.io rolling restart, <2 min), so this hasn’t hurt.

Per-domain scaling. With microservices, the IoT consumer can scale independently of the analytics layer. In our world, we scale the whole binary per workspace. For our cost model (per-workspace Fly.io machines), this is fine — the workspace is the scaling unit, not the domain.

Language flexibility. Microservices let you write the IoT consumer in Rust and the LLM router in Python. We’re all-Rust. For us this is a win (one language, one toolchain), but for teams with strong existing Python/Go investments it would be a downside.

Blast radius isolation. If the agent runtime crashes, all of Tablize crashes for that workspace. With microservices, only the agent runtime container would crash. We mitigate with extensive panic handling in Rust and per-workspace processes (so one workspace’s crash doesn’t affect another). It’s a real tradeoff.

When you should do what we did

Not all products should ship as a single Rust binary. Some signals that this shape fits:

You need self-hosting to be trivial. “Run this Docker image” is dramatically simpler than “run this Kubernetes manifest.”
You’re a small team (under 20 engineers) for the foreseeable future. Microservice operational tax doesn’t pay for itself until you have multiple teams owning multiple services.
Your domains are mostly read-mostly with shared persistent state. Our domains all read/write to the same PG. If your domains had genuinely separate state, splitting them makes more sense.
You’re optimizing for cold-start latency. Per-workspace per-process per-machine isolation is much faster than per-workspace per-container per-deployment.

Signals it doesn’t fit:

You have multiple teams owning independent services with independent release cadences. Then microservices are how you scale organizational complexity.
Your domains have wildly different traffic profiles. If one domain handles 100× more traffic than the rest, splitting it lets you scale just that.
You need polyglot. Different languages for different services means microservices.

What to take from this

Architectural choices aren’t permanent. We may end up splitting Tablize into multiple binaries someday if scale demands it. The current shape is the right shape for now and for the next 12-24 months of growth. The flexibility we kept by structuring the code as well-isolated domain crates means a future split, if it happens, is a refactor — not a rewrite.

If you’re starting a similar product today, I’d encourage you to ask: “what’s the code-organization problem I’m solving by splitting into services, and could I solve that with disciplined crates / packages / modules instead?” Most of the time, the answer is yes. The deployment complexity of microservices is mostly orthogonal to whether your code is well-organized.

Try Tablize free →

Related reading:

What is a Data Agent?
The 9-step Verifiable Reasoning Protocol explained
Tablize architecture overview lives in llms-full.txt