A fractal cell language where the agent's lifecycle is a state machine the compiler proves terminates. State machines are proofs. Infrastructure is a constraint. Pipelines are pipes.
cell agent + state machine + set_budget — protocol verified at compile time, LLM loop bounded at runtime
curl -fsSL https://soma-lang.dev/install.sh | sh
cell Fund {
memory {
positions: Map<String, Position> [persistent, consistent] // → SQLite. Automatically.
prices: Map<String, Float> [ephemeral, ttl(30s)] // → in-memory, auto-expire
}
state regime {
initial: neutral
neutral → risk_on
neutral → risk_off
* → crisis
crisis → neutral
}
every 1h { rebalance() }
on rebalance() {
let portfolio = universe()
|> map(s => score(s))
|> filter(s => s.composite > 50)
|> sort_by("composite", "desc")
|> top(15)
for stock in portfolio { execute(stock) }
}
on request(method: String, path: String, body: String) {
match path {
"/" → html(dashboard())
"/api/risk" → risk_report()
_ → response(404, map("error", "not found"))
}
}
}
Every agent framework has the same problem: agents get stuck in loops, call tools in wrong order, never terminate. Soma is the only language where the agent's lifecycle is a state machine the compiler can prove terminates — combined with a hard set_budget token cap, every think() call is bounded by construction. Verification covers the protocol; set_budget covers the LLM loop.
cell agent Researcher {
face {
signal research(topic: String) -> Map
tool search(q: String) -> String
tool summarize(text: String) -> String
}
state workflow {
initial: idle
idle -> researching
researching -> analyzing
analyzing -> done
* -> failed
}
on research(topic: String) {
set_budget(5000) // hard token cap
transition("task", "researching")
let data = think("Research: {topic}")
transition("task", "analyzing")
let summary = think("Synthesize: {data}")
transition("task", "done")
map("summary", summary, "tokens", tokens_used())
}
}
$ soma verify agent.cell
State machine 'workflow': 5 states, initial 'idle'
States: [analyzing, done, failed, idle, researching]
✓ 5 states, 7 transitions
✓ all states reachable from 'idle'
✓ terminal states: [failed]
✓ no deadlocks
✓ liveness: every state can eventually reach a terminal state
✓ wildcard transitions: * -> [failed]
Temporal properties for 'workflow':
✓ deadlock_free
✓ eventually(state in [done, failed])
✓ after('researching', state in [analyzing, failed])
✓ after('analyzing', state in [done, failed])
Temporal: 4 passed, 0 failed
// The compiler PROVED this lifecycle
// always terminates. set_budget(N) caps
// the LLM loop inside it. Together: a
// bounded, terminating agent.
cell agent — an AI agent is a cell. think() — calls any OpenAI-compatible LLM and runs the tool-calling loop with retries. tool — declares what the LLM is allowed to call (handler-backed). state — the lifecycle protocol, model-checked at compile time. The proof is over the lifecycle, not over the LLM's tokens; set_budget bounds the latter.
cell agent Pipeline {
state pipeline {
initial: idle
idle -> researching -> writing -> reviewing -> done
* -> failed
}
on run(topic: String) {
transition("job", "researching")
let facts = think("Research '{topic}'. List 5 key facts.")
transition("job", "writing")
let article = think("Write an article from: {facts}")
transition("job", "reviewing")
let review = think("Review for accuracy: {article}")
transition("job", "done")
map("article", article, "review", review)
}
}
// soma verify PROVES, on the state machine:
// - every reachable state has a path to done or failed (eventually)
// - no deadlocks
// What it does NOT prove: that the handler body calls transition() in
// the right order. That's still on you (and on `soma check`, which
// catches missing handlers, contradictory properties, signal mismatches).
| LangChain | CrewAI | AutoGen | Soma | |
|---|---|---|---|---|
| Lifecycle proven to terminate | No | No | No | Yes (CTL on the state machine) |
| Illegal tool order rejected at compile time | No | No | No | State machine + face contract |
| Token budget enforced as a hard cap | No | No | No | Yes (set_budget) |
| Verified handoffs | No | No | No | Signal types + state machines |
| Agent memory persistence | Plugin | Plugin | Plugin | Built-in: [persistent] |
| Distribution | No | No | No | soma serve --join |
A quantitative fund today needs Python + Pandas + Redis + PostgreSQL + Kafka + Flask + React + Celery + Airflow + Terraform. Ten tools, five languages, three teams. Fifty thousand lines of glue code. The specification lives in Confluence. The code lives somewhere else. They drift apart.
An AI agent can write a function. Maybe a file. But it cannot write a system — services that coordinate, maintain state, handle failures, and evolve. The gap is not intelligence. The gap is that programming languages separate intent from infrastructure, specification from implementation, contract from code.
Soma closes that gap.
In Soma, you don't write specs then implement them. The face contract is the API. The state machine is the protocol. The memory properties are the infrastructure requirements. The compiler checks all three. The same artifact that describes what the system should do is the artifact that runs.
Everyone thinks they mean the same thing by "specification" until they try to make it precise.
— Leslie Lamport, Turing Lecture 2014
[persistent, consistent] resolves to SQLite. [ephemeral] resolves to in-memory. The program never names a database. It declares what it needs. The runtime resolves how. The provider protocol is extensible — custom backends can be added as .cell files.
memory {
orders: Map<String, Order> [persistent, consistent] // → SQLite
cache: Map<String, Price> [ephemeral, local] // → in-memory HashMap
audit: Log<Event> [persistent] // → SQLite (WAL mode)
}
The future of data management lies in the declarative specification of what is wanted, not the procedural specification of how to get it.
— Joseph Hellerstein, The Declarative Imperative, 2010
Soma's cell is fractal. The same structure — face, memory, state, handlers — works at every scale. A function is a cell. A service is a cell. An AI agent doesn't need to learn "how to deploy a distributed system." It needs to learn one thing: the cell. Independent cells communicate via the TCP signal bus, configured in soma.toml.
// Two independent programs, connected via soma.toml
// exchange/app.cell
cell Exchange {
on order(data: Map) { place_order(data) }
every 500ms { signal trade(match_orders()) }
}
// trader/app.cell
cell Trader {
every 3s { signal order(quote) }
on trade(fill: Map) { record_fill(fill) }
}
// trader/soma.toml: [peers] exchange = "localhost:8082"
The best way to predict the future is to invent it.
— Alan Kay, Turing Award 2003
The face section is not documentation. It is a machine-checked contract. The compiler verifies it. Break the contract — the program does not compile.
cell API {
face {
signal create(name: String) -> Map
signal delete(id: String) // ← declared but no handler
promise all_persistent
}
on create(name: String) { return map("name", name) }
}
error: face contract: signal 'delete' declared in cell 'API' has no handler
→ Every signal in the face MUST have a matching on handler.
→ Param counts are verified. Contradictory properties are rejected.
→ Descriptive promises generate warnings. Structural promises are enforced.
This is what separates Soma from every other dynamic language. The face is a structural contract: it declares what signals the cell handles, and the compiler verifies the cell implements them. Signal existence, parameter counts, property contradictions, and structural promises are all checked at compile time. Return types and runtime invariants are not yet verified.
soma verify is a model checker. It exhaustively explores every reachable state in your state machine and proves temporal properties. Not testing — proving.
state order {
initial: pending
pending → validated
validated → sent
sent → filled
sent → rejected
filled → settled
* → cancelled
}
[verify]
deadlock_free = true
eventually = ["settled", "cancelled"]
never = ["invalid"]
[verify.after.sent]
eventually = ["filled", "rejected"]
[verify.after.pending]
eventually = ["validated", "cancelled"]
$ soma verify app.cell
State machine 'order': 7 states, initial 'pending'
States: [cancelled, filled, pending, rejected, sent, settled, validated]
✓ 7 states, 11 transitions
✓ all states reachable from 'pending'
✓ terminal states: [cancelled]
✓ no deadlocks
✓ liveness: every state can eventually reach a terminal state
⚠ validated -> sent has no guard (consider adding a guard condition)
✓ wildcard transitions: * -> [cancelled]
Temporal properties for 'order':
✓ deadlock_free — no deadlocks in any reachable state
✓ eventually(state in [settled, cancelled])
✓ never(state == 'invalid')
✓ after('pending', state in [validated, cancelled])
✗ after('sent', state in [filled, rejected]) — after reaching 'sent', predicate can be avoided
counter-example: pending → validated → sent → cancelled
→ The wildcard * → cancelled bypasses the expected filled/rejected
The model checker found a real bug: after sent, the spec requires the order to reach filled or rejected. But the wildcard * → cancelled allows skipping directly to cancelled — violating the spec. The counter-example shows the exact execution path.
This is an explicit-state CTL model checker over the state-machine graph. It exhaustively enumerates reachable states — not data values, not guard expressions. For state machines with a few dozen states, verification runs in milliseconds and is complete over that abstraction. The spec lives in soma.toml, the implementation lives in .cell, and the proof is one command: soma verify. What it cannot prove: anything that depends on guard predicates over runtime data, or anything inside a handler body that doesn't show up as a transition.
| Property | Meaning | CTL equivalent |
|---|---|---|
| deadlock_free | No reachable state has zero exits | AG(EX true) |
| eventually = ["X"] | All paths reach X | AF(X) |
| never = ["X"] | X is never reached | AG(¬X) |
| after.S.eventually = ["X"] | After S, all paths reach X | AG(S → AF(X)) |
Soma is not invented from nothing. It stands on decades of research in programming language theory, distributed systems, and formal methods.
Hewitt's Actor model (1973) introduced the idea that computation is message passing between autonomous agents. Soma cells are actors: they have private state (memory), respond to messages (signals), and can spawn children (interior). What Soma adds: typed contracts on the messages (face), and declarative infrastructure on the state (properties).
One Actor can send messages, create new Actors, and determine how to handle the next message it receives.
— Carl Hewitt, Peter Bishop, Richard Steiger. A Universal Modular ACTOR Formalism. 1973
Lamport showed that distributed systems need formal specifications. But specifications lived in separate documents, drifting from code. In Soma, the state machine IS the protocol. The face contract IS the API. The promise IS the invariant. There is no drift because there is no separation.
If you're thinking without writing, you only think you're thinking.
— Leslie Lamport, Turing Award 2013
Codd's relational model (1970) proved that declaring what you want beats specifying how to get it. SQL replaced procedural data access. Soma applies the same principle to infrastructure: [persistent, consistent] declares the intent. The runtime resolves the backend — today, that means SQLite for [persistent] and an in-memory HashMap for [ephemeral]. The provider protocol is extensible: new backends are cell backend declarations under stdlib/, so a Redis or Postgres backend is added the same way SQLite was.
Future users of large data banks must be protected from having to know how the data is organized in the machine.
— E.F. Codd, A Relational Model of Data for Large Shared Data Banks. 1970
Kay's vision for Smalltalk was a system where everything — including the language itself — could be modified from within. In Soma, properties, checkers, and backends are defined as .cell files, not compiler code. Add a new property? Write a cell. Add a new storage backend? Write a cell. The compiler reads them and enforces them.
The best way to predict the future is to invent it.
— Alan Kay, Turing Award 2003
Thompson and McIlroy's Unix pipes (1973) showed that complex programs are built by composing simple ones. Soma's |> operator is the same idea applied to data: filter |> map |> sort |> top. Each stage is a pure transform. The pipeline reads like a sentence.
Write programs that do one thing and do it well. Write programs to work together.
— Doug McIlroy, Unix Philosophy. 1978
Inspired by Elm and Rust. Every error shows the source line, points to the exact expression, and suggests a fix.
error: expected '=', found number '10'
--> app.cell:3:15
|
3 | let x 10
| ^
error: cannot add String and Int: hello + 5
--> app.cell:7:17
|
7 | let r = "hello" + 5
| ^
error: undefined function: lenght (did you mean 'len'?)
--> app.cell:4:9
error: face contract: signal 'delete' declared in cell 'API' has no handler
→ The face is a contract. Break it, and the compiler tells you.
soma fixThe compiler doesn't just report errors — it fixes them. soma fix reads the errors, computes repairs, and writes them back to the source file.
$ soma fix app.cell
✓ removed contradictory property: ephemeral on slot 'items'
✓ added handler: on delete(id: String) { ... }
✓ added handler: on update(id: String, data: Map) { ... }
✓ 3 fix(es) applied, re-checking...
✓ All checks passed.
$ soma lint app.cell
⚠ line 7: redundant to_json() — storage auto-serializes maps
⚠ line 11: unchecked .get() — may return ()
ℹ line 15: consider using match instead of if-chain (3 branches)
Soma is designed around the generate → fix → check → verify → serve loop. Every tool outputs structured JSON (--json). soma fix repairs trivial errors agents commonly produce: missing handlers referenced from face, contradictory property combinations, common builtin typos. Storage auto-serializes maps and lists. The language is intentionally regular so the compiler can give specific repairs.
Auto-repairs missing handlers, contradictory properties. One command.
Catches redundant to_json, unchecked .get(), if-chains that should be match.
Rich JSON: handler signatures, memory schema, state machines, face contracts.
Map destructuring, string prefix, guard clauses, range patterns, or-patterns — all composable. Agents generate correct HTTP routing on the first try.
on request(method: String, path: String, body: String) {
let req = map("method", method, "path", path)
match req {
{method: "GET", path: "/"} -> home()
{method: "GET", path: "/api/" + resource} -> list(resource)
{method: "POST", path: "/api/" + resource} -> create(resource, body)
{method: "DELETE", path: "/api/" + resource} -> delete(resource)
{method} if method == "OPTIONS" -> cors()
_ -> response(404, map("error", "not found"))
}
}
No more to_json / from_json. Store maps directly. Read them back as maps.
users.set("alice", map("name", "Alice", "score", 95))
let user = users.get("alice")
print(user.name) // "Alice" — auto-deserialized
on withdraw(balance: Int, amount: Int) {
let result = balance - amount
ensure result >= 0 // postcondition — fails if violated
result // implicit return
}
memory {
accounts: Map<String, String> [persistent]
invariant _slot_len <= 10000 // checked on every .set()
}
| Tool | Role | In Soma |
|---|---|---|
| Airflow / Cron | Scheduling | every 30s { ... } |
| Pandas (in-memory) | Data pipelines | |> filter() |> map() |> sort_by() |> agg() — in-memory only; no out-of-core / cluster query engine |
| SQLite / DB config | Persistence | [persistent, consistent] -> SQLite auto-resolved |
| In-memory cache | Fast storage | [ephemeral, local] -> HashMap |
| Express / Flask | API server | on request() { match path { ... } } |
| React | Dashboard | html(""" ... """) |
| Celery | Background jobs | every 1h { rebalance() } |
| TLA+ (a CTL fragment) | Verification | soma verify + soma.toml [verify] — covers deadlock_free, eventually, never, and conditional after.S.eventually; not the full TLA |
| ESLint / Clippy | Linting | soma lint — catches anti-patterns |
| Copilot fixes | Auto-repair | soma fix — writes missing handlers |
10 tools
5 languages
50,000 lines of glue
Specification: Confluence
Implementation: somewhere else
1 language
1 file (or a few)
1 command: soma serve
~300 lines
The spec IS the program
Soma has one structure: the cell. Face (contract), memory (state), state machines (protocol), signal handlers (behavior). Everything is a cell, from a function to a datacenter.
on request(method: String, path: String, body: String) {
let req = map("method", method, "path", path)
match req {
{method: "GET", path: "/"} → html(dashboard())
{method: "GET", path: "/api/" + resource} → list(resource)
{method: "POST", path: "/api/" + resource} → create(resource, body)
_ → response(404, map("error", "not found"))
}
}
// Also: guard clauses, range patterns, or-patterns, if-expressions
let grade = match score {
90..100 → "A"
80..89 → "B"
n if n >= 70 → "C"
_ → "F"
}
let x = if cond { a } else { b } // if/match are expressions
let portfolio = stocks
|> filter(s => s.market_cap > 1e9)
|> map(s => score(s))
|> sort_by("alpha", "desc")
|> top(15)
let names = users |> map(u => u.name) |> join(", ")
let active = users |> filter(u => u.status == "active")
let found = users |> find(u => u.id == target)
let valid = orders |> all(o => o.total > 0)
let html = """
<div class="card">
<h2>{stock.ticker}</h2>
<p class="price">${stock.price}</p>
<span class="change {cls}">{stock.momentum}%</span>
</div>
"""
state order {
initial: pending
pending → validated { guard { risk_check_passed } }
validated → sent
sent → filled
sent → rejected
filled → settled
* → cancelled
}
// Runtime enforces: you cannot reach 'sent' without passing 'validated'.
// The auditor reads the .cell file and sees the policy.
// The policy IS the code.
memory {
data: Map<String, String> [persistent, consistent] // → SQLite
cache: Map<String, String> [ephemeral, local] // → in-memory
sessions: Map<String, String> [ephemeral, ttl(30min)] // → auto-expire
}
// The compiler verifies: [persistent, ephemeral] → contradiction.
// Properties are defined in .cell files. The language grows itself.
let x = 42 // Int (64-bit)
// BigInt — declare in face/handler signatures, e.g. signal compute(n: BigInt)
// Promotion is automatic when an Int operation overflows.
let price = 3.14 // Float (64-bit)
let sci = 1.5e3 // Scientific: 1500.0
let name = "world" // String with {interpolation}
let user = User { name: "Alice", age: 30 } // Typed record
let dur = 5s // Duration → 5000ms
let nothing = () // Unit — null equivalent
for i in range(0, 10) { /* ... */ } // range, break, continue
while running { if done { break } } // loops
// Conversion: to_int("abc") returns () (NOT 0). Always null-check user input.
// Integer division promotes to Float when non-exact: 7 / 2 == 3.5
// Use floor(7 / 2) for truncating division.
The cell keyword takes a kind modifier. Each kind is the same five-section structure (face, memory, state, scale, handlers); the kind tells the compiler what role the cell plays.
| Kind | Purpose |
|---|---|
cell Foo { } | Regular cell — functions, services, web apps |
cell agent Foo { } | Agent cell — unlocks think, set_budget, tool declarations, and the agent verification story |
cell property Foo { } | Define a new memory property (see stdlib/durability.cell) |
cell backend Foo { } | Define a storage backend implementation |
cell type Foo<T> { } | Define a custom type |
cell checker Foo { } | A custom validation rule the checker will run |
cell builtin Foo { } | FFI bridge to a Rust builtin (stdlib only) |
cell test Foo { } | Test cell — rules { assert … }, run with soma test |
// Properties, checkers, and backends are .cell files, not compiler code.
cell property geo_replicated {
rules {
implies [persistent, consistent]
contradicts [ephemeral, local]
}
}
cell backend redis {
rules {
matches [ephemeral, ttl]
native "redis"
}
}
// Add geo_replicated to your memory — the compiler now checks it.
// Define a redis backend — the runtime can resolve [ephemeral, ttl] to it.
// No compiler changes needed. The language grew.
interior and runtimeCells compose by nesting. A parent cell can declare interior children and wire their signals together explicitly.
cell System {
interior {
cell Worker { /* … */ }
cell Cache { /* … */ }
}
runtime {
start Worker // bring up the child
connect Worker.done -> Cache // wire signals
emit initialize() // fire startup signal
}
}
This is how a system grows from one cell. soma describe dumps the full interior graph as JSON.
| Backend | How to invoke | When to use |
|---|---|---|
| Tree-walking interpreter | soma run file.cell (default) | Development, the reference semantics. All other backends must agree with it. |
| Bytecode VM | soma run file.cell --jit | Compiled-once-per-process speedup; better for hot loops without committing to native. |
Native (Rust cdylib) | [native] handler annotation, or soma build for a Rust skeleton | Tight numeric loops; whole-loop BigInt dispatches to GMP via rug. Single BigInt ops cross FFI and are slower — the speedup is shaped, not uniform. |
Soma replaces Pandas for in-memory analytics. Same operations, fraction of the code. Every pipeline is composable via |>.
| Operation | Syntax |
|---|---|
| map(list, lambda) | data |> map(s => s.name) |
| filter(list, lambda) | data |> filter(s => s.score > 50) |
| find(list, lambda) | data |> find(s => s.id == target) |
| any / all / count | data |> any(s => s.active) |
| filter_by(field, op, val) | data |> filter_by("price", ">", 100) |
| sort_by(field, dir) | data |> sort_by("score", "desc") |
| top(n) / bottom(n) | data |> top(10) |
| agg(group, "col:func"...) | data |> agg("sector", "vol:sum", "price:avg") |
| group_by(field) | data |> group_by("region") |
| join(left, right, key) | orders |> join(prices, "ticker") |
| with(key, val) | record |> with("score", 95) |
| describe(field) | data |> describe("price") → {sum, avg, min, max, count} |
| flatten / zip / reverse | nested |> flatten() |
let portfolio = universe()
|> filter(s => s.market_cap > 1e9)
|> map(s => score(s))
|> sort_by("composite", "desc")
|> top(15)
|> map(s => {
let weight = clamp(s.composite * exposure / total, 0, 15)
s |> with("weight", weight)
})
You write Soma for the architecture: cells, state machines, persistence, web server. Then one handler does a million iterations and takes seconds. In every other language, you rewrite it in C++ or Rust. In Soma, you add one word: [native]. The compiler emits a Rust cdylib, caches it under .soma_cache/native/, and the handler runs as native code. Same file, same syntax. On a tight integer loop, the speedup over the tree-walking interpreter is in the 100–300× range; numerically heavy workloads with whole BigInt loops dispatch to GMP via rug and run within a small constant factor of hand-written Rust.
cell P {
on hot(n: Int) [native] {
let s = 0
let i = 0
while i < n {
s = s + i
i += 1
}
return s
}
on slow(n: Int) {
let s = 0
let i = 0
while i < n {
s = s + i
i += 1
}
return s
}
}
$ soma run hot.cell --signal slow 10000000
# interpreted: ~1,250 ms
$ soma run hot.cell --signal hot 10000000
[native] compiling 1 handler(s) for cell 'P'...
[native] compiled → .soma_cache/native/...
# native: ~3–5 ms
# Speedup on this loop: ~300×.
# Reproduce: bench/compare.sh — Soma vs Python
# side-by-side on 90+ math/CS challenges.
| Mode | Sum 0..107 (single run) | vs interpreted |
|---|---|---|
| Tree-walking interpreter | ~1,250 ms | 1× |
[native] (Rust cdylib, sequential) | ~3–5 ms | ~300× |
Measured on an Apple M-series with the bundled [native] codegen. Numbers vary with workload — this is a tight integer loop, the best case for the JIT path. The reproducible benchmark suite under bench/compare.sh runs 90+ math/CS challenges Soma-vs-Python with both wall-time and inner timings; [compute.parallel] for cross-handler thread fan-out is wired up but the headline parallel numbers are workload-specific — check the suite for your shape.
The code is the same in all three rows. [native] compiles to machine code. Adding [compute.parallel] in soma.toml splits the work across cores automatically. No threads, no mutexes, no async/await, no rewrite. One word in the code, two lines in the config.
This is the fourth axis of the Soma property system:
| Axis | Code | Configuration | Resolved to |
|---|---|---|---|
| Storage | [persistent] | soma.toml [storage] | SQLite |
| Transport | signal / on | soma.toml [peers] | TCP bus |
| Verification | state { } | soma.toml [verify] | Model checker |
| Compute | [native] | soma.toml [compute] | LLVM + threads |
Four axes. Same pattern. The code declares what. The configuration declares how. The compiler resolves.
let files = read_files("data", 10000)
let counts = files |> word_count()
read_files and word_count are Rust builtins exposed through the interpreter, so the work happens at native speed even without [native]. The same shape applies to other heavy collection operations: the interpreter dispatches to a Rust implementation under the hood, and the pipe just composes them. Numeric hot paths still use [native]; everything else stays in the interpreter where readability and the rest of the cell model live.
on bad(name: String) [native] {
print("hello {name}")
}
error: handler 'bad' is marked [native] but uses unsupported
parameter type 'String' for 'name'
→ [native] handlers can only use Int, Float, Bool
The compiler catches it before any code runs. [native] is for numeric hot paths — the interpreter handles everything else. Same file, same language: the interpreter orchestrates, [native] computes.
soma serve app.cell — threaded HTTP server, SQLite, CORS, auto-routing, HTML templates. Zero framework. Storage auto-serializes — no to_json/from_json needed.
cell App {
memory { tasks: Map<String, String> [persistent, consistent] }
on request(method: String, path: String, body: String) {
let req = map("method", method, "path", path)
match req {
{method: "GET", path: "/"} → html(dashboard())
{method: "GET", path: "/api"} → tasks.values // auto-deserialized
{method: "POST", path: "/api"} → {
let data = from_json(body)
let id = to_string(next_id())
tasks.set(id, data) // auto-serialized
map("id", id)
}
_ → response(404, map("error", "not found"))
}
}
on dashboard() {
let rows = tasks.values
|> map(t => """<tr><td>{t.name}</td><td>{t.status}</td></tr>""")
|> join("")
"""<html><body><h1>Tasks</h1><table>{rows}</table></body></html>"""
}
}
// $ soma serve app.cell
// listening on http://localhost:8080
Independent Soma programs communicate via signals. No WebSocket code. No HTTP polling. No serialization. signal to send, on to receive. Peers declared in soma.toml. Zero transport code in your program.
cell Exchange {
on order(data: Map) {
place_order(data)
}
every 500ms {
let fill = match_orders()
signal trade(fill)
}
}
cell Trader {
every 3s {
signal order(data)
}
on trade(fill: Map) {
record_fill(fill)
}
}
// soma.toml
// [peers]
// exchange = "localhost:8082"
The exchange emits signal trade(fill). The trader's on trade(fill) fires automatically. Two independent processes. No shared code. No ws_connect, no http_get, no to_json. The signal is the transport.
Peers are declared in soma.toml, not in code. Change the address — zero code changes. Replace TCP with Kafka — zero code changes. The program contains logic. The configuration contains topology.
// soma.toml — topology is configuration, not code
[package]
name = "trader"
[peers]
exchange = "localhost:8082" // auto-connects on soma serve startup
risk = "localhost:7082" // add more peers — no code changes
signal trade(fill) delivers to all listeners: browsers via SSE, programs via TCP bus, WebSocket clients. Same keyword, runtime resolves the transport.
signal trade(fill) // one emit → three transports:
// → SSE: browser EventSource('/stream')
// → TCP bus: connected peers (soma.toml)
// → WS: WebSocket clients on :8081
Same code. One machine or twenty thousand. The only difference: --join.
A cell declares what it needs. The runtime figures out how.
cell PricingEngine {
memory {
trades: Map<String, String> [persistent, consistent]
cache: Map<String, String> [ephemeral, local]
}
scale {
replicas: 50
shard: trades
consistency: strong
tolerance: 2
cpu: 4
memory: "8Gi"
}
on book_trade(data: Map) {
trades.set(data.id, data) // auto-serialized, replicated to all nodes
}
on price(data: Map) {
return cache.get(data.symbol) // node-local, microseconds
}
}
Run it alone:
$ soma serve app.cell -p 8080
Scale it to a cluster:
$ soma serve app.cell -p 8080 # node 1 (leader)
$ soma serve app.cell -p 8081 --join localhost:8082 # node 2
$ soma serve app.cell -p 8083 --join localhost:8082 # node 3
The code doesn't change. Not one line. The scale { } section declares the distribution contract. The runtime honors it:
shard: trades | Slot is the routing key for the consistent hash ring (FNV, 128 vnodes/node). The compiler checks the slot is non-[ephemeral]; the current runtime distributes via signal-bus broadcast rather than true partitioning. |
[ephemeral, local] | Memory stays node-local. Fast path. Never touches the network. |
consistency: strong | Compile-time contract: linearizable reads/writes, CP under partition. Quorum is computed and CAP contradictions are rejected. Runtime status: best-effort — the current cluster broadcasts writes over the signal bus and does not yet run a consensus protocol (Raft is on the roadmap). |
consistency: eventual | Write locally, propagate in background over the signal bus. Reads may be stale. This is what the runtime actually delivers today, regardless of the declared mode. |
tolerance: 2 | Declared failure budget. The compiler rejects values larger than N − quorum; the runtime detects peer loss via 15s heartbeat timeout. |
cpu: 4, memory: "8Gi" | Resource hints per instance, surfaced to deployment targets (Dockerfile, fly.toml). |
Honesty note. The
scaleblock is a verified contract, not yet a hardened consensus runtime.soma verifyproves your declared distribution is internally consistent (nostrongon an[ephemeral]slot, no impossibletolerance, quorum math checks out). What the runtime currently enforces is replication-via-broadcast on the signal bus. Treatstrongas a specification the compiler checks you meant, not as a Raft-level guarantee.
There is no special replication protocol. When trades.set(key, val) executes on node 1, it broadcasts an EVENT on the signal bus. Every connected node receives it and applies it locally. The same bus that carries inter-cell signals carries data replication. The infrastructure is the language — with the trade-off that today this gives you eventual replication regardless of the declared consistency: mode. A real Raft/Paxos backend will plug into the same protocol.
soma verify proves distributed properties at compile time, before deployment:
$ soma verify app.cell
scale 'PricingEngine':
✓ replicas: 50 instances declared
✓ shard 'trades' is [persistent] — eligible for distribution
✓ consistency: strong — declared linearizable
✓ CAP: CP mode — consistent + partition-tolerant (declared)
✓ quorum: 26/50 nodes needed
✓ tolerance: 2 ≤ (50 − 26) = 24, accepted
✓ memory 'cache' is [ephemeral, local] — not distributed (fast path)
⚠ runtime: strong is currently delivered via signal-bus broadcast
(no consensus protocol yet) — treat as a contract, not a guarantee
Contradictions are caught at compile time:
memory { data: Map [ephemeral, local] }
scale { shard: data, consistency: strong }
// → error: shard 'data' uses [ephemeral] but scale declares
// consistency: strong — contradictory
| Kubernetes | Erlang/OTP | Akka | Soma | |
|---|---|---|---|---|
| Distribution model | External YAML | In the VM | In the library | In the language |
| Verified before deploy | No | No | No | Yes (CTL + CAP) |
| Consistency declared | No | No | No | Per memory slot |
| Same code local/cluster | No | Almost | Almost | Yes, zero changes |
Read the paper: Scale as a Type: Verified Distribution in a Fractal Cell Language
50-node trade booking, option pricing, greek reconciliation. Strong consistency, verified quorum. --join to scale.
A mini Celery. Submit jobs, work-stealing scheduler, retry logic. Verified: every job completes or expires. 120 lines.
Multi-room messaging across nodes. Eventual consistency. Messages replicated via signal bus. 90 lines.
Cells are pods. State machines are lifecycles. The scheduler is a controller loop. Verified deadlock-free. 200 lines.
Order book, matching engine, market maker bot. Two programs communicating via signal bus. Real-time fills.
From hello world to HTMX apps to Monte Carlo pricing. Every example parses, checks, and runs.
| soma run file.cell [args] | Execute the entry handler in the tree-walking interpreter |
| soma run file.cell --signal name [args] | Execute a specific named handler |
| soma run file.cell --jit [args] | Execute via the bytecode VM (faster startup for hot loops) |
| soma build file.cell [-o out.rs] | Generate Rust skeleton from a cell (native codegen frontend) |
| soma serve file.cell | HTTP :8080, WS :8081, Bus :8082 |
| soma serve file.cell --join host:port | Join a cluster |
| soma serve file.cell --watch | Hot reload on file change |
| soma check file.cell [--json] | Verify contracts & properties |
| soma fix file.cell | Auto-repair: adds missing handlers, fixes properties |
| soma lint file.cell [--json] | Anti-patterns: redundant to_json, unchecked .get(), if-chains |
| soma verify file.cell [--json] | State machine + distribution proofs |
| soma describe file.cell | Rich JSON: handlers, memory, state machines, face, scheduled |
| soma test file.cell | Run test cells |
| soma init [name] | Create project |
| soma add pkg --git url | Add dependency |
| soma install | Install deps |
| soma props | List registered properties + backends |
| soma repl | Interactive evaluator |
| soma ast file.cell | Dump the parsed AST as JSON |
| soma tokens file.cell | Dump the lexer token stream |
| soma env | Show stdlib path, cache dir, and resolved config |
| soma deploy file.cell --target fly|cloudflare|aws | Generate deployment scaffolding (Dockerfile / fly.toml / etc.) and shell out to the cloud CLI — you still need flyctl/wrangler auth set up |
Source (.cell)
→ Lexer → Parser → AST
→ Checker (contracts, properties, scale coherence)
→ Fixer (auto-repair: missing handlers, bad properties)
→ Linter (anti-patterns: redundant to_json, if-chains)
→ Verifier (state machines, temporal logic, CAP analysis)
→ Describe (rich JSON: handlers, memory, state, face)
→ Interpreter (soma run)
→ Auto-serialize storage (maps → JSON → maps)
→ Memory invariants (checked on every .set())
→ Ensure postconditions (checked on handler exit)
→ Native codegen ([native] → Rust → .dylib via cached cdylib build)
→ Registry (stdlib/*.cell — properties, backends, builtins)
→ Runtime
→ Storage (SQLite | Memory | auto-serialize)
→ HTTP server (soma serve)
→ Cluster (--join → hash ring → signal replication)
→ SSE + WebSocket + TCP signal bus
Everything an AI agent needs to write Soma. Copy-paste ready. See llms.txt for the full machine-readable reference.
// app.cell
cell App {
on run() { print("hello soma") }
}
// $ soma run app.cell
cell App {
memory { items: Map<String, String> [persistent] }
on request(method: String, path: String, body: String) {
let req = map("method", method, "path", path)
match req {
{method: "GET", path: "/"} -> items.values
{method: "POST", path: "/api/" + resource} -> items.set(resource, from_json(body))
_ -> response(404, map("error", "not found"))
}
}
}
// $ soma serve app.cell
cell agent Bot {
face { tool search(q: String) -> String "Search the web" }
state w { initial: idle idle -> thinking -> done * -> failed }
on search(q: String) { "Results for: {q}" }
on run(topic: String) {
set_budget(5000)
transition("t", "thinking")
let answer = think("Research: {topic}")
transition("t", "done")
answer
}
}
// $ soma verify bot.cell ← PROVES it terminates
// $ SOMA_LLM_MOCK=echo soma run bot.cell "AI" ← works offline
// $ SOMA_LLM_KEY=sk-... soma run bot.cell "AI" ← real LLM
//
// Ollama (local, free):
// $ export SOMA_LLM_KEY=ollama
// $ export SOMA_LLM_URL=http://localhost:11434/v1/chat/completions
// $ export SOMA_LLM_MODEL=gemma3:12b
// $ soma run bot.cell "AI"
cell agent Researcher {
state w { initial: idle idle -> done * -> failed }
on research(topic: String) {
transition("t", "done")
let r = think("Research: {topic}")
emit findings(map("topic", topic, "data", r)) // -> Writer receives
}
}
cell agent Writer {
state w { initial: idle idle -> done * -> failed }
on findings(data: Map) {
transition("t", "done")
print(think("Write about: {data.topic}"))
}
}
// emit dispatches to all cells with matching handler
// delegate("Writer", "findings", data) for direct calls
// gather(items, "Worker", "process") for fan-out
// broadcast("alert", data) for all agents
// Variables
let x = 42 // Int
x += 10 // compound: += -= *= /=
let s = "hi {x}" // String interpolation
let m = map("a", 1) // Map (not {a: 1})
let l = list(1, 2, 3) // List (not [1,2,3])
let n = () // null (not null/nil)
// Control flow (if/match are expressions)
let y = if x > 0 { "pos" } else { "neg" }
let z = match x { 0..10 -> "small" _ -> "big" }
// Pipes
data |> filter(x => x.score > 50) |> sort_by("score") |> top(10)
// Error handling
let v = try { risky() }? // ? propagates error
ensure balance >= 0 // postcondition
let val = x ?? 0 // null coalesce
// Storage (auto-serializes)
data.set("k", map("a", 1)) // stores map directly
let v = data.get("k") // returns map, not string
// Agent builtins
think("prompt") // LLM + tool loop
think_json("prompt") // returns Map
delegate("Cell", "sig", args) // cross-agent call
set_budget(5000) // token cap
trace() // execution log
approve("action") // human gate
match value {
"literal" -> expr
"a" || "b" -> expr // or-pattern
name -> use(name) // variable binding
"/api/" + rest -> api(rest) // string prefix
{method: "GET", path} -> get(path) // map destructure
0..100 -> "small" // range
n if n > 0 -> "positive" // guard
() -> "null" // unit
_ -> "default" // wildcard
}
| Wrong | Right |
|---|---|
function foo() {} | on foo() {} |
null | () |
[1, 2, 3] | list(1, 2, 3) |
{key: val} | map("key", val) |
items.set(k, to_json(m)) | items.set(k, m) |
from_json(items.get(k)) | items.get(k) |
import x | use lib::x |
console.log(x) | print(x) |