I Spent Weeks Debugging LangGraph. Arya Fixed It Out of the Box.

Shared state, half-written crashes, free-text agent outputs. The problems that took weeks to fix in LangGraph ship as defaults in Sarvam Arya.

LangGraph is great for building multi-agent systems. Arya is built for shipping them to production.

i've used LangGraph extensively in production. few weeks back i built a multi-agent system to review startups and run research workflows. it worked. but it broke in ways that took days to figure out.

LangGraph vs Sarvam Arya

i'm Mohd Mursaleen, an AI engineer based in Bengaluru. this isn't a benchmark post. not a takedown either. just what changed when i stopped patching LangGraph and tried Arya.

The System That Broke

the workflow was simple on paper. ingest a startup deck, pull market data, run financial checks, summarize. five agents. one shared state. LangGraph holding it together.

then production hit and three failure modes showed up on repeat:

agents overwriting each other's state mid-run
context windows bloated with irrelevant data from previous steps
one agent crashing and corrupting state for everything downstream

we fixed all of it eventually. took multiple iteration cycles, custom guardrails, and a lot of painful debugging. then i came across Sarvam Arya. and it solves most of these problems out of the box.

State: One Drawer vs Four

simplest way to understand the difference.

in LangGraph, all your agents share one big state dictionary. every agent reads everything. every agent writes to the same place. works for small workflows. gets messy fast at scale.

# LangGraph: one shared dict, everyone touches it
class AgentState(TypedDict):
    documents: list[str]
    facts: dict
    code_results: list
    chat_history: list
    # every agent sees all of it. every agent can overwrite any of it.

Arya splits the state into four separate drawers.

one for raw documents and source material
one for clean structured facts
one for code execution results
one for conversation history

each agent only opens the drawer it needs. the research agent never sees the chat history. the code agent never reads the raw documents. no noise. no wasted tokens. no accidental overwrites.

Crash Recovery

in LangGraph, a crash mid-write leaves your state half-updated. you retry on top of debris. next agent reads partial data and either fails louder or, worse, silently makes a decision off corrupted input.

in Arya, state is immutable. if a node crashes, nothing is written. you retry from a clean checkpoint. every time.

this single property removed the entire category of "why is the state weird" debugging from my workflow.

Schema Enforcement at the Node Level

Arya enforces input-output schemas on every single node natively.

an agent can't return "revenue looks strong". it has to return:

{ "value": 175000, "unit": "Cr" }

if the output doesn't match the schema, it fails at compile time, not in production.

in LangGraph this is on you. pydantic wrappers, retry loops, output parsers, structured output prompts. all of it custom. all of it brittle.

with Arya the contract is the runtime. agents can't drift because they can't.

Native Code Execution

one thing LangGraph still can't do natively: Arya has a built-in Python interpreter. one agent can run code, and the results persist in their own drawer for other agents to use.

in the startup review system this was the cleanest win. a financial-analysis agent could compute runway, burn, unit economics in actual Python. results landed in the code-results drawer. summarizer pulled from there directly. no string parsing of LLM output into floats. no "approximate the math in the prompt."

Side by Side

| Concern | LangGraph | Arya | | ------------------- | --------------------------------- | ------------------------------------- | | State model | one shared dict | four typed drawers | | Crash mid-write | partial state, retry on debris | immutable, retry from checkpoint | | Schema enforcement | custom (pydantic, parsers, retry) | native, compile-time | | Code execution | external sandbox or tool | built-in Python interpreter | | Best for | flexible experimentation | production guarantees |

When to Pick What

both have their place. LangGraph gives you flexibility. Arya gives you guarantees.

if you're prototyping a new agent shape, LangGraph's flat state model lets you move faster. no schemas to refactor every time you add a field.

if you're shipping agents to users, the problems i spent weeks solving in LangGraph already ship as defaults in Arya. that's the trade.

three things i'd tell anyone debating the switch: treat shared state as the bug it usually is. let the runtime enforce schemas, not your prompts. keep code execution close to the data it produces.

if you want to see what production multi-agent looks like end to end, check my five-agent orchestration platform that shipped to 200 users on launch day. that one ran on LangGraph. the next one is shipping on Arya.

stay building.