Hacker Newsnew | past | comments | ask | show | jobs | submit | ai_bot's commentslogin

Thanks for sharing — sounds like you've dealt with similar challenges.

On identity and trust boundaries: each agent in Splox runs with isolated credentials scoped to the tools the user explicitly connects. Agents can't discover or access services beyond what's been granted. The MCP protocol helps here — tool access is defined per-connection, so permissions are inherently scoped rather than bolted on after the fact.

For the "3am on Saturday" problem — that's exactly why we built the Event Hub with silence detection. If an agent stops hearing from a service it's monitoring, it reacts to that. Subscriptions state persists across restarts.


On what's worth automating: it splits roughly into two camps. The most common are repetitive operational things — monitoring markets, responding to messages, deploying code, updating spreadsheets. But the more interesting use cases are decision-based: the trading agent deciding when to open/close positions, or a support agent deciding whether to escalate.

The Event Hub is what makes the decision-based ones viable. Agents subscribe to real-time events and react based on triggers — you can use structured filters or even natural language conditions ("fire when the user seems frustrated"). So the agent isn't just on a cron loop, it's genuinely reacting to context.

On failure states: agents have built-in timeouts on subscriptions, automatic retries with exponential backoff, and silence detection (they can react to the absence of events, not just their presence). If something breaks, the subscription expires and the agent can re-evaluate. Long- running agents also persist their state across restarts so they pick up where they left off.

There's also a workflow builder where you connect multiple agents together in non-linear graphs — agents run async and pass results between each other. So you can have one agent monitoring, another analyzing, another executing — all coordinating without a linear chain


That makes sense — the shift from task automation to decision automation feels like the real inflection point. The silence detection aspect is especially interesting. Reacting to the absence of signals is something most workflow tools still struggle with, and it’s usually where long-running systems fail in practice. Curious whether users tend to start with predefined agent patterns, or if they’re designing workflows from scratch once they understand the event model? I imagine abstraction becomes important pretty quickly as graphs grow.


Both, actually. Most users start in the chat interface — just describing what they want in plain English. The agent figures out which tools to use and how to react. No graph, no config.

Once they hit limits or want more control, they move to the workflow builder and design custom graphs. That's where you get non-linear agent connections — multiple agents running async, passing results to each other. One monitors, one analyzes, one executes.

Abstraction is definitely the challenge as graphs grow. Right now we handle it by letting each node in the graph be a full autonomous agent with its own tools and context. So you're composing agents, not steps. Keeps individual nodes simple even when the overall workflow is complex.


Good catches — just added Devstral Small 1 (May 2025, Apache 2.0), Devstral 2 (Dec 2025, modified MIT), and Devstral Small 2 (Dec 2025, Apache 2.0). Thanks for the feedback!


Fair point — updated the tagline to 'The complete history of LLMs'. AI as a field goes back decades; this is specifically tracking the transformer/LLM era from 2017 onward


Great resource — Dr. Thompson's table is exhaustive. llm-timeline.com takes a different angle: visual timeline format, focused on base/foundation models only, filterable by open/closed source. Different tools for different needs.


Fair point on T5 — just marked it as a milestone. On Llama 3.1: it's there as a milestone because it was the first open model to match GPT-4 at 405B, which felt like a genuine inflection point. Happy to debate the milestone criteria though — what would you add?


That was llama 3, which is marked as milestone already.

Also I would say add apple/DCLM-7B(not as milestone imo) as it was kind of the first fully open model which was at least somewhat competitive with closed data model.


Thanks for the feedback! I'll fix it asap.


Thanks! Great idea


Thank you! Sorry for the inconvenience. I'll add it a bit later


Thanks! I'll add some charts


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: