Hacker Newsnew | past | comments | ask | show | jobs | submit | timini's commentslogin

clearly this is a plat to own all the agent traces for further training


clearly this is a play to own all the agentic traces for further training


This paper evaluates three control strategies for untrusted agents: deferral to trusted models, resampling, and critical action deferral. Initial testing showed resampling and critical action deferral achieving 96% safety. However, adversarial testing revealed resampling crashes to 17% safety when attackers can detect resampling or simulate monitors, while critical action deferral remained robust against all attack strategies.


HaluMem introduces the first benchmark for evaluating hallucinations in agent memory systems at the operation level. Through three evaluation tasks (memory extraction, updating, and question answering), it reveals that existing memory systems generate and accumulate hallucinations during early stages, which then propagate errors downstream. The benchmark uses two datasets spanning different context scales to systematically reveal these failure modes.


OpenHands SDK provides a complete architectural redesign for building production software development agents. It balances simplicity (few lines of code for basic agents) with extensibility (custom tools, memory management) while delivering seamless local-to-remote execution, integrated security, and connections to various interfaces (VS Code, command line, APIs).


not sure there are any agents yet

but there are some research like

https://arxiv.org/abs/2510.15103


judging by this article, no


TL;DR Problem: "Tool overload" is a critical bottleneck for AI agents. Providing an LLM with a large, static list of tools bloats the context window, degrading performance, increasing costs, and reducing accuracy. Solution: Implement a "select, then execute" architectural pattern. Use a lightweight "router" agent to first retrieve a small, relevant subset of tools for a specific task. Then, a more capable "specialist" agent uses that curated set to execute the request. Benefits: Lower latency and cost (fewer tokens), higher tool-selection precision, a scalable architecture for large tool catalogs, and improved reliability. Pattern: This pattern is a form of Retrieval-Augmented Generation (RAG) applied to tools, often called Retrieval-Augmented Tool Selection (RATS). It can be combined with State-Based Gating for even greater precision. How: This post provides a complete, production-aware implementation using Google's Agent Development Kit (ADK).


I think its fairly simple, it needs a certain level of proof e.g references to authoritative sources, if not say "i don't know".


LLMs are token completion engines. The correspondence of the text to the truth or authoritative sources is a function of being trained on text like that; with the additional wrinkle that generalization from training (a desired property or it's just a memorization engine) will produce text which is only plausibly truthful, it only resembles training data.

Getting beyond this is a tricky dark art. There isn't any simple there. There's nowhere to put an if statement.


LLMs don't have a concept of sources for their statements.

Ask them to give you some literature recommendations on something it has explained to you. You'll get plenty of plausible sounding papers that don't exist.

Humans know to some extent why they know (read it in a text book, colleague mentioned it). LLMs don't seem to.


Ask a human to provide accurate citations for any random thing they know and they won't be able to do a good job either. They'd probably have to search to find it, even if they know they got it from a document originally and have some clear memory of what it said.


But they could, if they needed to. But most people don’t need to, so they don’t keep that information in their brains.

I can’t tell you the date of every time I clip my toenails, but if I had to could remember it.


LLMs can remember their sources. It's just additional knowledge, there's nothing special about it.

When you ask an LLM to tell you the height of Mount Everest, it clearly has a map of mountains to heights, in some format. Using exactly the same mapping structure, it can remember a source document for the height.


No, it “knows” that one of the tokens that commonly follows is “29,031 ft”.


The fact that a human chooses not to do remember their citations, does not mean they lack the ability.

This argument comes up many times “people don’t do this” - but that is a question of frequency, not whether or not people are capable.


LLMs are capable as well if you give them access to the internet though


Humans did research and remembered sources before the Internet was a thing.

But also, can you give an example where an LLM with access to the Internet can find a primary source?

I don't think learning to refer to sources is something inherently impossible for LLMs, but it is very different to the kind of implicit knowledge they seem to excel at.


They just paste in the first link then or some other programmed heuristic, they aren't like a human that puts in effort to find something relevant. An LLM with internet access isn't smarter than just asking google search.


Yes, humans wont lie to you about it, they will research and come up with sources. Current LLM doesn't do that when asked for sources (unless they invoke a tool), they come back to you with hallucinated links that looks like links it was trained on.


Unfortunately it's not an uncommon experience when reading academic papers in some fields to find citations that, when checked, don't actually support the cited claim or sometimes don't even contain it. The papers will exist but beyond that they might as well be "hallucinations".


Humans can speak bullshit when they don't want to put in the effort, these LLMs always do it. That is the difference. We need to create the part that humans do when they do the deliberate work to properly create those sources etc, that kind of thinking isn't captured in the text so LLMs doesn't learn it.


They read it in a non-existent average interpolation of the books actual humans read similar things in.


LLMs don't have any concepts period.


Then it is nothing more than a summarizer for search engine results.


A lot of people have said chat-gpt/copilot is a lot like having a robotic junior dev around.

I think perhaps your description is more succinct


I'm really curious about one would implement that. By pondering weigths from certain sources ?


did you get paid to make this?


Yes. I did most of this as a postdoc at FAIR, Meta's AI research group.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: