Hacker Newsnew | past | comments | ask | show | jobs | submit | mycall's commentslogin

Overriding drivers can be mitigated as it is just one layer.

I tend to disagree. When guiding AI through many rounds of code review, it can self correct if shown where general issues exist. It does take practice for using the language of the model, i.e. drift instead of issues. Human in the loop is good enough to produce useful and accurate code today.

If you can actually do this, please sell your services. You will become a multi-millionaire overnight if you can provide a workflow that doesn't result in mass hallucinations or incorrect suggestions you're able to do something no other LLM company can.

The more common use case is that these tools struggle immensely on anything outside the happy path.


Ukraine is different and did the reverse, giving up their nukes. They said it was too expensive to keep them, which is only partially true. Ukraine could have deconstructed them and created new Permissive Action Links (PALs) in Dnipro although this process would have taken years and carried a high risk of accidental detonation or radioactive mishaps during the reengineering phase.

And there’s also a small detail of Russia threatening invasion if Ukraine tried to decode those.

Well they always do, so threats by the bad boys are just noise

Sorry but if Cuba starting launching drones at Florida, especially Mar-a-Lago, Cuba will be carpet bombed. Americans would simply not put up with that. It would be sad days in history imho.

The problem with regex is multi-language support and how big the regex will bloat if you to support even 10 languages.

Supporting 10 different languages in regex is a drop in the ocean. The regex can be generated programmatically and you can compress regexes easily. We used to have a compressed regex that could match any placename or street name in the UK in a few MB of RAM. It was silly quick.

I think it will depend on the language. There are a few non-latin languages where a simple word search likely won't be enough for a regex to properly apply.

Exactly this. Unicode is a big beast to consider in regex concats.

woah. This is a regex use I've never heard of. I'd absolutely love to see a writeup on this approach - how its done and when it's useful.

You can literally | together every street address or other string you want to match in a giant disjunction, and then run a DFA/NFA minimization over that to get it down to a reasonable size. Maybe there are some fast regex simplification algorithms as well, but working directly with the finite automata has decades of research and probably can be more fully optimized.

This was many moons ago, written in perl. From memory we used Regexp::Trie - https://metacpan.org/release/DANKOGAI/Regexp-Trie-0.02/view/...

We used it to tokenize search input and combined it with a solr backend. Worked really remarkably well.


We're talking about Claude Code. If you're coding and not writing or thinking in English, the agents and people reading that code will have bigger problems than a regexp missing a swear word :).

I talk to it in non-English. But have rules to have everything in code and documentation in english. Only speaking with me should use my native language. Why would that be a problem?

Because 90% of training data was in English and therefore the model perform best in this language.

In my experience these models work fine using another language, if it’s a widely spoken one. For example, sometimes I prompt in Spanish, just to practice. It doesn’t seem to affect the quality of code generation.

They literally just have to subtract the vector for the source language and add the vector for the target.

It’s the original use case for LLMs.


Thank you. +1. There are obviously differences and things getting lost or slightly misaligned in the latent space, and these do cause degradation in reasoning quality, but the decline is very small in high resource languages.

It’s just a subjective observation.

It just can’t be a case simply because how ML works. In short, the more diverse and high quality texts with reasoning reach examples were in the training set, the better model performs on a given language.

So unless Spanish subset had much more quality-dense examples, to make up for volume, there is no way the quality of reasoning in Spanish is on par with English.

I apologise for the rambling explanation, I sure someone with ML expertise here can it explain it better.


I saw a curious post recently that explored this idea, and showed that it isn’t really the case. The internal layers of the model aren’t really reasoning in English, or in any human language.

Translation in/out of human languages only happens at the edges of the model.

Internal layer activations for the same concept are similar regardless of language, while activations at the top/bottom layers diverge. Meanwhile the pattern is reversed for same language different content.


So we do at least agree on the fact that quality of human language <-> embeddings transition depends on how good target language is represented in the training dataset?

Even if it happens at the edge, on every conversation turn, I may assume non captured small subtleties of meaning over time can accumulate into significant error.


The agent may be translating to English behind the scenes. There are all kinds of hacks layered on top of the underlying models. But however exactly it works, it’s clearly not the case that these things only make use of Spanish training data when you prompt them in Spanish.

Lets take a language with orders of magnitude less training data, like Nahuatl. The following is what happens if I ask Claude Code in Nahuatl to give me an example of a simple program that I can execute inside a web browser (avoiding any direct keyword prompts like 'Javascript' or 'Hello World'). Interestingly, it responds in Spanish. I doubt that it has any relevant training data in Nahuatl. (Apologies if the Nahuatl is wrong – I just used Google translate.)

    ¿Uelis techmakas se neskayotl tlen se programa tlen amo ouij tlen ueliskia nijchiuas ipan se navegador web?

    -----

    Entiendo tu pregunta en náhuatl. Aquí un ejemplo sencillo — una calculadora en HTML puro que puedes abrir directo en el navegador, sin servidor:

    <!DOCTYPE html>
    <html>
    <head>
        <meta charset="UTF-8">
        <title>Calculadora</title>
        <style>
        body { font-family: sans-serif; display: flex; justify-content: center; padding: 40px; }
        .calc { border: 1px solid #ccc; padding: 20px; border-radius: 8px; width: 200px; }
        input { width: 100%; margin-bottom: 10px; font-size: 1.2em; text-align: right; }
        button { width: 22%; margin: 2px; padding: 8px; font-size: 1em; cursor: pointer; }
        .wide { width: 47%; }
        </style>
    </head>
    <body>
    <div class="calc">
        <input type="text" id="pantalla" readonly>
        <div>
        <button onclick="tecla('7')">7</button>
        <button onclick="tecla('8')">8</button>
        <button onclick="tecla('9')">9</button>
        <button onclick="tecla('/')">÷</button>
    
        <button onclick="tecla('4')">4</button>
        <button onclick="tecla('5')">5</button>
        <button onclick="tecla('6')">6</button>
        <button onclick="tecla('*')">×</button>
    
        <button onclick="tecla('1')">1</button>
        <button onclick="tecla('2')">2</button>
        <button onclick="tecla('3')">3</button>
        <button onclick="tecla('-')">−</button>
    
        <button onclick="tecla('0')">0</button>
        <button onclick="tecla('.')">.</button>
        <button onclick="tecla('+')">+</button>
        <button onclick="limpiar()">C</button>
    
        <button class="wide" onclick="calcular()">=</button>
        <button class="wide" onclick="borrar()">⌫</button>
        </div>
    </div>
    
    <script>
        const pantalla = document.getElementById('pantalla');
    
        function tecla(val) {
        pantalla.value += val;
        }
    
        function calcular() {
        try {
            pantalla.value = eval(pantalla.value);
        } catch {
            pantalla.value = 'Error';
        }
        }
    
        function limpiar() { pantalla.value = ''; }
        function borrar() { pantalla.value = pantalla.value.slice(0, -1); }
    </script>
    </body>
    </html>
    
    Guarda esto como calculadora.html y ábrelo en cualquier navegador — no necesita servidor ni dependencias. Es un buen punto de partida para aprender HTML,
    CSS y JavaScript.

> it’s clearly not the case that these things only make use of Spanish training data when you prompt them in Spanish.

It’s not! And I’ve never said that.

Anyways, I’m not even sure what we are arguing about, as it’s 100% fact that SOTA models perform better in English, the only interesting question here how much better, is it negligible or actually makes a difference in real world use-cases.


It’s negligible as far as I can tell. If the LLM can “speak” the language well then you can prompt it in that language and get more or less the same results as in English.

In my experience agents tend to (counterintuitively) perform better when the business language is not English / does not match the code's language. I'm assuming the increased attention mitigates the higher "cognitive" load.

Claude handles human languages other than English just fine.

They only need to look at one language to get a statistically meaningful picture into common flaws with their model(s) or application.

If they want to drill down to flaws that only affect a particular language, then they could add a regex for that as well/instead.


Did you just complain about bloat, in anything using npm?

Everyone will need something customized to their lifestyle changes, like a digital factory kernel, to others wanting to trade workflow recipes. They then use to write their own AI clean-roomed edition (which of course is just glue to hosted tools you buy)

Needs a "Back in Meow minutes" on the screen when they go off-camera.

I noticed codex has a sandbox, wondering if it has a comparable config section.

Codex uses and ships with bubblewrap on Linux and will attempt to use the version installed on the path before falling back to the shipped version with a warning message.

You should be able to configure the sandbox using https://developers.openai.com/codex/agent-approvals-security if you are a person who prefers the convenience of codex being able to open the sandbox over an externally enforced sandbox like jai.


Have you seen what Unitree G1 can already do? I see the writing on the walls for going onsite and pulling wires.


AI is most excellent at reading and understanding large codebases and, with some guidance docs, can easily reproduce accurate technical documentation. Divide and conquer.


Reading a large codebase...perhaps if it is not too large. Understanding... why a tool exists, what is the motivation for its design, what the external human systems requirements for successful utilization of the internal facing tools... especially when that knowledge exists only in the memories of a few developers and PMs... not so much. Deep domain expertise is a long way from AI capability for effective replacement.


Again, nobody would spend a dime to create the technical documentation, even if it could be done somewhat faster with AI support. Also, in my experience AI is not so great explaining the consequences to business processes when documenting code.


Accuracy/faithfulness to the code as written isn't necessarily what you care about though, it's an understanding of the underlying problem. Just translating code doesn't actually help you do that.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: