It's 'basically, a made-up language'. It's just tongue-in-cheek because when we started this it was just a ridiculous proposition to try and make a DSL.
Boundary (YC W23) | Software engineer (compilers) | Seattle, USA (in person) | Full-time
We are building a new programming language (BAML) to build AI agents -- the "typescript" for LLMs. We are open source: https://github.com/BoundaryML/baml
A big part of this language is all the tooling around visualizing non-deterministic code, visualizing code, and getting great observability (e.g. our language has type information at runtime unlike TS).
We are looking for engineers with experience with Rust, programming languages, and/or compilers. Any amount of experience is fine.
To apply: send an email to aaron@boundaryml.com with your resume and mention you came from HN
You may also want to check out BAML https://github.com/BoundaryML/baml - a DSL for prompt templates that are literally treated like functions.
the prompt.yaml format (which this project uses) suffers from the fact that it doesn't address the structured outputs problem. Writing schemas in yaml/xml is insanely painful. But BAML just feels like writing typescript types.
- Created by an AWS team but aws logo is barely visible at the bottom.
- Actually cute logo and branding.
- Focuses on the lead devs front and center (which HN loves). Makes it seem less like a corporation and more like 2 devs working on their project / or an actual startup.
- The comment tone of "hey ive been working on this for a year" also makes it seem as if there weren't 10 6-pagers written to make it happen (maybe there weren't?).
- flashy landing page
Props to the team. Wish there were more projects like this to branch out of AWS. E.g. Lightsail should've been launched like this.
We're making a prompting DSL (BAML https://github.com/BoundaryML/baml) and what we've found is that all the syntax rules can easily be encoded into a Cursor Rules file, which we find LLMs can follow nicely. DSLs are simple by nature so there's not too many rules to define.
Here's the cursor rules file we give folks: gist.github.com/aaronvg/b4f590f59b13dcfd79721239128ec208
it's kind of wild -- none of the multimillion dollar VSCode forks (Cursor, windsurf) are working properly at the moment. It seems open-vsx is quite a vulnerable single point of failure. Searching extensions gives a 503.
It's kind of insane going from 76% to 3% on the new version of a benchmark. We clearly need more rapid progress on the creation of benchmarks.
Then again, I wonder -- if a benchmark is way too hard from the beginning, would it make it much harder for people to test new solutions that actually have real-world impact, even if the new results on the hard benchmark only increased the score by 1%?
I'll add it in!