A MOS6502 assembler implemented as a Rust macro

monocasa · on Feb 11, 2016

Reminds me of this sweet paper

http://research.microsoft.com/en-us/um/people/nick/coqasm.pd...

nickpsecurity · on Feb 11, 2016

That's the exact one I thought of. On top of it, there's typed assembler:

https://www.cs.cornell.edu/talc/

And Vx86 used in Microsoft hypervisor:

http://swt.informatik.uni-freiburg.de/staff/maus/dissemain.p...

Note: Microsoft also used a TAL, though can't recall if same as Cornell, in their VerveOS that was verified down to assembler. Combined with SLAM, it's safe to say their researchers are kicking serious behind on practical, cutting-edge verification. I wouldn't have believed it 10+ years ago if someone told me haha.

hathawsh · on Feb 11, 2016

Very cool. As a Rust newbie I'd like to understand what's going on here. Is this a "procedural" macro? Does Rust specifically target writing DSLs using its macro system? Can we expect macros like this to continue working for a long time? There's a blog post that suggests possible changes: https://internals.rust-lang.org/t/the-future-of-syntax-exten...

mcpherrinm · on Feb 11, 2016

This isn't a procedural macro: Procedural macros run Rust code to transform your program, and interact with the guts of the compiler (AST, etc). That's hopefully going to change to not be compiler-version specific, etc.

This is using a system known as "macro_rules" which are a system of "expanding" -- in some ways it's a fancier version of the C preprocessor, but also resembling the Lisp/Racket macro system to some degree too.

Let's take a random little piece of that code from the post and talk about it:

    macro_rules! codelen {
        () => { 0 };
        ( $($c:expr),+ ) => { [$($c),+].len() };
    }

This declares a new macro substitution rule called `codelen`. Without going into too much depth, that's defining rules (on the left hand of the `=>`) and saying what code to expand them into, on the other side. The first replaces `()` with `{0};`, just a simple substitution. In the second rule, the `$($variable),+ is the essential construct here, letting you operate on sequences of input stuff.

It's a little hard to read -- writing your own macros in Rust is a little bit complicated and I don't think it's actually something that should be done often. You can write little DSLs with it for sure. This variety is going to stick around and will keep working. It is called macro_rules! and not just macro! to leave that name free for a future, enhanced way to write macros, though.

steveklabnik · on Feb 11, 2016

  >  Is this a "procedural" macro?

This is not a procedural macro, it's a regular-old macro.

  > Does Rust specifically target writing DSLs using its macro system?

I would say that "specifically" is a bit strong, but they are one application of them, yes.

  > Can we expect macros like this to continue working for a long time?

This functionality runs on Rust stable, and therefore should continue working forever.

The post you're referring to is talking about procedural macros/syntax extensions/compiler plugins, which are not stable and will see lots of change.

We also may end up with a new macro system as well, but it will not conflict with the existing macro system if we do.

GlitchMr · on Feb 11, 2016

It doesn't support explicit accumulator addressing mode (ASL A, LSR A, ROL A, ROR A), but otherwise quite neat. (from a quick look at it)

(also, 0x syntax is quite unusual, but I assume $ couldn't have been implemented for some reason)

ddingus · on Feb 11, 2016

The accumulator is the only register those work with. It was very common to not specify it.at all. I actually never have in all the 6502 assembly code I've ever written. Wasn't required.

pvg · on Feb 12, 2016

I don't think that's true.

http://www.downloads.reactivemicro.com/Public/Users/David_Cr...

Code is straight from the Red Book listings. Search for ROR - accumulator (shown explicitly), indexed, you name it. 4 modes each.

ddingus · on Feb 12, 2016

Well, I have written a lot of code on 6502.

What you name is the value to be operated on, and that is indexed, zpage, immediate, etc... the accumulator is not named. There are two or three bytes, one instruction, one or two for value or address.

Same for things like ADC, the accumulator is where it will happen, the things you name are values and addresses.

LSL, LSR can just be stated, one byte instructions that operate on the accumulator, which does not need to be specified.

6502 is really basic. The accumulator is the working register for many things. The index registers are for indexing. One does not do LDX $c000, A for example.

pvg · on Feb 12, 2016

I believe you. But it was, quite typically, like you can see in the Woz code above. Or from '6502 Assembly Language Subroutines', 1982 vintage.

http://i.imgur.com/jreIWA5.png

ddingus · on Feb 12, 2016

Yeah, it can be. No worries. Guess I never assembled on something that required it.

Fun old times in any case. That chip provided just enough, and no more.

vardump · on Feb 12, 2016

Memory operand is not a register operand. Unless you're one of those who consider 6502 zero page as 254 registers (IIRC, you couldn't use addresses 0 and 1, at least on C64, because they were used for ROM visibility and I/O control).

to3m · on Feb 12, 2016

The C64 used the 6510, not the 6502. The zero page definitely has 256 bytes in it normally!

vardump · on Feb 12, 2016

Ok, 6510. Regardless I was talking about C64. You can't safely use addresses 0 or 1, because they're memory mapped I/O.

to3m · on Feb 11, 2016

This seems to be somewhat common - I've used a few assemblers that make you write ASL A as just ASL (and so on).

Benefit (?): you can now have a label called "A".

vardump · on Feb 11, 2016

Isn't A the only register those ops can use anyways? Although it's been a while, 25 years...

duskwuff · on Feb 12, 2016

You can't ASL X or Y, but you can ASL $0 or ASL $0, X. (Using either a zero-page address or an absolute address.)

pvg · on Feb 11, 2016

You mean as syntax? Because the opcodes seem to be there.

gamache · on Feb 12, 2016

No ORG, no deal. I can't use this in production unless it supports ORG.

CodeWriter23 · on Feb 11, 2016

That's a new twist on the Macro Assembler.

ilurkedhere · on Feb 12, 2016

Lisp user here.

This reminded me of https://github.com/kingcons/cl-6502 where the opcodes are defined using a defasm macro https://github.com/kingcons/cl-6502/blob/master/src/cpu.lisp... and implemented here https://github.com/kingcons/cl-6502/blob/master/src/opcodes....

Are Lisp macros pretty much on par with Rust macros?

FreeFull · on Feb 12, 2016

I was under the impression Lisp macros are more powerful/flexible than Rust's macro_rules! macros, although I don't know enough about Lisp macros to be sure.

JustSomeNobody · on Feb 11, 2016

That ... is ... f*ing ... ugly to look at.

vardump · on Feb 11, 2016

Well, a creative abuse of the language. Programmer humor.

JustSomeNobody · on Feb 11, 2016

Being a developer myself, I'm well aware of this. Still, it's ugly.

Animats · on Feb 12, 2016

When I see this, I understand why the Go crowd didn't put generics or macros in Go. It's so easy to write useful but awful things with them, and they're really hard to fix and debug.

This is getting to be a big problem with Rust. The language, or maybe its community, encourages cutesy stuff like this. Remember "Diesel", the Rust compiler for SQL, from a few days ago? Just because you can doesn't mean you should. Too much fancy template stuff is being baked into the low level libraries. This may not end well.

C++ went down this road. The template system turned out to be more powerful than intended. Then people started writing cool stuff as fancy templates. Then the C++ committee added features to support the fancy templates better. Overly elaborate templates damaged C++; nobody could figure out what was going on. Use of C++ for new work declined.

All of this comes from smart people with good intentions. But when there's too much of this stuff, the mental load required to keep up with all of it becomes excessive. Especially if it changes a lot.

pcwalton · on Feb 12, 2016

This is silly. Rust could not achieve its performance and safety goals without macros and generics. Generics may have been optional for Go (I don't agree, but let's go with it), but there is no way to achieve zero-cost memory safety without generics (while supporting first-class references and dynamic allocation).

Rust generics are deliberately less expressive than C++ templates, in that they're strongly typed. That's why we get so many complaints from C++ and D developers that you can't do certain things. But Rust sticks with its strongly-typed generics, because Rust has always tried hard to strike a balance between code expressivity and maintainability.

This is a terrible example if you want to prove that point, anyway. This 6502 assembler is obviously a total abuse of the system, and nobody will deploy it to production. It's a neat hack, and that's all. Nobody would use an obfuscated quine written in Go to argue that Go is unreadable. This is exactly analogous.

Finally, Diesel is just an ORM. It uses exactly enough features to be an ORM; the only "problem" with it is that it's an ORM.

vardump · on Feb 12, 2016

Animats had a very important point. While magic is so attractive and perhaps even useful in moderation, it causes so much damage in a large code base.

In C++, you get this damage from for example operator overload abuse (ok for math, not many other defensible uses), template abuse (ok in moderation), ninja exceptions and inheritance (especially multiple inheritance, ugh).

Anyways, if Rust proves to be a feasible workhorse building medium/large systems with bigger teams, it does sound interesting to me. Especially catching concurrency issues at compile time.

I just hope it won't ever become another C++.

pcwalton · on Feb 12, 2016

And my point is that nobody would ever deploy this to production. Would you say you're worried about Go becoming C++ if someone posted a quine written in Go to HN?

By the way, Rust has no "template abuse", "ninja exceptions", or "inheritance".

vardump · on Feb 12, 2016

I do think Rust is currently the most likely language to become the C++ killer. And attract significant number of developers from other major commercial languages/platforms as well.

Rust might eventually become the new industry standard. That'd be a clear improvement for sure.

But we'll see what happens. Hopefully there'll be no Rust committees at least.

Animats · on Feb 12, 2016

You end up with it in production because some lower level library uses it.

pcwalton · on Feb 12, 2016

No Rust production program will ever transitively depend on a library with a 6502 assembler written in the macro language.

pjmlp · on Feb 12, 2016

You are missing the point.

It is not the ORM library or the 6502 Assembler.

It is the library written by the clever guy on the third floor from blue team, for the new product being developed at Corpo X.

However I also have seen lots of convoluted code, exactly because such expressiveness is missing from the language.

For example, JVM bytecode manipulation or use of external code generators.

Manishearth · on Feb 13, 2016

> However I also have seen lots of convoluted code, exactly because such expressiveness is missing from the language.

That's the operating point here. Leave out expressivity, and tons of programmers write convoluted code. Leave it in, and a couple of clever people use it in an over-clever way.

EvenThisAcronym · on Feb 12, 2016

> That's why we get so many complaints from C++ and D developers that you can't do certain things. But Rust sticks with its strongly-typed generics, because Rust has always tried hard to strike a balance between code expressivity and maintainability.

In what sense are Rust macros "strongly-typed"? D and C++ templates are also strongly typed, just like the rest of the language.

pcwalton · on Feb 12, 2016

Rust generics are strongly typed. C++ and D templates are, however, untyped. You can get errors at template instantiation time in C++ and D, because the compiler makes no attempt to ensure that the types your template operates on actually support the operations you're performing on them at the time you declare the template. In Rust (and basically every other language), though, the compiler typechecks your templates at the time you write them, so template instantiation can never result in errors inside the template.

EvenThisAcronym · on Feb 14, 2016

Ah, I see. I'm assuming then that Rust generics are not generative templates like C++/D, but closer to Java/C#.

comex · on Feb 14, 2016

They're like Java/C# syntactically. Implementation-wise, they're monomorphized (code is generated for each specialization at compile time) like C++.

bsaul · on Feb 12, 2016

Maybe there still should be a compiler option to automatically evaluate the degree of abstraction complexity of a code, much like cyclomatic complexity, to prevent abuses ?

Soft · on Feb 12, 2016

Not that I agree with your point, but cyclomatic complexity checks are actually implemented as a compiler plugin in rust-clippy project[1]. You'll get a compile time warning if your function becomes too complex.

[1]: https://github.com/Manishearth/rust-clippy/wiki#cyclomatic_c...

sklogic · on Feb 12, 2016

> It's so easy to write useful but awful things with them, and they're really hard to fix and debug.

And it is even easier to write useful, beautiful things with macros, which are really easy to fix and debug.

nickpsecurity · on Feb 12, 2016

Back when I did a 4GL with them, my amateur trick was to write and test them like normal programs first. Then in the templated form as a normal program. Then do macros. On first two, I can use any technique known to benefit software quality. For programming in the large, interface checks that enforce correct usage are a straight-forward solution.

Curious, with your LISP and ML background, what method or methods do you use to ensure your macros are correct and easy to debug?

sklogic · on Feb 12, 2016

A couple of very simple tricks:

Macros must be staged - similarly to a good practice of not having too big functions, macros also must be small, and must transform code in small steps, with more macros further breaking the result down. Ideally, each macro must be a trivial rewrite doing only one small thing.

Another trick is to have a bunch of macros that would inject debugging output when enabled. Wrap every macro definition body in such a macro, and, with debugging turned on (can be selective), both source and output would be displayed. Another macro can be used to inject debugging info into a generated code.

I usually start writing macros the other way around - not from a code I'd like to see generated, but from a final form. Once I find the code with macros passable, I'll start implementing the macros, in small steps.

Same applies to a DSL design in general - first I write a code I want to see, and only then I fill in an implementation.

SixSigma · on Feb 12, 2016

I can't remember the exact quote but :

LISP's FOR macro got so clever that LISP programmers do everything they can to avoid using it

Manishearth · on Feb 13, 2016

> This is getting to be a big problem with Rust.

I haven't seen any examples of this happening in actual code. Sure, you have lots of toy things but that's just stuff people do for fun.

And as was adequately explained in that thread, ORM is an existing pattern from other languages that people like using. It's got some nifty benefits to it; so while it's not what you'd usually do it's not an overcomplication. Like Patrick said, its only "problem" is that it's an ORM.

> Overly elaborate templates damaged C++; nobody could figure out what was going on. Use of C++ for new work declined.

This happens because C++ templates aren't typechecked, so working with them requires you to understand their internals to some degree ("what kind of type does this expect?"). Rust generics are typesafe. Rust macros aren't, but they're not encouraged as much as generics/templates are so you don't see much convoluted things except as fun experiments (who doesn't want to write a brainfuck interpreter in rust macros?). People understand that macros will give substandard error traces and mostly only use them for quick and straightforward deduping of code.

lifthrasiir · on Feb 12, 2016

I can partly understand your points. I actually hated the "traditions" like Just Another Perl Hacker because it strongly gives an illusion of Perl as an obfuscated language.

My corresponding counterpoint is that you can write an obfuscated code in any language, no exception---probably a side-effect of Turing-completeness. If people has a general dislike towards too much magic and magic can be replaced with science^W saner alternatives there is no problem. The problem arises when you have to rely on magic for the certain class of tasks; it's no doubt that C++ (even the modern one) has this problem, but I think it is too early to claim that Rust has the same symptom.

jstimpfle · on Feb 12, 2016

> This is getting to be a big problem with Rust. The language, or maybe its community, encourages cutesy stuff like this.

> C++ went down this road.

Also a big problem in the Haskell community. Trying to prove all sorts of complicated properties in a simple type system leads to overcomplicated design.

I don't think I'm against static code generation. But it incurs mental overhead. If you do it using a language which wasn't designed for that, much more so.

pcwalton · on Feb 11, 2016

It's not the intended use case for declarative macros. Procedural macros are designed for this.

mcguire · on Feb 11, 2016

If your macros aren't ugly, you're doing it wrong.

sklogic · on Feb 11, 2016

Macros should never be ugly. There are multiple ways of using compile-time metaprogramming in a nice and clean way.

Maybe even in Rust, although I have not tried it yet.

evincarofautumn · on Feb 12, 2016

If your macro is ugly, but it makes your code pretty, it might mean you’ve successfully sequestered some ugliness, which is probably a good thing.

sklogic · on Feb 12, 2016

Of course. But there is no need to write ugly macros. Any macro of any complexity can be written in a nice and clean way.

mc808 · on Feb 12, 2016

You know that optical illusion where you can't tell whether you're looking at a cube from the top or the bottom, so your brain keeps flipping back and forth between them? This code is like that for me, but flipping between Perl and C.

mmastrac · on Feb 11, 2016

The author appears to have a crate for it as well: https://github.com/jonas-schievink/rustasm6502

vvanders · on Feb 11, 2016

Link appears to be broken, working one is:

https://play.rust-lang.org/?gist=a18d697454f9261b28ff