If you can observe GC from content then site behaviour will start to depend on t...

wahern · on May 27, 2020

The key to understanding Lua is that the C API is a first-class citizen, and the Lua authors strive to keep the Lua scripting and Lua C API semantics symmetric (my characterization). As a general rule, anything you can do in Lua script you can do from the C API--closures, coroutines, etc--with the same logical semantics. (And using the standard C-compliant Lua C API--no C compiler extensions required.) The VM implementation, language semantics, and Lua C API strongly reflect each other and channel language design and VM implementation. Finalizers are critical for a good C API, and whether the finalizer is a lua_CFunction or an in-language function should be and is irrelevant. And a rule that didn't permit leaking would be rather brittle and error prone from the perspective a C module author.

Of course, relying on ephemerons to merely detect GC collection (as opposed to their primary purpose of caching and memoizing data associated to a particular blackbox object instance) is a rather obtuse hack, and not something I've seen in practice for anything other than debugging.

I don't agree that JavaScript's semantics are what have permitted JavaScript optimizing compilers to work so well. Details like the above aren't necessarily difficult to implement efficiently; they're difficult, if not impossible, to implement if the implementation architecture didn't contemplate their existence, or if the architecture was predicated on their non-existence. For awhile LuaJIT outperformed mature JavaScript engines despite having to deal with more complex semantics and more abstract constructs (e.g. stackful coroutines that were independent of the C stack, fidelity to the Lua C API, goto, variable return lists, guaranteed TCO, etc), as well all the same dynamic typing headaches as JavaScript and Python. And it did so with an appreciably smaller VM. And I have no doubt it could do it again with the benefit of another 10 years of knowledge and experience.

What Lua has going in its favor is 1) the authors are only weakly committed to backwards compatibility--there are always breakages with each new version, usually small but they add up over time; and 2) a strong commitment to a symmetric C API, which means the authors have to think long and hard about language constructs and architectural details. #1 provides them the liberty to experiment with better semantics, including discarding constructs or behaviors that don't work out. #2 is a constraint that prevents them from taking certain shortcuts, such as intermingling the "C stack" with the logical language stack. The reason JavaScript only supports explicit "async" methods rather than a more seamless control flow construct is because all implementations (except DukTape?) mingle C stack semantics with the in-language stack, including especially the JIT compiler components. This chumminess isn't necessary--LuaJIT isn't particularly handicapped by avoiding it--but it's not something you can fix after the fact; it'd requiring starting an implementation from scratch, and that's probably not ever going to happen again, at least not merely for the purposes of providing better control flow semantics. VM authors tend not to appreciate how architectural details limit the range and scope of future language constructs. The Lua C API gives the Lua authors fewer degrees of freedom, which paradoxically results, I think, in them avoiding implementation dead-ends.

I do agree that it's problematic to rely on certain aspects of a GC, like absolute timing in the case of mark & sweep style GC. But other aspects, like finalizers and object revival, are deliberate guarantees focused on language semantics. Another guarantee that Lua provides is that objects are destroyed in reverse order of creation, which is a useful guarantee from the perspective of a C module writer. Again, because of the nature of the Lua C API and the need to provide more than just a blackbox GC, Lua is far more deliberate about which aspects of the GC can be relied upon (and how), and which aren't their problem. Other languages avoid this--the very thought of providing any GC-specific language features is anathema--but in time invariably find themselves providing ad hoc guarantees and interfaces, or locked into accidental semantics.

klodolph · on May 27, 2020

Apps already depend on the specifics of individual GC implementations. At least, that’s my experience working with real-time apps.

chrisseaton · on May 27, 2020

How are you possibly writing any kind of real-time app in JavaScript with a GC?

jfkebwjsbx · on May 27, 2020

Real-time and general purpose GCs does not sound like a good idea, except for very specialized GCs...

klodolph · on May 27, 2020

Talking about soft real-time here… do you think multimedia apps should be native-only? That it’s a bad idea to put multimedia apps on the web?

jfkebwjsbx · on May 27, 2020

The web does not imply a GC.

Most multimedia in the web is hardware accelerated, SIMD accelerated via plugins or uses Wasm.

Any other way would be a huge waste on resources or simply impossible to do for high resolutions.

a1369209993 · on May 28, 2020

> That it’s a bad idea to put multimedia apps on the web?

Yes? Obviously? The only reason Flash ever worked for anything useful (-ish depending on your opinion of video games) was that it was in practice isolated from the rest of the web browser. Javascript... isn't.

tinus_hn · on May 27, 2020

Nor real-time and Javascript in a web app

klodolph · on May 27, 2020

You might be forgetting that multimedia apps are often soft real-time.

gridlockd · on May 27, 2020

I do not buy into this argument. Finalizers are generally non-deterministic, regardless of implementation. Sure, sometimes that causes issues, but that is not a good argument against having them.

wruza · on May 27, 2020

But then, how did LuaJIT manage to do the same thing with all these semantics in place?

wahern · on May 27, 2020

I don't think the problem are the semantics, per se. It's knowing or anticipating the semantics before you ever write the first line of VM code. LuaJIT was written to provide near perfect semantic and C API fidelity to PUC Lua 5.1; semantics that were and still are quite sophisticated relative to popular languages. (Lua is a well-disguised functional language, notwithstanding that dynamic typing is no longer en vogue.)

Today's modern JavaScript engines were written with the primary purpose of making then existing JavaScript constructs fast, and those constructs were few and mostly simple. (Exception: JavaScript was an early adopter of lexical closures, notwithstanding its block scoping quirk. Prototypes are conceptually simple but, like closures, rather complex from the perspective of the VM and especially a JIT engine.) They took liberties and shortcuts--all useful, but few without similarly performant alternatives--that had the effect of constraining their ability to implement newer constructs. Because nobody would demand the major browser implementations to introduce drastic architectural changes, newer language constructs were and continue to be tailored to fit the design constraints of existing implementations.

Lua has accumulated so many sophisticated constructs because it's a tiny VM that is substantially rewritten with each new major version (5.1 -> 5.2 is a major version bump). And Lua isn't beholden to a strong backward compatibility guarantee; they can discard things that don't work, keeping the VM relatively simple and small and thus easier to refactor. It's notable that while LuaJIT is much faster than PUC Lua, LuaJIT is stuck at 5.1 + some 5.2 extensions. Most newer constructs and semantics added to PUC Lua since 5.1 are relatively difficult to add back into LuaJIT.

wruza · on May 28, 2020

These changes are not really important though, in my opinion. Lua authors do research first, not engineering first (still good and oldschool, as seen from sources). LuaJIT and 5.1 found themselves embedded in many environments, because it was one of the most successful variants. And later additions to 5.2/3 which are incompatible and non-compatible with LJ/5.1 were disputable in a mailing list. Biggest non-compatible and LJ-unimplememted changes were _ENV and integers. _ENV was a cute alternative, but not a huge revolution. Integers... well, I hope Roberto and Luis know what they’re after with that.

In a context of js/lj, lj is a single-variant of always compatible language, which I think shares more similarities with js in this regard, rather than with lua 5.x series.

Sophistifunk · on May 27, 2020

It's bollocks. JavaScript is the only GC system of any widespread use that doesn't allow programmers weakrefs. JVM languages all have it, .net has it, SmallTalk, various collectors for C++, Cocoa (ARC as well as the briefly-viable GC), even ActionScript had it. Very few universes were torn asunder.