Because Akka can't magically patch over the JVM's shared memory model: http://do...

pron · on June 22, 2014

> And because the JVM does global stop-the-world garbage collection, which makes soft real-time implausible because of the unpredictability of GC affecting your actors. Erlang has per-process heaps.

Not quite. Every environment needs some shared memory semantics. In Erlang that's done with ETS tables, which don't undergo GC at all. Java's GCs are so good now, that, if shared data structures are used sparingly, would still give you better performance than BEAM. Plus, you have commercial pauseless GCs, as well as hard realtime JVMs.

> Basically the Erlang VM was created for this use case while the JVM was not, and its not something you can just add with a library. edit: Also the lightweightness of Erlang processes compared to Java threads[1] and hot code upgrades.

The JVM is so versatile, though, that all this can actually be added as a library, including true lightweight threads and hot code swapping (see Quasar[1]).

Nevertheless, BEAM was indeed designed with process isolation in mind, and on the JVM, actors might interfere with one another's execution more so than on BEAM, but even on BEAM you get the occasional full VM crashes. If total process isolation is not your main concern, you might find that Java offers more than Erlang, all things considered.

[1]: https://github.com/puniverse/quasar

yummyfajitas · on June 22, 2014

Java/Scala do allow you to do bad things. So add the following to Hershel's question:

"Assume developers are non-malicious and will only pass immutable objects across actor/future boundaries."

Also, I'm not that familiar with Erlang's memory model, so I might be wrong on this. But as far as I'm aware the memory for a message in Erlang is shared between threads - it's only local variables that use private memory. This means Erlang will also need some sort of concurrent garbage collector - does Erlang's version not stop the world, or at least the messaging subsystem?

rdtsc · on June 22, 2014

> But as far as I'm aware the memory for a message in Erlang is shared between threads

Yes and no. Some large binaries (a specific Erlang data type, that can say represent a packet or block of data from disk), will be shared and reference counted when passed between processes instead of copied. They are immutable just like most datatypes in Erlang. These binaries have a specific GC algorithm that it just might take longer sometime for them to be reclaimed. But it seems all that could presumably be done via atomic updates to counters and references.

In general most messages are copied on send. So implementation wise GC is very simple then. On another level because data in Erlang is immutable, the fact that messages get copied is also an implementation detail! One could conceive another implementation of a VM that only passes references and immutable data on message send (well minus when it sends it to another machine, of course). But that would make GC a bit more tricky just like in case of those binaries.

toast0 · on June 22, 2014

The large binary GC is actually pretty simple too: Shared binaries are refcounted; the references are in the process heap. When the references are GCed from the process, the shared binary can be freed. The reason that sometimes it takes a long time to free, is that some types of processes will get references to a large number of binaries, but not trigger a process garbage collection, leaving lots of binaries allocated in the shared space. Garbage collection for a process is only automatically triggered when the process heap would grow, so there are some common cases which result in bad behavior: processes that don't generate much garbage on their heap, but do touch a lot of binaries (often this is request routing); processes that grow their heap to some large size doing one kind of work, but then switch to another type that doesn't use much heap space, leaving a long time between GC; processes that touch a lot of binaries but then don't do any processing for a long time (maybe a periodic cleanup task).

Another common issues is taking references to a small part of a large shared binary.

vamega · on June 22, 2014

That's exactly right. Erlang has per actor heaps, and it's garbage collector only stops the actors that are being garbage collected. This is highly concurrent and is a great property to have when you're trying to keep your response times low.

waffle_ss · on June 22, 2014

> But as far as I'm aware the memory for a message in Erlang is shared between threads - it's only local variables that use private memory

No, the messages are truly copied: http://jlouisramblings.blogspot.dk/2013/10/embrace-copying.h...

edit: the exception being large binaries apparently

cwp · on June 22, 2014

Even if developers are non-malicious, they aren't infallible.

saryant · on June 22, 2014

Hopefully with the upcoming Spores[1] feature in Scala, Akka will be able to enforce message immutability in some form. I was at Scala Days last week and the developer behind spores gave a great talk on the sort of immutability guarantees this feature will allow. Worth watching once it's posted online.

[1] http://docs.scala-lang.org/sips/pending/spores.html

eeperson · on June 22, 2014

JVMs can have a separate heap for each thread. See the Avian JVM[1] for an example of this.

[1] http://oss.readytalk.com/avian/

rational-future · on June 22, 2014

>> JVM does global stop-the-world garbage collection

Does it? Back when I was in HFT, we were definitely running a JVM with background thread GC.

rdtsc · on June 22, 2014

Only Azul's JVM has managed to create a pause-less garbage collector. They use some pretty cool tricks.

It is really a fantastic piece of technology:

http://www.azulsystems.com/zing/pgc

Even just marveling at the complexity and how they got it working.

Otherwise, besides those tricks, how would you do it when you have multiple threads accessing objects on a shared heap?

Erlang's VM is another even wonderful piece of engineering. Each little process lives in its own memory heap. Then pauseless garbage collection become trivial. It has many other really cool and unique features (hot code reloading, inter-node distribution, ability to load C code, etc etc...)

judk · on June 22, 2014

A few simple ways: put shared data in PermGen, and rollover to a new process when memory gets low (erlang-style but at OS-level).

rdtsc · on June 22, 2014

Well I wouldn't say "roll-over" to the new process is exactly simple but it is a good trick though. Forking has its interesting dark cases that have to be handled. Inherited file descriptors, what happens to threads, signals and so on.

waffle_ss · on June 22, 2014

Were you paying huge money for Azul? http://www.azulsystems.com/zing/pgc

judk · on June 22, 2014

Is ConcurrentMarkAndSweep stop-the-world?

How hard are the limits of "soft" realtime?

syjer · on June 22, 2014

The metronome GC from ibm is predictable (not hard real time though).