Stackless is interesting, but I found it to be a little unwieldy when I wrote a volume visualization test. When each task needs access to the entire volume of data, Stackless gets REALLY slow. I didn't look through the Stackless internals the way I looked into Erlang, but the slowdown was undeniable.
Again, for massive datasets, accessed by many cores over a massive number of threads, it needs a little more development.
TEST ALGORITHM:
Standard Volume ray casting. Each pixel processed separately. Ray casted through volume with alpha based early ray termination.
That's an excellent test algorithm because of the shared memory issues. Might be better if the rays affected the environment -- say laser beams. (For some reason I have a picture of sharks with laser beams)
Ray-tracing is a nice, simple problem domain -- enough to be complicated, but not too complicated.
Again, for massive datasets, accessed by many cores over a massive number of threads, it needs a little more development.
TEST ALGORITHM:
Standard Volume ray casting. Each pixel processed separately. Ray casted through volume with alpha based early ray termination.