The future of programming languages in a massively concurrent world

axod · on Nov 4, 2007

I really hope that if multiple cores increase, they are done in a way which hides them from the programmer. For example a 32 core CPU, but that appears as a very very fast single core. That's where the concurrency/'threading' issues should live, not in everyones code.

Threads are usually the problem IMHO not the solution.

I don't agree with the suggestion that javascript will need threads either. Javascript works extremely well in a single thread. There isn't really much of a need for threads. Having multiple cores doesn't change that, it just means you might need some abstraction layer like I described above, that utilizes all cores, whilst appearing as a single core.

ctkrohn · on Nov 5, 2007

You're talking about automatic parallelization. There's been quite a bit of research done in this area, but the results aren't that great. Some languages (particularly functional ones) are better suited to automatic parallelization than others, and it's pretty doubtful that you'll see a good automatically parallelizing compiler anytime soon, especially for an imperative interpreted language like Javascript.

See http://en.wikipedia.org/wiki/Automatic_parallelization for more info.

tlrobinson · on Nov 5, 2007

A common misconception is that Moore's "law" states that processor speed will double every 18 months or so. That's incorrect. In fact, Moore said that the NUMBER of transistors on a processor would double roughly every 18 months.

It turns out that they are correlated, since smaller transistors means both faster transistors and more transistors per unit of area.

My point is that more processor cores require more transistors, and thus the end of Moore's Law also means the end of more processor cores (ignoring things like architectural advancements or more but less powerful cores)

That said, I do agree that we will see an increasing number of cores, at least for awhile.

First of all, it's important to have an OS that efficiently manages the cores and the applications that run on them. This automatically benefits everyone who runs multiple applications on a multi-core machine, since each application gets a larger slice of time and fewer context switches.

Multiple cores could also eliminate dedicated components like GPUs which would bring down the cost of low end machines.

As far as programming languages go, I hear Erlang is good for concurrency, though I've never used it.

davidw · on Nov 4, 2007

I think it's a race:

In one corner are languages like Erlang that have been designed for concurrency.

In another corner are languages with massive user bases, that don't do concurrency very well (Java for example), that will have to undergo modifications to work better.

In another corner, perhaps, are languages that are just now being created. They're the outsiders, but have more agility in their design because they don't have huge user bases.

bilbo0s · on Nov 4, 2007

I think 'race' is one way of looking at it. Another way to look at it is that there will be more than one 'winner'. Just like today, I think in the world of tomorrow people will use more than one programming language. So all of the languages that expect to be around tomorrow will need to evolve into languages for controlling concurrent systems on massively parallel architectures.

You are correct in saying that some languages are simply given as winners.

Erlang is a VERY impressive language from a performance standpoint. That said, it really needs to be used in conjunction with some front end UI language to present the results of all of its processing. Make no mistake about it, if one has so much data that one will need say 32 cores to process it, then a clever presentation layer is a must. Another drawback here is that, for those Erlang programmers who are something other than elite, Erlang will handle all of the locking and synchronization on its own. I think experienced senior programmers see where that one is going. That said, the way Erlang implements concurrency is cool. I like simple!

In my opinion, Java is another clear winner here. From Wall Street systems with US$900,000,000,000 a day in transactions flying through them, to medical imaging systems rooting out cancerous pathologies, to petroleum exploration systems trying to find the precise limits to all of that new Cuban oil, Java is at the center. When it comes to handling massive datasets in a maintainable fashion, Java is second only to the C,C++ languages. Some would even say that C or C++ code is less maintainable than Java. I would say they should hire new C programmers.

Lastly, C, C++ and Assembly will always be around because there will always be someone,(Carmack) who wants to outshine everyone else. And we can all agree that nothing flies like an assembly routine. As a bonus, we can get exacting control over concurrency, synchronization and locking. It may be difficult to believe, but this option is attractive to a certain class of hacker.

Concurrency in todays web darlings, Ruby, PHP, and Python will be challenging. I think someone out there will simply come up with a new language. Or the web guys will switch to Erlang over time. They will run into a data size problem though due to the way Erlang implements concurrency at a low level. It will be interesting to watch them problem solve that.

On the front end, it's easy . . . Microsoft wins. Anyone who understands concurrency in an intimate fashion knows the challenges of getting a scripting language like JavaScript to support it in a satisfactory fashion. Do you lock and synchronize for the developer? Do you let him/her? Do you copy the heap and send messages? What about client side memory in a tabbed browser? The questions go on and on. Java may have somewhat of a chance here, depending how things go, but basically Microsoft will continue to have the majority lock.

nostrademons · on Nov 4, 2007

Concurrency in web languages is a non-issue, because each request is independent (or should be, if you have a proper shared-nothing architecture). You simply run multiple processes and give each process a full core. Most FastCGI/SCGI webserver modules have functionality built-in to multiplex among backend processes.

Concurrency in the database is more interesting, particularly since that's where the bottleneck is in many web apps. But that's a C/C++ problem, and many DBMS vendors have already invested a large amount of effort into solving it.

emfle · on Nov 4, 2007

Well, there is actually a latency gain to be had from parallellizing indidvidual requests, rather than just running several requests in parallel.

Think about this example: You have 10 printers each capable of printing 10 pages per minute. Then 10 jobs are submitted each with 10 pages. If you run those jobs in parallel, all of them will finish after 60 seconds. If you parallelize each job and print page 1 on printer 1, page 2 on printer 2 etc., then the first job will finish in 6 seconds, the second in 12, and the last one in 60 seconds. The average latency is then (6 s + 12 s + ... + 60 s) / 10 = 33 s.

Your throughput will be the same, except for a bit of parallelization overhead.

nostrademons · on Nov 5, 2007

Usually CPU latency for a web request is very tiny compared to network/DB/IO latency, though.

emfle · on Nov 6, 2007

Sure, but I/O can be parallelized too. The argument is not specific to CPU at all.

dood · on Nov 4, 2007

Concurrency in todays web darlings, Ruby, PHP, and Python will be challenging.

Heard of stackless python? [http://www.stackless.com/]. Eve Online is the highest profile app using it I know of.

Introduction to Concurrent Programming with Stackless Python [http://members.verizon.net/olsongt/stackless/why_stackless.h...]

bilbo0s · on Nov 4, 2007

Stackless is interesting, but I found it to be a little unwieldy when I wrote a volume visualization test. When each task needs access to the entire volume of data, Stackless gets REALLY slow. I didn't look through the Stackless internals the way I looked into Erlang, but the slowdown was undeniable.

Again, for massive datasets, accessed by many cores over a massive number of threads, it needs a little more development.

TEST ALGORITHM:

Standard Volume ray casting. Each pixel processed separately. Ray casted through volume with alpha based early ray termination.

DanielBMarkham · on Nov 4, 2007

That's an excellent test algorithm because of the shared memory issues. Might be better if the rays affected the environment -- say laser beams. (For some reason I have a picture of sharks with laser beams)

Ray-tracing is a nice, simple problem domain -- enough to be complicated, but not too complicated.

tocomment · on Nov 4, 2007

Isn't stackless not truly concurrent since it's still beholden to the Python GIL?

Or is that not the case?

mprovost · on Nov 4, 2007

Stackless is an example of the communicating sequential process model, not concurrency. You avoid all the issues around locking etc because switching between processes (tasklets) is explicit - it's cooperative multitasking not preemptive. It's a very nice language but doesn't help at all with running on multiple cores. Once you start having multiple processes running simultaneously and shared mutable state you have to start worrying about things like locks. Erlang nicely sidesteps the problem by not having mutable variables.

michaelneale · on Nov 4, 2007

I don't think erlang is that great for the scenarios that started this thread - it is designed for 1000's of processes, not dozens. And by all reports its straight line performance is much much much slower then anything else out there.

There is of course functional approaches like haskell (but it will take longer to be mainstream).

davidw · on Nov 4, 2007

Erlang is functional, but in any case, you're right about its performance - it's more or less that of a fast interpreted language, rather than a compiled language.

michaelneale · on Nov 5, 2007

I never thought of it as functional - looked more like little islands of imperative programming to me (I guess I will have to check again).

I think fast interpreted is probably even overstating it, it makes all sorts of compromises. Erlang was also designed for very high availability - so that must cost something as well.

I guess we will see what happens, but I am very skeptical about erlang (not sure why, just a gut feel).

davidw · on Nov 5, 2007

Erlang is most definitely functional, although it's not as pure as something like Haskell.

In terms of speed, this is something to look at:

http://shootout.alioth.debian.org/debian/benchmark.php?test=...

It's pretty fast for most things if you use HiPE.

Your skepticism about Erlang is justified in terms of "next big language":

http://journal.dedasys.com/articles/2007/10/09/languages-wor...

However, if you use it for what it was created for, it is very nice - it's the best thing out there.

Hexayurt · on Nov 4, 2007

Occam.

Specifically, http://transterpreter.org

Yes, the language it runs (Occam) is 20 years old. But the language was designed for programs running on dozens to thousands of nodes, and in the transterpreter implementation, there's the possibility of doing this on heterogeneous hardware, where the fast nodes do things like splitting and merging the data set, and the smaller "grunt compute" nodes do the actual work.

Parallel programming is hard, but that's inherent hardness. You can't get around things like memory bandwidth and latency at a programming language level, no matter how much you try. You can only get away from those things by dealing with the fact you have thousands of machines, or tens of thousands.

It's only going to get worse from here on in, as "faster" comes to mean more processors, not higher clock rates. You'll see this: 2 core! 3 core! 4 core! 8 core! and pretty soon (within 10 years) we'll see 64 and 128 core desktop machines, maybe even a revival of unusual architectures like wafter scale integration with 3D optical interconnects (i.e. upward pointing tiny lasers and photocells fabricated on the chip) to handle getting data on and off the processors.

We've seen unambiguously that GIGANTIC data sets have their own value. Google's optimization of their algorithms clearly uses enormous amounts of observed user behavior. Translation efforts with terabyte source cannons. Image integration algorithms like that thing that Microsoft were demonstrating recently... gigantic data sets have power because statistics draw relationships out of the real world, rather than having programmers guessing about what the relationships are.

I strongly suspect that 20 years from now, there are going to be three kinds of application programming:

1> Interface programming

2> Desktop programming (in the sense of programming things which operate on your personal objects - these things are like pens and paper and you have your own.)

3> Infrastructure programming - supercomputer cluster programming (Amazon and Google are supercomputer applications companies) - which will provide yer basic services.

One of the concepts I'm pitching to the military right now is using the massive data sets they have from satellite sources to provide "precision agriculture" support for the developing world. Precision Agriculture in America is tractors with GPS units that vary their fertilizer and pesticide distribution on a meter-by-meter basis (robotic valves consult the dataset as you drive around the land.)

In a developing world context, your farmers get the GPS coordinates for their land tied to their cell phone numbers either by an aid worker, or by their own cell phone company.

Then the USG runs code over their sat data, and comes up with farming recommendations for that plot of land. If the plots are small enough (and they often are) the entire plot is a single precision agriculture cell.

But if you think about the size of the datasets - we're talking about doing this for maybe 20 - 30% of the planet's landmass - and the software to interpret the images is non-trivial and only going to get more complex as modeling of crops and farming practices improves...

Real applications - change the world applications - need parallel supercomputer programming. Occam was right in the same way that Lisp is right but for a different class of problems. That's because Occam is CSP (concurrent sequential processes) and those are a Good Thing. There may need to be refinements to handle the fact we have much faster nodes, but much slower networks, than Occam was originally designed for - but that may also turn out to be a non-issue.

I'm also working on similar stuff around expert systems for primary health care - medical expert systems are already pretty well understood - so the notion is to develop an integrated set of medical practices (these 24 drugs which don't require refrigeration, don't produce overdose easily, and are less than $10 per course) with an expert system which can be accessed both by patients themselves to figure out if their symptoms are problematic or not, and by slightly trained health care workers who would use the systems to figure out what to prescribe from their standard pharmacopoeia.

It's not much, but for the poorest two or three billion, this could be the only health care service they ever see. None of the problems are particularly intractable, but you better bet there's a VAST - and I mean VAST - distributed call center application at the core of this.

Of course, the Right Way to do this is FOLDING@HOME or SETI - we've already proven that public interest supercomputing on a heterogeneous distributed network works.

Now we just need to turn it to something directly lifesaving, rather than indirectly important for broader reasons.

Remember that the richest 50% of the human race have cell phones already, and rumor has it (i.e. I read it on the internet) that phone numbers and internet users in Africa have doubled every year for the past seven years. 10 years from now the network is going to be ubiquitous, even among many of the very, very poorest.

We get a do-over here in our relationship with the developing world. We can't fix farm subsidies, but we can ensure that when they plug into the network for the first time, there is something useful there.

yubrew · on Nov 5, 2007

Hi Hexayurt, we're looking at a couple different heathcare IT problems to solve, and creating expert systems for specific medical domains is one of the areas we are currently investigating.

Can I pick your brain a bit? Send me an e-mail, or let me know what your e-mail is.

ralph · on Nov 6, 2007

Occam is yet another thing with its roots in Hoare's CSP. See my posts elsewhere on this story.

ralph · on Nov 5, 2007

I'm surprised the decent solution to this isn't more widely known. People have mentioned Occam and Stackless Python; both interesting. But their ancester is Hoare's CSP and other descendant have included Squeak (not the Smalltalk relation), Newsqueak, Plan 9's Alef, Inferno's Limbo, and now libthread.

Channels with co-operating threads are easy to reason about. See Russ Cox's overview page http://swtch.com/~rsc/thread/ for more.

cstejerean · on Nov 5, 2007

Stackless python does absolutely nothing to help with scaling applications to multiple cores. It allows you to write asynchronous applications to better utilize a single processor for operations that depend heavily on IO (or otherwise waiting for some resource).

ralph · on Nov 6, 2007

I wasn't pushing Stackless, just saying that others have mentioned it and its ancestory has something in common with what I'm trying to sell; channels as a synchronisation method. See the references I gave for more details.

staunch · on Nov 5, 2007

The server side is already prepared for this. There's nothing much to do. All the big web languages run as multiple processes (or threads) and so do all the big databases. I think we'll see a lot more server consolidation, which we're already seeing with the 4-core and 8-core machines of today.

dhouston · on Nov 4, 2007

check out cilk; http://en.wikipedia.org/wiki/Cilk . i'm not terribly familiar with it, but it extends c++ and adds a few keywords/abstractions to work with concurrency. it's been spun out of its original project at MIT into a startup as well (http://cilk.com/ )

jhrobert · on Nov 12, 2007

My bet is that existing Object Oriented Programming languages will become more "functionnal".

ie: Everything is an object... or a value!

http://virteal.com/ObjectVersusValue

gritzko · on Nov 4, 2007

There is some chicken-and-egg effect: multicore processors are selling well if you have applications/languages and vice-versa. As far as I see, OpenMP has some potential as well as Erlang and others.

queensnake · on Nov 4, 2007

No, manufacturers have their own reasons to sell multi-cores, namely, that that's the only way they can sell them as being 'faster'. Intel is going to stop selling single-core processors not too long from now (eg).

waleedka · on Nov 4, 2007

The way things are headed, we'll soon be running most of our applications in the browser. Someone needs to come up with a multi-threaded JavaScript. Yes, I know, it's not going to be pretty.

Zak · on Nov 4, 2007

Unless somebody comes up with an app that's doing massive amounts of data processing in JS, I think giving each tab/window its own thread in the browser will be an adequate solution for the foreseeable future.

rontr · on Nov 4, 2007

Google Gears is multi-threaded.