Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Great idea, since it's all theoretical currently I'm wondering with the compiler offloading how well it will actually perform. Itanium was capable of doing some amazing things, but the compiler tech never quite worked out.


Ah, but the Mill was primarily designed by a compiler writer ;)

Here's Ivan's bio that is tagged on his talks:

"Ivan Godard has designed, implemented or led the teams for 11 compilers for a variety of languages and targets, an operating system, an object-oriented database, and four instruction set architectures. He participated in the revision of Algol68 and is mentioned in its Report, was on the Green team that won the Ada language competition, designed the Mary family of system implementation languages, and was founding editor of the Machine Oriented Languages Bulletin. He is a Member Emeritus of IFIPS Working Group 2.4 (Implementation languages) and was a member of the committee that produced the IEEE and ISO floating-point standard 754-2011."

So actually its been designed almost compiler-first :)


Still interested in how it works in practice. I'm pretty sure the Itanium team combined with Intel's compiler team have similar credentials.

I'm not saying it can't work, not saying it won't work, but we know that most code pointer chases. While CPU and compiler design is above my paygrade I know that often a lot of fancy CPU/design and compiler tricks that make things twice as fast on some benchmark leads to 2 to 3% performance gains on pointer chasing code.

Not sure how the Mill is going to make my ruby webapp go 8 times as fast by issuing 33 instructions instead of 4.


> Not sure how the Mill is going to make my ruby webapp go 8 times as fast by issuing 33 instructions instead of 4.

8x speed is not being claimed, 10x power/performance is. That could mean that the app runs at the same speed but the CPU uses 10% of the power. A lot of the power saving probably comes from eliminating many part of modern CPUs like out-of-order circuitry.


Ok, so now that it's 10x power/performance I buy 10 of these things and it still only delivers 5% more webpages.

This kind of mealymouthed microbenchmark crap is exactly what the industry doesn't need, if I have a bunch of code that is pure in order mul/div/add/sub then I put it on a GPU that I already have and it goes gangbusters. The problem is most code chases pointers.

Like I said, great idea, would love to see something that can actually serve webpages 10x as fast or 1/10th the power (and cost similar to today's systems)


I never thought of serving webpages as being CPU-bound. Anyway, to get a 10x speedup, you would have to buy enough of these to use as much power as whatever you're replacing. So if one Mill CPU uses 2% as much power as a Haswell, then you'd have to buy 50 of them to see a 10x performance improvement over the Haswell.


The speedup for Ruby will come from the Mill enabling faster DBs and services for you to use, and from Ruby VM improvements that are not perhaps Mill -specific.

If you pick Ruby as your platform, though, you are really picking a point on the runtime vs developing speed tradeoff that perhaps suggests you plan to scale sideways rather than upwards anyway; in which case the hosting platform for your app may be interested in Mill even if its users are ambivient.

Pointer chasing is a major concern, and the Mill can't magic it away. But there are other parts of your Ruby webapp that are a big deal such as event loops, continuations and garbage collection, where again the Mill has special sauce. There is also special attention paid to syscall performance on the Mill. Rails has a staggering number of syscalls per request, and django to pick an alternative has very few, so I'd still hope Rails moderates syscalls a bit.


The beauty of the Mill is that it's been designed from the start to make the compiler extremely simple and straightforward. There is no "magic" in the software here, it's all in the hardware.


Actually there's a fair bit of magic in the software as a result of exposing the hardware rather than trying to hide it. Once the software can know how long things will take, suddenly it can do things that in x86 land would be magical.

This seems to me philosophically what Sony was trying to do with the Cell processor. Expose the hardware to programmers so that they can manage things better. The big difference being that the Mill was designed by a compiler writer rather than a bunch of guys who design GPU pipelines.


Actually there's a fair bit of magic in the software as a result of exposing the hardware rather than trying to hide it. Once the software can know how long things will take, suddenly it can do things that in x86 land would be magical.

Ahhh. When I think of magic I think of stuff like optimizer heuristics that give incredible performance with very carefully written micro-benchmarks and abysmal performance in the worst case.


Yeah that makes perfect sense. I would probably refer to that as "cheating" rather than "magic" but I totally get the nomenclature mix-up.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: