I don’t really have any insights to provide on “big companies are using sponsors...

tialaramex · on March 12, 2022

> For example, if you ever attend a C++ talk by Googlers ... present about how they “optimized” some part of the STL

I'm guessing you're thinking about things like Swiss Tables. But the situation isn't that Swiss Tables "optimize" the STL's containers instead they're just the replacement you'd actually want. You can't "optimize" the STL containers, because they're defined in a way that's hostile to optimization.

Take std::unordered_set. Why is it so slow ? Well, your standard library is obliged to make this work roughly the same way as it would have when explained in a CS introductory Data Structures class in the 1980s. This is not necessary for the ordinary understanding or use of an "unordered set" which is why Swiss Tables has one that's much better, but if you've paid attention in that class you know there are buckets of keys with similar hash values so that's what the STL is obliged to provide.

If you just want an "unordered set" you do not want std::unordered_set despite the name, you want the much better replacements from Swiss Tables or various other offerings and it's unfortunate that std::unordered_set is in the standard and those are not.

Of course the other reason std::unordered_set is so slow for you isn't solved by Swiss Tables, but it was called out by the Googlers presenting Swiss Tables more than once. Your hash function is garbage. Even if you insist on using std::unordered_set because it was good enough thirty years ago or whatever, this part of the lesson is invaluable anyway.

When using data structures that are faster because of hashing, you defeat them by using poor quality hashes. To a first approximation if you aren't sure that you are using a good quality hash then you probably aren't. In any optimisation quest start by measuring, and in this case that means measuring: Is your hash actually any good at... hashing ?

kevinventullo · on March 12, 2022

To further support your point, the engineers at Facebook who worked on the F14 hashmaps/hashsets saw the same issues, and created two implementations as a result:

“Folly has chosen to expose a fast C++ class without reference stability as well as a slower C++ class that allocates each entry in a separate node. The node-based version is not fully standard compliant, but it is drop-in compatible with the standard version in all the real code we’ve seen.”

https://engineering.fb.com/2019/04/25/developer-tools/f14/

saagarjha · on March 12, 2022

This is basically my point, though. The things you’ve said (which are basically just what Google pushes) are correct. For their use case, they’ve found some good wins and there’s lots of interesting things under the hood enabling this. That’s cool, but these improvements come at a cost: the ABI (and in some cases, the API) is different. For Google this is OK because they can just ask their clients to adapt. This might be fine for you as well. But it’s definitely not the case for everyone, and ignoring it (at best) or actively harping on the standard for making concessions for the “dumb” reason of stability is not appropriate.

tialaramex · on March 13, 2022

I think you're arguing against a straw man? Maybe you can give a concrete example of the behaviour you're concerned about.

Neither of the Google C++ Swiss Table presentations I've seen were about how somehow std::unordered_set can be replaced by this completely different thing. The existing container is just useless, that's sad but there's nothing to be done about it because the ABI break cost is unacceptable.

Instead the talks were about why this thing (Swiss Tables) makes sense, how you can make use of it in your own software, and maybe about tricks you can use in your own data structures, or about how Hyrum's law interacts with this work.

mhh__ · on March 12, 2022

Forget custom ML silicon, Apple also have custom Arm instructions that as far as I'm aware they still don't think we're of the sort to have earned the right to know about (officially).

onion2k · on March 12, 2022

That's something that ARM explicitly use as a selling point to attract people to license their designs.

https://www.arm.com/technologies/custom-instructions

mhh__ · on March 12, 2022

Apple have a FU licence, it's not that kind of arrangement.

pkaye · on March 12, 2022

Apple has an architectural license that allows them to build ARM-compatible processors with custom micro-architecture. I'm sure others like Google and Nvidia also have it.

saagarjha · on March 12, 2022

Apple happens to be very quiet about GXF in their platform security guide, yes ;)

jbandela1 · on March 12, 2022

> For example, if you ever attend a C++ talk by Googlers, you’ll notice that they basically only talk about C++ as they use it, silently ignoring things they don’t care about.

Not all Googler talks.

My talks at CppCon (Polymorphism != Virtual https://m.youtube.com/watch?v=PSxo85L2lC0, Beyond Struct: Metaprogramming a Struct Replacement https://m.youtube.com/watch?v=FXfrojjIo80) had nothing to do with Google3 and use techniques that would likely be discouraged by the style guide (especially the struct one).

srvmshr · on March 12, 2022

I sort of agree with whatever you just said. Perspective is very important as solutions emerge from the problems in play for these companies. I will probably extend this a bit further : these solutions also increase our understanding and mental models for better & secure products. What company X does is not only limited to X, but benefits others too.

For corporate sway in research - its my personal opinion (any only limited to me), that citizen awareness is generally high. HN and similar communities are quick to spot gaping holes or flaws, and alternatives are plenty as well. There is fortunately a still healthy ecosystem of indie developers who contribute everything from Linux kernel to iOS patches. As you mention, as of present this does not seem a big concern and the alternative scenario (no academia-industry symbiosis) could be worse.

saagarjha · on March 12, 2022

Yeah, I definitely don't want to come across as asking for companies to stop publishing their research. The information is always interesting to see, and sometimes even trying to sus out the bias can help you better understand the companies themselves. Like, picking the example of C++, you can tell that Microsoft cares about ABI stability because they ship an OS that exposes APIs to binaries (Apple cares as well, but they're far less vocal about C++, but in the language they own–Swift–they've gone all the way to reifying an entire ABI stable interface for generics). The problem is when e.g. Google presents something about not caring about ABI stability, and whether unintentionally or not, recruits people to their cause, to the extent that I see Windows programmers who ship closed source software clamoring for Microsoft to "stop preventing C++ from being more efficient and better" because they read a bunch of stuff about how std::string could get a better layout or something. This definitely isn't wrong but the perspective is easily skewed by what your goals are, and it's easy to accidentally think Google's goals are the same as yours because they certainly have no incentive to suggest otherwise.

The question of how much we actually avoid this is a complicated one to answer. I like to think that a lot of the obvious biases get caught, but I have also been around long enough to know that Hacker News is definitely not immune to this. My employer constantly falls into the trap of having a problem and then looking around to see how FAANG is solving it, then trying that solution largely uncritically, despite not quite being a FAANG. It's mildly amusing when your see a principal engineer with several times more experience (and compensation!) than you do get tripped up by it, but it only emphasizes that evaluating research with a critical eye is difficult to do and everyone struggles with it to some extent.

srvmshr · on March 12, 2022

> when your see a principal engineer with several times more experience (and compensation!) than you do get tripped up by it.

Slightly off topic :) I feel by the time people become Principals, they lose the laser sharp focus because they are juggling too many things at the same time. Principals who work as IC on the team, however are much better since they are hands-on to the current problems.