It's framed as being only for social media. But, really, it's about network access. Without network access, it's difficult to thrive in the modern world.
Are you not alarmed at the possibility that a person's network access could be cut arbitrarily and at-will?
Why? Kids have had access to the internet for over 30 years. What is the tiktok brainwashing (I don't use it), and how do you qualify the danger of it from say google news brainwashing, or even (gasp) public school brainwashing? I mean, if we're going to group ban information, at least let people in the local communities make those decisions. Otherwise, we're going to get the Epstein class making these decisions.
That's the issue. Most Americans don't want the entire population of India to be in San Diego. To some of us, San Diego is one of the best areas in the country. Why destroy it?
The article describes "the pile" as an "unfiltered scrape by design". But, the paper actually describes it as a bizarre mix of curated sources. https://arxiv.org/pdf/2101.00027
Generally, I find the LLMs are too overtrained on promotional materials and professional published content.
The last time I interviewed (around 10 years ago) I was surprised when 9 of the 10 senior developers didn't know how many bits were in basic elemetary types.
(Then, shortly afterward I also tried to find a new job, realized the entire industry had changed, and was fortunate enough to decide it wasn't worth the trouble.)
> 9 of the 10 senior developers didn't know how many bits were in basic elemetary types
That's likely thanks to C which goes to great pains to not specify the size of the basic types. For example, for 64 bit architectures, "long" is 32 bits on the Mac and 64 bits everywhere else.
The net result of that is I never use C "long", instead using "int" and "long long".
This mess is why D has 32 bit ints and 64 bit longs, whether it's a 32 bit machine or a 64 bit machine. The result was we haven't had porting problems with integer sizes.
It's substantially worse on the JVM. One's intuition from C just fails when you have to think about references vs primitives, and the overhead of those (with or without compressed OOPs).
I've met very few folks who understand the overheads involved, and how extreme the benefits can be from avoiding those.
Conversely I've met many folks who come into managed environments and piss away time trying to wrangle the managed system into how they think it should work, instead of accepting that clever people wrote it and guidelines when followed result in acceptable outcomes.
The sort of insane stuff I've seen on the dotnet repo where people are trying to tear apart the entire type system just because they think they've cracked some secret performance code.
My favourite JVM trivia, although I openly admit I don't know if it's still true, is the fact that the size of a boolean is not defined.
If you ask a typical grad the size of a bool they will inevitably say one bit, but, CPUs and RAM, etc don't work like that, typically they expect WORD sized chunks of memory - meaning that the boolean size of one but becomes a WORD sized chunk, assuming that it hasn't been packed
". While it represents one bit of information, it is typically implemented as 1 byte in arrays, and often 4 bytes (an int) or more as a standalone variable on the stack "
In what way is it worse? The range of values they can contain is well-specified.
And you have a frame with an operands stack where you should be able to store at least a 32-bit value. `double` would just fill 2 adjacent slots.
And references are just pointers (possibly not using the whole of the value as an address, but as flags for e.g. the GC) pointing to objects, whose internal structure is implementation detail, but usually having a header and the fields (that can again be reference types).
Pretty standard stuff, heap allocating stuff is pretty common in C as well.
And unlike C, it will run the exact same way on every platform.
I’m saying very few folks understand the cost tradeoffs of using references/objects versus using primitives directly. The difference in memory used for significant amounts of data is huge.
Not to mention indirection costs, but that’s a different issue.
That's a reasonable answer. But, I meant they seemed to have little understanding or interest. I don't interview much, and I'm probably a poor interviewer. But, I guess I was expecting some discussion.
Oooh, saw Andrei's name pop up and remember his books on C++ back in the day .. ran into a systems engineer a while ago that asked why during a tech review asked why some data size wasn't 1000 instead of 1024.. like err ??
> That's likely thanks to C which goes to great pains to not specify the size of the basic types. For example, for 64 bit architectures, "long" is 32 bits on the Mac and 64 bits everywhere else.
Don't you mean Windows instead of Mac? Most Unix-like operating systems use LP64 while Windows uses LLP64.
Microsoft tried valiantly to make Win16 code portable to Win32, and Win32 to Win64. But it failed miserably, apparently because the programmers had never ported 16 bit C to 32 bit C, etc., and picked all the wrong abstractions.
> Even more fun is pointers, especially when windows / macos were switching from 32-bits to 64-bits (in different ways).
And yet even more of a fun time with porting pointer code was going from the various x86 memory models[0] to 32-bit. Depending on the program, the pain was either near, far, or huge... :-D
In ancient computing times, which is when C was birthed, the size of integers at the hardware level and their representation was much more diverse than it is today. The register bit-width was almost arbitrary, not the tidy powers of 2 that everyone is accustomed to today.
The integer representation wasn't always two's complement in the early days of computing, so you couldn't even assume that. C++ only required integer representations to be two's complement as of C++20, since the last architectures that don't work this way had effectively been dead for decades.
In that context, an 'int' was supposed to be the native word size of an integer on a given architecture. A long time ago, 'int' was an abstraction over the dozen different bit-widths used in real hardware. In that context, it was an aid to portability.
C is a portable language, in that programs will likely compile successfully on a different architecture. Unfortunately, that doesn't mean they will run properly, as the semantics are not portable.
C certainly gives the illusion of portability. I recall a fellow who worked on DSP programming, where chars and shorts and ints and longs were all 32 bits. He said C was great because that would compile.
I suggested to him that he'd have a hard time finding any existing C code that ran correctly on it. After all, how are you going to write a byte to memory if you've only got 32 bit operations?
Anyhow, after 20 years of programming C, I took what I learned and applied it to D. The integral types are specified sizes, and 2's complement.
One might ask, what about 16 bit machines? Instead of trying to define how this would work in official D, I suggested a variant of D where the language rules were adapted to 16 bits. This is not objectively worse than what C does, and it works fine, and the advantage is there is no false pretense of portability.
I mean, as a senior developer, the number of bits in an "int" is "who the hell knows, because it has changed a bunch of times during my career, and that's what stdint.h is for." And let's not even talk about machines with 32-bit "char" types, which I actually had to program for once.
If the number of bits isn't actually included right in the type name, then be very sure you know what you're doing.
The senior engineer answer to "How many bits are there in an int?" is "No, stop, put that down before you put your eye out!" Which, to be fair, is the senior engineer answer to a lot of things.
How many bits are in an `int` in C? What do you mean "at least 16", that's ridiculous, nobody would write a language that leaves the number of bits in basic elementary types partially specified‽
It is a good idea - most of the time you don't care, and on slower systems a large int is harmful since the system can't handle that much and it cost performance - go to the faster system with larger ints when you need larger intw.
Not directly related, but this is a good example of why I love dependency injection. In most systems, I typically define the interface, implement something super simple at first, and as I iterate I re-evaluate, and I can easily* swap between implementations.
https://chatgpt.com/share/69f246e5-e0e8-83ea-aa88-6d0024b915...
reply