> I do not agree with you. What you are saying is that compiler must keep the variable ordering on the stack the same. That is the only way that `&(p+1) == &q` would hold consistently.
I think we're speaking past each other because that is in no way what I'm saying.
All I said was: there was a memory write... to an address... that came from an integer... that came from somewhere (a pointer). Since the compiler presumably has no idea where that "somewhere" was, it could've been q. It can't prove that it isn't q, because it knows that q's address was taken somewhere, and it has no idea where that address went. Hence it can't assume the target of the write can't have been q.
This assumes nothing about stacks, frames, or the ordering of any variables. You could order them any way you want and what I said would hold. It's literally just discussing how information could've flowed from one point to another.
It can't prove it isn't q even if q's address was never taken though. It does not matter that q's address was taken. p+1 would still point to q. If you have a pointer in scope, you can no longer remove or reorder any write at all. Those pointers could have been pointing to volatile locations for all the compiler knows. If there is multithreading, it cannot remove any write at all since the other thread could have had a pointer to that location.
It no longer matters that a variable did not have its address taken at any point since that address could have been taken via pointer arithmetic from somewhere else.
> It can't prove it isn't q even if q's address was never taken though.
I was imprecise with the wording.
What I meant is that, when q's address is taken and shoved into an integer, the compiler (assuming it gives up upon seeing integer conversion) fails to prove that the address points provably outside q on that path. Thus it needs to assume that the address might, in fact, provably point inside q (i.e. it might be guaranteed to overlap q). Thus it cannot assume q is unmodified.
If q's address was never taken, then the compiler could prove that q's address was unknown on that path, and thus could be anything. Because this proves there could exist an execution where the target doesn't overlap q, then the compiler could assume that an unmodified q is a valid execution. i.e. it could assume q is unmodified.
To put it in perhaps clearer terms: when the address of q is untaken, that means the program's (defined) behavior is completely invariant with respect to (i.e. symmetric w.r.t., i.e. independent of) q's address. Thus it can act as if the target of the write is not q (...because we just established the program's defined behavior is independent of q's address). Whereas it cannot make that assumption when q's address is leaked somewhere.
Hmmm... that sounds like exposed provenance (https://doc.rust-lang.org/stable/std/ptr/index.html#exposed-...) to me. When a pointer loses its provenance, its allocation is put in the global exposed list. Any pointer without a clear provenance then can alias any (all) of the exposed provenance allocations. What you are saying is that you want all pointers to behave this way. While that can prevent miscompiles, I would assume that that would also lead to massive slowdowns in runtime performance.
I hadn't heard of that before, cool. That sounds similar, yeah. (At a quick glance at least. I can't tell if it's exactly equivalent but it certainly seems close.)
> I would assume that that would also lead to massive slowdowns in runtime performance.
My responses are:
1. I am amused that the implication of this would be that Rust is slower than C/C++.
2. I would not assume such a thing purely on faith. It may be true, but there have been lots of poor specifications in the standards based on erroneous presumptions about the performance cost/benefit trade-offs that subsequently didn't hold. I won't even go into the ones that touch UB -- just look at the argument evaluation order that they finally caved on and made stricter from C++14 to C++17. I find the standards are overzealous w.r.t. optimization in a lot of cases, and it takes them a long time to admit that.
3. If that's actually true in this particular case, the correct solution to that would've been to provide an opt-in flag to break programs (think -ffast-math) rather than laying incorrectness as the foundation.
1. Rust has proper references with much stricter aliasing requirements. So no, it would be C that would slow down. Rust would still be able to take advantage of these optimisations.
2. In this case, I would say the effects are much more obvious. Just take the example in the article. The compiler cannot replace a pointer access with its value if there has been any other pointers access in between. The compiler also cannot reorder any statements with pointer accesses. They would basically behave as memory barriers (if I understand them correctly). That will cause slow downs in runtime (the compile time will speed up though).
3. Sure. Perhaps that is an option. I wouldn't call it a flag to "break program" though. Maybe a config file that tells the compiler which assumptions can be made (strict aliasing, associative floats, threading safe, etc).
EDIT: In point 3, I would also add assumptions like whether the code can be changed at runtime, etc (take a pointer to a function then write your own code there).
> In this case, I would say the effects are much more obvious. Just take the example in the article. The compiler cannot replace a pointer access with its value if there has been any other pointers access in between.
I have no idea what you mean. The whole point of this discussion is that the example in the article should not be optimized in the manner they're describing, because it would be wrong. That is... a good thing. Not a bad thing.
And the compiler certainly could replace a pointer access with its value if there has been other pointer access in between, as long as that pointer wasn't leaked away into some opaque place via an integer.
> The compiler also cannot reorder any statements with pointer accesses. They would basically behave as memory barriers (if I understand them correctly).
No, I don't think you understood what I'm saying. See above. It would still be perfectly legal for the compiler to transform
char p[1] = {0};
opaque();
return p[0];
to
opaque();
return 0;
under what I wrote above, the intervening statement notwithstanding.
I think we're speaking past each other because that is in no way what I'm saying.
All I said was: there was a memory write... to an address... that came from an integer... that came from somewhere (a pointer). Since the compiler presumably has no idea where that "somewhere" was, it could've been q. It can't prove that it isn't q, because it knows that q's address was taken somewhere, and it has no idea where that address went. Hence it can't assume the target of the write can't have been q.
This assumes nothing about stacks, frames, or the ordering of any variables. You could order them any way you want and what I said would hold. It's literally just discussing how information could've flowed from one point to another.