Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Rather than trying to decide whether to require that all implementations must use two's-complement math, or suggest that all programs should support unusual formats, the Standard should recognize some categories of implementations with various recommended traits, and programs that are portable among such implementations, but also recognize categories of "unusual" implementations.

Recognizing common behavioral characteristics would actually improve the usability of arcane hardware platforms if there were ways of explicitly requesting the commonplace semantics when required. For example, if the Standard defined an intrinsic which, given a pointer that is four-byte aligned, would store a 32-bit value with 8 bits per byte little-endian format, leaving the any bits beyond the eighth (if any) in a state which would be compatible with using "fwrite" to an octet-based stream, an octet-based big-endian platform could easily process that intrinsic as a byte-swap instruction followed by a 32-bit store, while a compiler for a 36-bit system could use a combination of addition and masking operations to spread out the bits.



This sounds like something memcpy would do already for you?


A 36-bit system with (it sounds like) 9-bit bytes stores bit 8 of a int in bit 8 of a char, and bit 9 of the int in bit 0 of the next char; memcpy won't change that. They're asking for somthing like:

  unsigned int x = in[0] + 512*in[1] + 512*512*in[2] + 512*512*512*in[3];
  /* aka x = *(int*)in */
  
  out[0] = x & 255; x>>=8;
  out[1] = x & 255; x>>=8;
  out[2] = x & 255; x>>=8;
  out[3] = x & 255;
  /* *not* aka *(int*)out = x */


The amount of effort for a compiler to process optimally all 72 variations of "read/write a signed/unsigned 2/4/8-byte big/little-endian value from an address that is aligned on a 1/2/4/8-byte boundary" would be less than the amount of effort required to generate efficient machine code for all the ways that user code might attempt to perform such an operation in portable fashion. Such operations would have platform-independent meaning, and all implementations could implement them in conforming fashion by simply including a portable library, but on many platforms performance could be enormously improved by exploiting knowledge of the target architecture. Having such functions/intrinsics in the Standard would eliminate the need for programmers to choose between portability and performance, by making it easy for a compiler to process portable code efficiently.


I'm not disagreeing, just showing code to illustrate why memcpy doesn't work for this. Although I do disagree that writing a signed value is useful - you can eliminate 18 of those variations with a single intmax_t-to-twos-complement-uintmax_t function (if you drop undefined behaviour for (unsigned foo_t)some_signed_foo this becomes a no-op). A set of sext_uintN functions would also eliminate 18 read-signed versions. Any optimizing compiler can trivially fuse sext_uint32(read_uint32le2(buf)), and minimal implementations would have less boilerplate to chew through.


> Although I do disagree that writing a signed value is useful

Although the Standard defines the behavior of signed-to-unsigned conversion in a way that would yield the same bit pattern as a two's-complement signed number, some compilers will issue warnings if a signed value is implicitly coerced to unsigned. Adding the extra 18 forms would generally require nothing more than defining an extra 24 macros, which seems like a reasonable way to prevent such issues.


Fair point; even if the combinatorical nature of it is superficially alarming, that's probably not a productive area to worry about feature creep in.


72 static in-line functions. If a compiler does a good job of handling such things efficiently, most of them could be accommodated by chaining to another function once or twice (e.g. to read a 64-bit value that's known to be at least 16-bit aligned, on a platform that doesn't support unaligned reads, read and combine two 32-bit values that are known to be 16-bit likewise).

Far less bloat than would be needed for a compiler to recognize and optimize any meaningful fraction of the ways people might write code to work around the lack of portably-specified library functions.


Ah, I see.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: