This code is incorrect in Standard C:
char buf[4096];
read(fd1, buf, 4096); // Assume error handling, omitted for brevity
struct read_format* rf = (struct read_format*) buf;
printf("%llu
", rf->nr);
There are two issues -- and these are distinct issues which should not be conflated -- :
buf
might not be correctly aligned for struct read_format
. If it isn't, the behaviour is undefined.
- Accessing
rf->nr
violates the strict aliasing rule and the behaviour is undefined. An object with declared type char
cannot be read of written by an expression of type . unsigned long long
. Note that the converse is not true.
Why does it appear to work? Well, "undefined" does not mean "must explode". It means the C Standard no longer specifies the program's behaviour. This sort of code is somewhat common in real code bases. The major compiler vendors -- for now -- include logic so that this code will behave as "expected", otherwise too many people would complain.
The "expected" behaviour is that accessing *rf
should behave as if there exists a struct read_format
object at the address, and the bytes of that object are the same as the bytes of buf
. Similar to if the two were in a union
.
The code could be made compliant with a union:
union
{
char buf[4096];
struct read_format rf;
} u;
read(fd1, u.buf, sizeof u.buf);
printf("%llu
", u.rf->nr);
The strict aliasing rule is "disabled" for union members accessed by name; and this also addresses the alignment problem since the union will be aligned for all members.
It's up to you whether to be compliant, or trust that compilers will continue put practicality ahead of maximal optimization within the constraints permitted by the Standard.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…