Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
242 views
in Technique[技术] by (71.8m points)

c++ - Same strings in array have same memory address

Why do same strings in a char* array have the same address?

Is this because of compiler optimization?

Example:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define ARR_SIZE 7

int main(int argc, char** argv) {
  size_t i = 0, j = 0;

  char * myArr[ARR_SIZE] = {
    "This is the first string",
    "This is the second string",
    "This is Engie",
    "This is the third string",
    "This is Engie",
    "This is the fifth string",
    "This is Engie"

  };

  for (i = 0; i < ARR_SIZE; ++i){
    for (j = i + 1; j < ARR_SIZE; ++j){
      if (memcmp((myArr + i), (myArr + j), sizeof(char*)) == 0){
      fprintf(stdout, "%p, %p
", *(myArr + i), *(myArr + j));
      fprintf(stdout, "found it start index: %lu, search index: %lu
", i, j);
      }
    }
  }
  return 0;
}

GDB:

(gdb) x/7w myArr
0x7fffffffdd10: U"x4007a8"
0x7fffffffdd18: U"x4007c1"
0x7fffffffdd20: U"x4007db"
0x7fffffffdd28: U"x4007e9"
0x7fffffffdd30: U"x4007db"
0x7fffffffdd38: U"x400802"
0x7fffffffdd40: U"x4007db"


(gdb) x/7s *myArr
0x4007a8:   "This is the first string"
0x4007c1:   "This is the second string"
0x4007db:   "This is Engie"
0x4007e9:   "This is the third string"
0x400802:   "This is the fifth string"
0x40081b:   "%p, %p
"
0x400823:   ""
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

It is called constant merging. It is enabled at higher levels of optimization, typically. The compiler simply takes all of the unique constant values and crunches them down. Good for memory usage and cache efficiency.

gcc has -fmerge-constants or using -O and company

Other compilers may or may not do it. It is compiler specific.

Since it is about the easiest optimization operation to implement I would imagine all C++ compilers do it.

This is a perfect example of why:

  1. You can't make assumptions about where a constant value will live (undefined behavior)
  2. You shouldn't make changes to constant values (undefined behavior)

but we see many questions about people (not yourself) observing they got away with modifying a constant string after casting away const.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...