Is the code below safe? It might be tempting to write code akin to this:
#include <map>
const std::map<const char*, int> m = {
    {"text1", 1},
    {"text2", 2}
};
int main () {
    volatile const auto a = m.at("text1");
    return 0;
}
The map is intended to be used with string literals only.
I think it's perfectly legal and seems to be working, however I never saw a guarantee that the pointer for the literal used in two different places to be the same. I couldn't manage to make compiler generate two separate pointers for literals with the same content, so I started to wonder how firm the assumption is.
I am only interested whether the literals with same content can have different pointers. Or more formally, can the code above except?
I know that there's a way to write code to be sure it works, and I think above approach is dangerous because compiler could decide to assign two different storages for the literal, especially if they are placed in different translation units. Am I right?
Whether or not two string literals with the exact same content are the exact same object, is unspecified, and in my opinion best not relied upon. To quote the standard:
[lex.string]
16 Evaluating a string-literal results in a string literal object with static storage duration, initialized from the given characters as specified above. Whether all string literals are distinct (that is, are stored in nonoverlapping objects) and whether successive evaluations of a string-literal yield the same or a different object is unspecified.
If you wish to avoid the overhead of std::string, you can write a simple view type (or use std::string_view in C++17) that is a reference type over a string literal. Use it to do intelligent comparisons instead of relying upon literal identity.
The Standard does not guarantee the addresses of string literals with the same content will be the same. In fact, [lex.string]/16 says:
Whether all string literals are distinct (that is, are stored in nonoverlapping objects) and whether successive evaluations of a string-literal yield the same or a different object is unspecified.
The second part even says you might not get the same address when a function containing a string literal is called a second time! Though I've never seen a compiler do that.
So using the same character array object when a string literal is repeated is an optional compiler optimization. With my installation of g++ and default compiler flags, I also find I get the same address for two identical string literals in the same translation unit. But as you guessed, I get different ones if the same string literal content appears in different translation units.
A related interesting point: it's also permitted for different string literals to use overlapping arrays. That is, given
const char* abcdef = "abcdef";
const char* def = "def";
const char* def0gh = "def\0gh";
it's possible you might find abcdef+3, def, and def0gh are all the same pointer.
Also, this rule about reusing or overlapping string literal objects applies only to the unnamed array object directly associated with the literal, used if the literal immediately decays to a pointer or is bound to a reference to array. A literal can also be used to initialize a named array, as in
const char a1[] = "XYZ";
const char a2[] = "XYZ";
const char a3[] = "Z";
Here the array objects a1, a2 and a3 are initialized using the literal, but are considered distinct from the actual literal storage (if such storage even exists) and follow the ordinary object rules, so the storage for those arrays will not overlap.
No, the C++ standard makes no such guarantees.
That said, if the code is in the same translation unit then it would be difficult to find a counter example. If main() is in a different translation then a counter example might be easier to produce.
If the map is in a different dynamic linked library or shared object then it's almost certainly not the case.
The volatile qualifier is a red herring.
The C++ standard does not require an implementation to de-duplicate string literals.
When a string literal resides in another translation unit or another shared library that would require the linker (ld) or runtime-linker (ld.so) to do the string literal de-duplication. Which they don't.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With