Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

At which exact statement does this program exhibit Undefined behavior as per the C++ standard?

(I am aware of the fact that returning address/reference to a variable local to the function should be avoided and a program should never do this.)


Does returning a reference to a local variable/reference result in Undefined Behavior? Or does the Undefined Behavior only occur later, when the returned reference is used (or "dereferenced")?

i.e. at what exact statement (#1 or #2 or #3) does code sample below invoke Undefined Behavior? (I've written my theory alongside each one)

#include <iostream>

struct A
{ 
   int m_i;
   A():m_i(10)
   {

   } 
};  
A& foo() 
{     
    A a;
    a.m_i = 20;     
    return a; 
} 

int main()
{
   foo();                // #1 - Not UB; return value was never used
   A const &ref = foo(); // #2 - Not UB; return value still not yet used
   std::cout<<ref.m_i;   // #3 - UB: returned value is used
}

I am interested to know what the C++ standard specifies in this regard.

I would like a citation from the C++ standard which will basically tell me which exact statement makes this code ill-formed.

Discussions about how specific implementations handle this are welcome but as I said an ideal answer would cite an reference from the C++ Standard that clarifies this beyond doubt.

like image 948
Alok Save Avatar asked Jan 20 '26 23:01

Alok Save


2 Answers

Of course, when the reference is first initialised it is done so validly, satisfying the following:

[C++11: 8.3.2/5]: There shall be no references to references, no arrays of references, and no pointers to references. The declaration of a reference shall contain an initializer (8.5.3) except when the declaration contains an explicit extern specifier (7.1.1), is a class member (9.2) declaration within a class definition, or is the declaration of a parameter or a return type (8.3.5); see 3.1. A reference shall be initialized to refer to a valid object or function. [ Note: in particular, a null reference cannot exist in a well-defined program, because the only way to create such a reference would be to bind it to the “object” obtained by dereferencing a null pointer, which causes undefined behavior. As described in 9.6, a reference cannot be bound directly to a bit-field. —end note ]

The reference being returned from the function is an xvalue:

[C++11: 3.10/1]: [..] An xvalue (an “eXpiring” value) also refers to an object, usually near the end of its lifetime (so that its resources may be moved, for example). An xvalue is the result of certain kinds of expressions involving rvalue references (8.3.2). [ Example: The result of calling a function whose return type is an rvalue reference is an xvalue. —end example ] [..]

That means the following does not apply:

[C++11: 12.2/1]: Temporaries of class type are created in various contexts: binding a reference to a prvalue (8.5.3), returning a prvalue (6.6.3), a conversion that creates a prvalue (4.1, 5.2.9, 5.2.11, 5.4), throwing an exception (15.1), entering a handler (15.3), and in some initializations (8.5).

[C++11: 6.6.3/2]: A return statement with neither an expression nor a braced-init-list can be used only in functions that do not return a value, that is, a function with the return type void, a constructor (12.1), or a destructor (12.4).

A return statement with an expression of non-void type can be used only in functions returning a value; the value of the expression is returned to the caller of the function. The value of the expression is implicitly converted to the return type of the function in which it appears. A return statement can involve the construction and copy or move of a temporary object (12.2). [ Note: A copy or move operation associated with a return statement may be elided or considered as an rvalue for the purpose of overload resolution in selecting a constructor (12.8). —end note ] A return statement with a braced-init-list initializes the object or reference to be returned from the function by copy-list-initialization (8.5.4) from the specified initializer list. [ Example:

std::pair<std::string,int> f(const char* p, int x) {
   return {p,x};
}

—end example ]

Additionally, even if we interpret the following to mean that an initialisation of a new reference "object" is performed, the referee is probably still alive at the time:

[C++11: 8.5.3/2]: A reference cannot be changed to refer to another object after initialization. Note that initialization of a reference is treated very differently from assignment to it. Argument passing (5.2.2) and function value return (6.6.3) are initializations.

  • This makes #1 valid.

However, your initialisation of a new reference ref inside main quite clearly violates [C++11: 8.3.2/5]. I can't find wording for it, but it stands to reason that the function scope has been exited when the initialisation is performed.

  • This would make #2 (and consequently #3) invalid.

At the very least, there does not appear to be anything further stated about the matter in the standard, so if the above reasoning is not sufficient then we have to conclude that the standard is ambiguous in the matter. Fortunately, it's of little consequence in practice, at least in the mainstream.

like image 82
Lightness Races in Orbit Avatar answered Jan 22 '26 12:01

Lightness Races in Orbit


Here's my incomplete and possible insufficient view on the matter:

The only thing special about references is that at initialization time they must refer to a valid object. If the object later stops existing, using the reference is UB, and so is initializing another reference to the now-defunct reference.

The following much simpler example provides exactly the same dilemma as your question, I think:

std::reference_wrapper<T> r;

{
    T t;
    r = std::ref(t);
}

// #1

At #1, the reference inside r is no longer valid, but the program is fine. Just don't read r.

In your example, line #1 is fine, and line #2 isn't -- that is because the original line #2 calls A::A(A const &) with argument foo(), and as discussed, this fails to initialize the function argument variable with a valid reference, and so would your edited version A const & a = foo();.

like image 37
Kerrek SB Avatar answered Jan 22 '26 13:01

Kerrek SB