1============================================= 2Enable std::unique_ptr [[clang::trivial_abi]] 3============================================= 4 5Background 6========== 7 8Consider the follow snippets 9 10 11.. code-block:: cpp 12 13 void raw_func(Foo* raw_arg) { ... } 14 void smart_func(std::unique_ptr<Foo> smart_arg) { ... } 15 16 Foo* raw_ptr_retval() { ... } 17 std::unique_ptr<Foo*> smart_ptr_retval() { ... } 18 19 20 21The argument ``raw_arg`` could be passed in a register but ``smart_arg`` could not, due to current 22implementation. 23 24Specifically, in the ``smart_arg`` case, the caller secretly constructs a temporary ``std::unique_ptr`` 25in its stack-frame, and then passes a pointer to it to the callee in a hidden parameter. 26Similarly, the return value from ``smart_ptr_retval`` is secretly allocated in the caller and 27passed as a secret reference to the callee. 28 29 30Goal 31=================== 32 33``std::unique_ptr`` is passed directly in a register. 34 35Design 36====== 37 38* Annotate the two definitions of ``std::unique_ptr`` with ``clang::trivial_abi`` attribute. 39* Put the attribuate behind a flag because this change has potential compilation and runtime breakages. 40 41 42This comes with some side effects: 43 44* ``std::unique_ptr`` parameters will now be destroyed by callees, rather than callers. 45 It is worth noting that destruction by callee is not unique to the use of trivial_abi attribute. 46 In most Microsoft's ABIs, arguments are always destroyed by the callee. 47 48 Consequently, this may change the destruction order for function parameters to an order that is non-conforming to the standard. 49 For example: 50 51 52 .. code-block:: cpp 53 54 struct A { ~A(); }; 55 struct B { ~B(); }; 56 struct C { C(A, unique_ptr<B>, A) {} }; 57 C c{{}, make_unique<B>, {}}; 58 59 60 In a conforming implementation, the destruction order for C::C's parameters is required to be ``~A(), ~B(), ~A()`` but with this mode enabled, we'll instead see ``~B(), ~A(), ~A()``. 61 62* Reduced code-size. 63 64 65Performance impact 66------------------ 67 68Google has measured performance improvements of up to 1.6% on some large server macrobenchmarks, and a small reduction in binary sizes. 69 70This also affects null pointer optimization 71 72Clang's optimizer can now figure out when a `std::unique_ptr` is known to contain *non*-null. 73(Actually, this has been a *missed* optimization all along.) 74 75 76.. code-block:: cpp 77 78 struct Foo { 79 ~Foo(); 80 }; 81 std::unique_ptr<Foo> make_foo(); 82 void do_nothing(const Foo&) 83 84 void bar() { 85 auto x = make_foo(); 86 do_nothing(*x); 87 } 88 89 90With this change, ``~Foo()`` will be called even if ``make_foo`` returns ``unique_ptr<Foo>(nullptr)``. 91The compiler can now assume that ``x.get()`` cannot be null by the end of ``bar()``, because 92the deference of ``x`` would be UB if it were ``nullptr``. (This dereference would not have caused 93a segfault, because no load is generated for dereferencing a pointer to a reference. This can be detected with ``-fsanitize=null``). 94 95 96Potential breakages 97------------------- 98 99The following breakages were discovered by enabling this change and fixing the resulting issues in a large code base. 100 101- Compilation failures 102 103 - Function definitions now require complete type ``T`` for parameters with type ``std::unique_ptr<T>``. The following code will no longer compile. 104 105 .. code-block:: cpp 106 107 class Foo; 108 void func(std::unique_ptr<Foo> arg) { /* never use `arg` directly */ } 109 110 - Fix: Remove forward-declaration of ``Foo`` and include its proper header. 111 112- Runtime Failures 113 114 - Lifetime of ``std::unique_ptr<>`` arguments end earlier (at the end of the callee's body, rather than at the end of the full expression containing the call). 115 116 .. code-block:: cpp 117 118 util::Status run_worker(std::unique_ptr<Foo>); 119 void func() { 120 std::unique_ptr<Foo> smart_foo = ...; 121 Foo* owned_foo = smart_foo.get(); 122 // Currently, the following would "work" because the argument to run_worker() is deleted at the end of func() 123 // With the new calling convention, it will be deleted at the end of run_worker(), 124 // making this an access to freed memory. 125 owned_foo->Bar(run_worker(std::move(smart_foo))); 126 ^ 127 // <<<Crash expected here 128 } 129 130 - Lifetime of local *returned* ``std::unique_ptr<>`` ends earlier. 131 132 Spot the bug: 133 134 .. code-block:: cpp 135 136 std::unique_ptr<Foo> create_and_subscribe(Bar* subscriber) { 137 auto foo = std::make_unique<Foo>(); 138 subscriber->sub([&foo] { foo->do_thing();} ); 139 return foo; 140 } 141 142 One could point out this is an obvious stack-use-after return bug. 143 With the current calling convention, running this code with ASAN enabled, however, would not yield any "issue". 144 So is this a bug in ASAN? (Spoiler: No) 145 146 This currently would "work" only because the storage for ``foo`` is in the caller's stackframe. 147 In other words, ``&foo`` in callee and ``&foo`` in the caller are the same address. 148 149ASAN can be used to detect both of these. 150