Unless you're using a really old compiler, or working really hard at turning off all possible optimization, returning the value will normally be at least as efficient, and sometimes (often?) more efficient.
C++ has allowed what are called Return Value Optimization (RVO) and Named Return Value Optimization (NRVO) since it was first standardized in 1998 (and quite a while before, though what was or wasn't allowed was a bit more nebulous before the standard).
RVO/NRVO say that if you have a copy constructor with observable side effects, those side effects may not be observable in the case of returning a value like this. That may not seem like much, but the intent (and actual result) is that when you return a value that requires copy construction during the return, that copy construction will almost always be optimized away. Instead, the compiler basically creates the returned value that the caller will see, and passes a reference to that object to the function as a hidden parameter, and the function just constructs and (if necessary) manipulates that object via the reference.
So, let's put a concrete example to the test by compiling two bits of code and looking at the code they produce:
#include <string> std::string encode(int i) { return std::string(i, ' '); } void encode(int i, std::string &s) { s = std::string(i, ' '); }
The first produces this code:
encode[abi:cxx11](int): # @encode[abi:cxx11](int) push rbx mov rbx, rdi movsxd rsi, esi lea rax, [rdi + 16] mov qword ptr [rdi], rax mov edx, 32 call std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct(unsigned long, char) mov rax, rbx pop rbx ret
This was compiled with Clang, but gcc produces nearly identical code. MSVC produces slightly different code, but the three have one major characteristic in common: returning the string doesn't involve copying with any of them.
Here's the code from the second version (this time compiled with gcc, but again, Clang is nearly identical, and MSVC fairly similar as well):
encode(int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&): # @encode(int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) push r15 push r14 push rbx sub rsp, 32 mov rbx, rsi movsxd rsi, edi lea r15, [rsp + 16] mov qword ptr [rsp], r15 mov r14, rsp mov rdi, r14 mov edx, 32 call std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct(unsigned long, char) mov rsi, qword ptr [rsp] cmp rsi, r15 je .LBB1_1 lea rdx, [rbx + 16] mov rdi, qword ptr [rbx] mov rcx, qword ptr [rbx + 16] xor eax, eax cmp rdi, rdx cmovne rax, rdi mov qword ptr [rbx], rsi movups xmm0, xmmword ptr [rsp + 8] movups xmmword ptr [rbx + 8], xmm0 test rax, rax je .LBB1_10 mov qword ptr [rsp], rax mov qword ptr [rsp + 16], rcx jmp .LBB1_11 .LBB1_1: cmp r14, rbx je .LBB1_2 mov rdx, qword ptr [rsp + 8] test rdx, rdx je .LBB1_7 mov rdi, qword ptr [rbx] cmp rdx, 1 jne .LBB1_6 mov al, byte ptr [rsi] mov byte ptr [rdi], al jmp .LBB1_7 .LBB1_10: mov qword ptr [rsp], r15 mov rax, r15 jmp .LBB1_11 .LBB1_6: call memcpy .LBB1_7: mov rax, qword ptr [rsp + 8] mov qword ptr [rbx + 8], rax mov rcx, qword ptr [rbx] mov byte ptr [rcx + rax], 0 mov rax, qword ptr [rsp] .LBB1_11: mov qword ptr [rsp + 8], 0 mov byte ptr [rax], 0 mov rdi, qword ptr [rsp] cmp rdi, r15 je .LBB1_13 call operator delete(void*) .LBB1_13: add rsp, 32 pop rbx pop r14 pop r15 ret .LBB1_2: mov rax, rsi jmp .LBB1_11
This doesn't do any copying either, but as you can see, it is just a tad longer and more complex...
Here's a link to the code on Godbolt in case you want to play with different compilers, optimization flags, etc.: https://godbolt.org/z/vGc6Wx
char*
. Answer: it makes no difference.