Tail Recursion and Tail Call Optimization
What is a tail call?
Informally, a tail call is when a function calls another function, then returns immediately. In assembly, it's when a CALL
and a RET
instruction are found right next to each other. For example, in the following code:
myFunction:CALLfooCALLbarRET
CALL bar
is a tail call, but CALL foo
is not.
It's important that there is no operation between CALL bar
and RET
. Consider the following example:
intfoo(x){returnbar(x)*2;}
Even though this looks like a tail call, the caller has to modify the return value after the CALL
operation. As such, it's not actually a tail call. In assembly:
CALLbarSHLEAX,1RET
However the following is a tail call:
intfoo(x){returnbar(x*2);}
In assembly:
SHLEAX,1CALLbarRET
As you can see, there is no operation between CALL
and RET
this time.
Optimizing a tail call
A tail call can be optimized by simply replacing CALL
and RET
with JMP
. So looking at this example, one more time:
intfoo(x){returnbar(x*2);}
In assembly, no tail call optimization:
SHLEAX,1CALLbarRET
In assembly, with tail call optimization:
SHLEAX,1JMPbar
This has a minor speed up as you no longer have to save and restore the instruction pointer. The callee (in this case, bar
) will return all the way back to the caller of this function.
It doesn't have much of an impact on a function like this, but it can have immense impact on recursive calls. A recursive call that takes 1000 recursions will have to save and restore the instruction pointer 1001 times without tail call optimization, but only once with it.
Epilogues and Prologues
In the above examples, it is assumed that functions don't have a prologue or epilogue which is highly impractical. As such a proper implementation of tail call optimization is a lot trickier than shown here. This is only a rudimentary example.