Tail Recursion and Tail Call Optimization

From OSDev Wiki
Jump to navigationJump to search

What is a tail call?

Informally, a tail call is when a function calls another function, then returns immediately. In assembly, it's when a CALL and a RET instruction are found right next to each other. For example, in the following code:

myFunction:CALLfooCALLbarRET

CALL bar is a tail call, but CALL foo is not.

It's important that there is no operation between CALL bar and RET. Consider the following example:

intfoo(x){returnbar(x)*2;}

Even though this looks like a tail call, the caller has to modify the return value after the CALL operation. As such, it's not actually a tail call. In assembly:

CALLbarSHLEAX,1RET

However the following is a tail call:

intfoo(x){returnbar(x*2);}

In assembly:

SHLEAX,1CALLbarRET

As you can see, there is no operation between CALL and RET this time.

Optimizing a tail call

A tail call can be optimized by simply replacing CALL and RET with JMP. So looking at this example, one more time:

intfoo(x){returnbar(x*2);}

In assembly, no tail call optimization:

SHLEAX,1CALLbarRET

In assembly, with tail call optimization:

SHLEAX,1JMPbar

This has a minor speed up as you no longer have to save and restore the instruction pointer. The callee (in this case, bar) will return all the way back to the caller of this function.

It doesn't have much of an impact on a function like this, but it can have immense impact on recursive calls. A recursive call that takes 1000 recursions will have to save and restore the instruction pointer 1001 times without tail call optimization, but only once with it.

Epilogues and Prologues

In the above examples, it is assumed that functions don't have a prologue or epilogue which is highly impractical. As such a proper implementation of tail call optimization is a lot trickier than shown here. This is only a rudimentary example.