Intuition Behind X86 "lea" Instruction
During the last meeting of the North Denver C++ Meetup, some people mentioned that lea
is more confusing than other instructions. lea
is an acronym for "load effective address." The usual explanation is "to put a memory address from the source into the destination." The syntax of lea
in the Intel Syntax is the following:
lea destination, source
For example, if you have an array points
of struct Point
:
struct Point
{
int x;
int y;
int z;
};
The compiler may generate the following line for int x = points[i].y;
mov eax, [rbx+rcx*4 + 4]
In this case, the register rbx
points to the array points
, rcx
is the index variable i
, and eax
is the register that holds x
. Similarly, for int* x = &points[i].y;
, compilers can generate
lea eax, [rbx+rcx*4 + 4]
However, besides using it for address operations, compilers seem to prefer using lea
to other arithmetic instructions as well for efficiency reason. For example, int y = x * 5;
may generate
lea eax, [rdi + 4*rdi]
instead of the more intuitive version of
imul eax, [rdi], 5
lea
is, in my point of view, a process of pointer arithmetic sandwiched with casts. For the previous example, the equivalent c code is
int y = (int)(&((int*)x)[x]);
The above code first treats x
as an int
pointer ((int*)x
), and then get address the x
-th element of that pointer. That part is essentially the address [rdi + 4*rdi]
. Next, it assigns the lower 32 bits of the address as an integer value to the destination.
I hope this example gives you some intuitive understanding about lea
. Of course, no sane C programmer will write such kind of code by hand. The above code is not even conforming C++ for a good reason (C++ disallow casting from pointer to smaller type int
). However, from a machine's perspective, such kind of "reinterpret_cast" is essentially a no-op, and machine languages leverage that all the time.