Do Comments In Code Register On The Stack?

Stack frames

A really quick explanation of stack frames and frame pointers

February xvi, 2018

Agreement Frame Pointers

Each function has local memory associated with information technology to hold incoming parameters, local variables, and (in some cases) temporary variables. This region of memory is called a stack frame and is allocated on the process' stack. A frame pointer (the ebp register on intel x86 architectures, rbp on 64-fleck architectures) contains the base address of the function'southward frame. The code to access local variables within a function is generated in terms of offsets to the frame pointer. The stack pointer (the esp register on intel x86 architectures or rsp on 64-chip architectures) may change during the execution of a function as values are pushed or popped off the stack (such equally pushing parameters in training to calling another function). The frame arrow doesn't alter throughout the function.

Here'south what happens during office (there might be slight differences among languages/architectures)

Push the electric current value of the frame pointer (ebp/rbp). This saves it so we can restore it after.
Move the current stack pointer to the frame pointer. This defines the start of the frame.
Subtract the space needed for the part'south data from the stack arrow. Call up that stacks grow from high memory to low memory. This puts the stack pointer past the infinite that will be used by the part and so that annihilation pushed onto the stack now volition non overwrite useful values.
Now execute the lawmaking for the function. References to local variables will be negative offsets to the frame pointer (eastward.thou., "movl $123, –8(%rbp)").
On go out from the office, copy the value from the frame pointer to the stack pointer (this clears up the infinite allocated to the stack frame for the function) and pop the old frame pointer. This is accomplished by the "leave" instruction.
Return from the procedure via a "ret" education. This pops the return value from the stack and transfers execution to that accost.

Basic example

Allow's consider the following set of functions in a file called try.c

          void bar(int a, int b) {     int x, y;      x = 555;     y = a+b; }  void foo(void) {     bar(111,222); }

We'll compile it via

          gcc -S  -m32 effort.c

The -S option tells the compiler to create an assembler file. The -m32 option tells the compiler to generate code for a 32-bit architecture. In this case, it keeps the numbers smaller and we don't have to worry well-nigh specifying -no-red-zone (run across more than details, below).

gcc chooses to use the mov instruction (movl) instead of button because the Intel x86 instruction set up doesn't have an instruction to push constant values onto the stack. Adjusting the stack and so moving the required parameters into the proper places as negative offsets accomplishes the same thing.

The generated code is (removing lines that contain directives to the linker):

          bar:     pushl   %ebp     movl    %esp, %ebp     subl    $sixteen, %esp     movl    $555, -4(%ebp)     movl    12(%ebp), %eax     movl    viii(%ebp), %edx     addl    %edx, %eax     movl    %eax, -viii(%ebp)     exit     ret foo:     pushl   %ebp     movl    %esp, %ebp     subl    $8, %esp     movl    $222, 4(%esp)     movl    $111, (%esp)     phone call    bar     leave     ret

We tin comment the code and trace it by starting at foo():

          bar:        # --------- commencement of the function bar()     pushl   %ebp        # save the incoming frame pointer     movl    %esp, %ebp  # prepare the frame pointer to the current tiptop of stack     subl    $16, %esp   # increase the stack by 16 bytes (stacks abound down)     movl    $555, -four(%ebp)  # x=555 a is located at [ebp-4]     movl    12(%ebp), %eax  # 12(%ebp) is [ebp+12], which is the 2d parameter     movl    viii(%ebp), %edx   # 8(%ebp) is [ebo+8], which is the kickoff parameter     addl    %edx, %eax  # add them     movl    %eax, -8(%ebp)  # store the result in y     leave           #     ret         # foo:        # --------- kickoff of the function foo()     pushl   %ebp        # salvage the current frame pointer     movl    %esp, %ebp  # set the frame pointer to the electric current acme of the stack     subl    $eight, %esp    # increase the stack by 8 bytes (stacks abound down)     movl    $222, iv(%esp)   # this is finer pushing 222 on the stack     movl    $111, (%esp)    # this is finer pushing 111 on the stack     phone call    bar     # call = push the instruction pointer on the stack and branch to foo     leave           # done     ret         #

Let's see what happens. In foo(), we demand to prepare the stack for two parameters that will exist sent to bar(). The compiler would like to do

          push $222 button $111

but those instructions don't be on the IA–32 architecture and then instead, the compiler generates code to subtract 8 from the stack pointer, making the stack grow by eight bytes (enough to hold two 32-bit values). It then uses stack first addressing to place the values 111 and 222 on the stack (run across figure ane).

Figure 1. Before call to bar — Figure 1. Earlier call to bar

Then foo calls bar. This pushes the return address onto the stack so it looks like this when execution starts at bar (figure ii):

Figure 2. At entry to bar — Figure ii. At entry to bar

On entry to bar(), nosotros relieve the previous value of ebp, and set up the frame pointer to the top of the stack (the current position of the stack pointer). Then we grow the stack by subtracting 16 from the stack arrow. Stacks on intel architectures abound from loftier memory to low retentiveness, so the top of the stack (the latest contents) are in depression retentivity. The stack now looks like the one shown in figure 3. We accept a stack frame for the office bar that holds local information for this instance of the function. Negative offsets of the frame pointer %ebp (toward the meridian of the stack, into lower memory) volition refer to local data in bar. Positive offsets of %ebp will allow us to read incoming parameters.

At present we're ready to execute the footling logic of the function. We set local variable x to 555. This variable is the very next set of four bytes after the saved ebp. The next statement adds the 2 parameters and stores the result into the local int y. The code for this is to read the value of b (which is [ebp+12]) and store it into register %eax. The value of a (which is [ebp+8]) is read into annals %edx. The two values are added and the result is stored in y, which is [ebp–8]. Effigy 4 shows the position of the parameters and local variables.

When we're done, we phone call "leave", which sets the stack pointer to the value of the frame pointer (%ebp) and pops the saved value of the frame arrow (the ane the function foo was using). At present the stack pointer is pointing to the return address inside foo that was saved when the call instruction was executed and our frame is effectively deallocated. The ret instruction pops the stack and transfers command dorsum to foo right later on the call bar instruction.

You might exist wondering why the stack was adjusted by 16 bytes instead of the eight that was needed to concur ten and y. I don't know. That seems to be a multiple that gcc uses. If you lot allocate two more than local ints, the frame remains the same size. If yous classify another int, the compiler grows the stack past 32 bytes.

More than details most how frames are used

gcc (and other compilers) uses registers for the starting time few (6) parameters and these are copied into areas inside the function'due south frame]
As an optimization, the intel x86–64 architecture allows functions to use space on the stack without adjusting the stack pointer if that infinite is <= 128 bytes. Interrupt handlers are guaranteed to not modify this region. Yous can search for "cherry-red zone" to read virtually this if you're interested. The gcc compiler tin be told to ignore this via a -mno-cherry-zone choice.
Since the compiler can keep rail of what'southward going on with the stack at any point in time, the frame pointer isn't strictly necessary. You can compile code to use the stack pointer exclusively with the -fomit-frame-pointer pick to gcc.

Exploiting buffer overflow

By exploiting a buffer overflow, you tin can write arbitrary data onto the stack. This ways that you can change the return accost of a function and besides change the data past that return address - the local variables of previous functions. In a basic code injection assail, yous tin can change the return address to the address of the buffer that you overwrote with code of your choosing. You at present injected code into the program. In a simple render-oriented-programming attack, yous change the return address to the address of a library function such as system() and insert data on the stack to make give system() the parameters you want (e.thousand., a command to execute). Annotation that the code illustrated in a higher place is not vulnerable to buffer overflow since we're using scalars (simply ints) instead of arrays.

References

Phone call stack
Stack frame layout on x86–64
Buffer overflow: the function stack
x86 Disassembly/Functions and Stack Frame
Red zone
x86 exit didactics