NYCU-LYX

10-buffer-overflow

https://youtu.be/NEtHcWE3Ki0?si=dFJ6BSPrhXRJ8fCU&t=6450

1995	A buffer overflow in NCSA httpd 1.3 was discovered and published on the Bugtraq mailing list by Thomas Lopatic.
1996	Aleph One published “Smashing the Stack for Fun and Profit” in Phrack magazine, giving a step by step introduction to exploiting stack-based buffer overflow vulnerabilities.
2001	The Code Red worm exploits a buffer overflow in Microsoft IIS 5.0
2003	The Slammer worm exploits a buffer overflow in Microsoft SQL Server 2000.
2004	The Sasser worm exploits a buffer overflow in Microsoft Windows 2000/XP Local Security Authority Subsystem Service (LSASS).
1998	The Morris Internet Worm uses a buffer overflow exploit in “fingerd” as one of its attack mechanisms.

1988	The Morris Internet Worm uses a buffer overflow exploit in “fingerd” as one of its attack mechanisms.
1995	A buffer overflow in NCSA httpd 1.3 was discovered and published on the Bugtraq mailing list by Thomas Lopatic.
1996	Aleph One published “Smashing the Stack for Fun and Profit” in Phrack magazine, giving a step by step introduction to exploiting stack-based buffer overflow vulnerabilities.
2001	The Code Red worm exploits a buffer overflow in Microsoft IIS 5.0.
2003	The Slammer worm exploits a buffer overflow in Microsoft SQL Server 2000.
2004	The Sasser worm exploits a buffer overflow in Microsoft Windows 2000/XP Local Security Authority Subsystem Service (LSASS).

10.1 Stack Overflows

Buffer Overflow Basics

A buffer overflow: known as a buffer overrun or buffer overwrite
NISTIR 7298 (Glossary of Key Information Security Terms, May 2013)

A condition at an interface under which more input can be placed into a buffer or data holding area than the capacity allocated, overwriting other information. Attackers exploit such a condition to crash a system or to insert specially crafted code that allows them to gain control of the system.

Programming error: a process attempts to store data beyond the limits of a fixed sized buffer
- Overwrites adjacent memory locations(覆寫比鄰的記憶體位置)
- Locations could hold other program variables and parameters
Buffer could be located on the stack, in the heap, or in the data section of the process

Consequences

Corruption of program data
Unexpected transfer of control
Memory access violations
Execution of code chosen by attacker

Basic Buffer Overflow Example

int main(int argc, char *argv[]) { int valid = FALSE;
  char str1[8];
  char str2[8];
  next_tag(str1);
  gets(str2);
  if (strncmp(str1, str2, 8) == 0)
    valid = TRUE;
  printf("buffer1: str1(%s), str2(%s), valid(%d)\n", str1, str2, valid);
}

$ cc -g -o buffer1 buffer1.c
$ ./buffer1
START
buffer1: str1(START), str2(START), valid(1)
$ ./buffer1
EVILINPUTVALUE
buffer1: str1(TVALUE), str2(EVILINPUTVALUE), valid(0)
$ ./buffer1
BADINPUTBADINPUT
buffer1: str1(BADINPUT), str2(BADINPUTBADINPUT), valid(1)

Result: corruption of the variable str1

What if str1 is a password?

Needs for the Attacker: Exploiting a Buffer Overflow

To identify a buffer overflow vulnerability in some program (識別程式buffer overflow弱點)
- That can be triggered using externally sourced data under the attackers control
To understand how that buffer will be stored in the process memory (知道buffer是如何存在ｐprocess memory)
- The potential for corrupting adjacent memory locations
- Potentially altering the flow of execution of the program

How to Identify Vulnerable Programs?

Inspecting the program source
Tracing the execution of programs as they process oversized input
Using tools to automatically identify potentially vulnerable programs
- Such as fuzzing: a software testing technique
  - Using randomly generated data as inputs to a program
  - The range of inputs can be very large
  - To test whether the program correctly handles all such abnormal inputs

Why Programs are not Necessarily Protected?

Basic machine level
- All the data manipulated by machine instructions: stored in either the processor’s registers or in memory
- Data’s interpretation: entirely determined by the function of the instructions
  - Can be treated as integer values, addresses of data, arrays of characters, etc.
- Responsibility on the assembly language programmer: ensuring that the correct interpretation is placed on any saved data value
Assembly language programs
- The greatest access to the resources
- But, at the highest cost and responsibility in coding effort
Modern high-level programming languages (e.g., Java, Python)
- Have a strong notion of type and valid operations
- Not vulnerable to buffer overflows: flexibility and safety
  - More data to be saved are not allowed
- Costs in resource use
  - Imposing checks at both compile and run times
- Limiting the usefulness in writing code
Between these two extremes: C and related languages
- Have many modern high-level control structures and data type abstractions
- But, allow direct access to low-level resources
  - Vulnerable to the buffer overflow
  - A large legacy of widely used, unsafe, and hence vulnerable codes

Stack Buffer Overflows

Occurring when the targeted buffer is located on the stack
- Stack: storing local variables in a function’s stack frame
Also referred to as stack smashing
First being seen in the Morris Internet Worm in 1988
- An unchecked buffer overflow from the C gets()in the fingerd daemon

Function call Mechanims

Stack frame: saving the following data
- Return address to the calling function
- Parameters passed to the called function
- Values of local variables
  
  一個函數調用另一個函數時，至少需要一個地方保存返回地址，以便被調用的函數完成後可以返回控制權。所有這些數據通常都保存在stack上，結構稱為stack frame
Right stack frame: function P calls another function Q

function P calling another function Q can be summarized as follows. The calling function P:

Pushes the parameters for the called function onto the stack (typically in reverse order of declaration). (將要傳遞給被調用函數的参数壓入Stack (通常按照聲明順序的逆序排列)
Executes the call instruction to call the target function, which pushes the return address onto the stack. (執行 call 指令，調用目標函數，並將返回地址壓入堆棧)

The called function Q:

Pushes the current frame pointer value (which points to the calling routine’s stack frame) onto the stack. (將當前的幀指標值 (指向調用例程的堆棧幀esp) 壓入堆棧)
Sets the frame pointer to be the current stack pointer value (i.e., the address of the old frame pointer), which now identifies the new stack frame location for the called function. (將frame pointer設置為當前的stack pointer的值 (即舊幀指標的地址old ebp)，現在它標示了被調用函數的新堆棧幀stack frame location(ebp))
Allocates space for local variables by moving the stack pointer down to leave sufficient room for them. (通過將stack pointer向下移動以留出足夠空間(esp)，為局部變量（esi,edi,ebx)分配空间。)
Runs the body of the called function. (運行被調用函數的程式主體)
As it exits, it first sets the stack pointer back to the value of the frame pointer (effectively discarding the space used by local variables). (在退出時，首先將stack pointer(esp)設置frame pointer(ebp)的值 (實際上是丟棄了局部變量使用的空間)
Pops the old frame pointer(ebp) value (restoring the link to the calling routine’s stack frame). (彈出舊幀指標值 (恢復到調用例程的stack frame的鏈接))
Executes the return instruction which pops the saved address off the stack and returns control to the calling function. (執行 return 指令，將保存的地址從堆棧中彈出，並將控制權返回给調用函數)

Lastly, the calling function:

10. Pops the parameters for the called function off the stack. (將被調用函數的参数從stack中彈出)

11. Continues execution with the instruction following the function call. (繼續執行函數調用後面的指令)

https://youtu.be/5iQkR69H_1M?feature=shared https://youtu.be/7ukTs4Bi7hI?feature=shared https://youtu.be/seo5Es4pycs?feature=shared

Stack Overflow Example

Local buffer overflow vulnerability: An exploit can overwrite the saved frame pointer and return address, leading to a stack overflow attack.
Layout of local variables: Local variables are allocated in the stack frame in order of declaration, growing downward in memory.
Process address space: A program has its own virtual address space with specific sections for code, data, heap, and stack.
Stack growth: The stack grows downward in memory, with stack frames placed one below another.

Stack Overflow Example

Stack Overflow Stack Values

Return address: 0x080483f0

Frame pointer value: 0xbffffbe8

More Stack Overflow Vulnerabilities

Potential for a buffer overflow: anywhere that data is copied or merged into a buffer
Occurring when the program does not check to ensure the buffer is large enough, or the data copied are correctly terminated
- Some of the data are read from outside the program
- Unsafe copy between functions in the same program

Example for the Unsafe Copy between Functions

Some Common Unsafe C Standard Library Routines

These routines are all suspect and should not be used without checking the total size of data being transferred

Shellcode

Code supplied by the attacker
- Often save in buffer being overflowed
- Traditionally transferred control to a user command-line interpreter (shell)
Simply machine code
- Specific to processor and operating system
- Traditionally needed good assembly language skills to create
Many sites and tools have been developed that automate this process
- e.g., Metasploit project https://www.metasploit.com/
  - Providing useful information to people who perform penetration, IDS signature development, and exploit research
  - Including an advanced open-source platform for developing, testing, and using exploit code

Example: Launching Shell on an Intel Linux System

Several requirements
- The high-level language spec must be compiled into equivalent machine language
- The instructions should be included inline, rather than relying on the library function
- Position independent: cannot contain any absolute address
  - Only relative address references, offsets to the current instruction address
- Cannot contain any NULL values
  - In C, a string is always terminated with a NULL character

Example of a Stack Overflow Attack

Scenario: an intruder has gained access to some system as a normal user, and wishes to exploit a buffer overflow in a trusted utility to gain greater privileges
How?
- Identified a suitable, vulnerable, trusted utility program: buffer4
- Analyze it to determine➔running the program using a debugger
  - The likely location of the targeted buffer on the stack
  - How much data are needed to reach up to and overflow the old frame pointer and return address in its stack frame
- Assume that the following information has been obtained
- How many bytes are needed to fill the buffer and reach the saved frame pointer?
Given the number of bytes needed to fill the buffer, what are next steps?
- Allowing a few more spaces at the end to provide room for the args array
- The NOP sled at the start is extended until the buffer is full
- Replace the return address
How many bytes are needed to be packed into inp?
Attacker must also specify the commands to be run by the shell once the attack succeeds

Much More than this Attack Example

Exploit of a local vulnerability: enabling attacker to escalate his privileges
Some practical variances
- The buffer is likely to be larger (1024 is a common size)
- A targeted utility will likely use buffered rather than unbuffered input
  - The input library reads ahead by some amount beyond what the program was requested
  - When the execve (“/bin/sh”) function is called, this buffered input is discarded
Another possible target: a network daemon
- Listening for connection requests from clients
- Spawning a child process to handle that request
- Typically with the network connection mapped to its standard input/output
- May use the same type of unsafe input or buffer copy code
Attacker might want to create shellcode to perform somewhat more complex operations
Both the Metasploit project and the Packet Storm websites include many packaged shellcodes
- Set up a listening service to launch a remote shell
- Create a reverse shell that connects back to the hacker
- Use local exploits that establish a shell or execve a process
- Flush firewall rules that currently block other attacks
- Break out of a restricted execution environment, giving full access to the system

10.2 Defending Against Buffer Overflows

Two broad defense approaches
- Compile-time defense
  - Aim to harden programs to resist attacks in new programs (旨在強化程序以抵禦新程序中的攻擊)
- Run-time defense
  - Aim to detect and abort attacks in existing programs(偵測並中止現有程式中的攻擊)

Compile-Time Defenses

Choice of programming languages

Using a modern high-level language
Pros: not vulnerable to buffer overflow
- Having a strong notion of variable type and permissible operations
- Compilers include additional code to enforce range checks and permissible operations
Cons:
- Additional code must be executed at run time to impose checks
- Flexibility and safety come at a cost in resource use
  - Mush less significant due to the rapid increase in processor performance
Access to some low-level instructions and hardware resources is lost

Safe coding techniques

C designers placed much more emphasis on space efficiency and performance considerations than on type safety
- Assumed programmers would exercise due care in writing code
Programmers need to inspect the code and rewrite any unsafe coding
- An example of this is the OpenBSD project
  - Programmers have audited the existing code base, including the operating system, standard libraries, and common utilities
  - This has resulted in what is widely regarded as one of the safest operating systems in widespread use
Codes not only for normal successful execution
- But, constantly aware of how things might go wrong
- Coding for graceful failure: always doing something sensible when the unexpected occurs

Figure 10.10a shows an example of an unsafe byte copy function. This code copies len bytes out of the from array into the to array starting at position pos and returning the end position. Unfortunately, this function is given no information about the actual size of the destination buffer to and hence is unable to ensure an overflow does not occur

to可能 pos 過大導致 buffer overflow，呼叫程式碼應確保size+len的值不大於to數組的大小

Figure 10.10b shows an example of an unsafe byte input function. It reads the length of binary data expected and then reads that number of bytes into the destination buffer. Again the problem is that this code is not given any information about the size of the buffer, and hence is unable to check for possible overflow.

它讀取預期的二進位資料長度，然後將該位元組數讀入目標緩衝區。同樣的問題是，這段程式碼沒有給出任何有關緩衝區大小的信息，因此無法檢查可能的溢出

Language extensions and use of safe libraries

Handling dynamically allocated memory: more problematic
- The size information is not available at compile time
Requiring an extension and the use of library routines
- Cons
  - Generally, there is a performance penalty
  - Programs and libraries need to be recompiled with the modified compiler
  - Feasible for new OSes, but likely to have problems with third-party apps
Common Concern with C: the use of unsafe standard library routines
- Replacing these with safer variants
- A well-known example: Libsafe
  - Including additional checks to ensure that the copy operations do not extend beyond the local variable space
  - Dynamic library: does not require existing programs to be recompiled

Safe Coding Style

google C++ Style Guide

Stack protection mechanisms

Add function entry and exit code to check stack for signs of corruption
- Stackguard: best known protection mechanism➔a GCC compiler extension
- Function entry code: writing a canary value below the old frame pointer address
- Function exit code: checking that the canary value has not changed
- The canary value: unpredictable and different on different systems Why?
  
  Named after the miner’s canary used to detect poisonous air in a mine and thus warn the miners in time for them to escape.
- Cons
  - All programs needing protection need to be recompiled
  - The structure of the stack frame has changed: causing problems with programs, e.g., debuggers
- Has been used to recompile an entire Linux distribution
Another variants: Stackshield and Return Address Defender (RAD)
- Also GCC extensions: including additional function entry and exit code
- Do not alter the structure of the stack frame
- Function entry code: writing a copy of the return address to a safe region of memory
- Function exit code: checking the return address in the stack frame against the save copy
- Compatible with unmodified debuggers Why?
- Programs must be recompiled

Run-Time Defenses

Can be deployed as OS updates to provide protection
- Compile-time approaches: usually require recompilation of existing programs
- Involving changes to the memory management
Several approaches
- Executable Address Space Protection
  - Block the execution of code on the stack
    - Against the attacks: copying machine code into the targeted buffer and then transferring execution to it
  - Tag pages of virtual memory as being non-executable
    - Requires support from memory management unit (MMU)
    - Long existed on SPARC used by Solaris
    - Recent addition of the no-execute bit in the x86 family
    - A standard feature in recent OSes
  - Cons
    - Unable to support executable stack code
      - e.g., Java Runtime system, nested functions in C, Linux signal handlers
- Address space randomization
  - Manipulate location of key data structures
    - Stack, heap, global data
    - Using random shift for each process
    - Large address range on modern systems: wasting some has negligible impact
  - Randomize location of heap buffers
  - Random location of standard library functions
  - Example implementation:
    - OpenBSD includes all of these extensions in its security features.
- Guard pages
  - Place guard pages between critical regions of memory
    - Flagged in MMU as illegal addresses
    - Any attempted access aborts process
  - Further extension places guard pages between stack frames and heap buffers
    - Cost in execution time to support the large number of page mappings necessary

10.3 Other Forms of Overflow Attacks

Replacement Stack Frame

Return to System Call

Heap Overflows

Attack buffer located in heap
- Typically located above program code
- Memory is requested by programs to use in dynamic data structures (such as linked lists of records)
No return address
- Hence no easy transfer of control
- May have function pointers to be exploited
- Or manipulate management data structures

Defense:
• Making the heap non-executable
• Randomizing the allocation of memory on the heap

Global Data Area Overflows

Attack buffer located in global data
- May be located above program code
- If has function pointer and vulnerable buffer
- Or adjacent process management tables
- Aim to overwrite function pointer later called

Defense
Non executable or random global data region
Move function pointers or use guard pages

Other Types of Overflows

10.4 Key Terms, Review Questions, and Problems