|
|
The basic features of ptrace were explained in Part I. We saw a small example too. As I said earlier, the main applications of ptrace are accessing memory or registers of a process being run (either for debugging or for some evil purposes). So first we should have some basic idea on the binary format of executables - then only we know how and where to access them. So I shall give you a small tutorial on ELF, the binary format used in Linux. In the last section of this article, we find a small program which accesses the registers and memory of another one and modifies them so as to change the output of that process, by injecting some extra code. Note: Please don't get confused. Definitely this is an article about ptrace, not about ELF. But a basic knowledge of ELF is required for accessing the core images of processes. So it should be explained first.
1. What is ELF?ELF stands for Executable and Linking Format. It defines the format of executable binaries used on Linux - and also for relocatable, shared object and core dump files too. ELF is used by both linkers and loaders. They view ELF from two sides, so both should have a common interface.
The structure of ELF is such that it has many sections and segments. Relocatable files have section header tables, executable files have program header tables, and shared object files have both. In the coming sections I shall explain what these headers are.
2. ELF HeadersEvery ELF file has an ELF header. It always starts at offset 0 in the file. It contains the details of the binary file - should it be interpreted, what data structures are related to the file, etc.
The format of the header is given below (taken from /usr/src/include/linux/elf.h)
A small description on the fields is as follows
3. Sections And SegmentsAs said above, linkers treat the file as a set of logical sections described by a section header table, and loaders treat the file as a set of segments described by a program header table. The following section gives details on sections and segments/program headers.
3.1 ELF Sections and Section HeadersThe binary file is viewed as a collection of sections which are arrays of bytes of which no bytes are duplicated. Even though there will be helper information to correctly interpret the contents of the section, the applications may interpret in its own way.
There will be a section header table which is an array of section headers. The zeroth entry of the table is always NULL and describe no part of the binary. Each section header has the following format: (taken from /usr/src/include/linux/elf.h)
Now the fields in detail.
The remaining fields seem to be self explaining.
3.2 ELF Segments And Program HeadersThe ELF segments are used during loading ie, when the image of the process is made in the core. Each segment is described by a program header. There will be a program header table in the file (usually near the ELF header). The table is an array of program headers. The format of the program header is as follows.
Remaining fields appear to be self explaining.
4. Loading The ELF FileWe have got some idea about the structure of ELF object files. Now we have to know how and where these files are loaded for execution. Usually we just type program name at the shell prompt. In fact a lot of interesting things happen after the return key is hit.
First the shell calls the standard libc function which in turn calls the kernel routine. Now the ball is in kernel's court. The kernel opens the file and finds out the type/format of the executable. Then loads ELF and needed libraries, initializes the program's stack, and finally passes control to the program code.
The program gets loaded to 0x08048000 (you can see this in /proc/pid/maps) and the stack starts from 0xBFFFFFFF (stack grows to numerically small addresses).
5. Code InjectionWe have seen the details of the programs being loaded in the memory. So when a process is given and its memory space known, we can trace it (if we have permission) and access the private data structures of the process. It is very easy to say this, but not that easy to do it. Why not make a try?
First of all, let's write a program to access the registers of another
program and
modify it. Here we use the following values of
Now we are going to inject a small piece of our code to image of the process being traced and force the process to execute our code by changing its instruction pointer.
What we do is very simple. First we attach the process, and then read the register contents of the process. Now insert the code which we want to get executed in some location of the stack and the instruction pointer of the process is changed to that location. Finally we detach the process. Now the process starts to execute and will be executing the injected code.
We have two source files, one is the assembly code to be injected and other is the one which traces the process. I shall provide a small program which we may trace.
The source files
Now compile the files.
Go to another console and run the sample program by typing
Come back and execute the tracer to catch the looping process and change its output. Type
Now go to where the sample program 'loop' runs and watch what happens. Definitely your play with ptrace has begun.
6. Looking ForwardIn the first part we traced a process and counted its number of instructions. In this part we studied the ELF file structure and injected a small piece of code into some process. In next part I would expect to access the memory space of some process. Till then, bye from Sandeep S. Published in Issue 83 of Linux Gazette, October 2002 |