1
Executable files typically contain a file header at or near the start of the
2
file. This header contains 'magic numbers' that identify the file type. Beyond
3
this header, executable files are typically divided into SECTIONS. Each section
4
is characterized by name, permissions (RWX), size, file offset, and virtual
9
.text or _TEXT R-X CODE, const globals, large literals
10
.rodata R-- const globals, large literals
11
.data or _DATA RW- initialized globals and static locals
12
.bss or _BSS RW- uninitialized globals and static locals
14
* Large literals - are constant values too large to be handled conveniently
15
with immediate addressing, such as string literals and constant structures.
16
* The C language requires that the BSS be zeroed before main() is called.
17
Because every variable in the BSS has the same initial value (zero), only
18
the BSS size is stored in the executable file.
22
* a.out (DJPP a.out, Linux a.out)
23
* COFF (DJGPP COFF, coff-m68k, sh-coff)
27
ELF (Executable and Linking Format)
29
- ELF files: object files, shared libraries, executables
30
- A loader for ELF executables should find and load the program headers (also
31
called SEGMENTS, not the sections).
32
- Each program header has two different size values: size-on-disk (file size)
33
and size-in-memory (memory size). If the memory size is greater than the file
34
size, the program header contains the BSS, and the extra memory should be
40
- If elf object have external references (library function calls, etc.), it
41
is impossible for assembler to find actual addresses of these functions to
43
- The assemble produces relocations in this case, it contains
44
+ Index into the symbol table
45
+ An offset into the .text section, which refers to the address of the
46
operand of the call instruction.
47
+ A tag which indicates what type of relocation is actually present.
48
- Linker processes all relocations, finds actual addresses and patches them
49
back into the operands of call instructions.
53
- ELF Shared Libraries are resolving symbols and externals at run time.
54
* This is performed with help of symbol table and list of relocations (i.e.
55
linking is performed in run-time)
57
- ELF Shared Libraries are position independent (this means that you can load
58
them more or less anywhere in memory, and they will work). This could be
60
1. Using rellocations. Even for symbols (global variables / functions)
61
local to shared library rellocations are generated. This results in
62
incredible amount of rellocations which should be performed during
65
2. The library is compiled to lookup all local symbols in GOT (Global
66
Offset Table) and PLT (Procedure Linkage Table) tables.
68
* The GOT is a table of pointers: one pointer for each global variable and
69
function (functions are handled differently, see below) used in the shared
71
+ On library load we would need only to fill GOT table (a single
72
relocation per a global variable).
73
+ Start of GOT is always pointed by one of machine registers (%ebx on the
75
+ Benchmark indicates that for most normal programs the drop in
76
performance is less than 3% for a worst case.
77
+ To achiev that all shared library sources should be compiled with
80
* PLT is an array of jump instructions, one for each existing function.
81
Thus if a particular function is called from thousands of locations within
82
the shared library, control will always pass through one jump instruction.
83
+ Actually, the PLT gets addresses for jump instructions from GOT table
84
+ If application linked with shared library has its own instance of a
85
function defined in the shared library, it can set appropriate address
86
in GOT table and the code of shared library will utilize this redefined
88
+ Consider lazy symbol bindings, below
92
- By default the .plt entries are all initialized by the linker not to
93
point to the correct target functions, but instead to point to the
94
dynamic loader itself. Thus, the first time you call any given function,
95
the dynamic loader looks up the function and fixes entry in .plt
96
+ Set 'LD_BIND_NOW=1' to avoild lazy binding
100
- Header (Elf32_Ehdr in /usr/include/linux/elf.h, readelf -h object.o)
102
+ class (ELF32|ELF64), type (EXEC|REL|DYN), machine(X86-64|80386),
104
EXEC: executable file
106
DYN: shared object file
107
+ Entry point address (for EXEC, DYN, 0 - for objects)
108
+ start, number, and size of program headers
109
+ start, number, and size of section headers
110
+ index of section (.shstrtab) containing section names
111
- table of section headers (Elf32_Shdr, readelf -S)
112
+ section offset in the file
113
+ address in virtual memory where this section should be loaded (if 0
114
the section will not be loaded to virtual memory)
115
- table of program headers (readelf -l)
116
+ distilation of section headers table needed to load appropriate
117
sections of executable into the virtual memory
118
+ Type, Offset, VirtAddr, PhysAddr, FileSiz, MemSize, Flags, Align
121
+ objdump -d -j <section name> - Disassemble code sections
122
+ objdump -s -j <section name> - Show section content
124
- symbol table (could be partily ripped to optimize executable size)
125
List of all symbols defined or referenced in file:
127
addresses of variables
130
address associated with symbol
131
tag indicating type of symbol
135
.shstrtab - List of section names
136
.interp - The name of dynamic loader
137
.dynamic - contains distilation from the section headers needed
138
by dynamic loader to do a job (optimization to save on
139
parsing of actual headers)
141
.hash - Hash table for navigating .dynsym
142
.dynsym - Dynamic Symbol Table?
143
.dynstr - Dynamic String Table?
145
.rel* - One ore more relocation sections (readelf -r)
147
type (R_X86_64_GLOB_DAT | R_X86_64_JUMP_SLO)
148
Symbol Value (could be 0 if not known)
150
addend (Offset from symbol?)
153
.got - Global Object Table
154
.plt - Procedure Linkage Table
159
- Locates the .text section within the executable, loads it into the
160
appropriate portions of virtual memory, and marks these pages as read-only.
161
- Locates the .data section in the executable and loads it into the user's
162
address space, this time in read-write memory.
163
- Finds the location and size of the .bss section from the image header,
164
and adds the appropriate pages of memory to the user's address space.
165
- if the application is linked to a shared library, the name of dynamic
166
linker is obtained from executable. The kernel than transfers control
167
to the dynamic linker, not application.
168
- The dynamic loader is initializing itself, loading the shared libraries
169
into memory, resolving remaining relocations and then transfering control
180
shared libraries (growing up)
181
kernel space (fixed size)
185
nm - list symbols from object files
186
ldd - shared library dependencies
187
readelf <opts> <obj|lib|app>
189
-S - list of sections (obj)
190
--segments - list of sections (obj), segments (elf)
192
objdump <opts> <obj|lib|app>
193
--private-headers - Print elf headers
194
strings - Reads all human readable strings from object file
196
pmap <pid> - memory map of application
198
strace - traces system calls application executes
199
ltrace - traces function calls (better version of strace)
200
ipcs - report interprocess communication facilities status