From BitWagon.com!jreiser Fri Jul 6 23:10:04 2001 Return-Path: Received: by nb.in-berlin.de via rmail with stdio id for scut@nb.in-berlin.de; Fri, 6 Jul 2001 23:10:04 +0200 (CEST) (Smail-3.2 1996-Jul-4 #1 built 1998-Dec-12) Sender: BitWagon.com!jreiser Received: from gnu.in-berlin.de (gnu.in-berlin.de [192.109.42.4]) by hirsch.in-berlin.de (8.11.1/8.11.1/Debian 8.11.0-6) with ESMTP id f66L4mS24197 for ; Fri, 6 Jul 2001 23:04:48 +0200 Received: from spruce.he.net (spruce.he.net [216.218.159.210]) by gnu.in-berlin.de (8.10.1/8.10.1) with ESMTP id f66L50q07399 for ; Fri, 6 Jul 2001 23:05:01 +0200 (CEST) (envelope-from jreiser@BitWagon.com) X-Envelope-From: jreiser@BitWagon.com X-Envelope-To: Received: from BitWagon.com (216-99-213-225.dsl.aracnet.com [216.99.213.225]) by spruce.he.net (8.8.6/8.8.2) with ESMTP id OAA26307 for ; Fri, 6 Jul 2001 14:05:03 -0700 Sender: jreiser@spruce.he.net Message-ID: <3B462824.61030F40@BitWagon.com> Date: Fri, 06 Jul 2001 14:05:40 -0700 From: John Reiser Organization: - X-Mailer: Mozilla 4.75 [en] (X11; U; Linux 2.2.19-6.2.1perf i586) X-Accept-Language: en MIME-Version: 1.0 To: Sebastian Subject: Re: ELF in-memory problems References: <20010706203002.A3717@nb.in-berlin.de> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Status: RO X-Status: A Content-Length: 6133 Lines: 145 Hello Sebastian, I got your note; here are some comments. > Since I need static and work-data in my loader segment (mapped at > 0x05371000), I have two choices to make it writeable. The first is to put > read + write + exec flags in the PT_LOAD header already. This was my first > decision, but the kernel did not like it and behaved very weird, for example > brk() could not shrink, only grow. So I sticked to the common layout of > having PF_R + PF_X for the first and PF_R + PF_W for the second segment. > Then brk() worked as expected. My understanding is that in execve() the kernel sets brk(0) as the largest (unsigned) value of (p_vaddr + p_memsz) over all PT_LOAD. Then there is a special check for subsequently setting brk(x) less than the initial value, which in the usual case prevents a program from "committing suicide" by unmapping the initial contents. This special check is the reason why brk(0) works at all: if interpreted literally, then brk(0) would mean "discard all memory at addresses > 0", which would mean discarding the whole address space. [By the way, in my opinion brk() is a historical relic that should disappear. I want to map _several_ ET_EXEC files into my address space (each at a different address, of course). I also want a binary interface to /proc/self/maps, much like WinNT's VirtualQuery(), so that I can build a user-mode, self-aware, page manager so that malloc(), mmap(), dlopen() can co-operate in managing the address space.] > * [004000de] old_mmap(0x8048000, 24151, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXE D|MAP_ANONYMOUS, 4211752, 0) = 0x8048000 > * [004000de] old_mmap(0x804e000, 4107, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED |MAP_ANONYMOUS, 4211752, 0x5000) = 0x804e000 > > The lines marked with a '*' are buggy I think, you pass a pointer as a > filedescriptor. The kernel ignores it though, but it looks very strange ;) MAP_ANONYMOUS in the flags takes precedence over fd. do_xmap() takes advantage of this (and sizeof(int)==sizeof(void *)) to reduce code size by merging two similar-but-different cases. > Here are my theory why my code generates segfaults, ... Can you observe the SIGSEGV using gdb? The values in the pc, registers, instuction being executed, and /proc//maps at the time of SIGSEGV: (gdb) x/i $pc (gdb) info reg (gdb) bt (gdb) shell $ ps . . . $ cat /proc//maps . . . $ exit (gdb) usually help a lot to figure out what is wrong. Please supply these values if you write about this problem again. If necessary, then use several __asm__("int3"); in your C code to get close to the error, then watch using 'stepi'. I use a macro (gdb) define g stepi x/i $pc end to do this. > ... why > don't you just overwrite the ones already in the array, but append it to the > array (so in your code there are actually two vectors for AT_PHDR for > example). I wasn't aware that there are two AT_PHDR; I'll have to check into this. Look at /usr/src/linux/fs/binfmt_elf.c, function do_load_elf_binary(): create_elf_tables((char *)bprm->p, bprm->argc, bprm->envc, (interpreter_type == INTERPRETER_ELF ? &elf_ex : NULL), load_addr, load_bias, interp_load_addr, (interpreter_type == INTERPRETER_AOUT ? 0 : 1)); then create_elf_tables(): if (exec) { sp -= 11*2; NEW_AUX_ENT(0, AT_PHDR, load_addr + exec->e_phoff); NEW_AUX_ENT(1, AT_PHENT, sizeof (struct elf_phdr)); . . . where 'exec' is the 4th argument. So, if there is no INTERPRETER_ELF, then the kernel supplies only AT_PLATFORM and AT_HWCAP; no AT_PHDR etc. AHA! this is a clue. The output from upx has no PT_INTERP for /lib/ld-linux.so.2; instead, it uses whatever the user's Elf32_Phdr specifies, and only later after decompression, not at kernel execve() time. If your program specifies a PT_INTERP to the kernel, then the kernel maps it and runs it _first_, before any of the instructions in the a.elf. The PT_INTERP maps any shared libraries, then jumps to your e_entry. Your code then re-maps /lib/ld-linux.so.2, re-initializing its variables. Notice that upx/src/stub/l_lx_elf.c makes two assignments to 'entry': entry = do_xmap((int)f_decompress, ehdr, &xi, av); entry = do_xmap(fdi, ehdr, 0, 0); The first is the user entry, the second is the entry to PT_INTERP, if any. Thus the entry to PT_INTERP supersedes the user entry, as far as the upx decompressor is concerned. The PT_INTERP later jumps to user e_entry. Look at upx/src/stub/*.lds to see how to avoid PT_INTERP. Run readelf --program-headers and see if your program has PT_INTERP requesting /lib/ld-linux.so.2. 'readelf' is in GNU binutils, and some Linux distributions (RedHat, for example): ----- $ readelf --program-headers /bin/date Elf file type is EXEC (Executable file) Entry point 0x8048cd0 There are 6 program headers, starting at offset 52 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align PHDR 0x000034 0x08048034 0x08048034 0x000c0 0x000c0 R E 0x4 INTERP 0x0000f4 0x080480f4 0x080480f4 0x00013 0x00013 R 0x1 [Requesting program interpreter: /lib/ld-linux.so.2] . . . ----- > - You ommit any static/non-relocateable data from your code and copy > your initialization code to 0x00400000, instead of making your own > segment writeable, why? The kernel maps the decompressor into memory at 0x00410000 (64KB up from 4MB), the code decompresses itself into memory at 0x00400000 (4MB), the user's code gets regenerated into memory at 0x08048000 [or wherever]. The reason for no .data is to guarantee no absolute addresses in code, which makes it easy to execute the code at an address different from where it was linked: just move it, and no other adjustments are required. Being able to move the code makes it easier to uncompress the compressed code of the [rest of the] decompressor. Best wishes, -- John Reiser, jreiser@BitWagon.com