Windows Kernel Vulnerability Research and Exploitation

Yüklə 445 b.

tarix	08.10.2017
ölçüsü	445 b.
	#3767

Windows Kernel Vulnerability Research and Exploitation
Gilad Bakas

Presentation Overview

Why Kernel?
What’s Different?
Technical Background
Vulnerability Research
Common and less common kernel bugs
Exploit Development
Examples: Use-after-free, DRM
Tips & Tricks
AFD.SYS: A simple kernel bug
Win32k.sys: A complex kernel bug
Windows 8 and the future
Questions

Why Kernel?

Used to be much harder
With the introduction of DEP, ASLR, UAC, Heap checks, Protected Mode, Sandboxes etc in User Mode, it’s now on par and sometimes even easier
In parallel to the securing of User Mode, a lot of OS functionality was moved from User to Kernel, and new User-to-Kernel interfaces were introduced, thus drastically increasing the attack surface in the Kernel

Why Kernel Cont’d

In 64bit systems, the Driver Signing Requirements prevent even an Administrator from running unsigned Kernel Code, making exploitation the only alternative.
Many times, the payload uses a driver anyway, so it’s easier to just start from the Kernel

This is already happening

Quoted from a November 4 article by Gregg Keizer’s on ComputerWorld:
“Microsoft has been extremely busy patching pieces of the Windows kernel this year.
So far during 2011, Microsoft has patched 56 different kernel vulnerabilities with updates issued in February, April, June, July, August and October. In April alone, the company fixed 30 bugs, then quashed 15 more in July.”

What’s different?

If something goes wrong, it goes REALLY wrong. That means that even the smallest glitch leads to a BSOD and a system reboot.
No need to worry about permissions 
You have to master a lot more technical knowledge.
No process boundaries. This means that you have a lot more to play with, but also a lot more to mess with

Required Technical Background Things you have to master before you even begin

Kernel APIs
Memory Layout
Interrupts, IRQLs, DPCs, IRPs
Synchronization: Events, Spinlocks, Mutexes, Timers, Semaphores, Resources
Paging mechanism
Intel System Architecture
Device Driver structure, MJ functions, IOCTLs

Vulnerability Research

High-Level first
Look for complexity
Real challenge is figuring out where NOT to look
Interfaces where different teams have to cooperate are more vulnerable – e.g. interaction between User and Kernel
Privilege Escalations are much easier than Remote
Multiple weak exploits can form one strong attack

Vulnerability Research – Cont’d

Three approaches for finding vulnerabilities:

The High Level approach

“Let’s first understand how this whole system works, and only then look for the holes.”

The Low Level approach

“This function looks complex, let’s break it down to the bit and see if it has any bugs”

The blackbox / brute force / fuzzing approach

“Let’s make this mothafucka crash by trying every possible input. We’ll worry about the details later”

Vulnerability Research – Cont’d

High Level approach process
Read everything you can possibly find about your target: white papers, help documents, bug reports, users forums etc.
Ask yourself “If I had to develop this code, how would I do it?”
Research to find all the possible ways to develop that functionality
Now that you know how it can be done, go to the code and find out which method it uses.
You now have a high level overview of how your target works!
With the knowledge of how it works, think of possible weaknesses, then search for them in the code

Vulnerability Research – Cont’d

Low Level approach process
Identify logically and technically complex functions / operations in the code
Completely analyze and/or reverse engineer the relevant functions, looking for bugs
If a bug is found, figure out if there’s a way to generate an attack flow that will trigger it

Vulnerability Research – Cont’d

Blackbox / brute force / fuzzing approach process
Identify all the possible code inputs that are under your control
Find out the structure of the input fields, including calculated data like CRC, lengths, etc
Test different inputs by:

Manually thinking of inputs that are likely to be mishandled
Write a script/program to generate inputs for you
Use a fuzzing script / program / infrastructure

The goal is to get the best code coverage

Common Bugs

Buffer overflows (stack and pool)
NULL dereference
Faulty input validation

Less common bugs

Use-after-free
Direct calling to User code
Logical bugs

Exploit Development

The more knowledge you have the better:

Constant memory addresses
Memory layouts
Heaps, Pools, Stacks
APIs, Objects
CPU
Assembly

Creativity
In kernel mode there are no process boundaries – we can use everything

Example 1: Use-after-free

The bug: object is freed but still kept in a linked list of active objects
To exploit we needed to get our own data into the freed buffer *before someone else does*
The solution:

Use a bug in one driver to cause CPU starvation, thus reducing the chance of anyone else stealing our buffer
Use some DPC code in a second driver that allocates a buffer with the right size and copies our data into it
Activate the code in our target driver that uses the freed object, causing our shellcode to be run

Example 2: DRM

This isn’t actually a kernel exploit, but it’s a great example of:

An insecure interface between User and Kernel code
How the interface points between different development teams are likely to be the weakest links

The system in this example is a DRM system that was meant to prevent movies from being copied by allowing playback on one machine only

Example 2: DRM Cont’d

Every movie is encrypted
Decryption code is embedded in the movie
The code is different in each movie
License is given per-computer, based on hardware signature
Accessing the hardware requires Kernel code, so the decryption code inside the movie calls a small driver, that calls straight back into the user code

Example 2: DRM Broken

Hook DeviceIoControl
Instead of calling the driver, we call the user-mode code within a try…catch statement
Any attempt to do something nasty like reading the BIOS and accessing hardware will generate an exception
We handle the exceptions by reading the data from a file instead of the BIOS/hardware
The decryption code is tricked to “see” the same hardware on all computers, so we can use the same license everywhere!

Tips & Tricks

Arbitrary write – good places to overwrite:

Generic places with fixed addresses (per OS build):
Callback functions pointers
Data segment variables
GDT/LDT tables (http://j00ru.vexillium.org/?p=290)
Distpatch Table (used to be great, but it’s blocked on new OSs)

Tips & Tricks – Cont’d

Non-fixed addresses that can be extracted:
New technique (thanks to Gil Dabah and Tarjei Mandt): The Window Handle Table is mapped to user address space and contains Kernel pointers to objects with function pointers in them (see http://www.mista.nu/research/mandt-win32k-paper.pdf)
So it’s possible to:

Create a kernel window (a window for which win32k created and registered a window class so the window procedure is in kernel, such as menu and tooltip)
Get the pointer to the kernel window object from the Handle Table
Overwrite the WndProc Pointer
Send a message to the window to trigger the WndProc Pointer

Tips & Tricks – Cont’d

Non-fixed addresses that can be extracted:

Other Kernel pointers that are passed to user space.
Some Win32k.sys syscalls are defined as VOID or USHORT and leak a full or partial kernel pointers in the return value (see http://j00ru.vexillium.org/?p=762)

Tips & Tricks – Cont’d

A BSOD is not the end:

There’s plenty of code that runs AFTER an exception, and many times that code calls callback functions that can be overwritten. Especially with ACCESS_VIOLATIONs, the flow goes to the page-fault handler first, so there are plenty of options for attack
Even inside KeBugCheck there are callbacks that can be overwritten
It’s a bit tricky to fix the context and resume normal execution afterwards – but it can be done.

Tips & Tricks – Cont’d

WOW64 processes:

When running in a 32bit process on a 64bit system, when you try to call NtQuerySystemInformation, all the returned pointers are truncated to 32bit – Very annoying!
This can be overcome by using the built-in call gate to temporarily switch to 64bit, call NtQuerySystemInformation, then return to your 32bit code. For more information see http://vxheavens.com/lib/vrg02.html (and thanks to Mark Dowd for the tip!)
The 64bit TEB can be accessed directly without all the switching to 64bit mess since it’s mapped at gs:0

From Kernel to User

Many times, kernel exploits are required to install user-mode payloads or perform operations that require running user-mode code
Contrary to common logic that says the more power the better, launching user-mode code that runs with SYSTEM privileges from the Kernel can be very tricky (due to the lack of API and OS support to do so).
In the following slides we’ll go over the different techniques that can be used, and the pros and cons of each of them

From Kernel to User – Changing the process token

Method
Change the token of a process we already have control of (e.g. the process that launched the exploit) to a SYSTEM token
Pros

Easiest way to implement
Very reliable

Cons

More noisy
The user-mode code has to do all the nasty work (e.g. injecting code to a system process), making it vulnerable to AV and security programs that hook user APIs

From Kernel to User – User-mode APCs

Method
Queue a user-mode APC to a target thread already running in a system process.
Pros

Gets you directly to where you want to be
Allows injection to any process on the system

Cons

Only threads in Alertable state can be targeted, and there is no generic way to find them. An alternative is to force a thread into an Alertable state, but this breaks its waiting state, causing the wait function to return mid-way, and may lead to system instability or crash.
Very undocumented, and the relevant structures are different between OS versions.
Unless targeting a thread you have intimate knowledge of, this method may lead to deadlocks if the thread is holding some locks when it enters the wait state (e.g. the LoaderLock)

From Kernel to User – Thread Hijacking

Method
Change the context of an existing thread in a system process to execute injected code.
Pros

Gets you directly to where you want to be
Allows injection to any process on the system

Cons

Restoring the context can be very difficult.
Hijacking an arbitrary thread is extremely dangerous and may cause deadlocks, instability, or crashes

From Kernel to User – Creating a new thread

Method
Create a new user-mode thread in a system process
Pros

An almost perfect solution, gets you exactly to where you want without any dangers or context issues.
Allows injection to any process on the system

Cons

Extremely difficult to implement. In order for the new thread to function it has to be registered with CSRSS. The APIs and structures involved with that are complex, undocumented, and change constantly with Windows updates.

From Kernel to User – API hooking

Method
Hook a user-mode API that you know is going to be called or that you can cause to be called within a system process
Pros

Allows to inject directly into a system process
Very reliable

Cons

Finding a suitable API to hook may be difficult.
This method isn’t generic, and will only work on system processes that frequently call the targeted API.

An example of a simple Kernel Exploit – AFD.SYS

Let’s have a look at afd!AfdGetRemoteAddress

Can someone see the problem?

// Attacker controls OutputBuffer and OutputBufferLength
void IOCTL_handler(...) {
[...]
try {
ProbeForWrite (OutputBuffer,
OutputBufferLength,
sizeof (UCHAR));
RtlCopyMemory(
OutputBuffer,
(PUCHAR)context+endpoint->Common.VcConnecting.RemoteSocketAddressOffset,
endpoint->Common.VcConnecting.RemoteSocketAddressLength
);
} except( AFD_EXCEPTION_FILTER(&status) ) {
}
[...]
}

AFD.SYS - Continued

OK, so we can write data to any address we want, including kernel addresses, but we can’t really control what data!
The data written looks like this:
02 00 XX XX YY YY YY YY, where XXXX is the port and YYYYYYYY is the IP, and there has to be an active TCP connection for the function to work
What to do?

AFD.SYS - Continued

Our options:

Overwriting a flag
Maybe we don’t need full control of the data?

AFD.SYS - Continued

Solution:

We can connect to 127.0.0.1, that’s 7F 00 00 01.
Port 445 is always open on Windows machines, that’s 01 BD, so now we have 01 BD 7F 00 00 01
We want to overwrite a 32bit pointer, and we need an address that we can easily allocate
How about: 01 BD 7F 00 00 01? Intel is Little Endian, so that gives us 0x00007FBD. Perfect!
Now we just need a pointer to overwrite. Since this is an old bug that only works on XP, we can just use the Dispatch Table.

AFD.SYS - Exploit

Allocate page at 0x7fb0 and copy the shellcode into it.
HookAddress = Dispatch table entry for ZwQueryIntervalProfile
connect() to 127.0.0.1:445
DeviceIoControl(HANDLE)sock, 0x1203F, NULL, 0, (PVOID)(HookAddress - 3), 0, &Result, NULL)
ZwQueryIntervalProfile()

AFD.SYS - Demo

Demo

Walk-through of a complex Kernel PE Exploit

Thanks to my friend Gil Dabah (creator of diStorm Disassembler)
This bug was silently fixed by MS in February

Background

When registering a Window Class it’s possible to request the OS to store some extra bytes with the window object
The extra bytes are appended to the WND structure in the kernel:

Background - Continued

Some special window types (Menus, Tooltips, etc) also have some private data that can only be accessed by the kernel:

Background - Continued

To change the data on the extra bytes, applications call the SetWindowLongPtr function with the index into the extra bytes and a new value.
The function then checks if the index provided is within the private data or the user extra bytes. If the index is within the private bytes, the function fails, so normally it’s impossible to change the private kernel data.

Background - Continued

In order to check if the index is within the private data, SetWindowLongPtr uses a table of window types with their corresponding total allocated bytes size (WND struct + private).
“Window type” refers to FNID, which is the real identifier of a window type, from a list of hard-coded values (unlike its Class).
The pseudo code for the check is:
if (index < (int)(window_class_alloc_sizes[fnid]-sizeof(WND)))) FAIL;

The bug – part 1

By using the undocumented and unexported function RegisterClassExWOWW and supplying an internal window type and a negative number for the extra bytes, it’s possible to overwrite the table with our own value. The bug is that the extra bytes value isn’t verified:

The bug – part 1 - continued

With the table altered to have a negative number as the # of allocated bytes, the test code is tricked:
(index < (int)(window_class_alloc_sizes[fnid]-sizeof(WND)))) == always FALSE
we can now call SetWindowLongPtr with 0 as index and change the private kernel data for the window

The bug – part 2

Now that we can overwrite private kernel data, we need to find a window type that has some useful stuff stored there.
The Menu window type stores a pointer to a structure, and during window destruction, a pointer in that structure is NULLed, giving us the ability to NULL any 32/64 bit value in the system – Bingo!

Exploitation

Since the Menu window private structure changes between Windows versions, we run the exploit twice:

The first time overwriting the pointer to the structure with a pointer to some non-NULL array, so that we can find out the offset were the NULL is put
The second time with a pointer to the address we want to NULL minus the offset found in the first stage

Exploitation - continued

All that’s left now is to allocate our shellcode at page 0, overwrite a function pointer, and then get it called
Easy! 

Exploitation - flow

Find the address of RegisterClassExWOWW using diStorm
RegisterClassExWOWW() passing the FNID for a Menu and a WNDCLASSEX structure with a negative number for the extra bytes
CreateWindow()
SetWindowLongPtr() with a non-NULL array
DestroyWindow()
Find offset
Repeat steps 3-5, this time passing the actual address to overwrite minus the offset
Get the overwritten pointer to be called

Windows 8 and the future

Questions? gbakas@gmail.com

Yüklə 445 b.

Dostları ilə paylaş:

Windows Kernel Vulnerability Research and Exploitation

Windows Kernel Vulnerability Research and Exploitation

Gilad Bakas

Presentation Overview

Why Kernel?

What’s Different?

Technical Background

Vulnerability Research

Common and less common kernel bugs

Exploit Development

Examples: Use-after-free, DRM

Tips & Tricks

AFD.SYS: A simple kernel bug

Win32k.sys: A complex kernel bug

Windows 8 and the future

Questions

Why Kernel?

Used to be much harder

With the introduction of DEP, ASLR, UAC, Heap checks, Protected Mode, Sandboxes etc in User Mode, it’s now on par and sometimes even easier

In parallel to the securing of User Mode, a lot of OS functionality was moved from User to Kernel, and new User-to-Kernel interfaces were introduced, thus drastically increasing the attack surface in the Kernel

Why Kernel Cont’d

In 64bit systems, the Driver Signing Requirements prevent even an Administrator from running unsigned Kernel Code, making exploitation the only alternative.

Many times, the payload uses a driver anyway, so it’s easier to just start from the Kernel

This is already happening

Quoted from a November 4 article by Gregg Keizer’s on ComputerWorld:

“Microsoft has been extremely busy patching pieces of the Windows kernel this year.

So far during 2011, Microsoft has patched 56 different kernel vulnerabilities with updates issued in February, April, June, July, August and October. In April alone, the company fixed 30 bugs, then quashed 15 more in July.”

What’s different?

If something goes wrong, it goes REALLY wrong. That means that even the smallest glitch leads to a BSOD and a system reboot.

No need to worry about permissions 

You have to master a lot more technical knowledge.

No process boundaries. This means that you have a lot more to play with, but also a lot more to mess with

Required Technical Background Things you have to master before you even begin

Kernel APIs

Memory Layout

Interrupts, IRQLs, DPCs, IRPs

Synchronization: Events, Spinlocks, Mutexes, Timers, Semaphores, Resources

Paging mechanism

Intel System Architecture

Device Driver structure, MJ functions, IOCTLs

Vulnerability Research

High-Level first

Look for complexity

Real challenge is figuring out where NOT to look

Interfaces where different teams have to cooperate are more vulnerable – e.g. interaction between User and Kernel

Privilege Escalations are much easier than Remote

Multiple weak exploits can form one strong attack

Vulnerability Research – Cont’d

Three approaches for finding vulnerabilities:

Vulnerability Research – Cont’d

Vulnerability Research – Cont’d

Vulnerability Research – Cont’d

Common Bugs

Buffer overflows (stack and pool)

NULL dereference

Faulty input validation

Less common bugs

Use-after-free

Direct calling to User code

Logical bugs

Exploit Development

The more knowledge you have the better:

Creativity

In kernel mode there are no process boundaries – we can use everything

Example 1: Use-after-free

The bug: object is freed but still kept in a linked list of active objects

To exploit we needed to get our own data into the freed buffer *before someone else does*

The solution:

Example 2: DRM

This isn’t actually a kernel exploit, but it’s a great example of:

The system in this example is a DRM system that was meant to prevent movies from being copied by allowing playback on one machine only

Example 2: DRM Cont’d

Every movie is encrypted

Decryption code is embedded in the movie

The code is different in each movie

License is given per-computer, based on hardware signature

Accessing the hardware requires Kernel code, so the decryption code inside the movie calls a small driver, that calls straight back into the user code

Example 2: DRM Broken

Hook DeviceIoControl

Instead of calling the driver, we call the user-mode code within a try…catch statement

To exploit we needed to get our own data into the freed buffer before someone else does