Since Apple decided to partially disable trace on os x, I searched for a replacement using the mach API. This is the result, useful for debugging or modifying external memory.
jona.t4
woensdag 12 maart 2014
woensdag 27 november 2013
Hooking explained: detouring library calls and vtable patching in Windows/Linux/MAC-OSX
Oftentimes, it can be useful to modify the behavior of an application without making extensive changes to the source code of the application. Specifically, one might want to intercept calls of certain functions to execute custom code before or after the execution of the original code, or one might want to retrieve or modify the parameters passed to a function. For example, it might be necessary to instrument the application for performance analysis or to add additional features to a program. In these cases when one does not have the source code available for the program, it is still possible to modify the code.
Here i will present the techniques i use for the different operating systems.
Please note that i don't claim that these techniques are the best solutions for all cases.
Appendix A: Windows DLL Injection
Appendix B: Import Address Table Hooking (IAT)
Appendix C: MS-Detours 1.5 (Direct3d)
Appendix D: virtual table patching
Appendix E: Example : hiding process(es) under windows
To change code in another process we must load our own shared library in the address space of the other process. On UNIX platforms (Linux/MAC-OSX) this can be achieved using the LD_PRELOAD environment variable, which instructs the loader to load the specific shared libraries. Function and other symbol definitions in the specified libraries will be used instead of the original ones.
However on Windows systems there is no such thing as LD_PRELOAD, to achieve the same result we must use a little exploit called DLL Injection (On Windows shared libraries are .DLL's, on Linux .so's and on MAC-OSX .dylib's). See Appendix A below for more information.
§2 Hooking/Detouring function calls
§2.1 UNIX/Linux
UNIX offers a simple way to override functions in a shared library with the LD_PRELOAD environment variable. When you make a twin brother of a function that is defined in an existing shared library, put it in your shared library, and you register your shared library name in DYLD_INSERT_LIBRARIES, your function is used instead of the original one. It is exactly the same as MAC-OSX (see below) but use LD_PRELOAD instead of DYLD_INSERT_LIBRARIES .
§2.2 MAC-OSX
Since MAC-OSX is also UNIX based it's almost exactly the same as in Linux, only they have renamed LD_PRELOAD to DYLD_INSERT_LIBRARIES and .so to .dylib. In this example I've detoured fopen from a test program. In 2003 Jonathan Rentzsch showed ways of detouring in MAC-OSX and released mach_star, but this method is way easier.
Our detour library:
This function will get called instead of the original one (see the intro), but we still need to call the original afterwards that's what we use dlsym for.
Compiling the library:
Running the program with DYLD_INSERT_LIBRARIES.
The reason why we restore the backup before getting the return value is because if we don't do it we will get an infinite loop, we call a function that jumps to the function that calls the function again etc etc.. If you change the parameters of the call to MessageBoxW inside MyMessageBoxW every messagebox that the DLL is injected to will have those parameters. See appendix C for the MS-Detours method which is way easier and recommended.
See the diagram:
There is another way using the CreateRemoteThread call. It is extremely easy, and relatively efficient. Before starting though, it is important to actually find the process to inject into. The Windows API provides a great function for doing this – CreateToolhelp32Snapshot.
I didn’t bother storing the value after I called Process32First because that will always be “[System Process]”, so there’s really no need. Process32Next returns TRUE on success, so just simply putting it in a loop and pushing the name of the process it received in a vector is what is needed. Once the loop is finished, every single process should be stored in processNames. This is great and all, but where does the DLL injection come in? Well, the PROCESSENTRY32 structure also has a member that holds the Process ID. Inside that loop, while we’re pushing the process names in our vector, we’re also going to inject the DLL.
The code above is pretty straightforward, we first get the current directory and append our dll name to it so we can later allocate it in the target process memory. Then we create a new thread which calls loadlibrary with our dll path as parameter.
One section of note is the import address table (IAT), which is used as a lookup table when the application is calling a function in a different module. It can be in the form of both import by ordinal and import by name. Because a compiled program cannot know the memory location of the libraries it depends upon, an indirect jump is required whenever an API call is made. As the dynamic linker loads modules and joins them together, it writes actual addresses into the IAT slots, so that they point to the memory locations of the corresponding library functions. Though this adds an extra jump over the cost of an intra-module call resulting in a performance penalty, it provides a key benefit: The number of memory pages that need to be copy-on-write changed by the loader is minimized, saving memory and disk I/O time. If the compiler knows ahead of time that a call will be inter-module (via a dllimport attribute) it can produce more optimized code that simply results in an indirect call opcode.
IAT hooking has pros and cons:
Cons:
- The method you are hooking must be imported from another module, you can't just hook a certain address in memory. This is not optimal for directx hooks, since you will only find createdevice (you can use that to get the device tho) but for Opengl and such this is handy.
Pros:
- Less detectable, you can make this into a fully external hook, that should be undetected for any antivirus/cheat because it also doesn't use any malicious calls.
So first we get a handle to our main module:
And inside this loop, we loop through the functions, if you add an int to the firsthunk you get to the next thunk and so on.
Now if you look in the import_desciptor structure you can see the name is on firsthunk +2 so
when we have the name we can compare it with our target and patch the address.the function will look like this:
And that's it! now we can just patch it:
First of all you need to make sure you have MS-Detours 1.5 downloaded and added the corresponding files to your project. I am using version 1.5 because it's the simplest to use, and it does the job nicely.
There is one important function we are going to use, its called DetourFunction. First we are going to need a typedef of the function we are going to hook (endscene in this case, since it gets called AFTER the drawing so we can add code right before that).
Now to actually hook endscene we need to retrieve the address of the original function, this can be done in two ways, the first way is to reverse a sample direct3d program to find the address of the endscene call and add that to the module base of d3d9.dll. And the second way is to use the GetProcAddress function. The problem with the first way is that it is platform dependent, the address is different on 64bit Windows from the 32bit version.
What we did here is retrieve the address with GetProcAddress and pass it as the first parameter, the second parameter is a pointer to our own detour function (hkEndScene). Now you can add drawing function to the original program, benchmarking programs make good use of this.
Whenever a class defines a virtual function (or method), most compilers add a hidden member variable to the class which points to a so called virtual method table (VMT or Vtable). This VMT is basically an array of pointers to (virtual) functions. At runtime these pointers will be set to point to the right function, because at compile time, it is not yet known if the base function is to be called or a derived one implemented by a class that inherits from the base class. The code below shows an example of a VMT hook, if you want to implement this in direct3d you need to create a new device, and use that to replace the original function in the original device.
In this example i will show how one can hook the system call that retrieves the list of processes and modify it so it will skip our process. For this i will use the mhook library but you can also use any other hooking method described in this article. The system call that the task manager uses to retrieve the list of processes is called NtQuerySystemInformation msdn. On msdn we can also find the appropriate structures needed for this call.
Now all is left is define our detour function and use mhook to hook it.
First i will show our detour function.
What we basically do here is create a loop that checks every process name, once we found our process name we skip our process and return the original call (without our process). Now we hook it using mhook.
Please note that i don't claim that these techniques are the best solutions for all cases.
Appendix A: Windows DLL Injection
Appendix B: Import Address Table Hooking (IAT)
Appendix C: MS-Detours 1.5 (Direct3d)
Appendix D: virtual table patching
Appendix E: Example : hiding process(es) under windows
§1 Shared Libraries & Injection/Loading
Shared libraries are code objects that may be loaded during execution into the memory space associated with a process. Library code may be shared in memory by multiple processes as well as on disk. If virtual memory is used, processes execute the same physical page of RAM, mapped into the different address spaces of each process. This has advantages. For instance on some systems, applications were often only a few hundred kilobytes in size and loaded quickly; the majority of their code was located in libraries that had already been loaded for other purposes by the operating system.To change code in another process we must load our own shared library in the address space of the other process. On UNIX platforms (Linux/MAC-OSX) this can be achieved using the LD_PRELOAD environment variable, which instructs the loader to load the specific shared libraries. Function and other symbol definitions in the specified libraries will be used instead of the original ones.
However on Windows systems there is no such thing as LD_PRELOAD, to achieve the same result we must use a little exploit called DLL Injection (On Windows shared libraries are .DLL's, on Linux .so's and on MAC-OSX .dylib's). See Appendix A below for more information.
§2 Hooking/Detouring function calls
§2.1 UNIX/Linux
UNIX offers a simple way to override functions in a shared library with the LD_PRELOAD environment variable. When you make a twin brother of a function that is defined in an existing shared library, put it in your shared library, and you register your shared library name in DYLD_INSERT_LIBRARIES, your function is used instead of the original one. It is exactly the same as MAC-OSX (see below) but use LD_PRELOAD instead of DYLD_INSERT_LIBRARIES .
§2.2 MAC-OSX
Since MAC-OSX is also UNIX based it's almost exactly the same as in Linux, only they have renamed LD_PRELOAD to DYLD_INSERT_LIBRARIES and .so to .dylib. In this example I've detoured fopen from a test program. In 2003 Jonathan Rentzsch showed ways of detouring in MAC-OSX and released mach_star, but this method is way easier.
The dummy program:
Just a simple program that calls fopen.
Just a simple program that calls fopen.
int main(int argc, char** argv) { printf("original program start\n"); FILE* fileptr = fopen("hey.txt", "w"); // create a new file fclose(fileptr); printf("original program quit"); return 0; }
This function will get called instead of the original one (see the intro), but we still need to call the original afterwards that's what we use dlsym for.
#include <stdio.h>
#include <dlfcn.h>
FILE* fopen(const char *path, const char *mode)
{
printf("Detoured fopen\n");
FILE* (*real_fopen)(const char*, const char*) =
(FILE* (*)(const char*, const char*)) dlsym(RTLD_NEXT, "fopen");
return real_fopen (path, "r"); // note r instead of w, this will prevent the program from creating files
}
Compiling the library:
gcc -fno-common -c fopenwrap.c
gcc -dynamiclib -o libhook.dylib fopenwrap.o
Running the program with DYLD_INSERT_LIBRARIES.
DYLD_FORCE_FLAT_NAMESPACE=1 DYLD_INSERT_LIBRARIES=libhook.dylib ./DummyFopen
You also need to define DYLD_FORCE_FLAT_NAMESPACE (doesn't matter what value it has).You can use the same technique to override a method in a class. Say there's a method named "libfff" in a class AAA.
class AAA
{
public:
int m;
AAA(){m = 1234;}
void libfff(int a);
};
To override it, you first need to know the mangled symbol name of the method.
class AAA
{
public:
int m;
AAA(){m = 1234;}
void libfff(int a);
};
$ nm somelibrary.dylib | grep "T "
00000ed6 T __ZN3AAA3fffEi
$ nm somelibrary.dylib | grep "T "
00000ed6 T __ZN3AAA3fffEi
Then what you need to define is _ZN3AAA3fffEi. Don't forget removing the first '_'. If you see multiple symbols in the shared library and not sure which one to override, you can check it by de-mangling a symbol.
Now we can detour it like this.
§2.3 Microsoft Windows
This is the framework of a standard API hook. All of this resides in a DLL that will be injected into a process. For this example, I chose to hook the MessageBoxW function. Once this DLL is injected, it will get the address of the MessageBoXW function from user32.dll, and then the hooking begins. In the BeginRedirect function, an unconditional relative jump (JMP) opcode (0xE9) instruction will contain the distance to jump to. The source is fully commented. $ c++filt __ZN3AAA3fffEi
AAA::libfff(int)
Now we can detour it like this.
#include <stdio.h>
#include <dlfcn.h>
#include <unistd.h>
typedef void (*AAAlibfffType)(AAA*, int);
static void (*real_AAAlibfff)(AAA*, int);
extern "C"{
void _ZN3AAA3fffEi(AAA* a, int b){
printf("--------AAA::libfff--------");
printf("%d, %d", b, a->m);
void* handle = dlopen("sharedlib.dylib", RTLD_NOW);
real_AAAlibfff = (AAAfffType)dlsym(handle, "_ZN3AAA3fffEi");
if (real_AAAlibfff) printf("OK");
real_AAAlibfff(a, b);
}
}
#include <windows.h>
#define SIZE 6
typedef int (WINAPI *pMessageBoxW)(HWND, LPCWSTR, LPCWSTR, UINT); // Messagebox protoype
int WINAPI MyMessageBoxW(HWND, LPCWSTR, LPCWSTR, UINT); // Our detour
void BeginRedirect(LPVOID);
pMessageBoxW pOrigMBAddress = NULL; // address of original
BYTE oldBytes[SIZE] = {0}; // backup
BYTE JMP[SIZE] = {0}; // 6 byte JMP instruction
DWORD oldProtect, myProtect = PAGE_EXECUTE_READWRITE;
INT APIENTRY DllMain(HMODULE hDLL, DWORD Reason, LPVOID Reserved)
{
switch(Reason)
{
case DLL_PROCESS_ATTACH: // if attached
pOrigMBAddress = (pMessageBoxW)
GetProcAddress(GetModuleHandle("user32.dll"), // get address of original
"MessageBoxW");
if(pOrigMBAddress != NULL)
BeginRedirect(MyMessageBoxW); // start detouring
break;
case DLL_PROCESS_DETACH:
memcpy(pOrigMBAddress, oldBytes, SIZE); // restore backup
case DLL_THREAD_ATTACH:
case DLL_THREAD_DETACH:
break;
}
return TRUE;
}
void BeginRedirect(LPVOID newFunction)
{
BYTE tempJMP[SIZE] = {0xE9, 0x90, 0x90, 0x90, 0x90, 0xC3}; // 0xE9 = JMP 0x90 = NOP oxC3 = RET
memcpy(JMP, tempJMP, SIZE); // store jmp instruction to JMP
DWORD JMPSize = ((DWORD)newFunction - (DWORD)pOrigMBAddress - 5); // calculate jump distance
VirtualProtect((LPVOID)pOrigMBAddress, SIZE, // assign read write protection
PAGE_EXECUTE_READWRITE, &oldProtect);
memcpy(oldBytes, pOrigMBAddress, SIZE); // make backup
memcpy(&JMP[1], &JMPSize, 4); // fill the nop's with the jump distance (JMP,distance(4bytes),RET)
memcpy(pOrigMBAddress, JMP, SIZE); // set jump instruction at the beginning of the original function
VirtualProtect((LPVOID)pOrigMBAddress, SIZE, oldProtect, NULL); // reset protection
}
int WINAPI MyMessageBoxW(HWND hWnd, LPCWSTR lpText, LPCWSTR lpCaption, UINT uiType)
{
VirtualProtect((LPVOID)pOrigMBAddress, SIZE, myProtect, NULL); // assign read write protection
memcpy(pOrigMBAddress, oldBytes, SIZE); // restore backup
int retValue = MessageBoxW(hWnd, lpText, lpCaption, uiType); // get return value of original function
memcpy(pOrigMBAddress, JMP, SIZE); // set the jump instruction again
VirtualProtect((LPVOID)pOrigMBAddress, SIZE, oldProtect, NULL); // reset protection
return retValue; // return original return value
}
The reason why we restore the backup before getting the return value is because if we don't do it we will get an infinite loop, we call a function that jumps to the function that calls the function again etc etc.. If you change the parameters of the call to MessageBoxW inside MyMessageBoxW every messagebox that the DLL is injected to will have those parameters. See appendix C for the MS-Detours method which is way easier and recommended.
See the diagram:
Appendix A: Windows DLL injection
NOTE: the easy way is at the end of this appendix, i will start with the hardcore method first.
Welcome to appendix A, here i will explain how to make another process load our DLL. What we do is allocate a chunk of memory in the target process with our assembly function which calls LoadLibrary, we also need to allocate space for our DLL path name. Next we suspend the main thread of our target and modify the register that holds the next instruction to be executed. Than we patch our allocated function to return/call the right addresses. When we are done we resume the main thread.
To continue we'll need a handle to a thread in the process, to achieve this one can use this function show in block A-2.
Welcome to appendix A, here i will explain how to make another process load our DLL. What we do is allocate a chunk of memory in the target process with our assembly function which calls LoadLibrary, we also need to allocate space for our DLL path name. Next we suspend the main thread of our target and modify the register that holds the next instruction to be executed. Than we patch our allocated function to return/call the right addresses. When we are done we resume the main thread.
#define PROC_NAME lorem.exe // block A-1
#define DLL_NAME ipsum.dll
// main()
void *dllString, *vfunc;
unsigned long ulproc_id, threadID, funcLen, oldIP, oldprot, loadLibAddy;
HANDLE hProcess, hThread;
CONTEXT ctx;
funcLen = (unsigned long)loadDll_end - (unsigned long)loadDll;
loadLibAddy = (unsigned long)GetProcAddress(GetModuleHandle("kernel32.dll"), "LoadLibraryA");
// This code is pretty straightforward
ulproc_id = GetProcIdFromName(PROC_NAME); // see A-4
hProcess = OpenProcess((PROCESS_VM_WRITE | PROCESS_VM_OPERATION), false, ulproc_id);
vdllString = VirtualAllocEx(hProcess, NULL, (strlen(DLL_NAME) + 1), MEM_COMMIT, PAGE_READWRITE);
vfunc = VirtualAllocEx(hProcess, NULL, funclen, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
WriteProcessMemory(hProcess, vdllString, DLL_NAME, strlen(DLL_NAME), NULL);
To continue we'll need a handle to a thread in the process, to achieve this one can use this function show in block A-2.
unsigned long GetThreadFromProc(char *procName) { // block A-2
PROCESSENTRY32 pe;
HANDLE thSnapshot, hProcess;
BOOL retval, ProcFound = false;
unsigned long pTID, threadID;
thSnapshot = CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, 0);
if(thSnapshot == INVALID_HANDLE_VALUE) {
MessageBox(NULL, "Error: unable to create toolhelp snapshot", "Loader", NULL);
return false;
}
pe.dwSize = sizeof(PROCESSENTRY32);
retval = Process32First(thSnapshot, &pe);
while(retval) {
if(StrStrI(pe.szExeFile, procName) ) {
ProcFound = true;
break;
}
retval = Process32Next(thSnapshot,&pe);
pe.dwSize = sizeof(PROCESSENTRY32);
}
CloseHandle(thSnapshot);
_asm {
mov eax, fs:[0x18]
add eax, 36
mov [pTID], eax
}
hProcess = OpenProcess(PROCESS_VM_READ, false, pe.th32ProcessID);
ReadProcessMemory(hProcess, (const void *)pTID, &threadID, 4, NULL);
CloseHandle(hProcess);
return threadID;
}
This is a prototype for the function we are going to allocate in the target process which will call loadlibrary, the addresses are left blank because we patch them later on when we have the right values.
Now, we need to pause the thread in order to get it's "context". The context of a thread is the current state of all of it's registers, as well as other peripheral information. However, we're mostly concerned with the EIP register, which points to the next instruction to be executed. So, if we don't suspend the thread before retrieving its context information, it'll continue executing and by the time we get the information, it'll be invalid. Once we've paused the thread, we'll retrieve it's context information using the GetThreadContext() function. We'll grab the value of the current next instruction to be executed, so that we know where our function should return to. Then it's just a matter of patching up the function to have all of the proper pointers, and forcing the thread to execute it. (A-3)
__declspec(naked) loadDll(void) { // prototype function
_asm{
push 0xFFFFFFFF // Placeholder for the return address
// Save the flags and registers
pushfd
pushad
push 0xFFFFFFFF // Placeholder for the string address
mov eax, 0xFFFFFFFF // Placeholder for loadlibrary
call eax // Call LoadLibrary with the string parameter
// Restore the registers and flags
popad
popfd
ret
}
}
__declspec(naked) loadDll_end(void){
}
Now, we need to pause the thread in order to get it's "context". The context of a thread is the current state of all of it's registers, as well as other peripheral information. However, we're mostly concerned with the EIP register, which points to the next instruction to be executed. So, if we don't suspend the thread before retrieving its context information, it'll continue executing and by the time we get the information, it'll be invalid. Once we've paused the thread, we'll retrieve it's context information using the GetThreadContext() function. We'll grab the value of the current next instruction to be executed, so that we know where our function should return to. Then it's just a matter of patching up the function to have all of the proper pointers, and forcing the thread to execute it. (A-3)
unsigned long threadID; // block A-3
HANDLE hThread;
threadID = GetThreadIdFromProc(PROC_NAME);
hThread = OpenThread((THREAD_GET_CONTEXT | THREAD_SET_CONTEXT | THREAD_SUSPEND_RESUME), false, threadID);
SuspendThread(hThread);
ctx.ContextFlags = CONTEXT_CONTROL;
GetThreadContext(hThread, &ctx);
oldIP = ctx.Eip;
//Set the EIP of the context to the address of our stub
ctx.Eip = (DWORD)stub;
ctx.ContextFlags = CONTEXT_CONTROL;
// patch the prototype
VirtualProtect(loadDll, funclen, PAGE_EXECUTE_READWRITE, &oldprot);
//Patch the first push instruction
memcpy((void *)((unsigned long)loadDll + 1), &oldIP, 4);
//Patch the 2nd push instruction
memcpy((void *)((unsigned long)loadDll + 8), &dllString, 4);
//Patch the mov eax, 0xDEADBEEF to mov eax, LoadLibrary
memcpy((void *)((unsigned long)loadDll + 13), &loadLibAddy, 4);
WriteProcessMemory(hProcess, vfunc, loadDll, funcLen, NULL);
//Set the new context of the target's thread
SetThreadContext(hThread, &ctx);
//Let the target thread continue execution, starting at our function
ResumeThread(hThread);
// clean up
Sleep(8000);
VirtualFreeEx(hProcess, dllString, strlen(DLL_NAME), MEM_DECOMMIT);
VirtualFreeEx(hProcess, stub, stubLen, MEM_DECOMMIT);
CloseHandle(hProcess);
CloseHandle(hThread);
unsigned long GetProcIdFromName(char *procName) { // block A-4
PROCESSENTRY32 pe;
HANDLE thSnapshot;
BOOL retval, ProcFound = false;
thSnapshot = CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, 0);
if(thSnapshot == INVALID_HANDLE_VALUE) {
MessageBox(NULL, "Error: unable to create toolhelp snapshot", "Loader", NULL);
return false;
}
pe.dwSize = sizeof(PROCESSENTRY32);
retval = Process32First(thSnapshot, &pe);
while(retval) {
if(StrStrI(pe.szExeFile, procName) ) {
ProcFound = true;
break;
}
retval = Process32Next(thSnapshot,&pe);
pe.dwSize = sizeof(PROCESSENTRY32);
}
CloseHandle(thSnapshot);
return pe.th32ProcessID;
}
There is another way using the CreateRemoteThread call. It is extremely easy, and relatively efficient. Before starting though, it is important to actually find the process to inject into. The Windows API provides a great function for doing this – CreateToolhelp32Snapshot.
#undef UNICODE
#include <vector>
#include <string>
#include <windows.h>
#include <Tlhelp32.h>
using std::vector;
using std::string;
int main(void) {
vector<string>processNames;
PROCESSENTRY32 pe32;
pe32.dwSize = sizeof(PROCESSENTRY32);
HANDLE hTool32 = CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, NULL);
BOOL bProcess = Process32First(hTool32, &pe32);
if(bProcess == TRUE) {
while((Process32Next(hTool32, &pe32)) == TRUE)
processNames.push_back(pe32.szExeFile); // store every process name
}
CloseHandle(hTool32);
return 0;
}
I didn’t bother storing the value after I called Process32First because that will always be “[System Process]”, so there’s really no need. Process32Next returns TRUE on success, so just simply putting it in a loop and pushing the name of the process it received in a vector is what is needed. Once the loop is finished, every single process should be stored in processNames. This is great and all, but where does the DLL injection come in? Well, the PROCESSENTRY32 structure also has a member that holds the Process ID. Inside that loop, while we’re pushing the process names in our vector, we’re also going to inject the DLL.
while((Process32Next(hTool32, &pe32)) == TRUE) {
processNames.push_back(pe32.szExeFile);
if(strcmp(pe32.szExeFile, "notepad.exe") == 0) // if we found our target process
{
char* DirPath = new char[MAX_PATH];
char* FullPath = new char[MAX_PATH];
GetCurrentDirectory(MAX_PATH, DirPath); // get current directory
sprintf_s(FullPath, MAX_PATH, "%s\\testdll.dll", DirPath); // append our dll to the current directory
HANDLE hProcess = OpenProcess(PROCESS_CREATE_THREAD | PROCESS_VM_OPERATION |
PROCESS_VM_WRITE, FALSE, pe32.th32ProcessID); // open process with extended access
LPVOID LoadLibraryAddr = (LPVOID)GetProcAddress(GetModuleHandle("kernel32.dll"),
"LoadLibraryA"); // get address of loadlibrary
LPVOID LLParam = (LPVOID)VirtualAllocEx(hProcess, NULL, strlen(FullPath),
MEM_RESERVE | MEM_COMMIT, PAGE_READWRITE); // allocate some space for the dll path we made
WriteProcessMemory(hProcess, LLParam, FullPath, strlen(FullPath), NULL); // write path to process
CreateRemoteThread(hProcess, NULL, NULL, (LPTHREAD_START_ROUTINE)LoadLibraryAddr, // create new thread and call loadlibrary with our dll path as parameter
LLParam, NULL, NULL);
CloseHandle(hProcess);
delete [] DirPath; // clean up
delete [] FullPath;
}
The code above is pretty straightforward, we first get the current directory and append our dll name to it so we can later allocate it in the target process memory. Then we create a new thread which calls loadlibrary with our dll path as parameter.
Appendix B: Import Address Table (IAT) Hooking
Before we jump in the Import Address Table you first need to know a bit background information, I'll start with the PE format. The Portable Executable (PE) format is a file format for executables, object code, DLLs, FON Font files, and others used in 32-bit and 64-bit versions of Windows operating systems. The PE format is a data structure that encapsulates the information necessary for the Windows OS loader to manage the wrapped executable code. This includes dynamic library references for linking, API export and import tables, resource management data and thread-local storage (TLS) data.One section of note is the import address table (IAT), which is used as a lookup table when the application is calling a function in a different module. It can be in the form of both import by ordinal and import by name. Because a compiled program cannot know the memory location of the libraries it depends upon, an indirect jump is required whenever an API call is made. As the dynamic linker loads modules and joins them together, it writes actual addresses into the IAT slots, so that they point to the memory locations of the corresponding library functions. Though this adds an extra jump over the cost of an intra-module call resulting in a performance penalty, it provides a key benefit: The number of memory pages that need to be copy-on-write changed by the loader is minimized, saving memory and disk I/O time. If the compiler knows ahead of time that a call will be inter-module (via a dllimport attribute) it can produce more optimized code that simply results in an indirect call opcode.
IAT hooking has pros and cons:
Cons:
- The method you are hooking must be imported from another module, you can't just hook a certain address in memory. This is not optimal for directx hooks, since you will only find createdevice (you can use that to get the device tho) but for Opengl and such this is handy.
Pros:
- Less detectable, you can make this into a fully external hook, that should be undetected for any antivirus/cheat because it also doesn't use any malicious calls.
This will be the procedure for internal (dll must be injected in target process) hooking:
- Retrieve DOS/NT Headers
- loop through the import descriptors
- Retrieve DOS/NT Headers
- loop through the import descriptors
So first we get a handle to our main module:
int ip = 0;
if (module == 0)
module = GetModuleHandle(0);
then we retrieve the headers (warning:Whoever wrote the header file for
the PE format is certainly a believer in long, descriptive names, along
with deeply nested structures and macros. When coding with WINNT.H, it's
not uncommon to have mind blowing expressions):
// get the DOS header
PIMAGE_DOS_HEADER pImgDosHeaders = (PIMAGE_DOS_HEADER)module;
// get the NT header from the dos header
PIMAGE_NT_HEADERS pImgNTHeaders = (PIMAGE_NT_HEADERS)((LPBYTE)pImgDosHeaders + pImgDosHeaders->e_lfanew);
// get the import_descriptor from the NT header (its all relative so we keep adding (LPBYTE)pImgDosHeaders PIMAGE_IMPORT_DESCRIPTOR pImgImportDesc = (PIMAGE_IMPORT_DESCRIPTOR)((LPBYTE)pImgDosHeaders + pImgNTHeaders->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT].VirtualAddress); // the size also from the NT header int size = (int)((LPBYTE)pImgDosHeaders + pImgNTHeaders->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT].Size); // check if the DOS header is a valid dos header
if (pImgDosHeaders->e_magic != IMAGE_DOS_SIGNATURE)
printf("e_magic is no valid DOS signature\n");
Now we basicly have enough information to start making the loops to the function pointer, note that every DLL has its own IMAGE_IMPORT_DESCRIPTOR that's why we loop through all of them:
for (IMAGE_IMPORT_DESCRIPTOR* iid = pImgImportDesc; iid->Name != NULL; iid++){}
And inside this loop, we loop through the functions, if you add an int to the firsthunk you get to the next thunk and so on.
for (int funcIdx = 0; *(funcIdx + (LPVOID*)(iid->FirstThunk + (SIZE_T)module)) != NULL; funcIdx++){}
Now if you look in the import_desciptor structure you can see the name is on firsthunk +2 so
char* name = (*(funcIdx + (SIZE_T*)(iid->OriginalFirstThunk + (SIZE_T)module)) + 2
when we have the name we can compare it with our target and patch the address.the function will look like this:
void** ninehook::IATfind(const char* function, HMODULE module){
int ip = 0;
if (module == 0)
module = GetModuleHandle(0);
PIMAGE_DOS_HEADER pImgDosHeaders = (PIMAGE_DOS_HEADER)module;
PIMAGE_NT_HEADERS pImgNTHeaders = (PIMAGE_NT_HEADERS)((LPBYTE)pImgDosHeaders + pImgDosHeaders->e_lfanew);
PIMAGE_IMPORT_DESCRIPTOR pImgImportDesc = (PIMAGE_IMPORT_DESCRIPTOR)((LPBYTE)pImgDosHeaders + pImgNTHeaders->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT].VirtualAddress);
int size = (int)((LPBYTE)pImgDosHeaders + pImgNTHeaders->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT].Size);
if (pImgDosHeaders->e_magic != IMAGE_DOS_SIGNATURE)
printf("e_magic is no valid DOS signature\n");
for (IMAGE_IMPORT_DESCRIPTOR* iid = pImgImportDesc; iid->Name != NULL; iid++){
for (int funcIdx = 0; *(funcIdx + (LPVOID*)(iid->FirstThunk + (SIZE_T)module)) != NULL; funcIdx++){
char* modFuncName = (char*)(*(funcIdx + (SIZE_T*)(iid->OriginalFirstThunk + (SIZE_T)module)) + (SIZE_T)module + 2);
if (!_stricmp(function, modFuncName))
return funcIdx + (LPVOID*)(iid->FirstThunk + (SIZE_T)module);
}
}
return 0;
}
And that's it! now we can just patch it:
DWORD oldrights, newrights = PAGE_READWRITE;
VirtualProtect(funcptr, sizeof(LPVOID), newrights, &oldrights);
oldfunctionptr = *funcptr;
*funcptr = newfunction;
VirtualProtect(funcptr, sizeof(LPVOID), oldrights, &newrights);
Appendix C: MS-Detours 1.5 (Direct3d)
There is one important function we are going to use, its called DetourFunction. First we are going to need a typedef of the function we are going to hook (endscene in this case, since it gets called AFTER the drawing so we can add code right before that).
#pragma comment(lib, "d3d9.lib")
#pragma comment(lib, "d3dx9.lib")
// not the device is a parameter you can check this by reversing the calls of a real d3d program
typedef HRESULT(WINAPI* tEndScene)(LPDIRECT3DDEVICE9 pDevice);
tEndScene oEndScene = NULL;
Now to actually hook endscene we need to retrieve the address of the original function, this can be done in two ways, the first way is to reverse a sample direct3d program to find the address of the endscene call and add that to the module base of d3d9.dll. And the second way is to use the GetProcAddress function. The problem with the first way is that it is platform dependent, the address is different on 64bit Windows from the 32bit version.
HMODULE hd3d9 = GetModuleHandle("d3d9.dll");
// detourfunction from ms-detours, the first parameter is the original address and the second is our detour function
oEndScene = (tEndScene)DetourFunction( (LPBYTE)GetProcAddress(hd3d9, "EndScene" ), (LPBYTE)&mEndScene);
// where our detour function would look something like this
HRESULT WINAPI hkEndScene(LPDIRECT3DDEVICE9 pDevice){
// do evil
return oEndScene(pDevice);
}
What we did here is retrieve the address with GetProcAddress and pass it as the first parameter, the second parameter is a pointer to our own detour function (hkEndScene). Now you can add drawing function to the original program, benchmarking programs make good use of this.
Appendix D: Virtual Table (Vtable) Patching
class VirtualTable { // example class
public:
virtual void VirtualFunction01( void );
};
void VirtualTable::VirtualFunction01( void ) { // just a function as example
cout << "VirtualFunction01 called" << endl;
}
// pointer to original function
typedef void ( __thiscall* VirtualFunction01_t )( void* thisptr );
VirtualFunction01_t g_org_VirtualFunction01;
//our detour function
void __fastcall hk_VirtualFunction01( void* thisptr, int edx ) {
cout << "Custom function called" << endl;
//call the original function
g_org_VirtualFunction01(thisptr);
}
int _tmain(int argc, _TCHAR* argv[]) {
VirtualTable* myTable = new VirtualTable();
//get the pointer to the actual virtual method table from our pointer to our class instance
void** base = *(void***)myTable;
DWORD oldProtection;
// protection
VirtualProtect( &base[0], 4, PAGE_EXECUTE_READWRITE, &oldProtection );
//save the original function
g_org_VirtualFunction01 = (VirtualFunction01_t)base[0];
//overwrite
base[0] = &hk_VirtualFunction01;
//restore page protection
VirtualProtect( &base[0], 4, oldProtection, 0 );
//call the virtual function (now hooked) from our class instance
myTable->VirtualFunction01();
return 0;
}
Appendix E: Example : Hiding process under Windows
enum SYSTEM_INFORMATION_CLASS
{
SystemProcessInformation = 5
};
struct SYS_PROCESS_INFO
{
ULONG NextEntryOffset; // next entry
ULONG NumberOfThreads;
LARGE_INTEGER Reserved[3];
LARGE_INTEGER CreateTime;
LARGE_INTEGER UserTime;
LARGE_INTEGER KernelTime;
UNICODE_STRING ImageName; // Process name
};
NTSTATUS (__stdcall *origNtQuerySystemInformation)(SYSTEM_INFORMATION_CLASS, PVOID, ULONG, PULONG); // original functon pointer
Now all is left is define our detour function and use mhook to hook it.
First i will show our detour function.
NTSTATUS WINAPI myNtQuerySystemInformation(SYSTEM_INFORMATION_CLASS SysInfoClass, PVOID SysInfo, ULONG SysInfoLength, PULONG RetLength)
{
NTSTATUS Return = origNtQuerySystemInformation(SysInfoClass, SysInfo, SysInfoLength, RetLength);
if((SysInfoClass == SystemProcessInformation) && (Return == STATUS_SUCCESS))
{
SYS_PROCESS_INFO* CurrentStructure = (SYS_PROCESS_INFO*)SysInfo;
SYS_PROCESS_INFO* NextStructure = (SYS_PROCESS_INFO*)((int)CurrentStructure + CurrentStructure->NextEntryOffset);
while(CurrentStructure->NextEntryOffset != 0){
if((wcsncmp(NextStructure->ImageName.Buffer, L"explorer.exe", NextStructure->ImageName.Length) == 0) || ((wcsncmp(NextStructure->ImageName.Buffer, L"notepad.exe", NextStructure->ImageName.Length) == 0)))
{
if(NextStructure->NextEntryOffset == 0) {
CurrentStructure->NextEntryOffset = 0;
}
else {
CurrentStructure->NextEntryOffset = CurrentStructure->NextEntryOffset + NextStructure->NextEntryOffset;
NextStructure = CurrentStructure;
}
}
CurrentStructure = NextStructure;
NextStructure = (SYS_PROCESS_INFO*)((int)CurrentStructure + CurrentStructure->NextEntryOffset);
}
}
return Return;
}
What we basically do here is create a loop that checks every process name, once we found our process name we skip our process and return the original call (without our process). Now we hook it using mhook.
HMODULE hNTDLL = NULL;
while(hNTDLL == NULL) {
hNTDLL = GetModuleHandle("ntdll.dll");
}
origNtQuerySystemInformation = (NTSTATUS (__stdcall*)(SYSTEM_INFORMATION_CLASS, PVOID, ULONG, PULONG))GetProcAddress(hNTDLL, "NtQuerySystemInformation");
Mhook_SetHook((PVOID*)&origNtQuerySystemInformation, myNtQuerySystemInformation);
Abonneren op:
Posts (Atom)