Search

9장. 연습 문제

1. 프로파일링 도구 개선하기

실습한 프로파일링 도구는 주어진 애플리케이션 부분뿐만 아니라 외부에서 벌어진 모든 종류 의 시스템 콜까지 전부 기록하고 있다. 그렇다면 프로파일링 도구를 수정해 시스템 콜이 발생 한 출처를 확인하고. 오직 주어진 대상 프로그램에서 발생한 것만으로 한정해 프로파일링을 수행할 수 있도록 해보자. 이러한 기능을 구현하려면 Pin의 사용자 매뉴얼을 반드시 참고하기 바란다.
******* CONTROL TRANSFERS ******* 0x00401000 <- 0x00403f7c: 1 (4.35%) 0x00401015 <- 0x0040100e: 1 (4.35%) 0x00401020 <- 0x0040118b: 1 (4.35%) 0x00401180 <- 0x004013f4: 1 (4.35%) 0x00401186 <- 0x00401180: 1 (4.35%) 0x00401335 <- 0x00401333: 1 (4.35%) 0x00401400 <- 0x0040148d: 1 (4.35%) 0x00401430 <- 0x00401413: 1 (4.35%) 0x00401440 <- 0x004014ab: 1 (4.35%) 0x00401478 <- 0x00401461: 1 (4.35%) 0x00401489 <- 0x00401487: 1 (4.35%) 0x00401492 <- 0x00401431: 1 (4.35%) 0x004014a0 <- 0x00403f99: 1 (4.35%) 0x004014ab <- 0x004014a9: 1 (4.35%) 0x00403f81 <- 0x00401019: 1 (4.35%) 0x00403f86 <- 0x00403f84: 1 (4.35%) 0x00403f9d <- 0x00401479: 1 (4.35%) 0x00403fa6 <- 0x00403fa4: 1 (4.35%) 0x7ff08edcb7bf <- 0x00403fb4: 1 (4.35%) 0x7ff08edcb830 <- 0x00401337: 1 (4.35%) 0x7ff0a27bcde7 <- 0x0040149a: 1 (4.35%) 0x7ff0a27bce05 <- 0x00404004: 1 (4.35%) 0x7ff0a27c3870 <- 0x00401026: 1 (4.35%) ******* FUNCTION CALLS ******* [_init ] 0x00401000 <- 0x00403f7c: 1 (25.00%) [__libc_start_main@plt ] 0x00401180 <- 0x004013f4: 1 (25.00%) [ ] 0x00401400 <- 0x0040148d: 1 (25.00%) [ ] 0x004014a0 <- 0x00403f99: 1 (25.00%) ******* SYSCALLS ******* 0: 1 (4.00%) 2: 2 (8.00%) 3: 2 (8.00%) 5: 2 (8.00%) 9: 7 (28.00%) 10: 4 (16.00%) 11: 1 (4.00%) 12: 1 (4.00%) 21: 3 (12.00%) 158: 1 (4.00%) 231: 1 (4.00%)
C++
복사
log_syscall은 해당 타입으로 정의되어 있으며 threadIndex 즉 소스코드의 tid가 syscall을 호출하는 스레드의 pin thread id 즉 해당 타입으로 어떤 스레드에서 syscall을 호출하는지 구분할 수 있음.
◆ SYSCALL_ENTRY_CALLBACK typedef VOID(* SYSCALL_ENTRY_CALLBACK) (THREADID threadIndex, CONTEXT *ctxt, SYSCALL_STANDARD std, VOID *v) Call-back function before execution of a system call. Parameters [in] threadIndex The Pin thread ID of the thread that executes the system call. [in,out] ctxt Application's register state immediately before execution of the system call. The tool may change this and affect the new register state. [in] std The system calling standard. [in] v The tool's call-back value.
C++
복사
처음에 tid로 로직을 짜 보았으나 정상적으로 찍히지 않아 확인해보니 모든 tid가 0인 것을 확인할 수 있었음.
결국 모든 true에서 호출하는 모든 syscall은 하나의 스레드에서 실행되는 것으로 확인 됨 (그러나 일반적으로 tid를 0으로 사용하지 않음)
#include <stdio.h> #include <map> #include <string> #include <asm-generic/unistd.h> #include "pin.H" KNOB<bool> ProfileCalls(KNOB_MODE_WRITEONCE, "pintool", "c", "0", "Profile function calls"); KNOB<bool> ProfileSyscalls(KNOB_MODE_WRITEONCE, "pintool", "s", "0", "Profile syscalls"); std::map<ADDRINT, std::map<ADDRINT, unsigned long> > cflows; std::map<ADDRINT, std::map<ADDRINT, unsigned long> > calls; std::map<ADDRINT, unsigned long> syscalls; std::map<ADDRINT, std::string> funcnames; unsigned long insn_count = 0; unsigned long cflow_count = 0; unsigned long call_count = 0; unsigned long syscall_count = 0; unsigned long first_thread_id = 0; // 추가 ... static void log_syscall(THREADID tid, CONTEXT *ctxt, SYSCALL_STANDARD std, VOID *v) { //printf("tid : 0x%x\n",tid); if(first_thread_id == 0) {first_thread_id = tid;} else if (first_thread_id == tid) { syscalls[PIN_GetSyscallNumber(ctxt, std)]++; syscall_count++; } } ... ----------------------------------------------------- binary@binary-VirtualBox:~/pin/pin-3.6-97554-g31f0a167d-gcc-linux$ ./pin -t ~/code/chapter9/profiler/obj-intel64/profiler.so -c -s -- /bin/true tid : 0x0 tid : 0x0 tid : 0x0 tid : 0x0 tid : 0x0 tid : 0x0 tid : 0x0 tid : 0x0 tid : 0x0 tid : 0x0 tid : 0x0 tid : 0x0 tid : 0x0 tid : 0x0 tid : 0x0 tid : 0x0 tid : 0x0 tid : 0x0 tid : 0x0 tid : 0x0 tid : 0x0 tid : 0x0 tid : 0x0 tid : 0x0 tid : 0x0 executed 95 instructions ******* CONTROL TRANSFERS ******* 0x00401000 <- 0x00403f7c: 1 (4.35%) 0x00401015 <- 0x0040100e: 1 (4.35%) 0x00401020 <- 0x0040118b: 1 (4.35%) 0x00401180 <- 0x004013f4: 1 (4.35%) 0x00401186 <- 0x00401180: 1 (4.35%) 0x00401335 <- 0x00401333: 1 (4.35%) 0x00401400 <- 0x0040148d: 1 (4.35%) 0x00401430 <- 0x00401413: 1 (4.35%) 0x00401440 <- 0x004014ab: 1 (4.35%) 0x00401478 <- 0x00401461: 1 (4.35%) 0x00401489 <- 0x00401487: 1 (4.35%) 0x00401492 <- 0x00401431: 1 (4.35%) 0x004014a0 <- 0x00403f99: 1 (4.35%) 0x004014ab <- 0x004014a9: 1 (4.35%) 0x00403f81 <- 0x00401019: 1 (4.35%) 0x00403f86 <- 0x00403f84: 1 (4.35%) 0x00403f9d <- 0x00401479: 1 (4.35%) 0x00403fa6 <- 0x00403fa4: 1 (4.35%) 0x7feaab29c7bf <- 0x00403fb4: 1 (4.35%) 0x7feaab29c830 <- 0x00401337: 1 (4.35%) 0x7feabec8dde7 <- 0x0040149a: 1 (4.35%) 0x7feabec8de05 <- 0x00404004: 1 (4.35%) 0x7feabec94870 <- 0x00401026: 1 (4.35%) ******* FUNCTION CALLS ******* [_init ] 0x00401000 <- 0x00403f7c: 1 (25.00%) [__libc_start_main@plt ] 0x00401180 <- 0x004013f4: 1 (25.00%) [ ] 0x00401400 <- 0x0040148d: 1 (25.00%) [ ] 0x004014a0 <- 0x00403f99: 1 (25.00%) ******* SYSCALLS ******* 0: 1 (4.00%) 2: 2 (8.00%) 3: 2 (8.00%) 5: 2 (8.00%) 9: 7 (28.00%) 10: 4 (16.00%) 11: 1 (4.00%) 12: 1 (4.00%) 21: 3 (12.00%) 158: 1 (4.00%) 231: 1 (4.00%)
C++
복사

2. 언패킹된 파일 조사하기

언패킹 도구를 실습하는 과정에서 /bin/ls 파일을 언패킹했더니 추가적인 파일들이 더 발견 됐다. 그 파일들에는 어떤 내용이 담겨있는지 살펴보고. 언패킹 도구가 이 파일들까지 덤프한 이유가 무엇일지 고민해 보자.
언패커 파일을 언패킹 해 본 결과 파일 3개가 확인되었다.
binary@binary-VirtualBox:~/pin/pin-3.6-97554-g31f0a167d-gcc-linux$ head unpacker.log ------- unpacking binary ------- extracting unpacked region 0x0000000000800000 ( 53.7kB) wx entry 0x000000000080d465 //1번 extracting unpacked region 0x0000000000800000 ( 55.3kB) wx entry 0x000000000080d6d0 //2번 extracting unpacked region 0x0000000000400000 ( 118.6kB) wx entry 0x000000000040000c //3번 ******* Memory access clusters ******* 0x0000000000400000 ( 118.6kB) wx: ================================================================================ 0x0000000000800000 ( 55.3kB) wx: ===================================== 0x000000000061de00 ( 4.5kB) w-: === 0x00007ffe403a7450 ( 3.8kB) w-: == 0x00007f5253ca32a0 ( 3.3kB) w-: ==
C++
복사
3번의 경우 upx가 패킹 해제한 bin/ls 파일과 바이너리가 거의 동일하며,
upx에서 메모리를 할당하여 패킹 된 데이터를 언패킹하는 것으로 보인다.
1번과 2번 해당 메모리 대역이 두번 찍힌 이유를 유추해보면
첫번째의 경우 upx 패킹의 첫 함수에서 리턴된 0x40d465
LOAD:000000000040D460 public start LOAD:000000000040D460 start proc near ; DATA XREF: LOAD:0000000000400018↑o LOAD:000000000040D460 call loc_40D6B8 LOAD:000000000040D465 push rbp // 해당 부분 LOAD:000000000040D466 push rbx LOAD:000000000040D467 push rcx LOAD:000000000040D468 push rdx LOAD:000000000040D469 add rsi, rdi
C++
복사
을 OEP 패턴으로 판단하여 덤프를 수행하였을 것이고
두번째로 0x40d6d0의 경우 기존에는 alignment가 파괴되어 있다, 호출되어 새로운 로직으로 탐지되었으며, 내부에서 syscall, 및 indirect jmp가 이루어 진다. 해당 40d734 내부의 indirect jmp 때문에 해당 파일도 덤프를 하는 듯 하다.
LOAD:000000000040D6B8 loc_40D6B8: ; CODE XREF: start↑p LOAD:000000000040D6B8 pop rbp LOAD:000000000040D6B9 lea rax, [rbp-9] LOAD:000000000040D6BD mov r15d, [rax] LOAD:000000000040D6C0 mov edx, 0C8h LOAD:000000000040D6C5 sub rax, r15 LOAD:000000000040D6C8 sub r15d, edx LOAD:000000000040D6CB lea rcx, [rax+rdx] LOAD:000000000040D6CB ; --------------------------------------------------------------------------- LOAD:000000000040D6CF db 0E8h LOAD:000000000040D6D0 ; --------------------------------------------------------------------------- LOAD:000000000040D6D0 call sub_40D734 LOAD:000000000040D6D0 ; --------------------------------------------------------------------------- LOAD:000000000040D6D5 db 2Fh ; / LOAD:000000000040D6D6 db 70h ; p LOAD:000000000040D6D7 db 72h ; r LOAD:000000000040D6D8 db 6Fh ; o LOAD:000000000040D6D9 db 63h ; c LOAD:000000000040D6DA db 2Fh ; / LOAD:000000000040D6DB db 73h ; s LOAD:000000000040D6DC db 65h ; e LOAD:000000000040D6DD db 6Ch ; l LOAD:000000000040D6DE aFExe db 'f/exe',0
C++
복사
LOAD:000000000040D734 sub_40D734 proc near ; CODE XREF: LOAD:000000000040D6D0↑p LOAD:000000000040D734 pop r9 LOAD:000000000040D736 mov rsi, rsp LOAD:000000000040D739 lea rdi, [rsi-1010h] LOAD:000000000040D740 mov rsp, rdi LOAD:000000000040D743 push 7 LOAD:000000000040D745 pop rcx LOAD:000000000040D746 rep movsq LOAD:000000000040D749 LOAD:000000000040D749 loc_40D749: ; CODE XREF: sub_40D734+1B↓j LOAD:000000000040D749 cmp qword ptr [rsi], 0 LOAD:000000000040D74D movsq LOAD:000000000040D74F jnz short loc_40D749 LOAD:000000000040D751 mov rdx, rdi LOAD:000000000040D754 stosq LOAD:000000000040D756 LOAD:000000000040D756 loc_40D756: ; CODE XREF: sub_40D734+28↓j LOAD:000000000040D756 cmp qword ptr [rsi], 0 LOAD:000000000040D75A movsq LOAD:000000000040D75C jnz short loc_40D756 LOAD:000000000040D75E push rdi LOAD:000000000040D75F LOAD:000000000040D75F loc_40D75F: ; CODE XREF: sub_40D734+33↓j LOAD:000000000040D75F cmp qword ptr [rsi], 0 LOAD:000000000040D763 movsq LOAD:000000000040D765 movsq LOAD:000000000040D767 jnz short loc_40D75F LOAD:000000000040D769 lea r15, [rdi-8] LOAD:000000000040D76D mov [rdx], rdi LOAD:000000000040D770 mov eax, 3D202020h LOAD:000000000040D775 stosd LOAD:000000000040D776 mov edx, 1000h ; bufsiz LOAD:000000000040D77B mov rsi, rdi ; buf LOAD:000000000040D77E mov rdi, r9 ; path LOAD:000000000040D781 push 59h ; 'Y' LOAD:000000000040D783 pop rax LOAD:000000000040D784 syscall ; LINUX - sys_readlink LOAD:000000000040D786 test eax, eax LOAD:000000000040D788 js short loc_40D78E LOAD:000000000040D78A mov byte ptr [rsi+rax], 0 LOAD:000000000040D78E LOAD:000000000040D78E loc_40D78E: ; CODE XREF: sub_40D734+54↑j LOAD:000000000040D78E add r9, 0Fh LOAD:000000000040D792 pop rcx LOAD:000000000040D793 pop rsi LOAD:000000000040D794 pop rdi LOAD:000000000040D795 sub rsp, 800h LOAD:000000000040D79C mov rdx, rsp LOAD:000000000040D79F mov r8, rbp LOAD:000000000040D7A2 push 0 LOAD:000000000040D7A4 call loc_40DC0A LOAD:000000000040D7A9 pop rdx LOAD:000000000040D7AA add rsp, 800h LOAD:000000000040D7B1 pop rsi LOAD:000000000040D7B2 pop rdi LOAD:000000000040D7B3 pop rcx LOAD:000000000040D7B4 pop rcx LOAD:000000000040D7B5 shl ecx, 0Ch LOAD:000000000040D7B8 add rdi, rcx LOAD:000000000040D7BB sub esi, ecx LOAD:000000000040D7BD push rax LOAD:000000000040D7BE push 0Bh LOAD:000000000040D7C0 pop rax LOAD:000000000040D7C1 jmp qword ptr [r15] LOAD:000000000040D7C1 sub_40D734 endp LOAD:000000000040D7C1 LOAD:000000000040D7C4 ; --------------------------------------------------------------------------- LOAD:000000000040D7C4 mov al, 0Bh LOAD:000000000040D7C6 jmp short loc_40D7D5
C++
복사

3. 언패킹도구개선하기

언패킹 도구에 새로운 커맨드 라인 옵션을 추가해 보자. 그 옵션이 적용되는 경우 OEP 부분 으로의 점프 발생을 찾을 때 간접 호출에 대한 것뿐만 아니라 모든 종류의 제어 흐름 변환을 계측하도록 해보자. 그리고 이 옵션이 적용된 경우와 그렇지 않은 경우 각각에 대해 실행 시 간이 얼마나 차이 나는지 비교해 보자. 만약 제어 흐름이 직접적으로 이뤄질 때 OEP 부분으로 점프하는 방식의 패킹 기법이 존재한다면 어떨까?

1. 새로운 옵션 추가 및 Direct Call & Jump Check

새로운 옵션 추가 및 새로운 함수를 사용하여 추가해 주었다.
KNOB<bool> AllChangeRountine(KNOB_MODE_WRITEONCE, "pintool", "a" , "0", "Checkc All flow Control Instruction") /***************************************************************************** * Instrumentation functions * *****************************************************************************/ static void instrument_mem_cflow(INS ins, void *v) ... if(AllChangeRoutine.Value()) { if((INS_IsIndirectBranchOrCall(ins) || INS_IsDirectBranchOrCall(ins))&& INS_OperandCount(ins) > 0) { INS_InsertCall( ins, IPOINT_BEFORE, (AFUNPTR)check_indirect_ctransfer, IARG_INST_PTR, IARG_BRANCH_TARGET_ADDR, IARG_END ); } } else { if(INS_IsIndirectBranchOrCall(ins) && INS_OperandCount(ins) > 0) { INS_InsertCall( ins, IPOINT_BEFORE, (AFUNPTR)check_indirect_ctransfer, IARG_INST_PTR, IARG_BRANCH_TARGET_ADDR, IARG_END ); } } }
C++
복사
check_indirect_ctransfer 함수가 메모리 덤프 및 OEP를 탐지하는 로직이며, 해당 로직도 동일하게 호출하도록 하였다.
static void check_indirect_ctransfer(ADDRINT ip, ADDRINT target) { mem_cluster_t c; shadow_mem[target].x = true; if(shadow_mem[target].w && !in_cluster(target)) { /* control transfer to a once-writable memory region, suspected transfer * to original entry point of an unpacked binary */ set_cluster(target, &c); clusters.push_back(c); /* dump the new cluster containing the unpacked region to file */ mem_to_file(&c, target); /* we don't stop here because there might be multiple unpacking stages */ } }
C++
복사
결과로는 메모리 엑세스가 조금 달라진 것을 확인할 수 있다.
------- unpacking binary with direct&indirect------- extracting unpacked region 0x0000000000800000 ( 53.7kB) wx entry 0x000000000080d465 extracting unpacked region 0x0000000000800000 ( 55.3kB) wx entry 0x000000000080d6d0 extracting unpacked region 0x0000000000400000 ( 118.6kB) wx entry 0x000000000040000c ******* Memory access clusters ******* 0x0000000000400000 ( 118.6kB) wx: ================================================================================ 0x0000000000800000 ( 55.3kB) wx: ===================================== 0x000000000061de00 ( 4.5kB) w-: === 0x00007fff9ff17c20 ( 3.8kB) w-: == 0x00007ff5e88982a0 ( 3.3kB) w-: == 0x00007ff5e93663f0 ( 3.0kB) w-: == 0x0000000001ec0078 ( 2.8kB) w-: = 0x00007ff5e862f6a8 ( 2.6kB) w-: = 0x00007ff5e91206f8 ( 2.3kB) w-: = 0x00007ff5e911fb20 ( 2.1kB) w-: = 0x0000000001ec5028 ( 968B) w-: 0x00007fff9ff15ca8 ( 968B) w-: 0x00007ff5e9366008 ( 944B) w-: 0x00007ff5e911ed80 ( 624B) w-: 0x00007ff5e911c9d8 ( 600B) w-: 0x00007ff5e8898008 ( 592B) w-: 0x00007ff5e9124580 ( 436B) w-: 0x00007ff5e8897bb0 ( 416B) w-: 0x0000000001ebe618 ( 412B) w-: 0x0000000001ebf508 ( 412B) w-: 0x00007ff5e911c138 ( 408B) w-: 0x0000000001ebf370 ( 404B) w-: 0x0000000001ebe480 ( 404B) w-: 0x00007ff5e911bfa0 ( 400B) w-: 0x00007ff5e911b7c0 ( 328B) w-: 0x00007ff5e911c800 ( 328B) w-: 0x0000000001ebe920 ( 308B) w-: ------- unpacking binary with indirect------- extracting unpacked region 0x0000000000800000 ( 53.7kB) wx entry 0x000000000080d465 extracting unpacked region 0x0000000000800000 ( 55.3kB) wx entry 0x000000000080d6d0 extracting unpacked region 0x0000000000400000 ( 118.6kB) wx entry 0x000000000040000c ******* Memory access clusters ******* 0x0000000000400000 ( 118.6kB) wx: ================================================================================ 0x0000000000800000 ( 55.3kB) wx: ===================================== 0x000000000061de00 ( 4.5kB) w-: === 0x00007ffeebb6c970 ( 3.8kB) w-: == 0x00007f55b38c72a0 ( 3.3kB) w-: == 0x00007f55b43c63f0 ( 3.0kB) w-: == 0x0000000002215078 ( 2.8kB) w-: = 0x00007f55b362c6a8 ( 2.6kB) w-: = 0x00007f55b414a6f8 ( 2.3kB) w-: = 0x00007f55b4149b20 ( 2.1kB) w-: = 0x000000000221a028 ( 968B) w-: 0x00007ffeebb6aa08 ( 952B) w-: 0x00007f55b43c6008 ( 944B) w-: 0x00007f55b4148d80 ( 624B) w-: 0x00007f55b41469d8 ( 600B) w-: 0x00007f55b38c7008 ( 592B) w-: 0x00007f55b414e580 ( 436B) w-: 0x00007f55b38c6bb0 ( 416B) w-: 0x0000000002214508 ( 412B) w-: 0x0000000002213618 ( 412B) w-: 0x00007f55b4146138 ( 408B) w-: 0x0000000002214370 ( 404B) w-: 0x0000000002213480 ( 404B) w-: 0x00007f55b4145fa0 ( 400B) w-: 0x00007f55b4146800 ( 328B) w-: 0x00007f55b41457c0 ( 328B) w-: 0x0000000002213920 ( 308B) w-:
C++
복사
또한 Direct Call까지 체크한 로그의 량이 월등이 많은 것을 볼 수 있다.
Mode LastWriteTime Length Name ---- ------------- ------ ---- -a---- 2021-12-13 오전 5:43 121007 allunpacker.log -a---- 2021-12-13 오전 5:45 50764 unpacker.log
C++
복사
대부분의 파일이 OEP를 찾아 갈 때 Indirect를 사용하지만, Direct Call을 이용 할 경우 해당 방식으로 새로운 OEP를 탐지 할 수 있을 것이다.

2. OEP로 직접 호출, 점프를 수행하는 패커가 있을 경우

해당 경우에는 위의 방식으로 탐지 할 수 있을 것이나, 패커의 경우 대부분이 분석을 어렵게 하기 위해 간접 호출로 OEP를 호출한다.