C – Why does the INST_PTR (instruction pointer) value of the same program change due to different runs?

Why does the INST_PTR (instruction pointer) value of the same program change due to different runs?… here is a solution to the problem.

Why does the INST_PTR (instruction pointer) value of the same program change due to different runs?

In Intel’s PinTool, you can use IARG_INST_PTR or INS_Address Print out the “instruction address” of each instruction in the program. I have observed that running the same program at different points in time generates different instruction addresses for the exact same instructions. However, I want the address to remain the same in operation. What is the root cause of this change? I’ve attached below two sample outputs that show the opcodes and instruction addresses of the first three instructions executed.

How do I find the PC for each instruction? Or the address displayed in OBJDUMP via PinTool?

–Run 1–

op:       MOV addr:0x00007fac87a8d2d0

op: CALL_NEAR addr:0x00007fac87a8d2d3

op:      PUSH addr:0x00007fac87a90a70

–Run 2–

op:       MOV addr:0x00007fc529f402d0

op: CALL_NEAR addr:0x00007fc529f402d3

op:      PUSH addr:0x00007fc529f43a70


(tl; dr version, there is a possible solution at the end. )

This is almost certainly due to the fact that address space randomization applies to shared libraries. Running the following command multiple times gives you an idea of how it works:

$ cat /proc/self/maps


proc/self/ refers to the current process (the process that opens the file). There is also the /proc// directory for specific PIDs. The maps file lists the mapping of the processes – in this case, the cat process itself.

This is the output of running it once on my system:

00400000-0040c000 r-xp 00000000 08:01 3409248            /bin/cat
0060b000-0060c000 r--p 0000b000 08:01 3409248            /bin/cat
0060c000-0060d000 rw-p 0000c000 08:01 3409248            /bin/cat
0063a000-0065b000 rw-p 00000000 00:00 0                  [heap]
7f017ef95000-7f017f761000 r--p 00000000 08:01 8126750    /usr/lib/locale/locale-archive
7f017f761000-7f017f91b000 r-xp 00000000 08:01 11155466   /lib/x86_64-linux-gnu/libc-2.19.so
7f017f91b000-7f017fb1a000 ---p 001ba000 08:01 11155466   /lib/x86_64-linux-gnu/libc-2.19.so
7f017fb1a000-7f017fb1e000 r--p 001b9000 08:01 11155466   /lib/x86_64-linux-gnu/libc-2.19.so
7f017fb1e000-7f017fb20000 rw-p 001bd000 08:01 11155466   /lib/x86_64-linux-gnu/libc-2.19.so
7f017fb20000-7f017fb25000 rw-p 00000000 00:00 0 
7f017fb25000-7f017fb48000 r-xp 00000000 08:01 11155454   /lib/x86_64-linux-gnu/ld-2.19.so
7f017fd1c000-7f017fd1f000 rw-p 00000000 00:00 0 
7f017fd23000-7f017fd47000 rw-p 00000000 00:00 0 
7f017fd47000-7f017fd48000 r--p 00022000 08:01 11155454   /lib/x86_64-linux-gnu/ld-2.19.so
7f017fd48000-7f017fd49000 rw-p 00023000 08:01 11155454   /lib/x86_64-linux-gnu/ld-2.19.so
7f017fd49000-7f017fd4a000 rw-p 00000000 00:00 0 
7fffacef5000-7fffacf16000 rw-p 00000000 00:00 0          [stack]
7fffacf5a000-7fffacf5c000 r-xp 00000000 00:00 0          [vdso]
7fffacf5c000-7fffacf5e000 r--p 00000000 00:00 0          [vvar]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0  [vsyscall]

The first three lines are the code snippet, read-only data segment, and read-write data segment of the executable file. The remaining lines are the stack, the heap, the individual segments of the shared library, the memory-mapped file (as a side note, the library is also just a memory-mapped file), and some internals related to how some system calls are implemented.

If you repeat the command a few times, you may see all the mappings move randomly except for the code and data segments in the executable. This is a security measure. Not knowing what’s in memory makes some attacks more difficult to implement because you can’t jump directly to an address where you know there will be useful routines.

The main reason that address space randomization is not applied to the code and data segments of the executable itself may be efficiency. Code that is not loaded into a fixed address must be location-independent, which adds some overhead. This is why shared libraries need to be explicitly compiled with -fPIC.

(For reasons other than security, shared libraries also need to be location-independent.) If two libraries happen to get overlapping load addresses, using a fixed address for each library causes problems. )

Unfortunately, I’m not familiar with PinTool. I believe GDB just disables address space randomization (using personality(2) system calls) to get predictable addresses for the shared library.

Address space randomization can be turned onoff for a single shell session (this also seems to use personality()), Or by executing echo 0 >/proc/sys/kernel/randomize_va_space in the global scope (see /proc/sys/ ).

I found the following on this page. May be relevant.

Does Pin change the application code and data addresses?

Note: Recent linux kernels intentionally move the location of stack and dynamically allocated data from run to run, even if you are not using pin. On RedHat-based systems you can workaround this by running Pin as follows:

$ setarch i386 pin -t pintool — app

tl; The Doctor replied

If all you need to do is associate an address from the PinTool that happens to come from the library to the objdump disassembly address, and you don’t mind doing some manual work each time, the following should work:

  1. Print /proc/maps from your process. (You can also run it in the background and print /proc//maps from the shell, for example using $! to get the PID.) )

  2. Check which mapping the address belongs to. In the case of libraries, it could be a piece of text for a library (labeled r-xp in /proc/maps).

  3. Subtract the mapped start address from the address you see in PinTool.

This will give you the address you see in the objdump disassembly when you run it on the same library. If the library has debugging information, you can also use addr2line(1) to get the source code lines.

Of course there may be a better workflow. This worked for me at least when using dlopen(3) and dlsym(3). The core dump should contain the library load address, so maybe it can be used somehow….

Related Problems and Solutions