Wednesday, 29 January 2020
Friday, 3 January 2020
Linux, Tutorials
Harry
13:11
I have been chasing for a tool/compiler option to generate call graph for user space C program. Some time back i wrote a tool that was dependent on gdb to generate stack frames and generate call graph out of it using python:
https://github.com/tarun27sh/gdb_graphs
The problem with it is it's very slow. One stackoverflow user was kind enough to use it, only to complain it's very slow. So, it's time to explore more options:
1. use gcc -finstrument-functions
2. use LLVM to write a transform pass that adds profiler instructions to each function.
In this post I'll cover #1, and will try to cover #2 in a future post - .
for hw.c:
$ cat hw.c
#include <stdio.h>
int main()
{
printf("Hello World\n");
return 0;
}
comile with:
gcc hw.c // generates a.out
and dump assembly for main:
objdump -S a.out
. . .
0000000000400526 <main>:
400526: 55 push %rbp
400527: 48 89 e5 mov %rsp,%rbp
40052a: bf c4 05 40 00 mov $0x4005c4,%edi
40052f: e8 cc fe ff ff callq 400400 <puts@plt>
400534: b8 00 00 00 00 mov $0x0,%eax
400539: 5d pop %rbp
40053a: c3 retq
40053b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
. . .
This is a normal assembly generated by gcc for our main function.
Now lets compile with -finstrument-functions and see what gets added:
gcc -finstrument-functions hw.c // generates a.out
and dump assembly for main:
objdump -S a.out
. . .
00000000004005d6 <main>:
4005d6: 55 push %rbp
4005d7: 48 89 e5 mov %rsp,%rbp
4005da: 53 push %rbx
4005db: 48 83 ec 08 sub $0x8,%rsp
4005df: 48 8b 45 08 mov 0x8(%rbp),%rax
4005e3: 48 89 c6 mov %rax,%rsi
4005e6: bf d6 05 40 00 mov $0x4005d6,%edi
4005eb: e8 d0 fe ff ff callq 4004c0 <__cyg_profile_func_enter@plt>
4005f0: bf a4 06 40 00 mov $0x4006a4,%edi
4005f5: e8 96 fe ff ff callq 400490 <puts@plt>
4005fa: bb 00 00 00 00 mov $0x0,%ebx
4005ff: 48 8b 45 08 mov 0x8(%rbp),%rax
400603: 48 89 c6 mov %rax,%rsi
400606: bf d6 05 40 00 mov $0x4005d6,%edi
40060b: e8 a0 fe ff ff callq 4004b0 <__cyg_profile_func_exit@plt>
400610: 89 d8 mov %ebx,%eax
400612: 48 83 c4 08 add $0x8,%rsp
400616: 5b pop %rbx
400617: 5d pop %rbp
400618: c3 retq
400619: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
. . .
Now we see two additional function calls have been injected by gcc:
1. __cyg_profile_func_enter@plt
2. _cyg_profile_func_exit@plt
@plt just means these functions will be resolved during run time by linker.
If the user doesn't define these functions, gcc has a default definition which just returns:
(gdb) disas __cyg_profile_func_enter
Dump of assembler code for function __cyg_profile_func_enter:
0x00007ffff7b23200 <+0>: repz retq
End of assembler dump.
(gdb) disas __cyg_profile_func_exit
Dump of assembler code for function __cyg_profile_func_enter:
0x00007ffff7b23200 <+0>: repz retq
End of assembler dump.
(gdb)
Turns out repz retq is an interesting topic in itself:
By defining these two APIs, one can override the default return behavior.
Text string: __cyg_profile_func_enter
File Line
0 gcc/testsuite/g++.dg/pr49718.C 5 /* { dg-final { scan-assembler-times "__cyg_profile_func_enter" 1 { target { ! { hppa*-*-hpux* } } } } } */
1 gcc/testsuite/g++.dg/pr49718.C 6 /* { dg-final { scan-assembler-times "__cyg_profile_func_enter,%r" 1 { target hppa*-*-hpux* } } } */
2 testsuite/gcc.c-torture/execute/eeprof-1.c 65 void __cyg_profile_func_enter (void*, void*) NOCHK;
3 testsuite/gcc.c-torture/execute/eeprof-1.c 69 void __cyg_profile_func_enter (void *fn, void *parent)
4 gcc/testsuite/gcc.dg/20001117-1.c 31 __cyg_profile_func_enter(void *this_fn, void *call_site)
5 gcc/testsuite/gcc.dg/instrument-1.c 6 /* { dg-final { scan-assembler "__cyg_profile_func_enter" } } */
6 gcc/testsuite/gcc.dg/instrument-2.c 6 /* { dg-final { scan-assembler-not "__cyg_profile_func_enter" } } */
7 gcc/testsuite/gcc.dg/instrument-3.c 6 /* { dg-final { scan-assembler-not "__cyg_profile_func_enter" } } */
8 gcc/testsuite/gcc.dg/pr78333.c 4 /* Add empty implementations of __cyg_profile_func_enter() and
9 gcc/testsuite/gcc.dg/pr78333.c 8 __cyg_profile_func_enter(void *this_fn, void *call_site)
a gcc/tree.c 10683 local_define_builtin ("__cyg_profile_func_enter", ftype,
b gcc/tree.c 10685 "__cyg_profile_func_enter", 0);
May be this is something arch dependent.
Let me know if you know how to find the place where gcc sets its definition.
1 #define _GNU_SOURCE
2 #include <dlfcn.h>
3
4 static void __attribute__((no_instrument_function))
5 __cyg_profile_func_enter (void *this_fn,
6 void *call_site)
7 {
8 Dl_info info;
9 dladdr(__builtin_return_address(0), &info);
10 printf("[+] %s\n", info.dli_sname);
11 }
12
13 static void __attribute__((no_instrument_function))
14 __cyg_profile_func_exit (void *this_fn,
15 void *call_site)
16 {
17 Dl_info info;
18 dladdr(__builtin_return_address(0), &info);
19 printf("[-] %s\n", info.dli_sname);
20 }
- line #1,2 include headers for dl* apis needed to get symbol name from address
- line #4 - tell gcc to not inject calls to enter/exit apis in profiler functions
- line #9 - get current stack frame address and pass it to dladdr to get symbol name. From gcc docs:
Built-in Function: void * __builtin_return_address (unsigned int level)
This function returns the return address of the current function, or of one of its callers. The level argument is number of frames to scan up the call stack. A value of 0 yields the return address of the current function, a value of 1 yields the return address of the caller of the current function, and so forth.
Now compile with:
$ gcc -finstrument-functions hw.c -ldl -rdynamic // generates a.out
or to compile and link separately:
$ gcc -finstrument-functions -c hw.c -o hw.o // generates hw.o
$ gcc hw.o -ldl -rdynamic // generates a.out
Finally run the executable:
$ ./a.out
[+] main
HW
[-] main
Now we get symbols too and overhead is much less :)
I added some sample code on how to add code for main, shared objects - check it out at:
https://github.com/tarun27sh/gdb_graphs
The problem with it is it's very slow. One stackoverflow user was kind enough to use it, only to complain it's very slow. So, it's time to explore more options:
1. use gcc -finstrument-functions
2. use LLVM to write a transform pass that adds profiler instructions to each function.
In this post I'll cover #1, and will try to cover #2 in a future post - .
How to work with gcc -finstrument-functions??
Lets start with a simple hello world.for hw.c:
$ cat hw.c
#include <stdio.h>
int main()
{
printf("Hello World\n");
return 0;
}
comile with:
gcc hw.c // generates a.out
and dump assembly for main:
objdump -S a.out
. . .
0000000000400526 <main>:
400526: 55 push %rbp
400527: 48 89 e5 mov %rsp,%rbp
40052a: bf c4 05 40 00 mov $0x4005c4,%edi
40052f: e8 cc fe ff ff callq 400400 <puts@plt>
400534: b8 00 00 00 00 mov $0x0,%eax
400539: 5d pop %rbp
40053a: c3 retq
40053b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
. . .
This is a normal assembly generated by gcc for our main function.
Now lets compile with -finstrument-functions and see what gets added:
gcc -finstrument-functions hw.c // generates a.out
and dump assembly for main:
objdump -S a.out
. . .
00000000004005d6 <main>:
4005d6: 55 push %rbp
4005d7: 48 89 e5 mov %rsp,%rbp
4005da: 53 push %rbx
4005db: 48 83 ec 08 sub $0x8,%rsp
4005df: 48 8b 45 08 mov 0x8(%rbp),%rax
4005e3: 48 89 c6 mov %rax,%rsi
4005e6: bf d6 05 40 00 mov $0x4005d6,%edi
4005eb: e8 d0 fe ff ff callq 4004c0 <__cyg_profile_func_enter@plt>
4005f0: bf a4 06 40 00 mov $0x4006a4,%edi
4005f5: e8 96 fe ff ff callq 400490 <puts@plt>
4005fa: bb 00 00 00 00 mov $0x0,%ebx
4005ff: 48 8b 45 08 mov 0x8(%rbp),%rax
400603: 48 89 c6 mov %rax,%rsi
400606: bf d6 05 40 00 mov $0x4005d6,%edi
40060b: e8 a0 fe ff ff callq 4004b0 <__cyg_profile_func_exit@plt>
400610: 89 d8 mov %ebx,%eax
400612: 48 83 c4 08 add $0x8,%rsp
400616: 5b pop %rbx
400617: 5d pop %rbp
400618: c3 retq
400619: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
. . .
Now we see two additional function calls have been injected by gcc:
1. __cyg_profile_func_enter@plt
2. _cyg_profile_func_exit@plt
@plt just means these functions will be resolved during run time by linker.
If the user doesn't define these functions, gcc has a default definition which just returns:
(gdb) disas __cyg_profile_func_enter
Dump of assembler code for function __cyg_profile_func_enter:
0x00007ffff7b23200 <+0>: repz retq
End of assembler dump.
(gdb) disas __cyg_profile_func_exit
Dump of assembler code for function __cyg_profile_func_enter:
0x00007ffff7b23200 <+0>: repz retq
End of assembler dump.
(gdb)
Turns out repz retq is an interesting topic in itself:
By defining these two APIs, one can override the default return behavior.
Define Entry/Exit hook
1 #include<stdio.h>
2
3 static void __attribute__((no_instrument_function))
4 __cyg_profile_func_enter (void *this_fn,
5 void *call_site)
6 {
7 printf("[+]\n");
8 }
9
10 static void __attribute__((no_instrument_function))
11 __cyg_profile_func_exit (void *this_fn,
12 void *call_site)
13 {
14 printf("[-]\n");
15 }
16
17 int main()
18 {
19 printf("HW\n");
20 return 0;
21 }
- line# 3,10 - tell gcc to not inject calls to enter/exit apis in profiler functions
Compile and run
$ gcc -finstrument-functions hw.c // generates a.out
$ ./a.out
[+]
HW
[-]
Great!
Now we are able to make use of injected functions. Next step would be to printf function name from where it is called and generate some kind of command line function graph. Something similar to what ftrace does.
But first where are these functions declared/defined?
I searched in the gcc source code, found following references to the enter function, but none of them point to its definition where it sets repz retq instructions.Text string: __cyg_profile_func_enter
File Line
0 gcc/testsuite/g++.dg/pr49718.C 5 /* { dg-final { scan-assembler-times "__cyg_profile_func_enter" 1 { target { ! { hppa*-*-hpux* } } } } } */
1 gcc/testsuite/g++.dg/pr49718.C 6 /* { dg-final { scan-assembler-times "__cyg_profile_func_enter,%r" 1 { target hppa*-*-hpux* } } } */
2 testsuite/gcc.c-torture/execute/eeprof-1.c 65 void __cyg_profile_func_enter (void*, void*) NOCHK;
3 testsuite/gcc.c-torture/execute/eeprof-1.c 69 void __cyg_profile_func_enter (void *fn, void *parent)
4 gcc/testsuite/gcc.dg/20001117-1.c 31 __cyg_profile_func_enter(void *this_fn, void *call_site)
5 gcc/testsuite/gcc.dg/instrument-1.c 6 /* { dg-final { scan-assembler "__cyg_profile_func_enter" } } */
6 gcc/testsuite/gcc.dg/instrument-2.c 6 /* { dg-final { scan-assembler-not "__cyg_profile_func_enter" } } */
7 gcc/testsuite/gcc.dg/instrument-3.c 6 /* { dg-final { scan-assembler-not "__cyg_profile_func_enter" } } */
8 gcc/testsuite/gcc.dg/pr78333.c 4 /* Add empty implementations of __cyg_profile_func_enter() and
9 gcc/testsuite/gcc.dg/pr78333.c 8 __cyg_profile_func_enter(void *this_fn, void *call_site)
a gcc/tree.c 10683 local_define_builtin ("__cyg_profile_func_enter", ftype,
b gcc/tree.c 10685 "__cyg_profile_func_enter", 0);
May be this is something arch dependent.
Let me know if you know how to find the place where gcc sets its definition.
How to add code to generate call stacks?
Now the only thing that our program has to do is to define what these hooks do when called:1 #define _GNU_SOURCE
2 #include <dlfcn.h>
3
4 static void __attribute__((no_instrument_function))
5 __cyg_profile_func_enter (void *this_fn,
6 void *call_site)
7 {
8 Dl_info info;
9 dladdr(__builtin_return_address(0), &info);
10 printf("[+] %s\n", info.dli_sname);
11 }
12
13 static void __attribute__((no_instrument_function))
14 __cyg_profile_func_exit (void *this_fn,
15 void *call_site)
16 {
17 Dl_info info;
18 dladdr(__builtin_return_address(0), &info);
19 printf("[-] %s\n", info.dli_sname);
20 }
- line #1,2 include headers for dl* apis needed to get symbol name from address
- line #4 - tell gcc to not inject calls to enter/exit apis in profiler functions
- line #9 - get current stack frame address and pass it to dladdr to get symbol name. From gcc docs:
Built-in Function: void * __builtin_return_address (unsigned int level)
This function returns the return address of the current function, or of one of its callers. The level argument is number of frames to scan up the call stack. A value of 0 yields the return address of the current function, a value of 1 yields the return address of the caller of the current function, and so forth.
Now compile with:
$ gcc -finstrument-functions hw.c -ldl -rdynamic // generates a.out
or to compile and link separately:
$ gcc -finstrument-functions -c hw.c -o hw.o // generates hw.o
$ gcc hw.o -ldl -rdynamic // generates a.out
Finally run the executable:
$ ./a.out
[+] main
HW
[-] main
Now we get symbols too and overhead is much less :)
I added some sample code on how to add code for main, shared objects - check it out at:
References:
1. https://lwn.net/Articles/370423/
Feed, Ubuntu
Harry
03:16
#use this tool for testing purpose only if you like buy the software and support the developers
do u all know the pain od activating office and windows using kmspico
when u download the files you get an virus then ohh oh
well you have come to the right place
link https://drive.google.com/open?id=1dEwwe_tggztIIHUTOQ_6CCb1zFPdMx9x
do u all know the pain od activating office and windows using kmspico
when u download the files you get an virus then ohh oh
well you have come to the right place
link https://drive.google.com/open?id=1dEwwe_tggztIIHUTOQ_6CCb1zFPdMx9x
Subscribe to:
Posts (Atom)