【工欲善其事，必先利其器】之gdb五大高级用法

本篇文章讲解gdb的一些高级用法，在我们的开发生涯中，调试是很重要的技能，而在linux下开发，最常用的调试工具就是gdb了，所以这里介绍几种gdb比较高级的用法，助力我们的调试技能。

还是先看下思维导图：

1. gdb怎么调试多线程

gdb调试多线程时，默认情况下是所有线程同时都在执行，但是假设我们想只有一个线程继续执行，其他线程都暂停呢？下面就来看一看该怎么实现这个功能。

有这么一段多线程代码，如下：

//test.cpp
#include <stdio.h>
#include <pthread.h>
#include <unistd.h>

void *print1_msg(void *arg)
{
    while(1)
    {
        printf("print1_msg\n");
        usleep(100);
    }
}

void *print2_msg(void *arg)
{
    while(1)
    {
        printf("print2_msg\n");
        usleep(100);
    }
}

int main()
{
    pthread_t id1, id2;
    pthread_create(&id1, NULL, print1_msg, NULL);
    pthread_create(&id2, NULL, print2_msg, NULL);
    pthread_join(id1, NULL);  //使主线程等待该线程结束后才结束，否则主线程很快结束，该线程没有机会执行
    pthread_join(id2, NULL);

    return 0;
}

假设我们因为线上的问题，然后想要程序只执行线程函数print1_msg，但不能修改代码，那要怎么办呢？

首先我们使用g++ -g test.cpp -l pthread -o test生成可执行文件，然后使用gdb ./test进入gdb模式，如下：

(gdb) b test.cpp:28  #b就是break，打断点
Breakpoint 1 at 0x40074e: file test.cpp, line 28.
(gdb) r  #r即run，运行程序
Starting program: /root/test 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x40a00940 (LWP 26623)]
print1_msg
[New Thread 0x41401940 (LWP 26624)]

Thread 1 hit Breakpoint 1, main () at test.cpp:28
28        pthread_join(id1, NULL);  //使主线程等待该线程结束后才结束，否则主线程很快结束，该线程没有机会执行
(gdb) info thread   #显示当前进程的所有线程，第1号线程就是进程自身
  Id   Target Id         Frame 
* 1    Thread 0x2aaaaae794d0 (LWP 26619) main () at test.cpp:28
  2    Thread 0x40a00940 (LWP 26623) 0x00000036f289a901 in nanosleep () from /lib64/libc.so.6
  3    Thread 0x41401940 (LWP 26624) 0x00000036f28d4971 in clone () from /lib64/libc.so.6
(gdb) thread 2  #进入序号为2的线程
[Switching to thread 2 (Thread 0x40a00940 (LWP 26623))]
#0  0x00000036f289a901 in nanosleep () from /lib64/libc.so.6
(gdb) set scheduler-locking on   #只有当前被调试线程会执行
(gdb) c
Continuing.
print1_msg
print1_msg
print1_msg
print1_msg
print1_msg

大家可以看看注释，这样一番操作以后，实际上就只有第一个线程在执行了，此时如果执行gdb命令set scheduler-locking off就会继续执行所有线程了。

下面介绍一下多线程调试中会比较多用到的gdb命令:

gdb命令	命令的作用
info thread	显示当前进程的所有线程，第一列代表线程序号，第1号线程就是进程自身，序号前面带*的就是当前正在执行的线程
thread id	id代表线程序号，比如thread 2就是表示进入2号线程
set scheduler-locking on	设置该命令后，表示只有当前线程会被执行，其他线程相当于被锁住，会暂停
set scheduler-locking off	不锁定任何线程，设置该命令后，表示所有线程都会被执行，也是gdb的默认值
set scheduler-locking step	单步执行当前线程时，其他线程不会被执行，但此模式下不能执行continue、finish、until命令，一旦执行，那么其他线程都会被唤醒
show scheduler-locking	显示当前scheduler-locking的状态
thread apply 1 2 command	让线程序号为1和2的线程执行某个gdb命令，其中的command是gdb命令，比如thread apply 1 2 info local，让序号为1和2的线程打印出所有局部变量
thread apply all command	command是gdb命令，让所有线程执行某个gdb命令

2. gdb怎么调试多进程

gdb调试多进程时最常用的是两个属性：follow-fork-mode和detach-on-fork，分别使用set follow-fork-mode parent|child和set detach-on-fork on|off这样的形式来进行设置，一般来讲，这两个命令是联合起来起作用的，下面就介绍一下他们的作用，如下：

follow-fork-mode	detach-on-fork	说明
parent	on	此种场景是gdb默认场景，表明gdb此时只调试父进程，包括打断点等都只对父进程起作用，子进程就继续运行，此时gdb不控制子进程
parent	off	此种场景下gdb同时控制父子进程，父进程可以正常调试，但子进程被gdb设置为暂停状态，不会继续执行
child	on	此种场景下gdb只控制子进程，gdb的所有命令都只对子进程起作用，父进程会继续运行
child	off	此种场景下gdb同时控制父子进程，子进程可以正常调试，但父进程被gdb设置为暂停状态，不会继续执行

此外还有一些其他调试多进程会用到的命令，如下：

gdb命令	作用说明
show follow-fork-mode	显示follow-fork-mode状态
show detach-on-fork	显示detach-on-fork状态
info inferiors	查询gdb当前可调试的进程
inferior	切换调试的进程，其中infer number是info inferiors命令打印出来的进程序号

接下来我们使用一个案例来说明上述命令的使用，有如下一段代码：

//test.cpp
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>

int main()
{
    if ( fork() > 0)
    {
        while(1)
        {
            printf("this is parent\n");
            sleep(1);
        }
    }
    else
    {
        while(1)
        {
            printf("this is son\n");
            sleep(1);
        }
    }
    return 0;
}

实际情况下代码肯定不能我这么写哈，要考虑到僵尸进程的产生，我这里只是为了排除其他干扰来说明gdb调试多进程的过程，所以写的很简洁。

假设这段代码编译后产生的执行文件为test，我们接着使用gdb对它进行调试，首先使用一下基本的命令，如下：

(gdb) show follow-fork-mode
Debugger response to a program call of fork or vfork is "parent".
(gdb) show detach-on-fork
Whether gdb will detach the child of a fork is on.

打印处理这两种设置模式的默认值，这个跟我们前面说的这个是gdb默认场景是一致的哈，此时我们分别在父进程和子进程代码处设置断点，然后运行看一下：

(gdb) b test.cpp:11
Breakpoint 1 at 0x40064e: file test.cpp, line 11.
(gdb) b test.cpp:19
Breakpoint 2 at 0x400664: file test.cpp, line 19.
(gdb) r
Starting program: /root/test 
this is son

Breakpoint 1, main () at test.cpp:11
11                printf("this is parent\n");
(gdb) this is son
this is son
this is son
this is son

可以看到只命中了父进程的断点，而子进程依然我行我素的运行。

接着假设我此时只想调试子进程，并且不想父进程继续运行，gdb命令如下：

(gdb) set follow-fork-mode child
(gdb) set detach-on-fork off
(gdb) b test.cpp:11
Breakpoint 1 at 0x40064e: file test.cpp, line 11.
(gdb) b test.cpp:19
Breakpoint 2 at 0x400664: file test.cpp, line 19.
(gdb) r
Starting program: /root/test 
[New process 29409]
Reading symbols from /root/test...done.
Reading symbols from /usr/lib64/libstdc++.so.6...done.
[Switching to process 29409]

Thread 2.1 hit Breakpoint 2, main () at test.cpp:19
19                printf("this is son\n");
(gdb) c
Continuing.
this is son

Thread 2.1 hit Breakpoint 2, main () at test.cpp:19
19                printf("this is son\n");
(gdb) c
Continuing.
this is son

Thread 2.1 hit Breakpoint 2, main () at test.cpp:19
19                printf("this is son\n");

此时父子进程都被gdb控制，并且只有子进程会命中断点，父进程被暂停了，所以既没有命中断点也没有继续执行。

3. gdb怎么调试正在运行中的进程

在实际情况中有很多场景，我们需要去调试正在运行中的进程，此时该怎么调试呢，有两种办法：

gdb PID或者gdb -p PID，program是进程名，PID是进程在操作系统中的进程号，用ps命令查看即可，两种命令作用是一样的；
gdb 以后，在gdb模式下attach PID也可以起到同样的作用，即挂载某个进程到gdb中；

上述两种方法进入调试模式后，如果不想继续调试直接在gdb模式下使用detach命令取消gdb挂载的进程即可。

下面用一个案例来说明一下，假设有下面这段代码：

#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>

int main()
{
    int i = 0;
    while(1)
    {
        i++;
        sleep(1);
    }
    return 0;
}

程序已经运行一段时间了，此时我想知道i的值是多少了，该怎么办呢，首先用ps命令查出进程的ID为29549，然后gdb -p 29549进入gdb模式，使用如下gdb命令查看即可：

(gdb) b test.cpp:10
Breakpoint 1 at 0x4005bb: file test.cpp, line 10.
(gdb) c
Continuing.

Breakpoint 1, main () at test.cpp:10
10            i++;
(gdb) p i
$1 = 165

4. gdb怎么调试生成的core文件

core文件一般是产生段错误产生的哈，也就是使用空指针或者有内存越界之类的动作会产生，但要产生core文件也是需要设置的，一般linux下使用ulimit命令即可，比如使用ulimit -c看下打印的值，如果不是unlimited，那么使用ulimit -c unlimited设置一下即可，关于ulimit命令的更多使用这里就不多做介绍了。

我的机器现在已经开启了core文件生成的开关，那么现在有这么一段代码，如下：

#include <stdio.h>

int main()
{
    char *str = NULL;
    printf("%s\n", str);
    return 0;
}

然后执行，果不其然，输出了段错误 (core dumped)这样的语句，可见是产生了core文件，我这里产生的core文件名为core.29626，此时我们可以使用gdb <program> core.29626这样的命令来进入gdb进行调试，如下：

(gdb) bt
#0  0x00000036f2879ba0 in strlen () from /lib64/libc.so.6
#1  0x00000036f28631cb in puts () from /lib64/libc.so.6
#2  0x00000000004005c8 in main () at test.cpp:6

使用bt命令即可查看出错的到底是哪个函数，哪行代码啦。

5. gdb怎么查看c++中类对象的详细信息

假设有这么一段c++代码，如下：

class CPeople
{
public:
    int age;
public:
    virtual void print(){}
};

class CBigPeople : public CPeople
{
public:
    int height;
};

int main()
{
    CPeople *people = new CBigPeople;
    delete people;
    return 0;
}

我们使用gdb查看people所指向的类型，如下：

(gdb) p *people
$2 = {_vptr.CPeople = 0x4008b0 <vtable for CBigPeople+16>, age = 0}

很显然默认情况下gdb没能显示出来真实类型，我们打开一个开关，如下：

(gdb) set print object on
(gdb) p *people
$3 = (CBigPeople) {<CPeople> = {_vptr.CPeople = 0x4008b0 <vtable for CBigPeople+16>, age = 0}, height = 0}

这是类型都显示在一行里面，如果是简单类型还好，如果类型很复杂的时候，这个就很难看了，所以我们可以让gdb显示树形结构，如下：

(gdb) set print pretty on
(gdb) p *people
$5 = (CBigPeople) {
  <CPeople> = {
    _vptr.CPeople = 0x4008c0 <vtable for CBigPeople+16>, 
    age = 0
  }, 
  members of CBigPeople: 
  height = 0
}

6. 小结

其实linux下调试gdb真的是个很强大的命令，仔细研究一下，我们会发现，只要我们能想到的功能，gdb真的都能实现，同时我们要善用gdb的help命令，它可以打印出所有的gdb命令和它的作用，如果你不想打印那么多，你可以只打印某个单一命令或者某一类命令，比如：

(gdb) help shell  #打印出gdb中shell命令的作用，它可以让我们直接在gdb下执行shell命令
Execute the rest of the line as a shell command.
With no arguments, run an inferior shell.
(gdb) help info  #打印出gdb中info开头的所有命令
Generic command for showing things about the program being debugged.

List of info subcommands:

info address -- Describe where symbol SYM is stored
info all-registers -- List of all registers and their contents
info args -- Argument variables of current stack frame
info auto-load -- Print current status of auto-loaded files
info auxv -- Display the inferior's auxiliary vector
info bookmarks -- Status of user-settable bookmarks
info breakpoints -- Status of specified breakpoints (all user-settable breakpoints if no argument)
info checkpoints -- IDs of currently known checkpoints
info classes -- All Objective-C classes
info common -- Print out the values contained in a Fortran COMMON block
info copying -- Conditions for redistributing copies of GDB
info dcache -- Print information on the dcache performance
info display -- Expressions to display when program stops
......  #篇幅有限，后续省略了

好了，本篇文章就为大家介绍到这里，觉得内容对你有用的话，记得顺手点个赞哦~