【原创】Linux后台服务相关问题总结

时间:2015-08-29 12:49:08   收藏:0   阅读:347


千言万语,不如实验来的直接...

基于sleep的小实验

      首先通过实验直观感受一下后台服务的运行状况(请注意,前方高能,相关概念在更后面才有解释)。

在命令行上以不同方式执行 sleep

确定登录 shell 和伪终端。
[root@YOYO ~]# ps ajxf          
 PPID   PID  PGID   SID TTY      TPGID STAT   UID   TIME COMMAND
...
    0     1     1     1 ?           -1 Ss       0   0:02 /sbin/init
...
    1  1654  1654  1654 ?           -1 Ss       0   0:00 /usr/sbin/sshd
 1654 15437 15437 15437 ?           -1 Ss       0   0:00  \_ sshd: root@pts/0,pts/1
15437 15441 15441 15441 pts/0    15441 Ss+      0   0:00      \_ -bash
15437 16237 16237 16237 pts/1    16258 Ss       0   0:00      \_ -bash
16237 16258 16258 16237 pts/1    16258 R+       0   0:00          \_ ps ajxf
...
[root@YOYO ~]#
分别以后台方式(&)、setsid、nohup 和前台方式执行 sleep 
[root@YOYO ~]# jobs -l          
[root@YOYO ~]# 
[root@YOYO ~]# sleep 600 &        -- 通过 & 后台运行   -- 1
[1] 16261
[root@YOYO ~]# 
[root@YOYO ~]# setsid sleep 660   -- 通过 setsid 后台运行   -- 2
[root@YOYO ~]# 
[root@YOYO ~]# nohup sleep 720 &   -- 通过 nohup + & 后台运行   -- 3
[2] 16271
[root@YOYO ~]# nohup: 忽略输入并把输出追加到"nohup.out"

[root@YOYO ~]# 
[root@YOYO ~]# sleep 780        -- 前台运行   -- 4
^Z                              -- 挂起
[3]+  Stopped                 sleep 780
[root@YOYO ~]# 
[root@YOYO ~]# jobs -l
[1]  16261 Running                 sleep 600 &
[2]- 16271 Running                 nohup sleep 720 &
[3]+ 16274 停止                  sleep 780
[root@YOYO ~]# 
[root@YOYO ~]# bg 3             -- 放入后台运行
[3]+ sleep 780 &
[root@YOYO ~]# 
[root@YOYO ~]# jobs -l
[1]  16261 Running                 sleep 600 &
[2]- 16271 Running                 nohup sleep 720 &
[3]+ 16274 Running                 sleep 780 &
[root@YOYO ~]#
查看此时的进程关系
[root@YOYO ~]# ps ajxf          
 PPID   PID  PGID   SID TTY      TPGID STAT   UID   TIME COMMAND
...
    0     1     1     1 ?           -1 Ss       0   0:02 /sbin/init
...
    1  1654  1654  1654 ?           -1 Ss       0   0:00 /usr/sbin/sshd
 1654 15437 15437 15437 ?           -1 Ss       0   0:00  \_ sshd: root@pts/0,pts/1
15437 15441 15441 15441 pts/0    15441 Ss+      0   0:00      \_ -bash
15437 16237 16237 16237 pts/1    16282 Ss       0   0:00      \_ -bash
16237 16261 16261 16237 pts/1    16282 S        0   0:00          \_ sleep 600       -- 1
16237 16271 16271 16237 pts/1    16282 S        0   0:00          \_ sleep 720       -- 3
16237 16274 16274 16237 pts/1    16282 S        0   0:00          \_ sleep 780       -- 4
16237 16282 16282 16237 pts/1    16282 R+       0   0:00          \_ ps ajxf
...
    1 16265 16265 16265 ?           -1 Ss       0   0:00 sleep 660         -- 2
[root@YOYO ~]#
叉掉 ssh 连接窗口,查看此时的 sleep 进程状态
[root@YOYO ~]# ps ajxf
 PPID   PID  PGID   SID TTY      TPGID STAT   UID   TIME COMMAND
...
    0     1     1     1 ?           -1 Ss       0   0:02 /sbin/init
...
    1  1654  1654  1654 ?           -1 Ss       0   0:00 /usr/sbin/sshd
 1654 15437 15437 15437 ?           -1 Ss       0   0:00  \_ sshd: root@pts/0 
15437 15441 15441 15441 pts/0    16300 Ss       0   0:00      \_ -bash
15441 16300 16300 15441 pts/0    16300 R+       0   0:00          \_ ps ajxf
...
    1 16265 16265 16265 ?           -1 Ss       0   0:00 sleep 660      -- 2
    1 16271 16271 16237 ?           -1 S        0   0:00 sleep 720      -- 3
[root@YOYO ~]#
实验结论
以不同方式启动进程,在 ssh 连接窗口被叉掉的时候会造成不同的影响。标号为 1 和 4 的两个进程都消失了,标号为 3 的进程有属性发生了变化,只有标号为 2 的进程没有任何改变。

在 shell 脚本中上以不同方式执行 sleep

测试一(前台进程组
[root@Betty ~]# vi test_1.sh 
 
#!/bin/sh 
sleep 600        # 会卡住当前 shell 脚本
 
[root@Betty ~]#
[root@Betty ~]# ./test_1.sh
(卡住)
 在另一个窗口中查看
[root@Betty ~]# ps ajxf
 PPID   PID  PGID   SID TTY      TPGID STAT   UID   TIME COMMAND
...
    1  1860  1860  1860 ?           -1 Ss       0   0:00 /usr/sbin/sshd
 1860 13331 13331 13331 ?           -1 Ss       0   0:00  \_ sshd: root@pts/1,pts/2
13331 16993 16993 16993 pts/1    18649 Ss       0   0:00      \_ -bash
16993 18649 18649 16993 pts/1    18649 R+       0   0:00      |   \_ ps ajxf
13331 18572 18572 18572 pts/2    18632 Ss       0   0:00      \_ -bash
18572 18632 18632 18572 pts/2    18632 S+       0   0:00          \_ /bin/sh ./test_1.sh
18632 18633 18632 18572 pts/2    18632 S+       0   0:00              \_ sleep 600
此时叉掉启动 test_1.sh 脚本的窗口,可以看到对应的进程全部消失。
[root@Betty ~]# ps ajxf
 PPID   PID  PGID   SID TTY      TPGID STAT   UID   TIME COMMAND
...
    1  1860  1860  1860 ?           -1 Ss       0   0:00 /usr/sbin/sshd
 1860 13331 13331 13331 ?           -1 Ss       0   0:00  \_ sshd: root@pts/1
13331 16993 16993 16993 pts/1    18706 Ss       0   0:00      \_ -bash
16993 18706 18706 16993 pts/1    18706 R+       0   0:00          \_ ps ajxf

测试二(孤儿后台进程组
[root@Betty ~]# vi test_2.sh 
 
#!/bin/sh 
sleep 600 &         # 不会卡住当前 shell 脚本,因为放在后台执行
  
[root@Betty ~]#
[root@Betty ~]# ./test_2.sh
[root@Betty ~]# 
在另一个窗口中查看
[root@Betty ~]# ps ajxf
 PPID   PID  PGID   SID TTY      TPGID STAT   UID   TIME COMMAND
...
    1  1860  1860  1860 ?           -1 Ss       0   0:00 /usr/sbin/sshd
 1860 13331 13331 13331 ?           -1 Ss       0   0:00  \_ sshd: root@pts/1,pts/0
13331 16993 16993 16993 pts/1    18778 Ss       0   0:00      \_ -bash
16993 18778 18778 16993 pts/1    18778 R+       0   0:00      |   \_ ps ajxf
13331 18734 18734 18734 pts/0    18734 Ss+      0   0:00      \_ -bash 
...
1 18763 18762 18734 pts/0    18734 S        0   0:00 sleep 600  -- 对应后台执行 sleep 的进程,由于是后台执行,
                                                                   所以不会卡住 test.sh 脚本的执行
                                                                   test.sh 脚本执行结束后,与 test.sh 对应的进程
                                                                   会自行退出,从而 sleep 进程被 init 进程收养
此时叉掉启动 test_2.sh 脚本的窗口,可以看到 sleep 600 对应进程的 TTY 和 TPGID 发生了变化,但进程并未消失。
[root@Betty ~]# ps ajxf
 PPID   PID  PGID   SID TTY      TPGID STAT   UID   TIME COMMAND
...
    1  1860  1860  1860 ?           -1 Ss       0   0:00 /usr/sbin/sshd
 1860 13331 13331 13331 ?           -1 Ss       0   0:00  \_ sshd: root@pts/1
13331 16993 16993 16993 pts/1    18816 Ss       0   0:00      \_ -bash
16993 18816 18816 16993 pts/1    18816 R+       0   0:00          \_ ps ajxf
...
    1 18763 18762 18734 ?           -1 S        0   0:00 sleep 600

测试三(前台进程组
[root@Betty ~]# vi test_3.sh 
 
#!/bin/sh 
sleep 600 &
sleep 720
 
[root@Betty ~]#
[root@Betty ~]# ./test_3.sh
(卡住) 
在另一个窗口中查看
[root@Betty ~]# ps ajxf
 PPID   PID  PGID   SID TTY      TPGID STAT   UID   TIME COMMAND
...
    1  1860  1860  1860 ?           -1 Ss       0   0:00 /usr/sbin/sshd
 1860 13331 13331 13331 ?           -1 Ss       0   0:00  \_ sshd: root@pts/1,pts/2
13331 16993 16993 16993 pts/1    18918 Ss       0   0:00      \_ -bash
16993 18918 18918 16993 pts/1    18918 R+       0   0:00      |   \_ ps ajxf
13331 18856 18856 18856 pts/2    18908 Ss       0   0:00      \_ -bash
18856 18908 18908 18856 pts/2    18908 S+       0   0:00          \_ /bin/sh ./test_3.sh
18908 18909 18908 18856 pts/2    18908 S+       0   0:00              \_ sleep 600
18908 18910 18908 18856 pts/2    18908 S+       0   0:00              \_ sleep 720
...
[root@Betty ~]#
 此时叉掉启动 test_3.sh 脚本的窗口,可以看到对应的进程全部消失。 
[root@Betty ~]# ps ajxf
 PPID   PID  PGID   SID TTY      TPGID STAT   UID   TIME COMMAND
...
    1  1860  1860  1860 ?           -1 Ss       0   0:00 /usr/sbin/sshd
 1860 13331 13331 13331 ?           -1 Ss       0   0:00  \_ sshd: root@pts/1
13331 16993 16993 16993 pts/1    18963 Ss       0   0:00      \_ -bash
16993 18963 18963 16993 pts/1    18963 R+       0   0:00          \_ ps ajxf
...
[root@Betty ~]#

测试四(后台进程组
[root@Betty ~]# cat test_1.sh 

#!/bin/sh
sleep 600
 
[root@Betty ~]# 
[root@Betty ~]# 
[root@Betty ~]# ./test_1.sh &   -- 后台执行该脚本
[1] 19016
[root@Betty ~]#
 在另外一个窗口中查看
[root@Betty ~]# ps ajxf
 PPID   PID  PGID   SID TTY      TPGID STAT   UID   TIME COMMAND
...
    1  1860  1860  1860 ?           -1 Ss       0   0:00 /usr/sbin/sshd
 1860 13331 13331 13331 ?           -1 Ss       0   0:00  \_ sshd: root@pts/1,pts/0
13331 16993 16993 16993 pts/1    19032 Ss       0   0:00      \_ -bash
16993 19032 19032 16993 pts/1    19032 R+       0   0:00      |   \_ ps ajxf
13331 18993 18993 18993 pts/0    18993 Ss+      0   0:00      \_ -bash
18993 19016 19016 18993 pts/0    18993 S        0   0:00          \_ /bin/sh ./test_1.sh
19016 19017 19016 18993 pts/0    18993 S        0   0:00              \_ sleep 600
...
[root@Betty ~]#
此时叉掉后台启动 test_1.sh 脚本的窗口,可以看到对应的进程全部消失。
[root@Betty ~]# ps ajxf
 PPID   PID  PGID   SID TTY      TPGID STAT   UID   TIME COMMAND
...
    1  1860  1860  1860 ?           -1 Ss       0   0:00 /usr/sbin/sshd
 1860 13331 13331 13331 ?           -1 Ss       0   0:00  \_ sshd: root@pts/1
13331 16993 16993 16993 pts/1    19052 Ss       0   0:00      \_ -bash
16993 19052 19052 16993 pts/1    19052 R+       0   0:00          \_ ps ajxf
...
[root@Betty ~]#
实验结论

相关概念

要想理解上面的实验结果,首先必须理解如下一些概念:

【进程组】

【会话】


【登陆shell】

当通过终端或网络登录时,可以得到一个登录 shell,其标准输入、输出和标准出错将连接到一个终端设备或者伪终端设备上。
[root@Betty ~]# ps ajxf
 PPID   PID  PGID   SID TTY      TPGID STAT   UID   TIME COMMAND
...
    1  1860  1860  1860 ?           -1 Ss       0   0:00 /usr/sbin/sshd
 1860  2228  2228  2228 ?           -1 Ss       0   0:02  \_ sshd: root@pts/1,pts/2,pts/3,pts/4,pts/0,pts/5,pts/6,pts/7,pts/8,pts/9
 2228  2230  2230  2230 pts/1     2230 Ss+      0   0:00      \_ -bash
 2228  2747  2747  2747 pts/2     2747 Ss+      0   0:00      \_ -bash
 2228  2772  2772  2772 pts/3     2772 Ss+      0   0:00      \_ -bash
 2228  6750  6750  6750 pts/4    16910 Ss       0   0:00      \_ -bash
 6750 16910 16910  6750 pts/4    16910 Sl+      0   0:02      |   \_ ./modb /etc/modbcore.conf
 2228  7213  7213  7213 pts/0     7213 Ss+      0   0:00      \_ -bash
 2228 17072 17072 17072 pts/5    17072 Ss+      0   0:00      \_ -bash
 2228 17091 17091 17091 pts/6    17091 Ss+      0   0:00      \_ -bash
 2228 17111 17111 17111 pts/7    17111 Ss+      0   0:00      \_ -bash
 2228 17132 17132 17132 pts/8    17132 Ss+      0   0:00      \_ -bash
 2228 17154 17154 17154 pts/9    17175 Ss       0   0:00      \_ -bash
17154 17175 17175 17154 pts/9    17175 R+       0   0:00          \_ ps ajxf
      可以看到,上面所有 -bash 均为通过 ssh 网络连接建立的登陆 shell,且都对应到 pts 伪终端上,即登陆 shell 拥有控制终端。
      另外,可以看到登陆 shell 的 PID = PGID = SID,所以登陆 shell 就是会话首进程,以及进程组组长进程。
      登陆 shell 是一个 POSIX.1 会话的开始,而此终端或伪终端则是会话的控制终端。

【伪终端】
      为使同一个软件既能处理终端 login,又能处理网络 login,系统使用了一种称为伪终端的软件驱动程序,它仿真串行终端的运行行为,并将终端操作映射为网络操作,反之亦然。
      当通过终端(基于硬链接和终端设备驱动程序)或网络(基于网络连接和伪终端设备驱动程序)登录时,我们得到一个登录 shell,其标准输入、输出和标准出错连接到一个终端设备或者伪终端设备上。
[root@YOYO ~]# w
 14:58:20 up 2 days,  5:30,  1 user,  load average: 0.00, 0.00, 0.00
USER     TTY      FROM              LOGIN@   IDLE   JCPU   PCPU WHAT
root     pts/1    172.16.81.112    Mon09    0.00s  0.45s  0.07s w
[root@YOYO ~]# 
[root@YOYO ~]# 
[root@YOYO ~]# ps aux|grep bash
root      2030  0.0  0.0 108552  2000 pts/1    Ss   Jul13   0:00 -bash
root     24962  0.0  0.0 103256   856 pts/1    S+   14:58   0:00 grep bash
[root@YOYO ~]# ll /proc/2030/fd
总用量 0
lrwx------. 1 root root 64 7月  13 09:43 0 -> /dev/pts/1
lrwx------. 1 root root 64 7月  13 09:43 1 -> /dev/pts/1
lrwx------. 1 root root 64 7月  13 09:43 2 -> /dev/pts/1
lrwx------. 1 root root 64 7月  15 11:04 255 -> /dev/pts/1
[root@YOYO ~]#

【控制终端】


【作业控制】

作业控制是 BSD 在 1980 年前后增加的一个特性。它允许在一个终端上启动多个作业(进程组),它控制哪一个作业可以访问该终端,以及哪些作业在后台运行。

【信号】

【守护进程】

【孤儿进程组】


进程消失的原因--SIGHUP

之前整理了一篇关于 SIGHUP 信号的博文,下面给出一些结论:


基于strace研究各种运行方式的差别

      既然知道了进程消失是因为 SIGHUP 信号导致,那么就可以通过 strace 观察各种运行方式下,都做了哪些相关处理。

跟踪前台运行,可以看到其中没有针对 SIGHUP 信号做任何处理。
[root@YOYO ~]# strace sleep 10
execve("/bin/sleep", ["sleep", "10"], [/* 28 vars */]) = 0
brk(0)                                  = 0xafa000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe54ec09000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=70566, ...}) = 0
mmap(NULL, 70566, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fe54ebf7000
close(3)                                = 0
open("/lib64/libc.so.6", O_RDONLY)      = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0p\356\0018?\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1926760, ...}) = 0
mmap(0x3f38000000, 3750152, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x3f38000000
mprotect(0x3f3818a000, 2097152, PROT_NONE) = 0
mmap(0x3f3838a000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x18a000) = 0x3f3838a000
mmap(0x3f3838f000, 18696, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x3f3838f000
close(3)                                = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe54ebf6000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe54ebf5000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe54ebf4000
arch_prctl(ARCH_SET_FS, 0x7fe54ebf5700) = 0
mprotect(0x3f3838a000, 16384, PROT_READ) = 0
mprotect(0x3f37a1f000, 4096, PROT_READ) = 0
munmap(0x7fe54ebf7000, 70566)           = 0
brk(0)                                  = 0xafa000
brk(0xb1b000)                           = 0xb1b000
open("/usr/lib/locale/locale-archive", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=99154480, ...}) = 0
mmap(NULL, 99154480, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fe548d64000
close(3)                                = 0
nanosleep({10, 0}, NULL)                = 0                -- 对应 sleep 10
close(1)                                = 0
close(2)                                = 0
exit_group(0)                           = ?
[root@YOYO ~]#

跟踪后台运行,可以看到其中同样没有针对SIGHUP信号做任何处理。
[root@YOYO ~]# strace sleep 10 &
[1] 2727
[root@YOYO ~]# execve("/bin/sleep", ["sleep", "10"], [/* 28 vars */]) = 0
brk(0)                                  = 0x1406000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ff18790f000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=70566, ...}) = 0
mmap(NULL, 70566, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7ff1878fd000
close(3)                                = 0
open("/lib64/libc.so.6", O_RDONLY)      = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0p\356\0018?\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1926760, ...}) = 0
mmap(0x3f38000000, 3750152, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x3f38000000
mprotect(0x3f3818a000, 2097152, PROT_NONE) = 0
mmap(0x3f3838a000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x18a000) = 0x3f3838a000
mmap(0x3f3838f000, 18696, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x3f3838f000
close(3)                                = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ff1878fc000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ff1878fb000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ff1878fa000
arch_prctl(ARCH_SET_FS, 0x7ff1878fb700) = 0
mprotect(0x3f3838a000, 16384, PROT_READ) = 0
mprotect(0x3f37a1f000, 4096, PROT_READ) = 0
munmap(0x7ff1878fd000, 70566)           = 0
brk(0)                                  = 0x1406000
brk(0x1427000)                          = 0x1427000
open("/usr/lib/locale/locale-archive", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=99154480, ...}) = 0
mmap(NULL, 99154480, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7ff181a6a000
close(3)                                = 0
nanosleep({10, 0}, NULL)                = 0              -- 对应 sleep 10
close(1)                                = 0
close(2)                                = 0
exit_group(0)                           = ?

[1]+  Done                    strace sleep 10
[root@YOYO ~]#

跟踪 setsid 的使用,可以看到其中同样没有针对 SIGHUP 信号做任何处理(通过 setsid 执行后不会退出的原因后续再说明)。
[root@YOYO ~]# strace setsid sleep 10        
execve("/usr/bin/setsid", ["setsid", "sleep", "10"], [/* 28 vars */]) = 0
brk(0)                                  = 0x1274000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f2dc78d4000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=70566, ...}) = 0
mmap(NULL, 70566, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f2dc78c2000
close(3)                                = 0
open("/lib64/libc.so.6", O_RDONLY)      = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0p\356\0018?\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1926760, ...}) = 0
mmap(0x3f38000000, 3750152, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x3f38000000
mprotect(0x3f3818a000, 2097152, PROT_NONE) = 0
mmap(0x3f3838a000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x18a000) = 0x3f3838a000
mmap(0x3f3838f000, 18696, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x3f3838f000
close(3)                                = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f2dc78c1000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f2dc78c0000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f2dc78bf000
arch_prctl(ARCH_SET_FS, 0x7f2dc78c0700) = 0
mprotect(0x3f3838a000, 16384, PROT_READ) = 0
mprotect(0x3f37a1f000, 4096, PROT_READ) = 0
munmap(0x7f2dc78c2000, 70566)           = 0
brk(0)                                  = 0x1274000
brk(0x1295000)                          = 0x1295000
open("/usr/lib/locale/locale-archive", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=99154480, ...}) = 0
mmap(NULL, 99154480, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f2dc1a2f000
close(3)                                = 0
getpgrp()                               = 2743              -- 获取进程组 id
getpid()                                = 2744              -- 获取进程 id
setsid()                                = 2744              -- 创建新的会话
execve("/usr/lib64/qt-3.3/bin/sleep", ["sleep", "10"], [/* 28 vars */]) = -1 ENOENT (No such file or directory)
execve("/usr/local/sbin/sleep", ["sleep", "10"], [/* 28 vars */]) = -1 ENOENT (No such file or directory)
execve("/usr/local/bin/sleep", ["sleep", "10"], [/* 28 vars */]) = -1 ENOENT (No such file or directory)
execve("/sbin/sleep", ["sleep", "10"], [/* 28 vars */]) = -1 ENOENT (No such file or directory)
execve("/bin/sleep", ["sleep", "10"], [/* 28 vars */]) = 0
brk(0)                                  = 0x1f96000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f60207fc000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=70566, ...}) = 0
mmap(NULL, 70566, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f60207ea000
close(3)                                = 0
open("/lib64/libc.so.6", O_RDONLY)      = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0p\356\0018?\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1926760, ...}) = 0
mmap(0x3f38000000, 3750152, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x3f38000000
mprotect(0x3f3818a000, 2097152, PROT_NONE) = 0
mmap(0x3f3838a000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x18a000) = 0x3f3838a000
mmap(0x3f3838f000, 18696, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x3f3838f000
close(3)                                = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f60207e9000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f60207e8000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f60207e7000
arch_prctl(ARCH_SET_FS, 0x7f60207e8700) = 0
mprotect(0x3f3838a000, 16384, PROT_READ) = 0
mprotect(0x3f37a1f000, 4096, PROT_READ) = 0
munmap(0x7f60207ea000, 70566)           = 0
brk(0)                                  = 0x1f96000
brk(0x1fb7000)                          = 0x1fb7000
open("/usr/lib/locale/locale-archive", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=99154480, ...}) = 0
mmap(NULL, 99154480, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f601a957000
close(3)                                = 0
nanosleep({10, 0}, NULL)                = 0                 -- 对应 sleep 10
close(1)                                = 0
close(2)                                = 0
exit_group(0)                           = ?
[root@YOYO ~]#

跟踪 nohup 的使用,可以看到内部设置了对 SIGHUP 信号的忽略处理。
[root@YOYO ~]# strace nohup sleep 10 &
[1] 763
[root@YOYO ~]# execve("/usr/bin/nohup", ["nohup", "sleep", "10"], [/* 28 vars */]) = 0
brk(0)                                  = 0x138b000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f41c123d000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=70566, ...}) = 0
mmap(NULL, 70566, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f41c122b000
close(3)                                = 0
open("/lib64/libc.so.6", O_RDONLY)      = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0p\356\0018?\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1926760, ...}) = 0
mmap(0x3f38000000, 3750152, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x3f38000000
mprotect(0x3f3818a000, 2097152, PROT_NONE) = 0
mmap(0x3f3838a000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x18a000) = 0x3f3838a000
mmap(0x3f3838f000, 18696, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x3f3838f000
close(3)                                = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f41c122a000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f41c1229000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f41c1228000
arch_prctl(ARCH_SET_FS, 0x7f41c1229700) = 0
mprotect(0x3f3838a000, 16384, PROT_READ) = 0
mprotect(0x3f37a1f000, 4096, PROT_READ) = 0
munmap(0x7f41c122b000, 70566)           = 0
brk(0)                                  = 0x138b000
brk(0x13ac000)                          = 0x13ac000
open("/usr/lib/locale/locale-archive", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=99154480, ...}) = 0
mmap(NULL, 99154480, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f41bb398000
close(3)                                = 0
ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig -icanon -echo ...}) = 0
ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig -icanon -echo ...}) = 0
ioctl(2, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig -icanon -echo ...}) = 0
open("/dev/null", O_WRONLY)             = 3                      -- 以“只写”权限打开 /dev/null 
dup2(3, 0)                              = 0                      -- 将标准输入重定向到 /dev/null
close(3)                                = 0
umask(037777777177)                     = 022
open("nohup.out", O_WRONLY|O_CREAT|O_APPEND, 0600) = 3           -- 打开 nohup.out 文件
dup2(3, 1)                              = 1                      -- 将标准输出重定向到 nohup.out
close(3)                                = 0
umask(022)                              = 0177
open("/usr/share/locale/locale.alias", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=2512, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f41c123c000
read(3, "# Locale name alias data base.\n#"..., 4096) = 2512
read(3, "", 4096)                       = 0
close(3)                                = 0
munmap(0x7f41c123c000, 4096)            = 0
open("/usr/share/locale/zh_CN.UTF-8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale/zh_CN.utf8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale/zh_CN/LC_MESSAGES/coreutils.mo", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=286636, ...}) = 0
mmap(NULL, 286636, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f41bb352000
close(3)                                = 0
open("/usr/lib64/gconv/gconv-modules.cache", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=26060, ...}) = 0
mmap(NULL, 26060, PROT_READ, MAP_SHARED, 3, 0) = 0x7f41bb34b000
close(3)                                = 0
write(2, "nohup: ", 7nohup: )                  = 7
write(2, "\345\277\275\347\225\245\350\276\223\345\205\245\345\271\266\346\212\212\350\276\223\345\207\272\350\277\275\345\212\240\345\210"..., 44忽略输入并把输出追加到"nohup.out") = 44
write(2, "\n", 1
)                       = 1
fcntl(2, F_DUPFD, 3)                    = 3
fcntl(3, F_GETFD)                       = 0
fcntl(3, F_SETFD, FD_CLOEXEC)           = 0
dup2(1, 2)                              = 2                       -- 将标准出错重定向到 nohuo.out
rt_sigaction(SIGHUP, {SIG_IGN, [HUP], SA_RESTORER|SA_RESTART, 0x3f380326a0}, {SIG_DFL, [], 0}, 8) = 0     -- 设置 SIGHUP 信号处理函数为 SIG_IGN
execve("/usr/lib64/qt-3.3/bin/sleep", ["sleep", "10"], [/* 28 vars */]) = -1 ENOENT (No such file or directory)
execve("/usr/local/sbin/sleep", ["sleep", "10"], [/* 28 vars */]) = -1 ENOENT (No such file or directory)
execve("/usr/local/bin/sleep", ["sleep", "10"], [/* 28 vars */]) = -1 ENOENT (No such file or directory)
execve("/sbin/sleep", ["sleep", "10"], [/* 28 vars */]) = -1 ENOENT (No such file or directory)
execve("/bin/sleep", ["sleep", "10"], [/* 28 vars */]) = 0
brk(0)                                  = 0x86a000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fc801ecc000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=70566, ...}) = 0
mmap(NULL, 70566, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fc801eba000
close(3)                                = 0
open("/lib64/libc.so.6", O_RDONLY)      = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0p\356\0018?\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1926760, ...}) = 0
mmap(0x3f38000000, 3750152, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x3f38000000
mprotect(0x3f3818a000, 2097152, PROT_NONE) = 0
mmap(0x3f3838a000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x18a000) = 0x3f3838a000
mmap(0x3f3838f000, 18696, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x3f3838f000
close(3)                                = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fc801eb9000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fc801eb8000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fc801eb7000
arch_prctl(ARCH_SET_FS, 0x7fc801eb8700) = 0
mprotect(0x3f3838a000, 16384, PROT_READ) = 0
mprotect(0x3f37a1f000, 4096, PROT_READ) = 0
munmap(0x7fc801eba000, 70566)           = 0
brk(0)                                  = 0x86a000
brk(0x88b000)                           = 0x88b000
open("/usr/lib/locale/locale-archive", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=99154480, ...}) = 0
mmap(NULL, 99154480, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fc7fc027000
close(3)                                = 0
nanosleep({10, 0}, NULL)                = 0                  -- 对应 sleep 10
close(1)                                = 0
close(2)                                = 0
exit_group(0)                           = ?


[1]+  Done                    strace nohup sleep 10
[root@YOYO ~]#

setsid 和 nohup 的源码实现

      通过上面的 strace 输出没有看出为何通过 setsid 启动程序不会因为 SIGHUP 而退出(但从理论上讲,我们知道是因为创建的进程没有控制终端的缘故)。下面看一下这两命令的源码实现。

下面是 setsid 的核心源码(取自 util-linux-2.26)
技术分享
下面给出 nohup 的核心源码(取自 coreutils-8.24)
技术分享
技术分享
技术分享
可以看到,源码实现中的逻辑与 strace 看到的内容完全对应上了。

部署工具脚本中的问题

      基于以上的内容,就可以很容易发现或解释我们实际使用中的脚本存在哪些问题(公司内容,略)。

更多参考

1.《SIGHUP问题梳理
2.《Bg, Fg, &, Ctrl-Z – 5 Examples to Manage Unix Background Jobs



原文:http://my.oschina.net/moooofly/blog/498924

评论(0
© 2014 bubuko.com 版权所有 - 联系我们:wmxa8@hotmail.com
打开技术之扣,分享程序人生!