supervisor的log中每2分钟 WARN received SIGTERM indicating exit request

sunnyzls 2020-02-13 09:23:45
最近跟着一个教程做了一个ASP.NET Core项目。当部署到CentOS 8上,用nginx做反向代理,用supervisor保护进程后,发现网站一阵能访问,一阵不能访问。后来查看/tmp/supervisor.log,发现每2分钟都会出现 WARN received SIGTERM indicating exit request事件,而后我的网站进程就dead了。
我的程序用命令行手动执行是没有自动退出的问题的,但是使用supervisor监控运行就每2分钟死掉一次。初学这方面内容,网上查找好久也没找到原因。
下面是/etc/superivsor/conf.d/目录下的文件ResultUploadSystem.ini的内容:
[program:ResultUploadSystem]
command=dotnet ResultUploadSystem.dll
directory=/aspnetcore/publish
autostart=true
autorestart=true
startretries=5
startsecs=1
user=root
priority=999
stderr_logfile=/var/log/ResultUploadSystem.err.log
stdout_logfile=/var/log/ResultUploadSystem.out.log
environment=ASPNETCORE_ENVIORNMENT=Production
stopsignal=INT

下面是/tmp/supervisor.log文件中的部分内容:
2020-02-13 20:53:39,175 WARN received SIGTERM indicating exit request
2020-02-13 20:53:39,176 INFO waiting for ResultUploadSystem to die
2020-02-13 20:53:40,270 INFO stopped: ResultUploadSystem (exit status 0)
2020-02-13 20:54:22,659 CRIT Supervisor is running as root. Privileges were not dropped because no user
is specified in the config file. If you intend to run as root, you can set user=root in the config file
to avoid this message.
2020-02-13 20:54:22,659 INFO Included extra file "/etc/supervisor/conf.d/ResultUploadSystem.ini" during p
arsing
2020-02-13 20:54:22,671 INFO RPC interface 'supervisor' initialized
2020-02-13 20:54:22,672 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2020-02-13 20:54:22,672 INFO supervisord started with pid 11658
2020-02-13 20:54:23,675 INFO spawned: 'ResultUploadSystem' with pid 11661
2020-02-13 20:54:25,275 INFO success: ResultUploadSystem entered RUNNING state, process has stayed up for
> than 1 seconds (startsecs)
2020-02-13 20:55:53,465 WARN received SIGTERM indicating exit request
2020-02-13 20:55:53,465 INFO waiting for ResultUploadSystem to die
2020-02-13 20:55:54,534 INFO stopped: ResultUploadSystem (exit status 0)
2020-02-13 20:56:36,912 CRIT Supervisor is running as root. Privileges were not dropped because no user
is specified in the config file. If you intend to run as root, you can set user=root in the config file
to avoid this message.
2020-02-13 20:56:36,912 INFO Included extra file "/etc/supervisor/conf.d/ResultUploadSystem.ini" during p
arsing
2020-02-13 20:56:36,924 INFO RPC interface 'supervisor' initialized
2020-02-13 20:56:36,924 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2020-02-13 20:56:36,924 INFO supervisord started with pid 11739
2020-02-13 20:56:37,932 INFO spawned: 'ResultUploadSystem' with pid 11742
2020-02-13 20:56:39,510 INFO success: ResultUploadSystem entered RUNNING state, process has stayed up for
> than 1 seconds (startsecs)
2020-02-13 20:58:07,467 WARN received SIGTERM indicating exit request
2020-02-13 20:58:07,467 INFO waiting for ResultUploadSystem to die
2020-02-13 20:58:08,542 INFO stopped: ResultUploadSystem (exit status 0)
2020-02-13 20:58:51,149 CRIT Supervisor is running as root. Privileges were not dropped because no user
is specified in the config file. If you intend to run as root, you can set user=root in the config file
to avoid this message.
2020-02-13 20:58:51,149 INFO Included extra file "/etc/supervisor/conf.d/ResultUploadSystem.ini" during p
arsing
2020-02-13 20:58:51,158 INFO RPC interface 'supervisor' initialized
2020-02-13 20:58:51,158 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2020-02-13 20:58:51,158 INFO supervisord started with pid 11824
2020-02-13 20:58:52,163 INFO spawned: 'ResultUploadSystem' with pid 11827
2020-02-13 20:58:53,763 INFO success: ResultUploadSystem entered RUNNING state, process has stayed up for
> than 1 seconds (startsecs)
2020-02-13 21:00:21,791 WARN received SIGTERM indicating exit request
2020-02-13 21:00:21,793 INFO waiting for ResultUploadSystem to die
2020-02-13 21:00:22,904 INFO stopped: ResultUploadSystem (exit status 0)
...全文
5946 7 打赏 收藏 转发到动态 举报
写回复
用AI写文章
7 条回复
切换为时间正序
请发表友善的回复…
发表回复
sunnyzls 2020-02-15
  • 打赏
  • 举报
回复
重新安装CentOS 8系统,又重新部署了一次ASP.Net Core程序。这次supervisor成功了。
上次失败的原因应该是画蛇添足了。
在CentOS 8下,安装supervisor使用的是如下语句
yum install epel-release
yum install -y supervisor

安装完成后,在/etc/目录下已经自动生成了supervisord.conf(配置文件,并且配置文件中已经启用了“[include] file=supervisord.d/*.ini”)和/etc/supervisord.d/(被保护程序配置的文件夹),在/usr/lib/systemd/system/目录下也自动生成了supervisord.service(服务文件,和/etc目录下的上述文件关联在一起的)。
而我上次把这些文件和文件夹都自己创建或修改了一遍(按照CentOS 7系统下的方法)。最后不知道是哪里改出问题了,还是由于又两套文件冲突,最后导致被保护的程序每2分钟左右重启一次。
github_36000833 2020-02-15
  • 打赏
  • 举报
回复
先确认是否supervisor本身/本机的问题。 1、把/etc/superivsor/conf.d/目录下的文件ResultUploadSystem.ini移除 2、创建一个简单的任务,my.conf,并把它拷贝到/etc/superivsor/conf.d/ 目录下。 ---- my.conf的内容,它的命令就是每两秒输出一行时间 ----
[program:my]
command=sh -c "while true; do echo $(date) && sleep 2; done"
3、重启supervisord 4、应该在supervisord的日志下看到任务program:my被成功启动,例子里它的进程id是1981:
CRIT Supervisor running as root (no user in config file)
INFO Included extra file "/etc/supervisor/conf.d/my.conf" during parsing
INFO RPC interface 'supervisor' initialized
CRIT Server 'unix_http_server' running without any HTTP authentication checking
INFO supervisord started with pid 1978
INFO spawned: 'my' with pid 1981
INFO success: my entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
观察是否supervisor还是存在意外被杀的情况,如果是,且换一台机器能工作正常,就能说明存在本机问题。 一般来说,supervisor管理的进程如果崩溃,supervisor应该重启一个任务,supervisor就是干这个的,它本身不应该崩溃。 比如在program:my的例子中,如果你运行kill 1981,手动杀掉进程1981,supervisor日志下可以看到任务重启,新的pid为3690:
INFO exited: my (terminated by SIGTERM; not expected)
INFO spawned: 'my' with pid 3690
INFO success: my entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
sunnyzls 2020-02-15
  • 打赏
  • 举报
回复
引用 6 楼 github_36000833 的回复:
先确认是否supervisor本身/本机的问题。
1、把/etc/superivsor/conf.d/目录下的文件ResultUploadSystem.ini移除
2、创建一个简单的任务,my.conf,并把它拷贝到/etc/superivsor/conf.d/ 目录下。
---- my.conf的内容,它的命令就是每两秒输出一行时间 ----
[program:my]
command=sh -c "while true; do echo $(date) && sleep 2; done"

3、重启supervisord
4、应该在supervisord的日志下看到任务program:my被成功启动,例子里它的进程id是1981:
CRIT Supervisor running as root (no user in config file)
INFO Included extra file "/etc/supervisor/conf.d/my.conf" during parsing
INFO RPC interface 'supervisor' initialized
CRIT Server 'unix_http_server' running without any HTTP authentication checking
INFO supervisord started with pid 1978
INFO spawned: 'my' with pid 1981
INFO success: my entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)

观察是否supervisor还是存在意外被杀的情况,如果是,且换一台机器能工作正常,就能说明存在本机问题。
一般来说,supervisor管理的进程如果崩溃,supervisor应该重启一个任务,supervisor就是干这个的,它本身不应该崩溃。
比如在program:my的例子中,如果你运行kill 1981,手动杀掉进程1981,supervisor日志下可以看到任务重启,新的pid为3690:
INFO exited: my (terminated by SIGTERM; not expected)
INFO spawned: 'my' with pid 3690
INFO success: my entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)

使用program:my,supervisor依然莫名重启。没有任何改善。
我的系统是CentOS 8。后来又装了一个CentOS 7,使用同样的设置和步骤,在Cent OS 7上程序运行正常。使用supervisor保护后,程序被kill掉后也能马上重启。
但是不知道为什么在CentOS 8上为什么就不行。过一会儿我要再重装一下Cent OS 8试试。
X-i-n 2020-02-14
  • 打赏
  • 举报
回复
这个配置文件没问题,可能是你的程序出现异常了。 查看一下 /var/log/ResultUploadSystem.err.log 和/var/log/ResultUploadSystem.out.log 看有没有相关度比较高的提示。
sunnyzls 2020-02-14
  • 打赏
  • 举报
回复
查看了cron计划任务,里面什么都没有。
我通过systemctl status supervisord命令查看了supervisord的状态,发现刚刚执行systemctl start supervisord后,一切正常:
[root@localhost system]# systemctl status supervisord
● supervisord.service - Supervisor Daemon
Loaded: loaded (/usr/lib/systemd/system/supervisord.service; disabled; vendor preset: disabled)
Active: activating (start) since Fri 2020-02-14 22:46:46 CST; 6s ago
Cntrl PID: 11398 (supervisord)
Tasks: 16 (limit: 11344)
Memory: 56.9M
CGroup: /system.slice/supervisord.service
├─11398 /usr/bin/python3.6 /usr/bin/supervisord -c /etc/supervisor/supervisord.conf
└─11401 dotnet ResultUploadSystem.dll

2月 14 22:46:46 localhost.localdomain systemd[1]: Starting Supervisor Daemon...
2月 14 22:46:46 localhost.localdomain supervisord[11398]: 2020-02-14 22:46:46,477 CRIT Supervisor is running as root. Privileges>
2月 14 22:46:46 localhost.localdomain supervisord[11398]: 2020-02-14 22:46:46,478 INFO Included extra file "/etc/supervisor/conf.>
2月 14 22:46:46 localhost.localdomain supervisord[11398]: 2020-02-14 22:46:46,490 INFO RPC interface 'supervisor' initialized
2月 14 22:46:46 localhost.localdomain supervisord[11398]: 2020-02-14 22:46:46,490 CRIT Server 'unix_http_server' running without >
2月 14 22:46:46 localhost.localdomain supervisord[11398]: 2020-02-14 22:46:46,491 INFO supervisord started with pid 11398
2月 14 22:46:47 localhost.localdomain supervisord[11398]: 2020-02-14 22:46:47,495 INFO spawned: 'ResultUploadSystem' with pid 114>
2月 14 22:46:49 localhost.localdomain supervisord[11398]: 2020-02-14 22:46:49,083 INFO success: ResultUploadSystem entered RUNNIN>
lines 1-18/18 (END)

过一段时间ASP.Net Core死掉后执行systemctl status supervisord命令得到的结果是这样的:
[root@localhost system]# systemctl status supervisord
● supervisord.service - Supervisor Daemon
Loaded: loaded (/usr/lib/systemd/system/supervisord.service; disabled; vendor preset: disabled)
Active: activating (auto-restart) (Result: timeout) since Fri 2020-02-14 22:43:56 CST; 30s ago
Process: 11207 ExecStart=/usr/bin/supervisord -c /etc/supervisor/supervisord.conf (code=exited, status=0/SUCCESS)

2月 14 22:43:56 localhost.localdomain systemd[1]: supervisord.service: Failed with result 'timeout'.
2月 14 22:43:56 localhost.localdomain systemd[1]: Failed to start Supervisor Daemon.
github_36000833 2020-02-14
  • 打赏
  • 举报
回复
supervisor.log说supervisor收到了SIGTERM信号,也就是说,有人(或其他程序)请求停止supervisor。 supervisor.log并没有抱怨netcore程序的任何崩溃。 因为supervisor把任务作为子程序运行,重启supervisor也就重启了所有supervisor的任务,包括你的asp.net core程序。 或许你看看是否有cron计划任务,自动更新,恶意程序等等的存在,导致supervisor的重启。
sunnyzls 2020-02-14
  • 打赏
  • 举报
回复
引用 1 楼 X-i-n 的回复:
这个配置文件没问题,可能是你的程序出现异常了。
查看一下
/var/log/ResultUploadSystem.err.log
和/var/log/ResultUploadSystem.out.log
看有没有相关度比较高的提示。

/var/log/ResultUploadSystem.err.log里面没有内容;
/var/log/ResultUploadSystem.out.log里面的最近一些内容如下:
warn: Microsoft.AspNetCore.DataProtection.Repositories.EphemeralXmlRepository[50]
Using an in-memory repository. Keys will not be persisted to storage.
warn: Microsoft.AspNetCore.DataProtection.KeyManagement.XmlKeyManager[59]
Neither user profile nor HKLM registry available. Using an ephemeral key repository. Protected data
will be unavailable when application exits.
warn: Microsoft.AspNetCore.DataProtection.KeyManagement.XmlKeyManager[35]
No XML encryptor configured. Key {e082db6b-b669-4b81-ad88-0b7e0ea8921e} may be persisted to storage
in unencrypted form.
info: Microsoft.Hosting.Lifetime[0]
Now listening on: http://[::]:5000
info: Microsoft.Hosting.Lifetime[0]
Application started. Press Ctrl+C to shut down.
info: Microsoft.Hosting.Lifetime[0]
Hosting environment: Production
info: Microsoft.Hosting.Lifetime[0]
Content root path: /aspnetcore/publish
info: Microsoft.Hosting.Lifetime[0]
Application is shutting down...
warn: Microsoft.AspNetCore.DataProtection.Repositories.EphemeralXmlRepository[50]
Using an in-memory repository. Keys will not be persisted to storage.
warn: Microsoft.AspNetCore.DataProtection.KeyManagement.XmlKeyManager[59]
Neither user profile nor HKLM registry available. Using an ephemeral key repository. Protected data
will be unavailable when application exits.
warn: Microsoft.AspNetCore.DataProtection.KeyManagement.XmlKeyManager[35]
No XML encryptor configured. Key {11984272-1ced-4b93-8587-9f239fb37c6c} may be persisted to storage
in unencrypted form.
info: Microsoft.Hosting.Lifetime[0]
Now listening on: http://[::]:5000
info: Microsoft.Hosting.Lifetime[0]
Application started. Press Ctrl+C to shut down.
info: Microsoft.Hosting.Lifetime[0]
Hosting environment: Production
info: Microsoft.Hosting.Lifetime[0]
Content root path: /aspnetcore/publish
info: Microsoft.Hosting.Lifetime[0]
Application is shutting down...
warn: Microsoft.AspNetCore.DataProtection.Repositories.EphemeralXmlRepository[50]
Using an in-memory repository. Keys will not be persisted to storage.
warn: Microsoft.AspNetCore.DataProtection.KeyManagement.XmlKeyManager[59]
Neither user profile nor HKLM registry available. Using an ephemeral key repository. Protected data
will be unavailable when application exits.
warn: Microsoft.AspNetCore.DataProtection.KeyManagement.XmlKeyManager[35]
No XML encryptor configured. Key {7c0d01f1-cbe1-49c7-a7e4-ba541f1b59a6} may be persisted to storage
in unencrypted form.
info: Microsoft.Hosting.Lifetime[0]
Now listening on: http://[::]:5000
info: Microsoft.Hosting.Lifetime[0]
Application started. Press Ctrl+C to shut down.
info: Microsoft.Hosting.Lifetime[0]
Hosting environment: Production
info: Microsoft.Hosting.Lifetime[0]
Content root path: /aspnetcore/publish
info: Microsoft.Hosting.Lifetime[0]
Application is shutting down...

我测试的时候害怕我的程序有问题,我还结束掉supervisor,单独用dotnet命令运行我的程序呢,也没有出现程序死掉的问题。
以下是dotnet命令直接运行程序的输出。
info: Microsoft.Hosting.Lifetime[0]
Now listening on: http://[::]:5000
info: Microsoft.Hosting.Lifetime[0]
Application started. Press Ctrl+C to shut down.
info: Microsoft.Hosting.Lifetime[0]
Hosting environment: Production
info: Microsoft.Hosting.Lifetime[0]
Content root path: /aspnetcore/publish
warn: Microsoft.AspNetCore.HttpsPolicy.HttpsRedirectionMiddleware[3]
Failed to determine the https port for redirect.
info: ResultUploadSystem.Controllers.AccountController[0]
Logged in admin.
info: ResultUploadSystem.Controllers.AccountController[0]
admin logged out.
info: ResultUploadSystem.Controllers.AccountController[0]
Logged in test.

62,269

社区成员

发帖
与我相关
我的任务
社区描述
.NET技术交流专区
javascript云原生 企业社区
社区管理员
  • ASP.NET
  • .Net开发者社区
  • R小R
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告

.NET 社区是一个围绕开源 .NET 的开放、热情、创新、包容的技术社区。社区致力于为广大 .NET 爱好者提供一个良好的知识共享、协同互助的 .NET 技术交流环境。我们尊重不同意见,支持健康理性的辩论和互动,反对歧视和攻击。

希望和大家一起共同营造一个活跃、友好的社区氛围。

试试用AI创作助手写篇文章吧