nodemanager无error异常宕机

m0_46397006 2021-04-30 12:05:13
nodemanager日志

8 GB physical memory used; 8.7 GB of 16.8 GB virtual memory used
2021-04-30 10:58:30,785 INFO nodemanager.NodeStatusUpdaterImpl (NodeStatusUpdaterImpl.java:removeOrTrackCompletedContainersFromContext(553)) - Removed completed containers from NM context: [container_e15_1619677419274_1197_01_000007]
2021-04-30 10:58:30,791 WARN containermanager.ContainerManagerImpl (ContainerManagerImpl.java:handle(1070)) - Event EventType: KILL_CONTAINER sent to absent container container_e15_1619677419274_1197_01_000054
2021-04-30 10:58:30,846 INFO monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(499)) - Memory usage of ProcessTree 124051 for container-id container_e15_1619677419274_1197_01_000027: 881.4 MB of 8 GB physical memory used; 8.7 GB of 16.8 GB virtual memory used
2021-04-30 10:58:31,337 INFO containermanager.ContainerManagerImpl (ContainerManagerImpl.java:startContainerInternal(810)) - Start request for container_e15_1619677419274_1197_01_000065 by user agent
2021-04-30 10:58:31,370 INFO application.ApplicationImpl (ApplicationImpl.java:transition(304)) - Adding container_e15_1619677419274_1197_01_000065 to application application_1619677419274_1197
2021-04-30 10:58:31,371 INFO container.ContainerImpl (ContainerImpl.java:handle(1163)) - Container container_e15_1619677419274_1197_01_000065 transitioned from NEW to LOCALIZING
2021-04-30 10:58:31,371 INFO containermanager.AuxServices (AuxServices.java:handle(215)) - Got event CONTAINER_INIT for appId application_1619677419274_1197
2021-04-30 10:58:31,371 INFO yarn.YarnShuffleService (YarnShuffleService.java:initializeContainer(192)) - Initializing container container_e15_1619677419274_1197_01_000065
2021-04-30 10:58:31,371 INFO yarn.YarnShuffleService (YarnShuffleService.java:initializeContainer(289)) - Initializing container container_e15_1619677419274_1197_01_000065
2021-04-30 10:58:31,371 INFO containermanager.AuxServices (AuxServices.java:handle(215)) - Got event APPLICATION_INIT for appId application_1619677419274_1197
2021-04-30 10:58:31,372 INFO containermanager.AuxServices (AuxServices.java:handle(219)) - Got APPLICATION_INIT for service mapreduce_shuffle
2021-04-30 10:58:31,372 INFO mapred.ShuffleHandler (ShuffleHandler.java:addJobToken(681)) - Added token for job_1619677419274_1197
2021-04-30 10:58:31,373 INFO container.ContainerImpl (ContainerImpl.java:handle(1163)) - Container container_e15_1619677419274_1197_01_000065 transitioned from LOCALIZING to LOCALIZED
2021-04-30 10:58:32,024 INFO containermanager.ContainerManagerImpl (ContainerManagerImpl.java:startContainerInternal(810)) - Start request for container_e15_1619677419274_1197_01_000066 by user agent
2021-04-30 10:58:32,061 INFO application.ApplicationImpl (ApplicationImpl.java:transition(304)) - Adding container_e15_1619677419274_1197_01_000066 to application application_1619677419274_1197
2021-04-30 10:58:32,063 INFO container.ContainerImpl (ContainerImpl.java:handle(1163)) - Container container_e15_1619677419274_1197_01_000066 transitioned from NEW to LOCALIZING
2021-04-30 10:58:32,063 INFO containermanager.AuxServices (AuxServices.java:handle(215)) - Got event CONTAINER_INIT for appId application_1619677419274_1197
2021-04-30 10:58:32,063 INFO yarn.YarnShuffleService (YarnShuffleService.java:initializeContainer(192)) - Initializing container container_e15_1619677419274_1197_01_000066
2021-04-30 10:58:32,063 INFO yarn.YarnShuffleService (YarnShuffleService.java:initializeContainer(289)) - Initializing container container_e15_1619677419274_1197_01_000066
2021-04-30 10:58:32,063 INFO containermanager.AuxServices (AuxServices.java:handle(215)) - Got event APPLICATION_INIT for appId application_1619677419274_1197
2021-04-30 10:58:32,063 INFO containermanager.AuxServices (AuxServices.java:handle(219)) - Got APPLICATION_INIT for service mapreduce_shuffle
2021-04-30 10:58:32,064 INFO mapred.ShuffleHandler (ShuffleHandler.java:addJobToken(681)) - Added token for job_1619677419274_1197
2021-04-30 10:58:32,065 INFO container.ContainerImpl (ContainerImpl.java:handle(1163)) - Container container_e15_1619677419274_1197_01_000066 transitioned from LOCALIZING to LOCALIZED
2021-04-30 10:58:32,801 INFO nodemanager.NodeStatusUpdaterImpl (NodeStatusUpdaterImpl.java:removeOrTrackCompletedContainersFromContext(553)) - Removed completed containers from NM context: [container_e15_1619677419274_1197_01_000008, container_e15_1619677419274_1165_01_000388, container_e15_1619677419274_1197_01_000006, container_e15_1619677419274_1198_01_000005, container_e15_1619677419274_1200_01_000002]

resourceManager日志无异常
没有固定时段,三个节点都有这种情况,一般在集群资源用满时出现
求大佬解决




...全文
1379 回复 打赏 收藏 转发到动态 举报
写回复
用AI写文章
回复
切换为时间正序
请发表友善的回复…
发表回复

20,808

社区成员

发帖
与我相关
我的任务
社区描述
Hadoop生态大数据交流社区,致力于有Hadoop,hive,Spark,Hbase,Flink,ClickHouse,Kafka,数据仓库,大数据集群运维技术分享和交流等。致力于收集优质的博客
社区管理员
  • 分布式计算/Hadoop社区
  • 涤生大数据
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告
暂无公告

试试用AI创作助手写篇文章吧