求助：oracle服务器经常性死机

ligen119 2014-07-03 09:49:19

给客户部署了一套业务应用服务器，使用的是server 2008 R2操作系统，oracle安装了11G 32位客户端和64位服务器，另外安装了一些其他的应用，业务发布使用的是tomcat6，另外一个业务报表系统使用的也是基于apache的应用，两个业务发布使用的是一个同一套JDK6。
这个服务器在刚部署完成的时候死过一次，表现就是能ping通服务器，但是远程连不上，web页面主页能打开，连不上数据库，在本地接上监视器和鼠标键盘也是登录不了桌面，强制关机重启后就恢复正常；
检查windows系统日志和应用程序日志，均未发现错误，但是在一个应用的日志上发现内存过载的日志，因此怀疑是数据库导致的内存耗尽死机，请高手帮忙分析一下：
一下为发生故障的alertlog：

Errors in file d:\app\administrator\diag\rdbms\uidb\uidb\trace\uidb_ckpt_2116.trc:

ORA-00206: error in writing (block 3, # blocks 1) of control file

ORA-00202: control file: 'D:\APP\ADMINISTRATOR\FLASH_RECOVERY_AREA\UIDB\CONTROL02.CTL'

ORA-27072: File I/O error

OSD-04008: WriteFile() 失败, 无法写入文件

O/S-Error: (OS 1453) 配额不足，无法完成请求的服务。

ORA-00206: error in writing (block 3, # blocks 1) of control file

ORA-00202: control file: 'D:\APP\ADMINISTRATOR\ORADATA\UIDB\CONTROL01.CTL'

ORA-27072: File I/O error

OSD-04008: WriteFile() 失败, 无法写入文件

O/S-Error: (OS 1453) 配额不足，无法完成请求的服务。

Errors in file d:\app\administrator\diag\rdbms\uidb\uidb\trace\uidb_ckpt_2116.trc:

ORA-00221: error on write to control file

ORA-00206: error in writing (block 3, # blocks 1) of control file

ORA-00202: control file: 'D:\APP\ADMINISTRATOR\FLASH_RECOVERY_AREA\UIDB\CONTROL02.CTL'

ORA-27072: File I/O error

OSD-04008: WriteFile() 失败, 无法写入文件

O/S-Error: (OS 1453) 配额不足，无法完成请求的服务。

ORA-00206: error in writing (block 3, # blocks 1) of control file

ORA-00202: control file: 'D:\APP\ADMINISTRATOR\ORADATA\UIDB\CONTROL01.CTL'

ORA-27072: File I/O error

OSD-04008: WriteFile() 失败, 无法写入文件

O/S-Error: (OS 1453) 配额不足，无法完成请求的服务。

CKPT (ospid: 2116): terminating the instance due to error 221

Tue Jul 01 19:42:47 2014

opiodr aborting process unknown ospid (973424) as a result of ORA-1092

Tue Jul 01 19:42:47 2014

ORA-1092 : opitsk aborting process

Tue Jul 01 19:42:48 2014

opiodr aborting process unknown ospid (2628) as a result of ORA-1092

Tue Jul 01 19:42:48 2014

ORA-1092 : opitsk aborting process

Tue Jul 01 19:42:48 2014

opiodr aborting process unknown ospid (3092) as a result of ORA-1092

Tue Jul 01 19:42:48 2014

opiodr aborting process unknown ospid (5484) as a result of ORA-1092

Tue Jul 01 19:42:48 2014

opiodr aborting process unknown ospid (3700) as a result of ORA-1092

Tue Jul 01 19:42:48 2014

opiodr aborting process unknown ospid (3796) as a result of ORA-1092

Tue Jul 01 19:42:48 2014

ORA-1092 : opitsk aborting process

Tue Jul 01 19:42:48 2014

ORA-1092 : opitsk aborting process

Tue Jul 01 19:42:48 2014

ORA-1092 : opitsk aborting process

Tue Jul 01 19:42:48 2014

ORA-1092 : opitsk aborting process

Tue Jul 01 19:42:48 2014

opiodr aborting process unknown ospid (156) as a result of ORA-1092

Tue Jul 01 19:42:49 2014

ORA-1092 : opitsk aborting process

Tue Jul 01 19:42:50 2014

opiodr aborting process unknown ospid (3276) as a result of ORA-1092

Tue Jul 01 19:42:50 2014

ORA-1092 : opitsk aborting process

Tue Jul 01 19:42:50 2014

opiodr aborting process unknown ospid (1032) as a result of ORA-1092

Tue Jul 01 19:42:50 2014

ORA-1092 : opitsk aborting process

Tue Jul 01 19:42:50 2014

opiodr aborting process unknown ospid (3864) as a result of ORA-1092

Tue Jul 01 19:42:50 2014

ORA-1092 : opitsk aborting process

Tue Jul 01 19:42:50 2014

opiodr aborting process unknown ospid (3748) as a result of ORA-1092

Tue Jul 01 19:42:50 2014

ORA-1092 : opitsk aborting process

Tue Jul 01 19:42:51 2014

opiodr aborting process unknown ospid (973848) as a result of ORA-1092

Tue Jul 01 19:42:51 2014

ORA-1092 : opitsk aborting process

Tue Jul 01 19:42:51 2014

opiodr aborting process unknown ospid (5508) as a result of ORA-1092

Tue Jul 01 19:42:51 2014

ORA-1092 : opitsk aborting process

Tue Jul 01 19:42:51 2014

opiodr aborting process unknown ospid (3148) as a result of ORA-1092

Tue Jul 01 19:42:51 2014

ORA-1092 : opitsk aborting process

Tue Jul 01 19:42:51 2014

opiodr aborting process unknown ospid (3156) as a result of ORA-1092

Tue Jul 01 19:42:51 2014

ORA-1092 : opitsk aborting process

Tue Jul 01 19:42:53 2014

opiodr aborting process unknown ospid (5488) as a result of ORA-1092

Tue Jul 01 19:42:53 2014

opiodr aborting process unknown ospid (2348) as a result of ORA-1092

Tue Jul 01 19:42:53 2014

ORA-1092 : opitsk aborting process

Tue Jul 01 19:42:53 2014

ORA-1092 : opitsk aborting process

Tue Jul 01 19:43:02 2014

Instance terminated by CKPT, pid = 2116

至此就死机了，连接不上数据库了

...全文

915 12 打赏收藏转发到动态举报

写回复

用AI写文章

12 条回复

切换为时间正序

请发表友善的回复…

发表回复

ligen119 2016-02-01

打赏
举报

引用 11 楼 second11 的回复:

解决了吗楼主

已经解决了，服务器C盘空间过小，虚拟内存扩展时磁盘空间不足，导致的服务器死机，并不是oracle或tomcat死掉

second11 2016-01-25

打赏
举报

解决了吗楼主

Yakecanz 2014-07-06

打赏
举报

应该是控制文件被别的进程占用。

huangdh12 2014-07-05

打赏
举报

另外，针对os 配额不足的问题，网络上也有人有类似的情况。搜搜 "配额不足，无法处理此命令" 看看有没有相似

huangdh12 2014-07-05

打赏
举报

你这个 oracle服务和tomcat 都是在server2008 服务器上吗？服务器是64位的？内存多少？另外，你的oracle的参数中 sga，pga是多少？还有是，如果再出现这个问题，你先通过任务管理器看看，内存，或者cpu上会不会有异常的地方

惜分飞 2014-07-04

打赏
举报

说的比较笼统，可能要深入数据库分析加入qq（107644445），给你们看看

lu仙深 2014-07-04

打赏
举报

定个时间每天自动重启oracle 服务试下看，会不会缓解呢？

惜分飞 2014-07-04

打赏
举报

1. d盘满了 2. 数据库需要人维护，不能放羊

卖水果的net 2014-07-04

打赏
举报

D:\APP\ADMINISTRATOR\ORADATA\UIDB\CONTROL01.CTL 快速恢复区满了，或 D 盘满了，手动清理一下。

ligen119 2014-07-04

打赏
举报

接楼上日志

Starting up:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options.
Using parameter settings in server-side spfile D:\APP\ADMINISTRATOR\PRODUCT\11.2.0\DBHOME_1\DATABASE\SPFILEUIDB.ORA
System parameters with non-default values:
  processes                = 150
  memory_target            = 1G
  memory_max_target        = 2G
  control_files            = "D:\APP\ADMINISTRATOR\ORADATA\UIDB\CONTROL01.CTL"
  control_files            = "D:\APP\ADMINISTRATOR\FLASH_RECOVERY_AREA\UIDB\CONTROL02.CTL"
  db_block_size            = 8192
  compatible               = "11.2.0.0.0"
  db_recovery_file_dest    = "D:\app\Administrator\flash_recovery_area"
  db_recovery_file_dest_size= 3912M
  undo_tablespace          = "UNDOTBS1"
  remote_login_passwordfile= "EXCLUSIVE"
  db_domain                = ""
  dispatchers              = "(PROTOCOL=TCP) (SERVICE=uidbXDB)"
  audit_file_dest          = "D:\APP\ADMINISTRATOR\ADMIN\UIDB\ADUMP"
  audit_trail              = "DB"
  db_name                  = "uidb"
  open_cursors             = 4000
  diagnostic_dest          = "D:\APP\ADMINISTRATOR"
Fri Jul 04 03:19:02 2014
PMON started with pid=2, OS id=2412 
Fri Jul 04 03:19:02 2014
VKTM started with pid=3, OS id=2416 at elevated priority
VKTM running at (10)millisec precision with DBRM quantum (100)ms
Fri Jul 04 03:19:02 2014
GEN0 started with pid=4, OS id=2420 
Fri Jul 04 03:19:02 2014
DIAG started with pid=5, OS id=2424 
Fri Jul 04 03:19:02 2014
DBRM started with pid=6, OS id=2428 
Fri Jul 04 03:19:02 2014
PSP0 started with pid=7, OS id=2432 
Fri Jul 04 03:19:02 2014
DIA0 started with pid=8, OS id=2436 
Fri Jul 04 03:19:02 2014
MMAN started with pid=9, OS id=2440 
Fri Jul 04 03:19:02 2014
DBW0 started with pid=10, OS id=2444 
Fri Jul 04 03:19:02 2014
LGWR started with pid=11, OS id=2448 
Fri Jul 04 03:19:02 2014
CKPT started with pid=12, OS id=2452 
Fri Jul 04 03:19:02 2014
SMON started with pid=13, OS id=2456 
Fri Jul 04 03:19:02 2014
RECO started with pid=14, OS id=2460 
Fri Jul 04 03:19:02 2014
MMON started with pid=15, OS id=2464 
Fri Jul 04 03:19:02 2014
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
Fri Jul 04 03:19:02 2014
MMNL started with pid=16, OS id=2468 
starting up 1 shared server(s) ...
ORACLE_BASE from environment = D:\app\Administrator
Fri Jul 04 03:19:05 2014
alter database mount exclusive
Successful mount of redo thread 1, with mount id 686344361
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: alter database mount exclusive
alter database open
Fri Jul 04 03:19:18 2014
Beginning crash recovery of 1 threads
 parallel recovery started with 7 processes
Started redo scan
Completed redo scan
 read 583 KB redo, 90 data blocks need recovery
Started redo application at
 Thread 1: logseq 2893, block 37660
Recovery of Online Redo Log: Thread 1 Group 1 Seq 2893 Reading mem 0
  Mem# 0: D:\APP\ADMINISTRATOR\ORADATA\UIDB\REDO01.LOG
Completed redo application of 0.30MB
Completed crash recovery at
 Thread 1: logseq 2893, block 38826, scn 159716486
 90 data blocks read, 90 data blocks written, 583 redo k-bytes read
Fri Jul 04 03:19:20 2014
Thread 1 advanced to log sequence 2894 (thread open)
Thread 1 opened at log sequence 2894
  Current log# 2 seq# 2894 mem# 0: D:\APP\ADMINISTRATOR\ORADATA\UIDB\REDO02.LOG
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri Jul 04 03:19:20 2014
SMON: enabling cache recovery
Successfully onlined Undo Tablespace 2.
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is AL32UTF8
No Resource Manager plan active
Fri Jul 04 03:19:28 2014
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
Fri Jul 04 03:19:34 2014
QMNC started with pid=27, OS id=2960 
Fri Jul 04 03:19:39 2014
Completed: alter database open
Fri Jul 04 03:19:43 2014
Starting background process CJQ0
Fri Jul 04 03:19:43 2014
CJQ0 started with pid=34, OS id=3312 
Fri Jul 04 03:19:43 2014
db_recovery_file_dest_size of 3912 MB is 0.00% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
Fri Jul 04 03:26:05 2014
Starting background process SMCO
Fri Jul 04 03:26:05 2014
SMCO started with pid=22, OS id=5908

ligen119 2014-07-04

打赏
举报

D盘空间还有40多G，应该不是磁盘空间满的问题！几天早上又发生了，这次能从本地登录服务器，但是其上所有应用都结束了，数据库也是连接不上，重启后恢复！

Fri Jul 04 03:15:48 2014


***********************************************************************

Fatal NI connect error 12638, connecting to:
 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))

  VERSION INFORMATION:
	TNS for 64-bit Windows: Version 11.2.0.1.0 - Production
	Oracle Bequeath NT Protocol Adapter for 64-bit Windows: Version 11.2.0.1.0 - Production
  Time: 04-7月 -2014 03:15:48
  Tracing not turned on.
  Tns error struct:
    ns main err code: 12638
    
TNS-12638: 身份证明检索失败
    ns secondary err code: 0
    nt main err code: 0
    nt secondary err code: 0
    nt OS err code: 0
Fri Jul 04 03:18:51 2014
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Picked latch-free SCN scheme 3
Using LOG_ARCHIVE_DEST_1 parameter default value as USE_DB_RECOVERY_FILE_DEST
Autotune of undo retention is turned on. 
IMODE=BR
ILAT =27
LICENSE_MAX_USERS = 0
SYS auditing is disabled

这个是昨天重启后清除alert日志后产生的新日志，大神们帮看看，数据库为啥又要歇菜！