求助:Oracle11gRAC节点实例自动停止
求助:Oracle11gRAC节点实例自动停止
radhat5.2下2台服务器组成11gRAC。前1周使用正常,最近发现节点1(RAC1)的实例经常会offline,RAC2各服务一直正常,发生后有时候整体RAC数据库可连接,不影响使用,也有时候整个RAC数据库都无法连接(虽然此时RAC2节点上所有服务都正常)。今天在中午12点到下午3点间发生故障,RAC1实例掉线,整个RAC数据库无法连接,后重启RAC1服务器,依旧无法解决。后关闭RAC1服务器,重启RAC2服务器后用单RAC2节点可连接数据库,再启动RAC1后一切正常。截了下日志,跪求高手帮忙看下啥问题,万分感谢!
alert_rac1.log
Mon Nov 01 18:24:16 2010
System state dump is made for local instance
Errors in file /u01/app/oracle/diag/rdbms/rac/rac1/trace/rac1_diag_14583.trc:
ORA-29702: ???????????
Trace dumping is performing id=[cdmp_20101101182416]
Instance terminated by LMON, pid = 14598
Tue Nov 02 15:02:50 2010
Some alert messages have been suppressed because they were produced too early
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Interface type 1 eth1 192.168.2.0 configured from OCR for use as a cluster interconnect
Interface type 1 eth0 26.27.17.0 configured from OCR for use as a public interface
Picked latch-free SCN scheme 2
Using LOG_ARCHIVE_DEST_1 parameter default value as /u01/app/oracle/product/11.1.0/db_1/dbs/arch
Using LOG_ARCHIVE_DEST_10 parameter default value as USE_DB_RECOVERY_FILE_DEST
WARNING: db_recovery_file_dest is same as db_create_file_dest
Autotune of undo retention is turned on.
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up ORACLE RDBMS Version: 11.1.0.6.0.
Using parameter settings in server-side spfile +DATA/rac/spfilerac.ora
System parameters with non-default values:
processes = 150
spfile = "+DATA/rac/spfilerac.ora"
nls_language = "SIMPLIFIED CHINESE"
nls_territory = "CHINA"
memory_target = 13008M
control_files = "+DATA/rac/controlfile/current.283.731590265"
control_files = "+DATA/rac/controlfile/current.284.731590265"
db_block_size = 8192
compatible = "11.1.0.0.0"
cluster_database = TRUE
cluster_database_instances= 2
db_create_file_dest = "+DATA"
db_recovery_file_dest = "+DATA"
db_recovery_file_dest_size= 2G
thread = 1
undo_tablespace = "UNDOTBS1"
instance_number = 1
remote_login_passwordfile= "EXCLUSIVE"
db_domain = ""
dispatchers = "(PROTOCOL=TCP) (SERVICE=racXDB)"
local_listener = "(ADDRESS = (PROTOCOL = TCP)(HOST = 26.27.17.3)(PORT = 1521))"
remote_listener = "LISTENERS_RAC"
audit_file_dest = "/u01/app/oracle/admin/rac/adump"
audit_trail = "DB"
db_name = "rac"
open_cursors = 300
diagnostic_dest = "/u01/app/oracle"
Cluster communication is configured to use the following interface(s) for this instance
192.168.2.101
cluster interconnect IPC version:Oracle UDP/IP (generic)
IPC Vendor 1 proto 2
Tue Nov 02 15:02:54 2010
PMON started with pid=2, OS id=12600
Tue Nov 02 15:02:54 2010
VKTM started with pid=3, OS id=12607 at elevated priority
VKTM running at (20)ms precision
Tue Nov 02 15:02:54 2010
DIAG started with pid=4, OS id=12611
Tue Nov 02 15:02:54 2010
DBRM started with pid=5, OS id=12613
Tue Nov 02 15:02:54 2010
PING started with pid=6, OS id=12615
Tue Nov 02 15:02:54 2010
PSP0 started with pid=7, OS id=12617
Tue Nov 02 15:02:54 2010
ACMS started with pid=8, OS id=12619
Tue Nov 02 15:02:54 2010
DSKM started with pid=9, OS id=12621
Tue Nov 02 15:02:54 2010
DIA0 started with pid=10, OS id=12623
Tue Nov 02 15:02:54 2010
LMON started with pid=9, OS id=12625
Tue Nov 02 15:02:54 2010
LMD0 started with pid=11, OS id=12627
Tue Nov 02 15:02:54 2010
LMS0 started with pid=12, OS id=12629 at elevated priority
Tue Nov 02 15:02:54 2010
LMS1 started with pid=13, OS id=12633 at elevated priority
Tue Nov 02 15:02:54 2010
RMS0 started with pid=14, OS id=12637
Tue Nov 02 15:02:54 2010
MMAN started with pid=15, OS id=12639
Tue Nov 02 15:02:54 2010
DBW0 started with pid=16, OS id=12641
Tue Nov 02 15:02:54 2010
DBW1 started with pid=17, OS id=12643
Tue Nov 02 15:02:54 2010
DBW2 started with pid=18, OS id=12645
Tue Nov 02 15:02:54 2010
LGWR started with pid=19, OS id=12647
Tue Nov 02 15:02:54 2010
CKPT started with pid=20, OS id=12649
Tue Nov 02 15:02:54 2010
SMON started with pid=21, OS id=12651
Tue Nov 02 15:02:54 2010
RECO started with pid=22, OS id=12653
Tue Nov 02 15:02:54 2010
RBAL started with pid=23, OS id=12655
Tue Nov 02 15:02:54 2010
ASMB started with pid=24, OS id=12657
Tue Nov 02 15:02:54 2010
MMON started with pid=25, OS id=12659
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
Tue Nov 02 15:02:54 2010
MMNL started with pid=26, OS id=12661
starting up 1 shared server(s) ...
lmon registered with NM - instance id 1 (internal mem no 0)
alert_+asm1.log
Mon Nov 01 18:24:23 2010
Instance shutdown complete
Tue Nov 02 15:02:29 2010
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Interface type 1 eth1 192.168.2.0 configured from OCR for use as a cluster interconnect
Interface type 1 eth0 26.27.17.0 configured from OCR for use as a public interface
Picked latch-free SCN scheme 2
Using LOG_ARCHIVE_DEST_1 parameter default value as /u01/app/oracle/product/11.1.0/db_1/dbs/arch
Autotune of undo retention is turned on.
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up ORACLE RDBMS Version: 11.1.0.6.0.
Using parameter settings in server-side pfile /u01/app/oracle/product/11.1.0/db_1/dbs/init+ASM1.ora
System parameters with non-default values:
large_pool_size = 12M
instance_type = "asm"
cluster_database = TRUE
instance_number = 1
asm_diskstring = "/dev/oracleasm/disks"
asm_diskgroups = "DATA"
diagnostic_dest = "/u01/app/oracle"
Cluster communication is configured to use the following interface(s) for this instance
192.168.2.101
cluster interconnect IPC version:Oracle UDP/IP (generic)
IPC Vendor 1 proto 2
Tue Nov 02 15:02:33 2010
PMON started with pid=2, OS id=11952
Tue Nov 02 15:02:33 2010
VKTM started with pid=3, OS id=11954 at elevated priority
VKTM running at (20)ms precision
Tue Nov 02 15:02:33 2010
DIAG started with pid=4, OS id=11960
Tue Nov 02 15:02:33 2010
PING started with pid=5, OS id=11962
Tue Nov 02 15:02:33 2010
PSP0 started with pid=6, OS id=11966
Tue Nov 02 15:02:33 2010
DSKM started with pid=7, OS id=11968
Tue Nov 02 15:02:33 2010
DIA0 started with pid=8, OS id=11970
Tue Nov 02 15:02:33 2010
LMON started with pid=9, OS id=11972
Tue Nov 02 15:02:33 2010
LMD0 started with pid=7, OS id=11974
Tue Nov 02 15:02:33 2010
LMS0 started with pid=10, OS id=11978 at elevated priority
Tue Nov 02 15:02:33 2010
MMAN started with pid=11, OS id=11986
Tue Nov 02 15:02:34 2010
DBW0 started with pid=12, OS id=11988
Tue Nov 02 15:02:34 2010
LGWR started with pid=13, OS id=12003
Tue Nov 02 15:02:34 2010
CKPT started with pid=14, OS id=12012
Tue Nov 02 15:02:34 2010
SMON started with pid=15, OS id=12021
Tue Nov 02 15:02:34 2010
RBAL started with pid=16, OS id=12026
Tue Nov 02 15:02:34 2010
GMON started with pid=17, OS id=12030
lmon registered with NM - instance id 1 (internal mem no 0)
Reconfiguration started (old inc 0, new inc 32)
ASM instance
List of nodes:
0 1
Global Resource Directory frozen
* allocate domain 0, invalid = TRUE
Communication channels reestablished
* allocate domain 1, invalid = TRUE
* domain 0 valid = 1 according to instance 1
* domain 1 valid = 1 according to instance 1
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
LMS 0: 0 GCS shadows traversed, 0 replayed
Submitted all GCS remote-cache requests
Post SMON to start 1st pass IR
Fix write in gcs resources
Reconfiguration complete
Tue Nov 02 15:02:35 2010
LCK0 started with pid=18, OS id=12056
ORACLE_BASE from environment = /u01/app/oracle
Tue Nov 02 15:02:36 2010
SQL> ALTER DISKGROUP ALL MOUNT
NOTE: cache registered group DATA number=1 incarn=0x00b84487
NOTE:Loaded lib: /opt/oracle/extapi/32/asm/orcl/1/libasm.so
NOTE: Assigning number (1,2) to disk (/dev/oracleasm/disks/ORACLEASM3)
NOTE: Assigning number (1,1) to disk (/dev/oracleasm/disks/ORACLEASM2)
NOTE: Assigning number (1,0) to disk (/dev/oracleasm/disks/ORACLEASM1)
kfdp_query(): 2
kfdp_queryBg(): 2
NOTE: cache opening disk 0 of grp 1: DATA_0000 path:/dev/oracleasm/disks/ORACLEASM1
NOTE: F1X0 found on disk 0 fcn 0.95292
NOTE: cache opening disk 1 of grp 1: DATA_0001 path:/dev/oracleasm/disks/ORACLEASM2
NOTE: cache opening disk 2 of grp 1: DATA_0002 path:/dev/oracleasm/disks/ORACLEASM3
NOTE: cache mounting (not first) group 1/0x00B84487 (DATA)
kjbdomatt send to node 1
NOTE: attached to recovery domain 1
NOTE: opening chunk 1 at fcn 0.162241 ABA
NOTE: seq=48 blk=6201
NOTE: cache mounting group 1/0x00B84487 (DATA) succeeded
kfdp_query(): 3
kfdp_queryBg(): 3
NOTE: Instance updated compatible.asm to 10.1.0.0.0 for grp 1
SUCCESS: diskgroup DATA was mounted
SUCCESS: ALTER DISKGROUP ALL MOUNT
Tue Nov 02 15:02:50 2010
Starting background process ASMB
Tue Nov 02 15:02:50 2010
ASMB started with pid=20, OS id=12569