求助:Oracle11gRAC节点实例自动停止

largerock2003 2010-11-02 07:13:56
求助:Oracle11gRAC节点实例自动停止

radhat5.2下2台服务器组成11gRAC。前1周使用正常,最近发现节点1(RAC1)的实例经常会offline,RAC2各服务一直正常,发生后有时候整体RAC数据库可连接,不影响使用,也有时候整个RAC数据库都无法连接(虽然此时RAC2节点上所有服务都正常)。今天在中午12点到下午3点间发生故障,RAC1实例掉线,整个RAC数据库无法连接,后重启RAC1服务器,依旧无法解决。后关闭RAC1服务器,重启RAC2服务器后用单RAC2节点可连接数据库,再启动RAC1后一切正常。截了下日志,跪求高手帮忙看下啥问题,万分感谢!
alert_rac1.log
Mon Nov 01 18:24:16 2010
System state dump is made for local instance
Errors in file /u01/app/oracle/diag/rdbms/rac/rac1/trace/rac1_diag_14583.trc:
ORA-29702: ???????????
Trace dumping is performing id=[cdmp_20101101182416]
Instance terminated by LMON, pid = 14598
Tue Nov 02 15:02:50 2010
Some alert messages have been suppressed because they were produced too early
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Interface type 1 eth1 192.168.2.0 configured from OCR for use as a cluster interconnect
Interface type 1 eth0 26.27.17.0 configured from OCR for use as a public interface
Picked latch-free SCN scheme 2
Using LOG_ARCHIVE_DEST_1 parameter default value as /u01/app/oracle/product/11.1.0/db_1/dbs/arch
Using LOG_ARCHIVE_DEST_10 parameter default value as USE_DB_RECOVERY_FILE_DEST
WARNING: db_recovery_file_dest is same as db_create_file_dest
Autotune of undo retention is turned on.
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up ORACLE RDBMS Version: 11.1.0.6.0.
Using parameter settings in server-side spfile +DATA/rac/spfilerac.ora
System parameters with non-default values:
processes = 150
spfile = "+DATA/rac/spfilerac.ora"
nls_language = "SIMPLIFIED CHINESE"
nls_territory = "CHINA"
memory_target = 13008M
control_files = "+DATA/rac/controlfile/current.283.731590265"
control_files = "+DATA/rac/controlfile/current.284.731590265"
db_block_size = 8192
compatible = "11.1.0.0.0"
cluster_database = TRUE
cluster_database_instances= 2
db_create_file_dest = "+DATA"
db_recovery_file_dest = "+DATA"
db_recovery_file_dest_size= 2G
thread = 1
undo_tablespace = "UNDOTBS1"
instance_number = 1
remote_login_passwordfile= "EXCLUSIVE"
db_domain = ""
dispatchers = "(PROTOCOL=TCP) (SERVICE=racXDB)"
local_listener = "(ADDRESS = (PROTOCOL = TCP)(HOST = 26.27.17.3)(PORT = 1521))"
remote_listener = "LISTENERS_RAC"
audit_file_dest = "/u01/app/oracle/admin/rac/adump"
audit_trail = "DB"
db_name = "rac"
open_cursors = 300
diagnostic_dest = "/u01/app/oracle"
Cluster communication is configured to use the following interface(s) for this instance
192.168.2.101
cluster interconnect IPC version:Oracle UDP/IP (generic)
IPC Vendor 1 proto 2
Tue Nov 02 15:02:54 2010
PMON started with pid=2, OS id=12600
Tue Nov 02 15:02:54 2010
VKTM started with pid=3, OS id=12607 at elevated priority
VKTM running at (20)ms precision
Tue Nov 02 15:02:54 2010
DIAG started with pid=4, OS id=12611
Tue Nov 02 15:02:54 2010
DBRM started with pid=5, OS id=12613
Tue Nov 02 15:02:54 2010
PING started with pid=6, OS id=12615
Tue Nov 02 15:02:54 2010
PSP0 started with pid=7, OS id=12617
Tue Nov 02 15:02:54 2010
ACMS started with pid=8, OS id=12619
Tue Nov 02 15:02:54 2010
DSKM started with pid=9, OS id=12621
Tue Nov 02 15:02:54 2010
DIA0 started with pid=10, OS id=12623
Tue Nov 02 15:02:54 2010
LMON started with pid=9, OS id=12625
Tue Nov 02 15:02:54 2010
LMD0 started with pid=11, OS id=12627
Tue Nov 02 15:02:54 2010
LMS0 started with pid=12, OS id=12629 at elevated priority
Tue Nov 02 15:02:54 2010
LMS1 started with pid=13, OS id=12633 at elevated priority
Tue Nov 02 15:02:54 2010
RMS0 started with pid=14, OS id=12637
Tue Nov 02 15:02:54 2010
MMAN started with pid=15, OS id=12639
Tue Nov 02 15:02:54 2010
DBW0 started with pid=16, OS id=12641
Tue Nov 02 15:02:54 2010
DBW1 started with pid=17, OS id=12643
Tue Nov 02 15:02:54 2010
DBW2 started with pid=18, OS id=12645
Tue Nov 02 15:02:54 2010
LGWR started with pid=19, OS id=12647
Tue Nov 02 15:02:54 2010
CKPT started with pid=20, OS id=12649
Tue Nov 02 15:02:54 2010
SMON started with pid=21, OS id=12651
Tue Nov 02 15:02:54 2010
RECO started with pid=22, OS id=12653
Tue Nov 02 15:02:54 2010
RBAL started with pid=23, OS id=12655
Tue Nov 02 15:02:54 2010
ASMB started with pid=24, OS id=12657
Tue Nov 02 15:02:54 2010
MMON started with pid=25, OS id=12659
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
Tue Nov 02 15:02:54 2010
MMNL started with pid=26, OS id=12661
starting up 1 shared server(s) ...
lmon registered with NM - instance id 1 (internal mem no 0)
alert_+asm1.log
Mon Nov 01 18:24:23 2010
Instance shutdown complete
Tue Nov 02 15:02:29 2010
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Interface type 1 eth1 192.168.2.0 configured from OCR for use as a cluster interconnect
Interface type 1 eth0 26.27.17.0 configured from OCR for use as a public interface
Picked latch-free SCN scheme 2
Using LOG_ARCHIVE_DEST_1 parameter default value as /u01/app/oracle/product/11.1.0/db_1/dbs/arch
Autotune of undo retention is turned on.
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up ORACLE RDBMS Version: 11.1.0.6.0.
Using parameter settings in server-side pfile /u01/app/oracle/product/11.1.0/db_1/dbs/init+ASM1.ora
System parameters with non-default values:
large_pool_size = 12M
instance_type = "asm"
cluster_database = TRUE
instance_number = 1
asm_diskstring = "/dev/oracleasm/disks"
asm_diskgroups = "DATA"
diagnostic_dest = "/u01/app/oracle"
Cluster communication is configured to use the following interface(s) for this instance
192.168.2.101
cluster interconnect IPC version:Oracle UDP/IP (generic)
IPC Vendor 1 proto 2
Tue Nov 02 15:02:33 2010
PMON started with pid=2, OS id=11952
Tue Nov 02 15:02:33 2010
VKTM started with pid=3, OS id=11954 at elevated priority
VKTM running at (20)ms precision
Tue Nov 02 15:02:33 2010
DIAG started with pid=4, OS id=11960
Tue Nov 02 15:02:33 2010
PING started with pid=5, OS id=11962
Tue Nov 02 15:02:33 2010
PSP0 started with pid=6, OS id=11966
Tue Nov 02 15:02:33 2010
DSKM started with pid=7, OS id=11968
Tue Nov 02 15:02:33 2010
DIA0 started with pid=8, OS id=11970
Tue Nov 02 15:02:33 2010
LMON started with pid=9, OS id=11972
Tue Nov 02 15:02:33 2010
LMD0 started with pid=7, OS id=11974
Tue Nov 02 15:02:33 2010
LMS0 started with pid=10, OS id=11978 at elevated priority
Tue Nov 02 15:02:33 2010
MMAN started with pid=11, OS id=11986
Tue Nov 02 15:02:34 2010
DBW0 started with pid=12, OS id=11988
Tue Nov 02 15:02:34 2010
LGWR started with pid=13, OS id=12003
Tue Nov 02 15:02:34 2010
CKPT started with pid=14, OS id=12012
Tue Nov 02 15:02:34 2010
SMON started with pid=15, OS id=12021
Tue Nov 02 15:02:34 2010
RBAL started with pid=16, OS id=12026
Tue Nov 02 15:02:34 2010
GMON started with pid=17, OS id=12030
lmon registered with NM - instance id 1 (internal mem no 0)
Reconfiguration started (old inc 0, new inc 32)
ASM instance
List of nodes:
0 1
Global Resource Directory frozen
* allocate domain 0, invalid = TRUE
Communication channels reestablished
* allocate domain 1, invalid = TRUE
* domain 0 valid = 1 according to instance 1
* domain 1 valid = 1 according to instance 1
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
LMS 0: 0 GCS shadows traversed, 0 replayed
Submitted all GCS remote-cache requests
Post SMON to start 1st pass IR
Fix write in gcs resources
Reconfiguration complete
Tue Nov 02 15:02:35 2010
LCK0 started with pid=18, OS id=12056
ORACLE_BASE from environment = /u01/app/oracle
Tue Nov 02 15:02:36 2010
SQL> ALTER DISKGROUP ALL MOUNT
NOTE: cache registered group DATA number=1 incarn=0x00b84487
NOTE:Loaded lib: /opt/oracle/extapi/32/asm/orcl/1/libasm.so
NOTE: Assigning number (1,2) to disk (/dev/oracleasm/disks/ORACLEASM3)
NOTE: Assigning number (1,1) to disk (/dev/oracleasm/disks/ORACLEASM2)
NOTE: Assigning number (1,0) to disk (/dev/oracleasm/disks/ORACLEASM1)
kfdp_query(): 2
kfdp_queryBg(): 2
NOTE: cache opening disk 0 of grp 1: DATA_0000 path:/dev/oracleasm/disks/ORACLEASM1
NOTE: F1X0 found on disk 0 fcn 0.95292
NOTE: cache opening disk 1 of grp 1: DATA_0001 path:/dev/oracleasm/disks/ORACLEASM2
NOTE: cache opening disk 2 of grp 1: DATA_0002 path:/dev/oracleasm/disks/ORACLEASM3
NOTE: cache mounting (not first) group 1/0x00B84487 (DATA)
kjbdomatt send to node 1
NOTE: attached to recovery domain 1
NOTE: opening chunk 1 at fcn 0.162241 ABA
NOTE: seq=48 blk=6201
NOTE: cache mounting group 1/0x00B84487 (DATA) succeeded
kfdp_query(): 3
kfdp_queryBg(): 3
NOTE: Instance updated compatible.asm to 10.1.0.0.0 for grp 1
SUCCESS: diskgroup DATA was mounted
SUCCESS: ALTER DISKGROUP ALL MOUNT
Tue Nov 02 15:02:50 2010
Starting background process ASMB
Tue Nov 02 15:02:50 2010
ASMB started with pid=20, OS id=12569
...全文
539 8 打赏 收藏 转发到动态 举报
写回复
用AI写文章
8 条回复
切换为时间正序
请发表友善的回复…
发表回复
largerock2003 2010-11-04
  • 打赏
  • 举报
回复
对于手动启动进程也试过了。结果是这样的:
[root@rac1 bin]# ./crs_start ora.rac.rac1.inst
Attempting to start `ora.rac.rac1.inst` on member `rac1`
Start of `ora.rac.rac1.inst` on member `rac1` failed.
rac2 : CRS-1018: Resource ora.rac1.ASM1.asm (application) is already running on rac1

///停止ora.rac1.ASM1.asm,再启动实例,出现以下信息

[root@rac1 bin]# ./crs_start ora.rac.rac1.inst
Attempting to start `ora.rac1.ASM1.asm` on member `rac1`
Start of `ora.rac1.ASM1.asm` on member `rac1` succeeded.
Attempting to start `ora.rac.rac1.inst` on member `rac1`
Start of `ora.rac.rac1.inst` on member `rac1` failed.
Attempting to stop `ora.rac1.ASM1.asm` on member `rac1`
Stop of `ora.rac1.ASM1.asm` on member `rac1` succeeded.
rac2 : CRS-1019: Resource ora.rac1.ASM1.asm (application) cannot run on rac2

CRS-0215: Could not start resource 'ora.rac.rac1.inst'.
largerock2003 2010-11-04
  • 打赏
  • 举报
回复
crsd的log,上个回复字数不够
2010-11-02 15:02:22.592: [ CLSVER][3086612880] Active Version from OCR:11.1.0.6.0
2010-11-02 15:02:22.592: [ CLSVER][3086612880] Active Version and Software Version are same
2010-11-02 15:02:22.592: [ CRSMAIN][3086612880] Initializing OCR
2010-11-02 15:02:22.600: [ OCRRAW][3086612880]proprioo: for disk 0 (/u01/raw/asmdisk2), id match (1), my id set (1619400550,1028247821) total id sets (1), 1st set (1619400550,1028247821), 2nd set (0,0) my votes (2), total votes (2)
2010-11-02 15:02:22.636: [ CRSD][3086612880] ENV Logging level for Module: allcomp 0
2010-11-02 15:02:22.637: [ CRSD][3086612880] ENV Logging level for Module: default 0
2010-11-02 15:02:22.638: [ CRSD][3086612880] ENV Logging level for Module: OCRRAW 0
2010-11-02 15:02:22.639: [ CRSD][3086612880] ENV Logging level for Module: OCROSD 0
2010-11-02 15:02:22.640: [ CRSD][3086612880] ENV Logging level for Module: OCRCAC 0
2010-11-02 15:02:22.641: [ CRSD][3086612880] ENV Logging level for Module: COMMCRS 0
2010-11-02 15:02:22.642: [ CRSD][3086612880] ENV Logging level for Module: COMMNS 0
2010-11-02 15:02:22.643: [ CRSD][3086612880] ENV Logging level for Module: CRSUI 0
2010-11-02 15:02:22.643: [ CRSD][3086612880] ENV Logging level for Module: CRSCOMM 0
2010-11-02 15:02:22.644: [ CRSD][3086612880] ENV Logging level for Module: CRSRTI 0
2010-11-02 15:02:22.645: [ CRSD][3086612880] ENV Logging level for Module: CRSMAIN 0
2010-11-02 15:02:22.646: [ CRSD][3086612880] ENV Logging level for Module: CRSPLACE 0
2010-11-02 15:02:22.647: [ CRSD][3086612880] ENV Logging level for Module: CRSAPP 0
2010-11-02 15:02:22.648: [ CRSD][3086612880] ENV Logging level for Module: CRSRES 0
2010-11-02 15:02:22.648: [ CRSD][3086612880] ENV Logging level for Module: CRSOCR 0
2010-11-02 15:02:22.649: [ CRSD][3086612880] ENV Logging level for Module: CRSTIMER 0
2010-11-02 15:02:22.650: [ CRSD][3086612880] ENV Logging level for Module: CRSEVT 0
2010-11-02 15:02:22.651: [ CRSD][3086612880] ENV Logging level for Module: CRSD 0
2010-11-02 15:02:22.652: [ CRSD][3086612880] ENV Logging level for Module: CLUCLS 0
2010-11-02 15:02:22.653: [ CRSD][3086612880] ENV Logging level for Module: CLSVER 0
2010-11-02 15:02:22.653: [ CRSD][3086612880] ENV Logging level for Module: CSSCLNT 0
2010-11-02 15:02:22.654: [ CRSD][3086612880] ENV Logging level for Module: OCRAPI 0
2010-11-02 15:02:22.655: [ CRSD][3086612880] ENV Logging level for Module: OCRUTL 0
2010-11-02 15:02:22.656: [ CRSD][3086612880] ENV Logging level for Module: OCRMSG 0
2010-11-02 15:02:22.657: [ CRSD][3086612880] ENV Logging level for Module: OCRCLI 0
2010-11-02 15:02:22.665: [ CRSD][3086612880] ENV Logging level for Module: OCRSRV 0
2010-11-02 15:02:22.666: [ CRSD][3086612880] ENV Logging level for Module: OCRMAS 0
2010-11-02 15:02:22.666: [ CRSMAIN][3086612880] Filename is /u01/app/oracle/product/11.1.0/crs_1/crs/init/rac1.pid
[ clsdmt][2862574480]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=rac1DBG_CRSD))
2010-11-02 15:02:22.688: [ CRSMAIN][3086612880] Using Authorizer location: /u01/app/oracle/product/11.1.0/crs_1/crs/auth/
2010-11-02 15:02:22.705: [ CRSMAIN][3086612880] Initializing RTI
2010-11-02 15:02:22.717: [ CRSMAIN][3086612880] Initializing EVMMgr
2010-11-02 15:02:22.717: [CRSTIMER][2841594768] Timer Thread Starting.
2010-11-02 15:02:22.915: [ COMMCRS][2831104912]clsc_connect: (0xb5624560) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=SYSTEM.evm.acceptor.auth))

2010-11-02 15:02:23.365: [ COMMCRS][2831104912]clsc_connect: (0xb56245f8) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=SYSTEM.evm.acceptor.auth))

2010-11-02 15:02:24.442: [ CRSMAIN][3086612880] CRSD locked during state recovery, please wait.
2010-11-02 15:02:24.494: [ CRSMAIN][3086612880] CRSD recovered, unlocked.
2010-11-02 15:02:24.495: [ CRSMAIN][3086612880] QS socket on: (ADDRESS=(PROTOCOL=ipc)(KEY=ora_crsqs))
2010-11-02 15:02:24.499: [ CRSMAIN][3086612880] CRSD UI socket on: (ADDRESS=(PROTOCOL=ipc)(KEY=CRSD_UI_SOCKET))
2010-11-02 15:02:24.501: [ CRSMAIN][3086612880] E2E socket on: (ADDRESS=(PROTOCOL=tcp)(HOST=rac1-priv)(PORT=49896))
2010-11-02 15:02:24.501: [ CRSMAIN][3086612880] Starting Threads
2010-11-02 15:02:24.501: [ CRSMAIN][3086612880] CRS Daemon Started.
2010-11-02 15:02:24.501: [ CRSMAIN][116947856] Starting runCommandServer for (UI = 1, E2E = 0). 0
2010-11-02 15:02:24.501: [ CRSMAIN][119049104] Starting runCommandServer for (UI = 1, E2E = 0). 1
2010-11-02 15:02:24.520: [ CRSRES][3086612880] startup = 1
2010-11-02 15:02:24.532: [ CRSRES][3086612880] startup = 1
2010-11-02 15:02:24.546: [ CRSRES][3086612880] startup = 1
2010-11-02 15:02:24.558: [ CRSRES][3086612880] startup = 1
2010-11-02 15:02:24.574: [ CRSRES][3086612880] startup = 1
2010-11-02 15:02:24.622: [ CRSRES][2776554384] StopResource: setting CLI values
2010-11-02 15:02:24.634: [ CRSRES][2776554384] Attempting to stop `ora.rac1.vip` on member `rac2`
2010-11-02 15:02:24.646: [ CRSRES][2774453136] startRunnable: setting CLI values
2010-11-02 15:02:24.649: [ CRSRES][2774453136] Attempting to start `ora.rac1.ASM1.asm` on member `rac1`
2010-11-02 15:02:24.918: [ CRSRES][2776554384] Stop of `ora.rac1.vip` on member `rac2` succeeded.
2010-11-02 15:02:24.924: [ CRSRES][2776554384] startRunnable: setting CLI values
2010-11-02 15:02:24.924: [ CRSRES][2776554384] Attempting to start `ora.rac1.vip` on member `rac1`
2010-11-02 15:02:29.322: [ CRSRES][2776554384] Start of `ora.rac1.vip` on member `rac1` succeeded.
2010-11-02 15:02:29.360: [ CRSRES][2776554384] startRunnable: setting CLI values
2010-11-02 15:02:29.366: [ CRSRES][2776554384] Attempting to start `ora.rac1.LISTENER_RAC1.lsnr` on member `rac1`
2010-11-02 15:02:35.564: [ CRSRES][2776554384] Start of `ora.rac1.LISTENER_RAC1.lsnr` on member `rac1` succeeded.
2010-11-02 15:02:36.546: [ CRSRES][2740882320] CRS-1002: Resource 'ora.rac1.LISTENER_RAC1.lsnr' is already running on member 'rac1'

2010-11-02 15:02:47.166: [ CRSRES][2740882320] startRunnable: setting CLI values
2010-11-02 15:02:47.172: [ CRSRES][2740882320] Attempting to start `ora.rac1.ons` on member `rac1`
2010-11-02 15:02:48.610: [ CRSRES][2740882320] Start of `ora.rac1.ons` on member `rac1` succeeded.
2010-11-02 15:02:48.636: [ CRSRES][2774453136] Start of `ora.rac1.ASM1.asm` on member `rac1` succeeded.
2010-11-02 15:02:48.657: [ CRSRES][2774453136] startRunnable: setting CLI values
2010-11-02 15:02:48.660: [ CRSRES][2774453136] Attempting to start `ora.rac.rac1.inst` on member `rac1`
largerock2003 2010-11-04
  • 打赏
  • 举报
回复
[Quote=引用 5 楼 inthirties 的回复:]
target和state的状态都是offline

还是traget是online而state是offline,

检查一下crsd.log

或者手动单独试试只启动出错node的instance,而不启动另一个。再看看情况。
[/Quote]
我昨天仔细看了下是这样的,所有情况下target都是online的,会offline的只有state。一种情况是节点1只有实例进程offline时候,整个数据库还可以使用,估计是节点2在提供服务。第二种情况是节点1的实例进程以及jsd的进程state都offline了,这个时候虽然节点2的state都正常,但是整个数据库就无法使用。
crsd.log 如下:
2010-11-01 18:24:24.712: [ CRSRES][62442384] Stop of `ora.rac1.ASM1.asm` on member `rac1` succeeded.
2010-11-01 18:24:24.721: [ CRSRES][62442384] rac2 : CRS-1019: Resource ora.rac1.ASM1.asm (application) cannot run on rac2


2010-11-02 14:20:14.329: [ CRSEVT][62442384] CAAMonitorHandler :: 0:Could not join /u01/app/oracle/product/11.1.0/crs_1/bin/racgwrap(check)
category: 1234, operation: scls_process_join, loc: childcrash, OS error: 0, other: Abnormal termination of the child

2010-11-02 14:20:14.329: [ CRSEVT][62442384] CAAMonitorHandler :: 0:Action Script /u01/app/oracle/product/11.1.0/crs_1/bin/racgwrap(check) timed out for ora.rac1.vip! (timeout=60)
2010-11-02 14:20:14.329: [ CRSAPP][62442384] CheckResource error for ora.rac1.vip error code = -2
2010-11-02 14:57:39.383: [ CRSEVT][58239888] Error dispatching EVM event; reconnecting
2010-11-02 14:57:39.565: [ COMMCRS][2798599056]clsc_connect: (0xae516a08) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=SYSTEM.evm.acceptor.auth))

2010-11-02 14:57:40.746: [ COMMCRS][2798599056]clsc_connect: (0xae516a08) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=SYSTEM.evm.acceptor.auth))

2010-11-02 14:57:41.428: [ COMMCRS][2798599056]clsc_connect: (0xae516a08) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=SYSTEM.evm.acceptor.auth))

2010-11-02 14:57:42.359: [ COMMCRS][2798599056]clsc_connect: (0xae516a08) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=SYSTEM.evm.acceptor.auth))

2010-11-02 14:57:43.789: [ COMMCRS][2798599056]clsc_connect: (0xae516a08) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=SYSTEM.evm.acceptor.auth))

2010-11-02 14:57:45.220: [ COMMCRS][2798599056]clsc_connect: (0xae516a08) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=SYSTEM.evm.acceptor.auth))

2010-11-02 14:57:45.901: [ COMMCRS][2798599056]clsc_connect: (0xae516a08) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=SYSTEM.evm.acceptor.auth))

2010-11-02 14:57:46.777: [ OCRSRV][2913987472]th_select_w_f_r: smwait error [1]
2010-11-02 14:57:46.777: [ OCRSRV][2937031568]th_select_w_f_r: smwait error [1]
2010-11-02 14:57:46.777: [ OCRSRV][2924477328]th_select_w_f_r: smwait error [1]
2010-11-02 14:57:46.777: [ OCRRAW][74300304]pr_io_wait: Error in smwait. retry.
2010-11-02 14:57:46.777: [ OCRSRV][2882517904]th_select_w_f_r: smwait error [1]
2010-11-02 14:57:46.777: [ OCRSRV][2958011280]th_select_w_f_r: smwait error [1]
2010-11-02 14:57:46.777: [ OCRSRV][2947521424]th_select_w_f_r: smwait error [1]
2010-11-02 14:57:46.777: [ OCRSRV][2872028048]th_select_w_f_r: smwait error [1]
2010-11-02 15:02:17.190: [ default][3086612880] CRS Daemon Starting
2010-11-02 15:02:17.204: [ CRSMAIN][3086612880] Checking the OCR device
2010-11-02 15:02:17.209: [ CRSMAIN][3086612880] Connecting to the CSS Daemon
2010-11-02 15:02:17.455: [ COMMCRS][51907472]clsc_connect: (0x8fee720) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_rac1_))

2010-11-02 15:02:17.455: [ CSSCLNT][3086612880]clsssInitNative: failed to connect to (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_rac1_)), rc 9

2010-11-02 15:02:17.461: [ CRSRTI][3086612880] CSS is not ready. Received status 3 from CSS. Waiting for good status ..

2010-11-02 15:02:18.660: [ COMMCRS][51907472]clsc_connect: (0x8fee720) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_rac1_))

2010-11-02 15:02:18.660: [ CSSCLNT][3086612880]clsssInitNative: failed to connect to (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_rac1_)), rc 9

2010-11-02 15:02:18.660: [ CRSRTI][3086612880] CSS is not ready. Received status 3 from CSS. Waiting for good status ..

2010-11-02 15:02:19.842: [ COMMCRS][51907472]clsc_connect: (0x8fee720) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_rac1_))

2010-11-02 15:02:19.843: [ CSSCLNT][3086612880]clsssInitNative: failed to connect to (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_rac1_)), rc 9

2010-11-02 15:02:19.843: [ CRSRTI][3086612880] CSS is not ready. Received status 3 from CSS. Waiting for good status ..

2010-11-02 15:02:21.041: [ COMMCRS][51907472]clsc_connect: (0x8fee720) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_rac1_))

2010-11-02 15:02:21.042: [ CSSCLNT][3086612880]clsssInitNative: failed to connect to (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_rac1_)), rc 9

2010-11-02 15:02:21.042: [ CRSRTI][3086612880] CSS is not ready. Received status 3 from CSS. Waiting for good status ..

2010-11-02 15:02:22.377: [ CRSMAIN][3086612880] CRSD running as the Privileged user
inthirties 2010-11-03
  • 打赏
  • 举报
回复
target和state的状态都是offline

还是traget是online而state是offline,

检查一下crsd.log

或者手动单独试试只启动出错node的instance,而不启动另一个。再看看情况。
largerock2003 2010-11-02
  • 打赏
  • 举报
回复
恩,就是 raw+asm的。只有实例offline
Dave 2010-11-02
  • 打赏
  • 举报
回复

ORA-29702: error occurred in Cluster Group Service operation
Cause: An unexpected error occurred while performing a CGS operation.

Action: Verify that the LMON process is still active. Also, check the Oracle LMON trace files for errors.


只有实例offline? 其他进程呢? 看下CRS 日志还有其他信息没有? RAC 用的是什么架构? raw+asm?

看一下系统的日志:
/var/log/message



------------------------------------------------------------------------------
Blog: http://blog.csdn.net/tianlesoftware
网上资源: http://tianlesoftware.download.csdn.net
相关视频:http://blog.csdn.net/tianlesoftware/archive/2009/11/27/4886500.aspx
DBA1 群:62697716(满); DBA2 群:62697977(满)
DBA3 群:62697850 DBA 超级群:63306533;
聊天 群:40132017
--加群需要在备注说明Oracle表空间和数据文件的关系,否则拒绝申请
RAC是一个完整的集群应用环境,它不仅实现了集群的功能,而且提供了运行在集群之上的应用程序,即Oracle数据库。无论与普通的集群相比,还是与普通的oracle数据库相比,RAC都有一些独特之处。 RAC由至少两个节点组成,节点之间通过公共网络和私有网络连接,其中私有网络的功能是实现节点之间的通信,而公共网络的功能是提供用户的访问。在每个节点上分别运行一个Oracle数据库实例和一个监听器,分别监听一个IP地址上的用户请求,这个地址称为VIP(Virtual IP)。用户可以向任何一个VIP所在的数据库服务器发出请求,通过任何一个数据库实例访问数据库。Clusterware负责监视每个节点的状态,如果发现某个节点出现故障,便把这个节点上的数据库实例和它所对应的VIP以及其他资源切换到另外一个节点上,这样可以保证用户仍然可通过这个VIP访问数据库。 在普通的Oracle数据库中,一个数据库实例只能访问一个数据库,而一个数据库只能被一个数据库实例打开。在RAC环境中,多个数据库实例同时访问同一个数据库,每个数据库实例分别在不同的节点上运行,而数据库存放在共享的存储设备上。 通过RAC,不仅可以实现数据库的并发访问,而且可以实现用户访问的负载均衡。用户可以通过任何一个数据库实例访问数据库,实例之间通过内部通信来保证事务的一致性。例如,当用户在一个实例修改数据时,需要对数据加锁。当另一个用户在其他实例中修改同样的数据时,便需要等待锁的释放。当前一个用户提交事务时,后一个用户立即可以得到修改之后的数据。
目录 推荐序 前言 第1章 认识Oracle RAC 1.1 RAC产生的背景 1.2 RAC体系结构 1.2.1整体结构 1.2.2物理层次结构 1.2.3逻辑层次结构 1.3 RAC的特点 1.3.1双机并行 1.3.2高可用性 1.3.3易伸缩性 1.3.4低成本 1.3.5高吞吐量 1.4 RAC存在的问题 1.4.1稳定性 1.4.2高性能 1.5 RAC软件 1.5.1存储管理软件 1.5.2集群管理软件 1.5.3数据库管理软件 1.6本章小结 第2章 搭建类似生产环境的RAC 2.1搭建环境 2.1.1 RAC的物理结构 2.1.硬件环境 2.1.3软件环境 2.2搭建存储服务器 2.2.1安装Openfiler操作系统 2.2.2Openfiler主界面 2.2.3配置iSCSI磁盘 2.3搭建数据库服务器 2.3.1为服务器配置4个网卡 2.3.2安装Linux操作系统 2.3.3挂载iSCSI磁盘 2.3.4配置udev固定iSCSI磁盘设备名称 2.3.5配置服务器的图形化环境 2.4 RAC运行环境安装前检查 2.4.1服务器检查 2.4.2存储检查 2.4.3网络检查 2.5配置数据库服务器 2.5.1安装软件包 2.5.2修改系统参数 2.5.3配置域名解析服务 2.5.4配置hosts文件 2.5.5创建组、用户和目录 2.5.6设置环境变量 2.5.7配置SSH用户等效性 2.5.8配置时间同步服务 2.5.9安装cvuqdisk包 2.5.10 CVU验证安装环境 2.6创建ASM磁盘 2.6.1安装ASMLib驱动 2.6.2创建ASMLib磁盘 2.7部署RAC 2.7.1安装Grid Infrastructure 2.7.2安装Database DBMS 2.7.3创建ASM磁盘组 2.7.4创建RAC数据库 2.8测试RAC 2.8.1连接方式测试 2.8.2异常情况测试 2.9虚拟机搭建RAC 2.9.1虚拟机Xen简介 2.9.2启动主机Xen内核 2.9.3 Xen虚拟机创建网络环境 2.9.4创建Xen存储服务器 2.9.5创建Xen数据库服务器 2.10本章小结 第3章 Clusterware集群软件 3.1 Grid Infrastructure架构 3.1.1 GI的特点 3.1.2 GI的应用 3.1.3 Clusterware的特点 3.1.4 Clusterware增强的特性 3.2 Clusterware磁盘文件 3.2.1表决磁盘 3.2.2集群注册表 3.2.3本地注册表 3.3 Clusterware启动流程 3.3.1启动流程 3.3.2后台进程 3.4 Clusterware隔离机制 3.4.1 Clusterware心跳 3.4.2 Clusterware隔离特性IPMI 3.4.3 RAC隔离体系 3.5网格即插即用 3.5.1 GPnP结构 3.5.2 GPnP profile文件 3.5.3 mDNS服务 3.6日志体系 3.6.1 ADR的特点 3.6.2 ADR目录结构 3.6.3命令行工具ADRCI 3.6.4 Clusterware日志文件 3.6.5 ASM实例和监听日志文件 3.6.6 Database日志文件 3.7本章小结 第4章 ASM存储软件 4.1 ASM简介 4.1.1 ASM的特点 4.1.2 ASM实例的功能 4.2 ASM磁盘组 4.2.1 ASM磁盘 4.2.2共享ASM磁盘组 4.2.3 ASM逻辑结构 4.2.4 ASM故障组 4.2.5 ASM条带化 4.3 ASM文件 4.3.1 ASM文件类型 4.3.2 ASM别名 4.3.3 ASM文件模板 4.4 ASM数据结构 4.4.1物理元数据 4.4.2虚拟元数据 4.5 ASM操作 4.5.1 RDBMS操作ASM文件 4.5.2 ASM文件的分配 4.5.3 ASM区间读写特性 4.5.4 ASM同步技术 4.5.5 ASM实例恢复和Crash恢复 4.5.6 ASM磁盘组操作 4.6 ACFS集群文件系统 4.6.1 ACFS概述 4.6.2 ADVM动态卷管理 4.6.3 ACFS快照 4.6.4 ACFS的备份和恢复 4.6.5 ACFS同ASM整合 4.7本章小结 第5章 RAC工作原理 5.1单实例并发与一致性 5.1.1数据读一致性与写一致性 5.1.2多版本数据块 5.1.3

17,377

社区成员

发帖
与我相关
我的任务
社区描述
Oracle 基础和管理
社区管理员
  • 基础和管理社区
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告
暂无公告

试试用AI创作助手写篇文章吧