Please, Help! System shuts down periodically.
Hi, all
recently our system restarts or shuts down frequently, I post the sys info and error message as below, is anybody here know what happened to our system, I will very appreciate it. BTW, we've just replaced a new motherborad yesterday, but things still happens. Thanks for your time.
asgard% uname -i
SUNW,Sun-Fire-280R
asgard% /usr/platform/`uname -i`/sbin/prtdiag
System Configuration: Sun Microsystems sun4u Sun Fire 280R (UltraSPARC-III)
System clock frequency: 150 MHz
Memory size: 3072 Megabytes
========================= CPUs ===============================================
Run E$ CPU CPU
Brd CPU MHz MB Impl. Mask
--- --- ---- ---- ------- ----
A 0 750 8.0 US-III 5.4
========================= Memory Configuration ===============================
Logical Logical Logical
MC Bank Bank Bank DIMM Interleave Interleaved
Brd ID num size Status Size Factor with
---- --- ---- ------ ----------- ------ ---------- -----------
CA 0 0 1024MB no_status 512MB 2-way 0
CA 0 1 512MB no_status 256MB 2-way 1
CA 0 2 1024MB no_status 512MB 2-way 0
CA 0 3 512MB no_status 256MB 2-way 1
========================= IO Cards =========================
Bus Max
IO Port Bus Freq Bus Dev,
Brd Type ID Side Slot MHz Freq Func State Name Model
---- ---- ---- ---- ---- ---- ---- ---- ----- -------------------------------- ----------------------
I/O PCI 8 A 1 33 66 1,0 ok TSI,gfxp GFXP
asgard% tail -100 /var/adm/messages
........
Nov 5 17:14:21 asgard SUNW,UltraSPARC-III: [ID 675889 kern.info] NOTICE: [AFT0] Corrected system bus (CE) Event detected by CPU0 at TL=0, errID 0x000009b1.58b59d08
Nov 5 17:14:21 asgard AFSR 0x00000002<CE>;.00000118 AFAR 0x00000000.2e37c0b0
Nov 5 17:14:21 asgard Fault_PC 0x100257c0 Esynd 0x0118 J0100
Nov 5 17:14:21 asgard SUNW,UltraSPARC-III: [ID 567426 kern.info] [AFT0] errID 0x000009b1.58b59d08 Corrected Memory Error on J0100 is Intermittent
Nov 5 17:14:21 asgard SUNW,UltraSPARC-III: [ID 884437 kern.info] [AFT0] errID 0x000009b1.58b59d08 Data Bit 87 was in error and corrected
Nov 5 17:14:21 asgard SUNW,UltraSPARC-III: [ID 721552 kern.info] [AFT2] errID 0x000009b1.58b59d08 E$tag PA=0x00000000.b2b7c080 does not match AFAR=0x00000000.2e37c080
Nov 5 17:14:21 asgard SUNW,UltraSPARC-III: [ID 186007 kern.info] [AFT2] errID 0x000009b1.58b59d08 PA=0x00000000.b2b7c080
Nov 5 17:14:21 asgard E$tag 0x00000001.65522920 E$state_2 Modified
Nov 5 17:14:21 asgard SUNW,UltraSPARC-III: [ID 895151 kern.info] [AFT2] E$Data (0x00) 0x418c256d.0005f37d 0x418c256d.0005f37d ECC 0x06a
Nov 5 17:14:21 asgard SUNW,UltraSPARC-III: [ID 895151 kern.info] [AFT2] E$Data (0x10) 0x00432b10.00432b50 0x00432b58.00432b60 ECC 0x1cd
Nov 5 17:14:21 asgard SUNW,UltraSPARC-III: [ID 895151 kern.info] [AFT2] E$Data (0x20) 0x00432b30.00432b38 0x00432b40.00432b48 ECC 0x0a9
Nov 5 17:14:21 asgard SUNW,UltraSPARC-III: [ID 895151 kern.info] [AFT2] E$Data (0x30) 0x00432b80.00432b88 0x00432b90.00432b98 ECC 0x162
Nov 5 17:14:21 asgard SUNW,UltraSPARC-III: [ID 929717 kern.info] [AFT2] D$ data not available
Nov 5 17:14:21 asgard SUNW,UltraSPARC-III: [ID 335345 kern.info] [AFT2] I$ data not available
Nov 5 17:14:21 asgard SUNW,UltraSPARC-III: [ID 203603 kern.warning] WARNING: [AFT1] Uncorrectable system bus (UE) Event detected by CPU0 Privileged Data Access at TL=0, errID 0x000009b1.58bc9ae0
Nov 5 17:14:21 asgard AFSR 0x00100006<RIV,UE,CE>;.000000e2 AFAR 0x00000000.2e37c300
Nov 5 17:14:21 asgard Fault_PC 0x100257b0 Esynd 0x00e2 J0100 J0202 J0304 J0406
Nov 5 17:14:21 asgard SUNW,UltraSPARC-III: [ID 976007 kern.notice] [AFT1] errID 0x000009b1.58bc9ae0 Two Bits were in error
Nov 5 17:14:21 asgard SUNW,UltraSPARC-III: [ID 580991 kern.info] [AFT2] errID 0x000009b1.58bc9ae0 E$tag PA=0x00000000.a6b7c300 does not match AFAR=0x00000000.2e37c300
Nov 5 17:14:21 asgard SUNW,UltraSPARC-III: [ID 979062 kern.info] [AFT2] errID 0x000009b1.58bc9ae0 PA=0x00000000.a6b7c300
Nov 5 17:14:21 asgard E$tag 0x00000001.4d000010 E$state_4 Invalid
Nov 5 17:14:21 asgard SUNW,UltraSPARC-III: [ID 895151 kern.info] [AFT2] E$Data (0x00) 0x00000000.00000080 0x00000000.00000000 ECC 0x03e
Nov 5 17:14:21 asgard SUNW,UltraSPARC-III: [ID 895151 kern.info] [AFT2] E$Data (0x10) 0x000294b2.000000c0 0x00000000.00000000 ECC 0x1f8
Nov 5 17:14:21 asgard SUNW,UltraSPARC-III: [ID 895151 kern.info] [AFT2] E$Data (0x20) 0x00000000.00000000 0x00000000.00000000 ECC 0x000
Nov 5 17:14:21 asgard SUNW,UltraSPARC-III: [ID 895151 kern.info] [AFT2] E$Data (0x30) 0x00000000.00000000 0x00000000.00000000 ECC 0x000
Nov 5 17:14:21 asgard SUNW,UltraSPARC-III: [ID 929717 kern.info] [AFT2] D$ data not available
Nov 5 17:14:21 asgard unix: [ID 321153 kern.notice] NOTICE: Scheduling clearing of error on page 0x00000000.2e37c000
Nov 5 17:14:23 asgard unix: [ID 221039 kern.notice] NOTICE: Previously reported error on page 0x00000000.2e37c000 cleared
Nov 5 17:14:23 asgard SUNW,UltraSPARC-III: [ID 899313 kern.info] [AFT3] errID 0x000009b1.58bc9ae0 Above Error detected by protected Kernel code
Nov 5 17:14:23 asgard that will try to clear error from system
Nov 5 17:14:23 asgard SUNW,UltraSPARC-III: [ID 798685 kern.info] NOTICE: [AFT0] Corrected system bus (CE) Event detected by CPU0 at TL=0, errID 0x000009b1.58bc9ae0
Nov 5 17:14:23 asgard AFSR 0x00100006<RIV,UE,CE>;.000000e2 AFAR 0x00000000.2e37c300 INVALID
Nov 5 17:14:23 asgard Fault_PC 0x100257b0 Esynd 0x00e2 INVALID
Nov 5 17:14:23 asgard SUNW,UltraSPARC-III: [ID 213248 kern.warning] WARNING: [AFT1] Uncorrectable system bus (UE) Event detected by CPU0 Privileged Data Access at TL=0, errID 0x000009b1.b8641c70
Nov 5 17:14:23 asgard AFSR 0x00100006<RIV,UE,CE>;.000000e4 AFAR 0x00000000.2e37e100
Nov 5 17:14:23 asgard Fault_PC 0x100257b0 Esynd 0x00e4 J0100 J0202 J0304 J0406
Nov 5 17:14:23 asgard SUNW,UltraSPARC-III: [ID 738383 kern.notice] [AFT1] errID 0x000009b1.b8641c70 Two Bits were in error
Nov 5 17:14:23 asgard SUNW,UltraSPARC-III: [ID 622977 kern.info] [AFT2] errID 0x000009b1.b8641c70 E$tag PA=0x00000000.0037e100 does not match AFAR=0x00000000.2e37e100
Nov 5 17:14:23 asgard SUNW,UltraSPARC-III: [ID 611851 kern.info] [AFT2] errID 0x000009b1.b8641c70 PA=0x00000000.0037e100
Nov 5 17:14:23 asgard E$tag 0x00000000.00000002 E$state_4 Invalid