blade150死机报错!急请帮助!
SUN blade150工作站,隔几天会死机或重启,查看messages如下:
Apr 8 12:15:42 Blade150 SUNW,UltraSPARC-IIe: [ID 589365 kern.warning] WARNING: [AFT1] Uncorrectable Memory Error on CPU0 at TL=0, errID 0x0002ce1d.b80cb6ea
Apr 8 12:15:42 Blade150 AFSR 0x00000001<ME>;.80300000<RIV,UE,CE>; AFAR 0x00000000.449bdba0
Apr 8 12:15:42 Blade150 AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x10032fbc
Apr 8 12:15:42 Blade150 UDBH 0x02e3<UE>; UDBH.ESYND 0xe3 UDBL 0x0000 UDBL.ESYND 0x00
Apr 8 12:15:42 Blade150 UDBH Syndrome 0xe3 Memory Module DIMM1
Apr 8 12:15:43 Blade150 SUNW,UltraSPARC-IIe: [ID 464264 kern.info] [AFT2] errID 0x0002ce1d.b80cb6ea E$tag != PA from AFAR; E$line was victimized
Apr 8 12:15:43 Blade150 dumping memory from PA 0x00000000.449bdb80 instead
Apr 8 12:15:43 Blade150 SUNW,UltraSPARC-IIe: [ID 359263 kern.info] [AFT2] E$Data (0x00): 0x00000072.00000000
Apr 8 12:15:43 Blade150 SUNW,UltraSPARC-IIe: [ID 359263 kern.info] [AFT2] E$Data (0x0: 0x00000019.00000000
Apr 8 12:15:43 Blade150 SUNW,UltraSPARC-IIe: [ID 359263 kern.info] [AFT2] E$Data (0x10): 0x0000006d.20000000
Apr 8 12:15:43 Blade150 SUNW,UltraSPARC-IIe: [ID 359263 kern.info] [AFT2] E$Data (0x1: 0x00000065.00000000
Apr 8 12:15:43 Blade150 SUNW,UltraSPARC-IIe: [ID 359263 kern.info] [AFT2] E$Data (0x20): 0x00000061.82001a04
Apr 8 12:15:43 Blade150 SUNW,UltraSPARC-IIe: [ID 359263 kern.info] [AFT2] E$Data (0x2: 0x00000019.00000057
Apr 8 12:15:43 Blade150 SUNW,UltraSPARC-IIe: [ID 359263 kern.info] [AFT2] E$Data (0x30): 0x00000000.00000000
Apr 8 12:15:43 Blade150 SUNW,UltraSPARC-IIe: [ID 359263 kern.info] [AFT2] E$Data (0x3: 0x0000006d.a0000000
Apr 8 12:15:43 Blade150 unix: [ID 836849 kern.notice]
Apr 8 12:15:43 Blade150 ^Mpanic[cpu0]/thread=300015a3a40:
Apr 8 12:15:43 Blade150 unix: [ID 597583 kern.notice] [AFT1] errID 0x0002ce1d.b80cb6ea UE Error(s)
Apr 8 12:15:43 Blade150 See previous message(s) for details
Apr 8 12:15:43 Blade150 unix: [ID 100000 kern.notice]
Apr 8 12:15:44 Blade150 genunix: [ID 723222 kern.notice] 000002a100705580 SUNW,UltraSPARC-IIe:cpu_aflt_log+4e0 (2a10070563e, 1, 10148c80, 2a1007057c8, 2a10070568b, 10148ca
Apr 8 12:15:44 Blade150 genunix: [ID 179002 kern.notice] %l0-3: 0000000000000000 000002a100705890 0000000000000003 0000000000000010
Apr 8 12:15:44 Blade150 %l4-7: 0000000000000000 0000000000000000 0000000000000000 000002a100705ba0
Apr 8 12:15:44 Blade150 genunix: [ID 723222 kern.notice] 000002a1007057d0 SUNW,UltraSPARC-IIe:cpu_async_error+868 (900000001, 3000020000, 9000000c3, 5800400000, 309015a3a40, 501041b2f
Apr 8 12:15:44 Blade150 genunix: [ID 179002 kern.notice] %l0-3: 0000000000000070 000002a100705948 0000000000000000 00000000000002e3
Apr 8 12:15:44 Blade150 %l4-7: 00000000449bdbc0 0000000000020000 0000000900000000 000003e001492a00
Apr 8 12:15:44 Blade150 unix: [ID 100000 kern.notice]
Apr 8 12:15:45 Blade150 genunix: [ID 672855 kern.notice] syncing file systems...
Apr 8 12:15:45 Blade150 genunix: [ID 904073 kern.notice] done
Apr 8 12:15:46 Blade150 genunix: [ID 353387 kern.notice] dumping to /dev/dsk/c0t0d0s3, offset 429916160
Apr 8 12:15:54 Blade150 uata: [ID 606412 kern.warning] WARNING: timeout: reset bus chno = 0 targ = 0
Apr 8 12:16:15 Blade150 genunix: [ID 409368 kern.notice] ^M100% done: 7094 pages dumped, compression ratio 3.32,
Apr 8 12:16:15 Blade150 genunix: [ID 851671 kern.notice] dump succeeded
May 31 17:11:25 Blade150 genunix: [ID 540533 kern.notice] ^MSunOS Release 5.8 Version Generic_108528-13 64-bit
May 31 17:11:25 Blade150 genunix: [ID 913631 kern.notice] Copyright 1983-2001 Sun Microsystems, Inc. All rights reserved.
May 31 17:11:25 Blade150 genunix: [ID 678236 kern.info] Ethernet address = 0:3:ba:27:a5:4b
May 31 17:11:25 Blade150 unix: [ID 389951 kern.info] mem = 524288K (0x20000000)
May 31 17:11:25 Blade150 unix: [ID 930857 kern.info] avail mem = 509108224
May 31 17:11:25 Blade150 rootnex: [ID 466748 kern.info] root nexus = Sun Blade 150 (UltraSPARC-IIe 650MHz)
我在网上看了一些关于这个问题的回答,有人说是替换CPU,有人说更新系统patch,也有说是禁用电源管理!现在我不能肯定是硬件原因还是软件原因!哪位老大能够给我一个肯定的答复,或给些诊断方法!谢谢!