




版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、常见硬件故障举例1, 系统日志/var/adm/messages*中报出cpu,内存出现aft类报错,应及时保修更换。查看cpu数量是否正确:psrinfo0 on-line since 01/26/07 11:22:072 on-line since 01/26/07 11:22:0516 on-line since 01/26/07 11:22:0718 on-line since 01/26/07 11:22:07或者/usr/platform/sun4u/sbin/prtdiag v 可以看到比较详细的系统硬件配置。prtdiag -v | moresystem configurati
2、on: sun microsystems sun4u sun fire v440system clock frequency: 177 mhzmemory size: 4gb =cpus = e$ cpu cpu temperaturecpu freq size implementation mask die amb. status location- - - - - - - - -0 1593 mhz 1mb sunw,ultrasparc-iiii 3.4 - - online - 1 1593 mhz 1mb sunw,ultrasparc-iiii 3.4 - - online - (
3、下面略) a,cpu报错信息举例,例子中说明cpu18出现错误:jun 27 17:50:30 v440 sunw,ultrasparc-iv: id 289920 notice: aft0 ucc event detected by cpu18 in user mode at tl=0, errid 0x00420f56.380eacb0jun 27 17:50:30 v440 afsr 0x00000400.00000026 afar 0x000000a0.b2532b20jun 27 17:50:30 v440 fault_pc 0xfe1696a8 esynd 0x
4、0026jun 27 17:50:30 v440 sunw,ultrasparc-iv: id 173042 aft0 errid 0x00420f56.380eacb0 data bit 19 was in error and correctedjun 27 17:50:30 v440 sunw,ultrasparc-iv: id 832860 aft2 errid 0x00420f56.380eacb0 pa=0x000000a0.b2532b00jun 27 17:50:30 v440 e$tag 0x000004a0.b2400001 e$sta
5、te_0 sharedjun 27 17:50:30 v440 sunw,ultrasparc-iv: id 895151 aft2 e$data (0x00) 0xfe409880.fe40b1e4 0xfe4133f8.fe1ee4a4 ecc 0x03fjun 27 17:50:30 v440 sunw,ultrasparc-iv: id 895151 aft2 e$data (0x10) 0xfe409370.fe40b120 0xfe40d77c.fe418850 ecc 0x17ajun 27 17:50:30 v440 sunw,ultra
6、sparc-iv: id 895151 aft2 e$data (0x20) 0xfe410fb8.fe40c874 0xfe07bdd0.fe406ad0 ecc 0x1a8jun 27 17:50:30 v440 sunw,ultrasparc-iv: id 895151 aft2 e$data (0x30) 0xfe40fe14.fe4104c0 0xfe40df10.fe4052c4 ecc 0x14ejun 27 17:50:30 v440 sunw,ultrasparc-iv: id 929717 aft2 d$ data
7、 not availablejun 27 17:50:30 v440 sunw,ultrasparc-iv: id 335345 aft2 i$ data not availableb,内存报错举例,可以看出/n0/sb4/p3/e1 j7300 这根内存有问题。may 14 17:39:20 hdb-lc lw8: id 408692 kern.notice main, up 153 days 12:05:38, memory 8,512,064may 14 21:39:20 hdb-lc lw8: id 994892 kern.notice main, up 153 d
8、ays 16:05:38, memory 8,208,768may 14 22:25:38 hdb-lc sunw,ultrasparc-iv: id 838864 notice: aft0 first error ucc event detected by cpu19 in user mode at tl=0, errid 0x002f28e2.e95593c0may 14 22:25:38 hdb-lc afsr 0x00000400.00000001 afar 0x00000023.f67bce70may 14 22:25:38 hdb-lc fault_pc 0x1
9、00fb8e60 esynd 0x0001 /n0/sb4/p3/e1 j7300may 14 22:25:38 hdb-lc sunw,ultrasparc-iv: id 450664 aft0 errid 0x002f28e2.e95593c0 check bit 0 was in error and correctedmay 14 22:25:38 hdb-lc sunw,ultrasparc-iv: id 248406 aft2 errid 0x002f28e2.e95593c0 pa=0x00000023.f67bce40may 14 22:2
10、5:38 hdb-lc e$tag 0x0000008f.d9249049 e$state_1 sharedmay 14 22:25:38 hdb-lc sunw,ultrasparc-iv: id 895151 aft2 e$data (0x00) 0xde10a064.80a3e000 0x1240027e.01000000 ecc 0x02amay 14 22:25:38 hdb-lc sunw,ultrasparc-iv: id 895151 aft2 e$data (0x10) 0xd80d601e.80a32005 0x0240015c.80
11、a7202b ecc 0x13amay 14 22:25:38 hdb-lc sunw,ultrasparc-iv: id 895151 aft2 e$data (0x20) 0x124000f5.01000000 0xd2176000.d406a030 ecc 0x186may 14 22:25:38 hdb-lc sunw,ultrasparc-iv: id 895151 aft2 e$data (0x30) 0x80a2400a.1640002d 0x01000000.c65e6180 ecc 0x1e8may 14 22:25:38 hdb-lc
12、 sunw,ultrasparc-iv: id 929717 aft2 d$ data not availablemay 14 22:25:38 hdb-lc sunw,ultrasparc-iv: id 335345 aft2 i$ data not availablemay 14 22:25:49 hdb-lc sunw,ultrasparc-iv: id 828558 notice: aft0 ucc event detected by cpu19 in user mode at tl=0, errid 0x002f28e2.e
13、95593c0may 14 22:25:49 hdb-lc afsr 0x00200400.00000001 afar 0x00000023.f67bce70may 14 22:25:49 hdb-lc fault_pc 0x100fb8e60 esynd 0x0001 /n0/sb4/p3/e1 j7300may 14 22:25:49 hdb-lc sunw,ultrasparc-iv: id 450664 aft0 errid 0x002f28e2.e95593c0 check bit 0 was in error and correctedmay 14 22:25:
14、49 hdb-lc sunw,ultrasparc-iv: id 248406 aft2 errid 0x002f28e2.e95593c0 pa=0x00000023.f67bce40may 14 22:25:49 hdb-lc e$tag 0x0000008f.d9249049 e$state_1 sharedmay 14 22:25:49 hdb-lc sunw,ultrasparc-iv: id 895151 aft2 e$data (0x00) 0xde10a064.80a3e000 0x1240027e.01000000 ecc 0x02am
15、ay 14 22:25:49 hdb-lc sunw,ultrasparc-iv: id 895151 aft2 e$data (0x10) 0xd80d601e.80a32005 0x0240015c.80a7202b ecc 0x13amay 14 22:25:49 hdb-lc sunw,ultrasparc-iv: id 895151 aft2 e$data (0x20) 0x124000f5.01000000 0xd2176000.d406a030 ecc 0x186may 14 22:25:49 hdb-lc sunw,ultrasparc-
16、iv: id 895151 aft2 e$data (0x30) 0x80a2400a.1640002d 0x01000000.c65e6180 ecc 0x1e8may 14 22:25:49 hdb-lc sunw,ultrasparc-iv: id 929717 aft2 d$ data not availablemay 14 22:25:49 hdb-lc sunw,ultrasparc-iv: id 335345 aft2 i$ data not available2, 硬盘报错a, 用format命令,对应磁盘条目出现“t
17、ype unknown”或者“ driver not found”。b, 命令iostat en 输出查看磁盘信息,注意media error中的数值是否为0。c, 对于用sds做的软raid,出现下面几种情况,则应及时保修。 命令 metadb 的输出中出现大写字母打头的行。 命令 metastat 的输出中,对应的raid卷状态出现非ok状态提示。 系统日志中出现关于meta的告警信息。 举例正常的metadb和metastat输出和系统日志中的告警信息。bash-2.03# metadb flags first blk block count a m p luo 16 1034 /dev
18、/dsk/c1t0d0s7 a p luo 1050 1034 /dev/dsk/c1t0d0s7 a p luo 2084 1034 /dev/dsk/c1t0d0s7 a p luo 16 1034 /dev/dsk/c1t1d0s7 a p luo 1050 1034 /dev/dsk/c1t1d0s7 a p luo 2084 1034 /dev/dsk/c1t1d0s7bash-2.03# metastat | mored0: mirror submirror 0: d1 state: okay submirror 1: d2 state: okay pass: 1 read opt
19、ion: roundrobin (default) write option: parallel (default) size: 55092864 blocksd1: submirror of d0 state: okay size: 55092864 blocks stripe 0: device start block dbase state hot spare c1t0d0s0 0 no okay d2: submirror of d0 state: okay size: 55092864 blocks stripe 0: device start block dbase state h
20、ot spare c1t1d0s0 0 no okay nov 13 20:25:23 v440 md_stripe: id 641072 kern.warning warning: md: d32: read error on /dev/dsk/c1t1d0s3nov 13 20:25:24 v440 md_mirror: id 104909 kern.warning warning: md: d32: /dev/dsk/c1t1d0s3 needs maintenancec, 系统日志/var/adm/messages* 中出现磁盘的block报错信息。例如:nov 13 20:25:18
21、 v440 scsi: id 107833 kern.warning warning: /pci9,600000/sunw,qlc2/fp0,0/ssdw500000e0114799c1,0 (ssd0):nov 13 20:25:18 v440 error for command: read(10) error level: retryablenov 13 20:25:18 v440 scsi: id 107833 kern.notice requested block: 111216288 error block: 111216305nov 13 20:25:18 v440 scsi: i
22、d 107833 kern.notice vendor: fujitsu serial number: 0530c049em nov 13 20:25:18 v440 scsi: id 107833 kern.notice sense key: media errornov 13 20:25:18 v440 scsi: id 107833 kern.notice asc: 0x11 (), ascq: 0x1, fru: 0x0nov 13 20:25:19 v440 scsi: id 243001 kern.warning warning: /pci9,600000/sunw,qlc2/fp
23、0,0 (fcp0):nov 13 20:25:19 v440 fcp: wwn 0x500000e0114799c1 reset successfullynov 13 20:25:19 v440 scsi: id 107833 kern.warning warning: /pci9,600000/sunw,qlc2/fp0,0/ssdw500000e0114799c1,0 (ssd0):nov 13 20:25:19 v440 error for command: read(10) error level: retryablenov 13 20:25:19 v440 scsi: id 107
24、833 kern.notice requested block: 111216288 error block: 111216305nov 13 20:25:19 v440 scsi: id 107833 kern.notice vendor: fujitsu serial number: 0530c049em nov 13 20:25:19 v440 scsi: id 107833 kern.notice sense key: media errornov 13 20:25:19 v440 scsi: id 107833 kern.notice asc: 0x11 (), ascq: 0x1,
25、 fru: 0x0nov 13 20:25:20 v440 scsi: id 243001 kern.warning warning: /pci9,600000/sunw,qlc2/fp0,0/ssdw500000e0114799c1,0 (ssd0):nov 13 20:25:20 v440 scsi transport failed: reason reset: retrying commandnov 13 20:25:22 v440 scsi: id 107833 kern.warning warning: /pci9,600000/sunw,qlc2/fp0,0/ssdw500000e
26、0114799c1,0 (ssd0):nov 13 20:25:22 v440 error for command: read(10) error level: retryablenov 13 20:25:22 v440 scsi: id 107833 kern.notice requested block: 111216288 error block: 111216305nov 13 20:25:22 v440 scsi: id 107833 kern.notice vendor: fujitsu serial number: 0530c049em nov 13 20:25:22 v440
27、scsi: id 107833 kern.notice sense key: media errornov 13 20:25:22 v440 scsi: id 107833 kern.notice asc: 0x11 (), ascq: 0x1, fru: 0x03, 网络接口的问题 例如网络时断时通,会在系统日志/var/adm/messages*中产生如下日志: mar 16 23:12:30 v440 genunix: id 408789 kern.warning warning: ce0: fault detected external to device; service degra
28、dedmar 16 23:12:30 v440 genunix: id 451854 kern.warning warning: ce0: xcvr addr:0x01 - link downmar 16 23:14:06 v440 genunix: id 408789 kern.notice notice: ce0: fault cleared external to device; service availablemar 16 23:14:06 v440 genunix: id 451854 kern.notice notice: ce0: xcvr addr:0x01 - link u
29、p 100 mbps full duplexmar 16 23:14:16 v440 genunix: id 408789 kern.warning warning: ce0: fault detected external to device; service degradedmar 16 23:14:16 v440 genunix: id 451854 kern.warning warning: ce0: xcvr addr:0x01 - link downmar 16 23:14:54 v440 genunix: id 408789 kern.notice notice: ce0: fa
30、ult cleared external to device; service availablemar 16 23:14:54 v440 genunix: id 451854 kern.notice notice: ce0: xcvr addr:0x01 - link up 100 mbps full duplexmar 16 23:51:39 v440 genunix: id 408789 kern.warning warning: ce0: fault detected external to device; service degradedmar 16 23:51:39 v440 ge
31、nunix: id 451854 kern.warning warning: ce0: xcvr addr:0x01 - link downmar 16 23:53:11 v440 genunix: id 408789 kern.notice notice: ce0: fault cleared external to device; service availablemar 16 23:53:11 v440 genunix: id 451854 kern.notice notice: ce0: xcvr addr:0x01 - link up 100 mbps full duplexmar
32、16 23:53:23 v440 genunix: id 408789 kern.warning warning: ce0: fault detected external to device; service degradedmar 16 23:53:23 v440 genunix: id 451854 kern.warning warning: ce0: xcvr addr:0x01 - link downmar 16 23:54:02 v440 genunix: id 408789 kern.notice notice: ce0: fault cleared external to de
33、vice; service availablemar 16 23:54:02 v440 genunix: id 451854 kern.notice notice: ce0: xcvr addr:0x01 - link up 100 mbps full duplex4, 通过系统配置命令判断硬件故障信息。可以查看系统的cpu,内存,pci,i/o,风扇,电源模块,温度,obp版本以及各种指示灯的详细状态。比如坏了一个电源,则电源状态那里会显示fault,而不是ok。举例v440的prtdiag v 输出信息:rootv440# /usr/platform/sun4u/sbin/prtdiag
34、-v | moresystem configuration: sun microsystems sun4u sun fire v440system clock frequency: 177 mhzmemory size: 4gb = cpus = e$ cpu cpu temperaturecpu freq size implementation mask die amb. status location- - - - - - - - - 0 1593 mhz 1mb sunw,ultrasparc-iiii 3.4 - - online - 1 1593 mhz 1mb sunw,ultra
35、sparc-iiii 3.4 - - online - = io devices =bus freq slot + name +type mhz status path model- - - - -pci 66 mb pci108e,abba (network) sunw,pci-ce okay /pci1c,600000/network2pci 66 pci2 pci100b,35 (network) sunw,pci-ce okay /pci1d,700000/network2pci 66 pci4 pci100b,35 (network) sunw,pci-x-qge okay /pci
36、1d,700000/pci1/network0pci 66 pci4 pci100b,35 (network) sunw,pci-x-qge okay /pci1d,700000/pci1/network1pci 66 pci4 pci100b,35 (network) sunw,pci-x-qge okay /pci1d,700000/pci1/network2pci 66 pci4 pci100b,35 (network) sunw,pci-x-qge okay /pci1d,700000/pci1/network3pci 33 mb isa/su (serial) okay /pci1e
37、,600000/isa7/serial0,3f8pci 33 mb isa/su (serial) okay /pci1e,600000/isa7/serial0,2e8pci 33 mb isa/rmc-comm-rmc_comm (seria+ okay /pci1e,600000/isa7/rmc-comm0,3e8pci 33 pci0 pci100b,35 (network) sunw,pci-ce okay /pci1e,600000/network2pci 33 mb pciclass,0c0310 (usb) okay /pci1e,600000/usbapci 33 mb p
38、ciclass,0c0310 (usb) okay /pci1e,600000/usbbpci 33 mb pci10b9,5229 (ide) okay /pci1e,600000/idedpci 66 mb pci108e,abba (network) sunw,pci-ce okay /pci1f,700000/network1pci 66 mb scsi-pci1000,30 (scsi-2) lsi,1030 okay /pci1f,700000/scsi2pci 66 mb scsi-pci1000,30 (scsi-2) lsi,1030 okay /pci1f,700000/s
39、csi2,1= memory configuration =segment table:-base address size interleave factor contains-0x0 2gb 4 bankids 0,1,2,30x1000000000 2gb 4 bankids 16,17,18,19bank table:- physical locationid controllerid groupid size interleave way-0 0 0 512mb 0,1,2,31 0 1 512mb 2 0 1 512mb 3 0 0 512mb 16 1 0 512mb 0,1,2
40、,317 1 1 512mb 18 1 1 512mb 19 1 0 512mb memory module groups:-controllerid groupid labels status-0 0 c0/p0/b0/d0 0 0 c0/p0/b0/d1 0 1 c0/p0/b1/d0 0 1 c0/p0/b1/d1 1 0 c1/p0/b0/d0 1 0 c1/p0/b0/d1 1 1 c1/p0/b1/d0 1 1 c1/p0/b1/d1 = environmental status =fan speeds:-location sensor status speed-ft0/f0 tach okay 3792 rpm ft1/f0 tach okay 3994 rpm ft1/f1 tach okay 3947 rpm ps0 ff_pdct_fan okay ps1 ff_pdct_fan okay temperature
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2024北京北师大二附中高一(下)期中英语试题及答案
- 涵洞沉箱施工方案
- 土方工程招投标重新报名流程和步骤3篇
- 工伤指定授权3篇
- 境外翻译移民委托书3篇
- 供应链安全升级承诺3篇
- 学生关注环保保证书3篇
- 纪检部部员工作计划书(30篇)
- 工程承包商责任3篇
- 回迁协议办房产证是能改名3篇
- (二模)2025年深圳市高三年级第二次调研考试地理试卷(含标准答案)
- 急性肾盂肾炎护理查房
- 人教版2025年八年级(下)期中数学试卷(一)(考查范围:第16~18章)
- 2025年高考语文作文命题方向预测04 科技创新(预测理由+作文真题+审题立意+高分范文)解析版
- 压花艺术-发现植物之美智慧树知到期末考试答案章节答案2024年华南农业大学
- 中远集团养老保险工作管理程序
- 留守儿童帮扶记录表
- 变电站第二种工作票
- 煤矿机电运输专业质量标准化管理制度
- 机电一体化专业毕业论文43973
- 基于PLC的变频中央空调温度控制系统的毕业设计
评论
0/150
提交评论