




版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、How to troubleshoot mailbox disk errorsMay 17, 2017How ToARTICLE NUMBER000025256DESCRIPTIONA High Availability partner fails to takeover after losing access to mailbox disks.How can I tell which drives are used for cluster mailbox disks?PROCEDUREMailbox disks are used by Data ONTAP to store active-a
2、ctive partner information such as the cf state. These mailbox disks are used in conjunction with the cluster interconnect to communicate heartbeats between the active-active partners. A lack of a heartbeat update might indicate that the partner storage system has failed and a cf takeover is required
3、.If both ports on the interconnect fail, each storage controller will continue to write to the mailbox disks to prevent a false takeover. Similarly, if a mailbox disk that belongs to a storage controllers partner becomes unreachable; however, if the interconnect link is still valid, clustering will
4、be disabled.The following bullets contain information on common cf mailbox errors and troubleshooting steps.Note: These do not apply to MetroCluster configurations.Partner Not Respondingo Example:Sat Sep 30 12:18:10 CEST storageA: cf.fsm.partnerNotResponding:notice: Cluster monitor: partner not resp
5、ondingo Cause:After waiting for several seconds, the local controller has seen no updates to the mailbox disks from its partner. This could indicate that the High Availability (HA partner is down or unable to reach the mailbox disks. This error will not occur if the cluster interconnect links fail.o
6、 Corrective Actions:1.Verify the state of the HA partner.2.Verify the FC-AL loops to the partner disk shelvesCluster monitor: takeover of filerB disabled (status of backup mailbox is uncertaino Example:Sat Sep 30 12:18:10 CEST storageA: cf.fsm.takeoverOfPartnerDisabled:notice: Cluster monitor: takeo
7、ver of storageB disabled (status of backup mailbox is uncertaino Cause:StorageA has disabled clustering because it cannot determine the status of the backup mailbox disk. When a mailbox disk becomes unreadable, the local node cannot accurately determine the state of the partner. If the Interconnect
8、link is still enabled, clustering is disabled because the controller cannot accurately determine whether the drives are still accessible on its partner.o Corrective Actions:1.If the error clears within a minute, check whether the disk wasupdating firmware at the time of the error. During a disk firm
9、ware upgrade, the disk will be taken offline briefly, creating the false positive error message.2.For FC-AL loops, run the fcadmin link_stats commandseveral times over the course of five minutes to see whether errors are incrementing on the loop. For SAS systems, use sasadmin dev_stats to look for i
10、ncrementing errors. If errors are incrementing, perform further troubleshooting to isolate the problematic component.3.If the issue clears without manual intervention, monitor thesystem for a reoccurrence. It might have been caused by a transient condition that caused the storage system to miss an u
11、pdate to the mailbox disk.4.Check for possible exposure to BUG 387507. If the storagesystem is running a version of Data ONTAP older than 7.3.3P2 and experiences high enough I/O, communication between nodes of a HA pair might be delayed, resulting in cf.fsm.takeoverOfPartnerDisabled messages being l
12、ogged, indicating status of backup mailbox is uncertain. These are then followed by cf.fsm.takeoverOfPartnerEnabled messages as the communication between HA pair nodes is reestablished.Cluster monitor: takeover of StorageA disabled (partner mailbox disks not accessible or invalido Example:Disk ?.? i
13、s a backup mailbox diskmissing lock disks, possibly stale mailbox instance on backup side Cluster monitor: backup mailbox error detectedCluster monitor: takeover of StorageA disabled (partner mailbox disks not accessible or invalidCluster failover of StorageA is not possible: partner mailbox disks n
14、ot accessible or invalid.Cluster is licensed but takeover of partner is disabled due to reason : partner mailbox disks not accessible or invalido Cause:The partner mailbox disks for StorageA were unreachable, which leads to the disabling of clustering. This could be due to the drive having repeat er
15、rors and thus becoming unresponsive, or due to FC-AL loop issues.o Corrective Actions:1.Using the cf monitor all command in diag mode, verify themailbox disks on both storage systems to determine which drive is missing:StorageB - Good mailbox disks:Disk 2a.16 is a primary mailbox diskDisk 2a.73 is a
16、 primary mailbox diskDisk 2b.16 is a backup mailbox diskDisk 8a.72 is a backup mailbox diskStorageA - Bad mailbox disks:Disk 2a.16 is a primary mailbox diskDisk 8a.72 is a primary mailbox diskDisk 2b.16 is a backup mailbox diskDisk ?.? is a backup mailbox diskNotice how StorageA is missing disk 2b.7
17、3 (StorageBs 2a.73 in the example above.This indicates the mailbox disk with errors. Attempt replacing this drive.2.Check the cabling for the shelves to verify it is correct.3.Review the storage systems messages file to see if any errorswere logged.4.Run the fcadmin link_stats command several times
18、overthe course of 5 minutes to see whether errors are incrementing on the loop. If errors are incrementing, perform further troubleshooting to isolate the problematic component.Cluster monitor: both partner mailbox disks are inaccessible. Check connectivity to the storage subsystemo Example:Cluster
19、monitor: both partner mailbox disks are inaccessible. Check connectivity to the storage subsystemo Cause:Disks on the FC-AL adapter for the partner loop are inaccessible or have errors.o Corrective Actions:1.If a new disk shelf was added immediately before the errorstarted, verify if the correct pro
20、cedure was used to add this shelf. The Hardware Service Guide for each shelf module documents the procedure.2.Check the cabling for the shelves to verify it is correct.3.Review the storage systems messages file to see if any errorswere logged.4.Run the fcadmin link_stats command several times overth
21、e course of five minutes to see whether errors are incrementing on the loop. If errors are incrementing, perform further troubleshooting to isolate the problematic component.StorageA reports Partner down, takeover in 20 sec; however, cluster takeover does not occur and partner StorageB shows that th
22、e cluster is up-OR-Cluster is licensed but takeover of partner is disabled due to the reason partner mailbox disks not accessible or invalido Cause:The partner StorageB, which reports that StorageA is up, is using the wrong mailbox disks. This might occur when an additional shelf with a root volume
23、is brought online.StorageB declares the new foreign root volumes disks as mailbox disks; however, StorageA continues to use the original mailbox disks. This mismatch means that the two HA partners are no longer using the same set of disks for cluster mailbox updates.o Corrective Actions:Verify that
24、the mailbox disks are the same for both controllers using the cf monitor all command:StorageA priv set diagStorageA* cf monitor allcf: Current monitor status (31Oct2008 13:52:15:partner StorageB, VIA Interconnect is up (link 0 up, link 1 up state UP, time 658442833, event CHECK_FSM, elem ChkMbValid
25、(12 mirrorConsistencyRequired TRUEtakeoverByPartner 0x2000mirrorEnabled TRUE, lowMemory FALSE, memio UNINIT, killPackets TRUE degraded FALSE, reservePolicy ALWAYS_AFTER_TAKEOVER, resetDisks TRUE timeouts:fast 1000, slow 2500, mailbox 10000, connect 5000operator 600000, firmware 15000 (recvd 65844283
26、3, dumpcore 60000 booting 300000 (recvd 0transit timer enabled TRUE, transit 600000 (last 474382882 mailbox disks:Disk 0a.18 is a local mailbox diskDisk 0a.17 is a local mailbox diskDisk 0b.18 is a partner mailbox diskDisk 0b.17 is a partner mailbox disko Removing the disks for the foreign volume sh
27、ould allow StorageB to re-elect the proper disks as mailbox disks. Alternatively, the foreign volume can be taken offline and destroyed.o In some case, cf monitor all output can be:Cluster Monitor StorageA mailbox disks: Disk 0a.16 is a local mailbox diskDisk 0a.32 is a local mailbox diskCluster Mon
28、itor StorageB mailbox disks: Disk 0b.39 is a local mailbox diskDisk 0d.16 is a local mailbox diskDisk 0c.16 is a partner mailbox diskDisk 0a.32 is a partner mailbox disko StorageA is not showing the partner mailbox disk. Log on to StorageB, take the foreign aggregate offline and destroy it. You must
29、 then zero the spare disks as said in the active guide Disks should be pre-zeroed and used as spares. Run the following command:disk zero sparesRelated Links:3013305: What is a MailBox disk?2011715: Error message: Permanent errors on all HA mailbox disks (while writing master block in process fmmbx_
30、instance000001995: What are the common causes of High Availability Takeover Impossible events?000014651: Simultaneous loss of both node mailbox disks causes system disruption Transient mailbox errors occur on the storage controller, but get corrected automatically within a short time. The error messages are similar to the following: Mon Apr 4 18:26:00 WIB tbs-fas3170-06:monitor.globalStatus.critical:CRITICAL: Controller failover of tbs-fas3170-05 is not possible: partner mail
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 毕业班学生心理疏导计划
- 口算除法 (教学设计)-2023-2024学年三年级下册数学人教版
- 投资咨询工程师常见错误试题及答案2024
- 注册会计师跨国公司财务试题及答案
- Unit 4 Plants around us大单元备课 (教学设计)-2024-2025学年人教PEP版(2024)英语三年级上册
- 2024年预算员考试实务试题及答案分享
- 品牌管理的重要性试题及答案
- 理解全媒体运营师的数据驱动营销:试题及答案
- 2024年人力资源管理师考试精要试题及答案
- 2024人力资源管理师科目试题及答案
- 2024.8.1十七个岗位安全操作规程手册(值得借鉴)
- 《送元二使安西》优秀课件
- 《比较不同的土壤》-完整版课件
- 2021年温二高、瓯海中学、龙湾中学提前招生英语试卷
- (WORD版可修改)JGJ59-2023建筑施工安全检查标准
- 2022年新高考全国Ⅰ卷英语试题及参考答案
- 高血压护理查房ppt
- 锦屏二级水电站厂区枢纽工程测量方案
- 山西安全资料地标表格
- 心理学专业英语总结(完整)
- 《电子商务法律法规》课程标准
评论
0/150
提交评论