Title: | HSZ40 Product Conference |
Moderator: | SSDEVO::EDMONDS |
Created: | Mon Apr 11 1994 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 902 |
Total number of notes: | 3319 |
A customer has logged the following call on her HSZ40. Can anyone suggest why the the HSZ40 might have performed a reset, and explain what the quoted message means? FROM: Trish Thomas TAFE NSW PH: 02 9950 1719 FAX: 02 9950 1601 PROBLEM: Disk error on ALPHA 2100, serial no. AY53411647 DESCRIPTION: Machine Type AlphaServer 2100A 5/300 running VMS V6.2-1H3 in a VMScluster. (HW Ver=04C700000000000000000018, SID=80000000, XSID=00000000) Serial no. AY53411647 Disk RZ28M,RZ29B PROBLEM: 1 All disks on the machine clocked one error early this morning. These disks are connected to an HSZ40 controller on a Storageworks shelf. It looks like the controller has reset itself and we don't know why. I have included an extract from DIAGNOSE & FMU. 2 We recently upgraded the firmware on the HSZ40 controller and initialized one of the disks with SAVE_CONFIGURATION. We are now getting this message on the HSZ40: "NVPM OEM information component initialized to default settings" Can you explain what this message means? *************************************** DIAGNOSE Logging OS 1. OpenVMS System Architecture 2. Alpha OS version V6.2-1H2 Event sequence number 27566. Timestamp of occurrence 23-MAY-1997 00:34:16 Time since reboot 7 Day(s) 11:17:24 Host name IRIVE2 System Model AlphaServer 2100 4/233 Entry type 1. Device Error ---- Device Profile ---- Unit IRIVE2$DKB202 Product Name HSZ40 SCSI to SCSI Ctrl -- Driver Supplied Info - Device Firmware Revision V31Z VMS SCSI Error Type 5. Extended Sense Data from Device SCSI ID x02 SCSI LUN x00 SCSI SUBLUN x02 Port Status x00000001 Success Command Opcode x0A Write (6 byte) Command Data x0E x14 xD1 x10 x00 SCSI Status x02 Check Condition Remaining Byte Length 160. ------- HSZ Data ------- Instance Code x03F40064 Device services had to reset the port to clear a bad condition. Note that in this instance the Associated Target, Associated ASC, and Associated ASCQ fields are undefined. Component ID = Device Services. Event Number = x000000F4 Repair Action = x00000000 NR Threshold = x00000064 Template Type x41 Device Services Non-Transfer Error. Template Flags x00 HCE = 0, Event did not occur during Host Command Execution. Ctrl Serial # ZG61201027 Ctrl Software Revision V31Z RAIDSET State x00 NORMAL. All members present and reconstructed, IF LUN is configured as a RAIDSET. Error Code x70 Current Error Sense Key x06 Unit Attention ASC & ASCQ xD203 ASC = x00D2 ASCQ = x0003 Device services had to reset the bus. Associated Port x01 Associated Target x04 Associated ASC x00 Associated ASCQ x00 ----- Software Info ----- UCB$x_ERTCNT 16. Retries Remaining UCB$x_ERTMAX 16. Retries Allowable IRP$Q_IOSB x0000000000000000 UCB$x_STS x08021810 Online Software Valid Unload At Dismount Volume is Valid on the local node Unit supports the Extended Function bit IRP$L_PID x824E8730 Requestor "PID" IRP$x_BOFF 512. Byte Page Offset IRP$x_BCNT 8192. Transfer Size In Byte(s) UCB$x_ERRCNT 1. Errors This Unit UCB$L_OPCNT 1060333. QIO's This Unit ORB$L_OWNER x00010004 Owners UIC UCB$L_DEVCHAR1 x1C4D4008 Directory Structured File Oriented Sharable Available Mounted Error Logging Capable of Input Capable of Output Random Access ************************************** Fault Management Utility describe instance 031A4002 Instance Code: 031A4002 Description: Command timeout. Reporting Component: 3.(03) Description: Device Services Reporting component's event number: 26.(1A) Event Threshold: 2.(02) Classification: HARD. Failure of a component that affects controller performance or precludes access to a device connected to the controller is indicated. Regards, Trish Thomas
T.R | Title | User | Personal Name | Date | Lines |
---|---|---|---|---|---|
888.1 | SSDEVO::T_GONZALES | Tue May 27 1997 11:12 | 6 | ||
You are getting a command timeout on a device on the hsz port, The fmu information should have included a port target information, although I didn't see it in your insert. Anytime a command timeout occurs on the hsz port side, the hsz will reset that port, usually all units on that port will report the reset. the error log information is reporting that event, recommend you change that device. | |||||
888.2 | still wondering about error message | GIDDAY::HIRSHMAN | Hugged your Webmeister today? | Mon Jun 02 1997 01:32 | 10 |
There was no command timeout error and/or port target ID info in the errorlog, so I'm not sure where that leaves me. Also, do we need to do anything about the "NVPM OEM information component initialized to default settings" error (informational??) message or should the customer just do a CLEAR CLI and ignore it? What does this message mean, anyway? -Bret |