| Title: | CSGUK_SYSTEMS |
| Notice: | No restrictions on keyword creation |
| Moderator: | KERNEL::ADAMS |
| Created: | Wed Mar 01 1989 |
| Last Modified: | Thu Nov 28 1996 |
| Last Successful Update: | Fri Jun 06 1997 |
| Number of topics: | 242 |
| Total number of notes: | 1855 |
*********** THIS AREA IS DEDICATED TO SDD / VAXSIMPLUS *********
| T.R | Title | User | Personal Name | Date | Lines |
|---|---|---|---|---|---|
| 55.1 | Questions And Answers. | KERNEL::JAMES | Alan James CSC Basingstoke | Thu Jul 20 1989 09:49 | 129 |
Below are 20 Questions and Answers we compiled on Vaxsimplus.
Alan.
VAXSIMPLUS QUESTIONS
-------------------------
1. HOW DOES AN SDD NOTIFY DIFFER FROM A NORMAL RDC NOTIFY?
call type is sdd
problem summary includes vaxsimplus & event code
Only the 1st FRU that Spear Suggests will be called out (unless repeat call)
2. WHAT ARE THE FULL NAMES OF THE COMMAND FILES THAT ARE INVOLVED
STARTING VAXSIMPLUS PROCESS AT BOOT TIME?
sys$common:[sysmgr]sdd$startup.com
sys$common:[sysmgr]systartuo_v5.com
@sdd$exe:vaxsim$startup
3. WHAT IS THE EFFECT OF THE VAXSIMPLUS PROCESS NOT RUNNING?
display appears to work but never clocks up any more errors
fault monitor will never have thresholds exeeeded
4. HOW CAN YOU TELL IF VAXSIMPLUS CAN SEND MAIL?
by sending a test mail (vaxsim/f mail summ test.tmp)
by invoking errors ie injecterr to see if mail sent
5. WHICH FILE HAS TO BE ACCESSED BY EVERY NODE?
HOW CAN YOU READ AND UPDATE THIS FILE?
vaxsim$cluster.dat
read: vaxsim$dump.com - output file @sdd$exe:vaxsim$dump
update & repair: vaxsim$fix.com - @sdd$exe:vaxsim$fix
6. WHAT IS MEL?
mel is no longer used ?
.mel is the output file from vaxsim/merge
7. WHAT IS VAXSIM/MERGE
vaxsim/merge is a utility that allows you to create a single output
file from multiple error log files, also it eliminates duplicate events.
8. NAME THE MAJOR LOGICALS NAMES USED BY VAXSIMPLUS.
sdd$dat, sdd$exe, sdd$logs
vaxsim$batch, vaxsim$errorlogs, vaxsim$mailbox, vaxsim$alias
9. NAME THE ONLY VAXSIMPLUS MANUALS THAT CAN BE LEFT ON A CUSTOMERS SITE.
Getting started with Vaxsimplus
Vaxsimplus User Guide
Release Notes
10.DURING THE INSTALLATION AT WHICH POINT DO YOU ISSUE THE WARNING NOT TO USE
THE HELP COMMAND? AND HOW WOULD YOU DO IT?
when you get to the point in the installation where vmsinstall prompts:
Do you want the vaxsimplus release notes queued to sys$print?
one method would be to send a system messgae to all the users (opcom)
(reply/all/bell "Please do not use help for 20 mins.")
11. BEFORE YOU MAKE ANY CHANGES USING VAXSIM$FIX WHAT PRECAUTION SHOULD
YOU TAKE?
run it in debug mode to ensure that correct data has been selected
12. WHEN VAXSIMPLUS IS INSTALLED HOW DO YOU INFORM IT THAT ERRORS FROM
PREVIOUS DAYS EXIST
@sdd$exe:vaxsim$loader.com
(procedure to create vaxsimplus database from the existing errorlogs.)
13. WHAT IS THE AFFECT ON THE FAULT MANAGER a) ALTERING THE MARGIN
b) CHANGING THE CLIPPING LEVEL.
there is no affect on the fault manager.
14. ON THE DISPLAY YOU CAN GO UP OR DOWN FOUR LEVELS FROM THE SYSTEM LEVEL,
WHAT ARE THESE LEVELS CALLED?
sub-system level
unit
error class
error detail
15. HOW MANY WAYS CAN THE THRESHOLD BE EXCEEDED?
thresholds can be exceeded by hard errors, soft errors, media errors or
information errors by the display or fault manager.
16. WHAT IS MEANT BY THE FACT THAT SYS$COMMON AND V4COMMON ARE SYNONM
DIRECTORIES? WHAT IS THE V5 EQUIVALENT OF V4COMMON?
synonym directories are the same directories
vms$common
17. WHAT DATE SHOULD THE CONTRACT EXPIRATION DATE BE SET TO?
three months after the system contract expiration date
18. WOULD YOU NORMALLY INSTALL COMPUTE,INSTRUCT AND VSR? WHY?
compute - no - statistics package, waists system time
- at boot time allows customer to say why system went down.
instruct - yes - instruction course for Spear Basic
vsr - only if there is an 86** cpu in the cluster
19. HOW DOES SPEAR BASIC AFFECT VAXSIMPLUS?
Spear Basic is nothing to do with Vaxsimplus
(Spear Extended code is embedded in Vaxsimplus to generate event codes)
20. WHAT IS VAXSIM$ALIAS USED FOR?
To identify the cluster for mail notifications.
| |||||
| 55.2 | Ultrix SDD. | COMICS::TREVENNOR | A child of init | Sat Jul 22 1989 09:36 | 4 |
Shall we start a separate topic for Ultrix SDD as it develops? Or
bundle it here?
AT
| |||||
| 55.3 | Here will do. | KERNEL::JAMES | Alan James CSC Basingstoke | Mon Aug 14 1989 13:40 | 8 |
> Shall we start a separate topic for Ultrix SDD as it develops? Or
> bundle it here?
Bundle it here.
a.
| |||||
| 55.4 | VAXSIMPLUS Generated Fault Calls | KERNEL::JAMES | Alan James CSC Basingstoke | Mon Aug 14 1989 13:42 | 48 |
***** VAXSIMPLUS GENERATED FAULT CALLS *****
On receipt of an SDD call - a call that has been generated by
VAXSIMPLUS - please follow the guidelines below.
(This information, and the SDD Call Flows, will be put in the
Library (Box 8806) with the selection of manuals on SDD/VAXSIMPLUS.)
1. Obtain the VAXSIMPLUS Event Code and failing Device Name/Type
from Customer.
2. Convert the Event Code to a Spear Theory :-
$ run rsds$spear:spear
$ SPEAR> A
$ QUERY CR
$ THEORY NUMBER Enter Theory Number
$ O/P 1st Page to File CR
3. Select the 1st FRU that Spear calls out.
Where Spear cannot analyse the problem, connect to the system if
it is accessable, and carry out manual diagnosis.
4. Notify results :-
Call Type SDD
Problem Summary Device Name, Device Type, [Theory]
( e.g. DUA1 RA81 [1.15.8.10] )
Tech Findings VAXSIMPLUS - 1ST FRU - "FRU NAME"
All other Notify Fields to be filled out as normal.
Alan James.
| |||||
| 55.5 | USIM - introductory details. | COMICS::TREVENNOR | A child of init | Thu Sep 07 1989 13:01 | 39 |
What is USIM?
USIM is the ULTRIX equivalent of VAXSIM-PLUS for VMS. It performs
the same functions in a similar way.
Limitations:
Until host-based shadowing comes about auto-copy is not supported
in USIM. In the first version no graphical interface is provided
for USIM, though this may well be added.
When?
USIM will ship with Ultrix V4.0 (The BIG one!) early to mid of CY 90.
Who is doing what?
I am tasked with creating the bulk of the documentation for the product
which is being written by Ultrix engineering. I already have a
simulator for the native UNIX style user interface, and will shortly
have a simulator for the Natural Language interface. At some later
stage access to these can be given - Alan James and I have been
discussing this. I would like to hear from anyone in the CSC who would
like to be involved with the USIM project in an advisory role vis-a-vis
its usability.
What systems?
USIM will run on ULTRIX/32, ULTRIX/RISC and later on A.N.Other Open
operating system whose existence we don't yet talk about (though I'm
sure you've heard). It will provide commonality tools to allow
imported, processed ULTRIX error log files to be used with VAXSIM
and (probably) vice versa.
Alan Trevennor.
| |||||
| 55.6 | Call Guidelines | KERNEL::JAMES | Alan James CSC Basingstoke | Tue Sep 12 1989 10:27 | 52 |
***** VAXSIMPLUS GENERATED FAULT CALLS *****
On receipt of an SDD call - a call that has been generated by
VAXSIMPLUS - please follow the guidelines below.
1. Obtain from the customer the "FAILING DEVICE NAME", "FAILING DEVICE
TYPE" and "VAXSIMPLUS EVENT CODE".
2. Convert the Event Code to a Spear Theory :-
$ run rsds$spear:spear
$ SPEAR> A
$ QUERY CR
$ THEORY NUMBER Enter Theory Number
$ O/P 1st Page to File CR
3. Note all the FRU's that Spear calls out.
If the 1st FRU is an HDA then automatic validation has been
carried out. Complete an HDA Validation Sheet found in a black
folder on the SDDPT desk.
Where Spear cannot analyse the problem, it may be necessary to
connect to the system, if it is on RDC, to carry out manual
diagnosis.
4. Notify results :-
Call Type SDD
Problem Summary Device Name, Device Type, [Theory]
( e.g. DUA1 RA81 [1.15.8.10] )
Tech Findings VAXSIMPLUS FRU LIST. FRU/PART, FRU/PART ....
( e.g. VAXSIMPLUS FRU LIST. SERVO 70-19045-01,
PERSONALITY 70-19046-01, POWER SUPPLY H776-D ...)
All other Notify Fields to be filled out as normal.
| |||||
| 55.7 | Bugs in SPEAR Extended.... | KERNEL::MCGAUGHRIN | What a Marvelous Delivery | Wed Apr 25 1990 11:04 | 40 |
There are as far as I know TWO known BUGS in SPEAR Extended, they
appear when you input the following Theory numbers....
Theory Number: 1.15.8.15
: 1.15.8.16
This causes "Invalid Theory Number !" as an error message.
These are known bugs, and the theory text can be accessed in the
following way....
SPEAR> T
THEORY mode
-----------
Desired Category (QUERY): DISK
Output first page to file (NONE): <Rtrn>
Type <cr> to confirm (/GO): <Rtrn>
This will then take yo into a couple of pages of text, which you
canflip through quickly, it will then bring you to a MENU, numbered
1-16, choose the...
RA Drive Theories
And....again you come to another menu, here you choose...
HDA Theories
And...this will then give you the theory codes for the two failing
numbers. Read through the text and make you reccomendations..
Regards Ian
| |||||
| 55.8 | VAXSIM+ and non-DEC devices | KERNEL::WRIGHTON | odd numbered release = bug insert | Sat Sep 15 1990 08:31 | 124 |
I received the following message from the states.
All,
Please find below a white paper/article written in response to a
legal protest by Berkshire Computer Products to a sole source
government contract. This article covers the reasons why a
non-digital device is NOT compatable with VAXsimPLUS.
This article has been reviewed by legal for distribution
internal and external to digital. PLease do NOT modify the
information in the article but please DO DISTRIBUTE IT WIDELY!
Keith Norman
Fault Management CSSE
SUBJECT: VAXsimPLUS compatibility with non-Digital
drives
There are three major categories of information re-
quired by the VAXsimPLUS knowledge-base in order for it
to provide the analysis and predictive diagnosis. These
categories are:
1. Error information and associated codes,
2. Device geometry tables, and
3. Failure signatures
Error information is obtained from the system er-
ror/event log in VMS . The information placed here
is obtained from two sources; controller detected er-
rors and drive detected errors. These error packets
contain the associated Mass Storage Control Protocol
(MSCP) error codes. These MSCP codes are specific to
the drive type and have meaning specific to that de-
vice. An example is that an MSCP code of XX for an RA82
does not have the same meaning as an XX for an RA92.
The VAXsimPLUS knowledge-base has been coded to under-
stand these differences. If an MSCP code is received
from a device which attributes different meaning to
that code, VAXsimPLUS will be unaware of this and would
make inappropriate assumptions. Since these codes are
specific to the device design, it is unlikely that a
non-Digital device would have the identical design and
attribute the exact same meaning to that code.
Page 2
The second piece of information required by the knowledge-
base is the exact device geometry. Geometry, in this
instance, refers to the number of heads, cylinders
and LBNs. Hard coded into VAXsimPLUS is the expected
geometry for the Digital RA series disk drives. A sim-
ple MSCP command will confirm the geometry differences
between the Digital RA series drives and other, non-
Digital drives. No non-Digital drive geometry tables
exist in VAXsimPLUS.
If a non-Digital drive reports itself as a legitimate
RA series drive, then the geometry table of that RA
drive will be falsely applied to the non-Digital drive.
This could result in incorrect assumptions as to the
physical distribution of errors within the drive and
thus, incorrect diagnosis by the knowledge-base engine.
The third category of information required by VAXsim-
PLUS is failure signatures and device specific rules.
Any given device will fail in a manner specific to that
class of devices. For instance, an RA82 will not fail
in the same way as an RA92. The design of the device,
the circuitry and mechanics used to build it and even
the specific components used in these designs have
their own unique failure signature.
The VAXsimPLUS rules have been constructed to recognize
the failure signatures of the RA series devices and
to make predictions based on them. There are no non-
Digital device rules coded into VAXsimPLUS.
If a non-Digital drive is analyzed using these rules,
an incorrect diagnosis/prediction is a virtual cer-
tainty. Although the functions may be identical between
the two drives, differing designs, circuitry and com-
ponents used by these devices will result in different
failure signatures.
Page 3
In summary, even if the first condition is met and a
non-Digital drive is 100% MSCP compliant and produces
the error codes exactly like the equivalent RA series
drive, it is still not compatible with VAXsimPLUS
because of points 2 and 3. These last two points cannot
be met without modifying the code which is protected
under copyright and has two patents pending.
| |||||
| 55.9 | THE FUTURE is SECURE | KERNEL::MCGAUGHRIN | What a Marvelous Delivery | Wed Oct 31 1990 16:06 | 645 |
From: FMCSSE::OPPENHEIM "FM CSSE DTN:523-2167 CXN2/35 05-Oct-1990 0752" 5-OCT-1990 10:33:54.35
To: @FM_GROUP
CC:
Subj: Emanuel Tucker's (Logistics) trip report from CX, worth reading!!
To all
During Emanual's visit we attempted to understand/quantify and document
the impact on logistics for the SDD tools. The attached report has the
details and I feel is worth reading through, its nice to see the
benefit to Digital of the tools everyone is working so hard on and by the way
doing a great job!!!
We attempted to keep the attached report as factual as possible and left
the subjectivity out.
In summary
The automation of the tools (ERFs,SPEAR,VAXsim) through VS+/SISR reduce
service calls by about 30% (this is very significant). We only calculated
savings for the RA8X for, we have the most information on this device family,
because of the volumes the ROI are significant.
Some of the more interesting data is, there are around 56,970 systems
under contract not using our tools, if we could get our tools utilized, it
would represent aprox 50 mill/yr expense reduction for RA8X devices alone.
This savings is only based on RA8x devices, we have not figured in the other
devices.
This is a great statement of benefit to the work you all have been doing,
Thanks much!!!
Andy O.
***********************************************************************
From: GENRAL::QETOO::TUCKER "Emanuel @275-2252, DAS1-2/B15 05-Oct-1990 0848" 5-OCT-1990 06:59:38.81
To: @CXNTRIP.DIS
CC: TUCKER
Subj: CXN Trip Report- Sept 26, 27
+-------------+ TM
| | | | | | | |
|d|i|g|i|t|a|l| I n t e r o f f i c e M e m o r a n d u m
| | | | | | | |
+-------------+
To: LARRY DOLAN From: Emanuel Tucker
PHIL PIETROWSKI Date: October 5, l990
Dept: CSLHQ Engineering
cc: Distribution List Loc: DAS1-2/B15
DTN: 275-2252
ENET: QETOO::TUCKER
Subject: CXN Trip Report- Sept 26, 27
The purpose of the trip to CXN on Sept 26, 27 was two-fold:
1] to develop with CSSE, a model to describe the impact of SDD/
SISR on Consumption.
2] to participate in a review of the CSSE Fault Management
Operations Plan.
My host was Andy Oppenheim (CSSE Fault Management Senior Manager).
Analysis Process:
----------------
The large majority of this report is devoted to a review/analysis
aimed at the development of a model to describe the impact of
SDD/SISR on consumption. This study centered primarily around
the data in eight tables which summarize:
- the impact of SDD on calls, diagnosis time, RAXX consumption
and RAXX NPF
- SDD tool accuracy
- VAXsimPLUS utilization
Conclusions:
-----------
- THE KEY TO SUCCESSFUL IMPLEMENTATION OF SDD IS THE AUTOMATIC
UTILIZATION OF THE VARIOUS SDD TOOLS. THIS CONCLUSION IS
SUPPORTED BOTH BY THE US PILOT (VINSON) AND THE EUROPEAN
PILOT RECENTLY COMPLETED IN FINLAND.
- Automated SDD (i.e., VAXsimPLUS for RAXX disks) reduces RAXX
consumption by up to 33% depending upon the level of VAXsimPLUS
utilization and NPF rate. The NPF rate determines the maximum
consumption opportunity. The level of VAXsimPLUS utilization
determines the extent to which this opportunity can be realized.
- Approximately 22% of the world-wide VAX installed base utilizes
VAXsimPLUS, leaving the remaining 78% as "fertile ground" for
future VAXsimPLUS utilization potential.
- As a function of automated SDD, there is an opportunity to
reduce consumption for non-RAXX products, but no data
currently exists to quantify the magnitude of this opportunity.
The CSSE Fault Management Group has set a goal of 10%
consumption reduction for non-RAXX products which are capable
for SDD (i.e., CPUs, tapes, other disks ... ).
Next Steps:
----------
- Service Delivery must formulate a long-range automated SDD
utilization plan.
- Logistics must determine if the product plans for RAXX products
correctly accounts for VAXsimPLUS consumption impact, adjusting
these plan as appropriate.
- For the current products with significant consumption reduction
potential, CSSE Fault Management must determine if/how SDD
could be applied and then implement an aggressive plan to add
this SDD capability.
- For new products, Logistics must understand how SDD will enable
consumption avoidance, adjusting the long-range plan as
appropriate.
[Report details follow]
p3 of 11
Wednesday, Sept 26: Andy Oppenheim (CSSE), Keith Brown (CSSE)
-------------------------------------------------------------
- A detailed review of SDD was made:
- ERF (Error Reporter Formatter is the Error Logging Bit-to-text
Report generator delivered as an integral part of the VMS/
ULTRIX operating system. It was offered in 1977 on VMS and
1986 on ULTRIX. The ERF concept began in 1972 as SYSERR on
DECsystem 10's.
- SPEAR Basic is a summary generator which extracts significant
error information from the system error (event) log and
performs a summary. It is capable of sorting on specific
devices/event types. Under VAXsimPLUS, Spear Basic has been
renamed to Merge Error Log (MEL), a more descriptive title-
the MEL program takes the collected error logs, merges them
into a single file and strips off redundant entries such as
HSC broadcast messages. SPEAR Basic was developed in 1983
and exists only under VMS today.
- SPEAR Extended (Analysis) takes the output from SPEAR Basic,
applies a rule-based system to perform single and multiple
event correlation to determine the suspected FRU and/or
action to be taken to recover. SPEAR Extended was developed
in 1983 and exists only under VMS today.
- Using the RA81 as an example, from approximately 230 unique
error report types, SPEAR Extended utilizes approximately
67 non-media and 18 media specific algorithms to look for
failure signatures. These algorithms will then indict the
appropriate FRU (one of five).
Consistent, accurate usage of MEL (SPEAR Basic) and Analysis
(SPEAR Extended) will yield increased field call-out accuracy.
VAXsimPLUS was created for RAXX disks to insure consistent and
accurate usage of MEL and Analysis by automating the process,
removing the decision to run the tools from the field engineer
and placing the decision with the operating system. By
automating this process, we have the capability to dynamically
monitor soft or recoverable error events and look for specific
failure signatures. This provides the capability to invoke a
service request prior to catastrophic failure (i.e., predictive
maintenance).
To the above tools, VAXsimPLUS adds an automatic monitoring,
notification and autocopy function:
- "Monitor" automatically classifies the error events into
"buckets", for each device, to be displayed as well as
set up for the Analysis program.
- "Notify" sends VAXmail messages to the designated mail lists
as well as initiates Autocopy.
p4 of 11
- If this notification is sent to the Customer to
initiate corrective action, then this activity is
called Customer Initiated Service Request (CISR).
- If this notification is sent to the CSC (Digital)
to initiate corrective action, then this activity
is called System Initiated Service Request (SISR).
- "Autocopy" invokes Volume Shadowing to copy the data from
the suspected candidate drive to a spare drive.
- more re VAXsimPLUS:
- Enables the automation of the corrective action/field
dispatching process.
- Deployment began in 1987.
- VMS only today.
- Enables a uniform, consistent, predictable corrective
action, resulting in both a more accurate and faster
diagnosis when compared to a manually-driven system.
- During SISR, the customer is also notified that a repair
plan is being put into place.
- NOW FOR THE SPECIFIC RESULTS:
NOTE: THE MAJORITY OF THE SAVINGS WHICH FOLLOW ARE CENTERED
AROUND RA DISKS. THIS IS THE AREA WHERE THE ROI
IS THE GREATEST AND WHERE, TO DATE, THE PRIMARY EFFORT
HAS BEEN PLACED ON CREATING ALGORITHMS AND ANALYSIS
IMPROVEMENTS.
CLEARLY, THERE ARE OTHER AREAS OF SIGNIFICANT RETURN
THAT MUST BE ADDRESSED. LOGISTICS IS LOOKING TO
THE FAULT MANAGEMENT CSSE GROUP TO TAKE A LEADERSHIP
ROLE IN ADDRESSING THESE OPPORTUNITIES.
p5 of 11
-------------------------------------------------------------------
TABLE 1: Service Delivery: SISR Versus w/o SISR
US Pilot (1) European Pilot (2)
(VINSON) (Finland)
---------------- --------------
When FY89 Q2, Q3 FY90
Performed
# Sites 2 IN-DEC 1 IN-DEC
Districts 10 Customer
# Systems approx 600 approx 55
# Calls 20-25% reduction 30% reduction
Diagnose 1 hour reduction/ Not Measured
Time call
Consumption 33% reduction or Not Measured
$288/disk
(Disk savings)
(1) Cost Model for SSP Savings, Brad Kennedy, 6/12/89.
(2) System Initiated Service Request Pilot Conclusions,
David Bell, 31 August 1990.
- There is a strong correlation in these two pilot regarding
the percentage reduction in calls.
- It is important to note that the US Pilot report attributes
savings to SSPs ... this can be quite misleading. The savings
detailed in this report are due to the implementation of
SISR and the automation of SDD and not due to the use of SSPs.
SSPs were used as the service delivery platform on which to
run the specific SDD/SISR applications ... other effective
service delivery platforms would yield similar results.
- Although the European Pilot made no conclusions regarding
Consumption, Emanuel will initiate a discussion with David
Bell to investigate the potential Consumption impacts of
this pilot.
-----------------------------------------------------------------
p6 of 11
-----------------------------------------------------------------
TABLE 2- US Pilot (VINSON) Call-Out Accuracy Analysis (3)
# Calls RA81/82
Call-out w/Parts Diagnosis
(per Theory Code) Replaced Accuracy
-------- --------- --------
1st FRU 38 80%
3rd FRU 4 8%
4th FRU 1 2%
Not on Theory List 2 4%
--
Totals 45
(3) May 89 LARS Theory/FRU Data, Keith Brown, 3-JUN-89
- Per this analysis, the VAXsimPLUS theory code correctly indicted
the FRU as the 1st call-out, 80% of the time.
------------------------------------------------------------------
TABLE 3- European HDA VAXsimPLUS Statistics Jun/Jul 90 (4)
# of # of
VAXsim+ VAXsim+ No VAXsim+ No VAXsim+
Drives NTF Drives NTF
------- ------- ---------- ----------
RA81 10 1 = 10% 26 2 = 7%
RA82 50 0 = 0% 82 13 = 15%
RA90 12 2 = 16% 68 9 = 13%
--- -------- --- --------
Totals 72 3 = 4% 176 24 = 13%
(4) VAXsimPLUS Info, David Bell, 20-AUG-1990
- Note that conclusions from this report are being made on a sample
of only two months of data.
- The overall 4% NTF (No Trouble Found) of VAXsimPLUS compared to
13% NTF for No VAXsimPLUS, indicates that VAXsimPLUS utilization
significantly improves the accuracy of the field diagnosis.
- VAXsimPLUS is doing an excellent job for the RA82. This is
especially significant in that the RA82 HDAs are swapped the
most today (132 out of 248 swaps were RA82 HDAs).
- The tool needs improvements for the RA81 and RA90 with the
RA90 being the most urgent since the RA90 population will
increase in the future while the RA81 population will decrease
in the future.
------------------------------------------------------------------
p7 of 11
------------------------------------------------------------------
TABLE 4- RA81 Cumulative Field Population (5)
FY86 FY87 FY88 FY89 FY90
---- ---- ---- ---- ----
Number of Drives 63,600 101,200 123,100 126,100 126,100
(5) per Bill Sutliff, Sept 26, l990
------------
TABLE 5- RA81 Major FRU Consumption (Units) (6)
FY86 FY87 FY88 FY89 FY90
---- ---- ---- ---- ----
54-15247 5,555 5,748 4,649 4,141 2,878
(Microprocessor Bd)
70-18491-01 3,706 4,049 12,271 13,921 15,622
70-19045-01 5,588 6,576 7,610 6,590 4,597
(Servo Bd)
(6) RA81 Consumption Data (per DSAS), Emanuel Tucker,
Sept 12, l990.
------------
TABLE 6- RA81 Major FRU Consumption ($)
Xfr(7)
Cost FY86 FY87 FY88 FY89 FY90
------ ---- ---- ---- ---- ----
54-15247 334 1.9M 1.9M 1.6M 1.4M 1.0M
70-18491-01 1752 6.5M 7.1M 21.5M 24.4M 27.4M
70-19045-01 507 2.8M 3.3M 3.9M 3.3M 2.3M
(7) FY91 Mfg Standard Cost.
- HDA consumption has increased significantly over this period.
- An FCO was done on the HDAs in FY86 that affected the counts
of HDAs for FY86 and FY87. 32,582 HDAs were shipped to the
field under the EQ kit number instead of under 70-18491-01-
hence these are not included in the numbers above.
- HDA wear-out factor (or HDA life cycle) is now considered
to be 22 months.
These two major impacts "wash-out" any attempt to determine
potential SDD impact on RA81 HDAs.
p8 of 11
- In reference to the Microprocessor and Servo Bd FRUs,
consumption has decreased appreciably (38% and 41% respectively)
from FY88 to FY90 with total drive population remaining nearly
constant. With VAXsimPLUS deployment beginning in 1987, it is
believed that one of the factors for this decrease in consumption
is the usage of VAXsimPLUS. Improved training/experience
undoubtedly also contributed to this consumption reduction.
-----------------------------------------------------------------
TABLE 7- Estimated VAXsimPLUS Utilization (8)
# of # of # of Overall
VAX VMS VAXsim+ VAXsim+ VAXsim+
Systems Systems Systems Penetration Utilization
------- ------- ------- ----------- -----------
US 33,000 28,000 24,920 6,230 19%
(85%) (89%) (25%)
Europe 29,000 24,600 22,140 8,700 30%
(85%) (90%) (39%)
GIA 11,000 7,333 3,667 1,100 10%
(67%) (50%) (30%)
------ ------ ------ ------ ---
Totals 73,000 59,933 50,727 16,030 22%
(82%) (85%) (32%)
(8) per Andy Oppenheim, Bill Sutliff, Sept 26, l990.
- With 16,030 systems currently utilizing VAXsim+, and an
average of three drives/systems, 48,090 drives would produce
a consumption savings of up to $13.849M.
- The $13.849M comes from $288 times 48,090 drives.
- The $288 comes from the projected US Pilot results
which gave a potential savings of up to $288/disk.
[ref TABLE 1]
- The above analysis does not take into account potential
material savings from non-RA devices for which VAXsim+ applies:
[i.e., assuming SDD hooks are available ... ]
- CPUs: 9000, 3000, 6000, 4000
- Devices: RV, RF, TA, TF, DEMNA, KDM70
For these product families a potential VAXsim+ savings impact
of 10% should be assumed. It is critical that pilots be run
to verify this assumption- it is being stated only for planning
purposes at this time.
- Anticipated VAXsim+ consumption savings have already been
factored into the product plans for the following products:
p9 of 11
- RA82, 90 (Ron Milano)
- 9000 (Bob Myers)
- 6000 (Steve Dail)
****************************************************************
Action: Emanuel Tucker will contact the above CSSE mgrs to
understand VAXsim+ impacts on consumption for the
these products.
****************************************************************
- Currently, VAXsim+ does not effectively diagnose 86XX and
memory products ... high NPF rates have been identified with
these products. CSSE has initiated efforts to understand
the causes of this high NPF rate and extend SDD as appropriate
to help reduce this phenomena.
Thursday, September 27- Andy Oppenheim (CSSE), Dick Brown (CSSE),
Keith Norman (CSSE), Lou Hohos (Training), Stan Goldstein
(US Area), John Wiejaczka (US Area), Debbie Riggen (CSSE),
Shelley Haenze (CSSE), Ed Trotter (ASDS) and Keith Brown (CSSE)
----------------------------------------------------------------
- The large majority of the day was spent reviewing the contents
of the CSSE Fault Management Operations Plan. Significant
results of this review follow:
- Considerable concern was expressed about the lack of a "real"
service delivery strategy. Andy agreed to use the existing
virtual Fault Management Team to draft a "straw horse"
service delivery strategy. Without this strategy, we cannot
properly position our Fault Management tools to meet their
objectives.
- A Fault Management LRP is needed.
- Fault Management must be expanded to include non-Digital
products.
- Direct inputs from customers and field engineers is deemed
critical- possibly CABs (Customer Advisory Boards) can be
formed.
- On page 8, Logistics must be added to the pictorial
representation of Fault Management CSSE Customers. Emanuel
suggested that Logistics be added along side of the Service
Delivery inputs to insure a two-way dialogue between the
Fault Management Virtual Team and Logistics w/r/t SDD Product
Requirements.
p10 of 11
- Each virtual team member must be responsible for "delivering
their respective organizations", i.e. act as a conduit/
approval between their groups and the Fault Management Team.
This will insure that conflicting Fault Management development
requirements are minimized.
- The tie-in with the DEFMA (Digital Equipment Fault Management
Architecture is critical). Frank Robbins is driving DEFMA.
- Release 1.7 of VAXsimPLUS is expected to enable the design of
a modular SPEAR- thus significantly improving our ability to
more easily apply SPEAR to new products.
- Emanuel Tucker presented the latest results of the Logistics
SDD Module Testing project:
-----------------------------------------------------------------
TABLE 8- SDD Call-Out Accuracy Analysis (9)
# of
# of Correctly Call-Out
Returns Indicted Accuracy
------- --------- --------
US Returns 95 58 60%
(VAXsimPLUS)
US Pilot 16 13 80%
(VINSON)
(9) SDD Module Testing Status Report,
Colleen Castonguay, August 20, l990.
- The accuracy of the SDD/Predictive Maintenance Tools on RA
series disks as measured by Customer Services Logistics is
between 60% and 80% accurate. This was based on the
preliminary results of the SDD Module Testing project that
is being conducted in CSL Engineering.
- The worst case scenario of 60% is based upon the analysis of
95 VAXsimPLUS indicted RAXX modules from the US AREA where
Logistics was able to correlate the field failure to a
Logistics failure.
- The best case scenario of 80% is based upon the analysis of
16 VINSON indicted RAXX modules. We were able to demonstrate
from CSL testing (No Problem Present) and from the field
(VINSON and LARS) data analysis, that 20% of the FRU's
from the customer systems did not correct the problem.
-----------------------------------------------------------------
- SDD indicted modules are not returning at a rate consistent
with the level of field indictment. It was suggested that
the process be refined using a specific field area/district
to maximize two-way dialogue between the field and
Logistics. This "adopt a district" concept could have the
additional benefit of providing access to LARS information
p11 of 11
and also provide a path for "repaired SDD modules" to return
to the field.
*****************************************************************
Action: Emanuel Tucker will investigate with Stan Goldstein the
practicality of a target area/district with which to work
the SDD Module Testing Project.
*****************************************************************
The overall results of the SDD Module Testing project were
considered consistent with analyses performed by CSSE on the
accuracy of the SDD tools [ref TABLE 2].
[end of report]
| |||||
| 55.10 | A fix for VMS 5.4 | KERNEL::MCGAUGHRIN | What a Marvelous Delivery | Wed Dec 05 1990 09:24 | 185 |
Patch for SDD V1.2A to work on VMS V5.4 --------------------------------------- This is to announce the availability of a patch kit for SDD V1.2A that enables it to work under VMS V5.4. This patch is intended to be used as a stop-gap until SDD V1.5 is available to help customers who are migrating to VMS V5.4. This patch will not help customers buying new hardware such as the VAX 9000/4000/ft3000/RA92. If these customers have a need for SDD and cannot wait for V1.5, then consider using the FT version (as I described previously). This patch is not expected to live very long. It will be replaced with SDD V1.5 which will ship as follows : - with new systems as part of Factory Installed Software program - from SSB as usual - with ISDS 0.3, as ISDS will be supporting the VMS V5.4 VAX 9000/4000/ft3000/RA92 environment. It is advantageous to use this patch where customers are simply upgrading to VMS V5.4 and do not require the new device support or bug fixes of SDD V1.5. It is also relatively cheap to apply the patch as it is small, less than 15 blocks on the customer system, thus it can be done quickly during a site visit or remotely, or even, if required, by the customer himself. Capability: This patch enables the existing VAXsimPLUS V1.2A to work under VMS V5.4 systems. It adds NO new functionality or device support (RA70/80/81/82/90 RF30/71 TA90) It fixes NO outstanding V1.2A bugs (TA90, AutoCOPY problems). If new functionality is required, then VAXsimPLUS V1.5 should be used. The patch may be applied pro-actively to VMS V5.x systems that will upgrade to V5.4 Due to its small size it can be installed remotely if required. Operation: This patch converts VMS V5.4 error log entries back to V5.2/5.3 format so that correct analysis may take place. The VAXsimPLUS monitor and the VAXsim/MERGE command support V5.4 without problem. Only the VAXsim/Fault_Manager cannot handle the error log records. The patch installs 1 image and a CLD file to provide a utility to translate the temporary error log file used by VAXsimPLUS. It also updates VAXsim$NOTIFY.COM to invoke the new image. Pre-requisites: All that is required is an existing installation of VAXsimPLUS V1.2A. Installation: Installation is may either be via a standard VMS installation or remotely. The full VMSINSTAL kit is 30 blocks, which is small enough to be downline loaded. Alternatively the kit may be broken out and just the necessary files transferred. The installation checks for the presence of SDD and that the kit has not been previously patched. It asks no questions and will complete the patch in less than 5 minutes. The kit can be left on the customer site if desired, as it contains no proprietary information. Support: All support issues/problems should be directed to ASDPO via the hotline (DTN 828-5111) for priority 1 and 2 problems. Documentation: Documentation is bundled within the kit itself. It is not left on-site when installation completes. A copy is appended to this message. Availability: The kit is available now from CLARID::ASDPO1$:[BELL.PUBLIC.SDD]SDD_PATCH054.A Acknowledgements: I am indebted to Chris Loane for his dedication, in providing all of us with the utility to covert from V5.4 error logs to V5.2/5.3 and making it work for SDD V1.2A VAXsimPLUS V1.2A Patch Kit ========================== Use this kit in 2 ways : - standard VMSINSTAL - downline load VMSINSTAL ========= Just type $ @SYS$UPDATE:VMINSTAL SDD_PATCH054 directory_spec this will do the work and install the images in the proper place. Check everything is OK - refer to the section below DOWNLINE LOAD ============= Break up the kit and transfer the following files to the SDD$EXE directory : XLATERR.CLD XLATERR.EXE DO_PATCH.COM To install, just execute DO_PATCH to update the VAXSIM$NOTIFY file $ @DO_PATCH CHECK OUT ========= Three new files should be created in SDD$EXE : XLATERR.CLD XLATERR.EXE VAXSIM$NOTIFY.COM There may also be a copy of DO_PATCH.COM as well if you downline loaded. Do a DIFF of VAXSIM$NOTIFY.COM (there should be at least 2 versions). You should see a single difference with the lines $ SET COMMAND XLATERR.CLD $ XLATE/INPUT='REPORT'.SYS/OUTPUT='REPORT'.SYS If its there, then all is OK. ERRORS ====== Installation will check for the presence of the SDD$EXE logical. If it does not exist then installation will fail with SDD-E-NOTFND Cannot find SDD$EXE directory Next it will check to see if XLATERR.CLD has already been installed in SDD$EXE. If so, it assumes that the patch has been done and will not do it again If installation fails in any other way, remove the 2 XLATERR.* files from SDD$EX$ and start again. | |||||
| 55.11 | How to patch 1.2A for 5.4 | KERNEL::MCGAUGHRIN | What a Marvelous Delivery | Wed Dec 05 1990 09:27 | 49 |
<<< KERNEL::DISK$APD1:[NOTES$LIBRARY]CSCUK_DEVICES.NOTE;1 >>>
-< cscuk_devices >-
================================================================================
Note 96.1 GOOD NEWS FOR VMS 5.4.... 1 of 1
KERNEL::MCGAUGHRIN "What a Marvelous Delivery" 41 lines 4-DEC-1990 17:20
-< How to Patch V1.2A for VMS 5.4 >-
--------------------------------------------------------------------------------
This note explains how to patch VAXsimPLUS V1.2A remotely
so that it will continue to work once a customer has upgraded
VMS to V5.4
You should have read the previous note before attemting this
because you should be very clear about what the patch actually
does and more importantly does not do!
You will find all the relevent files in rsds$spear, and there
is also a README.TXT which you may wish to browse through
before going ahead.
There are THREE files which need to be installed, and as usual
you should make a connection to the customers machine and
login to the FIELD or privalleged account..
REM> tr
CUST> set proc/priv=all
CUST> set def sdd$exe
REM> RFT
RFT> send rsds$spear:xlaterr.cld sdd$exe:xlaterr.cld
RFT> send rsds$spear:xlaterr.exe sdd$exe:xlaterr.exe
RFT> send rsds$spear:do_patch.com sdd$exe:do_patch.com
REM> tr
CUST> @DO_PATCH.COM
That is it! It should not take any more than a few minutes
then you can check to see if you have created a new file
called VAXSIM$NOTIFY.COM, if so then OK.
| |||||
| 55.12 | SDD V1.5-127 RELEASED | KERNEL::MCGAUGHRIN | What a Marvelous Delivery | Tue Mar 26 1991 13:39 | 17 |
The latest "final V1.5 kit" has been placed with the rest of the ISDS kits in the following location : CLARID::ISDS003:[CUST.SDD015] This kit will be sent to SSB on Monday 25-MAR-1991 for manufacture. Before copying read the READ_ME_FIRST file as it supplements the release notes. If you have any queries, I'll be out for the next 2 weeks ! :-) Regards, David Bell | |||||
| 55.13 | SDD V1.5 LIMITATIONS | KERNEL::MCGAUGHRIN | What a Marvelous Delivery | Tue Mar 26 1991 13:40 | 30 |
IMPORTANT-IMPORTANT-IMPORTANT-IMPORTANT-IMPORTANT-IMPORTANT ----------------------------------------------------------- This SDD kit is currently the V1.5-127 kit. It fixes several problems related to the password generation on the SDD$MANAGER account. Note very carefully the following limitations : - VAX 9000 SPU and memory errors are not always handled correctly. Engineering says to rely on the SPU information - not all VAX 4000 CPU errors are reported. There will be a patch to fix this in the future - when upgrading a V1.2A site it is recommended that the VAXsim$CLUSTER.DAT database be deleted before upgrading, followed by VAXsim$LOADER to reload information. BUT remember to save a copy of the mail list if you do this as it is held in this database! If you have any questions concerning SDD, please contact me. David Bell @VBO DTN: 828-5502 CLARID::BELL [email protected] | |||||
| 55.14 | [1.15.8.10] Are they injected errors? | KERNEL::MCGAUGHRIN | What a Marvelous Delivery | Thu May 16 1991 17:35 | 65 |
It is important that you are able to recognise whether a drive has
had errors injected onto it. Otherwise you could be sending engineers
to site un-necessarily. This reply explains what to look for.
1. INJECTERR
INJECTERR is a program written in Macro to inject DSA disk errors
into a System Errorlog primarily for demonstrating the SDD Tools
Kit functionality. It can also be used to verify correct operation
of the Tools Kit after installation.
It should be noted that INJECTERR will leave errors in the System
Errorlog file, but will NOT show up in the output of the SHOW ERROR
command (Basically, it doesn't increment the Device's UCB error
count field)
Set-up :
INJECTERR resides in the SDD$EXE directory with the SDD Tools.
Before running INJECTERR you must set up your command tables by
typing the following :
$ SET COMMAND SDD$EXE:INJECTERR.CLD
Use :
The full format of the command to inject errors is as follows:
$INJECT $2$DUA7/AFTER=0:0:10/DELAY=0:0:30/COUNT=6/CLASS=MEDIA/BELL
2. SDD FUNCTIONS
Once errors have been injected and the drives monitor threshold has been
exceeded (warning condition), vaxsim then triggers. It then collects the
Errorlog data, analyses that data for a failure mode (theory), and if one
is detected then it sends mail.
If you inject MEDIA errors, the vaxsim theory received will be [1.15.8.10]
3. DSNlink (AES - SISR)
Given the scenario that an engineer wishes to test that both VAXsimPLUS
and SISR are working correctly, then he can use INJECTERR to do this.
Typically he will inject MEDIA errors, this will cause mail to be sent
to those on the mailing lists, also you will find that if DSNlink (AES)
is installed, then it will attempt to log a request into NICE at the
CSC. The Problem Description in the NICE request will include the VAXsimPLUS
Event Code [1.15.8.10].
4. How to spot INJECTED errors
If you are dealing with *ANY* VAXsimPLUS event code of [1.15.8.10], then
you should be able to recognise whether these errors have been injected
or whether they are real. The tool INJECTERR was specifically designed in
such a way so you CAN recognise injected errors. The LBN to which these
errors are targeted is 48879, this when converted into HEX = BEEF (RF**
drives check Command Reference Number) Also you can check the time span
of these errors from the mail messages and determine whether it is
likely that these have been injected.
| |||||
| 55.15 | Operator Requested Shutdown SICL calls | KERNEL::LOANE | Comfortably numb!! | Wed Jul 14 1993 12:22 | 35 |
Recently calls logged via AES have indicted Operator Requested
Shutdowns. These come in 2 flavours; calls logged by CLUE and calls
logged by VAXsimPlus. Could I ask you to forward this mail to
whoever you believe may get one of these calls so we can gather
information to discover the root cause.
********************************************************************
Firstly, calls logged via CLUE.
CLUE is installed (optionally) with SDD and with AES. However, in
Europe, the ability for CLUE to log calls via AES was turned off
(the file SDD$EXE:CLUE$STARTUP.COM has the following line:-
$ !define/system/exec clue$log_call TRUE
I believe that this line is being un-commented in which case, the
CLUE$STARTUP.COM procedure will call SDD$EXE:CLUE$LOG_CALL.COM
********************************************************************
Secondly, calls logged via VAXsimPlus.
We have experienced problems with VAXsimPlus logging SysBugChk
and/ot CrshRstrt events....but the errorlogs supplied are nothing
to do with the event (typically, the errorlogs contain some Bus
errors). To provide a workaround to the problem, a `patch' was
written for the file SDD$EXE:FMGR$BUILD_SISR.COM. This involves
adding the lines:-
$ melselect = "/entry=(40,37,2,112)" ![CL] Check SysBugChk etc
$ gosub melcount ![CL] Count those records
$ if rec_ltd .gt. 0 then goto gotmel ![CL] Got records, melselect ok
I'd be interested in knowing if this (patched) version of the file
is on the system logging Operator Requested shutdowns and also,
what version of VAXsimPlus is installed (typing VAXSIM at the DCL
prompt will tell you).
Cheers
Chris
| |||||