[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference decwet::hsm-4-unix

Title:HSM for UNIX Platforms
Notice:Kit Info in note 2.1 -- Public Info Pointer in 3.1
Moderator:DECWET::TRESSEL
Created:Fri Jul 08 1994
Last Modified:Wed Jun 04 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:238
Total number of notes:998

225.0. "RWZ52 support?" by MARVEL::NICHOLSONA (rackin frackin varmit) Tue Dec 17 1996 01:45

T.RTitleUserPersonal
Name
DateLines
225.1rw531 after all but still not workingMARVEL::NICHOLSONArackin frackin varmitTue Dec 17 1996 08:5910
225.2DECWET::TRESSELPat TresselWed Dec 18 1996 15:4570
225.3here is the right stuffMARVEL::NICHOLSONArackin frackin varmitMon Dec 23 1996 04:54400
225.4Bad platters?DECWET::TRESSELPat TresselThu Jan 02 1997 23:10116
225.5system now crashing when jukebox attachedKERNEL::NICHOLSONAone suite too many can cause truth decayThu Feb 27 1997 03:27339
    
    
    Hi,
    
    
    Customer did have some hardware problems with the jukebox. The jukebox
    was sent away and now has come back. When the boot the system up with
    the jukebox powered on the system now crashes. Only if the system is
    booted up and the jukebox is powered off does the system remain up.
    If the jukebox is powered up after startup has completed the system
    does not crash.
    
    I have advised the customer now that I think the best way forward is to
    start from scratch and reinstall cam layered products and then hsm. 
    
    What do you think?
    
    At least then we would be starting from a known place.
    I am leaving the company tomorrow so if you could reply on the
    conference that would be great. 
    
    Below is a section of the crash-data file..
    
    
    _crash_data_collection_time: Thu Feb 13 18:29:34 GMT 1997
    _current_directory: /
    _crash_kernel: /var/adm/crash/vmunix.15
    _crash_core: /var/adm/crash/vmcore.15
    _crash_arch: alpha
    _crash_os: Dec 08
    Dec 08
    Dec 08
    Dec 08
    Dec 08
    DECnet/OSI for
    DECnet/OSI for
    Digital UNIX
    _host_version: Dec 08 00:00 OSF1 V3.2 (Rev. 214.61) 
    Dec 08 00:00 OSF1 V3.2 (Rev. 214.61) 
    Dec 08 00:00 OSF1 V3.2 (Rev. 214.61) 
    Dec 08 10:09 OSF1 V3.2 (Rev. 214.61) 
    Dec 08 10:09 OSF1 V3.2 (Rev. 214.61) 
    DECnet/OSI for Digital UNIX V3.2A-0 (Rev. 23.19); Fri Sep 15 13:21:53
    EDT 1995
    DECnet/OSI for Digital UNIX V3.2A-0 (Rev. 23.19); Fri Sep 15 13:21:53
    EDT 1995
    Digital UNIX V3.2D-1 (Rev. 41); Wed Jan 15 19:53:18 GMT 1997 
    _crash_version: Dec 08 00:00 OSF1 V3.2 (Rev. 214.61) 
    Dec 08 00:00 OSF1 V3.2 (Rev. 214.61) 
    Dec 08 00:00 OSF1 V3.2 (Rev. 214.61) 
    Dec 08 10:09 OSF1 V3.2 (Rev. 214.61) 
    Dec 08 10:09 OSF1 V3.2 (Rev. 214.61) 
    DECnet/OSI for Digital UNIX V3.2A-0 (Rev. 23.19); Fri Sep 15 13:21:53
    EDT 1995
    DECnet/OSI for Digital UNIX V3.2A-0 (Rev. 23.19); Fri Sep 15 13:21:53
    EDT 1995
    Digital UNIX V3.2D-1 (Rev. 41); Wed Jan 15 19:53:18 GMT 1997 
    
    _crashtime:  struct {
        tv_sec = 855857203
        tv_usec = 914609
    } 
    _boottime:  struct {
        tv_sec = 855857032
        tv_usec = 609024
    } 
    _config:  struct {
        sysname = "OSF1"
        nodename = "kofim8"
        release = "V3.2"
        version = "41"
        machine = "alpha"
    } 
    _cpu:  35 
    _system_string:  0xffffffffff801048 = "AlphaServer 2100 4/200" 
    _ncpus:  1 
    _avail_cpus:  1 
    _partial_dump:  1 
    _physmem(MBytes):  127 
    _panic_string:  0xfffffc000067b5a0 = "bread: size 0" 
    _paniccpu:  0 
    _panic_thread:  0xfffffc0004dddb80 
    _preserved_message_buffer_begin: 
    struct {
        msg_magic = 0x63061
        msg_bufx = 0x968
        msg_bufr = 0x86f
        msg_bufc = "PCXAL keyboard, language English (American)
    
    Alpha boot: available memory from 0xa44000 to 0x7ffe000
    Digital UNIX V3.2D-1 (Rev. 41); Wed Jan 15 19:53:18 GMT 1997 
    physical memory = 128.00 megabytes.
    available memory = 117.78 megabytes.
    using 484 buffers containing 3.78 megabytes of memory
    Firmware revision: 3.9
    PALcode: OSF version 1.35
    ibus0 at nexus
    AlphaServer 2100 4/200
    cpu 0 EV-4s 1mb b-cache
    gpc0 at ibus0
    pci0 at ibus0 slot 0
    tu0: DECchip 21040-AA: Revision: 2.3
    tu0 at pci0 slot 0
    tu0: DEC TULIP Ethernet Interface, hardware address: 08-00-2B-E2-68-05
    tu0: auto sensing: selected AUI (10Base2|5) port
    psiop0 at pci0 slot 1
    Loading SIOP: script 1001b00, reg 81000000, data 40759a08
    scsi0 at psiop0 slot 0
    rz0 at scsi0 bus 0 target 0 lun 0 (DEC     RZ28     (C) DEC D41C)
    rz1 at scsi0 bus 0 target 1 lun 0 (DEC     RZ26L    (C) DEC 440C)
    rz2 at scsi0 bus 0 target 2 lun 0 (DEC     RZ26L    (C) DEC 440C)
    rz3 at scsi0 bus 0 target 3 lun 0 (DEC     RZ26L    (C) DEC 440C)
    rz4 at scsi0 bus 0 target 4 lun 0 (DEC     RZ26L    (C) DEC 440C)
    rz5 at scsi0 bus 0 target 5 lun 0 (DEC     RZ29B    (C) DEC 0014)
    rz6 at scsi0 bus 0 target 6 lun 0 (DEC     RRD43   (C) DEC  1084)
    eisa0 at pci0
    ace0 at eisa0
    ace1 at eisa0
    lp0 at eisa0
    fdi0 at eisa0
    fd0 at fdi0 unit 0
    vga0 at eisa0
     1024x768 (QVision )
    aha0 at eisa0 slot 3
    scsi1 at aha0
    lvm0: configured.
    lvm1: configured.
    dli: configured
    SuperLAT. Copyright 1993 Meridian Technology Corp. All rights reserved.
    datalink: links=64, macs=6
    knbinit: sessions=256, names=64
    knbtcp: configured
    knbtcpd: configured
    knbadm configured
    nbeadmin_configure
    netbeuid_configure
    netbeui_configure
    x25_access: configured
    x25_ip: configured
    x25_relay: configured
    wandd_base: configured
    wandd_llc2: configured
    wan_utilities: configured
    ctf_base: configured
    Node ID is 08-00-2b-e2-68-05 (from device tu0)
    dna_netman: configured
    dna_dli: configured
    ADVFS: using 1152 buffers containing 9.00 megabytes of memory
    vm_swap_init: warning /sbin/swapdefault swap device not found
    vm_swap_init: in swap over commitment mode
    Node UID is c390bf00-85cb-11d0-8008-08002be26805
    dna_base: configured
    dna_rfc1006: configured
    dna_xti: configured
    panic (cpu 0): bread: size 0
    syncing disks... done
    device string for dump = SCSI 0 1 0 1 100 0 0 .
    DUMP.prom: dev SCSI 0 1 0 1 100 0 0 , block 131072
    device string for dump = SCSI 0 1 0 1 100 0 0 .
    DUMP.prom: dev SCSI 0 1 0 1 100 0 0 , block 131072
    "
    } 
    _preserved_message_buffer_end: 
    _kernel_process_status_begin: 
      PID   COMM
    00000   kernel idle
    00001   init
    00008   kloadsrv
    00038   update
    01066   knblink
    01068   sendmail
    01074   dllink
    01117   timed
    01172   mold
    01175   internet_mom
    01184   snmp_pe
    01190   inetd
    01195   cron
    01223   pwlic.reg
    01229   lpd
    01295   jmd
    01299   qr
    01326   coolsrvr
    01334   xdm
    01340   jmd
    01342   jmd
    01343   oss_exec
    01345   getty
    01352   nbelink
    01354   Xdec
    01364   pwalrtr
    01368   xdm
    01372   lmx.ctrl
    01383   getty
    01384   getty
    01385   getty
    01386   getty
    01387   getty
    01389   getty
    01392   getty
    01394   getty
    01395   getty
    01397   getty
    01398   getty
    01399   getty
    01401   getty
    01402   getty
    01405   getty
    01406   getty
    01407   getty
    01409   getty
    01410   getty
    01411   getty
    01412   getty
    01413   getty
    01414   getty
    01415   getty
    01416   getty
    01424   getty
    01441   Xsetup_0
    01451   dxconsole
    00670   usd
    00740   syslogd
    00742   binlogd
    00769   gated
    00839   named
    00872   portmap
    00874   mountd
    00876   nfsd
    00878   nfsiod
    00879   nfsiod
    00880   nfsiod
    00881   nfsiod
    00882   nfsiod
    00883   nfsiod
    00884   nfsiod
    00890   automount
    00912   dnalimd
    00915   dnaevld
    00922   ctfd
    00952   dnascd
    00953   dnansd
    00954   dnaksd
    00958   dnsadv
    00962   dtssd
    00965   dnanoded
    00976   dnamopd
    00986   rfc1006d
    01000   osaknmd
    01005   ftam_listener
    01007   ftam_listener
    _kernel_process_status_end: 
    _current_pid:  1340 
    _current_tid:  0xfffffc0004dddb80 
    _proc_thread_list_begin: 
    thread 0xfffffc0004dddb80 stopped at  [boot:1730 ,0xfffffc00004800cc]   
    Source 
    not available
    _proc_thread_list_end: 
    _dump_begin: 
    >  0 boot(0x0, 0x0, 0xfffffc000067b5a0, 0xffffffffffffffff,
    0xffffffff88b8f0a0) 
    ["../../../../src/kernel/arch/alpha/machdep.c":1730,
    0xfffffc00004800cc]
    
       1 panic(s = 0xfffffc000067b5a0 = "bread: size 0")
    ["../../../../src/kernel/bs
    d/subr_prf.c":757, 0xfffffc000043f3b4]
    pcpu = 0x2fa
    i = 0
    bootopt = 7606120
    mycpu = 4151676
    spl = 0
    prevcc = 18446739675667192340
    nextcc = 18446739675783402496
    timer = 1
    limit = 0
    
       2 bread(vp = 0xfffffc0005273c00, blkno = 0, size = 0, cred =
    0xffffffffffffff
    ff, bpp = 0xffffffff88b8f0a0)
    ["../../../../src/kernel/vfs/vfs_bio.c":393, 0xfff
    ffc000044d46c]
    bp = (nil)
    error = -2144736992
    metadatatype = 0
    
       3 blkatoff(0x8052000006c1, 0x0, 0x0, 0xffffffff88b8f1d8,
    0xffffffff88b8f250) 
    ["../../../../src/kernel/ufs/ufs_lookup.c":1332, 0xfffffc0000270e34]
    
       4 scandir(0xfffffc0005273c00, 0x0, 0x0, 0xfffffc0000212270, 0x0)
    ["../../../.
    ./src/kernel/ufs/ufs_lookup.c":530, 0xfffffc000026fd34]
    
       5 ufs_lookup(0xfffffc0005273c00, 0xffffffff88b8f550, 0x3d,
    0xfffffc0000295610
    , 0xfffffc0000295988) ["../../../../src/kernel/ufs/ufs_lookup.c":386,
    0xfffffc00
    0026fab0]
    
       6 namei(0xfffffc00006d61a0, 0xfffffc0000000001, 0xfffffc0005273c00,
    0xfffffc0
    000000001, 0xfffffc0000299cf4)
    ["../../../../src/kernel/vfs/vfs_lookup.c":563, 0
    xfffffc00002954fc]
    
       7 vn_open(0xffffffff88b8f720, 0x1001, 0x0, 0x633231706f2f7665,
    0xfffffc000048
    eaac) ["../../../../src/kernel/vfs/vfs_vnops.c":515,
    0xfffffc0000298ee0]
    
       8 copen(0xfffffc0004ddd210, 0xffffffff88b8f8c8, 0xffffffff88b8f8b8,
    0x0, 0x40
    289374bc6a7efa) ["../../../../src/kernel/vfs/vfs_syscalls.c":1824,
    0xfffffc00004
    54f98]
       9 open(0xffffffff88b8f8b8, 0x0, 0x40289374bc6a7efa,
    0xfffffc000048e4d4, 0xfff
    ffc000048d8d8) ["../../../../src/kernel/vfs/vfs_syscalls.c":1781,
    0xfffffc000045
    4ebc]
    
      10 syscall(0x140019228, 0x1, 0x0, 0x0, 0x2d)
    ["../../../../src/kernel/arch/alp
    ha/syscall_trap.c":519, 0xfffffc000048d8d4]
    
      11 _Xsyscall(0x8, 0x1200477c8, 0x1400276e0, 0x11ffff258, 0x0)
    ["../../../../sr
    c/kernel/arch/alpha/locore.s":1094, 0xfffffc000047d024]
    
    _dump_end: 
    
    
    Anyway thanks for all the help and advice you have given me.
    
    Many thanks
    
    Avril
225.6DECWET::TRESSELPat TresselThu Feb 27 1997 16:1884
Avril --

> I am leaving the company tomorrow

I wish you well in your future endeavors!

-- Pat

-------------------------------------------------------------------------------

About the case:

>    Customer did have some hardware problems with the jukebox. The jukebox
>    was sent away and now has come back.

Sounds like it's worse than it was before...  What was done to the jukebox?
Is there any way a different jukebox (preferably one known to be working)
could be brought to the customer site for testing?

>    When the boot the system up with
>    the jukebox powered on the system now crashes. Only if the system is
>    booted up and the jukebox is powered off does the system remain up.
>    If the jukebox is powered up after startup has completed the system
>    does not crash.

The crash info shows the jmd trying to open something, probably on one of
the optical drives, and probably while trying to inventory the jukebox.
So the fact that the system stays up if the jukebox is turned on later is
only because the jmd decided there was no jukebox.  It won't go back and
try to look for, and inventory, a newly turned on jukebox, unless it's
restarted.  So if they restart HSM after the jukebox is turned on, I'd
presume the system would crash then.

The crash appears to have happened because function bread was passed a 0
size argument -- it checks for this, and panics if it gets it.

The stack trace shown in the crash info looks wrong -- there are impossible
arguments in some of the calls, for instance, the open and copen calls
should both have the proc pointer in the first argument...but their first
arguments are not the same.  Also, the open flag appears to be 1, O_WRONLY,
which the jmd does not use.  So maybe the stack wasn't valid at the time
of the dump.  Or I'm confused about what it means...  I don't have access
to v3.2d-1 sources, but I looked at both v3.2c and v3.2f sources.

There are several possibilities:

1) The jukebox is still sick.  Although the stack trace shows the panic
   happening before the open gets to the point of trying to access the
   device, device-related things may have happened earlier, that messed up
   internal data structures or some such thing, and this panic is a
   consequence of that previous damage.

2) Their vmunix is corrupted.  Why don't they rebuild their kernel, and
   compare the new and old versions?  They should make sure the rebuild
   succeeds.  They can rebuild by doing:

     doconfig -c XXXX

   where XXXX is the name of their machine in uppercase.  They should
   answer no when doconfig asks them if they want to edit the configuration
   file.

   > I have advised the customer now that I think the best way forward is to
   > start from scratch and reinstall cam layered products and then hsm. 

   I don't see anything HSM-specific in this case, but if they did reinstall,
   they'd rebuild the kernel along the way...  It might confuse the issue
   to do both at once, though.

3) There's a bug in the os.  I didn't see any patches in the v3.2d-1 patch
   set that mention a "bread: size 0" panic, but not all patches give their
   precise symptoms.  If a kernel rebuild doesn't help, it may be necessary
   to get help from the Unix group, to look at the crash dump.

4) Something else that talks to the device is incompatible with it.
   What SCSI adapter are they using on the bus the jukebox is on?  Are
   any other devices on the same bus?

But before we get to these second-level possibilities, the first approach
would be to replace or repair the jukebox...after all, it *was* getting
those huge numbers of errors, and the behavior *did* change when it was
"fixed".

-- Pat
225.7DECWET::TRESSELPat TresselThu Feb 27 1997 22:04138
In order to prevent the system from crashing, another thing they can do
instead of powering off the jukebox is to take HSM out of the system startup.

To not start HSM during system startup, remove the link to its startup
procedure:

  rm /sbin/rc3.d/S70hsm

One of the tests I'll suggest below requires that HSM be running, so it
would have to be started by hand later, but we don't want it trying an
inventory when it starts.

To turn off the inventory, change the file /usr/efs/PolycenterHSM.startup to
set the inventory type to "none" by replacing the line:

  /usr/efs/bin/jmd

with

  /usr/efs/bin/jmd -i none

-------------------------------------------------------------------------------

Going back to some previous evidence:

    # file /dev/r0op13c
    /dev/rop13c:	character special (54/21506) SCSI #1 RWZ52 disk
    #104 (SCSI ID #5) errors = 0/176 offline 
    # file /dev/rop12c
    /dev/rop12c:	character special (54/20482) SCSI #1 RWZ52 disk #96
    (SCSI ID #4) errors = 0/263 offline 

There are errors on both drives, so whatever the problem is, it's not
drive-specific, i.e. there isn't a "bad" drive.  Anything common to both
drives, and anything on the path between the operating system, which reports
the errors, and the drives is suspect.

This includes the SCSI bus itself, which is why I asked about other devices
on that bus.  Going back to the scu output, I see there's a tape drive on
that bus too -- is the tape drive behaving ok?  It would be a simple test
to replace all the cables and terminators, and see if that makes a difference.
Another thing to try is to temporarily take the tape drive (and anything
else that may have been put on that bus since) off the bus -- have just one
cable straight from the 2100 to the jukebox, and terminate the bus at the
jukebox.  

(I recognize that "testing" is likely to crash the system if the cables
aren't the problem, so they should schedule this and not have users on at
the time.)

The external cables aren't all the cables there are:  There are also cables
inside the jukebox.  We had one case where a jukebox was shipped from the
factory that didn't have its internal cables connected.  So it would be good
to have all the internal connections checked.  Since the problems got worse
after the jukebox was serviced, one obvious thing to check is whether
everything was put back together correctly.

Speaking of putting things back together correctly...  Where were the platters
while the jukebox was being serviced?  Were they all taken out and kept at the
customer site?  If so, were they put back in the same slots, with the same
side up, as before?  (Since the jmd can't inventory the platters, if they're
moved, HSM will have obsolete info as to what platter is in what slot.)
Were they left in the jukebox?  If so, was Field Service aware that the
platters had data on them?  

One thing in common between the drives is their firmware revision level.
But my drives are at the same firmware rev level, though with the slight
difference that mine are pre-release models and didn't have their names
and vendor ids changed to RWZ52 and DEC yet.  I've used them under v3.2d-1
also, but on a different model of processor.  So the problem probably isn't
in the firmware, or the higher levels of Digital Unix.

The lower levels of the drivers (the port drivers) are specific to the type
of SCSI adapter -- a problem here would not be ruled out by the fact that
I've used RWZ52s under v3.2d-1, because I probably don't have the same type
of adapter.  On the other hand, neither 2100s nor RWZ52s are new, so it
would be surprising for something to surface now.

So the first thing to rule out is still a problem with the bus, whether
in or out of the jukebox.

-----------------------------------------------------------------------------

During system startup, the drivers are all told to initialize themselves.
Since the crash happened only after the system was fully up, then startup
got past the initialization.  But that doesn't mean it got through without
errors, so...were there any errors reported during startup?  These are
recorded in /var/adm/messages -- look for the section where all the devices
are being listed.  Also during startup, when the bus is reset, the jukebox
should display its revision level on its front panel -- did anyone see this
happen?

Did anyone notice if HSM is able to load platters into drives?  That would
help us tell whether this was a problem affecting the optical drives only,
or the changer as well.  (If it's *all* the devices, the problem is likely
to be a generic bus problem; if the changer works fine, then either internal
connections in the jukebox, or the port driver / adapter interaction with
the optical drives become more likely.)  We can check whether platters can
be loaded, without having HSM try to read their filesystems.  (If they took
HSM out of the system startup, they'll need to start it by hand, by running
the modified /usr/efs/PolycenterHSM.startup with inventory set to none.)
The following won't work quite right if platters are not in the same slots
as they were originally -- we're going to ask HSM to load platters, so we
ask it to load something from a slot that used to have a platter in it, and
there's no platter there now, it'll report an error.  But that won't mean
there's any problem with loading, just that the platters aren't in their
expected slots.

It would be best to do the following while near the jukebox, to see if it
is actually doing anything.

Before starting HSM:
Do "file" on both raw optical devices, as before -- both should be offline,
since HSM hasn't been asked to load anything yet.  If one *isn't offline,
that means a platter was left in a drive somewhere along the way -- this
is going to complicate things, so use the jukebox front panel to unload
the platter.

After HSM is up, do listm -f to get a list of platter names.  Choose some
platter name and use the loadm command to tell HSM to load it, e.g.

  loadm 1004a

Does the jukebox do anything?

Do file on both rop devices again.  Loadm reported which drive it put the
platter in.  That drive should no longer say offline.

Now loadm another platter, which should go in the other drive.  Use file to
check that that drive is no longer offline.

If the changer operates, the problem is less likely to be external SCSI
cables, or other devices on the bus.  If the changer does not operate,
the problem is less likely to be specific to the optical drives, e.g.
less likely to be their firmware, but we don't really suspect that anyway.
This test doesn't rule out any of the other possibilities.

-- Pat
225.8is CLC31a supported in HSM?NETRIX::"[email protected]"decatl::johnsonTue Apr 01 1997 14:4814

It appears that they may be using CLC31a  (not CLC310)  and HSM 12a.
(the call's work units seem to also confirm this  CLCMC311 and CLCOP311.

* Has CLC131a been tested with HSM?

	According to the NJA12a spd, only CLC300 and CLC310 is supported.

	
sid johnson  
customer support center/Atlanta

[Posted by WWW Notes gateway]
225.9is there still a hardware problem?DECWET::TRESSELPat TresselTue Apr 01 1997 15:5855
Sid --

CLC 3.1a is fine for HSM -- it's what most sites are using now.  The changes
between 3.1 and 3.1a were mainly for DUnix v4.0.

> According to the NJA12a spd, only CLC300 and CLC310 is supported.

That's because the spd was last updated before CLC 3.1a was released.
In general, the latest version of CLC should be used, since CLC contains
different copies of the driver for each version of DUnix, and will pick
whichever is appropriate.

The exception is OSMS/OSDS, because those products include drivers that
interact with CLC in the kernel -- they currently ship the particular
version of CLC that they need.

                                * * *

So this problem has not been resolved?

I don't think Avril tried the things in .4 on the repaired jukebox before
he left -- that might be a good starting point.  In fact, since the repair
may have left the jukebox with odd settings, the whole configuration should
be checked.  So it would also be good to re-do the earlier checks, e.g. to
make sure that the SCSI ids are in the right order, so that platters get
loaded into the correct drives.

I think it's most likely that there is still a hardware problem -- that the
repair didn't fix it.  To see if it is a hardware problem, the simplest thing
would be to start swapping things one at a time, changing the easiest things
first: SCSI cables + terminators, SCSI adapter, the jukebox itself...

Is anything on the same bus with the jukebox?  Does that other thing work?
Could the jukebox be moved to a different bus?

Any way you could get the jukebox connected to a different machine?
How about one with a totally different type of SCSI adapter)?  (This last
actually isn't a check for a hardware problem -- it's to see if there might
be a SCSI protocol "difference of opinion" between the Unix port driver and
the jukebox firmware.)

Some (other) non-hardware things to try:

Was the jukebox used for something else previously?  If so, maybe it needs
to be reset to factory default settings.  There should be a "test" through
the front panel that will do this.   Do they have the jukebox manual that
lists these "tests"?  If they don't, Field Service should have it.  (I haven't
found one here, yet.)  Or someone in the Optical group should know.  (The
optical support folks got downsized, and my contact there transferred to
another group.)

Maybe the firmware is messed up and needs to be reloaded?  Field Service,
again, should be able to do this.

-- Pat