[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference spezko::cluster

Title:+ OpenVMS Clusters - The best clusters in the world! +
Notice:This conference is COMPANY CONFIDENTIAL. See #1.3
Moderator:PROXY::MOORE
Created:Fri Aug 26 1988
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:5320
Total number of notes:23384

5315.0. "Satellites boot slowly after 5.5-2 to 6.2 upgrade" by TAV02::GALIA (Galia Reznik, Israel Software Support) Tue May 27 1997 07:12

    Hi,
    
    Our customer has a large cluster containing the following (among
    other):
    
    1. VAX 6xxx + VAX 7xxx ,  CI and DSSI, OpenVMS V5.5-2
    2. About 40 satellites: about 10 of them have system disk V5.5-2
       and about 30 have system disk V6.2; all of them get boot services from
       6xxx and 7xxx.
    3. DECnet IV
    
    A few weeks ago, all systems mentioned above were V5.5-2 and the 
    satellites booted fine: there was one connection message and the 
    satellite continued booting.
    System disk of those above mentioned 30 satellites was upgraded to V6.2
    and other 10 were left with V5.5-2. Those which left with V5.5-2 are fine.
    Those which upgraded to V6.2 have the following problem:
    When a satellite starts booting, it goes through half-an-hour of
    'lost connection' and 'regained connection' until it finally boots.
    The startup itself is quick - the problem is while getting the boot
    services. 
    Satellites' NCP entry on the boot server (V5.5-2) looks like this:
    
    Hardware address	= <satellite's h/w address>
    Tertiary loader	= <V6.2 TERTIARY_VMB.EXE on the satellite's system>
    Load Assist Agent	= SYS$SHARE:NISCS_LAA.EXE on V5.5-2 boot server system
    Load Assist Parameter= V6.2 system disk name and root 
    
    This V6.2 system disk is connected to HSC, just as it was with V5.5-2.
    
    There is one interesting fact, though: One of those 30 satellites which
    boot from V6.2 is VAXstation 4000-90. It still boots fine. We could not
    find any differences in configuration between it and the others. 
    It is connected exactly like others,  geographically it is near the 
    others, and others don't boot fine. The only difference is, that it 
    is 4000-90 and the others are 4000-60 and 3100. I don't feel this should
    influence that much.
    
    I'm aware of mixing V5.5-2 and V6.2 in one cluster and getting mixed
    boot services,  but is there anything else?
    
    Galia Reznik,
    MCS, Israel.
T.RTitleUserPersonal
Name
DateLines
5315.1Questions, No Answers...XDELTA::HOFFMANSteve, OpenVMS EngineeringTue May 27 1997 12:5323
   Ok, given the symptoms and the messages, it would appear that
   DECnet MOP (I assume they're using DECnet MOP and not LANCP MOP)
   is fine and there is a problem connecting with the disk services.

   What are the allocation class settings for the hosts and CI widgets,
   and what is the general the network configuration, and what sorts of
   system and network errors are present in the error log and in the
   network counters?

   Swap in one of the slow-booting nodes into the network connector
   used for the VAXstation 4000-90, and try to see if this is related
   to a problem on the network cabling used by (most) satellite nodes.
   (If the bootstrap goes quickly, start looking at the cabling...)

   Also, run several passes of AUTOGEN with FEEDBACK on the core nodes,
   to try to clear up any system resource limitations (and specifically
   things like an underconfigured non-paged pool) that might exist on
   the disk servers.

   Also see what might have changed in the configuration during the
   upgrade -- this looks like it might potentially be hardware, find
   out if there were any changes made to the cabling...
5315.2TAV02::GALIAGalia Reznik, Israel Software SupportSun Jun 01 1997 10:1315
    Hi,
    
    Finally all checks were compleeted, but no good news:
    Alloclass is ok everywhere, no network errors, no non-zero counters on
    lines, no cabling problem and no system resource problems. That 4000-90
    is quick in any situation - booting from 5.5-2 or from 6.2, even from a
    6.2 root of a slow-booting system. The slow-booting systems are slow
    even if booted from 4000-90 cabling. 
    The only difference I see is that 4000-90 has ISA-0 and others have 
    SVA-0. Is there a problem booting satellite with SVA-0 from V6.2 
    system disk using V5.5-2 MOP services? (As described in .0, when the 
    system disk was V5.5-2, the boot was ok). 
    
    Thanks,
    Galia.