[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference bulova::decw_jan-89_to_nov-90

Title:DECWINDOWS 26-JAN-89 to 29-NOV-90
Notice:See 1639.0 for VMS V5.3 kit; 2043.0 for 5.4 IFT kit
Moderator:STAR::VATNE
Created:Mon Oct 30 1989
Last Modified:Mon Dec 31 1990
Last Successful Update:Fri Jun 06 1997
Number of topics:3726
Total number of notes:19516

2238.0. "DECW sessions to CI failing despite DECW$SERVER_RETRY*" by GIDDAY::OMELEY (BUGCHECK .... a momentary lapse of reason) Fri Feb 09 1990 00:31

    Hi,

    	I have a customer that has a MI Cluster and some standalone
    workstations. The CI part of the cluster has a range of VAXen
    (2x6000-320, 8650, 8300, 2x750) running A5.1, off this hang mostly
    VS2000's, and there is a standalone VS3100 running VMS V5.2.

    	DECW sessions are run from the CI (FileView, DECTerm, Clock .. etc)
    and display on all the workstations (ie: the users customise what they
    want to run and where from). There are also some DECTerm sessions that
    are run local to the Workstation and are logged into the CI nodes via
    DECNet.

    	When the CI goes through a cluster transition the have recently
    seen that sessions to the 8650 died whilst those to other nodes in the
    CI cluster kept up. The workload on the 8650 was not high and there is
    nothing unusual from the DECNet side either (because a remote login
    from a workstation to the 8650 kept alive, whilst the clock and other
    sessions to the node died) and appear for the greater part to be
    running the default settings. Only the 6000 and 8650 VAXen have
    lockdirwt non-zero, all CI nodes have votes=1 and the satellites are
    nonvoting.

    	I have asked them to look at the DECW$SERVER_RETRY_WRITE_MAX and
    _MIN values and set them suitably high (allowing for the unit shift 5.1
    - 5.2) and (for example) on the V5.2 workstation are _MAX = 200000
    and _MIN = 10000 (200 and 10 sec respectively) I suggested that _MIN
    was a bit large and that a value more like 500 (1/2 sec... default)
    would be better.

    	I am at a bit of a loss to explain why some nodes kept the DECW
    sessions active and the 8650 didn't, I believe that this scenario has 
    happened more than once when the cluster went through a state
    transition (which typically last 40 - 60 sec). Although I mentioned
    only the V5.2 node above, the behaviour was noted on the satellites to
    the cluster.

    	Could someone suggest how this behaviour might be accounted for, I
    have scanned the notes to date and can't spot anything that describes
    similar behaviour. As the remote login session kept alive I don't
    believe that DECNet is the culprit, certainly there was nothing logged
    in the operator logs regarding adjacencies to WS or the accounting logs 
    regarding process termination, server and user logs are also clean (I
    believe).... all a bit of a mystery !!

    	Any help/pointers appreciated.

    	Thankyou

    		Rob
    
T.RTitleUserPersonal
Name
DateLines
2238.1DECWIN::JMSYNGEJames M Synge, VMS DevelopmentMon Feb 12 1990 10:3411
    My first recommendation is one your customer may not be ready for: UPGRADE!
    
    We fixed a lot of the problems with connections getting dropped for the
    V5.3 release.
    
    
    If that isn't possible, I'd ask the following question: Is the 8650 a
    router?  If so, NETACP maybe doing a lot of activity following the
    cluster state transition.
    
    James