[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference smurf::ase

Title:ase
Moderator:SMURF::GROSSO
Created:Thu Jul 29 1993
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:2114
Total number of notes:7347

1928.0. "Director not exiting with NET PARTITION" by NETRIX::"[email protected]" ( decatl::johnson) Fri Mar 07 1997 17:35


This is a re-submission of 1907, since the reply appears not help in 
addressing what to check if the 'director does not exit'.

system A 
        has the disk service
system B
        has the director

According to customer, the following was the result after pulling
the network cable:

        the Directory on system B 
		DID NOT exit and the logs show a
		net partition on both logs.  
	And he was able to go into ASEMGR on system B and display
		status.

QUESTION: Is there any known issuea or anything we need to check 
	  before escallation? 

		and the response to 1907 indicates that this is a BUG,

		But:

	The customer will re-peat the testing to get logging info and
           current checkit info if needed, but wasnt willing
	   to do it before seeing if he could get a 'fix' without taking
	   the system down again.
	
3.2g 
	ASE130 patches installed 
	and 'major 3.2g patches installed'
(this is mission critical customer and he says the MC group
has went over the patches and hardware stuff).


Sid Johnson
customer Support Center
[Posted by WWW Notes gateway]
T.RTitleUserPersonal
Name
DateLines
1928.1Where is the problem ?BRSDVP::DEVOSManu Devos DEC/SI Brussels 856-7539Sun Mar 09 1997 12:4218
    Sid,
    
    What is the complaint of the customer?
    
    	- the fact that the asedirector is not killed ?
    	- the fact that he can call asemgr ans see the system status?
    	- the fact that both daemon logs shows a NET PARTITION ?
    	- the fact that in this NET PARTITION situation, he cannot 
    	  relocate, or restart any services ?
    
    The last point is the standard behaviour of ASE since the beginning.
    
    The first and second points may have change since the first version,
    but are not and objective per se, but a way to achieve the last point.
    
    The NET PARTITION seems obvious as the NET cable has been removed.
    
    Manu.
1928.2Trust?USCTR1::ASCHERDave AscherMon Mar 10 1997 00:2722
Manu,
        
    What is the complaint of the customer?
    
      If the director is supposed to exit when there is a net
      partition and it is not doing so - and if the CSC is insisting
      and ASE Engineering is insisting that this behavior is wrong
      then the customer has a right to be somewhat uncomfortabe
      about what is going to happen in a real life situation. 
      
      These people are running mission critical applications with
      DECsafe; they have a strong need to know what might happen when
      various failure events occur. If they are observing a different
      behavior than what Digital is telling them they should observe
      under these conditions then we would seem to owe them either an
      explanation or a patch.  They would be idiots to simply shrug
      off a discrepancy like this and hope that all works out when
      a real failure happens. They are running the company heart
      and soul and other sensitive bits on these platforms. Then
      need to know how they are and are not covered.
      
      d
1928.3... ?BRSDVP::DEVOSManu Devos DEC/SI Brussels 856-7539Mon Mar 10 1997 15:4216
    Hi Dave,
    
    Ok, I now  understand your concern,... and I am not able to confirm or
    infirm that DECsafe has changed versus the explnation given in the
    Admin guide. If you really need an explanation, I think you should open
    a QAR on this issue giving the ASE version and the ADmin guide version
    you refer to.
    
    I believed, at first, that you wondered what is the expected behaviour
    of DECsafe in case of NET partition, and thus I concentrated my answer
    on that. By the way, the behaviour of DECsafe is still the same and
    only the way to achieve it has changed. Maybe Greg, Mitch or Doug can
    shed some light on the version evolution... ?
    
    Regards, Manu.
    
1928.4 USCTR1::ASCHERDave AscherMon Mar 10 1997 21:5113
re:   <<< Note 1928.3 by BRSDVP::DEVOS "Manu Devos DEC/SI Brussels 856-7539" >>>
                                   -< ... ? >-

   By the way, the behaviour of DECsafe is still the same and
    only the way to achieve it has changed.
    
    
    The externally observable behavior appears to be similar to the
    way it is 'supposed to be'. "Appears to be" is a bit shy of where
    one would like to be when dealing with a major corporation's
    sensitive parts.
    
    d
1928.5BACHUS::DEVOSManu Devos DEC/SI Brussels 856-7539Tue Mar 11 1997 06:0434
    Hi,,
    
    By re-reading notes 1907 and 1928 attentively, you are running UNIX
    3.2G and thus ASE1.3. This ASE 1.3 version has introduced the support
    of multi-networks for the inter-cluster communication (HSM_PATH_STATUS)
    and the network monitoring feature (HSM_NI_STATUS). HSM_PATH_STATUS is
    covering the network partition. So, it is most likely possible that the
    code has changedin this version 1.3 versus before 1.3.
    
    I have not the ADMIN guide of the version V1.2, only the ADMIN guide of
    version 1.3, so I cannot compare the difference. But, from my mind,
    before version V1.3, the asedirector was simply exiting in case of net
    partition. In the ASE 1.3 ADMIN guide AA-QP0TA-TE v1.3 on page 1-12, I
    can read:
    
    
    "For example, a full network partitionoccurs if only one network is
    used in an ASE and that path failes, or if more than one network is
    used and all the paths fail. If a full network partition occurs, the
    services continue to run on the member system and can be automatically
    failed over IF THE SYSTEM CRASHES, but you cannot use asemgr utility to
    change the ASE or to manually relocate services."
    
    In the appendix B1 and B2, I cannot see that the asedirector is exiting
    when a network partition occurs.
    
    So, according to the documentation of ASE 1.3, the asedirector is not
    exiting. If you have the chance to look in the ADMIN guide of ASE 1.2,
    then maybe you will find that the asedirector was killed in case of
    network partition. But, I hope that you agree that the BEHAVIOUR is not
    "appearing" to look the same, but is mentionned clearly in the ADMIN
    guide of ASE 1.3
    
    regards, Manu. 
1928.6re: .5: thanksNETRIX::&quot;[email protected]&quot;decatl::johnsonWed Mar 12 1997 14:4510

re: .5

thanks.  I had read 1-12 but had not exactally inteperted it's full meaning
and the difference between 1.2 and 1.3   

Sid Johnson
Customer Support Center/Atlanta
[Posted by WWW Notes gateway]
1928.7USCTR1::ASCHERDave AscherThu Mar 13 1997 00:167
re: .5
    
    
    Sounds good to me... It'd feel better if somebody from engineering
    would confirm it.