[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference smurf::ase

Title:ase
Moderator:SMURF::GROSSO
Created:Thu Jul 29 1993
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:2114
Total number of notes:7347

1919.0. "Same ASEV1.3, different behavior on AXP models!" by MANM01::JOELJOSOL () Wed Mar 05 1997 04:16

    I have setup an AlphaServer 1000A and an AlphaServer 4100 to
    simulate a customer problem involving two AlphaServer 8400s
    on ASE failover.
    
    The AlphaServer 4100 and the AlphaServer 1000A uses a common
    mount point at the SAME time. The 1000A has the ASE service while
    the 4100 has a local storage using the same mount point used
    by the 1000A for ASE service.
    
    Both systems run on Digital UNIX V3.2G and ASE V1.3. Balanced mode
    is the preferred policy with no migration when the other system
    comes up.
    
    This is the problem:
    
    On the 4100, when ASE failovers from the 1000A, it attempts
    to mount the ASE service on the 4100 mount point which is
    currently in use, fails, dismounts the mount point, and
    tries again and succeeds.
    
    This time around when ASE failovers from 4100 back to 1000A
    which is now up and running and using the mount point for the
    local storage, ASE tries to use the mount point, fails, and
    remains unassigned.
    
    The 1000A behavior is what is happening on the 8400 pair.
    
    We put in some print stubs inside the ase_mount_point. The
    4100 shows that it starts the ASE which fails and stops it,
    then starts it again. The 1000A showed that it starts the
    ASE and fails. Why the different failover behavior?
    
    /joeljosol
T.RTitleUserPersonal
Name
DateLines
1919.1Customer customized the auto_mount_action...MANM01::JOELJOSOLWed Mar 05 1997 04:5215
    The customer has customized the 1000A's ase_mount_action to insert
    the line in the 'start' code block a check of the 'df' if it
    contains the mounted local storage and if it is, unmounts it.
    This insertion made the failover for the 1000A work.
    
    It is not needed to replicate the script on the 4100 which already
    is working fine - unmounting the local storage after the ASE fails
    in its first attempt, mounts it at the second try.
    
    The customer wanted to know if there are legal implications for
    support for adding this line to an otherwise not working script
    for the 1000A.  Or, what needs to be found out to make the 1000A
    installation of ASE work like the 4100?
    
    /joeljosol
1919.2Shared storage must be uniquely defined with local storageNETRIX::"[email protected]"Gregory P. MyrdalFri Mar 07 1997 13:5019
Joel,

The customer should not have local storage and ASE storage defined at the
same mount point.  Shared storage must be uniquely defined with local
storage on all ASE members.

There are times where the agent may try to force a start or stop a service
before retrying.  I am not sure why you are seeing different behavior, but
since I think your problem lies in unique naming I will not be searching
for the reason.  I do not understand the customers environment, but it looks 
like they could be looking for trouble if they do not uniquely define storage.

If this has to be done I would not edit ase_mount_action.  This
would not be supported.  You could put a umount command in the start script
of the service action script since you know there will be a mount point 
there before the service starts.

-- Greg
[Posted by WWW Notes gateway]
1919.3IPMT?MANM01::JOELJOSOLSun Mar 09 1997 23:0226
    Greg,
        
        I will have to go file an IPMT for this one. The guy in charge
        has not come back to me yet via e-mail.
        
        If I put the umount on the start script, ASE tries to mount the
        service first before executing this script. The customer placed
        the umount script instead on the 'start' block of the
        ase_mount_action where it behaved correctly. Actually, it is just
        one line that calls an external script that does a 'df' and then
        the 'umount'.
        
        This works for the 1000A. We don't have to do this for the 4100.
        However, the 8400s are behaving exactly like the 1000A. The
        correct behavior expected by the customer is again
        
          1. 8400-A holding the service on its /oracle_data mount point
             fails.
          2. 8400-B holding local storage on its /oracle_data detects
             failure, dismounts its local storage, mounts ASE service.
          3. 8400-A reboots.
          4. 8400-A detects ASE service is on 8400-B, mounts only
             local storage.
    
    /joeljosol0