[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference smurf::ase

Title:	ase

Moderator:	SMURF::GROSSO

Created:	Thu Jul 29 1993
Last Modified:	Fri Jun 06 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	2114
Total number of notes:	7347

1919.0. "Same ASEV1.3, different behavior on AXP models!" by MANM01::JOELJOSOL () Wed Mar 05 1997 04:16

    I have setup an AlphaServer 1000A and an AlphaServer 4100 to
    simulate a customer problem involving two AlphaServer 8400s
    on ASE failover.
    
    The AlphaServer 4100 and the AlphaServer 1000A uses a common
    mount point at the SAME time. The 1000A has the ASE service while
    the 4100 has a local storage using the same mount point used
    by the 1000A for ASE service.
    
    Both systems run on Digital UNIX V3.2G and ASE V1.3. Balanced mode
    is the preferred policy with no migration when the other system
    comes up.
    
    This is the problem:
    
    On the 4100, when ASE failovers from the 1000A, it attempts
    to mount the ASE service on the 4100 mount point which is
    currently in use, fails, dismounts the mount point, and
    tries again and succeeds.
    
    This time around when ASE failovers from 4100 back to 1000A
    which is now up and running and using the mount point for the
    local storage, ASE tries to use the mount point, fails, and
    remains unassigned.
    
    The 1000A behavior is what is happening on the 8400 pair.
    
    We put in some print stubs inside the ase_mount_point. The
    4100 shows that it starts the ASE which fails and stops it,
    then starts it again. The 1000A showed that it starts the
    ASE and fails. Why the different failover behavior?
    
    /joeljosol

T.R	Title	User	Personal Name	Date	Lines
1919.1	Customer customized the auto_mount_action...	MANM01::JOELJOSOL		`Wed Mar 05 1997 04:52`	15
	The customer has customized the 1000A's ase_mount_action to insert the line in the 'start' code block a check of the 'df' if it contains the mounted local storage and if it is, unmounts it. This insertion made the failover for the 1000A work. It is not needed to replicate the script on the 4100 which already is working fine - unmounting the local storage after the ASE fails in its first attempt, mounts it at the second try. The customer wanted to know if there are legal implications for support for adding this line to an otherwise not working script for the 1000A. Or, what needs to be found out to make the 1000A installation of ASE work like the 4100? /joeljosol
1919.2	Shared storage must be uniquely defined with local storage	NETRIX::"[email protected]"	Gregory P. Myrdal	`Fri Mar 07 1997 13:50`	19
	Joel, The customer should not have local storage and ASE storage defined at the same mount point. Shared storage must be uniquely defined with local storage on all ASE members. There are times where the agent may try to force a start or stop a service before retrying. I am not sure why you are seeing different behavior, but since I think your problem lies in unique naming I will not be searching for the reason. I do not understand the customers environment, but it looks like they could be looking for trouble if they do not uniquely define storage. If this has to be done I would not edit ase_mount_action. This would not be supported. You could put a umount command in the start script of the service action script since you know there will be a mount point there before the service starts. -- Greg [Posted by WWW Notes gateway]
1919.3	IPMT?	MANM01::JOELJOSOL		`Sun Mar 09 1997 23:02`	26
	Greg, I will have to go file an IPMT for this one. The guy in charge has not come back to me yet via e-mail. If I put the umount on the start script, ASE tries to mount the service first before executing this script. The customer placed the umount script instead on the 'start' block of the ase_mount_action where it behaved correctly. Actually, it is just one line that calls an external script that does a 'df' and then the 'umount'. This works for the 1000A. We don't have to do this for the 4100. However, the 8400s are behaving exactly like the 1000A. The correct behavior expected by the customer is again 1. 8400-A holding the service on its /oracle_data mount point fails. 2. 8400-B holding local storage on its /oracle_data detects failure, dismounts its local storage, mounts ASE service. 3. 8400-A reboots. 4. 8400-A detects ASE service is on 8400-B, mounts only local storage. /joeljosol0