Title: | ase |
Moderator: | SMURF::GROSSO |
Created: | Thu Jul 29 1993 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 2114 |
Total number of notes: | 7347 |
This is a re-submission of 1907, since the reply appears not help in addressing what to check if the 'director does not exit'. system A has the disk service system B has the director According to customer, the following was the result after pulling the network cable: the Directory on system B DID NOT exit and the logs show a net partition on both logs. And he was able to go into ASEMGR on system B and display status. QUESTION: Is there any known issuea or anything we need to check before escallation? and the response to 1907 indicates that this is a BUG, But: The customer will re-peat the testing to get logging info and current checkit info if needed, but wasnt willing to do it before seeing if he could get a 'fix' without taking the system down again. 3.2g ASE130 patches installed and 'major 3.2g patches installed' (this is mission critical customer and he says the MC group has went over the patches and hardware stuff). Sid Johnson customer Support Center [Posted by WWW Notes gateway]
T.R | Title | User | Personal Name | Date | Lines |
---|---|---|---|---|---|
1928.1 | Where is the problem ? | BRSDVP::DEVOS | Manu Devos DEC/SI Brussels 856-7539 | Sun Mar 09 1997 12:42 | 18 |
Sid, What is the complaint of the customer? - the fact that the asedirector is not killed ? - the fact that he can call asemgr ans see the system status? - the fact that both daemon logs shows a NET PARTITION ? - the fact that in this NET PARTITION situation, he cannot relocate, or restart any services ? The last point is the standard behaviour of ASE since the beginning. The first and second points may have change since the first version, but are not and objective per se, but a way to achieve the last point. The NET PARTITION seems obvious as the NET cable has been removed. Manu. | |||||
1928.2 | Trust? | USCTR1::ASCHER | Dave Ascher | Mon Mar 10 1997 00:27 | 22 |
Manu, What is the complaint of the customer? If the director is supposed to exit when there is a net partition and it is not doing so - and if the CSC is insisting and ASE Engineering is insisting that this behavior is wrong then the customer has a right to be somewhat uncomfortabe about what is going to happen in a real life situation. These people are running mission critical applications with DECsafe; they have a strong need to know what might happen when various failure events occur. If they are observing a different behavior than what Digital is telling them they should observe under these conditions then we would seem to owe them either an explanation or a patch. They would be idiots to simply shrug off a discrepancy like this and hope that all works out when a real failure happens. They are running the company heart and soul and other sensitive bits on these platforms. Then need to know how they are and are not covered. d | |||||
1928.3 | ... ? | BRSDVP::DEVOS | Manu Devos DEC/SI Brussels 856-7539 | Mon Mar 10 1997 15:42 | 16 |
Hi Dave, Ok, I now understand your concern,... and I am not able to confirm or infirm that DECsafe has changed versus the explnation given in the Admin guide. If you really need an explanation, I think you should open a QAR on this issue giving the ASE version and the ADmin guide version you refer to. I believed, at first, that you wondered what is the expected behaviour of DECsafe in case of NET partition, and thus I concentrated my answer on that. By the way, the behaviour of DECsafe is still the same and only the way to achieve it has changed. Maybe Greg, Mitch or Doug can shed some light on the version evolution... ? Regards, Manu. | |||||
1928.4 | USCTR1::ASCHER | Dave Ascher | Mon Mar 10 1997 21:51 | 13 | |
re: <<< Note 1928.3 by BRSDVP::DEVOS "Manu Devos DEC/SI Brussels 856-7539" >>> -< ... ? >- By the way, the behaviour of DECsafe is still the same and only the way to achieve it has changed. The externally observable behavior appears to be similar to the way it is 'supposed to be'. "Appears to be" is a bit shy of where one would like to be when dealing with a major corporation's sensitive parts. d | |||||
1928.5 | BACHUS::DEVOS | Manu Devos DEC/SI Brussels 856-7539 | Tue Mar 11 1997 06:04 | 34 | |
Hi,, By re-reading notes 1907 and 1928 attentively, you are running UNIX 3.2G and thus ASE1.3. This ASE 1.3 version has introduced the support of multi-networks for the inter-cluster communication (HSM_PATH_STATUS) and the network monitoring feature (HSM_NI_STATUS). HSM_PATH_STATUS is covering the network partition. So, it is most likely possible that the code has changedin this version 1.3 versus before 1.3. I have not the ADMIN guide of the version V1.2, only the ADMIN guide of version 1.3, so I cannot compare the difference. But, from my mind, before version V1.3, the asedirector was simply exiting in case of net partition. In the ASE 1.3 ADMIN guide AA-QP0TA-TE v1.3 on page 1-12, I can read: "For example, a full network partitionoccurs if only one network is used in an ASE and that path failes, or if more than one network is used and all the paths fail. If a full network partition occurs, the services continue to run on the member system and can be automatically failed over IF THE SYSTEM CRASHES, but you cannot use asemgr utility to change the ASE or to manually relocate services." In the appendix B1 and B2, I cannot see that the asedirector is exiting when a network partition occurs. So, according to the documentation of ASE 1.3, the asedirector is not exiting. If you have the chance to look in the ADMIN guide of ASE 1.2, then maybe you will find that the asedirector was killed in case of network partition. But, I hope that you agree that the BEHAVIOUR is not "appearing" to look the same, but is mentionned clearly in the ADMIN guide of ASE 1.3 regards, Manu. | |||||
1928.6 | re: .5: thanks | NETRIX::"[email protected]" | decatl::johnson | Wed Mar 12 1997 14:45 | 10 |
re: .5 thanks. I had read 1-12 but had not exactally inteperted it's full meaning and the difference between 1.2 and 1.3 Sid Johnson Customer Support Center/Atlanta [Posted by WWW Notes gateway] | |||||
1928.7 | USCTR1::ASCHER | Dave Ascher | Thu Mar 13 1997 00:16 | 7 | |
re: .5 Sounds good to me... It'd feel better if somebody from engineering would confirm it. |