[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference iosg::all-in-1_v30

Title:*OLD* ALL-IN-1 (tm) Support Conference
Notice:Closed - See Note 4331.l to move to IOSG::ALL-IN-1
Moderator:IOSG::PYE
Created:Thu Jan 30 1992
Last Modified:Tue Jan 23 1996
Last Successful Update:Fri Jun 06 1997
Number of topics:4343
Total number of notes:18308

4159.0. "Updating shared drawer access caused node crash" by BIGTOY::LE (Hong Le, CSC Sydney) Thu May 12 1994 09:04

VMS 5.5-2 and IOS 3.0 (yes, version 3.0)

There seemed to have been close relationships between a node crash and updating
the access list of a shared drawer. It happened 3 times on 2 different nodes of
a 5-node cluster from very much same incidents.

Customer accounts:

It seems like MOAS crashed while I was updating the access list of Shared
Drawer ADGSHR by giving read access to userid PHMANAGER.

. I was using the MASUP account which has control access to Shared Drawer
  ADGSHR. The only other accounts with control access are ALLIN1 and RBOWER
  (owner).

. From DRM I selected ADGSHR and did an E (edit access list).

. I added the userid PHMANAGER (a normal non-privileged account) and put Y
  accross the READ column and did a Gold F to save the change. It would
  take several minutes and then I'd get the "...do you want to reconnect
  [Y]? " prompt.

I've done this before and it would take several minutes before the change
was completed but it always completed. Yesterday, I tried to do the above
process 3 times and all failed

The current ACL for ADGHSR is:

IEJ2XWCL.WPL;1       [A353115,RBOWER]      (RWED,RWED,,)
          (IDENTIFIER=[SHARED,ADGSHR],ACCESS=READ+WRITE+DELETE+CONTROL)
...............
          (IDENTIFIER=[ALLIN1],ACCESS=READ+WRITE+DELETE+CONTROL)
          (IDENTIFIER=[A353115,MASUP],ACCESS=READ+WRITE+DELETE+CONTROL)
...............
          (IDENTIFIER=[A353115,RBOWER],ACCESS=READ+WRITE+DELETE+CONTROL)
...............

(some 58 ACEs, note those from ALLIN1, owner, and the account that was 
modifying access MASUP)

Please note that the owner RBOWER has overdrawn his diskquota used in the
disk where ADGSHR resides. This could be connected to the problem

-------------- end of customer accounts

A brief look at FC server transaction and error logs didn't reveal anything 
relevant around that time.

System dump file showed XQPERR errors, i.e file system XQP; unfortunately,
disk saving measures on these nodes produced short dump only, and our VMS
expert couldn't obtain necessary XQP offset for investigation.

Thanks for any comments

Hong
T.RTitleUserPersonal
Name
DateLines
4159.1IOSG::PYEGraham - ALL-IN-1 Sorcerer's ApprenticeThu May 12 1994 10:004
    Whatever the problem is, the use of ALL-IN-1 Groups to reduce the
    number of ACEs to rather less than 58 would be a good idea!
    
    Graham
4159.2group services was indeed the saviourBIGTOY::LEHong Le, CSC SydneyThu May 12 1994 14:577
    .1
    
    Yes, they immediately implemented this idea after the crashes
    
    Thanks for suggestion
    
    Hong