Title: | ACMS comments and questions |
Notice: | This is not an official software support channel. Kits 5.* |
Moderator: | CLUSTA::HALL AN |
Created: | Mon Feb 17 1986 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 4179 |
Total number of notes: | 15091 |
Hi, VMS VAX V6.2 DECnet Phase IV ACMS V4.1 DECforms V2.1b I have a situation where I can see from DCL failure routines that the following DECnet object has remained on this Front End node for the past 6 days although ACMS is taken down nightly at this customer site. This rogue object does not seem to be causing any problems for ACMS, but it is patently incorrect. The object ACM$00010221 has what looks like a valid process id 21400439... Object = ACM$00010221 Number = 0 Process id = 21400439 <<<<<<<<< $ sh proc /cont/id=21400439 <<<<<<<<< %SYSTEM-W-NONEXPR, nonexistant process $ ANALYZE/SYSTEM SDA> set process/index=21400439 <<<<<<<<< SDA> SDA> show process Process index: 0039 Name: ACMS01CP057000 Extended PID: 21408A39 <<<<<<<<< ^^^^ ^^^^^^^^^^^^^^ ^^^^^^^^ ------------------------------------------------------------------- Status : 00040001 res,phdres Status2: 00000001 quantum_resched PCB address B4DE6900 JIB address B4901D40 PHD address EE351000 Swapfile disk address 00000000 Master internal PID 00450039 Subprocess count 0 Internal PID 00450039 Creator internal PID 00000000 Extended PID 21408A39 Creator extended PID 00000000 State HIB Termination mailbox 21A2 Current priority 6 AST's enabled KESU Base priority 4 AST's active NONE UIC [00001,000004] AST's remaining 2951 Mutex count 0 Buffered I/O count/limit 2198/2220 Waiting EF cluster 1 Direct I/O count/limit 2220/2220 Starting wait time 1B001B18 BUFIO byte count/limit ******/1921098 Event flag wait mask 0000000C # open files allowed left 158 Local EF cluster 0 F0000080 Timer entries allowed left 98 Local EF cluster 1 64000000 Active page table count 0 Global cluster 2 pointer 00000000 Process WS page count 9217 Global cluster 3 pointer 00000000 Global WS page count 1469 SDA> NCP> Object = ACM$00010254 Number = 0 Process id = 21408A39 <<<<<<<<<<< I am unclear whether the process index 21408a39 has in fact two ACM$ decnet objects associated with it ? Am I correct in thinking that SDA is just returning what is currently index 0039, and is erronous for the purpose of troubleshooting the existence of the rogue decnet object after the event. Also, I would like a discussion as to whether it is mandatory to stop/restart acms and decnet if acms fails to shutdown cleanly (like rogue decnet objects remaining) ? I have seen these objects remain a few times in the past month, and on occasions I get SRVNOTFOUND errors reported in the Back End audit log the following day. I think these are due to the rogue DECnet objects. Regards and Thanks, P.J.Hanley.
T.R | Title | User | Personal Name | Date | Lines |
---|---|---|---|---|---|
4177.1 | OHMARY::HALL | Bill Hall - ACMS Engineering - ZKO2-2 | Tue Jun 03 1997 10:11 | 8 | |
I think the SDA command is being mis-interpreted. There is a SET PROCESS/INDEX= and a SET PROCESS/ID=. In your example, you said 'set process/index=21400439 ' when it should have been set proc/id=21400439. Do a SHOW SUMMARY next time and see if the process is still there. Bill | |||||
4177.2 | SIOG::HANLEY | Tue Jun 03 1997 10:49 | 43 | ||
Bill, Seems to work the same for /id or /index ... I did a SDA> show summary and the process pid 21400439 was not present. Then .... SDA> SDA> set proc/id=21400439 SDA> SDA> sh process Process index: 0039 Name: ACMS01CP057000 Extended PID: 21408A39 ------------------------------------------------------------------- Status : 00040001 res,phdres Status2: 00000001 quantum_resched PCB address B4DE6900 JIB address B4901D40 PHD address EE351000 Swapfile disk address 00000000 Master internal PID 00450039 Subprocess count 0 Internal PID 00450039 Creator internal PID 00000000 Extended PID 21408A39 Creator extended PID 00000000 State HIB Termination mailbox 21A2 Current priority 5 AST's enabled KESU UIC [00001,000004] AST's remaining 2943 Mutex count 0 Buffered I/O count/limit 2196/2220 Waiting EF cluster 1 Direct I/O count/limit 2220/2220 Starting wait time 1B001B18 BUFIO byte count/limit ******/1920650 Event flag wait mask 0000000C # open files allowed left 151 Local EF cluster 0 F0000080 Timer entries allowed left 98 Local EF cluster 1 6C000000 Active page table count 0 Global cluster 2 pointer 00000000 Process WS page count 15251 Global cluster 3 pointer 00000000 Global WS page count 1475 SDA> | |||||
4177.3 | OHMARY::HALL | Bill Hall - ACMS Engineering - ZKO2-2 | Tue Jun 03 1997 13:43 | 12 | |
Looks like the orginal process is gone but the PID remains tied to the DECnet object. Another CP is created that just happens to use the same PID slot. PIDs are unique while VMS is running but the slots are re-used. We've already been in contact with DECnet engineering and supplied them with our code that deals with DECnet objects. They have a case already open for another customer. Bill | |||||
4177.4 | Continued. | KERNEL::PULLEY | Come! while living waters flow | Fri Jun 06 1997 05:34 | 22 |
On the day this object was first spotted, being attached to a nonexistant process: the machine had only been up one day; we couldn't find any record of it in accounting; there was a system service exception in the error log from a development user that morning. I've asked them to try to simply clear the object to see if that hurts ACMS, and to find out what that user was doing and what they experienced. They haven't got back to me yet. They are using acms/enter/noreturn. Does that briefly run up a user process before handing them on to the CP, then reminates the user process? There does seem to be some corolation between orphaned ACM$* network objects, and a development user's system service exception, but doesn't appear to be any match between the pid of that exception process and the pid attached to the network object. Does anyone juust happen to know, what mode exit handlers are run, (presumabley a DECnet one), which should clear up any objects? |