Title: | DIGITAL UNIX (FORMERLY KNOWN AS DEC OSF/1) |
Notice: | Welcome to the Digital UNIX Conference |
Moderator: | SMURF::DENHAM |
Created: | Thu Mar 16 1995 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 10068 |
Total number of notes: | 35879 |
The system is an Alpha 8200 that has recently had some hardwarechanges. Has a dual cpu board, sincle cpu board and a 2Gig memorymodule. System ran allright for a while (3days) before a strange thinghas happened. At 23:14 on Feb 14th, a process (pid 3090)showed up asrunaway on the new processor (processor number #10). This processorindicates 0% idle, 100% system and the process 3090 can not be stoppedwith any kinds of kill signals (notice that this is a valid/runningprocess, not "defunct").psrinfo -v shows all three cpus are offline since the system boot. Theprocesses thats running on cpu #10 are pid 3090 (running, priority 48-53 varies), an idle nfsd and an idle rpc.statd. Any attempts to run a newprocess/program on the above mentioned processor will just hang.We have debugged the process and found its doing some tcp or listens orthread_idle all the time. The processor #10 doesn't accept any newprocesses to run nor kill the PID 3090 upon our signals. Can we release the bonding between pid 3090 and cpu 10? [Posted by WWW Notes gateway]
T.R | Title | User | Personal Name | Date | Lines |
---|---|---|---|---|---|
8943.1 | need dump and file qar | SMURF::WOODWARD | Tue Feb 25 1997 07:26 | 6 | |
If the process doesn't die with a SIGKILL, then it is hung in a tight O/S loop. I'm afraid you won't get the processor back until the system is rebooted. If you want this debugged, then you need to force a crash dump and file a QAR. By the way, wht O/S version are you running? /jim/jim |