[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference turris::digital_unix

Title:DIGITAL UNIX(FORMERLY KNOWN AS DEC OSF/1)
Notice:Welcome to the Digital UNIX Conference
Moderator:SMURF::DENHAM
Created:Thu Mar 16 1995
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:10068
Total number of notes:35879

8943.0. "A process and cpu are stuck to each other. Can't kill/release this bond. Unix 3.2d, 8400, 3 cpus, 2gig RAM" by NETRIX::"[email protected]" (Sri) Mon Feb 24 1997 18:35

The system is an Alpha 8200 that has recently had some hardwarechanges. Has a
dual cpu board, sincle cpu board and a 2Gig memorymodule.  System ran allright
for a while (3days) before a strange thinghas happened.  At 23:14 on Feb 14th,
a process (pid 3090)showed up asrunaway on the new processor (processor number
#10). This processorindicates 0% idle, 100% system and the process 3090 can
not be stoppedwith any kinds of kill signals (notice that this is a
valid/runningprocess, not "defunct").psrinfo -v shows all three cpus are
offline since the system boot. Theprocesses thats running on cpu #10 are pid
3090 (running, priority 48-53 varies), an idle nfsd and an idle rpc.statd. Any
attempts to run a newprocess/program on the above mentioned processor will
just hang.We have debugged the process and found its doing some tcp or listens
orthread_idle all the time.  The processor #10 doesn't accept any newprocesses
to run nor kill the PID 3090 upon our signals. Can we release the bonding
between pid 3090 and cpu 10?
[Posted by WWW Notes gateway]
T.RTitleUserPersonal
Name
DateLines
8943.1need dump and file qarSMURF::WOODWARDTue Feb 25 1997 07:266
    If the process doesn't die with a SIGKILL, then it is hung in a tight
    O/S loop.  I'm afraid you won't get the processor back until the system
    is rebooted.  If you want this debugged, then you need to force a crash
    dump and file a QAR.   By the way, wht O/S version are you running?
    
    /jim/jim