[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference turris::digital_unix

Title:DIGITAL UNIX(FORMERLY KNOWN AS DEC OSF/1)
Notice:Welcome to the Digital UNIX Conference
Moderator:SMURF::DENHAM
Created:Thu Mar 16 1995
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:10068
Total number of notes:35879

8552.0. "Unique name for core file???" by GALVIA::STONES (Tom Stones) Thu Jan 23 1997 04:43

T.RTitleUserPersonal
Name
DateLines
8552.1VAXCPU::michaudJeff Michaud - ObjectBrokerThu Jan 23 1997 11:5317
8552.2You could QAR this.NETRIX::"[email protected]"Jarkko HietaniemiMon Jan 27 1997 04:4710
I heard that there is, in fact, such a feature in the free BSDs.
Something along the lines core.`base-name-of-the-executable`.$$
(in shell notation here but handled by the kernel). If you feel
strongly enough for this feature I guess you could QAR it?

-- 
Jarkko Hietaniemi Digital/Finland MCS/OSS DTN 879-4738
[email protected]   +358-(0)9-434 4738

[Posted by WWW Notes gateway]
8552.3warning (fwiw)VAXCPU::michaudJeff Michaud - ObjectBrokerMon Jan 27 1997 10:338
> I heard that there is, in fact, such a feature in the free BSDs.
> Something along the lines core.`base-name-of-the-executable`.$$
> (in shell notation here but handled by the kernel). If you feel
> strongly enough for this feature I guess you could QAR it?

	Of course the obvious problem with turning on such a
	feature system-wide is that all your disc space can
	fill up with core files if you're not careful.
8552.4SMURF::DENHAMDigital UNIX KernelMon Jan 27 1997 12:052
    Yep, we'd probably want to rig it up to work something like
    the uac command works for unaligned access fixup/reporting.
8552.5TLE::REAGANAll of this chaos makes perfect senseMon Jan 27 1997 12:529
    RE: .3
    
    >        Of course the obvious problem with turning on such a
    >        feature system-wide is that all your disc space can
    >        fill up with core files if you're not careful.
    
    You mean like version numbers?  :-) :-)
    
    				-John
8552.6SMURF::PBECKPaul BeckMon Jan 27 1997 13:241
    set file core /version_limit=5
8552.7SMURF::DENHAMDigital UNIX KernelMon Jan 27 1997 14:284
    % set file core /version_limit=5
    set: Variable name must begin with a letter.
    
    Rats!
8552.8Minor oversight...SMURF::PBECKPaul BeckMon Jan 27 1997 14:382
    I forgot to mention steps 1 and 2 (1: add version numbers to file
    system, and 2: write DCL shell).
8552.9Real UNIX folks wouldn't get caught dead w/a DCL shell :-)VAXCPU::michaudJeff Michaud - ObjectBrokerMon Jan 27 1997 16:076
> I forgot to mention steps 1 and 2 (1: add version numbers to file
> system, and 2: write DCL shell).

	Well I could see adding version numbers, but screw DCL (btw,
	there are DCL shell's by 3rd parties that at least used to
	be available for ULTRIX)
8552.10QAR entered.GALVIA::STONESTom StonesFri Jan 31 1997 11:0022
I've entered QAR 51279 :


 There doesn't seem to be a simple way of causing core files to be
created with unique names.

I'm running parallel applications on a cluster.  A parallel application
(PVM, MPI or HPF) typically consists of multiple copies of the same code
running on one or more nodes of the cluster.  The multiple processes
 often expect to be running in the same working directory.  When they
crash, they all create a file called "core".  By Sods Law, the core file
with the info you're looking for is never the one that remains on the disk
after they've all finished overwriting each other.

I'd like a mechanism, such as setting an environment variable, to indicate
that I'd like unique names for any core files produced.  I'd be satisfied
with appending the pid to the name - better yet would be appending
nodename and pid - better yet would be appending program name,host and pid.

The "High Performance Computing" and "Scientific and Technical" markets
are making more and more use of parallel codes like this, so a need for
such a mechanism is becoming more urgent.
8552.11Would a last chance handler help?HYDRA::NEWMANChuck Newman, 508/467-5499 (DTN 297), MRO1-3/F26Fri Jan 31 1997 12:3411
As a workaround, how about implementing a last chance
handler that cooperates w/ other copies of the same
program such that only one process can die at a time,
and they rename old core files before they exit

You'd probably need to find the last-chance handler
first, and call that from your last-chance handler
(or whatever it takes to make the core dump be
written).

I don't know if this would work -- just a thought...
8552.12VAXCPU::michaudJeff Michaud - ObjectBrokerFri Jan 31 1997 14:239
Re: .11

	if by last chance, you mean a signal handler, yes, that could
	work as well.  I've used such signal handlers in the past
	to syslog info from the signal context about the pc and address
	for segv and bus violations.  Such a handler could also rename
	an existing "core" file before re-asserting the signal (and make
	sure to restore the signal's behaviour to SIG_DFL before re-
	asserting).