[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference clt::cma

Title:DECthreads Conference
Moderator:PTHRED::MARYSTEON
Created:Mon May 14 1990
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:1553
Total number of notes:9541

1496.0. "printing a traceback on program exit" by PEACHS::LAMPERT (Pat Lampert, UNIX Applications Support, 343-1050) Tue Feb 25 1997 18:05

Lately we have had a couple of customers ask us about how they can 
print a trace of the stack(s) when exiting a threaded program after
a failure. They want this to happen automatically without the user having
to enter or use the debugger. 

For non-threaded code I have used an example routine called walk_stack in 
the past, but this routine doesnt seem to work with threaded programs. My 
other idea is to call pthread_debug_cmd after catching the exception but 
that generates a segv in itself. 

What is the best way to accomplish this?

Pat


Here is my current hacked up test code along with the stack_walk routine
I have used in the past. 

compile cc -o catchsegv catchsegv.c -threads

catchsegv

and from another terminal do a kill -SEGV on the process id.  It catches it,
but how to print a traceback I dont know...


#include <pthread.h>
#include <signal.h>

//char *programName;

    void handler(void)
    {
    sigset_t mask;
            if ( sigemptyset(&mask) == -1 )

            perror("sigemptyset"), exit (1);

            /* set up mask to catch SEGV */

         if ( sigaddset( &mask, SIGSEGV) == -1)
          perror("sigaddset"), exit(1);

          while (1) {
          switch  (sigwait(&mask)) {

          case SIGSEGV: {
			  printf("Caught segv\n");

//			  pthread_debug_cmd("stack"); 	//causes segv - why?
			  pthread_debug();		//This seems to work OK, but requres user interaction.
//			  walk_stack();			//This doesnt seem to work in threaded code.	

                          pthread_exit(0);
			}

          case -1: perror("sigwait");
              exit(1);

          default: printf( "unknown signal\n");
                   break;
          } /* switch */

          printf("Counting \n");
          { long j,t=0; for (j=1;j<100000000; j++) t+=j;}
          printf("Count over\n");
          } /* while */
    }

    main(
     int argc,
     char **argv
	)
    {
          pthread_t       thread;
	 
//	  programName = argv[0];	// Must supply programname as arg[0] to use walk_stack()

//	  pthread_debug_cmd(); 		// try loading early per appendix D in manual. Didnt help.
          pthread_create(&thread,
                         pthread_attr_default,
                          (void *) handler,
                          NULL);
          pthread_join(thread,NULL);


          printf("main(): Goodbye\n");

    }




==================================

Here is the stack_walk routine if you want it for anything...

You need to include programName as an external in the main program and link
with stack.o  also need  -lmld.

run as follows:

catchsegv catchsegv

Then kill -SEGV from another terminal.


/*stack.c*/
#include <stdio.h>
#include <excpt.h>
#include <pdsc.h>
#include <ldfcn.h>

static elf32_LDFILE *ldptr = NULL;
extern char *programName;

WalkStack ()
{
  CONTEXT context;
  unsigned long stack[1024];
  int level = 0;

  printf("walkstack...\n");
  exc_capture_context(&context);

  do {
    stack[level] = context.sc_pc - 4;
    exc_virtual_unwind(0, &context);
    level++;
  } while (context.sc_pc);

  stacktrace(level, stack);
}

stacktrace(unsigned int nEntries, unsigned long *stack)
{
  SYMR asym;
  unsigned istack = 0;
  int isym, ifd;
  char *pname;
  PDR apd;
  pPDR ppd = &apd;
  LDFILE *lfile;

  unsigned long pc;

  ldptr = ldopen(programName, NULL);
  if (ldptr == NULL) {
    printf("cannot read in %s\n", programName);
    return;
  }
  lfile = (LDFILE *)&(ldptr->ldfile);

  if (PSYMTAB(ldptr) == NULL) {
    fprintf(stderr, "cannot read symbol table\n");
    return(-1);
  }

  for (istack = 0; istack < nEntries-3; istack++) {
    
    pc = stack[istack];

	ldgetpd(lfile, ipd_adr(pc), ppd);
	isym = ppd->isym;
	if (isym != isymNil) {
	   if (ldtbread(lfile,isym,&asym) == FAILURE) {
		printf("cannot read %d symbol\n", ppd->isym);
		continue;
	    }
	    pname = (char *)ldgetname(lfile, &asym);
	    if (ppd->framereg != 30) {
		printf("%s has a non-sp framereg (%d)\n", pname,
		    ppd->framereg);
		/***return;*/
	    }
	    ifd = ld_ifd_symnum(lfile, ppd->isym);
	    printf("%4d: %s ", istack, pname);

	   printf ("[%s: %d, 0x%lx]\n", st_str_ifd_iss(ifd, 1),
	    SYMTAB(ldptr)->pline[ppd->iline+((pc-ppd->adr)/4)],
	    pc);

	} else {
	    printf("%4d <name stripped> (", istack, pname);
	}

  }
}

ipd_adr (adr)
unsigned long adr;

{
    int		ilow;
    int		ihigh;
    int		ihalf;
    int		ilowold;
    int		ihighold;
    int		ipd;
    PDR		apd;
    LDFILE *lfile = (LDFILE *)&(ldptr->ldfile);

    if (PSYMTAB(ldptr) == NULL) {
      fprintf(stderr, "cannot read symbol table\n");
      return(-1);
    }

    ilow = 0;
    ihigh = SYMHEADER(lfile).ipdMax;
    /* binary search proc table */
    while (ilow < ihigh) {
	ihalf = (ilow + ihigh) / 2;
	ilowold = ilow;
	ihighold = ihigh;
	ldgetpd(lfile, ihalf, &apd);
	if (adr < apd.adr)
	    ihigh = ihalf;
	else if (adr > apd.adr)
	    ilow = ihalf;
	else {
	    ilow = ihigh = ihalf;
	    break;
	} /* if */
	if (ilow == ilowold && ihigh == ihighold)
	    break;
    } /* while */

    ipd =  ((ilow < ihigh || ihigh < 0) ? ilow : ihigh);

    return (ipd);

}

T.RTitleUserPersonal
Name
DateLines
1496.1Yick. How about just taking a core file, instead?WTFN::SCALESDespair is appropriate and inevitable.Wed Feb 26 1997 13:2336
.0> Lately we have had a couple of customers ask us about how they can  print
.0> a trace of the stack(s) when exiting a threaded program after a failure. 

Actually, there is an effort underway to come up with a generalized set of
routines to do just that (well, almost...they do the stack walk part).

.0> For non-threaded code I have used an example routine called walk_stack in 
.0> the past, but this routine doesnt seem to work with threaded programs. 
[...]
.0> from another terminal do a kill -SEGV on the process id.

Have you tried it by generating the SEGV inside the program instead of
sending it from the outside??  There's a big difference!!  If you send it
from the outside, there's no guarantee which thread will get it, so there's
no telling which stack you'll walk (or even that the target thread won't
try to recover from it...)

.0> My other idea is to call pthread_debug_cmd after catching the exception
.0> but that generates a segv in itself.

I don't know why that would SEGV, but it's just a debugging routine, and it's
not exactly robust...

.0> What is the best way to accomplish this?

Probably the best thing to do is to trap the failure somehow and then walk
the stack.  The somehow is tricky, because it kind of requires you to know in
advance _how_ it's gonna fail.  And, if you replace the default signal
handlers, you may interfere with other threads' ability to deal (properly)
with those signals.  (Oh, and, in the case of the synchronous signals, you
can get multiple ones going at the same time, in case of catastrophic
failure, so it would be good if your walker-rouine were completely
reentrant...)


				Webb
1496.2See 1227.4 and 1227.5DUCAT::ROSCOETue Mar 04 1997 10:252
See notes 1227.4 andd 1227.5 for generating a core file that contains the 
stack trace of the thread that caused the exception.