T.R | Title | User | Personal Name | Date | Lines |
---|
68.1 | Is effect of backing up the PC defined? | WIBBIN::NOYCE | Pulling weeds, pickin' stones | Wed Apr 16 1997 19:02 | 5 |
| It's relatively easy for a debugger to expand the current line-number table
into a sorted PC->line number map that is easy to search. It's not so clear
how to do that with the new proposed table, given the ADD_PC function. What
is supposed to happen if the same PC is described as part of several lines?
In any case, building the map seems to require a sort operation in general.
|
68.2 | Does this answer the question, or did you have something else in mind? | QUARRY::petert | rigidly defined areas of doubt and uncertainty | Thu Apr 17 1997 17:56 | 26 |
| > Is the effect of back up the PC defined?
> It's relatively easy for a debugger to expand the current line-number table
> into a sorted PC->line number map that is easy to search. It's not so clear
> how to do that with the new proposed table, given the ADD_PC function. What
> is supposed to happen if the same PC is described as part of several lines?
I think I know what you're questioning here, but let me just go over a
few things to make sure.
This proposal started out as a way to represent split routines in the symbol
table, but it was felt my suggestions were a bit awkward and didn't address
all the possibilities, specifically problems with line numbers. So we
transitioned into this proposal which deals only with line numbers. While
there is a many PC to line number mapping, there is never (well, not considering
alternate entry points) a many lines to one PC here. The only time the
PC would actually be backed up is when a section of code is split off
and placed in a "hot region". This could be a good distance away from
its previous position, though it may just be moved to the top of the
routine. Once the end of this section of code had been reached, the PC
would be adjusted again, likely forward, to where it had been before it
had been adjusted backwards. I don't forsee a case where several lines
would point to the same PC. I would consider that an error on the part
of the producer of the extended source info, but maybe I'm not thinking
object orientated enough...
PeterT
|
68.3 | | TLE::DMURPHY | | Fri Apr 18 1997 12:41 | 29 |
| > It's relatively easy for a debugger to expand the current line-number table
> into a sorted PC->line number map that is easy to search. It's not so clear
> how to do that with the new proposed table, given the ADD_PC function.
I'll be the first person to admit that there are additional problems with
trying to represent what the compilers are doing than what is detailed
in this note. Having said that let explain how I thought we'd use the table.
The procedure descriptors, in addition to the elements of the expanded
line number table are sorted in address assending order. (Well the linker
only partially sorts them but that's another issue.)
Locate the function for a given pc taking into account lexically enclosed
functions. Use the new representation to run a machine that transitions
pc/file/line until we get the desired PC. Set the debugger location state
given the file/line/column we arrived at by running the machine.
> What is supposed to happen if the same PC is described as part of several
lines?
This is certainly not something that can be done with the existing mechanism.
Is that your point? The compilers wound not issue such a description until
we defined what was suppoed to happen.
>In any case, building the map seems to require a sort operation in general.
I guess I don't see it as a simple map. It does require more work of
the debuggers but it more accurately captures the program state at debug
time.
|
68.4 | Does it meet the goals? | WIBBIN::NOYCE | Pulling weeds, pickin' stones | Fri Apr 18 1997 13:10 | 15 |
| > Locate the function for a given pc taking into account lexically enclosed
> functions. Use the new representation to run a machine that transitions
> pc/file/line until we get the desired PC.
Two problems: Worst, this assumes that function fragments aren't interleaved
with each other, so that you can find the function before interpreting the
line-number info. I thought one reason for the proposal was so that rarely-
used code could be pushed far away from the "hot" code of an image. And
second, this requires a linear search to map PC to line number, The latter
doesn't sound too bad, until you realize that debuggers like to do this
operation for every line of machine code they display.
Maybe I'm misunderstanding. To find the function, do you use the debugger
info, or the Runtime Procedure Descriptor and Code Range Descriptor (which
can describe discontiguous, interleaved pieces of functions)?
|
68.5 | This only addresses a subset of the problem... | SMURF::PETERT | rigidly defined areas of doubt and uncertainty | Fri Apr 18 1997 14:23 | 49 |
| Let's be discontiguous:
> Maybe I'm misunderstanding. To find the function, do you use the
> debugger info, or the Runtime Procedure Descriptor and Code Range
> Descriptor (which can describe discontiguous, interleaved pieces of
> functions)?
Neither dbx or ladebug (to my knowledge) use the RPD's and code range
descriptors. While the potential seems great, they are not currently
part of non-shared objects, and there is no easy mapping from a
procedure descriptor, which contains all sorts of useful info, to
the RPD /CRD pair which contain info useful for exception handling.
You still need to go through the symbol table to find the function.
> Two problems: Worst, this assumes that function fragments aren't
> interleaved with each other, so that you can find the function
> before interpreting the line-number info. I thought one reason
> for the proposal was so that rarely- used code could be pushed
> far away from the "hot" code of an image.
This started out as a split routine proposal, so that the hot and
cold regions could be described accurately, but that was seen as
missing some basic assumptions, such as how to deal with the line
numbers. OM can already split routines and executables and rearrange
them into hot and cold regions. But it does nothing to the line number
info. So once the routine is split, the line numbers no longer
represent the code accurately. This proposal is intended for producers
and consumers alike to define a way to map the line numbers in
cases where odd things have been done to the code, and to deal with
deficencies like how to handle include files which contain code,
something we've ignored for a long time. The function will still
be found by start address + size of function for a particular
pc. Obviously this will not map well with split routines, but
it does allow a way of getting there by laboriously going
through the line number info if no other match is found.
I'd like to see us eventually come up with an easy way of mapping
which code ranges apply to which function, but, no that is no a
goal of THIS proposal.
> And second, this requires a linear search to map PC to line
> number, The latter doesn't sound too bad, until you realize that
> debuggers like to do this operation for every line of machine
> code they display.
Yes it can be very compute intensive, but at least it's a way to deal
with it until we come up with a more complete answer.
PeterT
|
68.6 | | TLE::DMURPHY | | Fri Apr 18 1997 16:24 | 67 |
|
> Maybe I'm misunderstanding. To find the function, do you use the
> debugger info, or the Runtime Procedure Descriptor and Code Range
> Descriptor (which can describe discontiguous, interleaved pieces of
> functions)?
I believe that ladebug makes limited use of the prd & crd data.
One reason not to use it is that I believe code linked -static (the
kernel??) does not/did not retain the .xdata/.pdata sections containing
the prd/crd information.motie>
The following is a 3.2 mvunix.
odump -h /vmunix
***SECTION HEADER***
Name Paddr Vaddr Size
Scnptr Relptr Lnnoptr
Nreloc Nlnno Flags
/vmunix:
.text 0xfffffc0000230000 0xfffffc0000230000 0x00000000002ef930
0x0000000000000270 0x0000000000000000 0x0000000000000000
0 0 0x00000020
.data 0xfffffc0000537340 0xfffffc0000537340 0x000000000006c480
0x00000000003075b0 0x0000000000000000 0x0000000000000000
0 0 0x00000040
.rdata 0xfffffc00005352a0 0xfffffc00005352a0 0x00000000000020a0
0x0000000000305510 0x0000000000000000 0x0000000000000000
0 0 0x00000100
.rconst 0xfffffc000051f930 0xfffffc000051f930 0x0000000000015970
0x00000000002efba0 0x0000000000000000 0x0000000000000000
0 0 0x02200000
.lit8 0xfffffc00005a37c0 0xfffffc00005a37c0 0x0000000000000880
0x0000000000373a30 0x0000000000000000 0x0000000000000000
0 0 0x08000000
.sdata 0xfffffc00005a4040 0xfffffc00005a4040 0x0000000000003330
0x00000000003742b0 0x0000000000000000 0x00000000003775e0
0 0 0x00000200
.sbss 0xfffffc00005a7370 0xfffffc00005a7370 0x0000000000000960
0x0000000000000000 0x0000000000000000 0x00000000003775e0
0 0 0x00000400
.bss 0xfffffc00005a7cd0 0xfffffc00005a7cd0 0x00000000000c3650
0x0000000000000000 0x0000000000000000 0x0000000000000000
0 0 0x00000080
> Two problems: Worst, this assumes that function fragments aren't
> interleaved with each other, so that you can find the function
> before interpreting the line-number info. I thought one reason
I think I've already agreed that the current symbol table format is a pretty
poor vessel for containing the information the compilers would use
to convey scheduling/optimization/other information.
This proposal seeks to cause minimal disturbance of the symbol table and
breakage of the tools.
> And second, this requires a linear search to map PC to line
> number, The latter doesn't sound too bad, until you realize that
> debuggers like to do this operation for every line of machine
> code they display.
I'd be willing to that this is not as bad as you make it sound. Caching
search results and running the machine from known cached states would
most probably speed up succeeding mappings.
|
68.7 | approved | SMURF::LOWELL | | Mon Apr 21 1997 13:18 | 5 |
|
Approved on 4/21/97.
OF/STWG ECO ID = 13
|
68.8 | Various comments | FLYBA::BRENDER | Ron Brender | Wed Apr 23 1997 11:04 | 41 |
| I think there are a couple different issues that are getting mixed up together
-- maybe these comments will help clarify, at least what *I* had in mind...
Re .3: > Locate the function...
Re .4: > ...interleaved function fragments...
The source locator table does not, and is not intended, to address how to do
that. Additional symbol table information is needed (and I have a proposal in
mind that will support even interleaved function fragments). So, this is not
a complete solution to the "split routine problem" if you will. But it is a
necessary part, which also provides many other desirable features.
Re .1: > ...the same PC is described as part of several lines...
Ah, good question! This representation is intended to allow such descriptions,
in order to support the full glory :-) of debugging optimized code in the
future -- in particular, in order to deal with the likes of cross-jumping
optimizations (and others -- but not inlining, which is different however).
It is TBD when we will want to support this full generality, however. In the
short term, as Peter suggests, we may possibly limit ourselves to emitting
only a 1-to-many PC-to-line mapping. However, it would please me greatly if
ladebug at least would take on the challenge of supporting the full generality
of the representation.
There is a half-way point here, which is a reasonable fall back for dbx and/or
ladebug. Allow a many-to-many mapping to be represented in the symbol table
but ignore duplicate PC information. This would allow GEM to emit the full
description in advance of when the debugger and other tools choose to deal
with it.
(GEM already knows how to do this, given a suitable object symbol
table representation! GEM needs to be a tad careful to arrange that the more/
most useful mapping comes first, but this is no different than being careful
to arrange that the only mapping is the more/most useful one in the current
scheme.)
It is a weakness of the proposal that we did not point out and clarify this
issue.
|
68.9 | Fly in the ointment ? | GEMGRP::MONTELEONE | | Thu May 22 1997 15:25 | 21 |
|
Regarding this point:
The linker will concatenate the Optimization Tables of all modules
contributing to an image. Note in particular that the linker need
not parse or analyze the Optimization Table information in any
way. The resulting image Optimization Table will be accessed by
post link tools such as OM and debuggers
Consider the set_file command, which references a file descriptor
index. Wouldn't the linker need to detect such commands and possibly
add to the IFT table for this case. How else would the postlink tools
know how to map the files referenced in the set_file command if there
were no such mapping ?
Bob
|
68.10 | -.1 not a problem... | GEMGRP::MONTELEONE | | Thu May 22 1997 17:25 | 15 |
|
This is not a problem afterall, since the linker produces a complete
FIT for each file descriptor which has a mapping for every file
descriptor described in its .o file. I was under the impression that
the FIT for a file descriptor contained only entries that were
*referenced* in the file descriptor, but it doesn't - it produces
entries for all fellow files which come from the same .o file.
If there linker were to ever optimize this behavior, then there would
be a problem, but there is no problem now...
Bob
|
68.11 | Alternate entry point clarification... | GEMGRP::MONTELEONE | | Mon Jun 02 1997 15:35 | 33 |
|
This note clarifies the specification of alternate entry points
and extended source line information.
An entry point and all of its corresponding alternates will
reference the same set of PPOD structures. The main entry point
will reference the PPOD structures through its related procedure
descriptor via the iopt field as usual. The iopt field associated
with the alternate entry point procedure descriptors will be zero.
The PPOD structures associated with the alternate entry points must
be accessed via the procedure descriptor associated with the main
entry point.
To determine the initial source location state for an alternate
entry point, the initial state of the related main entry point
is retrieved (e.g. the lower line number in the procedure descriptor,
the source file associated with the main entry point etc.). The
PPOD structures associated with the main entry point are found
via the iopt field of the procedure descriptor. The PPOD structures
are interpreted and the source state is updated until the current
address is equal to the address specified in the alternate entry
point's procedure descriptor. At this point the initial source
location is known for the alternate entry point.
Note that this process is especially meaningful when there exist
file boundaries amongst entry points.
Bob
|
68.12 | | SMURF::LOWELL | | Tue Jun 03 1997 10:37 | 17 |
| By making alternate entries share their PPOD structures with the
main entrypoint we are establishing a precedent that we will
probably want to follow for anything else that is implemented
as optimization symbols.
(Ron, will this work for split-lifetimes etc.? Or can we expect at
some point that alternate entries will have some of their own opt
symbols, but they always have to look to the main entry for extended
source line information?)
One small correction on your clarification, Bob. The PDR iopt field
should be set to ioptNil, not zero.
Also, what did we decide to do with the lnLow field in alt entry PDR's?
We were planning on setting it to -1 right? Will this be a problem
for debuggers that don't know about the new line number info?
|
68.13 | | GEMGRP::MONTELEONE | | Tue Jun 03 1997 11:03 | 15 |
|
>>Also, what did we decide to do with the lnLow field in alt entry PDR's?
>>We were planning on setting it to -1 right? Will this be a problem
>>for debuggers that don't know about the new line number info?
No, we cannot set the lnLow field to -1, since the tools which use
the old line number table information depend on the low line number
value being accurate. The high line number will be -1, per usual.
If we decide to remove the old line number table information at
some point in the future, then we could set the lnLow field to -1.
Bob
|
68.14 | No iopt's for alternate entries are forseen | FLYBA::BRENDER | Ron Brender | Thu Jun 05 1997 17:14 | 20 |
| Re .12:
>(Ron, will this work for split-lifetimes etc.? Or can we expect at
>some point that alternate entries will have some of their own opt
>symbols, but they always have to look to the main entry for extended
>source line information?)
Good question! In fact, the split lifetime information by it's nature applies
to a complete routine. Where the entry points are and how many is not relevant.
Trying to subdivide the information in some way related to entries would only
complicate matters.
Similarly for the semantic stepping information; while that probably *could*
be subdivided at alternate entry point boundaries, there is no reason to
want to do so.
So, at present, I don't see any reason for an alternate entry point to want/
need its own iopt information. There is certainly a possiblility of doing
that in the future, I 'spose, but it doesn't appear on the horizon yet as
far as I can see...
|