| Date: Mon, 17 Feb 97 18:52:59 EST
From: William Beck <[email protected]>
To: [email protected], [email protected]
Cc: [email protected]
Apparently-To: [email protected]
Subject: Re: Problem executing with shared library
Hi Marvin,
The error message you report ("Failed to enable speculative
environment")
doesn't sound good. I just missed the person who may be able to sort
this
out for you, but I suspect that they will want the following:
1) A test case. It's nice if it is small, but size is no
problem. The test case does not need to do anything
useful beyond reproducing the error message.
Please don't forget to include any environment setup
info, how to run the test case and perhaps even the
expected results (say on the V3.2D-? and V4.0A systems).
2) The *exact* link command(s) used to build the
components of the test case. A build log is great.
It is probable that only the link command for
the ISV's 700Kb shared library is needed, but if
additional info is available, please send it along.
3) The "uname -a" output from the system where the test
case was built, were it runs and where it fails. For
example, was the system where it works a V3.2D-1 or
a V3.2D-2 system... they are different :-(
4) The "business info" for this problem... Is this a fire?
Is the ISV "down". Is there a CLD or a QAR? Should we
get one opened? Are there any big deadlines comming up
for the ISV and/or DEC w.r.t. this problem? We need
this info so you can help us set our priorities correctly.
Feel free to give me a call at DTN 381-1186 until I find a good
home for this problem (hopefully tomorrow morning :-) I fear that
this will be a "nasty one". Let's get moving towards resolution ASAP.
I look forward to hearing more. Thank you for reporting this
problem!
Will
PS. If any (or all) of the four items requested above are expensive
to collect, just let us know and move on to the next. We
need an "expert's opinion" before you go to a lot of bother
to comply with these requests.
|
| To:[email protected]
cc:
Subject:ASAP Problem #3195
--------
Murali-
We have made contact with UNIX engineering and they are quite concerned
about
your problem.
They have asked for some additional information.
1. A test case or "reproducer" if it is possible for you to send us
one.
include "any environment setup info, how to run the test case and
perhaps
even the expected results (say on the V3.2D-? and V4.0A systems).
2. The *exact* link command(s) used to build the components of the test
case
or your application. A build log is great.It is probable that only
the link
command for 700Kb shared library is needed, but if additional info
is
available, please send it along.
3. The "uname -a" output from the system where the test case or
application was
built, were it runs and where it fails. For example, was the system
where it
works a V3.2D-1 or a V3.2D-2 system... they are different.
4. What is the impact of this problem on your business plans, release
schedule
etc.
Also we are still interested in the results of test of building on
4.0a.
Marvin Davis
[email protected]
|
| Date: Tue, 18 Feb 1997 16:50:24 -0600
Message-Id: <[email protected]>
From: [email protected] (Murali Somarouthu)
Subject: Re: ASAP Problem #3195
To: [email protected]
I have not been able to create a test case to reproduce it, but
guess
what, I built it on 4.0 and it seems to run great. I have not
used
any different options or any different compiler settings.
I am going to do little more research and will let you know.
Thanks
-Murali
Date: Tue, 18 Feb 1997 20:47:41 -0600
Message-Id: <[email protected]>
From: [email protected] (Murali Somarouthu)
Subject: Re: ASAP Problem #3195
To: [email protected]
This is some addition info. to my earlier e-mail. I have this
shared
libray which was built on 3.2 is giving the problem on 4.0.
Interestingly I got all the object files for this shared library
built
on 3.2, but created the shared library on 4.0 and it seems to
work.
It looks like some additional features in 'ld' command on '4.0' is
making the difference. But I have not used any different options,
infact, used the same makefile with the following 'ld' options.
ld -o <problem shared library> -taso -shared a.o b.o c.o
-ldepend1
-ldepend2.
|
| To:[email protected]
cc:kenyon@hydra
Subject:Problem executing with shared library
--------
Will-
I forwarded your requests for information to our Software Partner who
is
having the problem running the shared library on 4.0 that he built
under 3.2.
This yields the error message "Failed to enable speculative
environment"
Here are some responses-
"I have not been able to create a test case to reproduce it, but guess
what, I built it on 4.0 and it seems to run great. I have not
used
any different options or any different compiler settings.
I am going to do little more research and will let you know."
" This is some addition info. to my earlier e-mail. I have this
shared
libray which was built on 3.2 is giving the problem on 4.0.
Interestingly I got all the object files for this shared library
built
on 3.2, but created the shared library on 4.0 and it seems to
work.
It looks like some additional features in 'ld' command on '4.0' is
making the difference. But I have not used any different options,
infact, used the same makefile with the following 'ld' options.
ld -o <problem shared library> -taso -shared a.o b.o c.o
-ldepend1
-ldepend2."
Regards
Marvin Davis
Software Partner Engineering
[email protected]
dtn 297 6853
|
| To: [email protected]
Subject: Re: Speculative Env. Problem
In-Reply-To: Your message of "Wed, 19 Feb 97 17:16:31 EST."
<[email protected]>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Date: Thu, 20 Feb 97 10:34:11 -0500
From: "Craig Neth DTN 381-2174" <[email protected]>
X-Mts: smtp
>Apparently ots_speculate.o is included in /usr/lib/cmplrs/cxx/libexc.a
in
>Digital UNIX 3.2. It is however missing from libexc.a in UNIX 4.0a.
>
>Murali does not specifically call ots_speculate.o in his application.
Right. The routines in that file are provided to support a compiler
optimization that is switch controlled - the optimization is called
'speculative execution' and it is controlled by the -speculate switch.
Those routines are not meant to be called by an application programmer.
The
compiler will insert references to them if that optimization technique
is
asked for.
And yes, there were some packaging changes for this support between
V3.2x
and V4.0. The C++ compiler was shipping that support as part of their
kit for awhile until it could integrated into the base system.
I guess one thing you could have them check quickly is whether or not
they are using the -speculate <option> switch on the compilation.
>he is willing to send his binary objects if necesary.
I think this would be the fastest way for me to help resolve this.
Can
you arrange this? Ideally, if you could get the .o files, libraries
and
executables (as well as the makefiles) that should be all I would
need. I
shouldn't need to see any sources (of course, if they want to send
those too
it's ok, but I know some partners don't like to do that).
Thanks,
Craig
|
| Multiple corespondance:
To: [email protected]
Cc: [email protected], [email protected],
[email protected],
[email protected]
Subject: Re: Speculative Env. Problem Reproducer
In-Reply-To: Your message of "Mon, 24 Feb 97 13:50:51 EST."
<[email protected]>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Date: Mon, 24 Feb 97 16:01:26 -0500
From: "Craig Neth DTN 381-2174" <[email protected]>
X-Mts: smtp
Davis,
Thanks for the reproducer. I now understand what Murali is doing to
trigger this undesired behavior, but I don't think it should be
happening.
I am also having troubles reproducing the problem - I get linker errors
when I build the reproducer (probably because I don't have the same
versions
of everything that Murali does - can you tell me what version of C++
they
are using?)
The problem is occuring because of the way Murali is linking with
libexc.
>From his script:
ld -o libz.so -taso -shared z.o -all -L/usr/lib/cxx/cmplrs \
/usr/lib/cmplrs/cxx/libcxx.a -lexc -none -lc -lots
-all tells the linker to bring in every symbol in an archive library
(.a)
file, regardless of whether or not that symbol is referenced by the
objects
that preceed it. It is effect on the command line until the -none
option is
seen. So, in the command line above, _all_ of the objects in
libcxx.a and
libexc.a will be included in libz.so.
This is what is bringing in the 'speculative execution' environment -
parts
of the speculative execution environment live in libexc.
I don't know why Murali is doing this but I think it is a mistake to do
so.
The speculative execution pieces of libexc should ONLY be brought in if
the applications uses speculative execution (and the compiler/linker
will conspire to do this for you - no user action required) You can
get all
sorts of misbehaviors by linking it in explicitly. [see below for the
details].
As an aside, it probably isn't a good idea to bind all of libcxx into
his
.so either - it will make his library much bigger.
What I haven't yet figured out is why this is failing on V4.0b - if it
'worked'
on V3.2x it should 'work' the same way on V4.0b, at least according to
what
I know at the moment. I have to get it reproduced before I'll be able
to
solve that. I will keep fiddling on my end, but if you can get me
more
details on their exact config that would be helpful.
On a book keeping note, do you know if this problem is also being
worked
through the IPMT system? I got a call today from the kernel developer
who
works on this stuff and they have an IPMT case complaining about the
same
symptoms - from a customer named 'ROLFE & NOLAN COMPUTER SERVICES'. Do
you know if this is the same set of folks?
Craig
[ Details on 'speculative execution'
Speculative execution is a scheduling optimization technique that is
used
on pipelined processors. The basic idea is that the compiler writes
code
that tries to do certain slow operations (like memory loads, and
floating
point ops) before it knows for certain whether or not everything is
ready
for those operations to be attempted.
As an example, consider a loop that accesses an array in sequential
order,
with a precondition on the loop that the pointer to the array is not
null.
If speculative execution is turned on, the compiler will schedule a
load
to the first array location before the test of the loop. If the
pointer
is non-null, this will mean that once the loop is entered the data for
the
first array access is already nearby in the cache and so will be loaded
quickly, speeding the execution of the loop.
Of course, there will be times when the pointer _is_ null, and that
will
cause undesired behaviors; behaviors that must be dealt with to ensure
proper
program behavior. In particular, you would get a SEGV or SIGBUS
exception -
something the program would not have gotten if this optimization were
not
enabled.
The speculative execution environment installs handlers that handle
such
undesired exceptions and dismiss them silently, so that the application
will continue to function. (In the case of our example, the SEGV will
be dismissed, the loop will not be entered, and everything will
continue
just as if there was no speculative execution)
And so this is why linking in that stuff all the time is bad: Most
applications do not want SEGV, FPE, etc. errors to be dismissed
silently - they
usually represent programming errors that should halt execution of the
application. By linking in this support, they may be inadvertantly
allowing
their application to continue after a serious error.
]
o: [email protected]
Cc: [email protected], [email protected],
[email protected],
[email protected]
Subject: Re: Speculative Env. Problem Reproducer
In-Reply-To: Your message of "Mon, 24 Feb 97 13:50:51 EST."
<[email protected]>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Date: Mon, 24 Feb 97 16:23:41 -0500
From: "Craig Neth DTN 381-2174" <[email protected]>
X-Mts: smtp
Marvin,
Whups, I made a small error in my last reply.
The linking problem is being caused when the _executable_ is linked,
not
when the shared object is linked. The libz.so shared object doesn't
have
any of the speculative execution stuff in it, but the main executable
certainly does, and that is what is causing the problem.
The -all -none stuff does NOT cause all of those libraries to be linked
into the .so file. Fifty lashes for me for not making sure I had
everything
right before sending things off...
I've run some experiments, and I still can't reproduce this problem,
although
I am positive now that it is because I do not have the right version of
C++.
Craig
|