T.R | Title | User | Personal Name | Date | Lines |
---|
3323.1 | requesting more info | HYDRA::EGGIMANN | | Fri Mar 14 1997 15:42 | 24 |
| From: HYDRA::EGGIMANN "14-Mar-1997 1508 -0500" 14-MAR-1997 15:24:01.35
To: US6RMC::"[email protected]"
CC: [email protected],PETE
Subj:
re: openVMS question
Hi Greg,
We'll start with the basics first. Can you tell me what version of the decc
compiler and rtl you're using. (I'm assuming you're using decc) Ran an
"Analyze/image decc$shr" (or vaxc$shr) and let me know the image file id and
date of link. There are -lots- of patches to the decc$shr.exe and vaxc$shr.exe
libraries which might be contributing and I want to rule those out first.
Also, you indicate (or imply) the problem has been isolated to smp environments.
Does this mean the problem does not reproduce when you turn off the other
processors, or when the non-smp version of vms is loaded? Does the problem
reproduce on versions more recent than v6.1 on alpha, and v5.5-2 on vax?
If you have a test case which you can give us, I can try some things on this
end in parallel.
Pete
|
3323.2 | looks like some old rtls | HYDRA::EGGIMANN | | Fri Mar 14 1997 16:49 | 40 |
| From: US6RMC::"[email protected]" "Greg Gudenburr" 14-MAR-1997
16:36:49.63
To: hydra::eggimann
CC:
Subj: Re[2]: openVMS question
Hi, Pete,
Thanks for the quick reply.
decc vax: image file identification: "V06.0-60"
link date/time 19-apr-1993 00:36:14:14
linker identification: "05-13"
compiler: 5.0-003
decc alpha: image file identification: t02.0-018
image file build identification" x5sc-ssb-0000
link date/time 13-apr-1994 16:08:48.76
link identification: t11-11
compiler: v4.0-000
The problem has only been seen on machine that are running smp
environments. Again the problem has only been seem at customer sites
in production modes so disabling one processor is not possible. I am
waiting to test right now with VMS with smp enabled but one processor.
I have been unable to reproduce the problem on uniprocessor machines.
At present I can not answer: Does the problem reproduce on versions
more recent than v6.1 on alpha, and v5.5-2 on vax?
I will check into a test case.
Greg
|
3323.3 | we'll need another day to get the kits | HYDRA::EGGIMANN | | Fri Mar 14 1997 17:15 | 16 |
| From: HYDRA::EGGIMANN "14-Mar-1997 1710 -0500" 14-MAR-1997 17:12:53.66
To: US6RMC::"[email protected]"
CC: PETE
Subj: RE: Re[2]: openVMS question
Greg,
I'm scrounging around for the latest patch kit for decc and vaxc. I looks
like you do not have any of the eco patch kits, but I'll verify once I find
the kit and look at the image ids.
It may be Monday before I can get my hands on the latest eco kits for both
these rtls. I can't say for sure if this will fix your problem, but its
definately the first and easiest thing to try.
Pete
|
3323.4 | Here's the latest applicable DEC C patch kits for these platforms | HYDRA::NEWMAN | Chuck Newman, 508/467-5499 (DTN 297), MRO1-3/F26 | Fri Mar 14 1997 18:12 | 16 |
| 04-FEB-1997 VAXACRT09_061
Description: Multiple Corrections For DEC C RTL;
OpenVMS VAX V5.5-2 through V6.1 VAXACRT09_061
27-JAN-1997 ALPACRT09_061
Description: Corrections For DEC C Run-Time Library (RTL);
OpenVMS Alpha V6.1 - V6.1-1H2 ALPACRT09_061
However, I would guess that it would not be a C RTL issue -- more likely
an OpenVMS issue.
-- Chuck Newman
|
3323.5 | recommendations from chuck | HYDRA::EGGIMANN | | Wed Apr 23 1997 16:26 | 30 |
| From: HYDRA::NEWMAN "Chuck Newman, 508/467-5499 (DTN 297), MRO1-3/F26"
2-APR-1997 16:12:34.57
To: eggimann
CC: and_me
Subj: More information for BMC.
Pete --
Here's some more information you can send to BMC. I remember a similar
problem in DECwindows and posted a note in VMSnotes # 412:
BMC's request:
On multiprocessor openVMS (vax 5.5.2, and alpha 6.1) systems it
appears that a virtual terminal may accept the command string that we
are sending with the ptd$write, but will never execute the command
string sent. This situation seems to happen most frequently with the
terminal has just been started (sys$cre). All return codes that we see
indicate a normal completion (ptd$write and buffer status). Is there
any thing different about the way that virtual terminals are handled
in a multiprocessor boxes? Or is there a special way that we should
check to see that the terminal is ready to accept command.
Response:
I suspect that
they send data to the terminal but there is no type-ahead buffer yet.
The arrival of this data will start the processing to create the
type-ahead buffer. But any data sent in that burst will be lost. They
need to make sure that the application using the pseudo terminal has
queued a read to it before sending any data. There is a notification
AST that can be enabled to know when a read is ready.
|
3323.6 | more on the problem ... | HYDRA::EGGIMANN | | Wed Apr 23 1997 16:29 | 161 |
| From: US6RMC::"[email protected]" "Greg Gudenburr" 2-APR-1997
11:49:03.59
To: hydra::eggimann, [email protected]
CC:
Subj: Re[2]: openVMS question
Pete, we are having two basic problems with the code at present.
1 - An access violation occurs in a system services (intermittently). The call
stack is completely in our code and seems to indicate a corruption in the memory
of the system service routing.
2 - SMP problems:
We send data to the terminal with the PTD$WRITE intermittently the complete
buffer is accepted but, no action is taken on the command sent. (all return
code are good).
Code fragments follow:
mainsetup code:
status = PTD$SET_EVENT_NOTIFICATION (ptd->channel, _BMCVmsPtdReadEvent,
(BMCuint32)ptd->ptdidx, 0, PTD$C_START_READ);
status = PTD$READ(ptdReadEF, ptd->channel, _BMCVmsPtdReadAst, ptd->ptdidx,
ptd->readBuf, ptd->bufSize+4);
ast routing routine:
static void
_BMCVmsPtdReadEvent(int ptdidx)
{
BMCVmsPtd *ptd;
...turn off ast's and save current state
ptd = _BMCVmsPtdFindIdx(ptdidx);
DbgWrtPtd("_BMCVmsPtdReadEvent",ptd,1); ...debug output
if (ptd) {
ptd->readReq = 1;
_BMCVmsPtdDoWrite(ptd); /* write to the ptd */
}
...return to previous state
}
Write routing called:
_BMCVmsPtdDoWrite(BMCVmsPtd *ptd)
{
Does the following:
status = PTD$WRITE(ptd->channel, 0, 0, ptd->writeBuf, writeLen, 0, 0);
}
Read AST routines:
static void
_BMCVmsPtdReadAst(int ptdidx)
{
status = PTD$READ(ptdReadEF, ptd->channel, _BMCVmsPtdReadAst, ptd->ptdidx,
ptd->readBuf, ptd->bufSize+4);
}
All routines turnoff ast's when entered and reset the prevoius state when
exiting.
Greg
______________________________ Reply Separator _________________________________
Pete here is my previous message:
Hi, I am Greg Gudenburr, working for BMC software in Houston and am
having a problem with virtual terminals with openVMS. I was wondering
if you could answer or locate someone inside digital to answer a
question.
The product that I are supporting creates up to 32 FTA processes and
uses the non-blocking ptd$write and ptd$read to send an receive
command/text from the virtual terminal. AST's are used for
notification when a read has completed. Additionally, a mailbox is
setup to get notification when a process dies.
On multiprocessor openVMS (vax 5.5.2, and alpha 6.1) systems it
appears that a virtual terminal may accept the command string that we
are sending with the ptd$write, but will never execute the command
string sent. This situation seems to happen most frequently with the
terminal has just been started (sys$cre). All return codes that we see
indicate a normal completion (ptd$write and buffer status). Is there
any thing different about the way that virtual terminals are handled
in a multiprocessor boxes? Or is there a special way that we should
check to see that the terminal is ready to accept command.
Where necessary we turnoff AST deliver by using the sys$setast(0)
function call.
We have just discovered the decc$set_reentrancy function could
attributing to our problem. (This is not being used)
It could be helpful if we had a contact that we could discuss this
issue with them. (713) 918 2211.
Thanks in advance
Greg Gudenburr
BMC software inc.
% ====== Internet headers and postmarks (see DECWRL::GATEWAY.DOC) ======
% Received: from mail13.digital.com by us6rmc.mro.dec.com (5.65/rmc-17Jan97) id
AA10471; Fri, 14 Mar 97 14:50:14 -0500
% Received: from dresden.bmc.com by mail13.digital.com (8.7.5/UNX 1.5/1.0/WV) id
OAA14842; Fri, 14 Mar 1997 14:35:54 -0500 (EST)
% Received: by dresden.bmc.com (1.40.112.8/16.2) id AA121548687; Fri, 14 Mar
1997 13:44:47 -0600
% Received: from cherry.bmc.com(172.17.1.25) by dresden.bmc.com via smap (3.2)
id xma012034; Fri, 14 Mar 97 13:44:28 -0600
% Received: from ccmail.bmc.com (banana.bmc.com [172.17.1.201]) by
cherry.bmc.com with SMTP (8.7.5/8.7.3) id NAA20100; Fri, 14 Mar 1997 13:35:30
-0600 (CST)
% Received: from ccMail by ccmail.bmc.com (IMA Internet Exchange 2.1 Enterprise)
id 00007689; Fri, 14 Mar 97 13:33:22 -0600
% Mime-Version: 1.0
% Date: Fri, 14 Mar 1997 13:36:39 -0600
% Message-Id: <[email protected]>
% From: [email protected] (Greg Gudenburr)
% Subject: openVMS question
% To: [email protected], hdlite::eggimann
% Content-Type: text/plain; charset=US-ASCII
% Content-Transfer-Encoding: 7bit
% Content-Description: cc:Mail note part
% ====== Internet headers and postmarks (see DECWRL::GATEWAY.DOC) ======
% Received: from mail13.digital.com by us6rmc.mro.dec.com (5.65/rmc-17Jan97) id
AA04420; Wed, 2 Apr 97 11:31:12 -0500
% Received: from dresden.bmc.com by mail13.digital.com (8.7.5/UNX 1.5/1.0/WV) id
LAA14139; Wed, 2 Apr 1997 11:10:14 -0500 (EST)
% Received: by dresden.bmc.com (1.40.112.8/16.2) id AA106737305; Wed, 2 Apr 1997
10:08:25 -0600
% Received: from cherry.bmc.com(172.17.1.25) by dresden.bmc.com via smap (3.2)
id xma010535; Wed, 2 Apr 97 10:08:13 -0600
% Received: from ccmail.bmc.com (banana.bmc.com [172.17.1.201]) by
cherry.bmc.com with SMTP (8.7.5/8.7.3) id KAA15994; Wed, 2 Apr 1997 10:09:53
-0600 (CST)
% Received: from ccMail by ccmail.bmc.com (IMA Internet Exchange 2.1 Enterprise)
id 00031E7D; Wed, 2 Apr 97 10:09:09 -0600
% Mime-Version: 1.0
% Date: Wed, 2 Apr 1997 10:12:08 -0600
% Message-Id: <[email protected]>
% From: [email protected] (Greg Gudenburr)
% Subject: Re[2]: openVMS question
% To: hydra::eggimann, [email protected]
% Content-Type: text/plain; charset=US-ASCII
% Content-Transfer-Encoding: 7bit
% Content-Description: cc:Mail note part
|
3323.7 | more recommendations from chuck | HYDRA::EGGIMANN | | Wed Apr 23 1997 16:31 | 107 |
| From: HYDRA::NEWMAN "Chuck Newman, 508/467-5499 (DTN 297), MRO1-3/F26"
7-APR-1997 10:49:00.26
To: [email protected], [email protected]
CC: EGGIMANN, and_me
Subj: Problems in SMP environment at BMC software inc.
Greg --
1) Accvio in mainline code.
My wording was a poor -- I'm theorizing that a system service has been
invoked which (directly or indirectly) has an associated AST. Later, when
doing the sprintf, the AST executes and corrupts the stack (i.e., routine
ABC's stack-datum was passed to the AST, and the AST executes after
routine ABC has returned).
One way to check this would be to disable kernel mode AST's before the
sprintf. then enable them after. Also check the location of the address
that gets corrupted (if you can find it):
call disble_kernel_asts
call sprintf
check_location_of_corrupted_address(see_if_it_is_valid)
call enable_kernel_asts
check_location_of_corrupted_address(see_if_it_is_valid)
Also, it would be useful to see what the ptd structure looks like.
For example, if it looks like this:
short first;
short second;
Then you can have problems if the mainline code is working on "first"
which an AST is working on "second", since they are smaller than the
smallest datum that can be accessed by Alpha CPUs. If the structure
has packed alignment, the potential for the problem can extend to any
items, even 8-byte ones. Compiling with /REENT=AST may help with this
(does the same thing as DECC$SET_REENTRANCY). There may be pragmas
that would be relevant also.
2) Re: buffer layout -- I don't have a hard copy of the manual, but my
figure 6-1 still shows the buffer as having the data start at an offset of 4
bytes into the buffer (so no change). Without seeing your ptd structure I can
only amke guesses about fields, but my reading of what I see makes me
suspicious. Here's your code, followed by the pseudo-code from the PTD example.
Unfortunately, the pseudocode doesn't descibe the buffer it's using.
status = PTD$READ(ptdReadEF, ptd->channel, _BMCVmsPtdReadAst, ptd->ptdidx,
ptd->readBuf, ptd->bufSize+4);
Call PTD$READ to start reading from the pseudoterminal
ASTADR = ft_read_ast
ASTPRM = buffer address
READBUF = I/O buffer + 8
READBUF_LEN = 500
Note that the documentation on the PTD$READ call indicates that you should pass
the address of the status filed, *NOT* the address of the byte to receive the
first character. Also, it looks to me as though the buffer_length parameter
should be diminished by the sizes of the status and byte_count fields:
readbuf
OpenVMS usage: char_string
type: character coded text string
access: write only
mechanism: by reference
Address of the read I/O status longword. The first character
position in an I/O buffer to receive all output is this address
plus 4. The readbuf argument must be in the range spec-
ified in the inadr argument of the PTD$CREATE routine,
otherwise an SS$_ACCVIO status is returned.
readbuf_len
OpenVMS usage: word_unsigned
type: word (unsigned)
access: read only
mechanism: by value
Number of characters that can be read from the pseudoter-
minal and stored in the buffer specified by readbuf.
I'd code this as follows:
#define BUFFER_SIZE 508
struct
{
short status;
short byte_count;
char io_buffer[BUFFER_SIZE];
} input_buffer;
.
.
.
ptd$read(blah, blah, blah, blah, &input_buffer.status, BUFFER_SIZE);
-- Chuck Newman
|
3323.8 | response from bmc | HYDRA::EGGIMANN | | Wed Apr 23 1997 16:33 | 219 |
| From: US6RMC::"[email protected]" "Greg Gudenburr" 4-APR-1997
15:10:45.04
To: hydra::eggimann, [email protected]
CC:
Subj: Re: greg - here's some comments
1 - An access violation occurs in a system services (intermittently). The call
stack is completely in our code and seems to indicate a corruption in the memory
of the system service routing.
| What system service is seeing the ACCVIO? Is it perhaps a completion AST
| writing to an area that was valid (on the stack) when the routine that set
| up the AST was called, but no longer valid when the AST was initiated?
|
| Also, does this application use multiple threads or multiple processes?
> No, the accvio is in the main line code that is not related to AST processing.
The system service call is sprintf. All parameters to the sprintf seem to be
valid. We have recently replaced the sprintf with the BSD safe sprintf routines
and the problem is now occurring in the memchr library routine within the BSD
sprintf.
>The application is single threaded.
2 - SMP problems:
We send data to the terminal with the PTD$WRITE intermittently the complete
buffer is accepted but, no action is taken on the command sent. (all return
code are good).
| By "all return code are good" do you mean the return status only, or also
| the status in the buffer (first two bytes)?
> Yes we have checked both return codes.
Code fragments follow:
mainsetup code:
status = PTD$SET_EVENT_NOTIFICATION (ptd->channel, _BMCVmsPtdReadEvent,
(BMCuint32)ptd->ptdidx, 0, PTD$C_START_READ);
status = PTD$READ(ptdReadEF, ptd->channel, _BMCVmsPtdReadAst, ptd->ptdidx,
ptd->readBuf, ptd->bufSize+4);
| Do you *REALLY* mean ptd->bufSize+4? I hope the "+4" isn't supposed to
| account for the 4 bytes of status and byte count? In other words, for
| ptd->bufSize+4 to work (e.g., not overwrite someone else's buffer), the
| size of this buffer should be ptd->bufSize+8.
|
| Also, do you use "write-with-echo" in your application?
>In the I/O user's reference manual pate 6-4 the buffer is shows the data to
begin at offset 4. Has this changed since the manual we have?
ast routing routine:
static void
_BMCVmsPtdReadEvent(int ptdidx)
{
BMCVmsPtd *ptd;
...turn off ast's and save current state
ptd = _BMCVmsPtdFindIdx(ptdidx);
DbgWrtPtd("_BMCVmsPtdReadEvent",ptd,1); ...debug output
if (ptd) {
ptd->readReq = 1;
_BMCVmsPtdDoWrite(ptd); /* write to the ptd */
}
...return to previous state
}
Write routing called:
_BMCVmsPtdDoWrite(BMCVmsPtd *ptd)
{
Does the following:
status = PTD$WRITE(ptd->channel, 0, 0, ptd->writeBuf, writeLen, 0, 0);
}
Read AST routines:
static void
_BMCVmsPtdReadAst(int ptdidx)
{
status = PTD$READ(ptdReadEF, ptd->channel, _BMCVmsPtdReadAst, ptd->ptdidx,
ptd->readBuf, ptd->bufSize+4);
}
All routines turnoff ast's when entered and reset the prevoius state when
exiting.
Greg
______________________________ Reply Separator _________________________________
Pete here is my previous message:
Hi, I am Greg Gudenburr, working for BMC software in Houston and am
having a problem with virtual terminals with openVMS. I was wondering
if you could answer or locate someone inside digital to answer a
question.
The product that I are supporting creates up to 32 FTA processes and
uses the non-blocking ptd$write and ptd$read to send an receive
command/text from the virtual terminal. AST's are used for
notification when a read has completed. Additionally, a mailbox is
setup to get notification when a process dies.
On multiprocessor openVMS (vax 5.5.2, and alpha 6.1) systems it
appears that a virtual terminal may accept the command string that we
are sending with the ptd$write, but will never execute the command
string sent. This situation seems to happen most frequently with the
terminal has just been started (sys$cre). All return codes that we see
indicate a normal completion (ptd$write and buffer status). Is there
any thing different about the way that virtual terminals are handled
in a multiprocessor boxes? Or is there a special way that we should
check to see that the terminal is ready to accept command.
Where necessary we turnoff AST deliver by using the sys$setast(0)
function call.
We have just discovered the decc$set_reentrancy function could
attributing to our problem. (This is not being used)
It could be helpful if we had a contact that we could discuss this
issue with them. (713) 918 2211.
Thanks in advance
Greg Gudenburr
BMC software inc.
% ====== Internet headers and postmarks (see DECWRL::GATEWAY.DOC) ======
% Received: from mail13.digital.com by us6rmc.mro.dec.com (5.65/rmc-17Jan97) id
AA10471; Fri, 14 Mar 97 14:50:14 -0500
% Received: from dresden.bmc.com by mail13.digital.com (8.7.5/UNX 1.5/1.0/WV) id
OAA14842; Fri, 14 Mar 1997 14:35:54 -0500 (EST)
% Received: by dresden.bmc.com (1.40.112.8/16.2) id AA121548687; Fri, 14 Mar
1997 13:44:47 -0600
% Received: from cherry.bmc.com(172.17.1.25) by dresden.bmc.com via smap (3.2)
id xma012034; Fri, 14 Mar 97 13:44:28 -0600
% Received: from ccmail.bmc.com (banana.bmc.com [172.17.1.201]) by
cherry.bmc.com with SMTP (8.7.5/8.7.3) id NAA20100; Fri, 14 Mar 1997 13:35:30
-0600 (CST)
% Received: from ccMail by ccmail.bmc.com (IMA Internet Exchange 2.1 Enterprise)
id 00007689; Fri, 14 Mar 97 13:33:22 -0600
% Mime-Version: 1.0
% Date: Fri, 14 Mar 1997 13:36:39 -0600
% Message-Id: <[email protected]>
% From: [email protected] (Greg Gudenburr)
% Subject: openVMS question
% To: [email protected], hdlite::eggimann
% Content-Type: text/plain; charset=US-ASCII
% Content-Transfer-Encoding: 7bit
% Content-Description: cc:Mail note part
% ====== Internet headers and postmarks (see DECWRL::GATEWAY.DOC) ======
% Received: from mail13.digital.com by us6rmc.mro.dec.com (5.65/rmc-17Jan97) id
AA04420; Wed, 2 Apr 97 11:31:12 -0500
% Received: from dresden.bmc.com by mail13.digital.com (8.7.5/UNX 1.5/1.0/WV) id
LAA14139; Wed, 2 Apr 1997 11:10:14 -0500 (EST)
% Received: by dresden.bmc.com (1.40.112.8/16.2) id AA106737305; Wed, 2 Apr 1997
10:08:25 -0600
% Received: from cherry.bmc.com(172.17.1.25) by dresden.bmc.com via smap (3.2)
id xma010535; Wed, 2 Apr 97 10:08:13 -0600
% Received: from ccmail.bmc.com (banana.bmc.com [172.17.1.201]) by
cherry.bmc.com with SMTP (8.7.5/8.7.3) id KAA15994; Wed, 2 Apr 1997 10:09:53
-0600 (CST)
% Received: from ccMail by ccmail.bmc.com (IMA Internet Exchange 2.1 Enterprise)
id 00031E7D; Wed, 2 Apr 97 10:09:09 -0600
% Mime-Version: 1.0
% Date: Wed, 2 Apr 1997 10:12:08 -0600
% Message-Id: <[email protected]>
% From: [email protected] (Greg Gudenburr)
% Subject: Re[2]: openVMS question
% To: hydra::eggimann, [email protected]
% Content-Type: text/plain; charset=US-ASCII
% Content-Transfer-Encoding: 7bit
% Content-Description: cc:Mail note part
% ====== Internet headers and postmarks (see DECWRL::GATEWAY.DOC) ======
% Received: from mail13.digital.com by us6rmc.mro.dec.com (5.65/rmc-17Jan97) id
AA07189; Fri, 4 Apr 97 15:04:16 -0500
% Received: from dresden.bmc.com by mail13.digital.com (8.7.5/UNX 1.5/1.0/WV) id
OAA14983; Fri, 4 Apr 1997 14:50:23 -0500 (EST)
% Received: by dresden.bmc.com (1.40.112.8/16.2) id AA122443312; Fri, 4 Apr 1997
13:48:32 -0600
% Received: from cherry.bmc.com(172.17.1.25) by dresden.bmc.com via smap (3.2)
id xma012065; Fri, 4 Apr 97 13:48:05 -0600
% Received: from ccmail.bmc.com (banana.bmc.com [172.17.1.201]) by
cherry.bmc.com with SMTP (8.7.5/8.7.3) id NAA13992; Fri, 4 Apr 1997 13:49:49
-0600 (CST)
% Received: from ccMail by ccmail.bmc.com (IMA Internet Exchange 2.1 Enterprise)
id 00035559; Fri, 4 Apr 97 13:49:07 -0600
% Mime-Version: 1.0
% Date: Fri, 4 Apr 1997 13:52:35 -0600
% Message-Id: <[email protected]>
% From: [email protected] (Greg Gudenburr)
% Subject: Re: greg - here's some comments
% To: hydra::eggimann, [email protected]
% Content-Type: text/plain; charset=US-ASCII
% Content-Transfer-Encoding: 7bit
% Content-Description: cc:Mail note part
|
3323.9 | action plan to come into center | HYDRA::EGGIMANN | | Wed Apr 23 1997 16:35 | 9 |
| After my visit onsite with BMC, we agreed that the best thing to do
was to replicate the problem in the lab here in Marlboro. Greg is
working to create a reproducer and will schedule time in the porting
center when he is ready.
(it appears that in the process of doing this, the problem does not
reproduce on v7.1 ...)
Closing the call until they are ready for a visit
|