T.R | Title | User | Personal Name | Date | Lines |
---|
1667.1 | PROBLEMS QUOTA EXCEEDED ???? | STKHLM::BERGGREN | Nils Berggren EIS/Project dpmt, Sweden DTN 876-8287 | Wed Oct 23 1991 09:16 | 12 |
| Hi again,
Anybody seen this note? There's been nearly a week, and no answer...
Is there a 'PROBLEMS QUOTA EXCEEDED'-flag raised for me or what?
The problem is quite anoying since I can't notify the domain on all
events in one shot.
Is there anyone looking into this problem?
/Nils
|
1667.2 | Problem is indeed in MCC_EVENT_DUMP | TOOK::T_HUPPER | The rest, as they say, is history. | Wed Oct 23 1991 17:21 | 34 |
| RE .0:
Do you see this problem when the event trace is not on? The ACCVIO
shown in your log is in the trace code, not the mainline code of the
events manager. It also looks like this is V1.1 code. If the problem
is in the mainline event manager code, this is a serious problem. If
it is only in the trace code (MCC_EVENT_DUMP), there is less reason to
panic.
On digging through the MCC_EVENT_DUMP code, I see that there indeed is
a restriction on the size of dump lines - 1024 bytes. As the entire
event filter is being dumped as a single line (no end-of-line character
until the end of the filter list), and it takes 5 bytes in the printout
for each filter element, plus 10 bytes for the header, the expected
blowup is after 203 elements. Seems to fit the problem very closely.
Do you need to have each event code explicitly in the filter list, or
can you use a wildcarded event filter (event filter pointer is
MCC_K_NULL_PTR = all event codes)?
I suppose we could change the dump so that we did intervene a bit more
in the formatting, and force a printout (rather than continuing to
accumulate characters) after N characters. This is not a difficult
fix, and we will incorporate this, or something like it, in the V1.2
code. Thanks for finding this problem.
RE .1:
Sorry for the delay, but we are running flat out here on V1.2 code.
Some of us have been in the critical path over the last few weeks, and
the schedule is seriously tight. The light at the end of the tunnel is
getting brighter now, however.
Ted
|
1667.3 | Still ACC-VIO with no tracing on | STKHLM::BERGGREN | Nils Berggren EIS/Project dpmt, Sweden DTN 876-8287 | Thu Oct 24 1991 08:23 | 151 |
|
Re .2
>>> Do you see this problem when the event trace is not on?
Yes. This can be interpreted as that we have garbage in from
NOTIFICATION that it and shows up in the trace code, when tracing is
enabled, but if tracing is disabled, it shows up somewhere else.
Please note, that this is my interpretation....
>>> It also looks like this is V1.1 code
That's right, I'm running V1.1.
>>> If the problem is in the mainline event manager code, this
>>> is a serious problem.
Couldn't have said it better myself...
>>> Do you need to have each event code explicitly in the filter list, or
>>> can you use a wildcarded event filter (event filter pointer is
>>> MCC_K_NULL_PTR = all event codes)?
What can I do about it? As I understand it's the responsibility of
NOTIFICATION-FM to pass the event-list into my in_p-argument. If I do a
"NOTIFY DOMAIN AAA EVENTS=(ANY EVENTS), -
ENTITY LIST=(SEBAM .SEBAM.SEB075 APPLICATION *)
then in_p contains all the events defined. Couldn't NOTIFICATION-FM pass
a MCC_K_NULL_PTR instead of all the events? However, that's just
part of the solution. If I really wanted to do a NOTIFY with an event-list
with more than 203 events, but not all events defined. Then it would
blow up anyway...
I just tried to to a NOTIFY, as above, without any event-logging and
the DEBUG-bit set in my MCC_SEBAM_AM_LOG-logical and when doing a STEP
in the debugger at the mcc_event_get-call I get:
%SYSTEM-F-ACCVIO, access violation, reason mask=04,
virtual address=002033F0, PC=001185BA, PSL=03C00000
%DEBUG-E-LASTCHANCE, stack exception handlers lost, re-initializing stack
%DEBUG-I-EXITSTATUS, is '%NONAME-W-NORMAL, normal successful completion'
Because of the stack getting re-inited, I can't do a "SHOW CALLS"
to see where it crashes.
Doing the same thing, but ending the event-list at the 203:rd event
still works OK.
I tried to run with MCC_NOTIFICATION_FM_LOG set to FFFFFFFF and
got the following output:
***>>>>>>>>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<<<***
Notification FM; NOTIFY entry point.
Notification FM; dispatch validation complete, validation status:
%MCC-S-NORMAL, normal successful completion
***>>>>>>>>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<<<***
* Notification FM NOTIFY call arguments *
***>>>>>>>>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<<<***
entity [0] wild = NOT_WILD class = 8 id = 1 type = 5
instance = �...<..�U�8......AAA..
%XAA0004003C04A0EE55F8BE389400070001034141410000
Notification FM; In_p :
[ 0 ] (
[ 3 ] 28 61 6e 79 20 65 76 65 6e 74 73 29 -- (any events)
[ 1 ] (
[ 1 ] (
[ 0 ] (
[ 1 ] 01
[ 2 ] 02 bc
[ 3 ] 01
[ 4 ] 05
[ 5 ] aa 00 04 00 3c 04 a0 ee 55 f8 be 38 94 00
11 00 01 05 53 45
42 41 4d 01 06 53 45 42 30 37 35 00 00
)
[ 1 ] (
[ 1 ] 03
[ 2 ] 1b 58
[ 3 ] 00
[ 4 ] 00
[ 5 ]
)
)
)
)Notification FM; In_q : NULL
Notification FM; Time Spec : 0 (NEXT)
Notification FM; State : MCC_K_HANDLE_FIRST
Notification FM; ** GETEVENT call arguments **
Notification FM; Verb code = 65
Notification FM; Partition = 15
Notification FM; In entity :
entity [0] wild = NOT_WILD class = 700 id = 1 type = 5
instance = �...<..�U�8......SEBAM..SEB075..
%XAA0004003C04A0EE55F8BE38940011000105534542414D01065345423037350000
entity [1] wild = INSTANCE_FULL class = 7000
Notification FM; In_p :
[ 1 ] (
[ 1 ] (
[ 1 ] 1e 69
[ 1 ] 1e 6b
[ 1 ] 1e 6c
:
: ! all 314 event-codes (I did not count them...)
:
)
)Notification FM; In_q :
[ 0 ] (
[ 3 ] aa 00 04 00 3c 04 a0 ee 55 f8 be 38 94 00 07 00 01 03 41 41
41 00 00
)
Notification FM; Time Spec : 0 (NEXT)
Notification FM; State : MCC_K_HANDLE_FIRST
Notification FM; Time Spec after check for use of default (INFINITY)Notification
FM; Time Spec :
schedule
is NULL
scope
contains 1 elements
element 0
begin is ABSOLUTE: 24-OCT-1991 11:33:58.70
end is ABSOLUTE: 28-NOV-9723 23:59:59.99
periodicity is NULL
period_end is NULL
and then the ACC-VIO.
I don't know if this is of any help, but now you have it.
regards,
Nils
P.S. Ted, sorry if I push you, but "the schedule is seriously tight" for us
as well and "The light at the end of the tunnel is getting brighter
now, however." doesn't apply for us until all problems are tracked
down and fixed. However, I really appreciate that you ALL 'over there'
are so very helpful and doing a great job in answering all questions
raised in the conference (even if it sometimes takes a while).
|
1667.4 | Let's assume there are (at least) 2 problems here | TOOK::GUERTIN | Don't fight fire with flames | Fri Oct 25 1991 11:20 | 20 |
| I think Ted is right. Let's assume there are two problems here. First
of all the Event Trace facility cannot handle that many events, so you
cannot use the Event Trace to provide any additional information,
sorry. The second problem (the *REAL* problem) is that there is an
accvio when you do a getevent on hundreds of events. That accvio
appears to be happening in the threads code. Are you doing anything
with threads? Anything asynchronous (e.g., ASTs?). If you are going
thru the Notification FM, then you may have simply exceeded the number
of threads that can handled within MCC. I can only guess. Could you
provide any additional information on your operating environment (you
are not using iconic map, correct?).
PS: I'm happy to see that you're less hostile about this. Monitoring
the notes file is NOT a required activity for developers. It should be
view as volunteer work, not as a duty. Hence, placing nasty-grams in
the notes file to get help is like having a flat tire and giving every
car that goes by the middle finger in hopes that someone will be upset
enough to stop and help you.
-Matt.
|
1667.5 | reply to .4 | STKHLM::BERGGREN | Nils Berggren EIS/Project dpmt, Sweden DTN 876-8287 | Fri Oct 25 1991 12:14 | 34 |
| repl .4
>>> First of all the Event Trace facility cannot handle that many events,
>>> so you cannot use the Event Trace to provide any additional information,
>>> sorry.
I can live with that, but shouldn't it be fixed anyway?
>>> Are you doing anything with threads?
The only thing that's thread-specific is that I create a lock every
time my getevent-directive entry point gets called. The lock is
deleted in the end_directive-routine.
>>> Anything asynchronous (e.g., ASTs?)
No, all my communications are done with QIOW
>>> If you are going thru the Notification FM, then you may have
>>> simply exceeded the number of threads that can handled within MCC.
I have the problem even if the NOTIFY-directive is the first and only
thing I do at a MCC-session. (I guess that the number of threads that
can be handled within MCC are reset when invoking MCC...)
Is it correct that NOTIFICATION creates a thread for each entity
specified in the entity-list and each thread calls my GETEVENT-entry point?
In that case, 'thread-quota' shouldn't be the problem since I get the
ACC-VIO even if there's only one entity in the list. Am I out flying now
or...?
>>> Could you provide any additional information on your operating
>>> environment (you are not using iconic map, correct?).
Not using IMPM, right! MCC BMS v1.1.
What else do you need?
Thanks and regards,
Nils
|
1667.6 | Sounds stack related to me | TOOK::GUERTIN | Don't fight fire with flames | Mon Oct 28 1991 16:03 | 32 |
| RE:.-1
>> shouldn't it be fixed anyway?
Well actually the event trace logs were intended to help the MCC Kernel
developers debug the event manager. There has been talk about removing
it from the production code (I don't know about having an
mcc_kernel_shr.exe with event tracing in the toolkit).
Since it appears that the accvio is occuring in the threads code. It
implies several things:
1) Usually there is a stack problem.
2) Possibly a stack corruption. For example if you declare
int x[2];
then...
x[1] = 0;
x[2] = 0;
you've just clobbered the stack (since the C language indexes from 0).
Something like this commonly shows up in the MCC Kernel as an
ACCVIO or reserved operand fault. I have seen many many
string-copies copy off the end of the stack and corrupt memory.
3) Slight possibility that there is a stack overflow. Are you
allocating any large structures on the stack? Have any
extensively recursive routines?
I'm sorry but I'm starting to run out of suggestions. If you believe
this to be an MCC bug, please enter a QAR (or I can enter one for you).
Perhaps you can enter some relevant source code as well.
-Matt.
|
1667.7 | 512 event codes is OK for mcc_event_get | TOOK::T_HUPPER | The rest, as they say, is history. | Tue Oct 29 1991 11:30 | 14 |
| I have tried using 512 event codes going into a single mcc_event_get()
call, with no problem. There is nothing in the regular (non-trace)
event manager code that has a limited buffer for event codes. They are
encoded in list form, and the list can be as long as you wish.
Note that my test is using code that does not go through the mcc_call
interface, so the environment that the event manager is running is is
different (little in the way of stack-eating higher-level routines).
As Matt points out in .6, the problem is perhaps something that ends
up blowing up in the event manager because it is the lowest level of
code. When there is a stack corruption, the problem shows up where
routine returns are being made (return to outer space).
Ted
|
1667.8 | thread size from notify call to getevetn | TOOK::CALLANDER | MCC = My Constant Companion | Tue Oct 29 1991 16:06 | 26 |
| another problem, is simply put, notification fm isn't giving you
a big enough stack to do the job! We picked a number that seemed
big enough and that is what we use. You may well be exceeding
the amount of space. If you do a getevetn from the fcl for
the same ANY EVENTS, does this problem go away? the main
difference between the two is that the getevetn gets passed the FCL
primary thread for use in processing (this is the main stack thready,
no real limit on its' size), while the notify uses a thread that it
creates for itself.
I assume tht you are doing you command ona single entity, because in
v1.1, if it is a wildcard we will do one getevent per entity requested
(I know you listed the command, but I didn't look close enough).
As to the event id list being passed, well that was done to help same
overhead. Since the event manager requires you to pass in the lsit of
events (or at least it did last time I checked) it was faster for the
underlying modules if the list was enumerated for the AM's before it
got passed down. This was especially helpful for modules like DNA5
where new events could be added to its' dictionary at any time, and
they would have to read the information from the dictionary at run
time, while the FCL has the info at hand (in the parse tables) without
having to access the dictionary.
Sorry for the run on sentences.
|
1667.9 | I'll do some homework... | STKHLM::BERGGREN | Nils Berggren EIS/Project dpmt, Sweden DTN 876-8287 | Wed Oct 30 1991 15:23 | 36 |
| RE .6
Matt,
I'll check the code for stack-problems. I don't think
that there is a stack overflow since I'm not having any
large structures or recursive routines.
RE .7
Ted,
It's nice to here that it works for you. To bad for me,
having to go over my code 'in deap' to find the problem.
RE .8
Jill,
>>>> If you do a getevetn from the fcl for
>>>> the same ANY EVENTS, does this problem go away?
If I do a GETEVENT from the FCL, I get a MCC_K_NULL_PTR in
the IN_P-argument, so I don't have that problem with a
GETEVENT-directive. NOTIFY, on the other hand, change
'any events' to a list of events defined.
>>>> I assume tht you are doing you command ona single entity,
Yes, I'm doing notify on a single entity.
Thanks all,
I'll do some homework now and look for coding-errors. I'll keep
you informed.
regards,
Nils
|
1667.10 | We need to do some research as well | TOOK::GUERTIN | Don't fight fire with flames | Wed Oct 30 1991 16:58 | 15 |
| Nils,
Although it is possible that you may have a coding bug, it is just as
likely (if not more) that it is our (MCC's) bug. We go through at
least 3 large management modules (FCL, Notification, Alarms, etc.) any
one of these can be mismanaging stack memory for >203 events. I guess
I was really trying to say it *probably* isn't an MCC Event Manager
problem.
Jill, could we have someone test this case for notifying an entity with
several hundred events? I know we don't have any such entity around,
but maybe someone on the Notification Services team can think of
something creative.
-Matt.
|
1667.11 | Yes please, do some testing | STKHLM::BERGGREN | Nils Berggren EIS/Project dpmt, Sweden DTN 876-8287 | Thu Oct 31 1991 06:00 | 22 |
| repl .10
I just did the same thing as Ted in .6, and it works with 310 events.
As he points out, the difference is that the MCC_CALL-mechanism is not
used. Using the MCC_CALL-mechanism could maybe twist something up, but
I'll go thru my code to see if I can find any problems.
However, I would very much appreciate if you could do some testing
just to verify the functionality.
I tried to do it myself by creating (in DAP) 310 events for the NODE4
class. After a new PTB I went into MCC (forms mode) and did a GETEVENT
NODE4 SEB075 and pressed the <HELP>-key at the arguments-line. It
gave me all the events, including the 310 I just defined. However,
when looking at the event-list in the event-trace (MCC_EVENT_LOG=1 and
MCC_EVENT_TRACE=180) I just got the events originally defined, not the
new ones.... I don't understand that. Who is removing "my events"?
So, I just had to delete the 310 events from the NODE4 class and try to
think out something else....
Thanks for your help,
Nils
|
1667.12 | I will check for mem clobber in event id list build | TOOK::CALLANDER | MCC = My Constant Companion | Fri Nov 22 1991 22:40 | 6 |
| I will check the code that builds the event id list, to make sure that
we are moting putting more in the list than we have allcoated memory
for. That is the only thing that quickly comes to mind.
jill
|