T.R | Title | User | Personal Name | Date | Lines |
---|
3876.1 | The mice are running as fast as they can | TOOK::FONSECA | I heard it through the Grapevine... | Thu Oct 08 1992 12:23 | 31 |
| I suggest that you grab the new TSAM kit I've described in 3.last.
That may fix some of your hanging problems. But not all...
I'm sure I've said this before, but I'll say it again. TSAM is slow.
If the *all* of the commands you are running alarms on won't
run in the time you have allotted for the repeat interval, you'll
just overload TSAM.
So in your example, you have got 16 SHOW commnads being executed against
TSAM every 30 seconds. It just won't work. TSAM takes much longer
for each individual command to execute, and because of the way it was designed
all of those requests get queued up through one process.
Figure out how long the commands that you are alarming on take,
and set up your repeat time that way.
If the only alarms going through TSAM were these sixteen SHOWS,
and we had figured out that each command takes 5 seconds, then
the repeat interval for each alram rule should at minimum be
5*16=80 seconds. I would make it two minutes, since you want your
alarms to be reliable, and you also want there to be breathing room
for you to execute other interactive TSAM commands on demand.
I'm completely aware of your frustration with slowness. I got here
long after the design for TSAM had this bottleneck. (whose presence
was neccessary but not insurmountable.)
You need to let product management (Laurel BLUMON::Nelson) know that
speed in this product is important.
Good luck...
Dave
|
3876.2 | New kit and interval works | SNOC01::MISNETWORK | MCC=My Constant Confusion | Sun Oct 11 1992 23:05 | 23 |
| Thanks for the quick reply Dave.
I have installed the kit you mentioned, and increased the repeat
interval to 50 secs for now (10 secs below your recommendation), and all
seems to be working fine now. I will see how reliable they are, and if
need be extend the interval.
The need for the smallest interval as possible is, like most alarms,
driven by the fact that we need to know about outages before the users
call us. The way I have set these alarms up for now is I check the link
status every 50 secs, another alarm is monitoring these alarms checking
for any exceptions, and if 3 exceptions occur in 3 minutes, then
another alarm is triggered to check the status of the server, and
depending on the result, a message is sent to us telling us either the
server's console is in use, or whatever else the error is. I will be
simplifying this soon by using the data collector in conjunction with
the link status rules.
Might need to let product management that TSAM shouldn't mean Terribly
Slow Access Module =8')
Cheers,
Louis
|
3876.3 | | TOOK::FONSECA | I heard it through the Grapevine... | Mon Oct 12 1992 10:56 | 3 |
| Glad to see this helped. Terribly Slow AM... gotta remember that one! :-)
-Dave
|
3876.4 | TSAM+Data Collector works well | SNOC01::MISNETWORK | MCC=My Constant Confusion | Mon Oct 12 1992 22:02 | 11 |
|
All is well !!!!!!
Have setup the polling interval to be 1 minute, and am using the data
collector to tell me when the MUXserver links are down/up, and also to
notify us when the server is down (Cannot communicate with target). If
the console is use, nothing happens (we donnot get notified)
Cheers,
Louis
|