T.R | Title | User | Personal Name | Date | Lines |
---|
1781.1 | More requirements... | RIVAGE::SILVA | Carl Silva - Telecom Eng - DTN 828-5339 | Fri Nov 08 1991 06:03 | 66 |
| > As the trouble ticket note is generating so much heat :-) I decided to
> move my current pet topic out here.
Well, its that the trouble ticket functionality seems to be a seperate
domain from the alarm correlation/pre-processing. I know that a lot of RFPs
come in with thte two areas lumped together, however you could have a trouble
ticket management/tracking system without the alarm correlation (tickets
created manually).
> Having made a number of attempt to implement a large rule base for
> DECmcc I now feel that we must plan for some form of alarm
> pre-processing in the near future. Hearing that many of our customers
> are planning to have several hundred polls per hours means that
> customers will be screaming for this functionality in a few months. At
> the moment the only reason most of our customers are not polling every
> LAN bridge etc. on their network is that it is too much work to key in all
> these alarm rules and they haven't the time to write automatic load scripts.
Yes, it would be nice to have the alarms rules management functionality
expanded so that alarm templates could be defined and automatically
associated with managed object classes.
> It has been implied in another note that a customer who has hundreds of
> alarms triggering at the time has done something wrong.
I don't know if it was me but I didn't mean to imply that the customer
was doing something wrong. A fiber optic cable cut can generate many alarms.
All I was saying is that the type of alarms/type distribution versus time
should be looked at to detect patterns that may be able to be suppressed or
pre-processed.
> Given that large number of alarms or exceptions will occur, we now need
> a was of pre-processing these alarms. I don't believe that much of
> this processing should take place within the Alarms FM. The Alarms
> FM should concentrate on providing the ability to collect large quantities
> of raw-alarm data in a very efficient manor and perform minimal data
> reduction. A separate FM should be able to take the raw data,
> perform some pre-processing and then feed a 'synthetic' alarm back into
> the system. This FM should be rule based and it's rule base exposed to
> the customers. We already have some of the code to add some real
> analysis to this FM. The ELM developers have developed a piece of code which
> can automatically generate the LAN's spanning tree, and can add the
> functionality to determine which devices resides on each segment. If
> that map of the network is kept in memory within the Alarm
> Pre-Processing FM then problems due to bridge failures can be
> detected by noting which devices are causing exception messages.
> Exception messages which were caused as a result of the device that
> failed can be filtered out.
> By building similar internal maps for the routing tables of the various
> protocols failure of routers and the associated 'exceptions' messages
> due to nodes becoming unreachable can be correlated to determine which
> component failed.
Sounds like we need an artificial intelligence kind of FM with a nice
user interface. Any volunteers? 8-)
> If this capability is not considered to be an important component of
> DECmcc then I would suggest that the DECmcc kit be shipped with a piece
> of code or an easy API which will allow customers to intercept Alarms
> on their way to the Notification FM, and allow customers to insert their
> own alarms into the system.
This can be done now. You can build an FM that receives the alarm
notifications from the alarms FM.
Carl
|
1781.2 | we need at least 2 answers | TOOK::MATTHEWS | | Fri Nov 08 1991 10:05 | 45 |
| I will start by agreeing that customers need much greater functionality
in the way of event/alarm correlation/filtering than we currently
are providing. There are many reasons we are not doing more, most
of them are purely financial.
Yes, there is a need but it is not clear that there aren't many
different needs identified here and I doubt that a single "hack"
/"modification" will do any more than change the perception of
the need.
First, We need event correlation/filtering based upon topological
knowledge of the network. Ie. DECmcc needs to know that to get
to bridge X from the current instantiation of DECmcc requires
going through bridge Y, Z, and A. This way a DECmcc "function"
could make rational decisions about the effects of one of these
bridges going down and whether to correlate these "events".
You can do it with hardwired procedures as is suggested in the
note or you can do it via based upon topological knowledge
(which currently does not exist in DECmcc). Note that the
hardwired solution breaks down when customers provide dormant
paths in their network that are enabled by changing forwarding
database entries and creating alternate topologies. Yes, I know
that spanning tree doesn't allow cycles. But, dormant (ie. non
forwarding links which are potential cycles) are allowed and
can be used. Thus any hardwired procedure based on a static
topology will fail in this case. I suggest that the there are
2 answers to this. First, there needs to be a way for customers
to write simple filtering scripts to reduce multiple events
occuring in some time scope to be correlated and generate a
"more meaningful" event to go into the alarms fm. This has
the advantage that it reduces the load on the alarms fm. It
has the disadvantage that it increases the delay for receiving
an event at the alarms fm interface. Second, we need to capture
topology data about networks including dynamic views and static
views so that an event filtering mechanism can provide event
filtering based upon topology.
Neither of these will be in V1.2. It is possible for the first
to be done by the next release after V1.2. The topology part
of the second is being planned but the actual filtering
algorithm/mechanism is not currently understood enough to
be planned.
wally
|
1781.3 | configuration management if necessary... | RIVAGE::SILVA | Carl Silva - Telecom Eng - DTN 828-5339 | Fri Nov 08 1991 10:19 | 16 |
| RE: .2,
> First, We need event correlation/filtering based upon topological
> knowledge of the network. Ie. DECmcc needs to know that to get
> to bridge X from the current instantiation of DECmcc requires
> going through bridge Y, Z, and A. This way a DECmcc "function"
> could make rational decisions about the effects of one of these
> bridges going down and whether to correlate these "events".
Yes, it is definitely clear that without configuration information in
the system, it is very difficult to do the alarm correlation. Without the
config info all you will have is the info contained in the alarm reports.
Are there plans for MCC to do configuration management?
Carl
|
1781.4 | Alarm & Notification APIs? | ANDRIS::putnins | Hands across the Baltics | Fri Nov 08 1991 10:56 | 34 |
| Re: .1
> This can be done now. You can build an FM that receives the alarm
>notifications from the alarms FM.
Where can I obtain information on the Alarms and Notification FM
interfaces? I would like to be able to write an FM that screens events
and writes them into an external database so another, independent,
program can process them. This may be in addition to the facilities
offered by PNMP, below.
Re: .2
>First, there needs to be a way for customers
> to write simple filtering scripts to reduce multiple events
> occuring in some time scope to be correlated and generate a
> "more meaningful" event to go into the alarms fm.
It appears that the PNMP Alarm Handling FM will offer this capability
(see TAEC::PNMP conference, note 3.2):
The PNMP Alarm Handioling FM provides the following alarm handling functions:
1. The ability to define an Operation Context for alarm handling.
2. Collecting alarm reports generated by managed object or generated by
user defined rules.
3. Filtering alarm reports using an ISO conformant discriminator construct.
4. Creating Alarm Objects corresponding to the filtered alarm reports
and maintaining these objects. Alarm Objects can then be acknowledged,
handled, closed, archived and/or purged.
4. Notifying the PNMP PM when an Alarm Object has been created or when
its status has changed.
5. Escalation (with the creation of a new Alarm Object) when an Alarm
Object has not been acknowledged or handled before a specified time
(change in time per severity).
|
1781.5 | PNMP Alarm Handling FM | RIVAGE::SILVA | Carl Silva - Telecom Eng - DTN 828-5339 | Fri Nov 08 1991 12:07 | 24 |
| >Where can I obtain information on the Alarms and Notification FM
>interfaces? I would like to be able to write an FM that screens events
>and writes them into an external database so another, independent,
>program can process them. This may be in addition to the facilities
>offered by PNMP, below.
The API is the normal interface that all modules use (mcc_call_access
or mcc_call_function).
>>First, there needs to be a way for customers
>> to write simple filtering scripts to reduce multiple events
>> occuring in some time scope to be correlated and generate a
>> "more meaningful" event to go into the alarms fm.
>
>It appears that the PNMP Alarm Handling FM will offer this capability
>(see TAEC::PNMP conference, note 3.2):
Yes, it will provide some of this functionality.
RE: .2,
Can you expand on your requirements?
Carl
|
1781.6 | As usual, sample code would do wonders.. | SUBWAY::REILLY | Mike Reilly - New York Bank District | Fri Nov 08 1991 12:15 | 30 |
|
re: .2
> This can be done now. You can build an FM that receives the alarm
> notifications from the alarms FM.
If you know of somewhere I could get my hands on some code which
retrieves alarms and inserts alarms into the system I would
be very interested in developing the link to a rule based system.
I would suggest that code such as this should be shipped with the
next release of DECmcc. This will allow customers to use their own
alarm pre-processing algorithms in the near term.
My current customer would like to define DECmcc alarm rules which
all have a severity of 'warning' or lower. When the alarms are passed
thru the pre-processing algorithm, a synthetic alarm would be
generated( if needed) with a severity of 'critical'. The network operations
staff would only have to watch for alarms which have a severity of
'critical'. The original alarms would still be available if needed.
We also need a fix to the problem of all exceptions being flagged as
'critical' for this scheme to work.
With regard to the development of an AI system to handle this, I
have heard that the south of France provides the ideal environment
for the development of AI systems :-).
- Mike
|
1781.7 | | RIVAGE::SILVA | Carl Silva - Telecom Eng - DTN 828-5339 | Fri Nov 08 1991 12:17 | 5 |
| With regard to the development of an AI system to handle this, I
have heard that the south of France provides the ideal environment
for the development of AI systems :-).
Yes, it does!
|
1781.8 | 1.2 notif has alot of what you are asking for | TOOK::CALLANDER | MCC = My Constant Companion | Fri Jan 03 1992 11:30 | 23 |
| okay, I know this discussion is old but...
The MRM (module reference manuals) ship with the kit. The alarms and
notif ones will be update along with the rest for final v1.2 shipment.
These documents explain the functions supported by each module.
Now as to most of the functionality you have listed, you should find
alot of it in the notification services in the 1.2 field test kit.
The logging stuff didn't make it into field test but will be in the
final product. The log is supposed to allow logging based on the
filters you have defined.
As to an open api to do what you are looking for, the data collector
AM provides an easy to use open interface that allows for inormation
to be passed into mcc from any application you want to connect to
the api. The documentation on this is only being distributed to a
few specific field test sites so that we can first determine if our
implementation meets the need. If you want more information on
the data collector please send mail to Anne Pelagatti or Wally
Matthews (both are located on TOOK::)
jill
|