[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | CSGUK_SYSTEMS |
Notice: | No restrictions on keyword creation |
Moderator: | KERNEL::ADAMS |
|
Created: | Wed Mar 01 1989 |
Last Modified: | Thu Nov 28 1996 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 242 |
Total number of notes: | 1855 |
146.0. "PRODUCTION CLUSTER " by KERNEL::ADAMS (An RD54 beats a platinum disc !!) Fri Nov 08 1991 16:54
From: NAME: Jeff Yates
FUNC: Customer Services
TEL: 833 <YATESJ AT A1_KERNEL @THESUN @UVO>
Date: 07-Nov-1991
Posted-date: 07-Nov-1991
Precedence: 1
Subject: Update on the power problems
Update on the power situation
A meeting was held on Wednesday 6th Nov, to review the power problems
experienced on Friday 1st November 1991.
Attendees
Jeff Yates
Simon Lobar
Kevin Gant
Mike Coggins
Mary Challenor
John Frawley
Erika Smith
Colin Tubb
Ray Stevens
The purpose of the meeting was to review the events that took place,
understand the oustanding issues, plan how to increase our resilience for the
future, and decide on a more suitable approach to problem managing a future
event if it occurs.
Current status
The UPS is currently out of circuit. No faults have been found with it.
Chloride want to progressively increase the loading on to the UPS, and we are
planning for the best way of arranging this. Also a mains physical integrity
check is felt advisable. Once again we are planning how to do this with
minimum disruption.
We can gain access to the Northern NICE providing we have a live ring main,
but the queue structures are sufficiently different to make this of limited
use. We are working to find out how to improve the usability of the two
systems in the event of either failing.
The phone system held up on battery power during the outage. Battery backup
is rated at 1.5 hours. We are considering if this needs supplementing with
more batteries.
Numerous people seemed to have access to plant room and the production
computer room, making control of repair effort difficult. The access list
will be reviewed and any apparent anomolies reviewed with the individuals
concerned to see if there are genuine access needs. Any unnecessary access
rights will be disabled.
A phone bell will be installed in the computer room
Crisis Management
A crisis was defined as the production system (Kernel cluster)
being unavailable.
In this event, where the down time is known, the IS group
should issue a tannoy message to the building via PM&S, giving
the reason for the outage, and the expected duration.
Where the downtime is not known, a Problem Manager should be
appointed by :-
- Either of the Service Centre Managers; or
- Either of the Operational Managers; or
- The Senior Manager present in either of the two
Service Centre Management teams
The responsibilities of the Problem Manager are :-
- To announce his or her presence to the building staff
- To co-ordinate the repair activities
- To provide regular updates to the building staff
- To ensure that non-required people are kept away from the
repair activities
- To take any decisions that impact the Service Centre businesses.
Regards
Jeff Yates
T.R | Title | User | Personal Name | Date | Lines
|
---|