[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference smurf::ase

Title:ase
Moderator:SMURF::GROSSO
Created:Thu Jul 29 1993
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:2114
Total number of notes:7347

1980.0. "Urgent!!! asemgr ................ Help needed!!" by MDR01::CARRANZ (MCS Madrid) Thu Apr 03 1997 07:02

We have a very customer urgent problem.

Customer Configuration:

We have two AlphaServer 4100 5/400
 
        Digital Unix V3.2F
        ASEBASE130 ----> DECSAFE
        ASECMS130 ----> DECSAFE
        2 GB of Memory (4*512MB).
        3 Power Supply
        4 CPU's 3MB Cache

        PCI/IO subsystem configuration:
        2*KZPSA connected to 2 HSZ40 FW V2.7
        1*KZPSC (3 Channels)
        2*DNSES (X.25)
        1*DE435
        1*DEFPA PCI TO FDDI ADAPTER
        1*KZPAA

        The system disk is attached to KZPSC (RAID 1).

	Customer has three ASE disk SERVICES.

(Apr 2 08:25) Yesterday one system (rtrprd2) got hung (it had just one disk 
service, it didn't have the asedirector), we weren't able to relocate 
that disk service.

	We forzed a crash (halt button and >>> crash).

	After that we couldn't do a "asemgr", when we invoked it just "....."
output, but it didn't answer.

	We have tested to invoke asemgr from both ase member (rtrprd2 &
	rtrprd1), obviously not at the same time.

	It looks like ase DB is locked.

Daemon.log message:

Apr  2 08:25:33 rtrprd1 DECsafe: rtrprd1 Director Error: freeLock: can't notify
EC to free DB lock

	Could anybody confirm us this point? and if it is locked, 
	How unlocked it?

	We have serious problem in that customer, we haven't done yet a reboot
from system 1 "rtrprd1" (production system that has the asedirector daemon), 
we have reboot the second one "rtrprd2" (system that yesterday got hung).
	
	Now, rtrprd1 has the every ASE services (when rtrpr2 got down, rtrprd1 
started it).

	Thanks in advance to everybody,

	Carmen Arranz and Bego�a Carvajal.

Apr  1 23:56:47 rtrprd1 xntpd[1765]: hourly check: drift -0.02578109 compliance 0.00000000
Apr  2 00:53:49 rtrprd1 os_mibs[1813]: os_mibs WARNING os_mibs.c line 366: Timeout waiting for response from master agent.
Apr  2 00:53:49 rtrprd1 os_mibs[1813]: os_mibs WARNING os_mibs.c line 388: Restaring protocol: Lost connection with Master Agent (snmpd).
Apr  2 00:54:19 rtrprd1 snmpd[1822]: Closing subagent os_mibs, reason: 0
Apr  2 01:07:43 rtrprd1 xntpd[1765]: hourly check: drift -0.02578109 compliance 0.00000000
Apr  2 02:07:43 rtrprd1 xntpd[1765]: hourly check: drift -0.02578109 compliance 0.00000000
Apr  2 03:07:43 rtrprd1 xntpd[1765]: hourly check: drift -0.02578109 compliance 0.00000000
Apr  2 04:07:43 rtrprd1 xntpd[1765]: hourly check: drift -0.02578109 compliance 0.00000000
Apr  2 05:07:43 rtrprd1 xntpd[1765]: hourly check: drift -0.02578109 compliance 0.00000000
Apr  2 06:07:43 rtrprd1 xntpd[1765]: hourly check: drift -0.02578109 compliance 0.00000000
Apr  2 07:07:43 rtrprd1 xntpd[1765]: hourly check: drift -0.02578109 compliance 0.00000000
Apr  2 08:07:43 rtrprd1 xntpd[1765]: hourly check: drift -0.02578109 compliance 0.00000000
Apr  2 08:25:33 rtrprd1 DECsafe: rtrprd2 AseMgr Error: ASE timeout - Unable to stop service.
Apr  2 08:25:33 rtrprd1 DECsafe: rtrprd1 Director Error: freeLock: can't notify EC to free DB lock
Apr  2 08:25:33 rtrprd1 DECsafe: rtrprd2 AseMgr Error: Unable to stop service 'rtrhis' - Relocation not successful.
Apr  2 08:25:33 rtrprd1 DECsafe: rtrprd2 AseMgr Error: Unable to relocate `rtrhis` to `rtrprd1`.
Apr  2 08:25:38 rtrprd1 DECsafe: rtrprd1 Director Error: freeLock: can't notify EC to free DB lock
Apr  2 08:27:33 rtrprd1 DECsafe: rtrprd1 Director Warning: timeout waiting on Reply to ASE_STOP_SERVICE
Apr  2 08:27:33 rtrprd1 DECsafe: rtrprd1 Director Notice: received callback for an old event.  Discarding callback.
Apr  2 08:35:38 rtrprd1 DECsafe: rtrprd1 Director Warning: timeout waiting on Reply to ASE_DAEMON_ONLINE
Apr  2 08:40:38 rtrprd1 DECsafe: rtrprd1 Director Warning: timeout waiting on Reply to ASE_INQ_STATE
Apr  2 08:45:39 rtrprd1 DECsafe: rtrprd1 Director Warning: timeout waiting on Reply to ASE_INQ_STATE
Apr  2 08:50:40 rtrprd1 DECsafe: rtrprd1 Director Warning: timeout waiting on Reply to ASE_INQ_STATE
Apr  2 09:07:43 rtrprd1 xntpd[1765]: hourly check: drift -0.02578109 compliance 0.00000000
Apr  2 09:22:44 rtrprd1 DECsafe: rtrprd1 Director Warning: timeout waiting on Reply to ASE_START_SERVICE
Apr  2 10:07:43 rtrprd1 xntpd[1765]: hourly check: drift -0.02578109 compliance 0.00000000
Apr  2 10:13:52 rtrprd1 DECsafe: rtrprd2 AseMgr Error: Bad return from network modify script.
Apr  2 10:13:58 rtrprd1 DECsafe: rtrprd1 Director Warning: an ASEmgr exited with the db lock
Apr  2 10:27:47 rtrprd1 DECsafe: rtrprd2 AseMgr Error: ASE timeout - Unable to stop service.
Apr  2 10:27:47 rtrprd1 DECsafe: rtrprd1 Director Error: freeLock: can't notify EC to free DB lock
Apr  2 10:27:47 rtrprd1 DECsafe: rtrprd2 AseMgr Error: Unable to stop service 'rtrhis' - Relocation not successful.
Apr  2 10:27:47 rtrprd1 DECsafe: rtrprd2 AseMgr Error: Unable to relocate `rtrhis` to `rtrprd1`.
Apr  2 10:29:47 rtrprd1 DECsafe: rtrprd1 Director Warning: timeout waiting on Reply to ASE_STOP_SERVICE
Apr  2 10:29:47 rtrprd1 DECsafe: rtrprd1 Director Notice: received callback for an old event.  Discarding callback.
Apr  2 10:37:48 rtrprd1 DECsafe: rtrprd1 Director Warning: timeout waiting on Reply to ASE_DAEMON_ONLINE
Apr  2 10:42:49 rtrprd1 DECsafe: rtrprd1 Director Warning: timeout waiting on Reply to ASE_INQ_STATE
Apr  2 10:47:03 rtrprd1 DECsafe: local HSM Warning: Can't ping rtrprd2 over the SCSI bus
Apr  2 10:47:04 rtrprd1 DECsafe: local HSM Warning: Can't ping rtrprd2-f over the network
Apr  2 10:47:08 rtrprd1 DECsafe: local HSM Warning: Can't ping rtrprd2 over the network
Apr  2 10:47:09 rtrprd1 DECsafe: local HSM Notice: /var/ase/sbin/ase_run_sh: change host 10.10.33.221: gateway 10.10.33.221
Apr  2 10:47:09 rtrprd1 DECsafe: local HSM Notice: /var/ase/sbin/ase_run_sh: change host 10.10.47.235: gateway 10.10.47.235
Apr  2 10:47:09 rtrprd1 DECsafe: local HSM ***ALERT: HSM_PATH_STATUS:10.10.33.221:DOWN:10.10.47.235:DOWN
Apr  2 10:47:09 rtrprd1 DECsafe: local HSM Warning: member rtrprd2 is DOWN
Apr  2 10:47:10 rtrprd1 DECsafe: rtrprd1 Director ***ALERT: Member  is not available
Apr  2 10:47:10 rtrprd1 DECsafe: rtrprd1 Director Bug: getGenericPlan: can't get service list from config db
Apr  2 10:47:10 rtrprd1 DECsafe: rtrprd1 Director Error: can't get plan
Apr  2 10:47:10 rtrprd1 DECsafe: rtrprd1 Director Error: can't complete agent state change processing
Apr  2 10:47:10 rtrprd1 DECsafe: rtrprd1 Director ***ALERT: Member rtrprd2 is not available
Apr  2 10:47:15 rtrprd1 DECsafe: rtrprd1 Agent Notice: starting service rtrhis
Apr  2 10:47:38 rtrprd1 dnascd[10728]: Connect from LOCAL:.rtrprd2::uic=[0,0]root for application psw under root
Apr  2 10:47:39 rtrprd1 dnascd[1020]: Process exit (PID 10728).
Apr  2 10:47:39 rtrprd1 DECsafe: rtrprd1 Agent Notice: /var/ase/sbin/ase_run_sh rtrhis: rtrhis: start at Wed Apr  2 10:47:22 MET DST 1997
Apr  2 10:47:39 rtrprd1 DECsafe: rtrprd1 Director Notice: started service rtrhis on rtrprd1
Apr  2 10:47:49 rtrprd1 DECsafe: rtrprd1 Director Notice: msgSvc: unclaimed timeout
Apr  2 11:00:08 rtrprd1 DECsafe: local HSM Notice: Able to ping rtrprd2 over the SCSI bus
Apr  2 11:00:08 rtrprd1 DECsafe: local HSM Notice: Able to ping rtrprd2 over the network
Apr  2 11:00:08 rtrprd1 DECsafe: local HSM Notice: Able to ping rtrprd2-f over the network
Apr  2 11:00:10 rtrprd1 DECsafe: local HSM Notice: /var/ase/sbin/ase_run_sh: change host 10.10.33.221: gateway 10.10.33.221
Apr  2 11:00:10 rtrprd1 DECsafe: local HSM Notice: /var/ase/sbin/ase_run_sh: change host 10.10.47.235: gateway 10.10.47.235
Apr  2 11:00:10 rtrprd1 DECsafe: local HSM ***ALERT: HSM_PATH_STATUS:10.10.33.221:UP:10.10.47.235:UP
Apr  2 11:00:10 rtrprd1 DECsafe: local HSM Notice: member rtrprd2 is UP
Apr  2 11:00:21 rtrprd1 DECsafe: rtrprd2 Agent Notice: initializing agent... stopping all services
Apr  2 11:00:22 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /home/sybackups: not currently mounted
Apr  2 11:00:22 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /home/sybackups already unmounted
Apr  2 11:00:23 rtrprd1 DECsafe: rtrprd2 Agent Notice: user script: rtrprd: boot at Wed Apr  2 11:00:23 MET DST 1997
Apr  2 11:00:24 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /ARCHIVE.TAPE: not currently mounted
Apr  2 11:00:24 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /ARCHIVE.TAPE already unmounted
Apr  2 11:00:25 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /PRISM_REL_DISK: not currently mounted
Apr  2 11:00:25 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /PRISM_REL_DISK already unmounted
Apr  2 11:00:25 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/ARCHIVE: not currently mounted
Apr  2 11:00:25 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/ARCHIVE already unmounted
Apr  2 11:00:25 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/QUEUE: not currently mounted
Apr  2 11:00:25 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/QUEUE already unmounted
Apr  2 11:00:25 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/REFDATA: not currently mounted
Apr  2 11:00:25 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/REFDATA already unmounted
Apr  2 11:00:25 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/SHAREDMEM: not currently mounted
Apr  2 11:00:25 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/SHAREDMEM already unmounted
Apr  2 11:00:26 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/WORK: not currently mounted
Apr  2 11:00:26 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/WORK already unmounted
Apr  2 11:00:26 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/users: not currently mounted
Apr  2 11:00:26 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/users already unmounted
Apr  2 11:00:26 rtrprd1 DECsafe: rtrprd2 Agent Notice: user script: rtrhis: boot at Wed Apr  2 11:00:26 MET DST 1997
Apr  2 11:00:27 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /rtrhis: not currently mounted
Apr  2 11:00:27 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /rtrhis already unmounted
Apr  2 11:07:43 rtrprd1 xntpd[1765]: hourly check: drift -0.02578109 compliance 0.00000000
Apr  2 12:07:43 rtrprd1 xntpd[1765]: hourly check: drift -0.02578109 compliance 0.00000000
Apr  2 12:44:39 rtrprd1 DECsafe: rtrprd2 AseMgr Error: Unable to get DB.
Apr  2 13:01:26 rtrprd1 DECsafe: rtrprd2 AseMgr Error: Unable to get DB.
Apr  2 13:07:43 rtrprd1 xntpd[1765]: hourly check: drift -0.02578109 compliance 0.00000000
Apr  2 13:18:12 rtrprd1 DECsafe: rtrprd2 AseMgr Error: Unable to get DB.
Apr  2 13:21:22 rtrprd1 DECsafe: rtrprd1 Director Notice: connected to by shutdown program
Apr  2 13:21:23 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_run_sh : sybase: not running in this member
Apr  2 13:21:24 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /home/sybackups: not currently mounted
Apr  2 13:21:24 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /home/sybackups already unmounted
Apr  2 13:21:25 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_run_sh rtrprd: rtrprd: stop at Wed Apr  2 13:21:25 MET DST 1997
Apr  2 13:21:25 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_run_sh rtrprd: rtrprd: not running in this member
Apr  2 13:21:27 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /ARCHIVE.TAPE: not currently mounted
Apr  2 13:21:27 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /ARCHIVE.TAPE already unmounted
Apr  2 13:21:27 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /PRISM_REL_DISK: not currently mounted
Apr  2 13:21:27 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /PRISM_REL_DISK already unmounted
Apr  2 13:21:27 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/ARCHIVE: not currently mounted
Apr  2 13:21:27 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/ARCHIVE already unmounted
Apr  2 13:21:28 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/QUEUE: not currently mounted
Apr  2 13:21:28 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/QUEUE already unmounted
Apr  2 13:21:28 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/REFDATA: not currently mounted
Apr  2 13:21:28 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/REFDATA already unmounted
Apr  2 13:21:28 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/SHAREDMEM: not currently mounted
Apr  2 13:21:28 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/SHAREDMEM already unmounted
Apr  2 13:21:28 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/WORK: not currently mounted
Apr  2 13:21:28 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/WORK already unmounted
Apr  2 13:21:29 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/users: not currently mounted
Apr  2 13:21:29 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/users already unmounted
Apr  2 13:21:29 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_run_sh rtrhis: rtrhis: stop at Wed Apr  2 13:21:29 MET DST 1997
Apr  2 13:21:29 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_run_sh rtrhis: rtrhis: not running in this member
Apr  2 13:21:29 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /rtrhis: not currently mounted
Apr  2 13:21:29 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /rtrhis already unmounted
Apr  2 13:21:29 rtrprd1 DECsafe: rtrprd2 Agent Warning: aseagent exiting on request...
Apr  2 13:21:29 rtrprd1 DECsafe: rtrprd1 Agent Warning: sendReply: send failed
Apr  2 13:21:46 rtrprd1 DECsafe: local HSM Warning: Can't ping rtrprd2 over the SCSI bus
Apr  2 13:21:47 rtrprd1 DECsafe: local HSM Notice: /var/ase/sbin/ase_run_sh: change host 10.10.33.221: gateway 10.10.33.221
Apr  2 13:21:47 rtrprd1 DECsafe: local HSM Notice: /var/ase/sbin/ase_run_sh: change host 10.10.47.235: gateway 10.10.47.235
Apr  2 13:21:47 rtrprd1 DECsafe: local HSM ***ALERT: HSM_PATH_STATUS:10.10.33.221:UP:10.10.47.235:UP
Apr  2 13:21:48 rtrprd1 DECsafe: local HSM ***ALERT: network ping to host rtrprd2 is working but SCSI ping is not
Apr  2 13:26:09 rtrprd1 DECsafe: rtrprd2 Agent Notice: initializing agent... stopping all services
Apr  2 13:26:09 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /home/sybackups: not currently mounted
Apr  2 13:26:09 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /home/sybackups already unmounted
Apr  2 13:26:11 rtrprd1 DECsafe: rtrprd2 Agent Notice: user script: rtrprd: boot at Wed Apr  2 13:26:10 MET DST 1997
Apr  2 13:26:13 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /ARCHIVE.TAPE: not currently mounted
Apr  2 13:26:13 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /ARCHIVE.TAPE already unmounted
Apr  2 13:26:13 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /PRISM_REL_DISK: not currently mounted
Apr  2 13:26:13 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /PRISM_REL_DISK already unmounted
Apr  2 13:26:13 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/ARCHIVE: not currently mounted
Apr  2 13:26:13 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/ARCHIVE already unmounted
Apr  2 13:26:13 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/QUEUE: not currently mounted
Apr  2 13:26:13 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/QUEUE already unmounted
Apr  2 13:26:13 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/REFDATA: not currently mounted
Apr  2 13:26:13 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/REFDATA already unmounted
Apr  2 13:26:14 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/SHAREDMEM: not currently mounted
Apr  2 13:26:14 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/SHAREDMEM already unmounted
Apr  2 13:26:14 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/WORK: not currently mounted
Apr  2 13:26:14 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/WORK already unmounted
Apr  2 13:26:14 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/users: not currently mounted
Apr  2 13:26:14 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/users already unmounted
Apr  2 13:26:14 rtrprd1 DECsafe: rtrprd2 Agent Notice: user script: rtrhis: boot at Wed Apr  2 13:26:14 MET DST 1997
Apr  2 13:26:15 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /rtrhis: not currently mounted
Apr  2 13:26:15 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /rtrhis already unmounted
Apr  2 13:26:21 rtrprd1 DECsafe: local HSM Notice: Able to ping rtrprd2 over the SCSI bus
Apr  2 13:26:23 rtrprd1 DECsafe: local HSM Notice: user script: change host 10.10.33.221: gateway 10.10.33.221
Apr  2 13:26:23 rtrprd1 DECsafe: local HSM Notice: user script: change host 10.10.47.235: gateway 10.10.47.235
Apr  2 13:26:23 rtrprd1 DECsafe: local HSM ***ALERT: HSM_PATH_STATUS:10.10.33.221:UP:10.10.47.235:UP
Apr  2 13:26:23 rtrprd1 DECsafe: local HSM ***ALERT: network ping to host rtrprd2 is working and now SCSI ping is working also
Apr  2 13:42:47 rtrprd1 DECsafe: rtrprd2 Agent Warning: aseagent exiting on request...
Apr  2 13:43:01 rtrprd1 DECsafe: local HSM Warning: Can't ping rtrprd2 over the SCSI bus
Apr  2 13:43:06 rtrprd1 DECsafe: local HSM Warning: Can't ping rtrprd2 over the network
Apr  2 13:43:06 rtrprd1 DECsafe: local HSM Warning: Can't ping rtrprd2-f over the network
Apr  2 13:43:07 rtrprd1 DECsafe: local HSM Notice: /var/ase/sbin/ase_run_sh: change host 10.10.33.221: gateway 10.10.33.221
Apr  2 13:43:07 rtrprd1 DECsafe: local HSM Notice: /var/ase/sbin/ase_run_sh: change host 10.10.47.235: gateway 10.10.47.235
Apr  2 13:43:07 rtrprd1 DECsafe: local HSM ***ALERT: HSM_PATH_STATUS:10.10.33.221:DOWN:10.10.47.235:DOWN
Apr  2 13:43:07 rtrprd1 DECsafe: local HSM Warning: member rtrprd2 is DOWN
Apr  2 13:43:08 rtrprd1 DECsafe: rtrprd1 Director ***ALERT: Member rtrprd2 is not available
Apr  2 13:45:08 rtrprd1 DECsafe: local HSM Notice: Able to ping rtrprd2 over the SCSI bus
Apr  2 13:45:20 rtrprd1 DECsafe: local HSM Notice: Able to ping rtrprd2-f over the network
Apr  2 13:45:31 rtrprd1 DECsafe: rtrprd2 Agent Notice: initializing agent... stopping all services
Apr  2 13:45:32 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /home/sybackups: not currently mounted
Apr  2 13:45:32 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /home/sybackups already unmounted
Apr  2 13:45:34 rtrprd1 DECsafe: rtrprd2 Agent Notice: user script: rtrprd: boot at Wed Apr  2 13:45:33 MET DST 1997
Apr  2 13:45:35 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /ARCHIVE.TAPE: not currently mounted
Apr  2 13:45:35 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /ARCHIVE.TAPE already unmounted
Apr  2 13:45:35 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /PRISM_REL_DISK: not currently mounted
Apr  2 13:45:35 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /PRISM_REL_DISK already unmounted
Apr  2 13:45:35 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/ARCHIVE: not currently mounted
Apr  2 13:45:35 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/ARCHIVE already unmounted
Apr  2 13:45:35 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/QUEUE: not currently mounted
Apr  2 13:45:35 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/QUEUE already unmounted
Apr  2 13:45:36 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/REFDATA: not currently mounted
Apr  2 13:45:36 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/REFDATA already unmounted
Apr  2 13:45:36 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/SHAREDMEM: not currently mounted
Apr  2 13:45:36 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/SHAREDMEM already unmounted
Apr  2 13:45:36 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/WORK: not currently mounted
Apr  2 13:45:36 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/WORK already unmounted
Apr  2 13:45:36 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/users: not currently mounted
Apr  2 13:45:36 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/users already unmounted
Apr  2 13:45:36 rtrprd1 DECsafe: rtrprd2 Agent Notice: user script: rtrhis: boot at Wed Apr  2 13:45:36 MET DST 1997
Apr  2 13:45:37 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /rtrhis: not currently mounted
Apr  2 13:45:37 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /rtrhis already unmounted
Apr  2 13:45:39 rtrprd1 DECsafe: local HSM Notice: /var/ase/sbin/ase_run_sh: change host 10.10.33.221: gateway 10.10.47.235
Apr  2 13:45:39 rtrprd1 DECsafe: local HSM Notice: /var/ase/sbin/ase_run_sh: change host 10.10.47.235: gateway 10.10.47.235
Apr  2 13:45:39 rtrprd1 DECsafe: local HSM ***ALERT: HSM_PATH_STATUS:10.10.33.221:DOWN:10.10.47.235:UP
Apr  2 13:45:39 rtrprd1 DECsafe: local HSM Notice: member rtrprd2 is UP
Apr  2 13:45:39 rtrprd1 DECsafe: local HSM Notice: Able to ping rtrprd2 over the network
Apr  2 13:45:40 rtrprd1 DECsafe: local HSM Notice: /var/ase/sbin/ase_run_sh: change host 10.10.33.221: gateway 10.10.33.221
Apr  2 13:45:40 rtrprd1 DECsafe: local HSM Notice: /var/ase/sbin/ase_run_sh: change host 10.10.47.235: gateway 10.10.47.235
Apr  2 13:45:40 rtrprd1 DECsafe: local HSM ***ALERT: HSM_PATH_STATUS:10.10.33.221:UP:10.10.47.235:UP
Apr  2 14:07:43 rtrprd1 xntpd[1765]: hourly check: drift -0.02578109 compliance 0.00000000
Apr  2 15:07:43 rtrprd1 xntpd[1765]: hourly check: drift -0.02578109 compliance 0.00000000
Apr  2 16:07:43 rtrprd1 xntpd[1765]: hourly check: drift -0.02578109 compliance 0.00000000
    
T.RTitleUserPersonal
Name
DateLines
1980.1Quick responseNETRIX::"[email protected]"Gregory P. MyrdalThu Apr 03 1997 10:3235
Hi,

I can not tell if your database is locked or not.  This is not a common
problem with ASE.  The logs do indicate that there could be a db lock
problem, however, there is also a database fetch with does not require
a db lock and it fails to.  My first instinct is to say that the director
is very confused for some reason.  The reason is unknown as it most likely
happened before the log entries submitted.  This is also why the asemgr is
most likely hanging.  The asemgr can not correctly talk to the asedirector
and it does not require a db lock to do so.

My suggestion is to try and get the director back in a sane state.  This
does not require rebooting any systems or shutting down any services.  If
rtrprd2 is not needed and is not running any services, I would shut it down 
to single user to remove it from the equation.  Then I would make sure that
all of the ase daemons are running via 'ps ax | grep ase' command in rtrprd1. 

Make sure you have an aseagent, asedirector, and asehsm.  You can then
kill -9 the asedirector process id.  Do a tail -f on the daemon.log and
save it off to a file for us.  The aseagent is responsible for starting a 
new daemon so you should see messages in the log about rtrprd1 starting a
new one.  In case this does not work we need to see these new entries in the
daemon.log to see what a new asedirector generates.  Also, if this fails I
would also suggest escalating this problem so you get the appropriate
attention and someone assigned to work it.

The reason I think the asedirector is confused is because its configuration
(in memory) database looks to be hosed.

Good luck and let us know what happens.

-- Greg


[Posted by WWW Notes gateway]
1980.2doing it ...MDR01::CARVAJALMCS MadridThu Apr 03 1997 11:049
    Thanks a lot for your quickly answer.
    
    We are going to follow your recomendations, but if the problem is not
    solved we have a little time to stop the machine or ... to recover a
    good asecdb.
    
    We will inform you asap.
    
    Carmen and Bego�a
1980.3The asecdb is not a suspect .... yetSMURF::MYRDALThu Apr 03 1997 11:098
    Note: nothing indicates that the asecdb is bad.  Also db locks are not
    placed in the filesystem version of the asecdb.  So what I am
    suggesting is that getting a new asecdb may not solve any problems.
    If you know you made changes to the asecdb and now are having problems,
    this is a horse of a different color.  But your original note did not
    mention this.
    
    -- Greg
1980.4What do you mean ...?MDR01::CARVAJALMCS MadridThu Apr 03 1997 11:199
    We suppose, as you, asecdb is not bad. But we need to tell the customer
    all the steps to do while the system is stoped.
    
    
    What do you mean with "placed in the filesystem of the asecdb" ?
    
    We suppose that a system reboot will solve the problem. Don`t you?
    
    bego�a and Carmen
1980.5BACHUS::DEVOSManu Devos DEC/SI Brussels 856-7539Thu Apr 03 1997 11:236
    Hi,
    
    Don't reboot now !!
    
    Start with the asedirector killing....
    
1980.6OK.!!MDR01::CARVAJALMCS MadridThu Apr 03 1997 11:273
    OK!!!
    
    
1980.7asecdb clarificationSMURF::MYRDALThu Apr 03 1997 11:3012
    Yes, try restarting the asedirector first.
    
    The asecdb lives in the filesystem and in the asedirector's memory. 
    Yes, I am suggsting that a restart of the daemons may fix the problem
    and that there is no problem with the on disk copy of the database.
    There is a chance there might be a problem with the in memmory copy of
    the director's database (or more likely its not there for some reason).
    When the director is restarted it will request a new copy from the
    agent who gets it from disk.
    
    -- greg
    
1980.8Everything went OK as here suggested!!!MDR01::CARRANZMCS MadridThu Apr 03 1997 13:2222
    Hi everybody who has helped us,
    
    We have done what you said and now DEC SAFE is working as expected
    (OK!).
    
    1.- Shutdown now on rtrprd2
    2.- Verify that ase daemons were running on rtrprd1
    3.- kill -9 asedirector daemon.
    4.- Inmediatly a new asedirector was started.
    5.- asemgr works
    
    Only one entry into daemon.log:
    
    "Apr 3 .... DECsafe:rtrprd1 Agent Notice: Starting a new director".
    
    Now we go to home to rest.
    
    Many thanks,
    
    Bego�a y Carmen