Title: | ase |
Moderator: | SMURF::GROSSO |
Created: | Thu Jul 29 1993 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 2114 |
Total number of notes: | 7347 |
We have a very customer urgent problem. Customer Configuration: We have two AlphaServer 4100 5/400 Digital Unix V3.2F ASEBASE130 ----> DECSAFE ASECMS130 ----> DECSAFE 2 GB of Memory (4*512MB). 3 Power Supply 4 CPU's 3MB Cache PCI/IO subsystem configuration: 2*KZPSA connected to 2 HSZ40 FW V2.7 1*KZPSC (3 Channels) 2*DNSES (X.25) 1*DE435 1*DEFPA PCI TO FDDI ADAPTER 1*KZPAA The system disk is attached to KZPSC (RAID 1). Customer has three ASE disk SERVICES. (Apr 2 08:25) Yesterday one system (rtrprd2) got hung (it had just one disk service, it didn't have the asedirector), we weren't able to relocate that disk service. We forzed a crash (halt button and >>> crash). After that we couldn't do a "asemgr", when we invoked it just "....." output, but it didn't answer. We have tested to invoke asemgr from both ase member (rtrprd2 & rtrprd1), obviously not at the same time. It looks like ase DB is locked. Daemon.log message: Apr 2 08:25:33 rtrprd1 DECsafe: rtrprd1 Director Error: freeLock: can't notify EC to free DB lock Could anybody confirm us this point? and if it is locked, How unlocked it? We have serious problem in that customer, we haven't done yet a reboot from system 1 "rtrprd1" (production system that has the asedirector daemon), we have reboot the second one "rtrprd2" (system that yesterday got hung). Now, rtrprd1 has the every ASE services (when rtrpr2 got down, rtrprd1 started it). Thanks in advance to everybody, Carmen Arranz and Bego�a Carvajal. Apr 1 23:56:47 rtrprd1 xntpd[1765]: hourly check: drift -0.02578109 compliance 0.00000000 Apr 2 00:53:49 rtrprd1 os_mibs[1813]: os_mibs WARNING os_mibs.c line 366: Timeout waiting for response from master agent. Apr 2 00:53:49 rtrprd1 os_mibs[1813]: os_mibs WARNING os_mibs.c line 388: Restaring protocol: Lost connection with Master Agent (snmpd). Apr 2 00:54:19 rtrprd1 snmpd[1822]: Closing subagent os_mibs, reason: 0 Apr 2 01:07:43 rtrprd1 xntpd[1765]: hourly check: drift -0.02578109 compliance 0.00000000 Apr 2 02:07:43 rtrprd1 xntpd[1765]: hourly check: drift -0.02578109 compliance 0.00000000 Apr 2 03:07:43 rtrprd1 xntpd[1765]: hourly check: drift -0.02578109 compliance 0.00000000 Apr 2 04:07:43 rtrprd1 xntpd[1765]: hourly check: drift -0.02578109 compliance 0.00000000 Apr 2 05:07:43 rtrprd1 xntpd[1765]: hourly check: drift -0.02578109 compliance 0.00000000 Apr 2 06:07:43 rtrprd1 xntpd[1765]: hourly check: drift -0.02578109 compliance 0.00000000 Apr 2 07:07:43 rtrprd1 xntpd[1765]: hourly check: drift -0.02578109 compliance 0.00000000 Apr 2 08:07:43 rtrprd1 xntpd[1765]: hourly check: drift -0.02578109 compliance 0.00000000 Apr 2 08:25:33 rtrprd1 DECsafe: rtrprd2 AseMgr Error: ASE timeout - Unable to stop service. Apr 2 08:25:33 rtrprd1 DECsafe: rtrprd1 Director Error: freeLock: can't notify EC to free DB lock Apr 2 08:25:33 rtrprd1 DECsafe: rtrprd2 AseMgr Error: Unable to stop service 'rtrhis' - Relocation not successful. Apr 2 08:25:33 rtrprd1 DECsafe: rtrprd2 AseMgr Error: Unable to relocate `rtrhis` to `rtrprd1`. Apr 2 08:25:38 rtrprd1 DECsafe: rtrprd1 Director Error: freeLock: can't notify EC to free DB lock Apr 2 08:27:33 rtrprd1 DECsafe: rtrprd1 Director Warning: timeout waiting on Reply to ASE_STOP_SERVICE Apr 2 08:27:33 rtrprd1 DECsafe: rtrprd1 Director Notice: received callback for an old event. Discarding callback. Apr 2 08:35:38 rtrprd1 DECsafe: rtrprd1 Director Warning: timeout waiting on Reply to ASE_DAEMON_ONLINE Apr 2 08:40:38 rtrprd1 DECsafe: rtrprd1 Director Warning: timeout waiting on Reply to ASE_INQ_STATE Apr 2 08:45:39 rtrprd1 DECsafe: rtrprd1 Director Warning: timeout waiting on Reply to ASE_INQ_STATE Apr 2 08:50:40 rtrprd1 DECsafe: rtrprd1 Director Warning: timeout waiting on Reply to ASE_INQ_STATE Apr 2 09:07:43 rtrprd1 xntpd[1765]: hourly check: drift -0.02578109 compliance 0.00000000 Apr 2 09:22:44 rtrprd1 DECsafe: rtrprd1 Director Warning: timeout waiting on Reply to ASE_START_SERVICE Apr 2 10:07:43 rtrprd1 xntpd[1765]: hourly check: drift -0.02578109 compliance 0.00000000 Apr 2 10:13:52 rtrprd1 DECsafe: rtrprd2 AseMgr Error: Bad return from network modify script. Apr 2 10:13:58 rtrprd1 DECsafe: rtrprd1 Director Warning: an ASEmgr exited with the db lock Apr 2 10:27:47 rtrprd1 DECsafe: rtrprd2 AseMgr Error: ASE timeout - Unable to stop service. Apr 2 10:27:47 rtrprd1 DECsafe: rtrprd1 Director Error: freeLock: can't notify EC to free DB lock Apr 2 10:27:47 rtrprd1 DECsafe: rtrprd2 AseMgr Error: Unable to stop service 'rtrhis' - Relocation not successful. Apr 2 10:27:47 rtrprd1 DECsafe: rtrprd2 AseMgr Error: Unable to relocate `rtrhis` to `rtrprd1`. Apr 2 10:29:47 rtrprd1 DECsafe: rtrprd1 Director Warning: timeout waiting on Reply to ASE_STOP_SERVICE Apr 2 10:29:47 rtrprd1 DECsafe: rtrprd1 Director Notice: received callback for an old event. Discarding callback. Apr 2 10:37:48 rtrprd1 DECsafe: rtrprd1 Director Warning: timeout waiting on Reply to ASE_DAEMON_ONLINE Apr 2 10:42:49 rtrprd1 DECsafe: rtrprd1 Director Warning: timeout waiting on Reply to ASE_INQ_STATE Apr 2 10:47:03 rtrprd1 DECsafe: local HSM Warning: Can't ping rtrprd2 over the SCSI bus Apr 2 10:47:04 rtrprd1 DECsafe: local HSM Warning: Can't ping rtrprd2-f over the network Apr 2 10:47:08 rtrprd1 DECsafe: local HSM Warning: Can't ping rtrprd2 over the network Apr 2 10:47:09 rtrprd1 DECsafe: local HSM Notice: /var/ase/sbin/ase_run_sh: change host 10.10.33.221: gateway 10.10.33.221 Apr 2 10:47:09 rtrprd1 DECsafe: local HSM Notice: /var/ase/sbin/ase_run_sh: change host 10.10.47.235: gateway 10.10.47.235 Apr 2 10:47:09 rtrprd1 DECsafe: local HSM ***ALERT: HSM_PATH_STATUS:10.10.33.221:DOWN:10.10.47.235:DOWN Apr 2 10:47:09 rtrprd1 DECsafe: local HSM Warning: member rtrprd2 is DOWN Apr 2 10:47:10 rtrprd1 DECsafe: rtrprd1 Director ***ALERT: Member is not available Apr 2 10:47:10 rtrprd1 DECsafe: rtrprd1 Director Bug: getGenericPlan: can't get service list from config db Apr 2 10:47:10 rtrprd1 DECsafe: rtrprd1 Director Error: can't get plan Apr 2 10:47:10 rtrprd1 DECsafe: rtrprd1 Director Error: can't complete agent state change processing Apr 2 10:47:10 rtrprd1 DECsafe: rtrprd1 Director ***ALERT: Member rtrprd2 is not available Apr 2 10:47:15 rtrprd1 DECsafe: rtrprd1 Agent Notice: starting service rtrhis Apr 2 10:47:38 rtrprd1 dnascd[10728]: Connect from LOCAL:.rtrprd2::uic=[0,0]root for application psw under root Apr 2 10:47:39 rtrprd1 dnascd[1020]: Process exit (PID 10728). Apr 2 10:47:39 rtrprd1 DECsafe: rtrprd1 Agent Notice: /var/ase/sbin/ase_run_sh rtrhis: rtrhis: start at Wed Apr 2 10:47:22 MET DST 1997 Apr 2 10:47:39 rtrprd1 DECsafe: rtrprd1 Director Notice: started service rtrhis on rtrprd1 Apr 2 10:47:49 rtrprd1 DECsafe: rtrprd1 Director Notice: msgSvc: unclaimed timeout Apr 2 11:00:08 rtrprd1 DECsafe: local HSM Notice: Able to ping rtrprd2 over the SCSI bus Apr 2 11:00:08 rtrprd1 DECsafe: local HSM Notice: Able to ping rtrprd2 over the network Apr 2 11:00:08 rtrprd1 DECsafe: local HSM Notice: Able to ping rtrprd2-f over the network Apr 2 11:00:10 rtrprd1 DECsafe: local HSM Notice: /var/ase/sbin/ase_run_sh: change host 10.10.33.221: gateway 10.10.33.221 Apr 2 11:00:10 rtrprd1 DECsafe: local HSM Notice: /var/ase/sbin/ase_run_sh: change host 10.10.47.235: gateway 10.10.47.235 Apr 2 11:00:10 rtrprd1 DECsafe: local HSM ***ALERT: HSM_PATH_STATUS:10.10.33.221:UP:10.10.47.235:UP Apr 2 11:00:10 rtrprd1 DECsafe: local HSM Notice: member rtrprd2 is UP Apr 2 11:00:21 rtrprd1 DECsafe: rtrprd2 Agent Notice: initializing agent... stopping all services Apr 2 11:00:22 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /home/sybackups: not currently mounted Apr 2 11:00:22 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /home/sybackups already unmounted Apr 2 11:00:23 rtrprd1 DECsafe: rtrprd2 Agent Notice: user script: rtrprd: boot at Wed Apr 2 11:00:23 MET DST 1997 Apr 2 11:00:24 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /ARCHIVE.TAPE: not currently mounted Apr 2 11:00:24 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /ARCHIVE.TAPE already unmounted Apr 2 11:00:25 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /PRISM_REL_DISK: not currently mounted Apr 2 11:00:25 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /PRISM_REL_DISK already unmounted Apr 2 11:00:25 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/ARCHIVE: not currently mounted Apr 2 11:00:25 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/ARCHIVE already unmounted Apr 2 11:00:25 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/QUEUE: not currently mounted Apr 2 11:00:25 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/QUEUE already unmounted Apr 2 11:00:25 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/REFDATA: not currently mounted Apr 2 11:00:25 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/REFDATA already unmounted Apr 2 11:00:25 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/SHAREDMEM: not currently mounted Apr 2 11:00:25 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/SHAREDMEM already unmounted Apr 2 11:00:26 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/WORK: not currently mounted Apr 2 11:00:26 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/WORK already unmounted Apr 2 11:00:26 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/users: not currently mounted Apr 2 11:00:26 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/users already unmounted Apr 2 11:00:26 rtrprd1 DECsafe: rtrprd2 Agent Notice: user script: rtrhis: boot at Wed Apr 2 11:00:26 MET DST 1997 Apr 2 11:00:27 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /rtrhis: not currently mounted Apr 2 11:00:27 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /rtrhis already unmounted Apr 2 11:07:43 rtrprd1 xntpd[1765]: hourly check: drift -0.02578109 compliance 0.00000000 Apr 2 12:07:43 rtrprd1 xntpd[1765]: hourly check: drift -0.02578109 compliance 0.00000000 Apr 2 12:44:39 rtrprd1 DECsafe: rtrprd2 AseMgr Error: Unable to get DB. Apr 2 13:01:26 rtrprd1 DECsafe: rtrprd2 AseMgr Error: Unable to get DB. Apr 2 13:07:43 rtrprd1 xntpd[1765]: hourly check: drift -0.02578109 compliance 0.00000000 Apr 2 13:18:12 rtrprd1 DECsafe: rtrprd2 AseMgr Error: Unable to get DB. Apr 2 13:21:22 rtrprd1 DECsafe: rtrprd1 Director Notice: connected to by shutdown program Apr 2 13:21:23 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_run_sh : sybase: not running in this member Apr 2 13:21:24 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /home/sybackups: not currently mounted Apr 2 13:21:24 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /home/sybackups already unmounted Apr 2 13:21:25 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_run_sh rtrprd: rtrprd: stop at Wed Apr 2 13:21:25 MET DST 1997 Apr 2 13:21:25 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_run_sh rtrprd: rtrprd: not running in this member Apr 2 13:21:27 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /ARCHIVE.TAPE: not currently mounted Apr 2 13:21:27 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /ARCHIVE.TAPE already unmounted Apr 2 13:21:27 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /PRISM_REL_DISK: not currently mounted Apr 2 13:21:27 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /PRISM_REL_DISK already unmounted Apr 2 13:21:27 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/ARCHIVE: not currently mounted Apr 2 13:21:27 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/ARCHIVE already unmounted Apr 2 13:21:28 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/QUEUE: not currently mounted Apr 2 13:21:28 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/QUEUE already unmounted Apr 2 13:21:28 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/REFDATA: not currently mounted Apr 2 13:21:28 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/REFDATA already unmounted Apr 2 13:21:28 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/SHAREDMEM: not currently mounted Apr 2 13:21:28 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/SHAREDMEM already unmounted Apr 2 13:21:28 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/WORK: not currently mounted Apr 2 13:21:28 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/WORK already unmounted Apr 2 13:21:29 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/users: not currently mounted Apr 2 13:21:29 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/users already unmounted Apr 2 13:21:29 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_run_sh rtrhis: rtrhis: stop at Wed Apr 2 13:21:29 MET DST 1997 Apr 2 13:21:29 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_run_sh rtrhis: rtrhis: not running in this member Apr 2 13:21:29 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /rtrhis: not currently mounted Apr 2 13:21:29 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /rtrhis already unmounted Apr 2 13:21:29 rtrprd1 DECsafe: rtrprd2 Agent Warning: aseagent exiting on request... Apr 2 13:21:29 rtrprd1 DECsafe: rtrprd1 Agent Warning: sendReply: send failed Apr 2 13:21:46 rtrprd1 DECsafe: local HSM Warning: Can't ping rtrprd2 over the SCSI bus Apr 2 13:21:47 rtrprd1 DECsafe: local HSM Notice: /var/ase/sbin/ase_run_sh: change host 10.10.33.221: gateway 10.10.33.221 Apr 2 13:21:47 rtrprd1 DECsafe: local HSM Notice: /var/ase/sbin/ase_run_sh: change host 10.10.47.235: gateway 10.10.47.235 Apr 2 13:21:47 rtrprd1 DECsafe: local HSM ***ALERT: HSM_PATH_STATUS:10.10.33.221:UP:10.10.47.235:UP Apr 2 13:21:48 rtrprd1 DECsafe: local HSM ***ALERT: network ping to host rtrprd2 is working but SCSI ping is not Apr 2 13:26:09 rtrprd1 DECsafe: rtrprd2 Agent Notice: initializing agent... stopping all services Apr 2 13:26:09 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /home/sybackups: not currently mounted Apr 2 13:26:09 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /home/sybackups already unmounted Apr 2 13:26:11 rtrprd1 DECsafe: rtrprd2 Agent Notice: user script: rtrprd: boot at Wed Apr 2 13:26:10 MET DST 1997 Apr 2 13:26:13 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /ARCHIVE.TAPE: not currently mounted Apr 2 13:26:13 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /ARCHIVE.TAPE already unmounted Apr 2 13:26:13 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /PRISM_REL_DISK: not currently mounted Apr 2 13:26:13 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /PRISM_REL_DISK already unmounted Apr 2 13:26:13 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/ARCHIVE: not currently mounted Apr 2 13:26:13 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/ARCHIVE already unmounted Apr 2 13:26:13 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/QUEUE: not currently mounted Apr 2 13:26:13 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/QUEUE already unmounted Apr 2 13:26:13 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/REFDATA: not currently mounted Apr 2 13:26:13 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/REFDATA already unmounted Apr 2 13:26:14 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/SHAREDMEM: not currently mounted Apr 2 13:26:14 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/SHAREDMEM already unmounted Apr 2 13:26:14 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/WORK: not currently mounted Apr 2 13:26:14 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/WORK already unmounted Apr 2 13:26:14 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/users: not currently mounted Apr 2 13:26:14 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/users already unmounted Apr 2 13:26:14 rtrprd1 DECsafe: rtrprd2 Agent Notice: user script: rtrhis: boot at Wed Apr 2 13:26:14 MET DST 1997 Apr 2 13:26:15 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /rtrhis: not currently mounted Apr 2 13:26:15 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /rtrhis already unmounted Apr 2 13:26:21 rtrprd1 DECsafe: local HSM Notice: Able to ping rtrprd2 over the SCSI bus Apr 2 13:26:23 rtrprd1 DECsafe: local HSM Notice: user script: change host 10.10.33.221: gateway 10.10.33.221 Apr 2 13:26:23 rtrprd1 DECsafe: local HSM Notice: user script: change host 10.10.47.235: gateway 10.10.47.235 Apr 2 13:26:23 rtrprd1 DECsafe: local HSM ***ALERT: HSM_PATH_STATUS:10.10.33.221:UP:10.10.47.235:UP Apr 2 13:26:23 rtrprd1 DECsafe: local HSM ***ALERT: network ping to host rtrprd2 is working and now SCSI ping is working also Apr 2 13:42:47 rtrprd1 DECsafe: rtrprd2 Agent Warning: aseagent exiting on request... Apr 2 13:43:01 rtrprd1 DECsafe: local HSM Warning: Can't ping rtrprd2 over the SCSI bus Apr 2 13:43:06 rtrprd1 DECsafe: local HSM Warning: Can't ping rtrprd2 over the network Apr 2 13:43:06 rtrprd1 DECsafe: local HSM Warning: Can't ping rtrprd2-f over the network Apr 2 13:43:07 rtrprd1 DECsafe: local HSM Notice: /var/ase/sbin/ase_run_sh: change host 10.10.33.221: gateway 10.10.33.221 Apr 2 13:43:07 rtrprd1 DECsafe: local HSM Notice: /var/ase/sbin/ase_run_sh: change host 10.10.47.235: gateway 10.10.47.235 Apr 2 13:43:07 rtrprd1 DECsafe: local HSM ***ALERT: HSM_PATH_STATUS:10.10.33.221:DOWN:10.10.47.235:DOWN Apr 2 13:43:07 rtrprd1 DECsafe: local HSM Warning: member rtrprd2 is DOWN Apr 2 13:43:08 rtrprd1 DECsafe: rtrprd1 Director ***ALERT: Member rtrprd2 is not available Apr 2 13:45:08 rtrprd1 DECsafe: local HSM Notice: Able to ping rtrprd2 over the SCSI bus Apr 2 13:45:20 rtrprd1 DECsafe: local HSM Notice: Able to ping rtrprd2-f over the network Apr 2 13:45:31 rtrprd1 DECsafe: rtrprd2 Agent Notice: initializing agent... stopping all services Apr 2 13:45:32 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /home/sybackups: not currently mounted Apr 2 13:45:32 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /home/sybackups already unmounted Apr 2 13:45:34 rtrprd1 DECsafe: rtrprd2 Agent Notice: user script: rtrprd: boot at Wed Apr 2 13:45:33 MET DST 1997 Apr 2 13:45:35 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /ARCHIVE.TAPE: not currently mounted Apr 2 13:45:35 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /ARCHIVE.TAPE already unmounted Apr 2 13:45:35 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /PRISM_REL_DISK: not currently mounted Apr 2 13:45:35 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /PRISM_REL_DISK already unmounted Apr 2 13:45:35 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/ARCHIVE: not currently mounted Apr 2 13:45:35 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/ARCHIVE already unmounted Apr 2 13:45:35 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/QUEUE: not currently mounted Apr 2 13:45:35 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/QUEUE already unmounted Apr 2 13:45:36 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/REFDATA: not currently mounted Apr 2 13:45:36 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/REFDATA already unmounted Apr 2 13:45:36 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/SHAREDMEM: not currently mounted Apr 2 13:45:36 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/SHAREDMEM already unmounted Apr 2 13:45:36 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/WORK: not currently mounted Apr 2 13:45:36 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/WORK already unmounted Apr 2 13:45:36 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/users: not currently mounted Apr 2 13:45:36 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /RTR/users already unmounted Apr 2 13:45:36 rtrprd1 DECsafe: rtrprd2 Agent Notice: user script: rtrhis: boot at Wed Apr 2 13:45:36 MET DST 1997 Apr 2 13:45:37 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /rtrhis: not currently mounted Apr 2 13:45:37 rtrprd1 DECsafe: rtrprd2 Agent Notice: /var/ase/sbin/ase_mount_action: /rtrhis already unmounted Apr 2 13:45:39 rtrprd1 DECsafe: local HSM Notice: /var/ase/sbin/ase_run_sh: change host 10.10.33.221: gateway 10.10.47.235 Apr 2 13:45:39 rtrprd1 DECsafe: local HSM Notice: /var/ase/sbin/ase_run_sh: change host 10.10.47.235: gateway 10.10.47.235 Apr 2 13:45:39 rtrprd1 DECsafe: local HSM ***ALERT: HSM_PATH_STATUS:10.10.33.221:DOWN:10.10.47.235:UP Apr 2 13:45:39 rtrprd1 DECsafe: local HSM Notice: member rtrprd2 is UP Apr 2 13:45:39 rtrprd1 DECsafe: local HSM Notice: Able to ping rtrprd2 over the network Apr 2 13:45:40 rtrprd1 DECsafe: local HSM Notice: /var/ase/sbin/ase_run_sh: change host 10.10.33.221: gateway 10.10.33.221 Apr 2 13:45:40 rtrprd1 DECsafe: local HSM Notice: /var/ase/sbin/ase_run_sh: change host 10.10.47.235: gateway 10.10.47.235 Apr 2 13:45:40 rtrprd1 DECsafe: local HSM ***ALERT: HSM_PATH_STATUS:10.10.33.221:UP:10.10.47.235:UP Apr 2 14:07:43 rtrprd1 xntpd[1765]: hourly check: drift -0.02578109 compliance 0.00000000 Apr 2 15:07:43 rtrprd1 xntpd[1765]: hourly check: drift -0.02578109 compliance 0.00000000 Apr 2 16:07:43 rtrprd1 xntpd[1765]: hourly check: drift -0.02578109 compliance 0.00000000
T.R | Title | User | Personal Name | Date | Lines |
---|---|---|---|---|---|
1980.1 | Quick response | NETRIX::"[email protected]" | Gregory P. Myrdal | Thu Apr 03 1997 10:32 | 35 |
Hi, I can not tell if your database is locked or not. This is not a common problem with ASE. The logs do indicate that there could be a db lock problem, however, there is also a database fetch with does not require a db lock and it fails to. My first instinct is to say that the director is very confused for some reason. The reason is unknown as it most likely happened before the log entries submitted. This is also why the asemgr is most likely hanging. The asemgr can not correctly talk to the asedirector and it does not require a db lock to do so. My suggestion is to try and get the director back in a sane state. This does not require rebooting any systems or shutting down any services. If rtrprd2 is not needed and is not running any services, I would shut it down to single user to remove it from the equation. Then I would make sure that all of the ase daemons are running via 'ps ax | grep ase' command in rtrprd1. Make sure you have an aseagent, asedirector, and asehsm. You can then kill -9 the asedirector process id. Do a tail -f on the daemon.log and save it off to a file for us. The aseagent is responsible for starting a new daemon so you should see messages in the log about rtrprd1 starting a new one. In case this does not work we need to see these new entries in the daemon.log to see what a new asedirector generates. Also, if this fails I would also suggest escalating this problem so you get the appropriate attention and someone assigned to work it. The reason I think the asedirector is confused is because its configuration (in memory) database looks to be hosed. Good luck and let us know what happens. -- Greg [Posted by WWW Notes gateway] | |||||
1980.2 | doing it ... | MDR01::CARVAJAL | MCS Madrid | Thu Apr 03 1997 11:04 | 9 |
Thanks a lot for your quickly answer. We are going to follow your recomendations, but if the problem is not solved we have a little time to stop the machine or ... to recover a good asecdb. We will inform you asap. Carmen and Bego�a | |||||
1980.3 | The asecdb is not a suspect .... yet | SMURF::MYRDAL | Thu Apr 03 1997 11:09 | 8 | |
Note: nothing indicates that the asecdb is bad. Also db locks are not placed in the filesystem version of the asecdb. So what I am suggesting is that getting a new asecdb may not solve any problems. If you know you made changes to the asecdb and now are having problems, this is a horse of a different color. But your original note did not mention this. -- Greg | |||||
1980.4 | What do you mean ...? | MDR01::CARVAJAL | MCS Madrid | Thu Apr 03 1997 11:19 | 9 |
We suppose, as you, asecdb is not bad. But we need to tell the customer all the steps to do while the system is stoped. What do you mean with "placed in the filesystem of the asecdb" ? We suppose that a system reboot will solve the problem. Don`t you? bego�a and Carmen | |||||
1980.5 | BACHUS::DEVOS | Manu Devos DEC/SI Brussels 856-7539 | Thu Apr 03 1997 11:23 | 6 | |
Hi, Don't reboot now !! Start with the asedirector killing.... | |||||
1980.6 | OK.!! | MDR01::CARVAJAL | MCS Madrid | Thu Apr 03 1997 11:27 | 3 |
OK!!! | |||||
1980.7 | asecdb clarification | SMURF::MYRDAL | Thu Apr 03 1997 11:30 | 12 | |
Yes, try restarting the asedirector first. The asecdb lives in the filesystem and in the asedirector's memory. Yes, I am suggsting that a restart of the daemons may fix the problem and that there is no problem with the on disk copy of the database. There is a chance there might be a problem with the in memmory copy of the director's database (or more likely its not there for some reason). When the director is restarted it will request a new copy from the agent who gets it from disk. -- greg | |||||
1980.8 | Everything went OK as here suggested!!! | MDR01::CARRANZ | MCS Madrid | Thu Apr 03 1997 13:22 | 22 |
Hi everybody who has helped us, We have done what you said and now DEC SAFE is working as expected (OK!). 1.- Shutdown now on rtrprd2 2.- Verify that ase daemons were running on rtrprd1 3.- kill -9 asedirector daemon. 4.- Inmediatly a new asedirector was started. 5.- asemgr works Only one entry into daemon.log: "Apr 3 .... DECsafe:rtrprd1 Agent Notice: Starting a new director". Now we go to home to rest. Many thanks, Bego�a y Carmen |