[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | POLYCENTER System Watchdog for VMS OSF/1 ULTRIX HP-UX AIX SunOS |
Notice: | Wishes:406,FAQ:845,Kits-VMS:1000,UNIX:694 VMS ECO01 FT kit: 521 |
Moderator: | AZUR::HUREZ Z |
|
Created: | Fri May 15 1992 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 1033 |
Total number of notes: | 4584 |
1004.0. "Polycenter watchdog : creation failure" by NETRIX::"[email protected]" (Thierry FAIDHERBE) Wed Feb 19 1997 09:49
Note posted on Digital_unix conference with ID 8867....
Hi to all,
I received a customer's problem with Polycenter watchdog :
Problem description of call
message : EPLZ14:Sensor /usr/opt/PSW/psw_sensor_eth
creation failure.
4th time today.
Other agents seem to work normally.
Consolidator 2.2-03 OpenVMS Alpha 6.2-1h2
Agent : 2.2 on Digital Unix 3.2d1
We enabled the logfile right now....
SNS> sh cons/full
Controller : V2.2-03
Consolidator : 135 V2.2-03
Profile : $1$DUA1:[SYS3.SYSEXE]SNS$PROFILE.DAT;213
Log file : SYS$SYSROOT:[SYSMGR]SNS$LOG.DAT;1 Enabled
Action routines : Enabled
DECtalk : Enabled
Mailbox : Enabled
Polling interval : 180
Before setting : Not specified
Since setting : Not specified
Watchdog information:
Node Status Class Version OS Version
QPLZ11 Enabled DEVELOPMENT XO2.20 OSF1 V3.2 62 alpha
QPLZ02 Enabled DEVELOPMENT V2.2-03 VMS V6.2
QPLZ01 Enabled DEVELOPMENT V2.2-03 VMS V6.2
EPLZ14 Enabled PRODUCTION XO2.20 OSF1 V3.2 41.64 alpha
EPLZ13 Enabled PRODUCTION XO2.20 OSF1 V3.2 41 alpha
EPLZ12 Enabled PRODUCTION XO2.20 OSF1 V3.2 41 alpha
EPLZ11 Enabled PRODUCTION XO2.20 OSF1 V3.2 41 alpha
EPLZ07 Enabled PRODUCTION V2.2-02 VMS V6.2-1H3
EPLZ05 Enabled DEFAULT V2.2-03 VMS V6.2
EPLZ04 Enabled PRODUCTION V2.2-02 VMS V6.2-1H3
EPLZ03 Enabled MANAGEMENT V2.2-03 VMS V6.2-1H2
EPLZ02 Enabled PRODUCTION V2.2-03 VMS V6.2
EPLZ01 Enabled PRODUCTION V2.2-03 VMS V6.2
We have other problem on this system : cron process cores dump...
dbx /usr/sbin/cron core_mob
dbx version 3.11.8
Type 'help' for help.
Core file created by program "cron"
signal Segmentation fault at >*[NLstrdlen, 0x3ff800c7810] bis r1,
r2,r1
(dbx) t
> 0 NLstrdlen(0x140001698, 0x1400018a0, 0x11ffff330, 0x100000018,
0x3ff800cd490) [0x3ff800c7810]
1 _doprnt(0x140000251, 0x11ffffb30, 0x28, 0x7e04, 0x3ff800ece70)
[0x3ff800c4f38]
2 sprintf(0x140001af8, 0x140000230, 0x14000ab40, 0x140000258, 0x53300d6c8)
[0x3ff800c7914]
3 ex(0x43e33412a61d8290, 0x43e03411267dffff, 0x22737708a6100000,
0x27ba20006b5b4365, 0x4400041023bd5534) [0x120006d50]
4 ex(0x4787041c4b80005c, 0x239000013f800000, 0x400034004a3c0f51,
0xf63ffff44a271791, 0x221e00b2a77d8180) [0x120006afc]
I toke a look in dxbookreadter and found :
Sensor Creation failure WDM Not enough resources; stop and restart
psw_agent
I suspect that watchdog problem in way with cron problem : if no cron process,
maybe is it a problem with process creation.
Does anybody have a explanation and/or more informations about
" Not enough resources " error message.
Kindly Regards,
+---++---++---++---++---++---++---+ TM Digital Equipment Belgium
| || || || || || || | Multivendor Customer Services
| d || i || g || i || t || a || l | Thierry FAIDHERBE
| || || || || || || | DIGITAL Unix Support Team
+---++---++---++---++---++---++---+ Email [email protected]
Phone : +32 2 729 77 44 Fax : +32 2 729 77 65
With DIGITAL Unix, ... You get what you pay for ...
[Posted by WWW Notes gateway]
T.R | Title | User | Personal Name | Date | Lines |
---|
1004.1 | Try the sensor independantly, with trace enabled | AZUR::HUREZ | Connectivity & Computing Services @VBE. DTN 828-5159 | Wed Feb 19 1997 12:23 | 16 |
| PSW for UNIX Agent doesn't use cron (as a matter of fact, the psw_agent
process is a kind of dedicated cron process that itself schedules
the sensors according to its configuration file psw_agent.conf)
Can you please try the following:
# csh
# setenv psw_trace on
# psw_sensor_eth 0 -x eth.log
and mail me with the output logfile...
Thanks,
-- Olivier.
|
1004.2 | ETH is OK. What about psw_agent and system resources | AZUR::HUREZ | Connectivity & Computing Services @VBE. DTN 828-5159 | Tue Feb 25 1997 06:10 | 25 |
| OK, I got the logfiles... According to their contents, when the sensor
has a chance to run, it seems to perform its job alright...
Do you have a logfile for the Agent itself that would cover a period
of time during which at least one ETH sensor creation failure happened?
FYI, such a trace can be obtained using the following method:
o Go superuser and Kill existing psw_agent process
% su
# ps -e -opid,ucomm | grep psw_agent | grep -v psw_agent_ | \
grep -v grep | awk '{print "kill -USR2 " $1}' | sh
o Enable the trace facility and rerun the psw_agent process,
while specifying an output logfile (-x option)
# setenv psw_trace
# /usr/opt/psw/psw_agent -f/usr/opt/psw/psw_agent.conf -xAGENT.LOG &
Would it be possible that the concerned system would have its process
table nearly full by that time?
-- Olivier.
|