T.R | Title | User | Personal Name | Date | Lines |
---|
505.1 | copy of database file ? | HPCGRP::BENSON | | Wed Jun 04 1997 09:40 | 6 |
| Paul,
Do you have a copy of the database file ?
-Ed
|
505.2 | May be a problem in /etc/hosts | NNTPD::"[email protected]" | Richard Warren | Wed Jun 04 1997 10:19 | 20 |
| Paul,
From your posting you mention that attempting to start "farmd" by hand
didn't result in anything unusual being reported. While it might not
seem strange that "lead.corning.com" reports a ring connection being made
to "nic1", it tells me that the string "nic1" is being returned as the
primary network name by gethostbyaddr() on that machine. As a guess,
the hostfile might be redone to have: the fully qualified name as the
first entry in the list with the "aliases" following, e.g.
149.42.1.2 corning.com corning
Doing the above would remove any doubt about matching database names to
actual hostnames as returned by gethostbyaddr(). As it stands now, the
comparision is done by strcasecmp() and would fail when attempting to match
a fully qualified name against an alias to figure out farm membership.
Other than that, I'm wondering if I could get access to the machine?
Richard
[Posted by WWW Notes gateway]
|
505.3 | PSE still not starting 1 sys | CSC32::P_HILL | | Thu Jun 05 1997 15:26 | 92 |
|
After talking with the custommer it looks like this system is in a
secure site so we will not be able to get remote access to this machine
but the customer did update the /etc/hosts file, here's what he sent me
It turns out that the /etc/hosts file on the system on which PSE did
not work had only short host names rather than fully-qualified names. I
have fixed this, and restarted PSE. From syslog:
Jun 4 11:34:19 lead farmd[649]: Farm domain (nicfarm) sockets opened
successful
ly
Jun 4 11:34:22 lead farmd[649]: Ring connection with nic1.corning.com
establish
ed. (Right)
Jun 4 11:34:22 lead farmd[649]: Ring connection with zinc.corning.com
establish
ed. (Left)
Jun 4 11:34:22 lead farmd[649]: Reinitializing
Jun 4 11:34:22 lead farmd[649]: Warning! Using service entry (nicfarm
32/tcp),
which differs from database SERVICE_PORT definition (57039)!
Jun 4 11:34:22 lead farmd[649]: Farm domain (nicfarm) sockets opened
successful
ly
And also:
# lspart -partition allmembers
Current farm:
nicfarm.db
Farm Attributes:
PSE_LOADSERVERS nic1 nic2
PSE_PREF_COMM shm mc fddi ethernet
PSE_FILESYSTEM nics1:/pse /pse nicu1:/u1 /nfs/u1 nicu2:/u2
/nfs/u2
nicu3
:/u3 /nfs/u3 nicu4:/u4 /nfs/u4 nicu5:/u5 /nfs/u5 nicu6:/u6 /nfs/u6
PSE_DEFAULT_PARTITION smp_zinc
PSE_SERVICEPORT 57039
Partition data:
Members(6):
lead.corning.com No information available
nic1.corning.com Jobslots = 1 load_avg = 0.20
nic2.corning.com Jobslots = 1 load_avg = 0.19
nickel.corning.com Jobslots = 6 load_avg = 3.11
tin.corning.com Jobslots = 4 load_avg = 0.01
zinc.corning.com Jobslots = 12 load_avg = 3.51
PSE_PREF_COMM shm mc fddi ethernet
So, it still does not start up on the same system.
This line from the log is puzzling:
Jun 4 11:34:22 lead farmd[649]: Warning! Using service entry (nicfarm
32/tcp),
The definition of the nicfarm port in /etc/services is the same as our
NIS
entry:
nicfarm 57039/tcp
Here's the farm definition:
configuration_data PSE_PARTITIONS allmembers bigmembers smp_nickel
smp_zinc
smp_tin smp_lead
configuration_data PSE_DEFAULT_PARTITION smp_zinc
configuration_data PSE_LOADSERVERS nic1 nic2
configuration_data PSE_UPDATE_PERIOD 60
configuration_data PSE_WHICH_LOADAVG 5
configuration_data PSE_SERVICEPORT 57039
configuration_data PSE_FILESYSTEM nics1:/pse /pse
configuration_data PSE_FILESYSTEM nicu1:/u1 /nfs/u1
configuration_data PSE_FILESYSTEM nicu2:/u2 /nfs/u2
configuration_data PSE_FILESYSTEM nicu3:/u3 /nfs/u3
configuration_data PSE_FILESYSTEM nicu4:/u4 /nfs/u4
configuration_data PSE_FILESYSTEM nicu5:/u5 /nfs/u5
configuration_data PSE_FILESYSTEM nicu6:/u6 /nfs/u6
configuration_data PSE_PREF_COMM shm mc fddi ethernet
allmembers PSE_MEMBERS zinc nickel tin lead nic1 nic2
bigmembers PSE_MEMBERS zinc nickel tin lead
smp_nickel PSE_MEMBERS nickel
smp_zinc PSE_MEMBERS zinc
smp_tin PSE_MEMBERS tin
smp_lead PSE_MEMBERS lead
Paul
|
505.4 | Still no progress? | NNTPD::"[email protected]" | Richard Warren | Thu Jun 05 1997 16:07 | 14 |
| Re: .2
The problem is as you noticed the farm daemon which seems to get
a service port that didn't match the database; and when this happens
I issue a warning but take the /etc/services entry as the "real" value.
Port 32 will obviously not talk to port 58039! From the looks of things,
the database is correct as is the /etc/services (though you should have
nicfarm 57039/udp in addition to the tcp entry).
If the udp entry is missing, please add it to /etc/services.
In anycase, I'd simply stop the existing farmd and try to restart in the
absence of a reliable explaination!!!
Richard
[Posted by WWW Notes gateway]
|