Title: | DECnet/OSI for OpenVMS |
Moderator: | TUXEDO::FONSECA |
Created: | Thu Feb 21 1991 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 3990 |
Total number of notes: | 19027 |
The intention of this note is to let you know what might happen if you install ECO-6 for V6.3 on a DNS Server node. If you then panic you will probably mess up your DNS namespace (as have a couple of my Customers), if you keep calm you can dig your way out of the situation. As far as I know, there are no particular problems with ECO-6 on DNS Clerks or systems with LOCAL as their primary name service. Therefore, over 99% of nodes will probably be improved by installation of ECO-6 -- it is just DNS Server nodes which may suffer, in particular, those with DECdns configured as the primary name service. The problem on DNS Server nodes occurs when the CDI cache entry for the Server node itself is flushed. There are a number of events which may flush the CDI cache entry, so you may want to avoid as many of these causes as possible!! 1. Installation of ECO-6 (cache file format is changed, so the entire cache is deleted) 2. NET$CONFIGURE options 1 or 2 rename the node and flush the node's own entries 3. Explicit use of NCL FLUSH on the node's own name or on "*" 4. In a very large network the cache might become full and older entries might become purged. 5. During normal operation old entries are periodically purged from the CDI cache. The purge interval is set by the "Session Control Naming Cache Timeout". By default, NET$CONFIGURE sets the value in NET$SEARCHPATH_STARTUP.NCL to be 30 days. You may edit this, but if you use NET$CONFIGURE option 2 to change it then your node's entry will be flushed, anyway! 6. During the boot process, between the startup of NET$ACP and the execution of NET$SEARCHPATH_STARTUP there is a 20 second window during which the cache timeout takes on the hard-coded value of 7 days. During this window, the node will almost certainly attempt to look up its own name, and if its own cache entry is more than 7 days old it will be deleted from the CDI cache. The hard-coded 7 day value can be over-ridden by defining the logical CDI_CACHE_TTL. In summary, if you have installed ECO-6 on your DNS Server, or if you need to install it, then take the following two actions, and avoid the others (above) and you will probably be OK. (OK, so you can't do #1 and avoid doing it!) 1. Edit the NET$SEARCHPATH_STARTUP.NCL to set the Timeout to be a large number of days. 2. Edit SYLOGICALS.COM to include something similar to the following, where the timeout is in units of 1 second $ def /sys CDI_CACHE_TTL 5184000 ! 60 days Setting CDI_CACHE_TTL to a negative value (any -ve value) is supposed to disable the timeout. I can't confirm it, although I can say that it has strange side effects, such as making the "Naming Cache Timeout" value non-displayable by NCL. If you are unlucky enough to have an ECO-6 DNS Server system lose its own CDI cache entry then you may get a variety of symptoms, mostly involving logical link connect attempts hanging and eventually timing out. Even "SET HOST 0" is likely to hang and then timeout. When in this state, the problem is how to restore the node's own CDI cache entry. The following sequence has been used multiple times, and has succeeded both in our lab and on Customer networks... Sometimes the system seems to recover spontaneously, but more often it is necessary to carry out the following sequence: 1. Sometimes a reboot is required. A side effect of the problem is that NET$ACP can consume all its VA and the only way out is to reboot. 2. Use NCL to rename the node into LOCAL: (for example, NCL RENAME NEW NAME = LOCAL:.FRED). Do not try to be clever and use NET$CONFIGURE option 2 as this will flush the cache as fast as it renames ! 3. Use DNS$DIAG to disable node verification, to prevent every logical link resulting in recursive lookups to DNS... $ mc dns$diag DECdns Server Diagnostics - Version V2.020 (Dec 2 1996) diag> disable node_verification 4. Use DNS$DIAG to disable ACS, as your access control is almost certainly not set up to allow access from unvalidated LOCAL: connections: $ mc dns$diag DECdns Server Diagnostics - Version V2.020 (Dec 2 1996) diag> enable acs_override 5. Carry out operations such as SET HOST to the proper *fullname* (ie with namespace name, etc): $ SET HOST NS:.XYZ.FRED 6. Use NCL to rename the node back to its proper name and check that its still working $ NCL RENAME NEW NAME = NS:.XYZ.FRED $ SET HOST 0 7. Force the CDI cache to be written to disk, using CDI trace to monitor CDI activity until the cache write occurs (15 minutes) $ mc ncl set ses con nam cac check int +0-08:00:00 $ mc cdi$trace 8. Search the CDI cache for references to LOCAL: and then flush them out ... $ @tt:/out=x.x _$ mc cdi_cache_dump _$ exit $ search x.x local $ mc ncl flu ses con nam cac ent "LOCAL:.FRED" 9. Make sure the CDI cache is written to disk before the next reboot: $ mc ncl set ses con nam cac check int +0-08:00:00 $ mc cdi$trace 10. Restore normal DECdns access control $ mc dns$diag DECdns Server Diagnostics - Version V2.020 (Dec 2 1996) diag> enable node_verification diag> disable acs_override Regards, John Weir
T.R | Title | User | Personal Name | Date | Lines |
---|---|---|---|---|---|
3879.1 | IAMOSI::LEUNG | Wed Mar 12 1997 22:30 | 23 | ||
Clarification needed : I have a customer who needs to apply ECO6 because it contains a fix for a crash in net$transport_nsp (NETNOSTATE) and the system is a DECdns server. Do they have to do these steps BEFORE applying ECO6 : > > 1. Edit the NET$SEARCHPATH_STARTUP.NCL to set the Timeout to be > a large number of days. > > 2. Edit SYLOGICALS.COM to include something similar to the following, > where the timeout is in units of 1 second > > $ def /sys CDI_CACHE_TTL 5184000 ! 60 days Is there an official patch from Engineering? If the node were to lose its own CDI cache, after doing the steps mentioned (renaming, dns$diag, etc...) and rebooting, is it then ok to use commands that will flush cdi cache entries? Thanks Dennis | |||||
3879.2 | Preliminary fixes, only, so far | COMICS::WEIR | John Weir, UK Country Support | Thu Mar 13 1997 03:59 | 68 |
Dennis, >Clarification needed : > >I have a customer who needs to apply ECO6 because it contains a fix for a crash >in net$transport_nsp (NETNOSTATE) and the system is a DECdns server. > >Do they have to do these steps BEFORE applying ECO6 : >> >> 1. Edit the NET$SEARCHPATH_STARTUP.NCL to set the Timeout to be >> a large number of days. >> >> 2. Edit SYLOGICALS.COM to include something similar to the following, >> where the timeout is in units of 1 second >> >> $ def /sys CDI_CACHE_TTL 5184000 ! 60 days Until fixes are widely available, do the above, as you suggest. Note: This will not completely avoid the problem, as when you do the upgrade to ECO-6 the CDI cache is deleted and thus the node's own name is flushed along with everything else. Also, is you rename the node then its name will be flushed ... But, at least if you do the above then you should not suffer from periodic and possibly unexpected flushing of the node's own entry. The base note contains the full procedure for recovering, and it has been used on several Customer sites as well as our lab. Quite often you do not have to use the full procedure and sometimes things recover spontaneously. So, provided you provide the Customer with the recovery instructions, the impact of the cache entry being flushed is not disasterous and can be recovered quite quickly -- don't panic, just be prepared (and being prepared, means knowing how to recover quickly). > >Is there an official patch from Engineering? No, but I have successfully tested the preliminary fixes. I don't have authority to distribute them on Easynet (so don't ask me). If you need the fixes then ask Engineering or IPMT. The preliminary fix is to set a flag on the node's own CDI cache entry and that of its Alias so that they do not get flushed by periodic cache timeouts. But, these entries can still be flushed by renaming the node (a bad idea on a DNS Server, anyway). So, be aware that you might still need the recovery procedure even with the fixes -- But, if you don't mess around (eg renaming your DNS Server) then once the fixes are in then you should not need the recovery procedure any more... >If the node were to lose its own CDI cache, after doing the steps mentioned >(renaming, dns$diag, etc...) and rebooting, is it then ok to use commands that >will flush cdi cache entries? I don't understand the question. Flushing the node's entry from the CDI cache may cause the problem, but if the node's entry has already been flushed, then you may repeat the flush as many times as you like without making the situation any worse ;-) ie flushing an empty cache is a NOOP. Regards, John | |||||
3879.3 | Clarification | NNTPD::"[email protected]" | Paul Sture | Tue May 13 1997 11:51 | 9 |
In View Note 3879, step 1 recommends setting the timeout in NET$SEARCHPATH.NCL to be a "large number of days". The default appears to be 90 days. What do you suggest for this "large number"? TIA PAul Sture [Posted by WWW Notes gateway] | |||||
3879.4 | 90 days should be enough (maybe) | COMICS::WEIR | John Weir, UK Country Support | Wed May 14 1997 06:43 | 16 |
Paul, > >to be a "large number of days". The default appears to be 90 days. What do you >suggest for this "large number"? > That all depends upon how quickly you think the fix will be produced and how readily your Customer will upgrade ... Engineering originally suggested 90 days. The default in NET$CONFIGURE is 30 days (I think). John | |||||
3879.5 | Lartge no of days | NNTPD::"[email protected]" | rtoms::support_pw | Wed May 21 1997 11:33 | 13 |
> That all depends upon how quickly you think the fix will be > produced and how readily your Customer will upgrade ... > > Engineering originally suggested 90 days. The default in > NET$CONFIGURE is 30 days (I think). Thanks for your reply. It's for our own DNS systems in Germany, so I've got control over applying the fix once it arrives. Regards Paul Sture [Posted by WWW Notes gateway] |