[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference lassie::ucx

Title:DEC TCP/IP Services for OpenVMS
Notice:Note 2-SSB Kits, 3-FT Kits, 4-Patch Info, 7-QAR System
Moderator:ucxaxp.ucx.lkg.dec.com::TIBBERT
Created:Thu Nov 17 1994
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:5568
Total number of notes:21492

5516.0. "NFS and RENAME errors" by QUARK::LIONEL (Free advice is worth every cent) Thu May 15 1997 12:53

[I received the following by e-mail from a customer.  Please follow up with
the customer directly - thanks. - Steve]

Hello Steve,

I contacted you before about Fortran and UCX questions, as you maybe rem=
ember.
You were so kind and answered me and it encourages me to send to this le=
tter.

The situation I describe below concerns TCP/IP, I think. So if you feel =
that it
is not your table, pass it to the UCX-expert, please.

I'm using UCX and I mount 2 NFS disks that are on 2 HP-UX systems.
There are several batch jobs that run alomost simultaneously and that ac=
cess
these NFS disks. Initially 3 DCL commands are performed by the jobs:
    1. f$search(<file>)
    2. DIRECTORY <file>
    3. RENAME <file> <new_file>
Sometimes when RENAME is executed the following messages are displayed:
  %RENAME-E-OPENIN, error opening <file> as input
  -RMS-F-REENT, file could not be renamed and recovery failed; file has =
been
lost
  -SYSTEM-W-NOSUCHFILE, no such file
This situation happens when I have 2 jobs that perform the commands
simultaneously". The time difference could be less then 00:00:00.50. One=
 job
wins and performs its commands and the other one produces the messages a=
bove.
This behavior I've tested several times with a help of a simple job:
    $ cnt=3D0
    $ on error then exit
    $ set verify
    $ set prefix "(!%T) "
    $nxt_rename:
    $ cnt1=3Dcnt+1
    $ directory/date 'p1':tmpfile.'cnt'
    $ rename/log 'p1':tmpfile.'cnt' .'cnt1'
    $ cnt=3Dcnt1
    $ if cnt .le. 99 then goto nxt_rename
    $ exit
The job is submitted twice with the same /AFTER value. P1's value is a l=
ogical
name that points to each HP-UX system. tmpfile is an empty UNIX file, cr=
eated
(touch) before jobs start.

In one batch log you find:
------------------------------------------------------------------------=
----
(08:35:01.23) $ directory/date nfsdisk1:tmpfile.0

Directory DNFS1:[000000]

TMPFILE.0;1          14-MAY-1997 08:28:16.00

Total of 1 file.
(08:35:01.31) $ rename/log nfsdisk1:tmpfile.0 .1
%RENAME-E-OPENIN, error opening DNFS1:[000000]TMPFILE.0;1 as input
-RMS-F-REENT, file could not be renamed and recovery failed; file has be=
en lost
-SYSTEM-W-NOSUCHFILE, no such file
------------------------------------------------------------------------=
---
and in the second log file:
------------------------------------------------------------------------=
---
(08:35:01.22) $ directory/date nfsdisk2:tmpfile.0

Directory DNFS2:[000000]

TMPFILE.0;1          14-MAY-1997 08:33:06.00

Total of 1 file.
(08:35:01.29) $ rename/log nfsdisk2:tmpfile.0 .1
%RENAME-I-RENAMED, DNFS2:[000000]TMPFILE.0;1 renamed to
DNFS2:[000000]TMPFILE.1;1
------------------------------------------------------------------------=
-----

The question is if there is any explanation for this error and an obviou=
s
conflict
during the NFS access. Since the NFS are placed on different UNIX-comput=
ers the
reason for the error should be looked for on VMS-site, shouldn't it.
Is it something in UCX configuration that causes the error or is it a bu=
g.
I could synchronize my jobs to avoid the conflict but I don't thing it i=
s the
way. I can't synchronize all NFS accesses!

After each failed RENAME I find a file on UNIX-site. The file has name l=
ike
.$NFS$9bbaed. It contains this lost file and it is hidden file (note a d=
ot).
Is there any NFS way to access the hidden files? What mechanism creates =
the
files?

I hope you or some of your colleagues can give answers on my questions.

Best Regards
Teofil Smolowicz
E-mail: [email protected]
T.RTitleUserPersonal
Name
DateLines
5516.1same problemUTRTSC::KNOPPERSOswald KnoppersTue May 20 1997 10:5633
Not a solution, but an occurence of the same. My customer uses the VMS NFS
client in combination a NT server running NFS-Maestro (whatever this may be)
software.

Renaming a file (always) results in:

$ RENAME x.x y.y
%RENAME-E-OPENIN, error opening DNFS1:[000000]X.X;1 as input
-RMS-F-REENT, file could not be renamed and recovery failed; file has been lost
-SYSTEM-W-NOSUCHFILE, no such file
$

The result is:

$ DIR

Directory DNFS1:[000000]

.$$NFS$$9B4B83;1

He also has a Digital UNIX client for this server and renaming file from
there works fine.

I have two questions:

- I guess this is some timing problem, would this be a client or server
  problem?
- How do I troubleshoot this, a nfs/debug=180 command only works on systems
  running the nfs server process?

Thanks in advance,

Oswald
5516.2how to analyze the traceUTRTSC::KNOPPERSOswald KnoppersWed May 21 1997 07:298
Ok, I made a trace with tcpiptrace. Does anybody have some tips on how to
interpret this, tcpiptrace unfortunally has no knowledge of nfs....

For the interested... Trace is at utrtsc::netw_user:[knoppers]nfs.trace

Regards,

Oswald
5516.3re: .1, .2 -- better report this to Maestro vendorLASSIE::CORENZWITstuck in postcrypt queueWed May 21 1997 19:2451
>Ok, I made a trace with tcpiptrace. Does anybody have some tips on how to
>interpret this, tcpiptrace unfortunally has no knowledge of nfs....
    
    No tips.  Just get out your RFC 1094 and RFC 1057 and start slogging. 
    It gets faster with practice.  8-{
    
    But, cumpulsive personality that I am, I have had a look at your trace. 
    It's a server problem.  
    
    For some background on why the UCX client does what it does, have a
    look at #4985.* in this conference.
    
    seq # 9, 10
    client does readdir; response shows directory containing only a.a
    
    some lookups and getattrs
    
    seq # 18, 19
    client does lookup on .$NFS$9b4043 to make sure it does not already
    exist; response confirms it does not
    
    seq # 20, 21
    client renames a.a to .$NFS$9b4043; gets successful status returned
    
    seq # 22, 23
    client does lookup on .$NFS$9b4043; response claims there is no such
    file!
    
    seq # 24, 25
    client checks directory attributes; modification time has changed since
    last attribute check before the rename
    
    seq # 26, 27
    client does readdir; response shows directory now contains only
    .$nfs$9b4043  Note wrong case!
    
    client does more housekeeping, including caching attributes on this
    .$nfs$9b4043 file
    
    seq # 37, 38
    client tries renaming .$NFS$9b4043 to b.b; server is still sure there
    is no such file
    
    seq # 39, 40
    client tries to back out the first rename by renaming .$NFS$9b4043 back
    to a.a; server is still sure there is no such file
    
    At this point the client gives up trying to go forward or backward on
    the rename from a.a to b.b.
    
    Julie
5516.4thanksUTRTSC::KNOPPERSOswald KnoppersTue May 27 1997 06:328
>    But, cumpulsive personality that I am, I have had a look at your trace. 
>    It's a server problem.  

Thanks for the analysis, I've informed the customer.

Regards,

Oswald