[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference noted::hackers_v1

Title:-={ H A C K E R S }=-
Notice:Write locked - see NOTED::HACKERS
Moderator:DIEHRD::MORRIS
Created:Thu Feb 20 1986
Last Modified:Mon Aug 03 1992
Last Successful Update:Fri Jun 06 1997
Number of topics:680
Total number of notes:5456

446.0. "Efficient way to append RMS sequential files" by CREDIT::ROYAL () Wed Apr 15 1987 11:23

    
       What I'd like to do from a program or from DCL is to create an
    output file which consists of appended input files.  I know I can
    use the DCL APPEND command, but I'd like to do this in a highly
    efficient manner.  Basically I'd like to eliminate file copies,
    rather if there's a way I could manipulate the EOF and BEOF RMS
    pointers to create this file this would seem most efficient.  Note
    that the input files do not need to exist after they are appended
    to the output file.  This seems like a real hack, so that's why
    I'm asking you seasoned hackers out there!!
    
       Thanks in advance for any suggestions you might have.
    
                        -- Phil
T.RTitleUserPersonal
Name
DateLines
446.1PASTIS::MONAHANThu Apr 16 1987 06:3618
    	A *real* hacker would just patch the file headers, so that the
    several files just became one file with extension headers. This
    will only work if you can guarantee that all the files will be on
    the same disk.
    
    	This should be fairly easy, but if you are concerned about
    efficient later access too, then you should repack the mapping pointers
    into the earlier file headers. This gets more messy, because then
    you might actually free up a file header, and have to dabble with
    the index file bitmap, which means locking the index file, ...
    
    	Better stick to linking the file headers together, and wait
    for an image backup/restore to do the repacking.
    
    	Don't blame me if you corrupt your disk, and I think only Eric
    would try to do it from DCL!
    
    		Dave
446.2CAFEIN::PFAUNow where did I leave my marbles?Thu Apr 16 1987 12:248
    I don't think that will work.  What happens to the space left over in
    the last blocks of the intermediate files?  Unless the files in
    question use all of the space in their last blocks, RMS will think the
    new file is corrupted.  Also, take into account disk cluster factors.
    There may be unused blocks containing garbage after the end of the
    file. 
    
    tom_p
446.3Oh well....DEBIT::ROYALFri Apr 17 1987 15:017
    
         Looks like I'll have to bag this idea, since I doubt I'd be
    able to ship a product that does this and can't guarrantee any kind
    of integrity.  It would be a good hack though.
    
                          -- Phil
    
446.4Flag to indicate end of information in blockDELNI::CANTORDave C.Mon Apr 20 1987 08:5710
      Re .2
      
      If the files have variable-length records, there's a flag word
      of all one-bits which goes in the place for the length of the
      next record and indicates that there are no more records in
      the block.   So the last block of each constituent file would
      have to be checked, and possibly munged, but it still looks
      possible.
      
      Dave C.
446.5PASTIS::MONAHANMon Apr 20 1987 11:3821
    	As .2 said, there could be several unused blocks at the end
    of each file because of disk cluster factor or file extents. It
    would look a rather funny file to have the flag word as the first
    word in several successive blocks in the middle of the merged file,
    but it sounds as if it should be a legal structure. Of course, an
    extent that was larger than the last disk cluster could be pruned
    in the merge.
    
    	Connected with the flag word, we have recently discovered an
    anomoly between how RMS and XQP treat it. Directory files are nominally
    the same structure as RMS sequential files, but are normally accessed
    via XQP instead. RMS is fairly tolerant about the flag word, just
    treating it as an indication to skip to the next block earlier than
    it might have expected. XQP expects to find the flag word indicating
    no more records in the block after the last record in every block,
    and will even move a record into the next block to make room for
    the flag word.
    
    	We discovered this using RMS to write a directory file. RMS
    would not put in a flag word if there was no need for one, and XQP
    would get upset when it tried to read the directory later.
446.6That flag wordMAY20::MINOWI need a vacationWed Apr 22 1987 15:2318
re: .4
      If the files have variable-length records, there's a flag word
      of all one-bits which goes in the place for the length of the
      next record and indicates that there are no more records in
      the block.

Watch it: that flag word is not part of the file (according to the
RMS "last block byte count" or whatever it's called).  If you
copy the file, the flag word will (may?) disappear.

To prove this, just copy a large bunch of .exe's (such as SYS$SYSTEM)
to another directory, then compare the results using DIFFERENCES/MODE=HEX.
Some of the files will be different.

I found that out the hard way.

Martin. 

446.7.EXE files aren't variable-lengthDELNI::CANTORDave C.Fri Apr 24 1987 02:118
      Re .6
      
      >To prove this, just copy a large bunch of .exe's ....

      But .EXE's don't have variable-length records; they have fixed-
      length records of 512 bytes each (full blocks).
      
      Dave C.
446.8But .EXEs do have variable length records...UFP::MURPHYEuropean or African Swallow?Fri Apr 24 1987 10:4420
    Re: .7
    But .EXE's have variable-length records imbedded in them; the debug
    information and traceback information is encoded exactly like RMS
    variable length records.
    Here's part of the DUMP/HEADER from an image file linked /DEBUG:
	-Rick

                             File Header
	...
    VAX-11 RMS attributes
        Record type:                      Fixed
        File organization:                Sequential
        Record attributes:                <none specified>
        Record size:                      512
        Highest block:                    9
        End of file block:                8
        End of file byte:                 186		<--- Note!
        Bucket size:                      0
        Fixed control area size:          0
        Maximum record size:              512
446.9Misinformation in headerDELNI::CANTORDave C.Sat Apr 25 1987 02:179
      Re .8
      
      I consider that to be a bug.  It looks like purposeful
      misinformation.   If the entire .EXE file consists of variable
      length records in normal format, then the file should be so-marked
      in the header, if not, then perhaps, the record type should
      be shown as "undefined," rather than fixed-512.
      
      Dave C.
446.10Why, it's a feature!SQM::HALLYBAre all the good ones taken?Sat Apr 25 1987 18:2812
    I think the/one reason for the discrepancy is that prior to V4.4,
    the only shared sequential format was fixed-512.  In fact, images
    have a complicated internal structure with ISDs, symbols, code,
    flags, etc., etc.  Logically these would be separate records but
    in fact the format is defined sort of under-the-table.
    
    Net, the header "misinformation" is not a bug, it's the tip of the
    iceberg.  And since it's been that way since V1, odds are it really
    is an "implementation aspect".  (== bug that does what we want,
    even if looks strange).
    
      John
446.11Last block byte countMAY20::MINOWI need a vacationMon Apr 27 1987 10:5612
Let me try (.6) again.  Although .exe's are stored as fixed-block, 512
byte records, RMS also stores a byte count for the last block in the file.
When you COPY a file, COPY notices the partial block count and copies
only the bytes that RMS says are in the file.

Other utility programs (such as DIFFERENCES) ignore the byte count,
comparing the garbage that the linker put in the .EXE (after the
"last byte") with the zero's that the disk controller filled the
last block of the copied file with.

Martin.