[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference ulysse::rdb_vms_competition

Title:DEC Rdb against the World
Moderator:HERON::GODFRIND
Created:Fri Jun 12 1987
Last Modified:Thu Feb 23 1995
Last Successful Update:Fri Jun 06 1997
Number of topics:1348
Total number of notes:5438

1063.0. "Benchmarking against Adabas" by NOT003::DENTI (Ian Dent @NOT, Nottingham, UK) Fri Jan 17 1992 17:40

                       Suggestions for Benchmarking Adabas

	This document was prepared to discuss some areas to consider when
	running a benchmark of Rdb against Adabas on an IBM mainframe.        

        The following detail some of the limits and restrictions in 
        Software AG's Adabas product for the IBM mainframe that I am 
        aware of. Note that my knowledge is 2 years old and some of the 
        limits may have been superseded. Any comments, corrections or latest
	information would be welcomed.
        
        Records per table
        
        Each table (called a file) in an Adabas database is restricted to 
        16.7 million records. I know that this limit has been got around 
        by Software AG by using "user exits". Each call made to the 
        database software can be intercepted by a user exit (a bit of 
        code written by the user and linked in with the Adabas image) and 
        the call changed if necessary. For large logical files (over 16 
        million records) user exits have been used to partition over a 
        number of real Adabas files. For instance to store 50 million 
        records 4 adabas files would be used (A,B,C,D). The user exit 
        would interrogate the database key and issue the correct database 
        command to either A,B,C or D.
        
        I believe that Software AG have developed a product called 
        something like "Large File Option" which implements this feature 
        in a "product". 
        
 	There may be an overhead in executing the user exit for each 
	database command.
        
        Tables per database
        
        Each database can have up to 255 tables (files in SAG jargon). Of 
        these about 20 are allocated for system information. 
        
        Databases per system
        
        There is no restriction on the number of databases that can be 
        run on the same system. Each database has all disk IO passed 
        through a single multi-threaded database nucleus process.
        
        If more that 1 database is to be run on the same system then the 
        restrictions on updates across the multiple databases should be 
        pointed out to customers. In order to implement 2 phase commit 
        I believe an extra product called Entire is needed. If this is 
        not used then updates can be made to the multiple databases but 
        without the protection of 2PC.
        
        Repeating Groups
        
        Adabas is not relational and allows repeating groups of fields to 
        be defined within a record (called periodic groups in SAG 
        jargon). There is a restriction of a maximum of 99 occurrences of 
        each repeating group.
        
        This feature allows owner-member relationships to be defined in a 
        single table and allows all related data to be retrieved in one 
        record IO.
        
        If you are specifying owner-member type relationships in the 
        specification then try and allow a maximum of more that 100 
        members per owner in order to preclude the use of periodic 
        groups.
        
        Multi-valued fields
        
        Much like periodic groups Adabas allows a single field to be 
        defined that can have multiple values (often used for, say, 
        address lines). There is a restriction of 191 on the number of 
        occurrences of the field that can be stored.
        
        This feature allows IOs to be reduced.
        
        Try to define the benchmark such that the maximum possible number 
        of entries for a field is over 191. This will ensure that the 
        field must be defined in a separate table.
        
        Multi-valued fields can be defined within a periodic group to 
        get many more occurrences allowed but at the expense of more
	programming effort.
        
        Locking
        
        As all IO to a database is done from a single multi-threaded 
        nucleus process, the locking between users does not have to incur 
        the overhead of process communication. This means that locking is 
        very efficient. 
        
        Try and minimise the locking overhead in the benchmark. If 
        possible allow users to lock at the table level in order to 
        reduce number of locks.
        
        Separation of data and indexes
        
        Adabas stores all data records in up to 16 DATA disk files. All 
        indexes are stored in up to 16 ASSO files. This means that data 
        cannot be stored together with the associated index.
        
        Try and find a benchmark that will allow Rdb to make use of data 
        and index clustering.
        
        Number of disk files
        
        Adabas allocates up to 16 DATA and 16 ASSO files on disk in order 
        to store all indexes and data records. Depending on how the IBM 
        mainframe operating system allows disks to be grouped together 
        there may be a restriction in how many disk files are needed.
        
        WORK area
        
        Adabas uses a single WORK file to store transaction information 
        and for use for sorting. In order to get good performance from 
        Adabas this should be on a fast device such as a solid state 
        disk. If not on a fast disk this can become a bottleneck in 
        Adabas.
        
        Think about restricting the use of solid state disks in the 
        benchmark or cast doubt on the use of a solid state disk for the 
        WORK area. WORK is used to recover incomplete transactions in the 
        event of a system failure.
        
        Global buffering
        
        Adabas does all IO through a single nucleus. This nucleus buffers 
        data and indexes very efficiently and in a normal time-sharing 
        environment the global buffering gives very good performance.
        
        For the benchmark, try and ensure that each user does not share 
        data and indexes with other users in order to restrict the 
        benefit of buffering.
        
        Ensure that the nucleus buffers are not pre-loaded at the 
        beginning of each benchmark run.
        
        Compound keys
        
        Adabas allows a superdescriptor to be defined which can consist 
        of up to 5 complete or portions of fields within a file.
        
        Try and ensure that some compound keys in the benchmark need 6 or 
        more portions. 
        
        Networking
        
        Any access between machines in a network will be done using 
        Adanet which passes the database commands from the originating 
        machine to the machine running the nucleus via any supported 
        networking protocol. Between IBM machines this is likely to be 
        very efficient and should be avoided in the benchmark if 
        possible.
        
        Journalling
        
        All transactions against an Adabas database are written to a 
        single PLOG (protection log). This can become a bottleneck for 
        Adabas and to get good performance should be positioned on a 
        solid state disk.
        
        As this is used to recover and roll forward from a disaster, 
        spread doubt about it being on a solid state disk and try and 
        insist that it is on as slow a device as possible. 
        
        Protection logging can be turned off. If Rdb is to run with after 
        image journalling then insist that the PLOG is on.
        
        ISNs
        
        Adabas stores each record using an ISN (internal sequence number) 
        which corresponds to an Rdb dbkey.
        
        Adabas allows the ISN to be user-specified such that, for 
        example, a unique customer number can be used as the ISN within a 
        file. As records can be accessed directly via ISN then this leads 
        to a very quick access method.    
        
        Try and ensure that there is no suitable user field that can be 
        used as a user defined ISN.
        
        A common technique in Adabas is to store the ISN of a related 
        record as a field in a record. This allows related records to be 
        accessed very quickly once the initial record has been read. Try 
        and outlaw this technique from the benchmark.
        
        Counting
        
        Counts of records in a file matching a certain index value are 
        very quick and should be avoided.
        
        Data loading and backups
        
        Initial loading of data and backups are very quick. Try and avoid 
        testing data loading and backup speeds against Rdb.
        
        Recovery from disaster
        
        Roll forward of the database after a disaster and the restoration 
        of a backup is very quick. I believe that the journal only 
        contains updates to data and index records and does not contain 
        information on meta-data changes. If a metadata change has been 
        made then the roll forward will stop at the time of the metadata 
        change. The user must then make the same metadata change before 
        continuing with the roll forward.
        
        If you must test the speed of recovery from a disaster try and 
        include a metadata update in the transactions being recovered.
        
        Data compression
        
        Adabas compresses trailing spaces on text fields, leading zeroes 
        on numeric fields and removes null values.
        
        Try and bias the data such that Rdb compression methods can be 
        used but that Adabas ones cannot.
        
        Hash keys
        
        Adabas has a structure called an ADAM file which is like defining 
        a hash key and storing the records based on that key. This is 
        fairly simple. 
        
        If possible try and come up with a complicated situation using 
        multiple hash keys that Adabas cannot use.
        
        Page size
        
        Data and index records are stored in the database on pages whose 
        size is dependent on the disk device being used. The standard 
        page size is about 5000 bytes for data and 2500 for index 
        records.
        
        Try and ensure that some records are defined to be very long 
        (i.e. over 5000 bytes). Adabas does not allow records to be 
        fragmented over pages and thus the database will need to be 
        defined to include some user specified segmentation. For example 
        a logical record of length 9000 could be stored as a field (1 or 
        2) for segment number and the actual segment (4500).
        
T.RTitleUserPersonal
Name
DateLines
1063.1A few commentsCOPCLU::BRUNSGAARDCurriculum Vitae, who's that ??Wed Jan 22 1992 12:4243
    Hi Ian,
    
    thanks for posting this, but I have a few comments.
    
    >>  I know that my knowledge is 2 years old and some of the 
    >>  limits may have been superseded. Any comments, corrections or latest
    >>	information would be welcomed.
        Remember this is also true with Rdb !
    
    
    >>  Counts of records in a file matching a certain index value are
    >>  very quick and should be avoided.
        Also since  counts are not very common in real applications, bias
    more aginst EXISTING operations, since they are handled exellent in
    Rdb.
    
    >>Data loading and backups
    >>Initial loading of data and backups are very quick. Try and avoid
    >> testing data loading and backup speeds against Rdb.
    If yu ask me then these tow things have nothing to with eachother !
    Load is one thing
    Backup is completely different
    a) You are probably right about loading data, though I would be
    surprised if the performance of Adabase were far superior than Rdb
    b) I would be EXTREMELY surprised if Adabas were even close at Rdb
    rates and functionality here !!
    I think that Rdb can kill any competitor in this area
     Online backup of db, db-area and aij
     online restore of db-area and air recover (in V4.1)
    
    >>        Data compression
    
    >>        Adabas compresses trailing spaces on text fields, leading zeroes
    >>        on numeric fields and removes null values.
    
    >>        Try and bias the data such that Rdb compression methods can be
    >>        used but that Adabas ones cannot.
    or try to make adabas compression do it's best, as this is bound to
    cost alot of CPU power to do this columnlevel wrap/unwrap
    Or even make use of compression optional (not allowed) in the benchmark
    
    Just my comments
    Lars