Title: | DEC Rdb against the World |
Moderator: | HERON::GODFRIND |
Created: | Fri Jun 12 1987 |
Last Modified: | Thu Feb 23 1995 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 1348 |
Total number of notes: | 5438 |
Suggestions for Benchmarking Adabas This document was prepared to discuss some areas to consider when running a benchmark of Rdb against Adabas on an IBM mainframe. The following detail some of the limits and restrictions in Software AG's Adabas product for the IBM mainframe that I am aware of. Note that my knowledge is 2 years old and some of the limits may have been superseded. Any comments, corrections or latest information would be welcomed. Records per table Each table (called a file) in an Adabas database is restricted to 16.7 million records. I know that this limit has been got around by Software AG by using "user exits". Each call made to the database software can be intercepted by a user exit (a bit of code written by the user and linked in with the Adabas image) and the call changed if necessary. For large logical files (over 16 million records) user exits have been used to partition over a number of real Adabas files. For instance to store 50 million records 4 adabas files would be used (A,B,C,D). The user exit would interrogate the database key and issue the correct database command to either A,B,C or D. I believe that Software AG have developed a product called something like "Large File Option" which implements this feature in a "product". There may be an overhead in executing the user exit for each database command. Tables per database Each database can have up to 255 tables (files in SAG jargon). Of these about 20 are allocated for system information. Databases per system There is no restriction on the number of databases that can be run on the same system. Each database has all disk IO passed through a single multi-threaded database nucleus process. If more that 1 database is to be run on the same system then the restrictions on updates across the multiple databases should be pointed out to customers. In order to implement 2 phase commit I believe an extra product called Entire is needed. If this is not used then updates can be made to the multiple databases but without the protection of 2PC. Repeating Groups Adabas is not relational and allows repeating groups of fields to be defined within a record (called periodic groups in SAG jargon). There is a restriction of a maximum of 99 occurrences of each repeating group. This feature allows owner-member relationships to be defined in a single table and allows all related data to be retrieved in one record IO. If you are specifying owner-member type relationships in the specification then try and allow a maximum of more that 100 members per owner in order to preclude the use of periodic groups. Multi-valued fields Much like periodic groups Adabas allows a single field to be defined that can have multiple values (often used for, say, address lines). There is a restriction of 191 on the number of occurrences of the field that can be stored. This feature allows IOs to be reduced. Try to define the benchmark such that the maximum possible number of entries for a field is over 191. This will ensure that the field must be defined in a separate table. Multi-valued fields can be defined within a periodic group to get many more occurrences allowed but at the expense of more programming effort. Locking As all IO to a database is done from a single multi-threaded nucleus process, the locking between users does not have to incur the overhead of process communication. This means that locking is very efficient. Try and minimise the locking overhead in the benchmark. If possible allow users to lock at the table level in order to reduce number of locks. Separation of data and indexes Adabas stores all data records in up to 16 DATA disk files. All indexes are stored in up to 16 ASSO files. This means that data cannot be stored together with the associated index. Try and find a benchmark that will allow Rdb to make use of data and index clustering. Number of disk files Adabas allocates up to 16 DATA and 16 ASSO files on disk in order to store all indexes and data records. Depending on how the IBM mainframe operating system allows disks to be grouped together there may be a restriction in how many disk files are needed. WORK area Adabas uses a single WORK file to store transaction information and for use for sorting. In order to get good performance from Adabas this should be on a fast device such as a solid state disk. If not on a fast disk this can become a bottleneck in Adabas. Think about restricting the use of solid state disks in the benchmark or cast doubt on the use of a solid state disk for the WORK area. WORK is used to recover incomplete transactions in the event of a system failure. Global buffering Adabas does all IO through a single nucleus. This nucleus buffers data and indexes very efficiently and in a normal time-sharing environment the global buffering gives very good performance. For the benchmark, try and ensure that each user does not share data and indexes with other users in order to restrict the benefit of buffering. Ensure that the nucleus buffers are not pre-loaded at the beginning of each benchmark run. Compound keys Adabas allows a superdescriptor to be defined which can consist of up to 5 complete or portions of fields within a file. Try and ensure that some compound keys in the benchmark need 6 or more portions. Networking Any access between machines in a network will be done using Adanet which passes the database commands from the originating machine to the machine running the nucleus via any supported networking protocol. Between IBM machines this is likely to be very efficient and should be avoided in the benchmark if possible. Journalling All transactions against an Adabas database are written to a single PLOG (protection log). This can become a bottleneck for Adabas and to get good performance should be positioned on a solid state disk. As this is used to recover and roll forward from a disaster, spread doubt about it being on a solid state disk and try and insist that it is on as slow a device as possible. Protection logging can be turned off. If Rdb is to run with after image journalling then insist that the PLOG is on. ISNs Adabas stores each record using an ISN (internal sequence number) which corresponds to an Rdb dbkey. Adabas allows the ISN to be user-specified such that, for example, a unique customer number can be used as the ISN within a file. As records can be accessed directly via ISN then this leads to a very quick access method. Try and ensure that there is no suitable user field that can be used as a user defined ISN. A common technique in Adabas is to store the ISN of a related record as a field in a record. This allows related records to be accessed very quickly once the initial record has been read. Try and outlaw this technique from the benchmark. Counting Counts of records in a file matching a certain index value are very quick and should be avoided. Data loading and backups Initial loading of data and backups are very quick. Try and avoid testing data loading and backup speeds against Rdb. Recovery from disaster Roll forward of the database after a disaster and the restoration of a backup is very quick. I believe that the journal only contains updates to data and index records and does not contain information on meta-data changes. If a metadata change has been made then the roll forward will stop at the time of the metadata change. The user must then make the same metadata change before continuing with the roll forward. If you must test the speed of recovery from a disaster try and include a metadata update in the transactions being recovered. Data compression Adabas compresses trailing spaces on text fields, leading zeroes on numeric fields and removes null values. Try and bias the data such that Rdb compression methods can be used but that Adabas ones cannot. Hash keys Adabas has a structure called an ADAM file which is like defining a hash key and storing the records based on that key. This is fairly simple. If possible try and come up with a complicated situation using multiple hash keys that Adabas cannot use. Page size Data and index records are stored in the database on pages whose size is dependent on the disk device being used. The standard page size is about 5000 bytes for data and 2500 for index records. Try and ensure that some records are defined to be very long (i.e. over 5000 bytes). Adabas does not allow records to be fragmented over pages and thus the database will need to be defined to include some user specified segmentation. For example a logical record of length 9000 could be stored as a field (1 or 2) for segment number and the actual segment (4500).
T.R | Title | User | Personal Name | Date | Lines |
---|---|---|---|---|---|
1063.1 | A few comments | COPCLU::BRUNSGAARD | Curriculum Vitae, who's that ?? | Wed Jan 22 1992 12:42 | 43 |
Hi Ian, thanks for posting this, but I have a few comments. >> I know that my knowledge is 2 years old and some of the >> limits may have been superseded. Any comments, corrections or latest >> information would be welcomed. Remember this is also true with Rdb ! >> Counts of records in a file matching a certain index value are >> very quick and should be avoided. Also since counts are not very common in real applications, bias more aginst EXISTING operations, since they are handled exellent in Rdb. >>Data loading and backups >>Initial loading of data and backups are very quick. Try and avoid >> testing data loading and backup speeds against Rdb. If yu ask me then these tow things have nothing to with eachother ! Load is one thing Backup is completely different a) You are probably right about loading data, though I would be surprised if the performance of Adabase were far superior than Rdb b) I would be EXTREMELY surprised if Adabas were even close at Rdb rates and functionality here !! I think that Rdb can kill any competitor in this area Online backup of db, db-area and aij online restore of db-area and air recover (in V4.1) >> Data compression >> Adabas compresses trailing spaces on text fields, leading zeroes >> on numeric fields and removes null values. >> Try and bias the data such that Rdb compression methods can be >> used but that Adabas ones cannot. or try to make adabas compression do it's best, as this is bound to cost alot of CPU power to do this columnlevel wrap/unwrap Or even make use of compression optional (not allowed) in the benchmark Just my comments Lars |