| Title: | DEC Rdb against the World |
| Moderator: | HERON::GODFRIND |
| Created: | Fri Jun 12 1987 |
| Last Modified: | Thu Feb 23 1995 |
| Last Successful Update: | Fri Jun 06 1997 |
| Number of topics: | 1348 |
| Total number of notes: | 5438 |
Suggestions for Benchmarking Adabas
This document was prepared to discuss some areas to consider when
running a benchmark of Rdb against Adabas on an IBM mainframe.
The following detail some of the limits and restrictions in
Software AG's Adabas product for the IBM mainframe that I am
aware of. Note that my knowledge is 2 years old and some of the
limits may have been superseded. Any comments, corrections or latest
information would be welcomed.
Records per table
Each table (called a file) in an Adabas database is restricted to
16.7 million records. I know that this limit has been got around
by Software AG by using "user exits". Each call made to the
database software can be intercepted by a user exit (a bit of
code written by the user and linked in with the Adabas image) and
the call changed if necessary. For large logical files (over 16
million records) user exits have been used to partition over a
number of real Adabas files. For instance to store 50 million
records 4 adabas files would be used (A,B,C,D). The user exit
would interrogate the database key and issue the correct database
command to either A,B,C or D.
I believe that Software AG have developed a product called
something like "Large File Option" which implements this feature
in a "product".
There may be an overhead in executing the user exit for each
database command.
Tables per database
Each database can have up to 255 tables (files in SAG jargon). Of
these about 20 are allocated for system information.
Databases per system
There is no restriction on the number of databases that can be
run on the same system. Each database has all disk IO passed
through a single multi-threaded database nucleus process.
If more that 1 database is to be run on the same system then the
restrictions on updates across the multiple databases should be
pointed out to customers. In order to implement 2 phase commit
I believe an extra product called Entire is needed. If this is
not used then updates can be made to the multiple databases but
without the protection of 2PC.
Repeating Groups
Adabas is not relational and allows repeating groups of fields to
be defined within a record (called periodic groups in SAG
jargon). There is a restriction of a maximum of 99 occurrences of
each repeating group.
This feature allows owner-member relationships to be defined in a
single table and allows all related data to be retrieved in one
record IO.
If you are specifying owner-member type relationships in the
specification then try and allow a maximum of more that 100
members per owner in order to preclude the use of periodic
groups.
Multi-valued fields
Much like periodic groups Adabas allows a single field to be
defined that can have multiple values (often used for, say,
address lines). There is a restriction of 191 on the number of
occurrences of the field that can be stored.
This feature allows IOs to be reduced.
Try to define the benchmark such that the maximum possible number
of entries for a field is over 191. This will ensure that the
field must be defined in a separate table.
Multi-valued fields can be defined within a periodic group to
get many more occurrences allowed but at the expense of more
programming effort.
Locking
As all IO to a database is done from a single multi-threaded
nucleus process, the locking between users does not have to incur
the overhead of process communication. This means that locking is
very efficient.
Try and minimise the locking overhead in the benchmark. If
possible allow users to lock at the table level in order to
reduce number of locks.
Separation of data and indexes
Adabas stores all data records in up to 16 DATA disk files. All
indexes are stored in up to 16 ASSO files. This means that data
cannot be stored together with the associated index.
Try and find a benchmark that will allow Rdb to make use of data
and index clustering.
Number of disk files
Adabas allocates up to 16 DATA and 16 ASSO files on disk in order
to store all indexes and data records. Depending on how the IBM
mainframe operating system allows disks to be grouped together
there may be a restriction in how many disk files are needed.
WORK area
Adabas uses a single WORK file to store transaction information
and for use for sorting. In order to get good performance from
Adabas this should be on a fast device such as a solid state
disk. If not on a fast disk this can become a bottleneck in
Adabas.
Think about restricting the use of solid state disks in the
benchmark or cast doubt on the use of a solid state disk for the
WORK area. WORK is used to recover incomplete transactions in the
event of a system failure.
Global buffering
Adabas does all IO through a single nucleus. This nucleus buffers
data and indexes very efficiently and in a normal time-sharing
environment the global buffering gives very good performance.
For the benchmark, try and ensure that each user does not share
data and indexes with other users in order to restrict the
benefit of buffering.
Ensure that the nucleus buffers are not pre-loaded at the
beginning of each benchmark run.
Compound keys
Adabas allows a superdescriptor to be defined which can consist
of up to 5 complete or portions of fields within a file.
Try and ensure that some compound keys in the benchmark need 6 or
more portions.
Networking
Any access between machines in a network will be done using
Adanet which passes the database commands from the originating
machine to the machine running the nucleus via any supported
networking protocol. Between IBM machines this is likely to be
very efficient and should be avoided in the benchmark if
possible.
Journalling
All transactions against an Adabas database are written to a
single PLOG (protection log). This can become a bottleneck for
Adabas and to get good performance should be positioned on a
solid state disk.
As this is used to recover and roll forward from a disaster,
spread doubt about it being on a solid state disk and try and
insist that it is on as slow a device as possible.
Protection logging can be turned off. If Rdb is to run with after
image journalling then insist that the PLOG is on.
ISNs
Adabas stores each record using an ISN (internal sequence number)
which corresponds to an Rdb dbkey.
Adabas allows the ISN to be user-specified such that, for
example, a unique customer number can be used as the ISN within a
file. As records can be accessed directly via ISN then this leads
to a very quick access method.
Try and ensure that there is no suitable user field that can be
used as a user defined ISN.
A common technique in Adabas is to store the ISN of a related
record as a field in a record. This allows related records to be
accessed very quickly once the initial record has been read. Try
and outlaw this technique from the benchmark.
Counting
Counts of records in a file matching a certain index value are
very quick and should be avoided.
Data loading and backups
Initial loading of data and backups are very quick. Try and avoid
testing data loading and backup speeds against Rdb.
Recovery from disaster
Roll forward of the database after a disaster and the restoration
of a backup is very quick. I believe that the journal only
contains updates to data and index records and does not contain
information on meta-data changes. If a metadata change has been
made then the roll forward will stop at the time of the metadata
change. The user must then make the same metadata change before
continuing with the roll forward.
If you must test the speed of recovery from a disaster try and
include a metadata update in the transactions being recovered.
Data compression
Adabas compresses trailing spaces on text fields, leading zeroes
on numeric fields and removes null values.
Try and bias the data such that Rdb compression methods can be
used but that Adabas ones cannot.
Hash keys
Adabas has a structure called an ADAM file which is like defining
a hash key and storing the records based on that key. This is
fairly simple.
If possible try and come up with a complicated situation using
multiple hash keys that Adabas cannot use.
Page size
Data and index records are stored in the database on pages whose
size is dependent on the disk device being used. The standard
page size is about 5000 bytes for data and 2500 for index
records.
Try and ensure that some records are defined to be very long
(i.e. over 5000 bytes). Adabas does not allow records to be
fragmented over pages and thus the database will need to be
defined to include some user specified segmentation. For example
a logical record of length 9000 could be stored as a field (1 or
2) for segment number and the actual segment (4500).
| T.R | Title | User | Personal Name | Date | Lines |
|---|---|---|---|---|---|
| 1063.1 | A few comments | COPCLU::BRUNSGAARD | Curriculum Vitae, who's that ?? | Wed Jan 22 1992 12:42 | 43 |
Hi Ian,
thanks for posting this, but I have a few comments.
>> I know that my knowledge is 2 years old and some of the
>> limits may have been superseded. Any comments, corrections or latest
>> information would be welcomed.
Remember this is also true with Rdb !
>> Counts of records in a file matching a certain index value are
>> very quick and should be avoided.
Also since counts are not very common in real applications, bias
more aginst EXISTING operations, since they are handled exellent in
Rdb.
>>Data loading and backups
>>Initial loading of data and backups are very quick. Try and avoid
>> testing data loading and backup speeds against Rdb.
If yu ask me then these tow things have nothing to with eachother !
Load is one thing
Backup is completely different
a) You are probably right about loading data, though I would be
surprised if the performance of Adabase were far superior than Rdb
b) I would be EXTREMELY surprised if Adabas were even close at Rdb
rates and functionality here !!
I think that Rdb can kill any competitor in this area
Online backup of db, db-area and aij
online restore of db-area and air recover (in V4.1)
>> Data compression
>> Adabas compresses trailing spaces on text fields, leading zeroes
>> on numeric fields and removes null values.
>> Try and bias the data such that Rdb compression methods can be
>> used but that Adabas ones cannot.
or try to make adabas compression do it's best, as this is bound to
cost alot of CPU power to do this columnlevel wrap/unwrap
Or even make use of compression optional (not allowed) in the benchmark
Just my comments
Lars
| |||||