| Title: | DEC Rdb against the World |
| Moderator: | HERON::GODFRIND |
| Created: | Fri Jun 12 1987 |
| Last Modified: | Thu Feb 23 1995 |
| Last Successful Update: | Fri Jun 06 1997 |
| Number of topics: | 1348 |
| Total number of notes: | 5438 |
This is an unofficial guide to competing with SYBASE, particularly in the
area of benchmarks. I would like to produce a more complete SYBASE
Competitive Guide. I would appreciate anyone who has competitive or
benchmarking tips not mentioned here to add them in replies to this note.
I will try to compile everything I get into a short VAX Document Guide.
Any corrections or suggestions on these or other topics would be welcome,
as I make no claim to absolute perfection, accuracy or any other saintly
qualities. Note that a lot of competitive information is contained
elsewhere in this conference (see note 58.*).
The information that I have placed here results from various competitive
situations that I have had with SYBASE. Note that I did not win all of these
situations and some of my hard earned scars I am including here. After having
reviewed the SYBASE documentation and the SYBASE optimization course, I am more
convinced than ever that RDB is the superior product, for more reasons than are
mentioned here.
1. Control of the Benchmark and the Environment
It is important to try to control the substance of the benchmark and the
conditions under which the benchmark is run. While Digital normally seems
to come from behind in these contests (we always seem to find out about the
benchmark after SYBASE or Oracle has written it), we can have an impact on
the benchmark by understanding the client's requirements and relating them
to the benchmark. In many cases the benchmark that is proposed only vaguely
reflects a client's true production environment. The following points are
provided to suggest features and strengths of RDB that a benchmark should
emphasize. In most cases there are valid reasons that relate back to a
client's environment that should be exploited.
For most of these suggestions the name of the game is to minimize the number
of I/O operations performed by RDB while increasing the number done by
SYBASE. Fortunately, this is not that difficult to do. However, SYBASE
makes up for the minimal 'smarts' in their database engine by using a very
large data cache to cover a multitude of sins. In each case, the goal is
to try to force SYBASE to become I/O bound while minimizing the effects of
their cache. In almost all TP and interactive benchmarks, the goal is to
minimize RDB I/O requirements while minimizing the impact of the SYBASE
cache.
o Suggest that RDB and SYBASE both use record locking for an apple to apple
comparison. Ensure 'smart' locking is done for updates. SYBASE does not
provide record locking and they claim that most applications don't really
need record locking. The alternative is to do record comparisons IN THE
APPLICATION CODE! This means that an extra record retrieval must be done
before re-writing a record and checking that the original copy that we
have in memory has not changed. The user is then told that the record
changed and the transaction is aborted, after the user has typed in the
transaction! They must then restart the transaction in order to start
from scratch. This is much more complex code than would be required to
support record locking. The extra programming effort required can destroy
a development group's productivity!
o Try for multi record retrievals. Retrievals by a key that is used to
store a record can be done more quickly by RDB. SYBASE has a fixed page
size which means that if more records are retrieved than fit on a single
page extra I/O's are forced.
o Use same environment for both benchmarks. SYBASE prefers not to run
their benchmarks observed by Digital. Try to force SYBASE to run the
benchmark on a Digital facility using the same machine and O/S
environment that will be used to run the RDB benchmark. Digital should
do the tuning for both benchmarks as the SYBASE people are not
knowledgeable about system tuning and can create a very poor environment.
o Restart SYBASE DB server from scratch for each test. For 'fairness' shut
down the RDB monitor and restart it again for each benchmark run. This
technique will help to avoid SYBASE data caching between benchmarks.
SYBASE will usually protest this action for the reason just mentioned,
but will say that 'SYBASE needs some time to pagefault everything into
memory to run fast'. Use the next point to avoid SYBASE protests.
o Record timings should be taken only when the benchmark has reached a
steady state. RDB can take a significant hit when loading the metadata
during a bind to the database. Use 1 initial fetch for each relation to
load metadata for RDB and to initialize SYBASE. This benefits us more
than it does them. It also more closely matches the situation when a
real production (TP) environment is run.
o Try not allow a remote client/server arrangement to be used if it doesn't
make sense for the customer. SYBASE loves to use a front end to run the
benchmark while using the back end to run the benchmark. Though the
extra network overhead incurred usually benefits us (specially if you
connect the front end and the back end using asynchronous Decnet!) it is
much more difficult to measure overall resource utilization and
performance using this model.
- Use ACMS if necessary for an even comparison. This does require some
significant re-writing, but it can be very useful to show the client
how a true TP environment functions. It does require focusing the
customer on their application, not on the benchmark.
- Try to use SMP and clusters. SYBASE does not like these, even with
their 'Virtual Server'. This has not really been shown to work in an
VAX SMP environment though they claimed they had it running on a VAX
6310 SMP machine (ha ha). As SYBASE uses a single database server, it
is very difficult for this to make use of an SMP machine. It is
possible for them to run in clusters as they treat the nodes as
separate machines.
- Try to use a single system for the front and back end processes as
SYBASE has problems because their server runs at a higher priority.
This can spoil response times for processes connected to the server in
a high throughput environment.
o Consider making cluster fail-over times a component for testing. SYBASE
must run a 'semi-hot' server on the second node with all of its memory
preallocated. SYBASE can detect the first node going away, but there
is a fairly wide window for SYBASE to experience data corruption and
integrity problems due to their cache not flushing at the right time.
o Have the customer specify that overall resource utilization should be
considered as one of the benchmark criteria. SYBASE typically sucks much
more of the systems resources both in terms of memory (when few jobs are
running), CPU and disk space! Make sure that customer considers all of
these aspects as part of the DB Benchmark results.
o Try to specify background reporting jobs that 'sweep' through
large portions of data. This tends to negate SYBASE's Data
Caching as it forces the cache to be flushed and reloaded with other
data.
o Specify a small recovery time interval to force frequent
flushing of SYBASE journal. I need to check this point as I have not
verified it.
o Go for large numbers of spindles and large databases. The larger the
database the lower the chances the SYBASE Caching will be effective.
A large database with hash keys can beat data caching almost any day.
o Sequential passes through database benefit us (even with updates) as we
can choose a page size that loads large amounts of data in a single read.
SYBASE has severe problems with this due to their fixed page size.
o Minimize range retrieval so that hashing can be used.
o Try for large empty records with lots of 'white space' if multi-record
retrieval is done as this allows us to use record compression to put more
records/page.
o Help write the benchmark code, if possible.
o Use embedded SQL modules as this makes it faster to compile and link 3GL
programs. Embedded SQL sucks!
o Minimize the number of iterations that a single transaction is run to
minimize the effects of SYBASE Data Caching. On a recent benchmark I
forgot to prevent this and SYBASE asked for 30 iterations of a fairly
long query. Over the 30 iterations I calculated that roughly 60+% of the
relation was able to cache itself in memory. Result was that SYBASE had
faster response times, but much higher CPU utilization. If we had
limited the number of iterations to 5 the effects of caching would have
been negated.
o Make sure that all transactions are started and finished with a COMMIT.
SYBASE can perform database activities without starting a transaction.
They look roughly like file I/O. In addition, the customer was doing
updates, inserts and deletes and doing a rollback immediately afterward.
This does not test the database much! Both the RDB and the SYBASE
benchmarks should START transactions and COMMIT them explicitly for an
'Apples to Apples' comparison. In addition, try to avoid the SYBASE
partial commit. This allows commits of some data to be performed part
way through a transaction while still keeping everything else inside a
single transaction.
o Minimize the assignment of sequential incrementing keys. This tends to
impact performance because one record is locked all of the time. Either
ask for blocks of counters to be allocated or use some other method of
generating a unique record identifier.
Note that SYBASE uses a technique called Pre-emptive Key Splitting in
this kind of situation to avoid locking indexes during updates. This
means that if a locked B-Tree node is encountered the node will be split
by SYBASE leaving a locked and an unlocked half. The unlocked half is
then updated with the new record key. This can lead to longer I/O chains
on SYBASE, but it does minimize the bucket locking that will occur when
multiple processes are storing records with sequentially incrementing
keys.
2. Analysis
o Access paths and frequency
- Look for ways to hash keys for clustered retrievals
- Delete unnecessary keys.
o Review all access paths used in the transactions. Look for frequently
used key paths and optimize. Look for hot spots.
o Determine all relations which are read only so that they could be placed
in a storage area that is R/O to minimize locking.
3. Setup
o Record Compression
- If retrieving multiple records and there is a large amount of 'white
space' (repeating characters) in a record, use compression.
- If minimal or no white space or only single records being retrieved,
turn off compression.
- If CPU time is an issue, turn off compression.
Note: SYBASE does not have record compression. For sequential
multi record retrieval we can significantly save on I/O by
using record compression. However, this does increase RDBs CPU
utilization.
o Multi Spindle Storage Areas and Partitioning
- If large numbers of processes doing single record retrieval,
split the relation between multiple storage areas to minimize spindle
contention.
Note: SYBASE now provides record partitioning but they can only divide a
table randomly between table files, they cannot partition by key
value.
o Hash Keys
- For single record retrieval by a primary key or for a join with
another relation using a foreign key.
- For high insertion rates with a sequentially incrementing key.
- When range or sequential retrieval in a sorted order is required, do
not use hash keys or provide a separate sorted index. Note that a
sequential index doe not work well with the previous situation. It can
cause significant performance bottlenecks.
o Joins
- In some benchmarks a multiple join may not specify retrieval of any
values from secondary relations. RDB can take advantage of this by
using sorted indexes with all keys used in the query as part of the
index. In these situations, the query optimizer will perform an index
only retrieval and avoid reading the record at all.
Note: The SYBASE query optimizer is not 'too bright' and will read all
records in a join, even if no values are retrieved.
Eg.
Relation A B
FLDA FLDA
FLDB FLDB
FLDC FLDD
Cardinality: 100 1000
Index: A_IX B_IX
FLDA (hashed) FLDA (sorted)
FLDB
Select A.FLDA, A.FLDB, A.FLDC [results in 5 As and
from A, B 50 Bs being read.
where A.FLDA = B.FLDA and A.FLDA between 10 and 14
If 5 records fit on a page and are stored in FLDA order, RDB will perform
this query in 3 I/Os. (1 Hash read on A and 2 Index Reads on B). SYBASE
will perform this in roughly 15 I/Os (2 Index reads on A, 1 record read
on A, 2 Index reads on B, 10 data reads on B). This example assumes no
data caching so in reality it would usually require less I/Os on an
active system.
The order of a join in a query can affect performance in SYBASE (but not
RDB!). Specifying inefficient orders for selection will slow down SYBASE
(unless you can convince the client that SYBASE is not allowed to change
the SQL code!) as they need to manually determine optimal queries to
take advantage of cardinality and indexes. A query that specifies a less
selective retrieval range first can force SYBASE to perform a much higher
number of I/O operations. Note that all appropriate fields should be
indexed in the query.
Eg. Select # from Personnel where SEX='F' and
AGE between 20 and 32 and
START_DATE <5-JUL-1988
With 1000 employees the following key cardinalities might be found:
SEX 2 (M/F)
AGE 40
START_DATE 1000
RDB would check the above and would probably use the START_DATE to
qualify the selection. SYBASE would use SEX as it was specified first.
'Unoptimizing' the query order is a minor irritant to SYBASE unless they
have a less experienced person running the benchmark in which case it
becomes an easy win (if there is any such thing)! You may also want to
be a bit more 'subtle' in your 'unoptimization' and put AGE first which
causes more I/O.
Even if SYBASE suggests re-ordering the SQL statements, you can be
generous and agree to the change. Then point out to the client that RDB
will optimize the query regardless of the order because of a much more
intelligent query optimizer. It's good for a few brownie points.
o Page Size
Choose a page size large enough to hold a set of range records
with a single I/O operation. If high update/write rates are
expected (with minimal range retrievals) choose a page size
that will minimize locking conflicts.
Note: SYBASE has a fixed page size of 2048 bytes. This can cause
fragmented records and much higher numbers of I/Os when the
benchmark is properly planned.
o Network Packet Size
If you are tuning the machine for the benchmark, which SYBASE is going
to run AND you are not using RDB in a client server arrangement THEN
choose a very small DECNet package size (say 128). This will increase
the network overhead considerably, and in most cases, SYBASE people do
not know enough to check this parameter. This will increase their CPU
overhead considerably without giving any indication why.
o Journalling
Spread the Run Unit Journal over multiple drives to minimize disk
contention. Avoid after image journalling as we seem to take more of an
impact on this due to locking and contention for the Journal file.
SYBASE also seems to take a fairly liberal view of journalling and does
not seem to rigidly synchronize journal flushes with updates. In
addition, a COMMIT may return a success status to the user while the
pages are still being flushed back to the database.
o R/O areas
Use Read only areas wherever possible to minimize locking overhead. Make
system area Read Only to avoid updates to cardinality. Note that if
cardinality will change radically during a benchmark then an RO system
area should not be used.
Note that you could also use Pseudo Disks for R/O areas to increase
performance and to minimize disk contention.
o Use Areas
- When reserving a relation that is spread across multiple areas (for
performance) it appears to be faster to no specify a reserving list.
Normally, when a reserving clause is specified, locks will be taken out
on all three areas. If no reserving clause is specified, only the area
that is being touched will have a lock taken out on it. Play with this
to see if it reduces CPU overhead.
- Use SPAM pages with low intervals for areas that have high update
rates.
o Locking
Leave lock granularity enabled unless it is a very small database.
Ensure that the benchmark does record locking for updates. For SYBASE
this means using the 'Apply lock holding' clause in the SQL statements.
Note: SYBASE does locking at the page level not at the record level.
Forcing SYBASE to use locking can create much higher contention
levels than for RDB which locks at a record level.
o Images
Install all images to conserve memory. RDB can be fairly memory
intensive compared to SYBASE so every bit counts.
Things to watch out for:
1) SYBASE BS:
They are usually not that familiar with VMS and will use gobbledy gook to
justify a position. e.g. (taken from a public notes file on BITnet)
Q: Why does SYBASE consume more CPU than other competing products?
A: SYBASE runs only in user mode which is visible. Other products use
executive and kernel modes which are not charged to the process and are not
visible. If you measure overall CPU utilization SYBASE is roughly
equivalent to other products.
RESPONSE: In reality, SYBASE in some cases uses twice the CPU of RDB. This
is fairly obvious BS to us but not to customers. Keep probing the customer
for these kinds of misconceptions and point out the fallacies in their
statements. This may start to shake their faith in the SYBASE's
'wunderkind' technical reputation.
SYBASE says that their goal is to become as good a marketing company as
Oracle. Note that they are not stressing technical excellence! Point this
out to customers.
2) SYBASE claims that VMS record Locking is too expensive. SYBASE means
never having to say 'record lock!'. SYBASE has published articles
purporting to show how expensive VMS locking is. The articles revealed a
very incomplete knowledge of how VMS locking works.
RESPONSE: So how come SYBASE has added locking (albeit at a page level) to
V4.0?
3) SYBASE claims the Fastest Benchmarks. TP1 etc. 'Designed for performance'.
RESPONSE: Standard line about TP1 not being a true measurable repeatable
benchmark. Throw in suspicion about one vendors ability to fairly benchmark
a competitors products.
4) SYBASE claims that 'Stored procedures are faster then embedded procedures.'
RESPONSE: In fact, they are faster for SYBASE as they reduce network I/O.
SYBASE must use more network calls to send the SQL statements to the server.
Trying to apply this argument to RDB is specious as RDB uses DSRI code which
is much faster than interpreting SQL statements.
5) Server Size
SYBASE will usually try and allocate as much memory as is possible to the
server. Try and achieve an equivalent memory usage (if it is to our
advantage). RDB processes like to have about a Megabyte or memory to run
comfortably, of which about 300 to 600 pages of this will be shareable. If
the customer wants 10 processes to run then you should request that both
Digital and SYBASE run in approximately 8 MB of memory. This forces SYBASE
to allocate only 6-7 MB to the server process. You must make sure that the
customer recognizes this as an apples to apples comparison and validates
that SYBASE is running with the right amount of memory. If a hundred
process are to be run, propose ACMS (if possible) instead, and try to apply
the same arguments about total memory usage. Do not bring up the arguments
about memory unless you are sure that it will benefit us and harm SYBASE.
6) 4GL Benchmarks This can be a real problem as nobodies 4GL performs well.
One of my clients created an application using APT-FORMS and estimated
they required two VUPS per user to run it. They tried to tell clients that
each clerk would require a Sun workstation to run a financial application.
They are now using DECForms. We also benchmarked Rally vs. APT-FORMS and
Rally cam out a bit faster (mostly on initial FORM display). This was only
after the application reached a steady state.
7) Cost Of Ownership
SYBASE will insist on doing benchmarks in a client/server mode. Total
memory will usually be quite high. Also, the overall CPU and resource
utilization should be measured to provide equivalent costs of ownership. A
benchmark that runs fast but eats your whole CPU may not be too cost
effective.
Sybase also tends to take more disk resources in terms of spindles and
capacity. One reason for this is that Sybase tends to be very poor at
reclaiming disk space that it uses. In the event of a B-Tree node split, an
additional 8 pages of disk space are allocated. If more records are
written to the relation, then additional index nodes might occupy this
space, otherwise it is wasted. Have the customer compare the full disk
space utilization after the benchmark (if you think you are using less
space).
8) Failover
The need to be able to continue processing transparently on other nodes when
a single node dies should be stressed. On SYBASE, a 'hot backup' must be
left running on another node in the cluster. This backup process needs it's
full complement of memory preallocated. This increases the cost of
redundancy significantly!
9) VAX/VMS
Sybase tends to concentrate on the UNIX marketplace. Therefore their
promises in the VMS market tend to be weak. They tend to release products
on UNIX first with VMS an after thought.
In addition, Sybase does not run well on VAXClusters. It does not like a
clustered environment and will only run a database on a single node.
Therefore it does not take advantage of the scaleability of a cluster.
The Virtual Server concept that Sybase says will handle SMP does not seem to
be shipping yet. One of my client's says he saw it running on a VAX 6310
last October (89) at a SYBASE conference. When it was pointed out that this
was a single board CPU, he looked puzzled. SYBASE sells future heavily!
| T.R | Title | User | Personal Name | Date | Lines |
|---|---|---|---|---|---|
| 720.1 | Excellent ! | TRCO01::MCMULLEN | Fri Aug 24 1990 17:27 | 9 | |
Neil,
Excellent guide. It is nice to see all these tips on paper. I hope
your boss knows the quality and effort you put into this.
Does anyone want to do the same for the rest of the competition
- ORACLE, INGRES, etc.
Ken
| |||||
| 720.2 | Competitors? | POBOX::BOOTH | Bo knows MUMPS | Fri Aug 24 1990 21:36 | 13 |
But, Ken, to write the same for Oracle would interfere with our Oracle
Financials strategy (as well as our "Oracle for one-stop data
management shopping" strategy as shown in the software price book). If
we write about Ingres, what would that do to ULTRIX/SQL? Gee, the
Sybase writeup might hurt our Sybase Security strategy for government.
You act as if these products are competitors. Surely you know that
these products sell VAXes. How could they be competitors if we have all
these agreements with them?
With_tongue_firmly_planted_in_cheek
---- Michael Booth
| |||||
| 720.3 | Cheek | TRCO01::MCMULLEN | Fri Aug 24 1990 22:22 | 9 | |
Michael,
But who's cheek?
Also, the next note contains some more cynical expressions about Digital's
agreements - is there field confusion.
Ken McMullen
| |||||
| 720.4 | thanks, and initial thoughts | BROKE::WATSON | one of the multiple schemers | Sun Aug 26 1990 16:36 | 27 |
> <<< Note 720.0 by VAOU02::NJOHNSON "Westcoast Wiz" >>>
> -< SYBASE Competitive Benchmark Tips >-
Thanks very much for entering this.
> o Use same environment for both benchmarks. SYBASE prefers not to run
> their benchmarks observed by Digital. Try to force SYBASE to run the
> benchmark on a Digital facility using the same machine and O/S
> environment that will be used to run the RDB benchmark. Digital should
> do the tuning for both benchmarks as the SYBASE people are not
> knowledgeable about system tuning and can create a very poor environment.
Isn't the best way to get an apples to apples comparison to let the Db
vendors tune the same hardware configuration to produce the best
environment for their database? Let SYBASE produce a poor environment.
Let us use of knowledge of VMS to produce a good environment for our
product. Then let's see who can demonstrate the fastest system.
> The order of a join in a query can affect performance in SYBASE (but not
> RDB!).
This reminded me forcibly of a comparison INGRES like to make between
themselves and ORACLE. Substitute ORACLE for SYBASE in the above
sentence, and you have a powerful point against ORACLE. INGRES, of
course, additionally substitute INGRES for Rdb.
Andrew.
| |||||
| 720.5 | Control the benchmark! | VAOU02::NJOHNSON | Westcoast Wiz | Thu Aug 30 1990 07:50 | 27 |
Michael, good to see that the move to Chicago has not affected your
subtle sense of humour. I miss that dry wit! How about a quarterly
NewsBooth?
Ken, my boss does not consider me to be a major database type of
person, so I am not going to disabuse her of the notion. Thanks for
the compliments.
Andrew, thanks for the reply. I agree that in some cases, Sybase can
hang themselves with their VMS knowledge. Unfortunately, local talent
varies considerably with Sybase, and you just might have an educated
one locally. My point was two-fold: 1- Digital should be setting the
hardware configuration so that we can make an apples-apples
price/performance comparison. 2- Later in the document I mention some
dirty tricks you can play if you are responsible for tuning. Normally,
we are a fair and above board group of people, but sometimes the
frustration with the B***S*** put out by our competitors gets to be too
much. In this case, having control of the tuning of the environment is
a great way to let off a little 'harmless' steam with a few innocent
'pranks'.
Thanks for all of your comments. Everyone else out there, please put
in your experiences and tips! A number of these points are applicable to
Oracle as well. Remember, Sybase is trying to be the next Oracle, so
keep those cards and letters coming and buy War Bonds!
Neil
| |||||
| 720.6 | keep your enemies closer than your friends | JENNA::SANTIAGO | VMS and U___, perrrfect together (DTN:352-2866) | Fri Aug 31 1990 22:31 | 49 |
generally speaking, sybase is great at queries where the datacache is ~30MB; i've seen them perform all sort of tricks during a benchmark to not do two things a) start a transactions (with BEGIN...COMMIT) b) LOCK records (must explicitly state w/ HOLDLOCK clause in SELECT verb) however i should point out that i also try and help sybase sell their platform in circumstances when we could not win otherwise, given the following conditions: a) the customer doesn't code to dblib directly, but rather writes a jacket routine (or C macro) to perform all I/O operations; this allows a latter move to SQLMOD if needed b) stay the hell away from APT (Ape Toolset ;-) c) balance stored procedure development with hold code to now create a bottleneck in the dataserver the points raised in .0 are serveral that we (my group in the NY DCC) go over when we teach sybase ( the actual, with labs, Fast Track for Programmers <who come from a DOS enviroment and aren't database literate> Course ). We sprinkle in a heavy competitive emphasis as we take the approach of having to know the third party products better than our competitor. The reasons for this are simple: a) the database vendor doesn't provide support or integration services among various h/w platforms; this they throw the vendors way; if there is any resistance, they suggest another platform b) keep the customer's application insulated from the proprietary interface c) know how they compete and be prepared This reaction customers get from this sort of support is that we've become independent (as much as possible) at candidly presenting the two products. We've even been asked to analyze and/or tune existing sybase systems BY SYBASE as we've developed a way to monitor the dataserver (with DECwatch - an asset that used to be known as WHAT in the toolshed), something the product doesn't have. If anyone is interested, we'll probably teach it again soon; class is limited to 20 students; for further info, mail to michaele lefferman@nyo regarding the Sybase Fast Track Competitive Seminar; i teach it /los | |||||
| 720.7 | I've seen version nos, date/time stamps etc in Rdb rows | CUBE3::MACKEY | ...however measured or far away... | Tue Sep 04 1990 15:59 | 25 |
> o Suggest that RDB and SYBASE both use record locking for an apple to apple > comparison. Ensure 'smart' locking is done for updates. SYBASE does not > provide record locking and they claim that most applications don't really > need record locking. The alternative is to do record comparisons IN THE > APPLICATION CODE! This means that an extra record retrieval must be done > before re-writing a record and checking that the original copy that we > have in memory has not changed. The user is then told that the record > changed and the transaction is aborted, after the user has typed in the > transaction! They must then restart the transaction in order to start > from scratch. This is much more complex code than would be required to > support record locking. The extra programming effort required can destroy > a development group's productivity! The only concern I have here is the "check to see if the record has changed" scenario is the one which, in many cases, the customer is *advised* to adopt. Reason? Rdb does record locking, sure - but do you want to hold that/those lock(s) across terminal I/O? I don't - and nor does the customer. In a benchmark environment, fine - but real life can (sometimes :-) be different from the benchmark. Perhaps I don't understand when you say "'smart' locking ... for updates". Kevin | |||||
| 720.8 | know thy enemy... | JENNA::SANTIAGO | VMS and U___, perrrfect together (DTN:352-2866) | Fri Sep 07 1990 04:30 | 39 |
re:-1
locking is the singlemost issue i've seen that forces the rdb software
specialist to have to do an actual database design to affectively
compete with sybase which doesn't explictly lock, unless you say
HOLDLOCK on the SELECT verb AND you're in a BEGIN...COMIT block;
as this is optional, the nieve sybase support person (you'll note that
all sybase benchmarks are written at sybase site by regional support
folks, then carried to the customer location) has the option to simply
ignore locking concerns as is customer in the fantasy world called a
benchmark;
typically the hack-arounds i've seen range from using db-library calls
to timestamp/check a column for a new value, to altering a field w/ the
users' initials to leave an audit trail; it's also very customary to
add fields to the benchmark such as a record id field which resembles
an rms records rfa (actually it created by using a trigger to assign an
assending (i.e., cardinality) value) so that the application doesn't
perform complex joins, but rather directly access the child records via
the record id kept in the parent record
it's also customary during a benchmark to load all parent/sibling
records into a host program array rather than joining;
what this all boils down to is knowing what you're competing with; a
tack i like to take is to give the customer a copy of codd & date's sql
reference manual (addison wesley) to the customer to outline how were
going to implement the benchmark and ask to openly discuss it w/ sybase
or anyone else present to drive towards a common/standard approach;
this only failed me one time, at citibank; however now i'm getting
calls from a local support person that the account wants them to
reference 3rd party 4gls (focus, powerhouse and smart star) as methods
to ease their 2yr backlog on sybase (yeah this product is reeeeal easy
to use...;-) i told them i'd be happy to meet with the account (same
folks who made sybase decision) and explore their concerns with sybase
/los
| |||||
| 720.9 | Optimistic <-> Pessimistic techniques | IJSAPL::OLTHOF | Henny Olthof @UTO 838-2021 | Fri Sep 07 1990 11:54 | 50 |
The technique described is known as "optimistic locking". It's
normally not a very good idea to do. I asked Phil Bernstein recently
about this (Phil is the architekt of DECdta and coming Rdb versions)
and I've posted his anwer below.
Henny Olthof, EIS Holland
From: TPSYS::BERNSTEIN "22-Aug-1990 1135" 22-AUG-1990 17:37:44.95
To: IJSAPL::OLTHOF
CC: BERNSTEIN
Subj: optimistic locking
Henny,
Try looking in my book, "Concurrency Control and Recovery in Database Systems,"
coauthored with V. Hadzilacos and N. Goodman, published by Addison-Wesley.
Chapter 4 talks about optimistic techniques (called "certifiers" in the book).
There are lots of references there to other papers on the subject.
In general, these techniques are known to perform poorly, which is why they
aren't much used in practice.
-- Phil
======================================================================
From: TPSYS::CRL::"mrgate::""utreis::mrgate::ijsapl::olthof""" 22-AUG-1990 11:00:47.60
To: crl::bernstein
CC:
Subj: Try to be optimistic
From: NAME: OLTHOF <OLTHOF@IJSAPL@MRGATE@UTREIS@UTO>
To: Philip Bernstein@CRL,
Henny Olthof@UTO
Philip,
I attended your "Transaction processing Internals" session in Cannes this
winter. I have a question from a customer who wants more information on
optimistic locking techniques.
What he looks for is a reference to a few pages that addres:
- what is optimistic locking
- what are tradeoffs
Main reason of interest is that they have a application that uses optimistic
locking (with Ingres?) and it's performance is lousy.
Thanks,
Henny Olthof, EIS Holland
| |||||
| 720.10 | benchmark .ne. reality | JENNA::SANTIAGO | VMS and U___, perrrfect together (DTN:352-2866) | Fri Sep 07 1990 23:11 | 14 |
i think i'm getting my point across; as a benchmark can be a totally
bogus activity, bringing reality into the picture can sometimes hurt
you if all you have to say about someone else's benchmark is
bad-mouthing how it was run;
the problem is that their product get's sold and WE are forced to fix
it as it runs on our hardware and we're squeezed into a corner as the
3rd party database can opt for a migration to another hardware
platform;
now i'm not saying not to present the facts to the customer, it's just
there a danger in doing it
/los "who can't wait for a global buffer cache..."
| |||||
| 720.11 | Locking - Your Friend, not Foe! | VAOU02::NJOHNSON | Westcoast Wiz | Wed Sep 19 1990 07:16 | 55 |
re .7
I think that people get confused about the purpose of locking in the
first place. There seems to be this horror about holding a lock while
a person is doing terminal activity. The problem is, that if you want
to maintain application integrity and maintain a certain 'fairness' in
TP then you do have to hold record locks.
The analogy I use for my customers is to consider their application and
the various records as pieces of paper. The computer serves as a
messenger and copying service which wizzes the pieces of paper around
to each person's desk who needs it. Now, the trick here is that each
piece of paper contains useful information, but you can only update the
original piece of paper, not a copy! Anyone is welcome to have a
photocopy made, but you must wait until you are given the original
before you can update it. Now, try and image a system using this
analogy where you are shown a particular piece of paper, it is then
rushed away again to someone else, you do some calculations at your
desk, ask for the piece of paper back, see if it was the same as before
it left, find out it wasn't and have to discard all of your
calculations.
If the above scenario has to do with inventory, pharmacy, car rentals,
airline reservations or anything else where information is updated from
multiple sources by a number of people, you can see that a person needs
to keep a firm grip on a piece of paper to avoid invalidating your
work, or embarrassing yourself in front of a customer who is trying to
rent a car!
I feel that this 'opportunistic' locking strategy is largely dependent
on luck! Most customers ignore the fact that they must develop a
recovery strategy for each point in the code where there is contention.
This leads to not only more overhead in production but also much longer
development times. If they don't do the work, they could be very
embarrassed (or sued!) when their applications hit the real world.
In terms of locking, in many cases you can delay locking until just
before you read the record. This is primarily true for applications
where you can estimate thresholds of depletion (ie, if an inventory
item has a lot of stuff in stock, then don't lock it, just re-read it
and update it later). In the terms of ascending sequences, you should
not read the 'Counting' record until just before you store a record and
perform a commit.
I would also ask the customer if locking the record for the duration
doesn't make more sense. After all, if they were still in a paper
office, they would still have to wait to get that piece of paper. A
computer does not change the fact that we need to work with a
particular instance of data which cannot change while we are using it.
It makes more sense to lock the record for a while (perhaps with a
screen timeout) than to force a user to re-key large amounts of data
when their transaction is rejected because one record changed while
they weren't looking.
Neil
| |||||
| 720.12 | y | ALICAT::CLEARY | A deviant having fun..." | Wed Sep 19 1990 08:57 | 7 |
I'm told that Sybase doesn't support SQL cursors - strange but true -
and that this means that it can only update a single record in each
transaction. You can probably use this to advantage if you are helping
to design the benchmark.
regards,
-mark
| |||||
| 720.13 | non-sequitur | LACKEY::HIGGS | SQL is a camel in disguise | Wed Sep 19 1990 17:04 | 8 |
I'm told that Sybase doesn't support SQL cursors - strange but true -
and that this means that it can only update a single record in each
transaction. You can probably use this to advantage if you are helping
to design the benchmark.
I don't know anything about SYBASE, but the lack of SQL cursor support doesn't
to the conclusion that you can only update a single record in each
transaction. You can do more than a single SQL statement in a transaction.
| |||||
| 720.14 | Locks are nice but dangerous | KERNEL::JACKSON | Peter Jackson - UK CSC | Tue Sep 25 1990 14:05 | 12 |
Re .11
Holding database locks across terminal activity can cause serious problems.
I have seen it happen. You do not have complete control over what
is locked, so it is not safe to leave a (non-snapshot) transaction
open when doing anything that involves an indefinite delay.
An alternative to checking that nothing has changed, is to take
out your own locks (easy to do on VMS) for the just the thing that
needs to be locked.
Peter
| |||||
| 720.15 | Not locking is not nice and also dangerous! | VAOU02::NJOHNSON | Westcoast Wiz | Thu Nov 08 1990 07:13 | 14 |
Peter, I have to strongly disagree, the whole point in locking is to
guarantee transaction consistency. If you cannot provide a mechanism
to do this, then you are forcing the customer to write horrendous
amounts of code (including 'integrity' checkers) just to make sure that
they didn't double update something. Please elaborate on what you mean
by leaving a non-snapshot transaction open. A statement that 'Locks
are nice but dangerous' plays directly into the Sybase and Oracle way
of thinking.
To me, if you are not locking, or are having problems with locks, it is
bad design! And somewhere, sooner or later, it is going to come back
and byte you!
Neil
| |||||
| 720.16 | Terminal IO allows for indefinite delays | KERNEL::JACKSON | Peter Jackson - UK CSC | Tue Nov 13 1990 17:34 | 34 |
Re .15
I agree that not locking at all is a very bad idea, and that locks
are needed to ensure transactiopn consistancy.
However, I have many times seen entire systems hung, because one user
has left her terminal while in the middle of a transaction, and it was
holding locks on the database. If you had had to sort several of
these, you would have an horror of holding database locks across
terminal IO.
Where I used to work, we rewrote all the programs that did terminal IO
while in the middle of a transaction. This required a lot of work, but
the reduction in problems made it worthwhile. The extra effort required
to write the code so as not to hold *database* locks across terminal IO
is worth it, because of the easier maintenance once the system is
running.
In my experience, horrendous amounts of code are not needed. One
technique, which corresponds to the piece of paper analogy used
in an earlier reply, is to use non-database locks to lock a resource
corresponding to the piece of paper, and hold that across both the
database transaction and terminal IO. The technique we used to use
was to generate an error if something that we had used earlier had
changed in an incompatible manner. This would force the user to
do the work again, but because there normally was a real piece of
paper (a barcoded label in our case) this was extremely rare and
when it did occur it always turned out to be a problem in the way
the system was being used.
Peter
| |||||