Title: | DEC Rdb against the World |
Moderator: | HERON::GODFRIND |
Created: | Fri Jun 12 1987 |
Last Modified: | Thu Feb 23 1995 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 1348 |
Total number of notes: | 5438 |
This is an unofficial guide to competing with SYBASE, particularly in the area of benchmarks. I would like to produce a more complete SYBASE Competitive Guide. I would appreciate anyone who has competitive or benchmarking tips not mentioned here to add them in replies to this note. I will try to compile everything I get into a short VAX Document Guide. Any corrections or suggestions on these or other topics would be welcome, as I make no claim to absolute perfection, accuracy or any other saintly qualities. Note that a lot of competitive information is contained elsewhere in this conference (see note 58.*). The information that I have placed here results from various competitive situations that I have had with SYBASE. Note that I did not win all of these situations and some of my hard earned scars I am including here. After having reviewed the SYBASE documentation and the SYBASE optimization course, I am more convinced than ever that RDB is the superior product, for more reasons than are mentioned here. 1. Control of the Benchmark and the Environment It is important to try to control the substance of the benchmark and the conditions under which the benchmark is run. While Digital normally seems to come from behind in these contests (we always seem to find out about the benchmark after SYBASE or Oracle has written it), we can have an impact on the benchmark by understanding the client's requirements and relating them to the benchmark. In many cases the benchmark that is proposed only vaguely reflects a client's true production environment. The following points are provided to suggest features and strengths of RDB that a benchmark should emphasize. In most cases there are valid reasons that relate back to a client's environment that should be exploited. For most of these suggestions the name of the game is to minimize the number of I/O operations performed by RDB while increasing the number done by SYBASE. Fortunately, this is not that difficult to do. However, SYBASE makes up for the minimal 'smarts' in their database engine by using a very large data cache to cover a multitude of sins. In each case, the goal is to try to force SYBASE to become I/O bound while minimizing the effects of their cache. In almost all TP and interactive benchmarks, the goal is to minimize RDB I/O requirements while minimizing the impact of the SYBASE cache. o Suggest that RDB and SYBASE both use record locking for an apple to apple comparison. Ensure 'smart' locking is done for updates. SYBASE does not provide record locking and they claim that most applications don't really need record locking. The alternative is to do record comparisons IN THE APPLICATION CODE! This means that an extra record retrieval must be done before re-writing a record and checking that the original copy that we have in memory has not changed. The user is then told that the record changed and the transaction is aborted, after the user has typed in the transaction! They must then restart the transaction in order to start from scratch. This is much more complex code than would be required to support record locking. The extra programming effort required can destroy a development group's productivity! o Try for multi record retrievals. Retrievals by a key that is used to store a record can be done more quickly by RDB. SYBASE has a fixed page size which means that if more records are retrieved than fit on a single page extra I/O's are forced. o Use same environment for both benchmarks. SYBASE prefers not to run their benchmarks observed by Digital. Try to force SYBASE to run the benchmark on a Digital facility using the same machine and O/S environment that will be used to run the RDB benchmark. Digital should do the tuning for both benchmarks as the SYBASE people are not knowledgeable about system tuning and can create a very poor environment. o Restart SYBASE DB server from scratch for each test. For 'fairness' shut down the RDB monitor and restart it again for each benchmark run. This technique will help to avoid SYBASE data caching between benchmarks. SYBASE will usually protest this action for the reason just mentioned, but will say that 'SYBASE needs some time to pagefault everything into memory to run fast'. Use the next point to avoid SYBASE protests. o Record timings should be taken only when the benchmark has reached a steady state. RDB can take a significant hit when loading the metadata during a bind to the database. Use 1 initial fetch for each relation to load metadata for RDB and to initialize SYBASE. This benefits us more than it does them. It also more closely matches the situation when a real production (TP) environment is run. o Try not allow a remote client/server arrangement to be used if it doesn't make sense for the customer. SYBASE loves to use a front end to run the benchmark while using the back end to run the benchmark. Though the extra network overhead incurred usually benefits us (specially if you connect the front end and the back end using asynchronous Decnet!) it is much more difficult to measure overall resource utilization and performance using this model. - Use ACMS if necessary for an even comparison. This does require some significant re-writing, but it can be very useful to show the client how a true TP environment functions. It does require focusing the customer on their application, not on the benchmark. - Try to use SMP and clusters. SYBASE does not like these, even with their 'Virtual Server'. This has not really been shown to work in an VAX SMP environment though they claimed they had it running on a VAX 6310 SMP machine (ha ha). As SYBASE uses a single database server, it is very difficult for this to make use of an SMP machine. It is possible for them to run in clusters as they treat the nodes as separate machines. - Try to use a single system for the front and back end processes as SYBASE has problems because their server runs at a higher priority. This can spoil response times for processes connected to the server in a high throughput environment. o Consider making cluster fail-over times a component for testing. SYBASE must run a 'semi-hot' server on the second node with all of its memory preallocated. SYBASE can detect the first node going away, but there is a fairly wide window for SYBASE to experience data corruption and integrity problems due to their cache not flushing at the right time. o Have the customer specify that overall resource utilization should be considered as one of the benchmark criteria. SYBASE typically sucks much more of the systems resources both in terms of memory (when few jobs are running), CPU and disk space! Make sure that customer considers all of these aspects as part of the DB Benchmark results. o Try to specify background reporting jobs that 'sweep' through large portions of data. This tends to negate SYBASE's Data Caching as it forces the cache to be flushed and reloaded with other data. o Specify a small recovery time interval to force frequent flushing of SYBASE journal. I need to check this point as I have not verified it. o Go for large numbers of spindles and large databases. The larger the database the lower the chances the SYBASE Caching will be effective. A large database with hash keys can beat data caching almost any day. o Sequential passes through database benefit us (even with updates) as we can choose a page size that loads large amounts of data in a single read. SYBASE has severe problems with this due to their fixed page size. o Minimize range retrieval so that hashing can be used. o Try for large empty records with lots of 'white space' if multi-record retrieval is done as this allows us to use record compression to put more records/page. o Help write the benchmark code, if possible. o Use embedded SQL modules as this makes it faster to compile and link 3GL programs. Embedded SQL sucks! o Minimize the number of iterations that a single transaction is run to minimize the effects of SYBASE Data Caching. On a recent benchmark I forgot to prevent this and SYBASE asked for 30 iterations of a fairly long query. Over the 30 iterations I calculated that roughly 60+% of the relation was able to cache itself in memory. Result was that SYBASE had faster response times, but much higher CPU utilization. If we had limited the number of iterations to 5 the effects of caching would have been negated. o Make sure that all transactions are started and finished with a COMMIT. SYBASE can perform database activities without starting a transaction. They look roughly like file I/O. In addition, the customer was doing updates, inserts and deletes and doing a rollback immediately afterward. This does not test the database much! Both the RDB and the SYBASE benchmarks should START transactions and COMMIT them explicitly for an 'Apples to Apples' comparison. In addition, try to avoid the SYBASE partial commit. This allows commits of some data to be performed part way through a transaction while still keeping everything else inside a single transaction. o Minimize the assignment of sequential incrementing keys. This tends to impact performance because one record is locked all of the time. Either ask for blocks of counters to be allocated or use some other method of generating a unique record identifier. Note that SYBASE uses a technique called Pre-emptive Key Splitting in this kind of situation to avoid locking indexes during updates. This means that if a locked B-Tree node is encountered the node will be split by SYBASE leaving a locked and an unlocked half. The unlocked half is then updated with the new record key. This can lead to longer I/O chains on SYBASE, but it does minimize the bucket locking that will occur when multiple processes are storing records with sequentially incrementing keys. 2. Analysis o Access paths and frequency - Look for ways to hash keys for clustered retrievals - Delete unnecessary keys. o Review all access paths used in the transactions. Look for frequently used key paths and optimize. Look for hot spots. o Determine all relations which are read only so that they could be placed in a storage area that is R/O to minimize locking. 3. Setup o Record Compression - If retrieving multiple records and there is a large amount of 'white space' (repeating characters) in a record, use compression. - If minimal or no white space or only single records being retrieved, turn off compression. - If CPU time is an issue, turn off compression. Note: SYBASE does not have record compression. For sequential multi record retrieval we can significantly save on I/O by using record compression. However, this does increase RDBs CPU utilization. o Multi Spindle Storage Areas and Partitioning - If large numbers of processes doing single record retrieval, split the relation between multiple storage areas to minimize spindle contention. Note: SYBASE now provides record partitioning but they can only divide a table randomly between table files, they cannot partition by key value. o Hash Keys - For single record retrieval by a primary key or for a join with another relation using a foreign key. - For high insertion rates with a sequentially incrementing key. - When range or sequential retrieval in a sorted order is required, do not use hash keys or provide a separate sorted index. Note that a sequential index doe not work well with the previous situation. It can cause significant performance bottlenecks. o Joins - In some benchmarks a multiple join may not specify retrieval of any values from secondary relations. RDB can take advantage of this by using sorted indexes with all keys used in the query as part of the index. In these situations, the query optimizer will perform an index only retrieval and avoid reading the record at all. Note: The SYBASE query optimizer is not 'too bright' and will read all records in a join, even if no values are retrieved. Eg. Relation A B FLDA FLDA FLDB FLDB FLDC FLDD Cardinality: 100 1000 Index: A_IX B_IX FLDA (hashed) FLDA (sorted) FLDB Select A.FLDA, A.FLDB, A.FLDC [results in 5 As and from A, B 50 Bs being read. where A.FLDA = B.FLDA and A.FLDA between 10 and 14 If 5 records fit on a page and are stored in FLDA order, RDB will perform this query in 3 I/Os. (1 Hash read on A and 2 Index Reads on B). SYBASE will perform this in roughly 15 I/Os (2 Index reads on A, 1 record read on A, 2 Index reads on B, 10 data reads on B). This example assumes no data caching so in reality it would usually require less I/Os on an active system. The order of a join in a query can affect performance in SYBASE (but not RDB!). Specifying inefficient orders for selection will slow down SYBASE (unless you can convince the client that SYBASE is not allowed to change the SQL code!) as they need to manually determine optimal queries to take advantage of cardinality and indexes. A query that specifies a less selective retrieval range first can force SYBASE to perform a much higher number of I/O operations. Note that all appropriate fields should be indexed in the query. Eg. Select # from Personnel where SEX='F' and AGE between 20 and 32 and START_DATE <5-JUL-1988 With 1000 employees the following key cardinalities might be found: SEX 2 (M/F) AGE 40 START_DATE 1000 RDB would check the above and would probably use the START_DATE to qualify the selection. SYBASE would use SEX as it was specified first. 'Unoptimizing' the query order is a minor irritant to SYBASE unless they have a less experienced person running the benchmark in which case it becomes an easy win (if there is any such thing)! You may also want to be a bit more 'subtle' in your 'unoptimization' and put AGE first which causes more I/O. Even if SYBASE suggests re-ordering the SQL statements, you can be generous and agree to the change. Then point out to the client that RDB will optimize the query regardless of the order because of a much more intelligent query optimizer. It's good for a few brownie points. o Page Size Choose a page size large enough to hold a set of range records with a single I/O operation. If high update/write rates are expected (with minimal range retrievals) choose a page size that will minimize locking conflicts. Note: SYBASE has a fixed page size of 2048 bytes. This can cause fragmented records and much higher numbers of I/Os when the benchmark is properly planned. o Network Packet Size If you are tuning the machine for the benchmark, which SYBASE is going to run AND you are not using RDB in a client server arrangement THEN choose a very small DECNet package size (say 128). This will increase the network overhead considerably, and in most cases, SYBASE people do not know enough to check this parameter. This will increase their CPU overhead considerably without giving any indication why. o Journalling Spread the Run Unit Journal over multiple drives to minimize disk contention. Avoid after image journalling as we seem to take more of an impact on this due to locking and contention for the Journal file. SYBASE also seems to take a fairly liberal view of journalling and does not seem to rigidly synchronize journal flushes with updates. In addition, a COMMIT may return a success status to the user while the pages are still being flushed back to the database. o R/O areas Use Read only areas wherever possible to minimize locking overhead. Make system area Read Only to avoid updates to cardinality. Note that if cardinality will change radically during a benchmark then an RO system area should not be used. Note that you could also use Pseudo Disks for R/O areas to increase performance and to minimize disk contention. o Use Areas - When reserving a relation that is spread across multiple areas (for performance) it appears to be faster to no specify a reserving list. Normally, when a reserving clause is specified, locks will be taken out on all three areas. If no reserving clause is specified, only the area that is being touched will have a lock taken out on it. Play with this to see if it reduces CPU overhead. - Use SPAM pages with low intervals for areas that have high update rates. o Locking Leave lock granularity enabled unless it is a very small database. Ensure that the benchmark does record locking for updates. For SYBASE this means using the 'Apply lock holding' clause in the SQL statements. Note: SYBASE does locking at the page level not at the record level. Forcing SYBASE to use locking can create much higher contention levels than for RDB which locks at a record level. o Images Install all images to conserve memory. RDB can be fairly memory intensive compared to SYBASE so every bit counts. Things to watch out for: 1) SYBASE BS: They are usually not that familiar with VMS and will use gobbledy gook to justify a position. e.g. (taken from a public notes file on BITnet) Q: Why does SYBASE consume more CPU than other competing products? A: SYBASE runs only in user mode which is visible. Other products use executive and kernel modes which are not charged to the process and are not visible. If you measure overall CPU utilization SYBASE is roughly equivalent to other products. RESPONSE: In reality, SYBASE in some cases uses twice the CPU of RDB. This is fairly obvious BS to us but not to customers. Keep probing the customer for these kinds of misconceptions and point out the fallacies in their statements. This may start to shake their faith in the SYBASE's 'wunderkind' technical reputation. SYBASE says that their goal is to become as good a marketing company as Oracle. Note that they are not stressing technical excellence! Point this out to customers. 2) SYBASE claims that VMS record Locking is too expensive. SYBASE means never having to say 'record lock!'. SYBASE has published articles purporting to show how expensive VMS locking is. The articles revealed a very incomplete knowledge of how VMS locking works. RESPONSE: So how come SYBASE has added locking (albeit at a page level) to V4.0? 3) SYBASE claims the Fastest Benchmarks. TP1 etc. 'Designed for performance'. RESPONSE: Standard line about TP1 not being a true measurable repeatable benchmark. Throw in suspicion about one vendors ability to fairly benchmark a competitors products. 4) SYBASE claims that 'Stored procedures are faster then embedded procedures.' RESPONSE: In fact, they are faster for SYBASE as they reduce network I/O. SYBASE must use more network calls to send the SQL statements to the server. Trying to apply this argument to RDB is specious as RDB uses DSRI code which is much faster than interpreting SQL statements. 5) Server Size SYBASE will usually try and allocate as much memory as is possible to the server. Try and achieve an equivalent memory usage (if it is to our advantage). RDB processes like to have about a Megabyte or memory to run comfortably, of which about 300 to 600 pages of this will be shareable. If the customer wants 10 processes to run then you should request that both Digital and SYBASE run in approximately 8 MB of memory. This forces SYBASE to allocate only 6-7 MB to the server process. You must make sure that the customer recognizes this as an apples to apples comparison and validates that SYBASE is running with the right amount of memory. If a hundred process are to be run, propose ACMS (if possible) instead, and try to apply the same arguments about total memory usage. Do not bring up the arguments about memory unless you are sure that it will benefit us and harm SYBASE. 6) 4GL Benchmarks This can be a real problem as nobodies 4GL performs well. One of my clients created an application using APT-FORMS and estimated they required two VUPS per user to run it. They tried to tell clients that each clerk would require a Sun workstation to run a financial application. They are now using DECForms. We also benchmarked Rally vs. APT-FORMS and Rally cam out a bit faster (mostly on initial FORM display). This was only after the application reached a steady state. 7) Cost Of Ownership SYBASE will insist on doing benchmarks in a client/server mode. Total memory will usually be quite high. Also, the overall CPU and resource utilization should be measured to provide equivalent costs of ownership. A benchmark that runs fast but eats your whole CPU may not be too cost effective. Sybase also tends to take more disk resources in terms of spindles and capacity. One reason for this is that Sybase tends to be very poor at reclaiming disk space that it uses. In the event of a B-Tree node split, an additional 8 pages of disk space are allocated. If more records are written to the relation, then additional index nodes might occupy this space, otherwise it is wasted. Have the customer compare the full disk space utilization after the benchmark (if you think you are using less space). 8) Failover The need to be able to continue processing transparently on other nodes when a single node dies should be stressed. On SYBASE, a 'hot backup' must be left running on another node in the cluster. This backup process needs it's full complement of memory preallocated. This increases the cost of redundancy significantly! 9) VAX/VMS Sybase tends to concentrate on the UNIX marketplace. Therefore their promises in the VMS market tend to be weak. They tend to release products on UNIX first with VMS an after thought. In addition, Sybase does not run well on VAXClusters. It does not like a clustered environment and will only run a database on a single node. Therefore it does not take advantage of the scaleability of a cluster. The Virtual Server concept that Sybase says will handle SMP does not seem to be shipping yet. One of my client's says he saw it running on a VAX 6310 last October (89) at a SYBASE conference. When it was pointed out that this was a single board CPU, he looked puzzled. SYBASE sells future heavily!
T.R | Title | User | Personal Name | Date | Lines |
---|---|---|---|---|---|
720.1 | Excellent ! | TRCO01::MCMULLEN | Fri Aug 24 1990 18:27 | 9 | |
Neil, Excellent guide. It is nice to see all these tips on paper. I hope your boss knows the quality and effort you put into this. Does anyone want to do the same for the rest of the competition - ORACLE, INGRES, etc. Ken | |||||
720.2 | Competitors? | POBOX::BOOTH | Bo knows MUMPS | Fri Aug 24 1990 22:36 | 13 |
But, Ken, to write the same for Oracle would interfere with our Oracle Financials strategy (as well as our "Oracle for one-stop data management shopping" strategy as shown in the software price book). If we write about Ingres, what would that do to ULTRIX/SQL? Gee, the Sybase writeup might hurt our Sybase Security strategy for government. You act as if these products are competitors. Surely you know that these products sell VAXes. How could they be competitors if we have all these agreements with them? With_tongue_firmly_planted_in_cheek ---- Michael Booth | |||||
720.3 | Cheek | TRCO01::MCMULLEN | Fri Aug 24 1990 23:22 | 9 | |
Michael, But who's cheek? Also, the next note contains some more cynical expressions about Digital's agreements - is there field confusion. Ken McMullen | |||||
720.4 | thanks, and initial thoughts | BROKE::WATSON | one of the multiple schemers | Sun Aug 26 1990 17:36 | 27 |
> <<< Note 720.0 by VAOU02::NJOHNSON "Westcoast Wiz" >>> > -< SYBASE Competitive Benchmark Tips >- Thanks very much for entering this. > o Use same environment for both benchmarks. SYBASE prefers not to run > their benchmarks observed by Digital. Try to force SYBASE to run the > benchmark on a Digital facility using the same machine and O/S > environment that will be used to run the RDB benchmark. Digital should > do the tuning for both benchmarks as the SYBASE people are not > knowledgeable about system tuning and can create a very poor environment. Isn't the best way to get an apples to apples comparison to let the Db vendors tune the same hardware configuration to produce the best environment for their database? Let SYBASE produce a poor environment. Let us use of knowledge of VMS to produce a good environment for our product. Then let's see who can demonstrate the fastest system. > The order of a join in a query can affect performance in SYBASE (but not > RDB!). This reminded me forcibly of a comparison INGRES like to make between themselves and ORACLE. Substitute ORACLE for SYBASE in the above sentence, and you have a powerful point against ORACLE. INGRES, of course, additionally substitute INGRES for Rdb. Andrew. | |||||
720.5 | Control the benchmark! | VAOU02::NJOHNSON | Westcoast Wiz | Thu Aug 30 1990 08:50 | 27 |
Michael, good to see that the move to Chicago has not affected your subtle sense of humour. I miss that dry wit! How about a quarterly NewsBooth? Ken, my boss does not consider me to be a major database type of person, so I am not going to disabuse her of the notion. Thanks for the compliments. Andrew, thanks for the reply. I agree that in some cases, Sybase can hang themselves with their VMS knowledge. Unfortunately, local talent varies considerably with Sybase, and you just might have an educated one locally. My point was two-fold: 1- Digital should be setting the hardware configuration so that we can make an apples-apples price/performance comparison. 2- Later in the document I mention some dirty tricks you can play if you are responsible for tuning. Normally, we are a fair and above board group of people, but sometimes the frustration with the B***S*** put out by our competitors gets to be too much. In this case, having control of the tuning of the environment is a great way to let off a little 'harmless' steam with a few innocent 'pranks'. Thanks for all of your comments. Everyone else out there, please put in your experiences and tips! A number of these points are applicable to Oracle as well. Remember, Sybase is trying to be the next Oracle, so keep those cards and letters coming and buy War Bonds! Neil | |||||
720.6 | keep your enemies closer than your friends | JENNA::SANTIAGO | VMS and U___, perrrfect together (DTN:352-2866) | Fri Aug 31 1990 23:31 | 49 |
generally speaking, sybase is great at queries where the datacache is ~30MB; i've seen them perform all sort of tricks during a benchmark to not do two things a) start a transactions (with BEGIN...COMMIT) b) LOCK records (must explicitly state w/ HOLDLOCK clause in SELECT verb) however i should point out that i also try and help sybase sell their platform in circumstances when we could not win otherwise, given the following conditions: a) the customer doesn't code to dblib directly, but rather writes a jacket routine (or C macro) to perform all I/O operations; this allows a latter move to SQLMOD if needed b) stay the hell away from APT (Ape Toolset ;-) c) balance stored procedure development with hold code to now create a bottleneck in the dataserver the points raised in .0 are serveral that we (my group in the NY DCC) go over when we teach sybase ( the actual, with labs, Fast Track for Programmers <who come from a DOS enviroment and aren't database literate> Course ). We sprinkle in a heavy competitive emphasis as we take the approach of having to know the third party products better than our competitor. The reasons for this are simple: a) the database vendor doesn't provide support or integration services among various h/w platforms; this they throw the vendors way; if there is any resistance, they suggest another platform b) keep the customer's application insulated from the proprietary interface c) know how they compete and be prepared This reaction customers get from this sort of support is that we've become independent (as much as possible) at candidly presenting the two products. We've even been asked to analyze and/or tune existing sybase systems BY SYBASE as we've developed a way to monitor the dataserver (with DECwatch - an asset that used to be known as WHAT in the toolshed), something the product doesn't have. If anyone is interested, we'll probably teach it again soon; class is limited to 20 students; for further info, mail to michaele lefferman@nyo regarding the Sybase Fast Track Competitive Seminar; i teach it /los | |||||
720.7 | I've seen version nos, date/time stamps etc in Rdb rows | CUBE3::MACKEY | ...however measured or far away... | Tue Sep 04 1990 16:59 | 25 |
> o Suggest that RDB and SYBASE both use record locking for an apple to apple > comparison. Ensure 'smart' locking is done for updates. SYBASE does not > provide record locking and they claim that most applications don't really > need record locking. The alternative is to do record comparisons IN THE > APPLICATION CODE! This means that an extra record retrieval must be done > before re-writing a record and checking that the original copy that we > have in memory has not changed. The user is then told that the record > changed and the transaction is aborted, after the user has typed in the > transaction! They must then restart the transaction in order to start > from scratch. This is much more complex code than would be required to > support record locking. The extra programming effort required can destroy > a development group's productivity! The only concern I have here is the "check to see if the record has changed" scenario is the one which, in many cases, the customer is *advised* to adopt. Reason? Rdb does record locking, sure - but do you want to hold that/those lock(s) across terminal I/O? I don't - and nor does the customer. In a benchmark environment, fine - but real life can (sometimes :-) be different from the benchmark. Perhaps I don't understand when you say "'smart' locking ... for updates". Kevin | |||||
720.8 | know thy enemy... | JENNA::SANTIAGO | VMS and U___, perrrfect together (DTN:352-2866) | Fri Sep 07 1990 05:30 | 39 |
re:-1 locking is the singlemost issue i've seen that forces the rdb software specialist to have to do an actual database design to affectively compete with sybase which doesn't explictly lock, unless you say HOLDLOCK on the SELECT verb AND you're in a BEGIN...COMIT block; as this is optional, the nieve sybase support person (you'll note that all sybase benchmarks are written at sybase site by regional support folks, then carried to the customer location) has the option to simply ignore locking concerns as is customer in the fantasy world called a benchmark; typically the hack-arounds i've seen range from using db-library calls to timestamp/check a column for a new value, to altering a field w/ the users' initials to leave an audit trail; it's also very customary to add fields to the benchmark such as a record id field which resembles an rms records rfa (actually it created by using a trigger to assign an assending (i.e., cardinality) value) so that the application doesn't perform complex joins, but rather directly access the child records via the record id kept in the parent record it's also customary during a benchmark to load all parent/sibling records into a host program array rather than joining; what this all boils down to is knowing what you're competing with; a tack i like to take is to give the customer a copy of codd & date's sql reference manual (addison wesley) to the customer to outline how were going to implement the benchmark and ask to openly discuss it w/ sybase or anyone else present to drive towards a common/standard approach; this only failed me one time, at citibank; however now i'm getting calls from a local support person that the account wants them to reference 3rd party 4gls (focus, powerhouse and smart star) as methods to ease their 2yr backlog on sybase (yeah this product is reeeeal easy to use...;-) i told them i'd be happy to meet with the account (same folks who made sybase decision) and explore their concerns with sybase /los | |||||
720.9 | Optimistic <-> Pessimistic techniques | IJSAPL::OLTHOF | Henny Olthof @UTO 838-2021 | Fri Sep 07 1990 12:54 | 50 |
The technique described is known as "optimistic locking". It's normally not a very good idea to do. I asked Phil Bernstein recently about this (Phil is the architekt of DECdta and coming Rdb versions) and I've posted his anwer below. Henny Olthof, EIS Holland From: TPSYS::BERNSTEIN "22-Aug-1990 1135" 22-AUG-1990 17:37:44.95 To: IJSAPL::OLTHOF CC: BERNSTEIN Subj: optimistic locking Henny, Try looking in my book, "Concurrency Control and Recovery in Database Systems," coauthored with V. Hadzilacos and N. Goodman, published by Addison-Wesley. Chapter 4 talks about optimistic techniques (called "certifiers" in the book). There are lots of references there to other papers on the subject. In general, these techniques are known to perform poorly, which is why they aren't much used in practice. -- Phil ====================================================================== From: TPSYS::CRL::"mrgate::""utreis::mrgate::ijsapl::olthof""" 22-AUG-1990 11:00:47.60 To: crl::bernstein CC: Subj: Try to be optimistic From: NAME: OLTHOF <OLTHOF@IJSAPL@MRGATE@UTREIS@UTO> To: Philip Bernstein@CRL, Henny Olthof@UTO Philip, I attended your "Transaction processing Internals" session in Cannes this winter. I have a question from a customer who wants more information on optimistic locking techniques. What he looks for is a reference to a few pages that addres: - what is optimistic locking - what are tradeoffs Main reason of interest is that they have a application that uses optimistic locking (with Ingres?) and it's performance is lousy. Thanks, Henny Olthof, EIS Holland | |||||
720.10 | benchmark .ne. reality | JENNA::SANTIAGO | VMS and U___, perrrfect together (DTN:352-2866) | Sat Sep 08 1990 00:11 | 14 |
i think i'm getting my point across; as a benchmark can be a totally bogus activity, bringing reality into the picture can sometimes hurt you if all you have to say about someone else's benchmark is bad-mouthing how it was run; the problem is that their product get's sold and WE are forced to fix it as it runs on our hardware and we're squeezed into a corner as the 3rd party database can opt for a migration to another hardware platform; now i'm not saying not to present the facts to the customer, it's just there a danger in doing it /los "who can't wait for a global buffer cache..." | |||||
720.11 | Locking - Your Friend, not Foe! | VAOU02::NJOHNSON | Westcoast Wiz | Wed Sep 19 1990 08:16 | 55 |
re .7 I think that people get confused about the purpose of locking in the first place. There seems to be this horror about holding a lock while a person is doing terminal activity. The problem is, that if you want to maintain application integrity and maintain a certain 'fairness' in TP then you do have to hold record locks. The analogy I use for my customers is to consider their application and the various records as pieces of paper. The computer serves as a messenger and copying service which wizzes the pieces of paper around to each person's desk who needs it. Now, the trick here is that each piece of paper contains useful information, but you can only update the original piece of paper, not a copy! Anyone is welcome to have a photocopy made, but you must wait until you are given the original before you can update it. Now, try and image a system using this analogy where you are shown a particular piece of paper, it is then rushed away again to someone else, you do some calculations at your desk, ask for the piece of paper back, see if it was the same as before it left, find out it wasn't and have to discard all of your calculations. If the above scenario has to do with inventory, pharmacy, car rentals, airline reservations or anything else where information is updated from multiple sources by a number of people, you can see that a person needs to keep a firm grip on a piece of paper to avoid invalidating your work, or embarrassing yourself in front of a customer who is trying to rent a car! I feel that this 'opportunistic' locking strategy is largely dependent on luck! Most customers ignore the fact that they must develop a recovery strategy for each point in the code where there is contention. This leads to not only more overhead in production but also much longer development times. If they don't do the work, they could be very embarrassed (or sued!) when their applications hit the real world. In terms of locking, in many cases you can delay locking until just before you read the record. This is primarily true for applications where you can estimate thresholds of depletion (ie, if an inventory item has a lot of stuff in stock, then don't lock it, just re-read it and update it later). In the terms of ascending sequences, you should not read the 'Counting' record until just before you store a record and perform a commit. I would also ask the customer if locking the record for the duration doesn't make more sense. After all, if they were still in a paper office, they would still have to wait to get that piece of paper. A computer does not change the fact that we need to work with a particular instance of data which cannot change while we are using it. It makes more sense to lock the record for a while (perhaps with a screen timeout) than to force a user to re-key large amounts of data when their transaction is rejected because one record changed while they weren't looking. Neil | |||||
720.12 | y | ALICAT::CLEARY | A deviant having fun..." | Wed Sep 19 1990 09:57 | 7 |
I'm told that Sybase doesn't support SQL cursors - strange but true - and that this means that it can only update a single record in each transaction. You can probably use this to advantage if you are helping to design the benchmark. regards, -mark | |||||
720.13 | non-sequitur | LACKEY::HIGGS | SQL is a camel in disguise | Wed Sep 19 1990 18:04 | 8 |
I'm told that Sybase doesn't support SQL cursors - strange but true - and that this means that it can only update a single record in each transaction. You can probably use this to advantage if you are helping to design the benchmark. I don't know anything about SYBASE, but the lack of SQL cursor support doesn't to the conclusion that you can only update a single record in each transaction. You can do more than a single SQL statement in a transaction. | |||||
720.14 | Locks are nice but dangerous | KERNEL::JACKSON | Peter Jackson - UK CSC | Tue Sep 25 1990 15:05 | 12 |
Re .11 Holding database locks across terminal activity can cause serious problems. I have seen it happen. You do not have complete control over what is locked, so it is not safe to leave a (non-snapshot) transaction open when doing anything that involves an indefinite delay. An alternative to checking that nothing has changed, is to take out your own locks (easy to do on VMS) for the just the thing that needs to be locked. Peter | |||||
720.15 | Not locking is not nice and also dangerous! | VAOU02::NJOHNSON | Westcoast Wiz | Thu Nov 08 1990 07:13 | 14 |
Peter, I have to strongly disagree, the whole point in locking is to guarantee transaction consistency. If you cannot provide a mechanism to do this, then you are forcing the customer to write horrendous amounts of code (including 'integrity' checkers) just to make sure that they didn't double update something. Please elaborate on what you mean by leaving a non-snapshot transaction open. A statement that 'Locks are nice but dangerous' plays directly into the Sybase and Oracle way of thinking. To me, if you are not locking, or are having problems with locks, it is bad design! And somewhere, sooner or later, it is going to come back and byte you! Neil | |||||
720.16 | Terminal IO allows for indefinite delays | KERNEL::JACKSON | Peter Jackson - UK CSC | Tue Nov 13 1990 17:34 | 34 |
Re .15 I agree that not locking at all is a very bad idea, and that locks are needed to ensure transactiopn consistancy. However, I have many times seen entire systems hung, because one user has left her terminal while in the middle of a transaction, and it was holding locks on the database. If you had had to sort several of these, you would have an horror of holding database locks across terminal IO. Where I used to work, we rewrote all the programs that did terminal IO while in the middle of a transaction. This required a lot of work, but the reduction in problems made it worthwhile. The extra effort required to write the code so as not to hold *database* locks across terminal IO is worth it, because of the easier maintenance once the system is running. In my experience, horrendous amounts of code are not needed. One technique, which corresponds to the piece of paper analogy used in an earlier reply, is to use non-database locks to lock a resource corresponding to the piece of paper, and hold that across both the database transaction and terminal IO. The technique we used to use was to generate an error if something that we had used earlier had changed in an incompatible manner. This would force the user to do the work again, but because there normally was a real piece of paper (a barcoded label in our case) this was extremely rare and when it did occur it always turned out to be a problem in the way the system was being used. Peter |