[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference wonder::turbolaser

Title:TurboLaser Notesfile - AlphaServer 8200 and 8400 systems
Notice:Welcome to WONDER::TURBOLASER in it's new homeshortly
Moderator:LANDO::DROBNER
Created:Tue Dec 20 1994
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:1218
Total number of notes:4645

1154.0. "Cache X-access penalties vs MPP ?" by SPANIX::JULIANR (ALPHAbet = Our bet on ALPHA) Mon Mar 31 1997 10:34

	Could any of you take some of your time to help me ?

	We are in the process to bid a large TL cluster to support a 1 TB 
datawarehouse using OPS. Our competitors are IBM SP2 and Siemens Nixdorf 
Reliant (Pyramid) servers.

	The customer is asking:

	- How costly would be, in terms of performance penalty, to access 
from a CPU another's CPU cache either whitin a single system or withing 
two TruCluster members ?
	- How do we compare such a performance hurt versus our competitor's 
hybrid SMP/MPP servers ?

	Thanks in advance,

Juli�n Rodr�guez, UNIX Ambassador
Digital Spain
T.RTitleUserPersonal
Name
DateLines
1154.1SX4GTO::OLSONDBTC Palo AltoThu Apr 03 1997 14:3343
    >	- How costly would be, in terms of performance penalty, to access 
    > from a CPU another's CPU cache either whitin a single system or withing 
    > two TruCluster members ?
    
    When you talk about databases (OPS on TruCluster in your example), it
    is important to define what you mean by cache.  To a database, cache is
    the buffer pool in memory- as opposed to the rest of data, out on disk
    somewhere.  We're NOT talking about accessing CPU onboard caches.
    
    An Oracle instance on a node in a TruCluster has its database cache in
    shared memory- so all CPUs in that instance/on that node can access it 
    equally.  No performance penalty.
    
    An Oracle instance on another node in the TruCluster, if it needs
    access to the data in the first instance's cache, must access it via
    the TruCluster-provided communications mechanisms- over memory channel,
    ultimately, using Distributed Lock Manager and some aspect of BSS/BSC
    (Block Shipment Server and Client pieces of DRD) (I'm not intimately
    familiar with this part of the machanism.)  This will be slower than
    accessing the shared memory on the same node would be.  How much
    slower, depends upon a large number of factors- how much MC bandwidth
    is consumed by other DLM and other BSS traffic, mainly, which is in
    turn dependent upon data partitioning to attempt to keep as much data
    localized to one particular instance so these internode transfers don't
    have to happen most of the time.
    
    >	- How do we compare such a performance hurt versus our competitor's 
    > hybrid SMP/MPP servers ?
    
    That's an evolving science.  In the general case, we refer to
    scaleability- measure performance (X) on one node, compare it to
    performance (Y) on N nodes; if NX = Y, you've achieved 100%
    scaleability, and you ask the other guys what they can do.  Our
    achievement for scaleability in specific benchmarks so far ranges
    widely; one year ago, at OPS announcement, TPC-C numbers on a 4-node
    cluster were roughly 30K, compared to 11K for a single node; 100%
    scaleability would have been 44K for 4 nodes, so we achieved 30K/44K
    or a little better than 67% (I repeat, a year ago, on this one
    application).  Each application must be carefully tuned.  This area is
    getting a lot of attention right now, in OPS' case, and I'm not one of
    the people doing the work, so I'm not in a position to comment further.
    
    DougO