[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference wonder::turbolaser

Title:TurboLaser Notesfile - AlphaServer 8200 and 8400 systems
Notice:Welcome to WONDER::TURBOLASER in it's new homeshortly
Moderator:LANDO::DROBNER
Created:Tue Dec 20 1994
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:1218
Total number of notes:4645

703.0. "Turbolaser arch. vs NUMA" by NETRIX::"[email protected]" (Trent Brennan) Thu Mar 28 1996 01:13

T.RTitleUserPersonal
Name
DateLines
703.1is the competitors NUMA story better than ours?MARVIN::RIGBYNo such thing as an alpha betaThu Mar 28 1996 08:348
703.2Alta Vista and NUMANETRIX::"[email protected]"Trent BrennanThu Mar 28 1996 19:2415
703.3personal opinions...WIBBIN::NOYCEEV5 issues 4 instructions per meterFri Mar 29 1996 08:2925
703.4some commentsSTAR::CROLLFri Mar 29 1996 11:1756
703.5GNROSE::HAGGERTYS.I. PM&D, Stow MA USAFri Mar 29 1996 17:063
703.6Excellent feedbackNETRIX::"[email protected]"Trent BrennanSun Mar 31 1996 19:353
703.7SMARIO::BARKERCracking Toast, Gromit !Mon Apr 01 1996 05:0113
703.8First NUMA delivery ....CHEFS::BUXTONJJon Buxton @HHLMon Apr 01 1996 05:3625
703.9 NUMA vs MCHPCGRP::MANLEYMon Apr 01 1996 15:5552
703.10MC vs. NUMASTAR::CROLLTue Apr 02 1996 10:2228
703.11apples vs. orangesNAMIX::jptFIS and ChipsWed Apr 03 1996 02:2521
703.12252 processors on SequentGNROSE::HAGGERTYS.I. PM&D, Stow MA USAWed Apr 03 1996 11:024
703.13More on Sequent's NUMA ...HPCGRP::MANLEYWed Apr 10 1996 19:4195
703.14still worriedWIBBIN::NOYCEEV5 issues 4 instructions per meterThu Apr 11 1996 08:4420
703.15HPCGRP::MANLEYThu Apr 11 1996 11:0356
703.16Sequents messages support our messagesNAMIX::jptFIS and ChipsFri Apr 12 1996 05:2724
703.17ESIS flashFRUST::FUKSFri Apr 12 1996 08:0314
703.18WONDER::VANDORENFri Apr 12 1996 09:0216
703.19Unigram.X = Electronic UNIX NewsletterMINNY::sbu011.zuo.dec.com::DOLDERThe future isn't what it used to beMon Apr 15 1996 07:575
703.20Sequent ships system to OracleSTOWOA::HAGGERTYSBU ASE, Stow MA USAMon Aug 05 1996 11:1766
703.21Sequent to demo machineEPS::HAGGERTYSBU ASETue Nov 05 1996 09:2995
703.22Need some ammo to pass on!WOTVAX::HILTONSave Water, drink beerFri Jan 24 1997 15:5111
    Do we havbe any 'official' or unofficial white papers I can pass onto
    my customer?
    
    I'm in a competitive situation, down to just Digital and Sequent.
    Sequent are pushing NUMA.
    
    Has anyone done any papers or presentations they gave to customers?
    
    Thanks,
    
    Greg
703.23And now SGI's Origin 2000BEET::GONZALEZSteve GonzalezWed Feb 05 1997 07:298
I have a customer who has been told by SGI that the new Origin 2000
is an industry standard implementation of NUMA by SGI.

Their homepage doesn't seem to reflect this, but the specs are impressive.

Any Thoughts?

Steve
703.24AKA Distributed Shared Memory.USPS::FPRUSSFrank Pruss, 202-232-7347Wed Feb 05 1997 16:0812
    Yes, the Origin series is a NUMA machine.  They call it distributed
    shared memory. Interestingly, if you do a search on NUMA there, it
    takes you to the Origin overviews, even though the term does not seem
    to appear on the pages discovered.
    
    Of course, for reasons which should be obvious, we shouldn't be going
    around saying NUMA can't work ;-)
    
    I'm not sure that the phrase "Industry Standard NUMA" has any meaning,
    unless you are talking about Intel's SHV boardset (DG & others...)
    
    FJP
703.25why is SGI lying?MSBCS::SCHNEIDERindividually twistedThu Feb 06 1997 08:064
    If SGI really claimed "Industry Standard NUMA", I think you should urge
    the customer to question this.  
    
    Chuck
703.26NUMA questonWOTVAX::HILTONSave Water, drink beerThu Feb 06 1997 08:5316
    One thing I'm struggling with.
    
    I'm competing against Sequent who are bidding a 64 processor Numa
    'box'.
    
    This I believe is made up of various quad boards with the memory
    interlinked.
    
    The customer has a concern with us using a cluster, and Oracle
    Parallel Server, however what would happen in the NUMA world when
    either a quad board, or the interconnec failed? Would Sequent not need
    to bid OPS to counter this potential problem?
    
    Thanks,
    
    Greg
703.27NUMA ain't all it's crocked up to be ...HPCGRP::MANLEYThu Feb 06 1997 09:3620
    
>    The customer has a concern with us using a cluster, and Oracle
>    Parallel Server, 

Why?

>                     however what would happen in the NUMA world when
>    either a quad board, or the interconnec failed?

Sequent implements NUMA using a uni-directional ring interconnect. If you
lose a single interconnect node, you lose everything. Furthermore, if a
failing quad board holds an updated piece of memory, owned by another quad
board or owns a piece of memory that's needed by another quad board, the
entire system is toast.

>                                                     Would Sequent not need
>    to bid OPS to counter this potential problem?
 
?   

703.28KITCHE::schottEric R. Schott USG Product ManagementThu Feb 06 1997 10:559
Numa is not a cluster...if a customer wants availability, they
want to cluster in some form.  For performance, I would push with
the customer as to where is the performance #'s for NUMA beyond it
has 64 cpus?  

A lot of vendors do an audit with 6, 10 or 20 cpus, and then say
64 must be good....don't believe it.


703.29How are they setting the bar?USPS::FPRUSSFrank Pruss, 202-232-7347Thu Feb 06 1997 12:159
    To reiterate, if you want high-availability, you need to implement a
    cluster.
    
    Sequent will also be able to cluster either their old SE kit or the new
    CC-NUMA kit.
    
    Is Sequent claiming that their 64 CPU CC-NUMA system will exceed the
    performance of a single 8400?  Based on what?  Was there a customer
    Benchmark?  
703.30WOTVAX::HILTONSave Water, drink beerWed Feb 19 1997 05:568
    >> Numa is not a cluster...if a customer wants availability, they
    >> want to cluster in some form.
       
    
    In this weeks PC week Sequent are stated as saying their NUMA machine
    gives 99.99% availability, doesn't say how though!
    
    
703.31BT's issues with OPSWOTVAX::HILTONSave Water, drink beerThu Feb 20 1997 12:5946
    These are the concerns BT have about OPS, any comments:
    
    Realworld is a BT written benchmark that is very heavy read/write.
    
    With RealWorld when ran on a cluster using Oracle Parallel Server,
    using TCP/IP (not a Digital cluster), bog standard interconnect. With 2
    nodes, as updates or benchmark was run using both nodes, got negative
    scalability. Investigate, discovered pinging was the problem ie DLM
    pings the data block from one node to the requesting node.
                                                                           
    Then BT changed Realworld so it was data dependent access on each node,
    ie customers a-d updated on node 1 e-g on node 2 etc, ie partitioned
    the data, however the datablocks stopped being pinged, it was the index
    header block that was being passed back and forwards.
    
    Ie do an insert/delete you change the index leaf in Oracle which
    eventually changes the index header block which becomes the contention
    point. Although its not causing a bandwidth problem, it becomes a
    latency issue.
    
    Will memory channel address this?
    
    
    "pinging" will still happen but will be faster, will it saturate the
    bus
    
    
    BT do not wish to partition their data.
    
    There is a architectural limitation in OPS when it comes to OLTP
    applications. If you are not able (or not willing) to partition your
    data and update it, potentially any node in the cluster can update any
    block of data. The Oracle OPS implementation does not have any concept
    of how distributed raw devices and the distributed lock manager are
    implemented and therefore does not 'know' that something like a memory
    channel even exists.
    
    If one node updates a block in its SGA and a second node in the cluster
    tries to update the same block a so called 'block ping' needs to
    happen: the block gets flushed out to disk from node 1 and is re-read
    on node 2 before it is updated. Now potentially if the application
    updates some small tables in nearly every transaction these few
    datablocks travel between disk and memory constantly. This is a serious
    bottleneck and can only be resolved by database design and data
    partitioning.