[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference 7.286::space

Title:Space Exploration
Notice:Shuttle launch schedules, see Note 6
Moderator:PRAGMA::GRIFFIN
Created:Mon Feb 17 1986
Last Modified:Thu Jun 05 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:974
Total number of notes:18843

625.0. "Shuttle Reliability" by 4347::GRIFFIN (Dave Griffin) Fri Jun 01 1990 19:52

[Couln't find a better place for this, so a new topic...  dg]
From: [email protected] (Henry Spencer)
Newsgroups: sci.space
Subject: shuttle reliability
Date: 31 May 90 03:48:34 GMT
Organization: U of Toronto Zoology

I finally got around to digging out my copy of the OTA report "Round Trip
To Orbit", which talks about future manned-spaceflight alternatives for
the US.  Although this is generally an interesting report, the real hot
spot is the comments on shuttle reliability...

- NASA's objective was a 97% probability of an orbiter lasting 100 flights.
	It's hard to say how well this has been achieved, since NASA makes
	little attempt at quantitative estimation of reliability.

- However, if reliability had been that high (equating to 99.97% chance of
	completing each mission successfully), the odds against Challenger
	being lost would have been 130:1.  It seems very likely that the
	reliability isn't that good.

- The best guess at reliability from the observed success rate is 96%.

- Shuttle launches to date provide only 50% confidence that the reliability
	is at least 94.3%.

- If reliability is now 94.3%, there is a better than even chance of losing
	at least one orbiter on the next 12 flights.  [These estimates will
	have improved a bit, as the report was written only four flights
	after Challenger.]

- If post-Challenger reliability is 98% -- there is little confidence of
	this in an objective statistical sense -- there is a 50% chance
	of losing another orbiter within 34 flights (about three years
	at planned near-term launch rates).

- At 98%, the chance of having four orbiters available for space station
	construction start is 28%.  The chance of having all four still
	operational at the end of station construction is 12%.

- At 98%, there is a 30-40% chance [reading off the graph, they don't cite
	a number for this one] that another orbiter will be lost before
	Endeavour is finished.

The report draws a number of conclusions about possible approaches to
take -- OTA tends to list alternatives rather than making recommendations --
but they say, loud and clear, that an ongoing shuttle fleet absolutely
requires ongoing orbiter construction.  And the decision to build the
next one will have to be made in the next year or two.
T.RTitleUserPersonal
Name
DateLines
625.1More on Shuttle ReliabilityLEVERS::CORTESWed Sep 05 1990 14:49760
    I think this note from the FLYING conference has not been reproduced 
    here yet. I looked around and could't find it. And I believe there is 
    some good information about the shuttle reliability included here.   
    
    Eladio
    
    
    
                     <<< MEIS::PUB$:[NOTES$LIBRARY]FLYING.NOTE;4 >>>
                             -< General Aviation >-
================================================================================
Note 3170.35             Shuttles grounded indefinitely                 35 of 46
CVG::LEKAS "From the Workstation of Tony Lekas"     740 lines  10-JUL-1990 13:14
                   -< Shuttle Reliability by R. P. Feynman >-
--------------------------------------------------------------------------------



        Personal observations on the reliability of  the  Shuttle
        by R.  P.  Feynman


        Introduction
        ------------

             It appears that there are  enormous  differences  of
        opinion  as  to the probability of a failure with loss of
        vehicle and of human  life.   The  estimates  range  from
        roughly  1  in  100  to 1 in 100,000.  The higher figures
        come from the working engineers, and the very low figures
        from management.  What are the causes and consequences of
        this lack of agreement?  Since 1 part  in  100,000  would
        imply  that  one  could put a Shuttle up each day for 300
        years expecting to lose only one, we could  properly  ask
        "What is the cause of management's fantastic faith in the
        machinery?"

             We have also found that certification criteria  used
        in  Flight  Readiness  Reviews  often develop a gradually
        decreasing strictness.  The argument that the  same  risk
        was  flown before without failure is often accepted as an
        argument for the safety of accepting it  again.   Because
        of this, obvious weaknesses are accepted again and again,
        sometimes  without  a  sufficiently  serious  attempt  to
        remedy  them,  or  to  delay  a  flight  because of their
        continued presence.

             There are several sources of information.  There are
        published criteria for certification, including a history
        of modifications in the form of waivers  and  deviations.
        In  addition, the records of the Flight Readiness Reviews
        for each flight document the arguments used to accept the
        risks  of  the flight.  Information was obtained from the
        direct testimony and the  reports  of  the  range  safety
        officer, Louis J.  Ullian, with respect to the history of
        success of solid fuel rockets.  There was a further study
        by  him  (as  chairman  of  the launch abort safety panel
        (LASP)) in an attempt to determine the risks involved  in
        possible  accidents  leading to radioactive contamination
        from attempting to fly a plutonium power supply (RTG) for
        future  planetary  missions.   The NASA study of the same        
        question is also available.  For the History of the Space
        Shuttle  Main  Engines,  interviews  with  management and
        engineers  at  Marshall,  and  informal  interviews  with
        engineers  at Rocketdyne, were made.  An independent (Cal
        Tech) mechanical engineer who consulted  for  NASA  about
        engines  was  also  interviewed  informally.   A visit to
        Johnson was made to gather information on the reliability
        of  the  avionics  (computers,  sensors,  and effectors).
        Finally there is a  report  "A  Review  of  Certification
        Practices,  Potentially  Applicable to Man-rated Reusable
        Rocket  Engines,"  prepared   at   the   Jet   Propulsion
        Laboratory  by  N.  Moore, et al., in February, 1986, for

                                                                Page 2


        NASA Headquarters, Office of Space Flight.  It deals with
        the  methods  used by the FAA and the military to certify
        their gas turbine and rocket engines.  These authors were
        also interviewed informally.


        Solid Rockets (SRB)
        -------------------

             An estimate of the reliability of solid rockets  was
        made  by  the  range  safety  officer,  by  studying  the
        experience of all previous  rocket  flights.   Out  of  a
        total  of  nearly  2,900  flights,  121 failed (1 in 25).
        This includes, however, what may be called, early errors,
        rockets  flown  for  the  first few times in which design
        errors are  discovered  and  fixed.   A  more  reasonable
        figure  for  the  mature  rockets might be 1 in 50.  With
        special care in the selection of parts and in inspection,
        a  figure  of  below  1 in 100 might be achieved but 1 in
        1,000 is probably not attainable with today's technology.
        (Since there are two rockets on the Shuttle, these rocket
        failure rates must be  doubled  to  get  Shuttle  failure
        rates from Solid Rocket Booster failure.)

             NASA officials argue that the figure is much  lower.
        They  point  out  that  these  figures  are  for unmanned
        rockets but since the Shuttle is a  manned  vehicle  "the
        probability  of mission success is necessarily very close
        to 1.0." It is not very clear  what  this  phrase  means.
        Does  it  mean  it  is  close to 1 or that it ought to be
        close to 1?  They go on  to  explain  "Historically  this
        extremely  high  degree of mission success has given rise
        to a difference in philosophy between manned space flight
        programs   and   unmanned   programs;   i.e.,   numerical
        probability usage versus  engineering  judgment."  (These
        quotations  are  from  "Space  Shuttle Data for Planetary
        Mission RTG Safety Analysis," Pages  3-1,  3-1,  February
        15,  1985, NASA, JSC.) It is true that if the probability
        of failure was as low as 1 in 100,000 it  would  take  an
        inordinate  number  of  tests to determine it ( you would
        get nothing but a string of perfect flights from which no
        precise figure, other than that the probability is likely
        less than the number of such flights  in  the  string  so
        far).   But,  if  the  real  probability is not so small,
        flights would show troubles, near failures, and  possible
        actual  failures with a reasonable number of trials.  and
        standard statistical  methods  could  give  a  reasonable
        estimate.   In  fact, previous NASA experience had shown,
        on occasion, just such difficulties, near accidents,  and
        accidents,  all  giving  warning  that the probability of
        flight failure was not so very small.  The  inconsistency
        of  the  argument  not  to  determine reliability through
        historical experience, as the range safety  officer  did,
        is   that   NASA   also  appeals  to  history,  beginning
        "Historically this high  degree  of  mission  success..."

                                                                Page 3


        Finally,   if   we  are  to  replace  standard  numerical
        probability usage with engineering judgment,  why  do  we
        find  such  an  enormous disparity between the management
        estimate and the judgment of  the  engineers?   It  would
        appear  that, for whatever purpose, be it for internal or
        external consumption, the management of NASA  exaggerates
        the reliability of its product, to the point of fantasy.

             The  history  of  the   certification   and   Flight
        Readiness  Reviews will not be repeated here.  (See other
        part of Commission reports.) The phenomenon of  accepting
        for  flight,  seals that had shown erosion and blow-by in
        previous flights, is very clear.  The  Challenger  flight
        is an excellent example.  There are several references to
        flights that had gone before.  The acceptance and success
        of  these  flights  is  taken as evidence of safety.  But
        erosion and blow-by are not  what  the  design  expected.
        They are warnings that something is wrong.  The equipment
        is not operating as expected, and therefore  there  is  a
        danger  that it can operate with even wider deviations in
        this unexpected and not thoroughly understood  way.   The
        fact  that  this  danger  did  not  lead to a catastrophe
        before is no guarantee that it will not  the  next  time,
        unless it is completely understood.  When playing Russian
        roulette the fact that the first shot got off  safely  is
        little comfort for the next.  The origin and consequences
        of the erosion and blow-by were not understood.  They did
        not   occur  equally  on  all  flights  and  all  joints;
        sometimes more, and sometimes less.   Why  not  sometime,
        when  whatever conditions determined it were right, still
        more leading to catastrophe?

             In spite of these  variations  from  case  to  case,
        officials  behaved  as  if  they  understood  it,  giving
        apparently  logical  arguments  to   each   other   often
        depending  on  the  "success"  of  previous flights.  For
        example.  in determining if flight 51-L was safe  to  fly
        in  the face of ring erosion in flight 51-C, it was noted
        that the erosion depth was only one-third of the  radius.
        It  had been noted in an experiment cutting the ring that
        cutting it as deep as one radius was necessary before the
        ring  failed.   Instead  of  being  very  concerned  that
        variations  of   poorly   understood   conditions   might
        reasonably  create  a  deeper  erosion  this time, it was
        asserted, there was "a safety factor of three." This is a
        strange use of the engineer's term ,"safety factor." If a
        bridge is built to withstand a certain load  without  the
        beams  permanently  deforming,  cracking, or breaking, it
        may be designed for the materials used to actually  stand
        up  under  three times the load.  This "safety factor" is
        to allow for uncertain excesses of load, or unknown extra
        loads,  or  weaknesses  in  the  material that might have
        unexpected flaws, etc.  If now the expected load comes on
        to  the new bridge and a crack appears in a beam, this is
        a failure of the design.  There was no safety  factor  at

                                                                Page 4


        all;  even  though  the  bridge did not actually collapse
        because the crack went only one-third of the way  through
        the  beam.  The O-rings of the Solid Rocket Boosters were
        not designed to erode.  Erosion was a clue that something
        was  wrong.   Erosion was not something from which safety
        can be inferred.

             There was no way, without full  understanding,  that
        one  could  have confidence that conditions the next time
        might not produce erosion three times  more  severe  than
        the   time   before.    Nevertheless,   officials  fooled
        themselves into thinking they had such understanding  and
        confidence, in spite of the peculiar variations from case
        to case.  A mathematical  model  was  made  to  calculate
        erosion.    This  was  a  model  based  not  on  physical
        understanding but on empirical curve fitting.  To be more
        detailed, it was supposed a stream of hot gas impinged on
        the O-ring material, and the heat was determined  at  the
        point  of  stagnation  (so far, with reasonable physical,
        thermodynamic laws).  But to determine  how  much  rubber
        eroded  it was assumed this depended only on this heat by
        a formula suggested by data on  a  similar  material.   A
        logarithmic  plot  suggested  a  straight line, so it was
        supposed that the erosion varied as the .58 power of  the
        heat,  the .58 being determined by a nearest fit.  At any
        rate, adjusting some other  numbers,  it  was  determined
        that  the  model  agreed  with  the  erosion (to depth of
        one-third the radius of the ring).  There is nothing much
        so   wrong   with   this   as   believing   the   answer!
        Uncertainties appear  everywhere.   How  strong  the  gas
        stream  might  be was unpredictable, it depended on holes
        formed in the putty.  Blow-by showed that the ring  might
        fail  even  though not, or only partially eroded through.
        The empirical formula was known to be uncertain,  for  it
        did not go directly through the very data points by which
        it was determined.  There were a  cloud  of  points  some
        twice  above,  and  some twice below the fitted curve, so
        erosions twice predicted were reasonable from that  cause
        alone.    Similar   uncertainties  surrounded  the  other
        constants in  the  formula,  etc.,  etc.   When  using  a
        mathematical  model  careful  attention  must be given to
        uncertainties in the model.


        Liquid Fuel Engine (SSME)
        ------------------------- 

             During the flight of 51-L the  three  Space  Shuttle
        Main  Engines  all  worked  perfectly,  even, at the last
        moment, beginning to shut down the engines  as  the  fuel
        supply  began  to fail.  The question arises, however, as
        to whether, had it failed, and we were to investigate  it
        in  as much detail as we did the Solid Rocket Booster, we
        would find a similar lack of attention to  faults  and  a
        deteriorating  reliability.   In  other  words,  were the

                                                                Page 5


        organization weaknesses that contributed to the  accident
        confined  to the Solid Rocket Booster sector or were they
        a more general characteristic of NASA?  To that  end  the
        Space  Shuttle  Main  Engines  and the avionics were both
        investigated.  No similar study of the  Orbiter,  or  the
        External Tank were made.

             The engine is a much more complicated structure than
        the  Solid Rocket Booster, and a great deal more detailed
        engineering goes into  it.   Generally,  the  engineering
        seems  to  be of high quality and apparently considerable
        attention is paid to deficiencies  and  faults  found  in
        operation.

             The usual way that such engines  are  designed  (for
        military   or   civilian  aircraft)  may  be  called  the
        component system,  or  bottom-up  design.   First  it  is
        necessary  to  thoroughly  understand  the properties and
        limitations of the materials  to  be  used  (for  turbine
        blades, for example), and tests are begun in experimental
        rigs to determine  those.   With  this  knowledge  larger
        component  parts  (such  as  bearings)  are  designed and
        tested individually.  As deficiencies and  design  errors
        are  noted  they  are corrected and verified with further
        testing.  Since one tests only  parts  at  a  time  these
        tests   and   modifications  are  not  overly  expensive.
        Finally one works up to the final design  of  the  entire
        engine, to the necessary specifications.  There is a good
        chance, by this  time  that  the  engine  will  generally
        succeed,  or  that  any  failures are easily isolated and
        analyzed  because  the  failure  modes,  limitations   of
        materials, etc., are so well understood.  There is a very
        good chance that the modifications to the engine  to  get
        around  the final difficulties are not very hard to make,
        for most  of  the  serious  problems  have  already  been
        discovered and dealt with in the earlier, less expensive,
        stages of the process.

             The Space Shuttle  Main  Engine  was  handled  in  a
        different manner, top down, we might say.  The engine was
        designed and put together all  at  once  with  relatively
        little  detailed  preliminary  study  of the material and
        components.   Then  when  troubles  are  found   in   the
        bearings, turbine blades, coolant pipes, etc., it is more
        expensive and difficult to discover the causes  and  make
        changes.   For  example,  cracks  have  been found in the
        turbine blades of the  high  pressure  oxygen  turbopump.
        Are  they  caused by flaws in the material, the effect of
        the oxygen atmosphere on the properties of the  material,
        the   thermal   stresses  of  startup  or  shutdown,  the
        vibration and stresses of steady running,  or  mainly  at
        some  resonance at certain speeds, etc.?  How long can we
        run from crack initiation to crack failure, and how  does
        this  depend  on power level?  Using the completed engine
        as a test bed to  resolve  such  questions  is  extremely

                                                                Page 6


        expensive.  One does not wish to lose an entire engine in
        order to find out where and how failure occurs.  Yet,  an
        accurate  knowledge  of  this information is essential to
        acquire a confidence in the engine  reliability  in  use.
        Without  detailed  understanding,  confidence  can not be
        attained.

             A further disadvantage of  the  top-down  method  is
        that,  if  an  understanding  of  a  fault is obtained, a
        simple fix, such as a new shape for the turbine  housing,
        may  be impossible to implement without a redesign of the
        entire engine.

             The Space Shuttle Main Engine is a  very  remarkable
        machine.  It has a greater ratio of thrust to weight than
        any previous engine.  It is built  at  the  edge  of,  or
        outside  of, previous engineering experience.  Therefore,
        as  expected,  many  different   kinds   of   flaws   and
        difficulties  have turned up.  Because, unfortunately, it
        was built in the top-down manner, they are  difficult  to
        find  and  fix.   The  design  aim  of  a  lifetime of 55
        missions equivalent firings (27,000 seconds of operation,
        either  in  a mission of 500 seconds, or on a test stand)
        has not been obtained.   The  engine  now  requires  very
        frequent  maintenance and replacement of important parts,
        such as turbopumps, bearings, sheet metal housings,  etc.
        The high-pressure fuel turbopump had to be replaced every
        three or four mission equivalents (although that may have
        been  fixed,  now) and the high pressure oxygen turbopump
        every five or six.  This is at most ten  percent  of  the
        original specification.  But our main concern here is the
        determination of reliability.

             In a total of about 250,000  seconds  of  operation,
        the  engines  have  failed  seriously  perhaps  16 times.
        Engineering pays close attention to  these  failings  and
        tries  to  remedy  them  as quickly as possible.  This it
        does by  test  studies  on  special  rigs  experimentally
        designed for the flaws in question, by careful inspection
        of the engine for suggestive clues (like cracks), and  by
        considerable  study  and analysis.  In this way, in spite
        of the difficulties  of  top-down  design,  through  hard
        work, many of the problems have apparently been solved.

             A list of  some  of  the  problems  follows.   Those
        followed by an asterisk (*) are probably solved:

        1.  Turbine blade cracks in high pressure fuel turbopumps
            (HPFTP).  (May have been solved.)

        2.  Turbine  blade  cracks  in   high   pressure   oxygen
            turbopumps (HPOTP).

                                                                Page 7


        3.  Augmented Spark Igniter (ASI) line rupture.*

        4.  Purge check valve failure.*

        5.  ASI chamber erosion.*

        6.  HPFTP turbine sheet metal cracking.

        7.  HPFTP coolant liner failure.*

        8.  Main combustion chamber outlet elbow failure.*

        9.  Main combustion chamber inlet elbow weld offset.*

       10.  HPOTP subsynchronous whirl.*

       11.  Flight acceleration  safety  cutoff  system  (partial
            failure in a redundant system).*

       12.  Bearing spalling (partially solved).

       13.  A  vibration  at  4,000  Hertz  making  some  engines
            inoperable, etc.


             Many  of  these  solved  problems  are   the   early
        difficulties  of a new design, for 13 of them occurred in
        the first 125,000 seconds and only three  in  the  second
        125,000  seconds.   Naturally, one can never be sure that
        all the bugs are out, and, for some, the fix may not have
        addressed  the  true cause.  Thus, it is not unreasonable
        to guess there may be at least one surprise in  the  next
        250,000  seconds,  a  probability of 1/500 per engine per
        mission.  On a mission there are three engines, but  some
        accidents  would  possibly  be contained, and only affect
        one engine.  The system can abort with only two  engines.
        Therefore  let  us  say that the unknown suprises do not,
        even  of  themselves,  permit  us  to  guess   that   the
        probability  of  mission  failure do to the Space Shuttle
        Main Engine is less than 1/500.  To this we must add  the
        chance  of  failure  from  known,  but  as  yet unsolved,
        problems (those without the asterisk in the list  above).
        These  we  discuss  below.  (Engineers at Rocketdyne, the
        manufacturer, estimate the total probability as 1/10,000.
        Engineers  at  marshal  estimate  it as 1/300, while NASA
        management, to whom these engineers report, claims it  is
        1/100,000.   An  independent engineer consulting for NASA
        thought 1 or 2 per 100 a reasonable estimate.)

             The history  of  the  certification  principles  for
        these  engines  is  confusing  and  difficult to explain.
        Initially the rule seems to have  been  that  two  sample
        engines  must  each  have  had  twice  the time operating
        without failure as the operating time of the engine to be
        certified  (rule  of  2x).   At  least  that  is  the FAA

                                                                Page 8


        practice, and NASA seems to have adopted  it,  originally
        expecting  the certified time to be 10 missions (hence 20
        missions for each sample).  Obviously the best engines to
        use  for  comparison  would  be  those  of greatest total
        (flight plus test) operating time -- the so-called "fleet
        leaders."  But  what if a third sample and several others
        fail in a short time?  Surely we will not be safe because
        two were unusual in lasting longer.  The short time might
        be more representative of the real possibilities, and  in
        the  spirit  of  the  safety  factor of 2, we should only
        operate at half the time of the short-lived samples.

             The slow shift toward decreasing safety  factor  can
        be  seen  in  many  examples.   We take that of the HPFTP
        turbine blades.  First of all  the  idea  of  testing  an
        entire  engine was abandoned.  Each engine number has had
        many important parts  (like  the  turbopumps  themselves)
        replaced  at frequent intervals, so that the rule must be
        shifted from engines to components.  We accept  an  HPFTP
        for  a  certification  time  if two samples have each run
        successfully for twice that time (and  of  course,  as  a
        practical  matter,  no longer insisting that this time be
        as large as 10 missions).  But  what  is  "successfully?"
        The  FAA calls a turbine blade crack a failure, in order,
        in practice, to really provide a  safety  factor  greater
        than  2.   There  is  some  time  that  an engine can run
        between the time a crack originally starts until the time
        it  has  grown  large  enough  to  fracture.  (The FAA is
        contemplating new rules that take this extra safety  time
        into  account,  but only if it is very carefully analyzed
        through known models within a known range  of  experience
        and  with  materials  thoroughly  tested.   None of these
        conditions apply to the Space Shuttle Main Engine.

             Cracks were found in many second stage HPFTP turbine
        blades.   In  one  case  three  were  found  after  1,900
        seconds, while in another they were not found after 4,200
        seconds,   although  usually  these  longer  runs  showed
        cracks.  To follow this story further we  shall  have  to
        realize that the stress depends a great deal on the power
        level.  The Challenger flight was to be at, and  previous
        flights  had  been at, a power level called 104% of rated
        power level during most of  the  time  the  engines  were
        operating.    Judging  from  some  material  data  it  is
        supposed that at the level 104% of rated power level, the
        time  to  crack is about twice that at 109% or full power
        level (FPL).  Future flights were to  be  at  this  level
        because  of heavier payloads, and many tests were made at
        this level.  Therefore dividing time at  104%  by  2,  we
        obtain  units  called equivalent full power level (EFPL).
        (Obviously, some uncertainty is introduced by  that,  but
        it  has  not been studied.) The earliest cracks mentioned
        above occurred at 1,375 EFPL.

                                                                Page 9


             Now the certification rule becomes "limit all second
        stage  blades to a maximum of 1,375 seconds EFPL." If one
        objects that the safety factor of 2 is lost it is pointed
        out  that  the  one  turbine  ran  for 3,800 seconds EFPL
        without cracks, and half of this is 1,900 so we are being
        more  conservative.   We  have  fooled ourselves in three
        ways.  First we have only one sample, and it is  not  the
        fleet  leader, for the other two samples of 3,800 or more
        seconds had 17 cracked blades between them.   (There  are
        59  blades  in the engine.) Next we have abandoned the 2x
        rule and substituted equal time.  And finally,  1,375  is
        where  we  did see a crack.  We can say that no crack had
        been found below 1,375, but the last time we  looked  and
        saw  no  cracks  was  1,100 seconds EFPL.  We do not know
        when the crack formed between these  times,  for  example
        cracks   may   have   formed   at   1,150  seconds  EFPL.
        (Approximately 2/3 of the blade sets tested in excess  of
        1,375  seconds  EFPL had cracks.  Some recent experiments
        have, indeed, shown cracks as early as 1,150 seconds.) It
        was important to keep the number high, for the Challenger
        was to fly an engine very close to the limit by the  time
        the flight was over.

             Finally it is claimed  that  the  criteria  are  not
        abandoned,  and  the system is safe, by giving up the FAA
        convention  that  there  should   be   no   cracks,   and
        considering  only a completely fractured blade a failure.
        With this definition no engine has yet failed.  The  idea
        is  that  since  there  is sufficient time for a crack to
        grow to a fracture we can insure  that  all  is  safe  by
        inspecting  all  blades  for  cracks.  If they are found,
        replace them, and if none are found we have  enough  time
        for  a  safe mission.  This makes the crack problem not a
        flight safety problem, but merely a maintenance problem.

             This may in fact be true.  But how well do  we  know
        that  cracks  always  grow slowly enough that no fracture
        can occur in a mission?  Three engines have run for  long
        times  with  a  few  cracked  blades (about 3,000 seconds
        EFPL) with no blades broken off.

             But a fix for this cracking may have been found.  By
        changing  the  blade shape, shot-peening the surface, and
        covering with insulation to exclude  thermal  shock,  the
        blades have not cracked so far.

             A very similar  story  appears  in  the  history  of
        certification  of  the  HPOTP,  but we shall not give the
        details here.

             It is evident, in summary, that the Flight Readiness
        Reviews  and certification rules show a deterioration for
        some of the problems of the  Space  Shuttle  Main  Engine
        that  is  closely  analogous to the deterioration seen in
        the rules for the Solid Rocket Booster.

                                                               Page 10


        Avionics
        --------

             By "avionics" is meant the computer  system  on  the
        Orbiter   as   well  as  its  input  sensors  and  output
        actuators.  At first we will restrict  ourselves  to  the
        computers   proper   and   not   be  concerned  with  the
        reliability of the input information from the sensors  of
        temperature,   pressure,   etc.,  nor  with  whether  the
        computer output is faithfully followed by  the  actuators
        of  rocket  firings,  mechanical  controls,  displays  to
        astronauts, etc.

             The computer system is very elaborate,  having  over
        250,000  lines  of  code.   It is responsible, among many
        other things, for the automatic  control  of  the  entire
        ascent  to orbit, and for the descent until well into the
        atmosphere (below Mach  1)  once  one  button  is  pushed
        deciding  the landing site desired.  It would be possible
        to make the entire landing automatically (except that the
        landing  gear  lowering  signal  is expressly left out of
        computer control, and must  be  provided  by  the  pilot,
        ostensibly  for  safety  reasons)  but  such  an entirely
        automatic landing is probably not  as  safe  as  a  pilot
        controlled  landing.  During orbital flight it is used in
        the control of payloads, in displaying information to the
        astronauts,  and  the  exchange  of  information  to  the
        ground.  It is evident that the safety of flight requires
        guaranteed  accuracy of this elaborate system of computer
        hardware and software.

             In brief, the hardware  reliability  is  ensured  by
        having  four  essentially  independent identical computer
        systems.  Where possible each sensor  also  has  multiple
        copies, usually four, and each copy feeds all four of the
        computer lines.  If the inputs from the sensors disagree,
        depending   on  circumstances,  certain  averages,  or  a
        majority selection is used as the effective  input.   The
        algorithm  used  by each of the four computers is exactly
        the same, so their inputs (since each sees all copies  of
        the  sensors)  are  the same.  Therefore at each step the
        results in each computer should be identical.  From  time
        to time they are compared, but because they might operate
        at slightly different speeds a  system  of  stopping  and
        waiting  at  specific  times  is  instituted  before each
        comparison is made.  If one of the  computers  disagrees,
        or  is  too  late  in  having its answer ready, the three
        which do agree are assumed to be correct and  the  errant
        computer is taken completely out of the system.  If, now,
        another computer fails, as judged by the agreement of the
        other two, it is taken out of the system, and the rest of
        the flight canceled, and descent to the landing  site  is
        instituted,  controlled  by  the two remaining computers.
        It is seen that this is  a  redundant  system  since  the
        failure of only one computer does not affect the mission.

                                                               Page 11


        Finally, as an extra feature of safety, there is a  fifth
        independent  computer,  whose  memory is loaded with only
        the programs of ascent and descent, and which is  capable
        of  controlling the descent if there is a failure of more
        than two of the computers of the main line four.

             There is not enough room in the memory of  the  main
        line  computers  for all the programs of ascent, descent,
        and payload programs in flight, so the memory  is  loaded
        about four time from tapes, by the astronauts.

             Because of the enormous effort required  to  replace
        the  software  for  such  an  elaborate  system,  and for
        checking a new system out, no change has been made to the
        hardware  since the system began about fifteen years ago.
        The  actual  hardware  is  obsolete;  for  example,   the
        memories  are  of  the  old  ferrite  core  type.   It is
        becoming more difficult to find manufacturers  to  supply
        such   old-fashioned   computers  reliably  and  of  high
        quality.  Modern computers are very much  more  reliable,
        can  run  much faster, simplifying circuits, and allowing
        more to be done, and would not require so much loading of
        memory, for the memories are much larger.

             The  software  is  checked  very  carefully   in   a
        bottom-up  fashion.   First,  each  new  line  of code is
        checked, then sections of code or  modules  with  special
        functions  are  verified.  The scope is increased step by
        step until  the  new  changes  are  incorporated  into  a
        complete  system  and  checked.   This complete output is
        considered  the  final  product,  newly  released.    But
        completely   independently   there   is   an  independent
        verification group, that takes an adversary  attitude  to
        the  software  development  group, and tests and verifies
        the software as if it were a customer  of  the  delivered
        product.   There  is additional verification in using the
        new programs in simulators, etc.  A discovery of an error
        during  verification  testing is considered very serious,
        and its origin  studied  very  carefully  to  avoid  such
        mistakes in the future.  Such unexpected errors have been
        found only about six times in  all  the  programming  and
        program  changing  (for new or altered payloads) that has
        been done.  The principle that is followed  is  that  all
        the  verification  is not an aspect of program safety, it
        is merely a test of that safety,  in  a  non-catastrophic
        verification.   Flight  safety  is to be judged solely on
        how well the programs do in the  verification  tests.   A
        failure here generates considerable concern.

             To summarize then, the  computer  software  checking
        system  and  attitude  is  of the highest quality.  There
        appears to be no process  of  gradually  fooling  oneself
        while  degrading standards so characteristic of the Solid
        Rocket  Booster  or  Space  Shuttle  Main  Engine  safety
        systems.   To be sure, there have been recent suggestions

                                                               Page 12


        by management to curtail  such  elaborate  and  expensive
        tests  as  being unnecessary at this late date in Shuttle
        history.   This  must  be  resisted  for  it   does   not
        appreciate  the  mutual subtle influences, and sources of
        error generated by even small changes of one  part  of  a
        program  on  another.   There  are perpetual requests for
        changes as new payloads and new demands and modifications
        are  suggested  by  the  users.   Changes  are  expensive
        because they require extensive testing.  The  proper  way
        to  save  money  is  to  curtail  the number of requested
        changes, not the quality of testing for each.

             One might add that the  elaborate  system  could  be
        very   much   improved   by   more  modern  hardware  and
        programming techniques.  Any  outside  competition  would
        have  all  the  advantages  of starting over, and whether
        that is a good idea for  NASA  now  should  be  carefully
        considered.

             Finally, returning to the sensors and  actuators  of
        the  avionics system, we find that the attitude to system
        failure and reliability is not nearly as good as for  the
        computer  system.   For  example,  a difficulty was found
        with certain temperature sensors sometimes failing.   Yet
        18  months  later the same sensors were still being used,
        still  sometimes  failing,  until  a  launch  had  to  be
        scrubbed  because  two  of  them failed at the same time.
        Even on a succeeding flight this  unreliable  sensor  was
        used  again.   Again reaction control systems, the rocket
        jets used for reorienting and control in flight still are
        somewhat  unreliable.   There is considerable redundancy,
        but a long history of failures, none  of  which  has  yet
        been  extensive  enough  to seriously affect flight.  The
        action of the jets is checked by sensors,  and,  if  they
        fail  to  fire  the computers choose another jet to fire.
        But they are not designed to fail, and the problem should
        be solved.

                                                               Page 13


        Conclusions
        -----------

             If a reasonable launch schedule is to be maintained,
        engineering  often  cannot be done fast enough to keep up
        with  the   expectations   of   originally   conservative
        certification  criteria designed to guarantee a very safe
        vehicle.  In these situations,  subtly,  and  often  with
        apparently logical arguments, the criteria are altered so
        that flights  may  still  be  certified  in  time.   They
        therefore  fly  in  a relatively unsafe condition, with a
        chance of failure of  the  order  of  a  percent  (it  is
        difficult to be more accurate).

             Official management, on the other  hand,  claims  to
        believe  the  probability  of failure is a thousand times
        less.  One reason for this may be an  attempt  to  assure
        the government of NASA perfection and success in order to
        ensure the supply of funds.  The other may be  that  they
        sincerely believed it to be true, demonstrating an almost
        incredible lack of communication between  themselves  and
        their working engineers.

             In  any  event  this  has   had   very   unfortunate
        consequences,  the  most serious of which is to encourage
        ordinary citizens to fly in such a dangerous machine,  as
        if  it  had  attained the safety of an ordinary airliner.
        The astronauts,  like  test  pilots,  should  know  their
        risks,  and  we  honor  them  for their courage.  Who can
        doubt that  McAuliffe  was  equally  a  person  of  great
        courage,  who was closer to an awareness of the true risk
        than NASA management would have us believe?

             Let us make  recommendations  to  ensure  that  NASA
        officials  deal  in  a  world of reality in understanding
        technological weaknesses and imperfections well enough to
        be  actively trying to eliminate them.  They must live in
        reality in comparing the costs and utility of the Shuttle
        to  other  methods  of  entering space.  And they must be
        realistic in making contracts, in estimating  costs,  and
        the  difficulty  of  the projects.  Only realistic flight
        schedules should  be  proposed,  schedules  that  have  a
        reasonable  chance  of  being  met.   If  in this way the
        government would not support them, then so be  it.   NASA
        owes  it  to the citizens from whom it asks support to be
        frank, honest, and informative, so  that  these  citizens
        can  make  the  wisest  decisions  for  the  use of their
        limited resources.

             For  a  successful  technology,  reality  must  take
        precedence  over  public  relations, for nature cannot be
        fooled.
		<<< End of extracted notes >>>

Extraction performed by BLOCK::CORTES at 04:09 on 5-Sep-90 (Wed)
 using BLOCK::USER$454:[CORTES]NOTES_BATCH.COM;1 -- v33
   with switches "/conf=<hispanic_heritage,space,giganet,reliability>/again_at=<tomorrow+0-04:00:00> /Not_Now=<No>"
625.2Space Shuttle statisticsADVAX::KLAESAll the Universe, or nothing!Thu Apr 25 1991 17:0148
Article         7900
From: [email protected]
Newsgroups: sci.space.shuttle
Subject: Charting a decade of the Shuttle
Date: 25 Apr 91 19:52:08 GMT
Sender: [email protected] (USENET News System)
Organization: NASA Johnson Space Flight Center
 
From the Johnson Space Center's
Space News Roundup Vol. 30 No. 15, April 12, 1991
with a correction in the April 19, edition:
 
"that shows for all intents and purposes, we've launched  1,200 tons
of payload every decade. It took us 215 launches in the '60s, 152
launches in the '70s and 102 launches in the '80s.  The Shuttle with 4
percent of all U.S. Launches, has launched 41 percent of all the mass.
Not including the orbiter." 
 
The article means that the Shuttle accounted for 41% of the '80s mass
The article also lists the total payload weight to orbit:
 
In 38 shuttle flights (up to and including STS-35):
 
People flown: 199
Days in orbit: 225.07
Man Hours in orbit: 28,688
Orbits: 3572
Max alt average: 189
Statute miles flown: 94,300,612
Total No. Payloads: 292
Orbiter Weight at liftoff (lbs): 9,061,659
Pounds to Orbit ( not including orbiter): 1,055,421
Payload deployed (lbs): 533,898
Payload Returned to Earth (lbs): 546,427
EVA man-hours: 136.66
 
Also the shuttle has a success to failure ratio of .974 with 1 being perfect.
Higher than any other US booster.
 
Ariane, the only other vehicle designed in the '70s and operated in the '80s,
had five failures in the first 40 flights.
 
-- 
 
Mike Kent -  	Lockheed Engineering and Sciences Company at NASA JSC
		2400 NASA Rd One, Houston, TX 77058 (713) 483-3791
		[email protected]

625.3The Space Shuttle as seen in 1971MTWAIN::KLAESAll the Universe, or nothing!Thu Oct 10 1991 15:3042
Article: 36405
From: [email protected] (Stephen Walton)
Newsgroups: sci.space
Subject: Golden Oldie
Date: 10 Oct 91 01:54:05 GMT
Sender: [email protected]
Organization: csun
 
A colleague was cleaning his office recently, and came upon a copy of
NASA/Ames "Educational Data Sheet #407," dated August 1971 and titled
"The Space Shuttle -- The Key to our Future in Space." Some excerpts
on the occasion of the just-passed 20th anniversary of its release: 
 
But the Space Shuttle as presently defined is a new and startling
concept.  It is different in many ways from any previous program in
space or aeronautics.  It promises to open the space frontier in the
way the railroads opened the west, or in the way the DC 3 airplane
began the great expansion of commercial aviation....
 
Preliminary estimates received from vehicle definition contractors
indicate that the development costs of the Shuttle will be in the $8 -
9 billion range, depending on the Shuttle configuration, spread over
about 8 years.  These estimates would include the operating site and
the manufacture and test of two boosters and two orbiters. 
 
NASA studies so far indicate that the Shuttle can be operated for a
cost of about $5 million per launch, less than it costs to use any
current launch vehicle except the small Scout. 
 
Studies made for NASA by an independent economic analysis firm
(Mathematica, of Princeton, NJ) show that an average rate of 40
missions per year over this 13-year period [1978-1990] would result in
savings equivalent to an annual return of 10 percent on the
government's investment in the Shuttle.  Fifty-five missions per year
would increase the annual return to 15 percent. 
 
[End of excerpts.]
 
--
Stephen R. Walton, Dept. of Physics and Astronomy, Cal State Northridge
[email protected] auto-forwards to my current machine (whatever it is)