T.R | Title | User | Personal Name | Date | Lines |
---|
2459.1 | WHERE DO WE KEEP THOSE TAPES???? | GJOVAX::SEVIC | | Thu Apr 08 1993 22:14 | 2 |
| If true sounds like nothing a restore of the old software couldn't
handle.
|
2459.2 | | SOLVIT::REDZIN::DCOX | | Fri Apr 09 1993 00:24 | 6 |
| The DECdirect order I placed last week was processed immediately and I
received the goods (floppies) within a couple of days. So.......if
it's broke, please do not fix it. :-)
Dave
|
2459.3 | | RCOCER::MICKOL | D-FENS | Fri Apr 09 1993 02:52 | 10 |
| The Field Admin (FOCUS, AQS) cluster (GREAT1::) for the northeast has been up
and down since last weekend. More down than up. I think they upgraded AQS
(Automated Quote System) last weekend. It has caused a fair amount of hassle
for us field types. I've got half a dozen quotes in the queue I can't get to
customers. I thought AQS was the only application affected, but I may be
wrong. I'm sure some heads will roll over this one.
Anyway, when customers see our lead times for PCs, they will probably assume
something is seriously broken...
|
2459.4 | | DPDMAI::DAWSON | t/hs+ws=Formula for the future | Fri Apr 09 1993 09:19 | 6 |
|
I am not sure of the status right now but tuesday and
wednesday, the system for DECDirect was down. Maybe it still is.
Dave
|
2459.5 | GREAT1 IS down.... | ODIXIE::SCRIVEN | | Fri Apr 09 1993 10:53 | 13 |
|
DecDirect and PCBYDEC's GREAT1 (Focus application) has been down since
Tuesday Afternoon. At least they have been UNABLE to process any
orders or ship trace requests.
I'm sitting on about 10 orders that cannot be booked in the field. I
wonder what their excuse is. the FOCUS systems in the field are
working OK to the best of my knowledge......
Turns those 45 to 90 day lead times on PC's into 90 to 120 I bet. Just
what we need.....
Toodles.....JP
|
2459.6 | I placed a DECDirect Order 4/8/93 1530 Hrs | MEMIT::YOUNG_J | | Fri Apr 09 1993 11:18 | 6 |
| I don't know about the systems _behind_ ordering, but I called
DECDirect yesterday around 3:30 pm EDT and placed my orders. I'm sure
my customer response contact was in _some_ kind of system, 'cause he
had to look up part numbers for a new item ...... and he found 'em.
... Maybe my order went in during one of the _up_ times???
|
2459.7 | DECdirect is up | RTL::LAPINE | | Fri Apr 09 1993 11:24 | 2 |
| The DECdirect system is up now. Apparently (finally) came up last night.
|
2459.8 | | NETWKS::GASKELL | | Fri Apr 09 1993 13:26 | 5 |
| When I called DECdirect yesterday morning they were asking in-house
orders to call again as they were having system problems. I am
assuming they wanted to concentrate on customer orders first.
They seemed to be up and running this morning when I called.
|
2459.9 | HP, not DEC...(?) | 35261::ROGERS | | Fri Apr 09 1993 15:37 | 17 |
| This whole thing sounds like it might be confusing us with HP. mIt has
been widely publicized that their revenues have suffered because all
their admin systems have broken down -- they "outgrew" their capacity.
It has been discussed by Wall Street, and HP had to make a public
admission.
The most recent mention in Computerworld, as I seem to recall, is that
HP has a new, ground-up redesigned Master System that should be ready
"real soon now."
Maybe the internet author got us confused with HP? Maybe their
outdated, overloaded system HAS been down for 11 days? If so, someone
should send him a correction over internet.
We aren't the only ones that have problems. Ours might be stodgy and
limited in flexibility, but at least they seem to work (mostly).
|
2459.10 | "Informed Customers" takes on a new meaning these days | AUSTIN::UNLAND | Digitus Impudicus | Fri Apr 09 1993 20:40 | 14 |
| I don't believe that the author confused us with HP. In talking to
my HP counterpart, they haven't had any recognizable system outage
in the past few days. On the other hand, I *do* know customers who
have called the local office this week because of issues with PCBYDEC
and DECdirect, and the local reps have had many problems trying to
get quotes out because of system problems.
I do think the Internet item suffers a bit from hyperbole and alarmism,
but no more than expected. People shouldn't be amazed when word gets
around so fast. Computer customers are *very* sensitive these days to
glitches in performance by the vendors. Many PC and Mini vendors are
hanging by a thread, and the customer know it.
Geoff
|
2459.11 | The *straight* scoop | ODAY40::USLSAT::FRICCHIONE | Rick Fricchione (MRO1-1/L87/297-2573) | Mon Apr 12 1993 17:52 | 57 |
| These are the facts... (I was one of the people working on the problem)
The Ordering and Selling systems across the US underwent a fairly major upgrade
beginning about a week ago. It was primarily in support of the new DPP
(Digital Pricing Program) program (major pricing and discounting changes,
a new discounting system, etc) but provides lots of other fixes and changes
as well. As much as 30-40 percent of the code changed in some systems.
This was up to 1000 plus changed source units.
This release was known as CTPS V4.0 (Customer Transaction Processing Systems)
and was probably the most tested release ever. However, as Murphy would have
it, some technology issues at the layered product interaction level caused
some down time which affected systems like the Electronic Store, DECdirect,
PC Direct and field sales order administration (AQS, FOCUS, etc). Most of
these technology issues were not something that you test for in the RTE/DTM
sense. This release was probably the most tested release of any we had done.
There are some issues however, that only seemed to appear when 700 to 900 sessions
of the application are running in a cluster with lots of other demands being
placed on it. No amount of RTE test scripts and load testing can simulate it.
No application bugs caused it. It was all at the RDB and VMS interaction level.
The basic technology issues boiled down to VMS lock remastering, RDB 4.0A, the
occurence of FREEZELOCKS, MEMBIT locks, and a few other things which didn't
show up because they are basically results of exception conditions and
differences in application initialization. Its a bit complex to get into here,
but they go away with RDB V4.1A. We were locked into RDB V4.0A (V4.2 is SSB)
because of a common layered product upgrade process in the US called
PASE which means *everyone* has to go to V4.x at the same time. This puts
us behind SSB by at least six months or more. The problem
symptoms were database servers which started up fine, serviced requests fine
(for a while), but then went into a HIB state and stayed there. You basically
had to wait for VMS and RDB to decide whether this would occur or not. You
couldn't tell for a while. Obviously that makes it a tough call as to whether
to use the system or not.
After some "interesting" moments, we apparently found that "by standing on
one leg and holding the TV antennae over our head", we got things to
initialize and not go into a HIB state. We had some good help from RDB
engineering, and we had some strong people of our own working on it as well.
Trust me. We had no lack of "management support and attention" in getting
this fixed.
The systems are *UP*, and have been that way for several days. At no time
were the systems unavailable for more than a few hours. The timing of the
installation (first weeks of the fiscal quarter) was deliberate. If you
are going to risk down time. Do it then. Orders were always flowing from
the entry systems to the fulfillment sites (thats why the floppies came)
but we deliberately held downstream feeds for a few days to make a possible
rollback easier if it came to that. They are on-line for several days now.
If anyone has any questions on this, please send me mail directly. There is
nothing in the above that really could not be said to customers, but we should
be careful of the spin we put on it, as well as just forwarding bits and pieces
of information onto the INTERNET or elseware.
Rick
|
2459.12 | But I digress | FUNYET::ANDERSON | OpenVMS Forever! | Mon Apr 12 1993 18:59 | 9 |
| This is another example of the problems that can be caused by being forced to
run old software by CVMS or PASE or whatever.
The people who run most of Digital's production machines unfortunately adhere to
this policy which, I believe, causes more harm than good. Rdb at V4.0A (two
versions back) is nothing compared to all those IM&T VTX servers still at VTX
V4.1!
Paul
|
2459.13 | | RCOCER::MICKOL | D-FENS | Mon Apr 12 1993 22:55 | 7 |
| Re: .11: I beg to differ with you, but AQS on GREAT1 was down for much of the
past week. I tried to use it frequently and it was rarely available
and stable. I had a bunch of quotes queued up ready to be entered and
it wasn't until this monring at 4am that the problems got fixed.
Jim
|
2459.14 | Whoa! | LABRYS::CONNELLY | Network partner excited | Mon Apr 12 1993 23:21 | 47 |
|
re: .12
>This is another example of the problems that can be caused by being forced to
>run old software by CVMS or PASE or whatever.
I have to take (what i hope is a mild-tempered) exception to that, being one
of the folks who works on CVMS and has seen the PASE process at work.
I can't respond to .11, since i don't know the facts of this particular
problem, but i will say that the software contents of PASE (the Production
Applications Support Environments) are agreed upon by the developers of DEC's
internal applications (including Mr. Fricchione's group). Why have common
software environments for applications? Basically because we can't afford to
have dedicated hardware and software for each individual business application
needed by DEC. We have to share hardware, especially out in the data centers
beyond the pale of GMA. If application developers can't count on there being
a standard software environment on each system that they install on, it will
be a crap shoot as to whether the applications work from one site to the next.
Especially when multiple different business application groups in different
chains of command may be targeting their software for the same machine.
To avoid mass chaos in the implementation of important business applications,
the applications developers have a Product Architecture Committee where they
jointly decide on what the common (PASE) software environment will be. In
some cases this will mean that the last application development group ready
to go forward to a new version of a layered product (like RDB) will hold up
all the other developers. Not very pretty, but the choice is always stated
in terms of "break the business?"
Another comment: yes, RDB V4.0A has bugs. The latest version of RDB has
bugs too--i'll guarantee that. This is supposed to be production grade
software and it's bent and twisted out of shape by customers like Pfizer et
al. far more extremely than it is by DEC's mundane IS applications. There
isn't much you can do to avoid these bugs, just hope that the power-users
uncover them first. If anything, staying with an older version and applying
patches for known bugs to it should be safer than jumping to the latest and
greatest "bleeding edge" version.
IMHO, DEC desparately needs a strong CIO with authority over both the data
center/network infrastructure and ALL applications development. We've been
operating for years in a twilight zone where applications software and data
have been "owned" by the sponsoring business while the IS infrastructure
has been quasi-independent but beholden to all these "special interests".
I had been hopeful that Bob Palmer was going to fix this, but the latest
news on that score has not been very encouraging.
- paul
|
2459.15 | | ROWLET::AINSLEY | Less than 150 kts. is TOO slow! | Mon Apr 12 1993 23:48 | 6 |
| re: .11
Thanks for setting the record straight. To summarize the gory details,
it sounds like it was a matter of system and application tuning.
Bob
|
2459.16 | CVMS OK with me | CSOADM::ROTH | you just KEEP ME hangin' on... | Tue Apr 13 1993 08:43 | 11 |
| I'll defend CVMS as well.
In one of my previous forms I was a systems/application jockey for a
business-critical application. Having CVMS as a base actually FORCED
those that were developing/maintaining the application to run on a
version of VMS and layered products that were reasonably close to
current... prior to that, they would lag behind clinging to that 'oldie
but goodie' release of VMS. (e.g. was still running V3.x of VMS more
than a year after 4.x came out)
Lee (who just dated himself a bit)
|
2459.17 | | TOMK::KRUPINSKI | Slave of the Democratic Party | Tue Apr 13 1993 18:53 | 8 |
| re .11
See TPSYS::FORMAL_INSPECTION for a method that will allow you to
detect and eliminate may of those problems that cannot be found
via testing.
Tom_K
|
2459.18 | PASE/CVMS is not the issue. | ODAY40::FRICCHIONE | Rick Fricchione (MRO1-1/297-2573) | Wed Apr 14 1993 08:04 | 21 |
| Since people chose to interpret my note as a "blame PASE" note and not
as a "here's the factcs" note, let *ME* set the record straight.
1. All US IM&T organizations are committed to PASE/CVMS as a process.
It works. My group is committed to it.
2. The characteristics of RDB V4.0a are such that it was not a tuning
issue or application performance issue. It was basically that when
you did x before y in the startup of the monitors, opening of the
database, firing up of the servers, etc, it didn't work in a high
load situation. We ran up to 500 sessions using RTE in a test
environment to simulate load and we didn't run into it. We now
know how to simulate the situation and can test for it. We do not
have this problem with V4.0a.
I don't want to get into a PASE/CVMS discussion. Thats not the issue
here. The issue is that there were problems, we believe we have
addressed them until RDB V4.1 is implemented in these sites.
Rick
|
2459.19 | How many times does history have to repeat itself ? | PARITY::FAHERTY | | Wed Apr 14 1993 18:57 | 41 |
| Truth is, I think this particular Rdb problem has surfaced time and time again
over the last 2 years. I'd have to look through my old mail, but I think I've
personally helped resolve the problem for at least two projects, one in what
was at the time Al Aucoin's group, and have heard of several other projects
that resolved the problem themselves.
Unfortunately, our system at that time, and even still, tends to reward lone
wolves who fight fires by themselves in a vacuum, rather than putting an
emphasis on good, documented engineering process, and encouraging and rewarding
such things as defect prevention, idea/solution/experience sharing, and
continuous process improvement. When those lone wolves move on, the knowledge
they have inside their heads about the possibility, characteristics, and
solutions to such things as this Rdb problem, goes with them. I believe the
situation described in the last paragraph of .14 is the root cause of this (the
lack of sharing and collaboration).
This seems to me to be a glaring example of why we need to fully and
consistently embrace a mature, comprehensive software improvement model such as
the SEI Capability Maturity Model. The SEI model, which looks at software
organizations in terms of 5 levels of successive maturity, characterizes the
lowest level of maturity as being one where the success of the organization
relies on the strengths of individuals, rather than on the strengths of the
process. As your organization moves up the levels of maturity, the emphasis
shifts to the process, and your process becomes stronger, more refined, and
more complete.
Both the SEI model and the ISO 9000 standard also emphasize the importance of
putting controls in place to assure adequate and known quality of the products
and services provided by your subcontractors and upstream suppliers.
Fortunately, for some, I think things will begin to get better. Ricks group,
for example, is getting very serious about quality (independantly of this
problem), are in the process of developing and implementing mechanisms and
processes which capture and leverage experience and learnings, and will be
looking into the possibility of applying the SEI model.
It's too bad these things weren't in place 2 years ago, might have prevented
this problem from ever occurring again, and at least would have saved a lot of
redundant problem solving.
John Faherty
|
2459.20 | IMHO not! | ELWOOD::LANE | Half of everything is below average | Thu Apr 15 1993 09:24 | 23 |
| No comments on the Rdb problem but I will commant on your implication
that one or more gifted people working as individuals are at the lower
end of the food chain while a comprehensive organization with proceedures
and processes is at the top.
If all you're interested in is quality, then perhaps you're right.
But a quality what? Compare the languages C and ADA.
C was invented by three guys who's names escape me at the moment (or
was that the transistor?) and ADA was invented by everybody and their
mother-in-law.
As a language, ADA has a much better quality than C. (Just what does
"char (*(*x())[])()" define, anyway?) but what's preferred? And why?
Quality is a property of something, not the result of some process
or proceedure. Individuals can do extreamly high quality work and
huge, highly structured organizations can produce junk although I'll
agree that this is usually the exception. On the other hand, individuals
usually produce innovative things while huge organizations usually produce
nothing.
Mickey.
|
2459.21 | Roger, Roger... Over, Over... What's our vector, Victor? | GOTIT::harley | Pay no attention to that man behind the curtain... | Thu Apr 15 1993 12:37 | 9 |
| I still want to know what the heck a
"technology issue at the layered product interaction level"
is...
Is that anything like calling a bug a "previously undocumented feature"?
/harley
|
2459.22 | IMHO, way ! | PARITY::FAHERTY | | Thu Apr 15 1993 13:51 | 81 |
| Re: .20:
I'll respond to reply 20, and then get off my soap box in this particular
conference and note, since I think we may be veering too far away from
the specific issue.
> No comments on the Rdb problem but I will commant on your implication
> that one or more gifted people working as individuals are at the lower
> end of the food chain while a comprehensive organization with proceedures
> and processes is at the top.
First of all, we're not talking food-chain here. We're talking survival.
Second, the SEI model is not about comparison between organizations or
individuals, as I think you are implying. Rather, it is a tool for you to use
to determine where your organization is, where you want it to be, and how to
get there, in a gradual, least-cost, least-risk fashion. It's about
organizational AND individual growth. You own the data about where you're at
and going, because you own the process of getting there. Similar to a career
planning guide for individuals, the SEI CMM could be viewed as a growth
planning guide for software organizations (explicitly) and individuals
(implicitly). All too many people initially view the model the way you seem to
have, both those unfamiliar with the SEI or other improvement models, as well
as those who have incorrectly applied the model (because they came at it with a
similar perspective as yours).
Third, I'd put hiring and supporting good people at the top of the list, before
process, of important ingredients of "world-class" software organizations, but
process would be a close second, in order to be able to optimize the work of
those good people. I think most software improvement leaders and experts,
including those at the SEI, would agree with this.
Fourth, doesn't it make sense to put mechanisms in place to leverage the good
ideas and solutions of those gifted people ?
>
> If all you're interested in is quality, then perhaps you're right.
> But a quality what? Compare the languages C and ADA.
>
> C was invented by three guys who's names escape me at the moment (or
> was that the transistor?) and ADA was invented by everybody and their
> mother-in-law.
>
> As a language, ADA has a much better quality than C. (Just what does
> "char (*(*x())[])()" define, anyway?) but what's preferred? And why?
Precisely ! One of the benefits of a quality system based on a proven
model is that you have the best chance of those questions getting asked
in the first place, and answered for all to know.
> Quality is a property of something, not the result of some process
> or proceedure. Individuals can do extreamly high quality work and
> huge, highly structured organizations can produce junk although I'll
> agree that this is usually the exception. On the other hand, individuals
> usually produce innovative things while huge organizations usually produce
> nothing.
One way (certainly not the only way) of viewing quality: a system built by,
from, and in support of an organization of individuals with quality attitudes
who want to prevent problems from occurring in the first place, never make the
same mistakes twice, and always repeat successes.
Here are a couple of interesting quotes from Bill Curtis of the SEI that I
think are somewhat pertinent:
<<< TPSYS::SYS$SYSDEVICE:[NOTES$LIBRARY]SEPF.NOTE;1 >>>
-< SEPF >-
================================================================================
Note 17.7 Boston SPIN 7 of 7
TOHOKU::TAYLOR "e-mail is the ethernet of the 90s" 9 lines 28-MAR-1993 17:40
-< 2 quotes by Dr. Bill Curtis of the SEI >-
--------------------------------------------------------------------------------
RE: Boston SPIN meeting 19-JAN-1993, talk by Dr. Bill Curtis of the SEI
I found two interesting quotes in my notes:
"Large projects are bus sensitive.
If a bus hits the lead person, the project dies."
"Process maturity lets you go home at night,"
because there is no overtime required.
|
2459.23 | a few more points | ODAY40::FRICCHIONE | Rick Fricchione (MRO1-1/297-2573) | Thu Apr 15 1993 23:50 | 56 |
| I have *NO* idea what some of the previous replies have to do with the
order processing problem we experienced. I'd suggest taking ideas on
who the next CIO should be, development methodologies, and hindsight in
general to the SOAPBOX notes file (off hours). I really don't have the
energy for it. It oversimplifies things and has little to do with
the original note.
The intent was to let people know what was going on since it seemed to
have some exposure internally and externally. Lets not mix the
religious cable channels with CNN (please).
A few RELEVANT points:
1. The problems still occur and will still occur until we go to RDB
V4.1. We know how to deal with them now though so the impacts are
minimized. Still there, but minimized. Planned upgrade: this
weekend.
2. "product interaction level" is management speak :-). To be honest
all we know is that dynamic lock remastering, RDB V4.0A, and VMS
V5.5-1 in this particular situation/load generate these problems. We
understand how to prevent them at startup, but basically everytime
the cluster undergoes a state transition (as happened today:
$#@$%#@# node crash) VMS lock management and RDB send our database
servers FREEZELOCKS which seem never to free up. Also, under certain
conditions RDB V4.0A in this environment sends these locks when
a database recovery is performed (even if someone just CTRL/Ys
out of an interactive SQL read transaction). $DELPRC same thing.
3. We are also experiencing system performance issues due to a
completely changed (at least it seems that way) application
profile. Again, we probably could have done a better job of
characterization here, but hindsight is 20-20. We are working on
that as well. Nothing that a few 7620s couldnt fix. Tuning is
progressing but you need data for that and that takes time.
Giving everyone DECwindows terminals in the last year and some
group consolidations into this cluster basically tripled the
number of sessions to 1200-1500 simultaneous and thats pretty
"challenging".
4. We didn't go off and lone wolf this. We worked with Colorado, RDB
engineering and all the organizations who we believed could add
value at the time. Lots of people had seen similar situations
before. We had too. Few can fix it. Are these the wrong people?
5. I stand corrected on the uptime statement I made. There was some
additional downtime before I and others got involved. I don't know
how much though, but the system was in for only 2 days at that
point. No where near the 11 days that someone stated. That part
is clearly wrong.
We continue to be up, processing orders and taking calls. We are
having a bumpy implementation due to these problem but we believe they
will be behind us soon.
Rick
|