T.R | Title | User | Personal Name | Date | Lines |
---|
834.1 | PTB by any chance? | QUARRY::petert | rigidly defined areas of doubt and uncertainty | Tue Feb 11 1997 16:22 | 3 |
| Check note 796 to see if this applies here.
PeterT
|
834.2 | Only the UI | RHETT::HALETKY | | Wed Feb 12 1997 14:07 | 5 |
| No the problem only occurs in dxladebug not in decldeabug. There is
something in the UI.
-ed halekty
|
834.3 | Need some additional info | TLE::MAHONEY | | Wed Feb 12 1997 16:05 | 7 |
| Ed,
What version of Digital UNIX and ladebug are these customers running? Any
chance we can get a reproducer to try out?
Jack Mahoney
Ladebug
|
834.4 | more info, new customer | RHETT::SHEPPARD | | Fri Feb 14 1997 17:47 | 30 |
| Here's some additional info on yet another instance of the problem for
a different customer.
He can reproduce the problem at will with either dxdecladebug or the
3rd-party 'Greenhill' debugger (GUI) on a large C/FORTRAN application,
or on a small hello-world program in the following environments:
Unix v4.0b
Alphastation250
ZLX-L1
dxdecladebug v4.0-25
Unix v4.0b
Alphastation500
ZLXP2-EZ5
dxdecladebug v4.0-25
He starts dxdecladebug, but while app is loading the console suddenly
returns to the blue CDE screen;
the console is hung, no keyboard response at all;
he can still log in via telnet -- network resources still working --
and says 'ps' shows dtwm is gone, so the window manager apparently crashed;
nothing appended to ~/.dt/errorlog;
Thanks in advance for any pointers or suggestions !
Steve Sheppard
Digital Unix / Ultrix Applications Support Team
[email protected]
|
834.5 | | QUARRY::petert | rigidly defined areas of doubt and uncertainty | Mon Feb 17 1997 17:14 | 46 |
| I've seen similar problems, it may be the db_enable problem which I've mentioned
before, or it may be a similar problem. I've seen many of them on the
various build levels, that it's getting hard to remember which problem
is which.
Go to super user and check these variables with the debugger,
tcp_noackwar // if 1, set to 0. Mostly this seems to kill off rlogins and
// other windows after a period of no activity.
db_enable // if 1, may crash the system while debugging, set to 0 to
// avoid the crash.
Note that this did not crash all systems, ie a Flamingo had no problem
with this. But maybe use of dxladebug aggravates it enough to cause it
to crash finally.
There is yet another problem which I've seen on Steel BL4, which may also
be a problem in PTB, though I haven't seen it. It sounds much like the
one described, where the console seems to lock up, but you can get into
the machine from another system. I don't know the particulars of this
one, or if there is a specific QAR, but I do know that it had something
to do with configuring STREAMS when the kernel is built. It seemed to
be a malloc problem, where one process was just chewing up memory.
It may be something similar here, if not the same exact problem.
For the db_enable or tcp_noackwar problems, you can do the following:
On the running kernel you can change this with ladebug or dbx
ladebug -k /vmunix
assign db_enable = 0
q
To patch this on the disk copy (so you don't have to do the above
every time you reboot) you must use dbx. Ladebug will not patch the
disk copy (a bug that needs fixing)
dbx /vmunix
patch db_enable = 0
q
and then you should be able to debug without a problem.
There are actually some machines that do not fall down with this problem,
but the majority of newer systems do.
PeterT
|
834.6 | db_enable = 0 saves the day again | PEACHS::DALEY | Maybe I should drink more coffee...or less! | Fri Feb 21 1997 15:32 | 9 |
| The problem in .4 was db_enable. dxladebug caused
the Xserver (Xdec) to die. The system was up and available
over the network until the customer tried to get a login
prompt on the console while it was in text mode. Then
the system crashed.
Thanks for the pointers,
John
|
834.7 | We'll get to the bottom of this... | SMURF::PETERT | rigidly defined areas of doubt and uncertainty | Fri Feb 21 1997 17:43 | 11 |
| I just got an update for this system crashing problem, but it
still crashed my machine. I'll be working with the kernel person
responsible to get it resolved. It seems to affect different
systems in different ways, and the ones it seems to spare entirely
are the older machines, which makes me start to wonder if it's
a timing issue. Or maybe dependent on the rev of the Alpha chip.
It hits my EV5 and EV56 machines, but seems to spare the Flamingos
and Sandpipers, which are EV4's at the most.
PeterT
|
834.8 | | TLE::MURRAY | Wanfang Murray | Mon Feb 24 1997 07:23 | 7 |
|
This problem sounds familiar. Jeff Denham of kernel group has
worked on similiar customer problem last month. I will send
this info to him to see if it sounds familiar.
Wanfang
|