T.R | Title | User | Personal Name | Date | Lines |
---|
539.1 | Some things to try | HARMNY::CUMMINS | | Thu Mar 20 1997 13:51 | 65 |
| Looks like there are two independent things going on with this system..
1. AlphaBIOS hang.
2. SRM SHOW command returns "string too long".
My initial impression is that you haven't had, nor do you now have any
HW issues here and that you're running into two FW issues. Swapping the
Saddle has possibly complicated the picture (at least for me) since I'm
not sure from your mail whether the "string too long" problem occurred
after or before you swapped the Saddle..
The "string too long" problem is described in note 93.32. This is an SRM
console firmware "feature" that was changed for the better in the V4.8
console release (V3.9 FW CD has V4.8-5 and Web pages' /interim/as4x00
directory has V4.8-6 required for 466 MHz CPUs). If you don't wish to
update to V4.8-5 or -6, clearing out the ESC NVRAM used to store the
SRM-defined, non-default environment variable (EV) values should fix this
particular problem. V4.8 console will also fix it. And V4.8 console adds a
CLEAR_SRM_NVRAM script for clearing out the SRM-defined NVRAM area as
well. See below for how to clear out this area when running versions of
console earlier than V4.8.
How the SRM EVs got to this state is presumably due to one of two things:
1. BOOTDEF_DEV was set to point to a boot device and then the boot
device was moved around within the system without clearing or
otherwise resetting the EV first.
2. The spare Saddle that was swapped had a non-default BOOTDEF_DEV
EV setting which happened to map to a non-boot device in your
machine.
As far as the AlphaBIOS hang is concerned; my first guess was the Fast
NVRAM part AlphaBIOS problem described in note 93.8. However, you said
you have tried different versions of AlphaBIOS (including V5.24 which
fixed this problem), so apparently this is not the issue. Still, it
would be interesting to note whether this machine's new Saddle has the
fast or the slow NVRAM part on it - see note 93.8 for details. I would
recommend clearing the ARC NVRAM and starting over..
Clearing NVRAM regions if you don't have V4.8 console (CLEAR_SRM_NVRAM
and CLEAR_ARC_NVRAM scripts).. The 8KB ESC NVRAM is divided by SW into
three unique regions:
1. AlphaBIOS EVs; AlphaBIOS disk partition data; etc. (3KB in size)
2. ECU data (3KB in size)
3. SRM console EVs; SRM power-up NVRAM script; etc. (2KB in size)
To clear these respective regions:
P00>>> d eerom:0 -b -n bff 0 # ARC region (#1)
P00>>> d eerom:c00 -b -n bff 0 # ECU region (#2)
P00>>> d eerom:1800 -b -n 7ff 0 # SRM region (#3)
Or to clear all three at once:
P00>>> d eerom:0 -b -n 1fff 0 # All three regions
It is advisable that you record data you want to restore (if possible to
do so) prior to clearing these regions. V4.8 SRM console also adds save
and restore NVRAM scripts should you want to save your NVRAM contents to a
floppy diskette after getting things set up the way you like - especially
useful if you need to swap a PCI motherboard (Saddle) for some reason..
Let us know what you find out.
BC
|
539.2 | Still hangs, "string too long" fixed. | BRUNEL::KIRBY | | Fri Mar 21 1997 10:04 | 20 |
| Bill,
Thanks for your input. Yes I would (now) agree I have 2 problems, I
just hadn't connected the "string too long" message with the line that had
yet to be output on the screen! Message now fixed as recommended by clearing
the ESC NVRAM.
As for the hang..........
I had checked the physical NVRAMs early on, both saddles have
200NS (-20 part number) chips so no issues there.
I have cleared all the NVRAM regions as you suggested, but no change.
Still hangs at the point I expect the picture to appear.
Any further thoughts?
Steve.
|
539.3 | | MAY30::CUMMINS | | Fri Mar 21 1997 10:12 | 38 |
| SROM checksum tests the system flash ROM's XSROM sectors.
XSROM in turn checksum tests the SRM console + PAL sectors.
However, the only time AlphaBIOS gets checksum tested is when running
LFU or using the SHOW FLASH command at console. It's possible, though
seemingly highly unlikely, that AlphaBIOS image in system flash ROM is
partially corrupted. To discern whether this is true, use the following
SRM console command:
P00>>> show flash
The very last several lines will tell whether the AlphaBIOS image is
okay or not..
I say that it's highly unlikely because you've swapped the PCI
motherboard already and this didn't fix the problem. Is said machine
a UNIX or VMS machine or is it an NT machine? If UNIX/VMS, is it able
to boot/run the operating system?
Another thing to check is whether memory is bad. We've seen this
before. To check, do one or both of the following:
P00>>> b -h pmem:0
.
.
P00>>> info 1
Does the INFO command say there are any bad pages? If so, how many?
Invoke SRM console's system TEST command.
P00>>> test
.
.
Let us know what you find out.
BC
|
539.4 | Still no good! | BRUNEL::KIRBY | | Fri Mar 21 1997 12:31 | 20 |
| Bill,
Already checked the flash, OK on both original and replcement hardware.
Memory is also good, both methods. (Had run "test" for a good while
initially, and have even tried another pair of memory modules).
This is an NT system, but has been brought into my office to fix and
the Customer would not let me have the disks (data security). Thus I have not
tried any O.S., but may be able to try Unix distribution CDs next week.
Any more ideas?
Steve.
|
539.5 | | HARMNY::CUMMINS | | Fri Mar 21 1997 13:05 | 10 |
| Was the system ever working? It sounds as if at one point it was
working (since you say the customer is an NT site and there are NT
disks that were on the machine).
Assuming my assumption is correct, do you know when AlphaBIOS started
hanging? Was there a particular event that triggered it? E.g. a
firmware upgrade; a reboot after some AlphaBIOS setup/init, etc.?
Running out of ideas..
BC
|
539.6 | | HARMNY::CUMMINS | | Fri Mar 21 1997 13:06 | 11 |
| Have you tried both serial and VGA modes? Which mode are you using now?
If only one or the other, I would try connecting to the other.
P00>>> set os_type nt
P00>>> set console serial (or graphics)
P00>>> init
Halt button pressed in gets you back to SRM if hung at AlphaBIOS, but
then you already knew this I presume..
BC
|
539.7 | Worked for 4 months! | BRUNEL::KIRBY | | Fri Mar 21 1997 13:36 | 21 |
| Bill,
The system worked for 4 months fine. As far as I can ascertain it was
just switched off for some reason, and would not work when switched on again.
I can run Alphabios from the serial port no problems, but obviouly
this is no use to the Customer.
I have just booted a Unix distribution CD, and that comes up fine into
XWindows, but as I have no disks I can go no further!
(In desperation) I have ordered a complete set of parts for Monday......
System motherboard, CPU, saddle, horse ..... to swap together in case I have
another "blow-up" situation. At least that should eliminate the hardware
rapidly.
Steve.
|
539.8 | | MAY30::CUMMINS | | Fri Mar 21 1997 16:11 | 1 |
| Are you sure the VGA card is in PCI0? It must hang off of IOD0.
|
539.9 | The "expletive "solution!!! | BRUNEL::KIRBY | | Thu Apr 03 1997 13:27 | 29 |
|
Well ....... hmmmmm .......... it's fixed now.
The actual hardware fault was the VRC15 monitor. Honest! Everything else back
to the original.
Not a lot I can say really.
I guess the "picture" puts the display into a different mode, and the faulty
monitor would not switch to the mode required. I subsequently tried it on my
laptop and it would not work on that either.
Having got the system back to site and put the Customer's disks back in and
reset the raid controller's parameters correctly so I could see his 3
partitions and reset the boot definitions, NT wouldn't boot! Re-installed NT
and still it wouldn't boot. "Hit any key to continue" was the continual
message!
Notes 58 and 464 describe that problem (which required me to downgrade the
AlphaBios temporarily to resolve), but I won't dwell on that here.
I guess this last problem occurred because of clearing the NVRAM!
Thanks for your time, ideas and suggestions ....
Steve.
|