[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference mvblab::alphaserver_4100

Title:AlphaServer 4100
Moderator:MOVMON::DAVISS
Created:Tue Apr 16 1996
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:648
Total number of notes:3158

516.0. "SWXCR and invalid OS selection..help" by LEDER1::BENDEL () Fri Feb 28 1997 14:25

    I have a 4100 with two SWXCR'S and one KZPBA. I installed NT on
    disk 0 of SWXCR 0. No problem, boots fine. Now for the detail, and
    the problem.
    
    SWXCR 0
    		All 7 drives are configured as JBOD.
    		NT 4.0 is installed, boots and runs fine on disk 0.
    		WINNT partition is NTFS, system partition obviously is FAT.
    
    Everything is working perfectly.
    
    I then want to create one logical RAID drive from 5 of the disks on
    SWXCR 0. I start SWXCRMGR and create a group and logical drive using
    the last 5 drives on the controller (not touching drive 0 which as NT
    4.0). No problem, until i try to reboot.
    
    Now I get the "OS Selection is invalid"...it can't find the osloader
    files on disk 0 partition 2, as is specified. This has not changed.
    Hard disk utilities shows disk 0 as having the NTFS and FAT partitions.
    
    Does making any RAID changes on the SWXCR the OS is on cause it to
    be damaged, ie not bootable? What has happened, and how can I recover?
    
    thanks....... Steve
T.RTitleUserPersonal
Name
DateLines
516.1no help??LEDER1::BENDELWed Mar 05 1997 11:377
    nobody has any ideas on what has to be done to recover, or is aware of
    this being a problem? (did I do something I was not supposed to?)
    
    I will be reinstalling NT very soon, if this leads nowhere, but am
    attempting to verify if this is a known problem or not as well as how
    to recover). (customers will NOT want to have to reinstall, cause of
    any chanegs on that conmtroller setup .....)
516.2More info?MOVMON::DAVISWed Mar 05 1997 13:0611
    Do you really have a KZPBA?  If so, it's not a supported option on this
    machine yet.
    
    Not that this is your SWXCR problem, it probably isn't.  
    
    You do have the hot fix for NT 4.0, yes?
    
    You may want to put more configuration info in here (SRM version,
    AlphaBIOS version, memory, # of CPUs, other I/O options, etc.)
    
    Todd
516.3not even getting to the OS level..but...LEDER1::BENDELWed Mar 05 1997 14:0714
     SRM 2.0-3, ALphaBIOS 5.24 (-4?)
    as far as SP2 and HOTFIX, I had just pulled them over the net to
    install, thats when the system failed to boot :) However, I cant
    imagine the HOTFIX will fix a boot problem, it is supposed to be for
    pool corruptions (only?). I am not even into the OS level yet.
    
    The system was shipped a few months ago, I thought it was a KZPBA, it
    is the QLISP SCSI controller that replaced the standard NCR on new 
    shipments.
    Maybe it is not a KZPBA. Is the KZPBA just not supported on nT? on the
    4100 at all?   I will look into that particular controller, but, as you
    said, thats not my RAID controller problem.
    
                                                                  
516.4Any chance of getting CPU, motherboard, Horse revs?MOVMON::DAVISThu Mar 06 1997 09:426
    You almost certainly have a KZPDA.  The KZPBA isn't shipping yet.  
    
    Any other PCI options?  We're up to SRM 4.8-6 and AlphaBIOS 5.28, but I
    don't suspect that's your problem either...
    
    Todd
516.5Im off to try more things :)LEDER1::BENDELThu Mar 06 1997 09:489
    THAT is what it was...KZPDA....its been a while since I have worked
    on this....easy to slip a character..... :):)
    
    I know the SRM and AlphaBIOS  is not the latest, I do plan to upgrade
    it before further testing...but agree I dont think it is the problem.
    Just PCI video, RAID, and DE500. Ill try some things today..then a
    reinstall to try some more things.....
    
    thanks.....  Steve
516.6try moving RAID controllerMOVMON::DAVISThu Mar 06 1997 10:086
    One other thing to try... we've been seeing some issues with a fully
    loaded PCI segment 0 with nothing in PCI segment 1.  You may want to
    move the RAID controller to PCI 1.  (segment = independent PCI bus = 4
    slots.  Video card must be in PCI 0.)
    
    Todd
516.7MAY30::CUMMINSThu Mar 06 1997 10:3627
Steve,
    
I forwarded your note .0 to my AlphaBIOS/HAL counterpart in DECwest, but no
feedback from him yet.
    
The following is an excerpt from some recent qual team meeting minutes that
caught my eye. It doesn't sound at all like the issue you're running into
since you booted okay off disk0, partition2 initially, but thought it might
be worth posting nonetheless.. To date, I haven't seen any other reports of
problems like the one you're describing.

Here's the qual team mtg minute excerpt:
  
  Experienced a problem during NT install on DECWest Development During. NT
  would report that "Boot device not accessible". It turned out that I had
  installed unformatted RZ29's in a storage shelf. I had partitioned each
  disk using the AlphaBIOS Hard Disk Setup menu. This was not sufficient.
  It was also necessary to FAT format each drive using AlphaBIOS Hard Disk
  Setup. A quick format was sufficient. This might explain customer
  problems with NT install reporting "inaccessible boot device". It might
  also explain problems with switch between UNIX and NT. This problem
  occurs when one attempts to boot NT after UNIX. The NT boot selection in
  AlphaBIOS is valid, however, executing the boot selection results in a
  return to AlphaBIOS with no error information. Both of these bugs are
  potentially AlphaBIOS problems because, in both cases, an ARC app
  (setup.exe or osloader.exe), are using ARC callbacks to perform disk IO
  functions. Mark this as under investigation.
516.8MAY30::CUMMINSThu Mar 06 1997 10:372
    DECwest contact for AlphaBIOS/HAL/NT issues on 4100/4000 is Matthew
    Buchman (OLEUM::BUCHMAN).
516.9thanks......dont stop with ideas :)LEDER1::BENDELThu Mar 06 1997 15:228
    thanks....my first attempt to recreate this failed, I hate when that
    happens! I am trying diffrent things now...SWXCR init is busy now,
    should take the rest of the day.
    
    Thanks for the involvement.....I know that what I did that caused this
    problem should not have caused it, and I know what I did.
    
    steve
516.10Any ideas?KUTIPS::ROBILLARDThu Mar 13 1997 16:1816

I am running into a similar problem on my system. I created a logical drive
(RAID1) with 4 partitions. I loaded NT 3.51 and when I tried to reboot it
said it could not find the osloader files on disk 0 partition 4 . When I look 
at the "Hard disk setup" from AlphaBIOS it shows the logical drive as 
***Off-Line***. ?????

I power failed the system and then rebooted to no avail. I used SWXCRMGR to 
run a parity check and all is fine. I again used SWXCRMGR to fail the drives 
and then make them optimal...still nothing.

I can elaborate a little on my hardware configuration if you'd like but in the
meantime, what gives?

Ben
516.11<< Try Revalidating the OS Selections in AlphaBIOS >>POBOXA::COMMOI&#039;ll find no bug before its time!Thu Mar 13 1997 17:4621
This might not be the complete answer for this particular problem, but I've
seen similar problems when I've had a single disk and added another disk.
Under certain conaditions the OS selections *appear* to become invalid.  I've
then gone into the AlphaBIOS Setp/Utilities/OSSelections.  Sure enought you 
have to <Enter> a number of times as it tells you that each component of the
os selection is invalid, but then it displays the boot selections.  I then
select each os selection in turn and hit the "Validate" function key (F9 
I think) and AlphaBIOS reports the selection as valid.  I then do an F10 to
store the info and we're off and running.

What I suspect happens is that AlphaBIOS displays the DiskNo/Partition to us
in one form but keeps it in a different form - with the appropriate mapping.
Adding a new disk or volume changes what the mapping has to be and hitting the
F9 (Validate Function) causes a remapping.  

I am unhappy that all this sounds vague and invite the appropriate Wizards to
clarify.  I do know that I've seen this behavior before.  Whether it will clear
this particular problem is unknown since I've never installed NT on a disk
behind a Mylex.

- norm
516.12KUTIPS::ROBILLARDMon Mar 17 1997 10:5910

I'm not sure if that last response was for me but logical drive 0 is 
***Off-Line*** when seen from AlphaBOIS. Even if I could change the 
OS selection to point to the drive I still can't access the drive.

I'm thinking of re-initializing everything from SWXCRMGR and starting 
from scratch.

Ben
516.13thanks.....I never could recreate itLEDER1::BENDELTue Mar 18 1997 10:3113
    Whenever Ive seen the offline I had to recreate the disk and reinstall.
    I dont "think" I ever successfully restored the disk status to "ready".
    What you are seeing is yet another peculiarity of the RAID controller
    behavior that we have all come to love :)
    
    Sorry to say i thinbk that disk is lost..buit dont think you need to
    reinitialize it, which only saves you an hour anyway.
    
    steve
    
    Norm  thanks for letting me know you have seen this behavior too.
    Unfortunately I was never able to successfully validate any new OS
    selections I created.....
516.14????????????KUTIPS::ROBILLARDTue Mar 18 1997 15:3218

FWIW, I was able to duplicate the error on this disk, and then take a 
different approach to get around the problem.  Initially, when I had
the problem, I configured the drive with 4 partitions using AlphaBIOS.
I would install NT, format some of the partitions, and and manually 
use the format command from DOS to have my NT partition changed to 
NTFS at reboot time. Of course theis never happened because my drive 
would go ***Off-Line***.

To get passed this I created only 1 32MG boot partition (FAT) in 
AlphaBIOS and then let NT prompt me to create and format the NT
partition during the installation. I then created my last 2 partitions
after the installation of NT and voila, no more problem.

Strange! 

Ben 
516.15saw the problem again....on a KZPDALEDER1::BENDELTue Mar 25 1997 09:1013
    Well, this problem has finally resurfaced. This time I was making
    changes to the SCSI configuration. Went into NT control panel, and
    removed the Mylex controllers from the config, since I was removing
    the swxcr's and drives connected to them. All that was to be left
    configured was one KZPDA SCSI controller and one BA356 shelf of
    drives.
    After doing the remove within NT, shutting down, physically removing
    the SWXCR's, when rebooting I got the "Invalid OS Selection"..again.
    Trying to verify the boot selection this time recovered it.
    
    Looks like random and infrequent corruption across my SCSI controllers?
    
    stay tuned.    Steve
516.16.14 ...Thats the way!QCAV01::VARUNTue May 06 1997 09:248
    The way described seems to be the only way to convert FAT to NTFS in
    WNT over Alpha.
    If Partitions are created using  arcinst for alpha,the arcinst doesnt
    allow this conversion,either thro DOS or thro Disk Manager in NT4.0 .
    
    regards,
    
    Varun