Title: | SABLE SYSTEM PUBLIC DISCUSSION |
Moderator: | COSMIC::PETERSON |
Created: | Mon Jan 11 1993 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 2614 |
Total number of notes: | 10244 |
I was trying to explain to a customer what happens if the os - here Digital UNIX - finds a broken CPU during runtime. Before it panics it masks the failed CPU so that during reboot the failed CPU is excluded. Now I tried to simulate the behaviour that CPU had failed and wanted to set cpu_enabled to 2 in order to exclude CPU 0. We tried this first on a 4100, but the console refused to disable CPU. On a 2000 we could even set cpu_enabled to 0, but VMS saw CPU 0 anyway. I had no time yet to install DU on that system. Can someone explain what happens if the primary CPU fails? And can I change the primary CPU from the console, i.e. exclude CPU 0? Or do I have to swap CPU boards? The 8x00 have a cpu_primary environment variable - it would be interesting to understand why they hav it and the smaller ones don't. Regards Hartmut
T.R | Title | User | Personal Name | Date | Lines |
---|---|---|---|---|---|
2604.1 | CLOUD::SHIRRON | Stephen F. Shirron, 223-3198 | Thu May 22 1997 17:32 | 5 | |
The SROM code, which runs before the SRM console code, is the only entity capable of disabling CPU 0 and selecting a different CPU to be the primary. Using cpu_enabled will NOT cause a new primary to be selected. stephen | |||||
2604.2 | Two cents re: 4100/4000 platform and CPU_ENABLED | HARMNY::CUMMINS | Fri May 23 1997 19:08 | 31 | |
The 4100 platform supports up to four CPUs. Any CPU except CPU0 can be disabled. The CPU_ENABLED EV is used to perform this function. At power-up, each CPU in the system is told to start the SRM console, but only if enabled via CPU_ENABLED. We were requested by the operating system groups to not allow disabling CPU0. Technically, we had the option of only allowing it to be disabled if other CPUs were present/okay in the machine. Still, the possibility existed, that were we to allow CPU0 disables, a faulty CPU elsewhere in the system would preclude the system from coming up. In the end, the OS groups wanted no part of disabling CPU0, so we complied. It should be noted that the 4100/4000 system cannot operate without a CPU0, since it provides the oscillator for the system bus. IMHO, CPU_DISABLED is provided for two reasons: 1) To disable suspect (faulty HW) CPUs. We typically provide excellent fault coverage during power-up and auto-disable CPUs if we detect a fault anyway. CPUs 1,2,3 can be disabled on 4100. 2) To enable performance comparisons on SMP machines without requiring HW to be physically removed from the machine. Thus, one can measure performance on a quad-CPU 4100 and compare against performance on a triple, a dual, and a uni, simply by adjusting the CPU_ENABELD EV and rebooting the OS. This can be done on 4100. | |||||
2604.3 | CPU0 limitation is a step backward | STAR::jacobi.zko.dec.com::jacobi | Paul A. Jacobi - OpenVMS Systems Group | Tue May 27 1997 14:28 | 10 |
>>> It should be noted that the 4100/4000 system cannot operate without a CPU0, >>> since it provides the oscillator for the system bus. Please consider removing this design limitation on future systems. The requirement for CPU0 to be present is a step backward in terms of CPU fail-over functionality that exits on Sable and even on old VAX6000 systems. -Paul | |||||
2604.4 | MAY30::CUMMINS | Wed May 28 1997 10:08 | 3 | ||
CPU0 needs only be present (and with working system bus oscillator). Most of the CPU (cache, EV5, SROM, etc.) can be terribly broken, and console and the O/S should still come up on an SMP 4100/4000 machine. |