T.R | Title | User | Personal Name | Date | Lines |
---|
105.1 | Module 10 Posted for Review | WHEEL::MOSTEIKA | Paul @ GSF/B21, DTN 264 (884)-2832 | Thu Sep 12 1991 14:16 | 8 |
| Please note that Module 10 - System Monitoring, is currently available on node
SUPER for your technical/content and organizational review.
SUPER::ES$REVIEW:[SYSNET_III]SYSNETIII_CHAP10.PS
Regards,
Paul M.
|
105.2 | Comments from DC Training Center | TEACH::WENDY | | Fri Sep 27 1991 13:28 | 66 |
|
SYSTEM MONITORING
10
1-9
I don't think this page really fits in here. We are talking
about monitoring a cluster here, not improving performance. Anyway this
tells them how to relieve a bottleneck, but not how to tell if they have
one, which would include another monitor display. Maybe show them the
commands to point the system to those files if you move them to another
disk. But really I think the page should go.
1-10 What do you do with this info? Like the instructor note says-
you can't do much with this info- so why show it to them. It just
generates questions that you can't answer because most of the
display is useless.
1-11 same as above
1-12a It says arriving and departing network packets rates are roughly
equivalent. What if they are not? what does that mean? Put that in here.
1-14 I don't think INIT is a good command to start off with here.Why
not just get into show cluster ans assume you get the default display.
Since you are removing software and votes ahy not ahow them what the
display looks like before you do that.
1-15 What do they do with the info in example 1-11. How would you know
if any of this is wrong.
1-23 If we have troubleshooting in this module maybe the module
should be called System Monitoring and Trouble Shooting.
1-26 Table 1-4 The bottom part of the diagram is confusing. The possible
causes do not corresspond with the Trouble shooting technique.
1-27 Again the bootom of this chart stinks. It was bad in SMII and
here it is again. For example Type C and hit return. What are
you talking about, where do you type this and why? I have a feeling
you are talking about cancelling mount verification, but how can anyone
figure that out?? Very, Very vague.
Also Take appropriate action-You can't be so vague. What is appropriate
action? That is waht they will want to know.
1-29 Do $Search/window=
That is much more useful.
1-31a. Dynamic is the number of pages that have been found as bad since the
last boot.
1-33 This is not enough. Either leave it out totally and restrict it to
the appendix or put more in the main part of the chapter.
1-35 typo. It is errorlog.sys not errorlog.sass.
1-38 Put something in instuctor notes about what to get out of this
display.
1-45 Will they know what a machine check is or why you do a machine
check. What will they do with this report. Maybe put something in here.
1-46 What do you do with this info?
Wendy
|
105.3 | Is there a more recent version? | NWGEDU::RODENBURG | Ed. Services, The Netherlands | Fri Nov 08 1991 10:46 | 7 |
| I want to review the material I received. Is there a more recent
version of the chapter included into the presented complete
courseguide?
Otherwise I will start to review next monday the current one.
Joop
|
105.5 | | SUPER::MATTHEWS | | Fri Nov 22 1991 15:34 | 2 |
| In the pilot teach, this chapter took approximately 2-1/2 hours. (The
appendix was not covered.)
|
105.6 | New Version of System Monitoring is Available | HARDY::MOSTEIKA | Paul, ZKO1-1/D42 DTN 381 (881)-1075 | Tue Dec 17 1991 15:43 | 16 |
| You will find a new version of the Monitoring Module in the review area (as well
as a revised Performance Module).
SUPER>dire ES$REVIEW:[SYSNET_III]SYSNETIII_CHAP9.PS;1
Directory ES$REVIEW:[SYSNET_III]
SYSNETIII_CHAP9.PS;1
1329 11-DEC-1991 15:12:39.86 11-DEC-1991 15:12:43.61
All your comments were carefully considered, and you'll find that almost all
were implemented.
Thanks for all the review comments.
Paul M.
|
105.7 | Trouble shooting needs to be part of the title | SOAEDS::TRAYSER | Seniority means a bigger shovel! | Fri Feb 28 1992 02:13 | 70 |
| This is primarily the SW/HW trouble shooting modules from the SM II course.
No significant problems with this module, although it has a few displays
that I know came from V4 VMS.
Overall this chapter has some of the better instructor pages, quite useful
and in many cases very clear explanation of the student page -- nice job!
9-9 - 9-18 --
These pages jump back and forth between VAXclusters and DECnet. I'd
suggest starting with the DECnet displays, this helps define some of
the VAXcluster displays. I'd suggest something like:
Start with 9-8, then move to 9-14 through 9-18. Then go back to
9-7 (maybe even add a MONITOR CLUSTER -- remember that 65% of my
students did NOT go to SysNet 2) and then on to 9-9 through 9-13.
9-10a, last line --
Last I checked this wasn't true for all cases. If a node is already
in the cluster using a quorum disk and another node wants to join but
with its differently named quorum disk, the disk is ignored and
(providing all else is fine) the VAX joins without its choice of
drive.
9-14 - 9-15 --
I'd suggest combining these pages. Also, don't turn on all events.
Show what logging we have on and maybe show adding a certain item
or two. And, so long as we are here, how about explaining why
we chose to be looking at this stuff.
9-14 - 9-16 --
OK, so where do we go from here? If this is supposed to be a task
oriented course, this lesson on events needs to lead somewhere. We
look at events, we turn them on, but....
9-17, 1st bullet --
"Network activity can be monitored..." and the first bullet mentions
LIST command in NCP. I don't know of any list commands that will
display Network activity.
whole page --
We seem to be telling them that we can show them lots of stuff, but
we don't show them any.
9-18 --
WAIT! Why are we zeroing counters before we show them something?! The
page title does not match page contents. Add a SHOW command or just
something useful with the counters. For that matter, lets leave the
zeroing to the DECnet Mgr I course.
9-20a, 4th paragraph --
Nit - misspelled "DECnet sofyware..."
9-20, 4th bullet --
Change this to "Is it reasonable to swap in another device?", to make
sure that if a disk pack goes bad they don't start swapping it to another
drive, thus destroying several drives - a real "classic" mistake.
9-22, last sentence --
"The final step is to reboot the node" -- HA! Kill this line, this
is bogus. Rebooting the VAX may not fix the problem!! Just end it with
something like "contact your support organization".
Actually this entire page doesn't make any sense. The problem statement
seems to be that the VAX is working but only from the console. But the
instructor's page is talking about a system hang. If the system is hung
how can we run LAT$STARTUP??
More later....
$
|
105.8 | Thanks for the Comments | SUPER::MOSTEIKA | Paul, ZKO1-1/D42 DTN 381 (881)-1075 | Mon Mar 02 1992 10:00 | 30 |
| Keep those comments coming. It would've been nice to have them earlier, but,
I know how it is when you're teaching.
The general theme of this module was to monitor on a routine basis, for the
purpose of avoiding system degradation. Whether it's performance related,
errors, or what have you. Also as a prelude to the Performance module.
We wanted to have them use the tools available to do this. While they're
looking at an error log, we might as well explain what it is they're looking at.
I beefed up the IG pages and reports, (thanks for the attaboy). You must realize
that some of these reports couldn't be created, and had to be edited to conform
to the newer version. Even the manual has some old reports in them. If you have
any that you fell are good, send 'em to us. When I was teaching, I got my best
examples from the field: students, support...
I does seem jumpy, but it's organized by tool/utility: MONITOR, SHOW, NCP.
Mostly all of the review stuff was ripped out - page counts. But personally, I
think it should be there.
It was decided to remove the info on events because the students need to know
the different DECnet layers to make sense out of this. Where do we go? The
instructor will tell them the important event classes to monitor.
Thanks for the feedback. We are compiling everyones notes for the update.
Remember, if you have a good example, a bad example, a good lab (we have enough
bad ones ;-) ), send them to us. I'm still waiting on those labs Greg!
Regards,
Paul M.
|
105.9 | The rest of the module... | SOAEDS::TRAYSER | Seniority means a bigger shovel! | Mon Mar 02 1992 23:29 | 44 |
| More...
9-7 --
I think we need something different here. I'd suggest a MONITOR CLUSTER.
Its mostly a variation of MONITOR SYSTEM so its easily understood. The
current display leads to numerous questions about MSCP Servers which has
not been discussed in detail in this course series.
9-10 - 9-13 --
All of these pages have slight formatting problems on the margins,
especially if the line has a bullet.
9-13 --
Kill this page. We haven't discussed SYSAPs to any detail level yet. I'd
suggest you toss in information on the WRITE, SAVE and PAN commands. And
maybe SHOW_CLUSTER$INIT logical would fit here well, I always teach these
and I get a good response.
9-23, #1 --
What is @CRASH on a uVAX-II equivalent to? How about a VAXstation? This
was mentioned on 9-22a, but how about an example for reference.
9-29, DRA0, NODE1_SYS_V4 --
Looks like an ancient system with RM05s or RM03s. WOW! Now, how about
some updated listings so we don't have to discuss old hardware info.
9-30 - 9-33 --
I don't usually teach VAXsim since, like the instructor's page states,
this is really a Customer Services (F.S.) tool. *IF* I was to teach it,
I would definitely do it AFTER I taught the 'built-in' stuff like error
log and Analyze. Please move it behind error log or in an appendix.
9-35 - 9-36 --
One of these is mostly redundant of the other, please delete or maybe
consolidate these pages.
9-37, bullet #1 --
Comment on bottom should be begin with "SWISH$" to be more technically complete.
9-39, bullet 2 --
Indention problems, see 9-41 for proper indenting.
$
|
105.10 | | HARDY::MATTHEWS | | Fri Mar 06 1992 08:46 | 2 |
| page 9-53, which I'm sure you all teach in great detail :-) "exits" in
the last line of figure 9-5 should be "exist."
|