T.R | Title | User | Personal Name | Date | Lines |
---|
359.1 | On the FCVR bit .. | IOSG::MAURICE | IOSG ain't a place to raise a kid | Mon Mar 30 1992 19:17 | 15 |
| > Also, I understand that we recommend all V2.4 sites upgrade to K604
> before running the FCVR... The customer is reluctant to go to K604
> because of the potential impact on customizations with syntax errors,
> so they are wondering if there are any other options?
If you can't go to K604 then I would recommend that the customer still
tries to run FCVR. It's most likely to work, and if it doesn't there's
nothing disastrous that will happen.
How far can the customer before getting syntax errors? (K601, K602,
K603?) What are they?
Cheers
Stuart
|
359.2 | Should they run an unpatched V2.4 FCVR? | SHALOT::LANPHEAR | Test the water or turn the tide? | Mon Mar 30 1992 21:39 | 14 |
| Hi Stuart,
They are running an _unpatched_ system, and they don't want to
install any patches if they don't have to. They understand that one of
the patches (603?) starts to evaluate syntax more carefully, and they
don't want any nasty messages to show up in front of company Executives...
And by the way, they have _never_ run the FCVR in over four years,
the SDAF goes from 900K+ to over 1.3M in the two weeks they go between
SDAF compression runs, and the janitor takes most of the weekend if
they compress the file cabinets during the run. They are worried that
the FCVR won't make it through in one weekend, and I had heard that
we don't recommend running the vanilla V2.4 FCVR... Do you still think
they should attempt an FCVR run?
|
359.3 | Go for it | IOSG::MAURICE | IOSG ain't a place to raise a kid | Tue Mar 31 1992 09:48 | 43 |
| Judge for yourself - here are the FCVR fixes in the K60x series:
1. FCVR can cause HIGH usage count. This can happen when the FCVR
deletes an SDAF record with attachments. The problem gets corrected
in the next FCVR run. In the meantime a HIGH usage count may cause
a file not to be deleted when it should. (The opposite problem of
a LOW usage count is the potentially serious one - where a file gets
deleted before it should).
2. FCVR can't handle search lists. This is a problem if your customer
has defined the OA$SHARxnnnn logicals as search lists. In this case
the FCVR will fail to delete files it should.
3. FCVR loops w. usage count >32767. This is a problem if there are
32767 references to a shared mail message or attachment. It's very
unlikely to occur - but two sites did manage it! If it does then the
FCVR loops - and does no repairs - but no corruptions.
4. FCVR doesn't delete SDAF records for missing files. If a user has a
DOCDB record pointing to a shared file that does not exist then the
DODCB record is not deleted.
Your customer is not likely to hit any of the above problems, and even
if they did the worst that can happen is that some files do not get
deleted when they should.
As time to run is important to you, please read carefully the
instructions. Two points in particular:
1. Run TRM on a generic batch queue so that slave processes will fire.
2. Observe the recommendations for memory usage - to achieve fast
throughput a lot of memory is used, and so you want to avoid
excessive paging.
Given that, you can expect fast completion. On IOSG it used to take
about 14-15 hours to run - it now takes 1� hours.
So please do run it, and also let us know how long it takes.
Cheers
Stuart
|
359.4 | We had to get rid of _lots_ of batch jobs | SHALOT::LANPHEAR | Test the water or turn the tide? | Thu Apr 02 1992 19:04 | 27 |
| Well, we tried it last night at 1am, and I had to remove about fifty
batch jobs this morning at 8. I selected the generic queue, but they
didn't mention that the processor specific queues all had job limits of
1. So, we were running five at a time (1/processor), but they each
took 1 - 1.5 hours to complete. Phases one & two completed by 5am, but
it looks as if all these batch jobs started at the same time. Can you
explain what all the slave processes do? I know there's one per
logical name found in the directory field of the profile, but how do
they relate to the three FCVR phases? SMLOG1.TMP contained:
ALL-IN-1 File Cabinet Verification and Repair Program
=====================================================
Version 2.4
Beginning Phase 1 at 01:08 AM -- scanning users' personal file cabinets
=======================================================================
and SMLOG2.TMP contained all the information from scanning the file
cabinets, and the totals for phase 1 and phase 2.
SMLOG3,SMLOG4, and SMLOG5 were all empty.
We are going to try again tonight with job limits of four per
processor, and see how far we get. Can anybody tell if we're on the
right track? We're running it in read-only mode for now.
Cheers, Dan'l
|
359.5 | If time is short ... | IOSG::MAURICE | IOSG ain't a place to raise a kid | Thu Apr 02 1992 19:35 | 33 |
| In 2.4 the FCVR goes the profile and looks at the disc name in the
directory field. It then fires up one slave process for each different
disc name it finds. I know it will be small consolation, but in V3 it
is improved so that there is one slave process per real disc.
Each slave process does its share of Phase 1 processing, while the
Master gets on with Phase 2. When the Master has finished Phase 2 it
then waits for all slaves to complete. When that has happened it goes
into Phase 3. Phase 3 can complete in less than a minute if there no
repairs. In your case where there are is potentially a very large
number of repairs so it can take hours.
To improve Slave processing time I would recommend you turn off body
file checking until you have got the FCVR to clear up all the other
accumulated errors acquired over the years.
To improve Phase 3 consider:
a) Not bothering with a verify run. In verify mode it will print
full details of each usage count discrepancy. In repair mode
you just get a total. Reporting the discrepancies in detail may
take a long time.
b) There is an option of having messages that have no owner put
in the RECOVERED SHARED DOCUMENTS folder of a user, usually
the Manager. Creating an entry is expensive, and you may well
have hundreds of these. By blanking the name of the user who
will receive these (in the Master file record) they will instead
be just deleted.
Cheers
Stuart
|
359.6 | Some problems with PORT PostScript printing | WELCLU::63945::MASON | Bruce Mason @WLO | Fri Apr 03 1992 15:38 | 27 |
| Going back to your original question about WPS-PLUS V4.0 printing to an HP
printer with PostScript cartridge. Yes, it does it, but the WPS-PLUS group
don't support it because the terminals group don't support attached
PostScript printers (since most attached PostScript printers are probably
attached to pc's rather than terminals, perhaps they asked the wrong people).
Things to watch for from my experience at a major ALL-IN-1 customer with
hundreds (thousands?) of attached PostScript printers:
The SETPAPERTRAY sequence is wrong, so printing with tray set to REAR may
print nothing, depending on how picky your printer is. Fix is available
from support centre or just set them all to null in the print table.
There are errors in the PostScript generated by WPS-PLUS. If you send too
many in too short a period, your printer may object and throw the rest of
the printout away. A good way to reproduce this is to use XP to print more
than about 6 messages to PORT PS. Most printers do an automatic reset
after a timeout, hence the time-related nature of this one. The only
workaround we could find was to do a .TEXT "<ESC>[5i^D<ESC>[4i" (printer
on, Control/D, printer off) between each document/attachment printed - we
used a copy of WPPPORT.SCP renamed to WPPPORT_PS.SCP.
Alternatively, wait until WPS-PLUS V4.1 in ALL-IN-1 V3.0 - maybe it will
all be fixed...
Have fun,
Bruce
|
359.7 | What about SDAF continuation record errors? | SHALOT::LANPHEAR | Test the water or turn the tide? | Mon Apr 13 1992 17:25 | 29 |
| Hmm... Looks like a net disconnect lost the last note. Anyway...
Bruce, since we're both working with different ends of the same
customer, I'll have Vid contact Mr. Carrier directly.
Back to the FCVR questions...
Phases 1 and 2 completed in 19 hours and 9 minutes. Phase one
complained about users logged in, since we were running it read-only on
Saturday. Phase 2, however, complained about:
Processing USERDISK3:[ALLIN1.DATA_SHARE]OA$DAF_E.DAT file at 02:11 AM
-----------------------------------------------
Error: continuation record not found when cont_flag set:
First record-key = OA$SHARE103:ZSTXP5J4D.WPL
Continuation record-key =OA$SHARE103:ZSTYFZASP.WPL
There were 29 such errors.
Then it went on to say Phase 3 will not continue due to these errors.
Now I can understand about the users logged in, but isn't it going to
fix the DAF for us? Is there something we have to do to the DAF?
For stats, it processed 1972 DOCDBs, in 19:09, with a total of 850767
records. That's 34 seconds per user (on five nodes (4 processor each)
running three slaves per node).
Number of records in OA$SHARE: 489648 - 03:59 AM
Thanks in advance, Dan'l
|
359.8 | | IOSG::MAURICE | IOSG ain't a place to raise a kid | Mon Apr 13 1992 19:10 | 14 |
| The continuation record errors will not prevent Phase 3, but the users
being logged in will. The log has been improved in V3 so that the
errors that prevent Phase 3 are clearly marked.
In earlier versions of the FCVR the continuation record errors did not
get repaired (only reported), but they are now. I can't remember which
patch introduced this capability.
The time to complete seems excessive - have you checked the memory
recommendation in the manual?
Cheers
Stuart
|
359.9 | Here's what they have... | SHALOT::LANPHEAR | Test the water or turn the tide? | Mon Apr 13 1992 19:55 | 32 |
| Hi Stuart,
One point to consider is that they have 99 logical names in use, so
there are 99 slaves which have to run.
The memory is as per the manual. The VMS account is set to have
a WSextent of 50,000, but the batch queues all have:
/BASE_PRIORITY=4 /CPUDEFAULT=INFINITE /CPUMAXIMUM=INFINITE /JOB_LIMIT=3
/OWNER=[SYSTEM] /PROTECTION=(S:RWE,O:RD,G:W,W) /WSDEFAULT=4096
/WSEXTENT=24000 /WSQUOTA=4096
UAF has:
Maxjobs: 0 Fillm: 500 Bytlm: 84000
Maxacctjobs: 0 Shrfillm: 0 Pbytlm: 0
Maxdetach: 0 BIOlm: 500 JTquota: 3072
Prclm: 10 DIOlm: 4096 WSdef: 1024
Prio: 4 ASTlm: 600 WSquo: 3072
Queprio: 0 TQElm: 300 WSextent: 50000
CPU: (none) Enqlm: 2000 Pgflquo: 150000
WSMAX runs from 28,000 to 50,000 on the systems. VIRTUALPAGECNT runs from
300,000 to 460,000 on the systems. There are digits on site who are
managaging the systems, so if you can make any recommendations, I'll
pass them along.
We are running the unpatched FCVR, so it looks like the continuation
record errors won't get fixed. I didn't think that they would prevent
the FCVR from entering phase 3, but I did wonder from the log file...
Cheers, Dan'l
|