[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference forty2::mailbus

Title:MAILBUS - Message Router and its Gateways
Notice:Kit Copy Utility - 100.1, Problems - 5.*, Kit Support - 103.1
Moderator:FORTY2::YUILLE
Created:Thu Jun 11 1992
Last Modified:Thu Jun 05 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:3209
Total number of notes:7125

3184.0. "Maximum transfer service images..." by TAV02::CHAIM (Semper ubi Sub ubi .....) Tue Mar 18 1997 06:45

Message Router V3.3a on VMS V6.2 DECNET phase IV

A customer recently upgraded from VMS V5.5-2 to V6.2 and from MR V3.2 to V3.3A.

He is having a problem that messages destined to this system are not being
picked up by the MR all the time and are remaining in the sending MR mailbox.

He is seeing an error:

Maximum transfer services images

He told me that he installed MR with a default configuration.

What should he increase in order to overcome this problem?

Thanks,

Cb.
T.RTitleUserPersonal
Name
DateLines
3184.1ACISS2::LENNIGDave (N8JCX), MIG, @CYOTue Mar 18 1997 12:477
    He should increase the 'number of transfer service images' setting.
    They'll need to switch to a Customized config to set it.
    
    As I recall, based upon customer demand, we changed the default for
    this from 'unlimited' to '20' in V3.3; check the release notes...
    
    Dave
3184.2specific references...FORTY2::LEWISWed Mar 19 1997 17:088
Specifically, Section 3.4.10 of the Message Router V3.3 Configuration Guide
describes this option.  As Dave says, the customer will have to swap to using a
customized configuration.

Also, Section 5.6 of the Message Router V3.3A Release Notes mentions the change
in the limit.

	Gill.
3184.3Increased to 300 - still has problemTAV02::CHAIMSemper ubi Sub ubi .....Tue Mar 25 1997 07:0532
The customer reinstalled and reconfigured acceprting ALL the defaults except
the maximum number of transfer images which he set to 300.

Now, he has a six node cluster. If he starts MR on four of the nodes in the
cluster, then it appears to work fine, however if he starts MR on a fifth node
then after several hours all messaeg transfer grinds to a halt. He needs MR
running on ALL six nodes since he does NOT use cluster aliasing and messages
are addressed to specific nodes. 

He was not able to tell me if the errors are the same, so I have asked him to
start the fifth node and as soon as the message transfer stops working to see
what errors (if any) are being generated. He did mention that he thinks that
the behavior IS different since previously he would receive the "maximum ..."
error even when trying to "interactively" send a message, and NOW he doesn't
recall receiving this error, but no messages are being transfered. AT any rate,
he will call me as soon as he is able to reproduce the problem which will take
several hours since, as I  mentioned previously, the problems start several
hours after the fifth node is started and NOT immedialely.

In the meantime, perhaps someone has some suspicion as to what is happening.

He did mention one other thing;

Before the upgrade there would be a talker process on each node with the number
"1". After the upgrade there is still the one talker process on each node, but
the numbers are "1", "2", "3", "4" etc..

I assume that this is the expected behavior, but since he mentioned it...

Thanks,

Cb.
3184.4ACISS2::LENNIGDave (N8JCX), MIG, @CYOTue Mar 25 1997 11:378
    By any chance is the customer trying to run seperate MR's on the
    nodes of the cluster? ie are/were there multiple [MB$...] trees?
    And while we're at it, are they running DDS in this environment?
    
    The rules for running MR in a cluster were rationalized in V3.3;
    this is covered in the docs/relnotes, and several notes in here.
    
    Dave
3184.5ONE [MB$...] - No DDS - stop one stops allTAV02::CHAIMSemper ubi Sub ubi .....Tue Mar 25 1997 12:3022
Dave,

>
>    By any chance is the customer trying to run seperate MR's on the
>    nodes of the cluster? ie are/were there multiple [MB$...] trees?

No, there is only ONE [MB$...] tree.

>    And while we're at it, are they running DDS in this environment?

No, DDS is not running.

The customer also told me that when they stop MR on one of the nodes, then all
the nodes currently running MR are stopped as well. This itself seems
suspicious and might perhaps help in pinpointing the customer problem.

Thanks,

Cb.



3184.6ACISS2::LENNIGDave (N8JCX), MIG, @CYOTue Mar 25 1997 16:169
    Issuing a stop is _supposed_ to shut it down cluster-wide. I find 
    it a bit worrisome that they consider this a change in behaviour, as
    it implies they had an "unusual" configuration before the upgrade.
    
    Do they have multiple system disks? Did they apply the upgrade to
    all system disks? (The docs give directions re: multiple system 
    disks, multiple sysuaf's, multiple PhIV DECNET databases, ...)
    
    Dave
3184.7Which document(s) ..??..TAV02::CHAIMSemper ubi Sub ubi .....Wed Mar 26 1997 05:0416
Dave,

>    Do they have multiple system disks? Did they apply the upgrade to
>    all system disks? (The docs give directions re: multiple system 
>    disks, multiple sysuaf's, multiple PhIV DECNET databases, ...)
>    

I don't think that they have multiple system disks (I will ask him as soon as
he gets into the office), however I know that they
have multiple SYSUAF's.

In which document(s) are the directions?

Thanks,

Cb.
3184.8ACISS2::LENNIGDave (N8JCX), MIG, @CYOWed Mar 26 1997 12:154
    Try Section 7.4 of the "Configuration Guide"; there are other
    cluster related topics elsewhere in this and the other manuals.
    
    Dave
3184.9Several Questions...TAV02::CHAIMSemper ubi Sub ubi .....Wed Mar 26 1997 14:2738
Dave,

Thanks for your reply.

I read the chapter you suggested. According to what the customer has told me he
manually made sure that all the SYSUAF.DAT files were identical with respect to
the MR accounts (UICs & PASSWORDS). 

According to the documentation, usually MR needs to be installed only once, but
that there are circumstances that might dictate installing on more than one of
the nodes, but it was NOT very clear what circumstance would dictate this, and
it didn't appear to me that multiple SYSUAF.DAT files was one of them. I'd
appreciate it if you could clarify when indeed multiple installations would be
required.

I told the customer that when stopping the MR on one node that the EXPECTED
behaviour is indeed the stopping of MR on the entire cluster, and this somewhat
surprised him, but he accepts this.

I am planning on going on-site tomorrow morning. Currently all six nodes are
running MR and they have not had any problems so far for almost 24 hours. I was
going to reinstall and reconfigure, but we decided that if the system is still
running tomorrow that we would wait and see.

The customer told me that he uses start= on all his nodes and not minstart= on
all the nodes except the primary node. Could this be a factor? Would you
suggest using the minstart= on the nodes except the primary node.

I will be here at the office for about 15 minutes more. I tried calling your
DTN but I guess you are not in the office.

If possible I'd appreciate it if you could make a fast reply to my questions.

Thanks,

Cb.


3184.10ACISS2::LENNIGDave (N8JCX), MIG, @CYOWed Mar 26 1997 17:1716
>According to the documentation, usually MR needs to be installed only once, but
>that there are circumstances that might dictate installing on more than one of
    
    re: Multiple system disks - see section 2.6 of the "Installation Guide"
    
    re: minstart/start etc - You are getting into an area of "what will
    work" vs "documented configs", and in particular one of the areas we
    tried to rationalize in V3.3. The answer to "what will work" varied
    across the various MAILbus components (MS, TS, DDS, ER, MRG, etc).
    The 'simplest' cluster config is to go ahead and give it an alias.
    
    re: multiple SYSUAF's - I believe if you read the VMS SPD, that nowadays
    they rather strongly state that a cluster is s single security domain,
    which having multiple (differant) sysuaf's probably violates...
    
    Dave
3184.11Several QuestionsTAV02::GODOVNIKHaim GodovnikThu Mar 27 1997 07:3919
    I am talking to Chaim Budnick now on the phone and he has a couple of
    questions:
    
    1. The customer starts only the MS, TS and ER components. In this case
    should he be using start or minstart on the nodes except the primary
    node?
    2. I noticed that after MR is started the MB$ROOT logical appears twice
    once in super mode and the second in exec mode. The definitions are
    identical i.e. they both point to the same disk/directory
    I stopped MR deassigned all the MB$* logicals and restarted MR from the
    SYSTEM account and I still saw the 2 logicals.
    2.1 Is this the expected behaviour?
    2.2 If not could this cause any problems? What could possibly cause
    this to happen?
    
    
    Thanks,
    
    Cb.
3184.12MB$ROOT both SUPER and EXEC ..TAV02::CHAIMSemper ubi Sub ubi .....Sun Mar 30 1997 07:0416
As my coworker stated, I was at the customer site last Thursday, and at that
time the system had been up for over 24 hours and there were no problems during
the course of the day.

I have the listing from show ms/ts/er and can post them if requested.

The only thing that I did notice was that the mb$root system logical was
defined in both SUPER and EXEC mode. As a check I stopped all the MR components
in the cluster and deassigned all the MB$* logicals across the cluster and then
isuued 
$@mb$control start=(ms,ts,er) on the primary node and MB$ROOT was defined in
both SUPER and EXEC mode.

Thanks,

Cb.
3184.13ACISS2::LENNIGDave (N8JCX), MIG, @CYOMon Mar 31 1997 14:295
    Glad things are running for them.
    
    The duplicate MB$ROOT logical is a known, non-problematic, behaviour.
    
    Dave