[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference help::decnet-osi_for_vms

Title:DECnet/OSI for OpenVMS
Moderator:TUXEDO::FONSECA
Created:Thu Feb 21 1991
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:3990
Total number of notes:19027

3891.0. "Node translations suddenly fail" by FUNYET::ANDERSON (Where's the nearest White Castle?) Wed Feb 26 1997 12:08

All of a sudden, many incoming node translations are failing on node HUMANE.  A
number of people, including me, cannot open members-only Notes conferences for
this reason.

Running CDI$TRACE shows many requests similar to this one.  Is the DECdns back
translation broken or is there another reason for these errors?

I access HUMANE from FUNYET often.  I can't imagine I would be flushed out of
the cache.

Paul


11:44:58                     [DECnet/CDI Request 871093]
11:44:58    == 871093 ==   DECnet/CDI lookup request for
"NET$490005AA0004005C1520"   ====
11:44:58
11:44:58 [looking up address]
11:44:58     - 871093 -  first, look up node "NET$490005AA0004005C1520"  ---
11:44:58
11:44:58 [2390278] lookup 2390278: "NET$490005AA0004005C1520"
11:44:58 (CDI cache) looking for "NET$490005AA0004005C15"
11:44:58 (CDI cache) entry "NET$490005AA0004005C15" not found
11:44:58 IN cdi$getnodeinfo - ADDING parent
11:44:58 [2390278] -1-   DECdns:  NET$490005AA0004005C1520
11:44:58 Converting NSAP "490005AA0004005C1520" (len=20) to DECdns objectname
11:44:58 Converted NSAP: "DEC:.DNA_BackTranslation.%x49.%x0005.%xAA0004005C15"
(len=51)
11:44:58 [2390278] Looking up backtranslation entry:
11:44:58 [2390278]            
"DEC:.DNA_BackTranslation.%x49.%x0005.%xAA0004005C15"
11:44:58 [2390278] Error converting name
"DEC:.DNA_BackTranslation.%x49.%x0005.%xAA0004005C15" to DECdns opaque format
11:44:58 [2390278] DECdns returned "Message number 01DE864A"
11:44:58 [2390278] invalid name
11:44:58 [2390278] -2-   Domain:  NET$490005AA0004005C1520
11:44:58 [2390278] NSAP address lookup not supported by Domain
11:44:58 [2390278] Address not found
11:44:58 (InQ Remove) 1 NET$490005AA0004005C1520 0-Left
11:44:58
11:44:58 CDI request 871093 complete, returning to Session
11:44:58     with lookup status "Object not found"
T.RTitleUserPersonal
Name
DateLines
3891.1DRAGNS::SAUNDERSWed Feb 26 1997 13:197
$ set message sys$message:dns$msg
$ exit %x01DE864A
%DNS-E-RESOURCEERROR, Insufficient resources to process request


John Saunders
DECdns Engineering
3891.2OK, now what?FUNYET::ANDERSONWhere's the nearest White Castle?Wed Feb 26 1997 13:266
Thanks for doing what I should have done!

What could the system be out of?  Keep in mind that HUMANE has recently added a
lot of Notes conferences and so has increased activity.

Paul
3891.3Might be running out of threads...STEVMS::PETTENGILLmulpThu Feb 27 1997 00:324
Do you have a local name database?  The corporate nameserver structure isn't
the most robust because the servers that handle the synonym and backtranslation
directories get hit pretty hard by even those nodes with a local name directory
because of the requirement to verify the node's towersets in those directories.
3891.4Reboot fixed itFUNYET::ANDERSONWhere's the nearest White Castle?Thu Feb 27 1997 09:3914
There's no local database, just DECDNS and DOMAIN for node name translation.

I ran NET$SHUTDOWN last night and the system crashed but came right back up.

Running CDI$TRACE this morning shows none of the "resource" errors from
yesterday.

I had to flush the entry for node FUNYET or notes I entered still showed up
numerically.  So, unless I flush the whole cache, people who didn't have their
node names translated yesterday will wind up with numeric addresses until, when?

If I'm running out of threads, what would I do about that?

Paul
3891.5Working...DRAGNS::SAUNDERSThu Feb 27 1997 11:0913
I'm still checking to see how many different things RESOURCEERROR could be (I'm
new here).

In the meantime, do I understand you to be saying that your system CRASHED when
you ran NET$SHUTDOWN?

That is not a feature.

Could you please save the dump and let someone here look at it?

Thanks,
John Saunders
DECdns Engineering
3891.6Dump providedFUNYET::ANDERSONWhere's the nearest White Castle?Thu Feb 27 1997 13:4423
John,

> I'm still checking to see how many different things RESOURCEERROR could be
> (I'm new here).

Thank you.  If it happened once, it will probably happen again, so I'd like to
know how to solve the resource problem.

> In the meantime, do I understand you to be saying that your system CRASHED
> when you ran NET$SHUTDOWN?

Yes it did.  I was at home, and it was shutting down transports when I lost the
connection to the system.

> That is not a feature.

Well, it provided a quick reboot, didn't it?

> Could you please save the dump and let someone here look at it?

The dump file is at HUMANE::DISK$KAKA:[DUMPS]SYSDUMP.DMP.

Paul
3891.7%SDA-E-DUMPEMPTY, dump file contains no valid dumpTECMAN::SAUNDERSThu Feb 27 1997 19:1222
Directory USER$62:[SAUNDERS.DUMPS]

HUMANE_CRASH.DMP;1    141485  27-FEB-1997 14:40:22.86

Total of 1 file, 141485 blocks.


However,

$ ana/crash HUMANE_CRASH.DMP
VAX/VMS System dump analyzer

%SDA-E-DUMPEMPTY, dump file contains no valid dump


Sorry, but are you sure the system rebooted? Were the SYSGEN parameters set to
create a dump? Please check DUMPSTYLE, SAVEDUMP, and DUMPBUG. Also, make sure
you have a SYS$SYSTEM:SYSDUMP.DMP and confirm that it is large enough to hold
all of your memory plus ERLBUFFERPAGES*ERRORLOGBUFFERS.

John Saunders
DECdns Engineering
3891.8It's a valid dump hereFUNYET::ANDERSONWhere's the nearest White Castle?Fri Feb 28 1997 09:4085
Here's what I get from the dump.

Paul


$ analyze /crash disk$kaka:[dumps]sysdump.dmp

OpenVMS (TM) Alpha system dump analyzer
...analyzing a compressed selective memory dump...

Dump taken on 26-FEB-1997 22:10:25.93
UNXSIGNAL, Unexpected signal name in ACP

SDA> show crash

System crash information
------------------------
Time of system crash: 26-FEB-1997 22:10:25.93

Version of system: OpenVMS (TM) Alpha Operating System, Version V7.1

System Version Major ID/Minor ID: 3/0

System type: DEC 4000 Model 610

Crash CPU ID/Primary CPU ID:  00/00

Bitmask of CPUs active/available:  00000001/00000001

CPU bugcheck codes:
	CPU 00 -- UNXSIGNAL, Unexpected signal name in ACP

System crash information
------------------------
System State at Time of Exception
---------------------------------

Saved Scratch Registers in Mechanism Array
------------------------------------------
Mechanism array inaccessible.

CPU 00 Processor crash information
----------------------------------

CPU 00 reason for Bugcheck: UNXSIGNAL, Unexpected signal name in ACP

Process currently executing on this CPU: SYSTEM

Current image file: $1$DIA0:[SYS0.SYSCOMMON.][SYSEXE]NCL.EXE;1

Current IPL: 8  (decimal)

CPU database address:  80C10000

CPUs Capabilities:    PRIMARY,QUORUM,RUN

General registers:

R0   = 00000000.00000004  R1   = 00000000.00000003  R2   = FFFFFFFF.8147EC58
R3   = 00000000.00000003  R4   = 00000000.00000003  R5   = FFFFFFFF.8147ECA8
R6   = FFFFFFFF.83D05B90  R7   = FFFFFFFF.852FFE18  R8   = FFFFFFFF.80EA6D40
R9   = 00000000.00000002  R10  = 00000000.00000000  R11  = 00000000.00000000
R12  = 00000000.11030010  R13  = FFFFFFFF.852FFEE0  R14  = 00000000.00000418
R15  = FFFFFFFF.83D05000  R16  = 00000000.0000041C  R17  = 0000FE00.00007E04
R18  = 00000000.00000000  R19  = 00000000.00001DA4  R20  = 00000000.0000000D
R21  = 00000000.3A750040  R22  = FFFFFFFF.83D06AC8  R23  = 00000000.00000008
R24  = FFFFFFFF.83D05828  AI   = 00000000.00000001  RA   = FFFFFFFF.852F51D4
PV   = FFFFFFFF.852FFEE0  R28  = 00000000.00000001  FP   = 00000000.7FFA1F20
PC   = FFFFFFFF.852F16CC  PS   = 10000000.00000804

Processor Internal Registers:

ASN  = 00000000.0000003E                     ASTSR/ASTEN =          0000000F
IPL  =          00000008  PCBB = 00000000.06E56080  PRBR = FFFFFFFF.80C10000
PTBR = 00000000.0000051D  SCBB = 00000000.000001A4  SISR = 00000000.00000008
VPTB = FFFFFFFC.00000000  FPCR = 00000000.00000000  MCES = 00000000.00000000

CPU 00 Processor crash information
----------------------------------
	KSP    = 00000000.7FFA1C10
	ESP    = 00000000.7FFA6000
	SSP    = 00000000.7FFAC100
	USP    = 00000000.7AFB3570

                No spinlocks currently owned by CPU 00
3891.9RMULAC.DVO.DEC.COM::S_WATTUMScott Wattum - FTAM/VT/OSAK EngineeringFri Feb 28 1997 10:0413
>$ ana/crash HUMANE_CRASH.DMP
>VAX/VMS System dump analyzer

versus

>$ analyze /crash disk$kaka:[dumps]sysdump.dmp
>
>OpenVMS (TM) Alpha system dump analyzer

I don't think anyone ever mentioned what type of system this was.

--Scott

3891.10Thanks!DRAGNS::SAUNDERSFri Feb 28 1997 12:219
Thanks, Scott! I didn't realize OpenVMS VAX would react that way to an Alpha
dump, so I took the message at face value.

I've got it now...

John Saunders
DECdns Engineering

P.S. This required a V7.1 system.
3891.11RESOURCEERRORTECMAN::SAUNDERSFri Feb 28 1997 16:2010
Well, I finally found the answer to the question, "what does RESOURCEERROR
mean"? It seems to imply that a memory allocation failed.

Now, as to the crash, you're running SSB OpenVMS V7.1 and SSB DECnet-Plus V7.1,
so please submit an IPMT case on this as soon as you get a chance. In the
meantime, I'll make the dump available to people over here.

Thanks,
John Saunders
DECdns Engineering
3891.12FUNYET::ANDERSONWhere's the nearest White Castle?Mon Mar 03 1997 10:5812
> Well, I finally found the answer to the question, "what does RESOURCEERROR
> mean"? It seems to imply that a memory allocation failed.

What process is having this problem?  Is there something I can change to avoid
the problem?

> Now, as to the crash, you're running SSB OpenVMS V7.1 and SSB DECnet-Plus
> V7.1, so please submit an IPMT case on this as soon as you get a chance.

Will do.

Paul
3891.13QAR directions?FUNYET::ANDERSONWhere's the nearest White Castle?Mon Mar 03 1997 12:095
I've looked in this conference for a half hour and can't find how to submit a
QAR.  Could someone post the node/username/password and proper database for
DECnet-Plus V7.1?

Paul
3891.14I've been wondering about that too!DAVIDF::FOXDavid B. Fox -- DTN 285-2091Mon Mar 03 1997 13:274
Since the QAR database is no longer on QAR1, this is a problem!  I have heard
that the database moved to BULEAN but have never seen any directions.

	David
3891.15IPMT, not QARTECMAN::SAUNDERSMon Mar 03 1997 14:186
Please submit this as an IPMT. If you do not already have an account, please
contact MSE1::KCARROLL to get one.

Thanks,
John Saunders
DECdns Engineering
3891.16RMULAC.DVO.DEC.COM::S_WATTUMScott Wattum - FTAM/VT/OSAK EngineeringTue Mar 04 1997 10:346
Just so you know.  You cannot submit a QAR against a version which is shipping;
only FT versions can have stuff QAR'd.  At least that's how it now works with
DECnet.

now, this doesn't stop engineering from migrating level 3 IPMT's into the
QAR database; but we can only do that after agreement with the IPMT submittor.
3891.17It's baaaackFUNYET::ANDERSONExchange *this*Wed Apr 09 1997 13:318
Node name translations on HUMANE have started failing again with the

   DECdns returned "Message number 01DE864A"

messages.  I will gladly IPMT this problem, but perhaps someone would like
access to HUMANE to look around before I reboot it to solve the problem tonight?

Paul