T.R | Title | User | Personal Name | Date | Lines |
---|
167.1 | | MOVIES::WIDDOWSON | Rod | Mon Feb 10 1997 13:38 | 27 |
| Now I'm no expert, but I can ask the stupid questions,
Is this Oracle/RDB or Oracle V7 ? Moving to OPS would indicate the
latter, but if they are going from RDB to OPS I would not suspect a
plug-play replacement.
Off the top of my head V7.x should help a lot with the max number of
processes time max workingset problem (but doesn't balance slots do
that as well? its been too long).
How is the CPU used up? High MP sync might mean that the DLM is
hitting them already. Expect to get and Oracle/VMSCluster person in to
get them tuned correctly for a balanced cluster operation. It is my
(limited) experience that Oracle will often `punt to the DBA' when
faced with a tuning issue.
As general background I'd recommend you read Roy Davis's book
especially chapter 6 which will fill in what the DLM is (and why ity is
the manager which is dynamic, not the locks - an interewsting quirk of
english).
OK you experts, I'll stand back, you can start to take pot-shots at me.
Finally, in peering into my crystal ball, it might be useful for the
customer to upgrade for 7.1 because of oracle futures...
/rod
|
167.2 | Do you need VLM support or OPS or both? | EVMS::NOEL | | Mon Feb 10 1997 14:36 | 25 |
| Do you need Oracle VLM support or Oracle Parallel Server or both?
I've contacted some Oracle 7 developers and product managers with
this question. I'll post their responses.
It sounds like you may want OpenVMS Alpha V7.1 and Oracle V7.3.2.3.2 which
is the Oracle7 release with VLM. It has not been released yet. I'm not
sure about Oracle Parallel Server, DLM and how that relates. I'm also not
sure how Oracle7 scales on 10 CPUs as opposed to OPS on 2 systems. It will
be interesting to find out how this works out.
I can comment on the OpenVMS Alpha V7.1 VLM features that help Oracle:
With OpenVMS Alpha V7.1, Oracle V7.3.2.3.2 can create their SGA larger
than the previous 450MB max. They now use 64-bit pointers
and also use a memory-resident global section for their SGA.
The memory-resident global section when accessed by the servers,
does not take up working set quota. This alleviates your
"process/working set size" VMS limits.
I can't comment on the new max size of the Oracle SGA, but we've
created larger SGAs than 6GB. Greater than twice as large, in fact...
Karen Noel
OpenVMS Engineering, VLM techie
|
167.3 | Response from Tony Lekas of Oracle | EVMS::NOEL | | Mon Feb 10 1997 17:03 | 41 |
| Tony also confirmed that Oracle V7.3.2.3.2 includes the Parallel Server
option. So the customer can use OPS and VLM on OpenVMS V7.1.
--=_ORCL_30351987_0_11919702101405020
Content-Transfer-Encoding:7bit
Content-Type:text/plain; charset="US-ASCII"
Karen,
There are several more or less separate issues listed in this mail.
Several issues are related to the use of a large SGA. With Oracle
7.3.2.3.2 (The VLM release) on OpenVMS V7.1 the SGA will only be limited by
the amount of memory available on the system. Working set size will not be an
issue.
Other issues concern ways of dealing with Oracle7 using 100% of 10
CPUs on an 8400. First I will assume that basic system tuning has been looked
at and that Oracle is actually using all of the CPUs. If not this should be
done.
The use of Oracle Parallel Server (OPS) across a cluster may be a
solution to this problem. However someone who understands OPS and the
customers application should look at this. For more information on OPS
refer to the Oracle book: Oracle7 Parallel Server Concepts and Administration.
Tony
--------------------------------------------------------------------------
Tony Lekas Oracle Corporation
Principle Engineer 110 Spit Brook Rd.
DEC Products Division ZKO2-1/O19
Office: 603.881.0931 Nashua, NH, 03062
Internet: [email protected] FAX: 603.881.0120
--------------------------------------------------------------------------
--=_ORCL_30351987_0_11919702101405020
Content-Type:message/rfc822
|
167.4 | More info and specific questions. | ALCALA::BURGOS | Luis Burgos. NSC Spain. | Tue Feb 11 1997 05:46 | 77 |
|
First all, thanks everybody for such quick answer.
Secondly let me give you some more information.
The costumer it's using ORACLE7 7.1 with VMS 6.2, and this past weekend
they tried to move to ORACLE7 7.3 with VMS 7.1 because the critical of
the situation. But due to some problem, Oracle7 won't came up, and they
have to go back to their old VMS 6.2 setup. I don't know if Oracle 7.1
may work over VMS 7.1, but perhaps if it does we may try that this next
weekend. Although I don't know what we may gain.
We are working with four local Oracle people, but I must confess that I
don't see them too much up to speed. We wasted all last Friday trying
to use a 650 MB SGA on a test system, theorically with the latests
Oracle7 version (the one suppose to be used on the production system),
modifying a lot of the system and user parameters, because we were able
to use a 300 MB SGA, but not a 500 MB SGA. Until on Friday evening when
trying to recompile the code I found a warning stating that the given
650 MB SGA area was been automatically modified to 450 MB.
This is really my first contact with Oracle, and I really don't know
anything about how it works or their limits, but I certantly would
expect them to know it better and at least read their compilation
warning messages.
Anyway the main problems the customer it's having (High MP Sync, High
paging rate, and long conect times using Forms3), are pretty well
described in the Oracle7 v7.3 release notes. So as a first step we are
trying to move the clients to another system, and install Oracle 7.3,
and as a second step to use two 8400 on cluster with Oracle Paralel
Server. Theorically the Oracle people it's looking into the
application, and the implications to move it to OPS, but like I said I
don't know if they are up to it. And they requested us information
regarding how the VMS "Distributed Lock Manager" works, which are it's
limits, and wheater to use "Dynamic locks" or not (because the
performance implications).
I have been looking over all the VMS Documentation for this
information, but I can't seem to find anything. So I really appreciate
any information or pointers regarding this particular points.
1.- How the VMS "Distributed Lock Manager" works.
2.- Which are it's limits (if any ?).
3.- And which are the implications of using "Dynamic locks" (Don't know
if it's an VMS or Oracle option/term).
As you may imagine this is a big and critical customer, which it's
already looking for somebody else to do the job. And although they
admit that part of the problem it's their own lack of prevision and
fault when making their poorly writen application, they ask us to solve
the problem in any way. Another 8400 it's about a be instaled just to
handle the Forms3 clients as a front-end. They are just waiting for the
4 GB Memory module to be released to be able to increase to 14 CPU's.
And they would certantly buy another 8400 rigth now if that would solve
their problems. So any help means business, and it's really welcome.
And regarding your replies.
.1 Could you specify more which one it's this "Roy Davis's book", and
if there is any way to quickly get it on a foreign country.
.2 I think we need both. The 650 SGA was sugested by the Oracle people.
But the way things are going with exponential grow, by providing
service to the whole country, that's not going to be enough at all.
.3 I'll ask the Oracle people for the "Oracle7 Parallel Server Concepts
and Administration" book.
Regards.
Luis Burgos.
|
167.5 | What caused the "drop-back?" | CRLRFR::BLUNT | | Tue Feb 11 1997 09:00 | 21 |
|
In reference to previous questions, (I think .-2) Oracle 7.1 will run
on V7.1. I think that the Oracle folks would prefer a migration to the
latest string of numbers (most recent release) soon after moving to VMS
V7.1. What were the problems that required a "drop-back" to V6.2? I
just did a customer upgrade to VMS V7.1 the weekend of 2/1, and found
that there were SEVERAL images that needed to be "registered as trusted
images" in order to get some things working (this site is running a
combo of Oracle, Oracle/RDB and RMS based databases).
Frankly, I'd think that there would be little benefit directly to
Oracle V7.1 on VMS V7.1. It is more likely that this is the next
logical step to upgrading the database to Oracle V7.3..., because V7.3
will not run on VMS V6.2. From what the Oracle folks here have said,
the upgrade is a "one-way street" in that once you have loaded the
upgrade, the internal database structure must be changed and the new
"format" is NOT compatible with Oracle V7.1. Downgrading to V7.1 from
V7.3 would require either a reload of the pre-V7.3 database or an
export/import.
bob
|
167.6 | | MOVIES::WIDDOWSON | Rod | Tue Feb 11 1997 09:42 | 14 |
| Luis,
"VAX Cluster Principles" Roy G Davis - Digital press
ISBN 1-55558-112-9
You may find it on the web or some of it. Hi MP_SYNCH is symptomatic
of the lockmanager being too busy (see what MON LOCK's says), make sure
that LOCKDIRWT is set up sensibly across the cluster (is there a
clustere right now? if not LOCKDIRWT will have little effect. Also
check LOCKIDTBL and LOCKIDTBL_MAX.
I'm no lockmanager expert but you would be failing if you had hit any
limits (with a quota failure or some such). Can someone provide an
online pointer to tuning the lockmanager ?
|
167.7 | Info from Saar Maoz of Oracle | EVMS::NOEL | | Tue Feb 11 1997 11:56 | 95 |
| From: US2RMC::"[email protected]" "Saar f Maoz ?" 10-FEB-1997 20:22:45.14
To: [email protected]
CC: [email protected], [email protected], evms::noel
Subj: Re: Response from a Techie/PM at Oracle?
--=_ORCL_30376087_0_11919702101751240
Content-Transfer-Encoding:7bit
Content-Type:text/plain; charset="US-ASCII"
Karen,
"Typical" situation raised by this customer in Spain. Here are my
thoughts:
Prior to the Oracle VLM release it was not possible to have your SGA bigger
than ~450MB
since there was not enough space to map the SGAPAD in P0 address space.
The high CPU usage could be attributed to the fact that their SGA is smaller
than what it should be and thus causes a lot of misses, which will increase
the I/O load on the system. In addition to the above, to my opinion, MOST of
the CPU increase would be attributed to the mapping of the demand-zero global
section aka SGAPAD, the place where we map the SGA into. The customer will be
able to confirm this by telling us what the connect/disconnect times to the
instance are, and also whether there is a lot of MP-Sync CPU mode registered.
The obvious has already been said about such CPU behaviour but here it is
again: make sure the SQL you are running is doing the right thing for your
environment. Use tkprof on long running queries and make sure you are using
the right indexes, etc.
So, the CPU problem will most probably go away with the release of Oracle's
VLM on VMS 7.1.
Want to talk about OPS? Fine, first, as Tony Lekas mentioned go and browse
through a much improved "Oracle7 Parallel Server Concepts and Administration"
Then, there are a few VMS limitations as it stands right now:
The UAF limit of the number of locks a single process can handle is
32767, that is if you try to set that number higher you will get the following
error:
UAF> mod oracle/Enqlm=40000
%UAF-E-VALTOOBIG, value too large for field \40000\
There is however the PQL parameter PQL_MENQLM which sets the minimum number of
locks a process could hold. Turns out that if you set that parameter to 64K
you could get away with it on VMS version 6.1 & 7.1, not so the case in 6.2 or
7.0, seems like they enforce the limit there like they do in UAF.
OUTPUT ON VMS 6.2 & 7.0:
SYSGEN> SH PQL_MENQ
Parameter Name Current Default Min. Max. Unit
Dynamic
-------------- ------- ------- ------- ------- ----
-------
PQL_MENQLM 200 4 4 32767 Locks D
OUTPUT ON VMS 6.1 & 7.1:
SYSGEN> SH PQL_MENQ
Parameter Name Current Default Min. Max. Unit
Dynamic
-------------- ------- ------- ------- ------- ----
-------
PQL_MENQLM 300 4 -1 -1 Locks D
Either way you will be restricted to either 32K locks or 64K locks, which some
users find hard to get by with. Especially since there is no warning
mechanizem, rather the instance crashes, sometimes causing other member
instances to go down as well.
(To fix this some work will have to be done on VMS's DLM which is what Oracle
uses right now)
Hope this answers your questions, and good luck with your evaluation, decision
and implementation.
Saar.
__ _ _ __ _ _ _ _ ___ _____________________________
(( /\\ /\\ ||) |\V/| /\\ /\\ >/ Senior Performance Engineer
_))//-\\//-\\||\ |||||//-\\\\//<_ Oracle Corporation - DEC SBU
TEL: 415.506.4967 FAX: 415.506.7304 [email protected] MS: 1OP5
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
////Share your knowledge with others and compete with yourself///
--=_ORCL_30376087_0_11919702101751240
Content-Type:message/rfc822
|
167.8 | Here it's what I found. | ALCALA::BURGOS | Luis Burgos. NSC Spain. | Tue Feb 11 1997 12:49 | 92 |
|
Thanks for all that good information. I was able to find a litle bit
more usign the Stars database. However I am not sure if I got all the
lock related parameters. So if you know of any other, let me know. But
so far it seems that most if not all of the VMS lock limitations seems
to be gone with VMS 7.1.
I still don't understand what the local Oracle person means when he
refers to using "permanent" vs "dynamic" locks. I'll have to get more
information from him.
Reply .7 it's rigth on target. The present SGA and SGAPAD area it's
only 200 MB, and I wonder if just increasing it to 400 MB can make a
noticeable diference. It may be a quick fix until we can make the
full migration to 7.3... I stil don't know what happen this weekend. I
let you if I find out.
Anyway here it's what I found and my comments.
#####################################################################
----------------------------------------------------------------------
3.11.4 Lock Manager: Increased Quotas and Limits
----------------------------------------------------------------------
The OpenVMS lock manager has been enhances for OpenVMS Version 7.1.
Some internal restrictions on the number of locks and resources
available on the system have been eased and a method to allow enqueue
limit quota (ENQLM) of greater than 32767 has been added. No
application changes are required to take advantage of these increases.
Specifically, the OpenVMS lock manager includes the following
additions:
o ENQLM greater than 32767 allowed
If you set ENQLM to a value of 32767 in the SYSUAF.DAT file,
OpenVMS treats it as no limit and allows an application to own
up to 16,776,959 locks, the architectural maximum allowed by
the OpenVMS lock manager.
>>> I guess it means it's now dinamic and can go up to "16,776,959",
>>> locks, with no problem.
o Sub-resources and sub-locks greater than 65535 allowed
o Resource hash table greater than 65535 allowed
>>> I don't know if they refer to a system/user parameter, or a program
>>> limitation (so no parameter to modify for it).
o LOCKIDTBL size restrictions removed (LOCKITDBL_MAX obsolete)
>>> I guess this still needs be set-up to a meaningfull value.
While most processes do not require very many locks simultaneously
(typically less than 100), large scale database or server applications
can easily exceed the previous thresholds. For more information about
these enhancements, see the OpenVMS Programming Concepts Manual.
>>> I am trying to get this, but it seems I can only find version 7.0,
>>> I don't know if there will be much difference with V7.1.
----------------------------------------------------------------------
3.11.5 New CLUSTER_CREDITS System Parameter
----------------------------------------------------------------------
The CLUSTER_CREDITS parameter specifies the number of per-connection
buffers a node allocates to receiving VMS$VAXcluster communications.
This system parameter is provided to support lock-intensive
applications, such as large scale-databases, which may require more
per-connection buffers. Prior to this release, it was not possible to
change the default setting.
This system parameter is not dynamic; that is, if you change the
value, you must reboot the node on which you changed it.
A shortage of credits affects performance, since message transmissions
are delayed until free credits are available. These are visible as
credit waits in the SHOW CLUSTER display.
For instructions for using this system parameter, see OpenVMS Cluster
Systems.
>>> I wonder which a good value could be for the planned two 8400
>>> cluster. Given how their application it's build, I guess I'll need a
>>> pretty big value.
Regards.
Luis Burgos.
|
167.9 | I'm not a locking guru -- yet | WAYLAY::GORDON | Resident Lightning Designer | Wed Feb 12 1997 10:14 | 35 |
| ... although someday I'm supposed to be. ;-)
... but I do know a bit about CLUSTER_CREDITS since I implemented it.
First, do a SHOW CLUSTER/CONTIN with the following commands: (you
may want to create a command file to do this)
add connections
add credits
add cr_w
remove connections/type=nooopen/name=(SCS$DIRECTORY,MSCP$TAPE,MSCP$DISK,
VMS$TAPE_CL_DRVR,VMS$DISK_CL_DRVR,SCA$TRANSPORT)
(You may need to add a few names to the list depending on what else is
running. The goal is to only have VMS$VAXcluster connections shown. You
may also need SET CR_W/WIDTH=n if all you get is ***** in the CR_W column.)
Watch the cr_w column - that's credit waits. If it's increasing at
a reasonable rate, upping cluster credits *on the other node* (credits are
a throttle on the sender from the receiver's point of view) might help. You
should read the info in the new features manual. I will say that it's
deliberately conservative in what it recommends, but I wouldn't make a jump
of more than 15 or 20 at a time. Going beyond 50 probably won't make all that
much difference even though the max is 128. You should never almost
never raise it on a satellite.
You're never going to eliminate credit waits, but cutting them down
can help locking performance. Note that CLUSTER_CREDITS only applies to
the VMS$VAXcluster SYSAP.
--Doug
|
167.10 | | ALCALA::BURGOS | Luis Burgos. NSC Spain. | Wed Feb 12 1997 12:21 | 25 |
|
The cluster it's still not quite ready for any testing. But it's a good
information to have an starting point. I appreciate it.
Do you know how much extra mem would cost every credit. Or in another
words, would we be using a lot of memory if we use a 75 or a 100
value?.
And respect to the last sentence,
======================================================================
You're never going to eliminate credit waits, but cutting them
down can help locking performance. Note that CLUSTER_CREDITS only applies
to the VMS$VAXcluster SYSAP.
======================================================================
I don't think I understood it well. Does it means that User applications
don't use it ?. How about Oracle ?. Could you explain what that
limitation means ?.
Regards.
Luis Burgos.
|
167.11 | Wishy-washy answer | WAYLAY::GORDON | Resident Lightning Designer | Wed Feb 12 1997 17:51 | 18 |
| If you did the SHOW CLUSTER/CONT with connections added, you'll see
lots of SYSAPS that use credits. CLUSTER_CREDITS only applies to the
VMS$VAXcluster SYSAP. That happens to be the SYSAP that the lock manager uses.
Anyone who does distributed locking will "benefit". MSCP is another that has
a SYSGEN-settable number of credits.
> Do you know how much extra mem would cost every credit. Or in another
> words, would we be using a lot of memory if we use a 75 or a 100
> value?.
The answer is "it depends." Several other parameters are calculated
from that including how many "warm packets" to keep around. And this is done
for each connection (so multiply it by the number of nodes-1 in the cluster)
I don't remember off-hand what the incremental cost in non-paged pool is. For
a two-node cluster with lots of memory, it's a drop in the bucket.
--Doug
|
167.12 | Huge "Global Valid Fault Rate" and "Demand Zero Fault Rate" | ALCALA::BURGOS | Luis Burgos. NSC Spain. | Fri Feb 14 1997 13:13 | 14 |
|
I have been reviewing the system usage, and I have found that there is
a lot of "Global Valid Fault Rate" and "Demand Zero Fault Rate". I
understand that this last one it's because the SGA or SGAPAD mapping,
as pointed by the .7 reply. I imagine the last one it's also related to
the same point, basically the huge ammount of image activation that's
taking place. I just wonder if there is anything else that can be done
to reduce those rate counters.
Regards.
Luis Burgos.
|