Discussion:
Disk Device Busy (%) - What exactly is this?
Guillermo Alan Bort
14 years ago
Permalink
hi guys,
recently we got an alert from OEM on the metric Disk Device Busy (%) (it
was at about 99% or something like that).

My Oracle Forums and Metalink search yielded some interesting results
that say it's a false alert (what a shock!)

So, the Documentation points to space utilization:
http://download-east.oracle.com/docs/cd/B16240_01/doc/em.102/b16230/host.htm#BHAFEBDG

Metalink corrects this as a documentation bug (another shock!)

The description of the metric in the documentation leads us to believe that
the metric checks the disk capacity. But this is incorrect and we have a
documentation bug

Bug.5099684 : EXPLANATION OF DISK DEVICE BUSY METRIC IS WRONG:

In reality, this metric checks how 'busy" the disk is.


So... what on earth does "busy" mean? Is it some sort of metric of how much
I/O is being done on the disk? if so... how can this be a percentage?
What's 100%?

I don't much care for the alert as I'm pretty sure that it's either a false
alert or something I can't do anything about in the short run... but I
would like to know what I'm seeing so I can find a way to prevent it from
happening again.

Oh, the databases are on SAN with one of those cool raid 5 stripped across
50 disks or something. So I get LUNs, not physical disks. Does this mean a
particular LUN is busy?

Thanks
Alan.-


--
http://www.freelists.org/webpage/oracle-l
Taylor, Chris David
14 years ago
Permalink
I'm not sure how any OS "knows" the maximum IO bandwidth available to a specific device (especially LUNs) but "Device Busy (%)" is common metric available on both *nix and Windows systems.

On many Linux systems, you can run iostat -x 2 100 and get interval snapshots showing the disk busy percentage (or disk utilization percentage). In Windows you use perfmon and the Logical (or Physical) disk metric counters.

Theoretically the percentage should never be higher than 100%. Also you should notice disk queues increasing and average waits the closer you get to 100% busy.


Chris Taylor
Sr. Oracle DBA
Ingram Barge Company
Nashville, TN 37205

"Quality is never an accident; it is always the result of intelligent effort."
-- John Ruskin (English Writer 1819-1900)

CONFIDENTIALITY NOTICE: This e-mail and any attachments are confidential and may also be privileged. If you are not the named recipient, please notify the sender immediately and delete the contents of this message without disclosing the contents to anyone, using them for any purpose, or storing or copying the information on any medium.


-----Original Message-----
From: oracle-l-bounce-***@public.gmane.org [mailto:oracle-l-bounce-***@public.gmane.org] On Behalf Of Guillermo Alan Bort
Sent: Monday, November 21, 2011 9:45 AM
To: oracle-l-freelists
Subject: Disk Device Busy (%) - What exactly is this?

hi guys,
recently we got an alert from OEM on the metric Disk Device Busy (%) (it was at about 99% or something like that).

My Oracle Forums and Metalink search yielded some interesting results that say it's a false alert (what a shock!)

So, the Documentation points to space utilization:
http://download-east.oracle.com/docs/cd/B16240_01/doc/em.102/b16230/host.htm#BHAFEBDG

Metalink corrects this as a documentation bug (another shock!)

The description of the metric in the documentation leads us to believe that the metric checks the disk capacity. But this is incorrect and we have a documentation bug

Bug.5099684 : EXPLANATION OF DISK DEVICE BUSY METRIC IS WRONG:

In reality, this metric checks how 'busy" the disk is.


So... what on earth does "busy" mean? Is it some sort of metric of how much I/O is being done on the disk? if so... how can this be a percentage?
What's 100%?

I don't much care for the alert as I'm pretty sure that it's either a false alert or something I can't do anything about in the short run... but I would like to know what I'm seeing so I can find a way to prevent it from happening again.

Oh, the databases are on SAN with one of those cool raid 5 stripped across
50 disks or something. So I get LUNs, not physical disks. Does this mean a particular LUN is busy?

Thanks
Alan.-


--
http://www.freelists.org/webpage/oracle-l




--
http://www.freelists.org/webpage/oracle-l
Radoulov, Dimitre
14 years ago
Permalink
Hi,
I have always considered the iostat/sar -d busy metrics important.
When I saw your question I searched google again and I found this *old*
document
that seems useful:

sunsite.uakom.sk/sunworldonline/swol-08-1999/swol-08-perf.html


Hope this helps
Dimitre
...
--
http://www.freelists.org/webpage/oracle-l
Guillermo Alan Bort
14 years ago
Permalink
Dimitre,
From that document I gather that the iostat busy metric is not all that
reliable when you have a complex disk subsystem: from the document:
*"Wrap up*
So the real answer to our initial question is that the model of disk
behavior and performance that is embodied by the iostat report is too
simple to cope with the reality of a complex underlying disk subsystem. We
stay with the old report to be consistent and to offer users familiar data,
but in reality, a much more sophisticated approach is required. I'm working
(slowly) on figuring out how to monitor and report on complex devices like
this."

It did, however, shed some light on exactly what "disk device busy (%)"
means.

thanks for the replies.
Alan.-


On Mon, Nov 21, 2011 at 12:56 PM, Radoulov, Dimitre
...
--
http://www.freelists.org/webpage/oracle-l
Radoulov, Dimitre
14 years ago
Permalink
Hi Alan,
yes, this was new to me too. In another mail Grzegorz Goryszewski have
just posted a link
to an article by Alex Gorbachev, which seem to confirm that too:

Quoting it:

Traditionally, it’s common to assume that the closer to 100% utilization
a device is, the more saturated it is.
This might be true when the system device corresponds to a single
physical disk.
However, with devices representing a LUN of a modern storage box, the
story might be completely different.
[...]

Regards
Dimitre
--
http://www.freelists.org/webpage/oracle-l
Taylor, Chris David
14 years ago
Permalink
I don't think that actually 'accurate' though.

There are real IO limits on the following:
1.) LUNS themselves (how many disks in the stripe, RAID levels)
2.) IO Controller Card between the server and the LUN or disk

NOW, the question that 'should be' asked is:

"How does my OS determine the IO capacity of my storage?"

Imagine if the OS does a statistics gathering on the IO subsystem (much like Oracle does on tables) then it can possibly "know" within a reasonable margin of error what the expected IO bandwidth is for the storage system (regardless of whether or not it is a LUN or a DISK).

So, does ANYONE know how the OS (Windows, Linux etc) tries to determine the maximum IO available across a disk or LUN?

Chris Taylor
Sr. Oracle DBA
Ingram Barge Company
Nashville, TN 37205

"Quality is never an accident; it is always the result of intelligent effort."
-- John Ruskin (English Writer 1819-1900)

CONFIDENTIALITY NOTICE: This e-mail and any attachments are confidential and may also be privileged. If you are not the named recipient, please notify the sender immediately and delete the contents of this message without disclosing the contents to anyone, using them for any purpose, or storing or copying the information on any medium.


-----Original Message-----
From: oracle-l-bounce-***@public.gmane.org [mailto:oracle-l-bounce-***@public.gmane.org] On Behalf Of Radoulov, Dimitre
Sent: Monday, November 21, 2011 10:21 AM
To: Guillermo Alan Bort
Cc: oracle-l-freelists
Subject: Re: Disk Device Busy (%) - What exactly is this?

Hi Alan,
yes, this was new to me too. In another mail Grzegorz Goryszewski have just posted a link to an article by Alex Gorbachev, which seem to confirm that too:

Quoting it:

Traditionally, it's common to assume that the closer to 100% utilization a device is, the more saturated it is.
This might be true when the system device corresponds to a single physical disk.
However, with devices representing a LUN of a modern storage box, the story might be completely different.
[...]

Regards
Dimitre
--
http://www.freelists.org/webpage/oracle-l




--
http://www.freelists.org/webpage/oracle-l
Grzegorz Goryszewski
14 years ago
Permalink
Post by Guillermo Alan Bort
hi guys,
recently we got an alert from OEM on the metric Disk Device Busy (%) (it
was at about 99% or something like that).
Hi,
check this one
http://www.pythian.com/news/247/basic-io-monitoring-on-linux/
Regards
GregG

--
http://www.freelists.org/webpage/oracle-l
Karl Arao
14 years ago
Permalink
To add on this blog link, if you have collectl installed somewhere there's
a file called formatit.ph that contains all the formatting/formulas that
collectl is using.. there's a section where the device busy % is derived
($dskUtil)
[***@desktopserver ~]# locate formatit.ph
/usr/share/collectl/formatit.ph
[***@desktopserver ~]# less /usr/share/collectl/formatit.ph

....

# we only need these if doing individual disk calculations
if ($subsys=~/D/)
{
# if doing hires time, we need the interval duration and
unfortunately at
# this point in time $intSecs has not been set so we can't use it
$microInterval=($fullTime-$lastSecs[$rawPFlag])*100 if
$hiResFlag;

$numIOs=$dskRead[$dskIndex]+$dskWrite[$dskIndex];
$dskRqst[$dskIndex]= $numIOs ?
($dskReadKB[$dskIndex]+$dskWriteKB[$dskIndex])/$numIOs : 0;
$dskQueLen[$dskIndex]=
$dskWeighted[$dskIndex]/$microInterval*$HZ/1000;
$dskWait[$dskIndex]= $numIOs ?
($dskReadTicks[$dskIndex]+$dskWriteTicks[$dskIndex])/$numIOs : 0;
$dskSvcTime[$dskIndex]=$numIOs ? $dskTicks[$dskIndex]/$numIOs : 0;
$dskUtil[$dskIndex]= $dskTicks[$dskIndex]*10/$microInterval;
}

....


if you are troubleshooting a "slow IO", you also need to consider and
correlate the service times of the SAN, oracle datafiles, and the session
IO service times... of course you need to sample them in a consistent and
fine grained manner, I would do 5secs interval for all the 3 subsystems
- SAN -> iostat -xnc 1 100000 | while read line; do echo "`date +%T`"
"$line" ; done >> iostat_1.txt
- datafiles ->
https://www.dropbox.com/s/jzcl5ydt29mvw69/PerformanceAndTroubleshooting/filestat.sql
- session - > @snapper ash=sql_id+sid+event+wait_class+module+service,stats
5 5 sid=<sid>

I had a recent scenario on Solaris M5000/9000 where the SAN (Symmetrix) and
datafiles are on the 10-60ms range and the oracle sessions are doing slow
IO and having around 900ms to 1sec service times, well that issue is
related to CPU scheduling (they have a really high load avg) and sessions
spinning on vxfslocks (due to concurrent IO not set).. but that is
something you have to keep in mind on the IO troubleshooting, the response
time of the kernel mode calls down to the low-level components (not
preempted) + the response time of the user mode calls (session IO - not
being serviced properly because of preemption brought by scheduling/lock
issues).

Here's the sample distribution of that scenario
http://karlarao.tiddlyspot.com/#%5B%5Bavg%20latency%20issue%5D%5D
--
Karl Arao
karlarao.wordpress.com
karlarao.tiddlyspot.com


--
http://www.freelists.org/webpage/oracle-l
kapil vaish
14 years ago
Permalink
Hi Guys ,
 
For some of our RAC envs, we do following during cold backup .

1. Shutdown DB
2. Take backup
3. startup DB in restricted mode
4. Do maintenace work like compile packages, pin packages,some index/table reorg etc and then
5. open DB to users ( in normal mode)

These all DBs are in RAC environment and we don't register them in the CRS. So, even when the server is shutdown, the DB will not automatically start and open to users without doing maintenance work.
Now there is a requirement to register all the dbs in CRS along with services .With this registration, an init.d script is created to start CRs (crsctl start crs) . CRS will then start up the database and open to users.   I will not be able to do my maintenace work. Are there any options to start the DB in restricted mode with RAC startup and then run scripts to do maintenance automatically before the DB is opened to users ?

any pointers are appreciated ..

thanks
kapil
--
http://www.freelists.org/webpage/oracle-l
kapil vaish
14 years ago
Permalink
Hi Guys ,
 
we have physical standby database for one of our biggest database. Scripts ship the archived log to standby server and then using parallel 32, manual recovery is performed (thru scripts) . Archived log size is 2 GB and daily production archive generation is aorund 2.5 TB. We are trying to increase performance on our standby database. We tried tuning various standby related parameters and IO, maximum apply rate we could achieve is 45 sec per archive log.  Can you suggest any other tunings you may have seen in your environments ? any pointers are appreciated ..

thanks
kapil Vaish
--
http://www.freelists.org/webpage/oracle-l
Guillermo Alan Bort
14 years ago
Permalink
<disclaimer>this is a strictly unhelpful comment </disclaimer>
I'm curious as to why you want to further reduce the apply time. Are you
experiencing a delay in the standby because it takes 45 seconds to apply
the archivelogs?

One of the key concepts of tuning in knowing when to stop, so perhaps if
you are experiencing no problems with this apply time it's time to leave it
be and move on to the next problem (there's always a next problem...
otherwise life would be boring)

hope that wasn't too unhelpful
Alan.-
...
--
http://www.freelists.org/webpage/oracle-l
kapil vaish
14 years ago
Permalink
Thanks Alan for your comments. Our standby is lagging behind even with this apply rate and we have to often use RMAN to sync it  up.
 
THanks
Kapil

________________________________
From: Guillermo Alan Bort <cicciuxdba-***@public.gmane.org>
To: kapilvaish1-/***@public.gmane.org
Cc: "oracle-l-***@public.gmane.org" <oracle-l-***@public.gmane.org>
Sent: Wednesday, November 23, 2011 5:53 AM
Subject: Re: Standby Database performance

<disclaimer>this is a strictly unhelpful comment </disclaimer>
I'm curious as to why you want to further reduce the apply time. Are you
experiencing a delay in the standby because it takes 45 seconds to apply
the archivelogs?

One of the key concepts of tuning in knowing when to stop, so perhaps if
you are experiencing no problems with this apply time it's time to leave it
be and move on to the next problem (there's always a next problem...
otherwise life would be boring)

hope that wasn't too unhelpful
Alan.-
...
--
http://www.freelists.org/webpage/oracle-l
--
http://www.freelists.org/webpage/oracle-l
Storey, Robert (DCSO)
14 years ago
Permalink
Just curious on my part. When you say the standby is lagging, how much of a lag are we talking about? Do you ship the logs as they are created, or do you batch them up?

In just doing the math, if I'm right, your generating about 1250 logs in a day at the rate of about 53 an hour? So you are popping an archive log every minute? I got those numbers at 2.5tb = 2500GB. 2GB archive log equates to 1250 logs at that size, so about 53 an hour if you are doing 1250 logs in a day. That's an archive log every minute

I don't think you can tune the application time that much. I thought that was a product of the processing speed of the hardware and the limitation of how fast oracle can apply redo. Maybe I'm wrong, but I think if 45 seconds for 2gb of redo application seems pretty decent.

What is the implication of the standby lagging? Are you using it for real time reporting? How long does it take to ship the logs to the new system. I guess I'm wondering that if it takes you 2 minutes to ship the log to the standby, then 3 other logs are going to get generated. From just a pure shipment standpoint, how can you ever catch up?

Not helping your problem, sorry, but I'm curious as to the overall picture of how the lag is occurring and where.

Thanks


-----Original Message-----
From: oracle-l-bounce-***@public.gmane.org [mailto:oracle-l-bounce-***@public.gmane.org] On Behalf Of kapil vaish
Sent: Wednesday, November 23, 2011 8:47 AM
To: cicciuxdba-***@public.gmane.org
Cc: oracle-l-***@public.gmane.org
Subject: Re: Standby Database performance

Thanks Alan for your comments. Our standby is lagging behind even with this apply rate and we have to often use RMAN to sync it  up.
 
THanks
Kapil

________________________________
From: Guillermo Alan Bort <cicciuxdba-***@public.gmane.org>
To: kapilvaish1-/***@public.gmane.org
Cc: "oracle-l-***@public.gmane.org" <oracle-l-***@public.gmane.org>
Sent: Wednesday, November 23, 2011 5:53 AM
Subject: Re: Standby Database performance

<disclaimer>this is a strictly unhelpful comment </disclaimer>
I'm curious as to why you want to further reduce the apply time. Are you
experiencing a delay in the standby because it takes 45 seconds to apply
the archivelogs?

One of the key concepts of tuning in knowing when to stop, so perhaps if
you are experiencing no problems with this apply time it's time to leave it
be and move on to the next problem (there's always a next problem...
otherwise life would be boring)

hope that wasn't too unhelpful
Alan.-
...
--
http://www.freelists.org/webpage/oracle-l
--
http://www.freelists.org/webpage/oracle-l


--
http://www.freelists.org/webpage/oracle-l
Subodh Deshpande
14 years ago
Permalink
rightoo alan :)
...
--
=============================================
TRUTH WINS AT LAST, DO NOT FORGET TO SMILE TODAY
=============================================


--
http://www.freelists.org/webpage/oracle-l
Subodh Deshpande
14 years ago
Permalink
just curious to know
why you are doing manual recovery..

why you did not configured to ship and apply the archives..and just monitor
the archive gap in primary and standby..

thanks..subodh
...
--
=============================================
TRUTH WINS AT LAST, DO NOT FORGET TO SMILE TODAY
=============================================


--
http://www.freelists.org/webpage/oracle-l
kapil vaish
14 years ago
Permalink
Thanks for all the answers, awesome team. Here are some answers to your questions .
Manual means thru scripts only, this is not Dataguard . There is no issue in shipping time, we hae plenty archived logs available on the standby server to apply. The lag becomes 30-40 hours in 3-4 days and will continue to grow . This DR  is used for multiple purposes and we can not afford this much lag . This is 3 node RAC BTW.  
What we are trying to figure out is that if it is limitation of Oracle and it can not get any better or some other tunings can be checked. We are continously working with our storage/hw teams to take care of any contentions .

________________________________
From: Subodh Deshpande <deshpande.subodh-***@public.gmane.org>
To: kapilvaish1-/***@public.gmane.org
Cc: "oracle-l-***@public.gmane.org" <oracle-l-***@public.gmane.org>
Sent: Wednesday, November 23, 2011 9:43 AM
Subject: Re: Standby Database performance

just curious to know
why you are doing manual recovery..

why you did not configured to ship and apply the archives..and just monitor
the archive gap in primary and standby..

thanks..subodh
...
--
=============================================
TRUTH WINS AT LAST, DO NOT FORGET TO SMILE TODAY
=============================================


--
http://www.freelists.org/webpage/oracle-l

--
http://www.freelists.org/webpage/oracle-l
Marcin Przepiorowski
14 years ago
Permalink
...
Hi,

Why you are not using DataGuard ? in that case you can use real time
apply and it can work better than applying archive logs.
From other side - did you ever try to check why standby is performing
poor ? you can use v$system/session_event and try to figure out where
Oracle is loosing time. It can be issue with applying logs but it can
be issue with DBWR doing checkpoint as well. I have seen case where
MRP was able to apply log in 20 s but checkpoint took 40 s.

regards,
--
Marcin Przepiorowski
http://oracleprof.blogspot.com
--
http://www.freelists.org/webpage/oracle-l
kapil vaish
14 years ago
Permalink
Hi,
DG was not able to scale upto this level. We tried combination of parallel threads starting from 8 threads upto 64 . We got best perf with 32 threads. Will review the suggested docs .
 
Thanks
Kapil

________________________________
From: Marcin Przepiorowski <pioro1-***@public.gmane.org>
To: kapilvaish1-/***@public.gmane.org
Cc: "deshpande.subodh-***@public.gmane.org" <deshpande.subodh-***@public.gmane.org>; "oracle-***@freelists.org" <oracle-l-***@public.gmane.org>
Sent: Thursday, November 24, 2011 5:20 AM
Subject: Re: Standby Database performance
...
Hi,

Why you are not using DataGuard ? in that case you can use real time
apply and it can work better than applying archive logs.
From other side - did you ever try to check why standby is performing
poor ? you can use v$system/session_event and try to figure out where
Oracle is loosing time. It can be issue with applying logs but it can
be issue with DBWR doing checkpoint as well. I have seen case where
MRP was able to apply log in 20 s but checkpoint took 40 s.

regards,
--
Marcin Przepiorowski
http://oracleprof.blogspot.com
--
http://www.freelists.org/webpage/oracle-l
Marcin Przepiorowski
14 years ago
Permalink
Post by kapil vaish
Hi,
DG was not able to scale upto this level. We tried combination of parallel
threads starting from 8 threads upto 64 . We got best perf with 32 threads.
Will review the suggested docs .
Hmmm I have seen couple of servers running at some load level (2 GB -
2.5 GB/ min)
in DG configuration in max performance mode but without RAC

I would start with detail analyze where Oracle is loosing time using
Oracle wait interface.
--
Marcin Przepiorowski
http://oracleprof.blogspot.com
--
http://www.freelists.org/webpage/oracle-l
CRISLER, JON A
14 years ago
Permalink
I find it odd that you say DG did not scale. I would suggest going back and looking at DG again: if implemented properly, it should save a great deal of labor since it will manage archive gaps for you.
Have you implemented statspack on the standby side? AWR reports will not be helpful if I recall correctly (i.e. they don't work on standby, at least for 10g), but there is a technote that shows you how to add statspack to a standby db, so it would give you additional metrics to help diagnose the problem.

How hard is your interconnect running ? We have found a lot of benefit in running 10g Ethernet with jumbo frames. Also, have your sysadmin / platform engineer check your HBA's to make sure they are optimially setup for high i/o to your SAN (or network connections if NFS). Look at things like proper multipath setup, proper queue lengths etc. Consider running Orion to benchark your disk i/o, and compare that to the primary side. Run against all LUN's / filesystems as well as some might perform worse than others.

-----Original Message-----
From: oracle-l-bounce-***@public.gmane.org [mailto:oracle-l-bounce-***@public.gmane.org] On Behalf Of kapil vaish
Sent: Thursday, November 24, 2011 11:46 AM
To: Marcin Przepiorowski
Cc: deshpande.subodh-***@public.gmane.org; oracle-l-***@public.gmane.org
Subject: Re: Standby Database performance

Hi,
DG was not able to scale upto this level. We tried combination of parallel threads starting from 8 threads upto 64 . We got best perf with 32 threads. Will review the suggested docs .
 
Thanks
Kapil

________________________________
From: Marcin Przepiorowski <pioro1-***@public.gmane.org>
To: kapilvaish1-/***@public.gmane.org
Cc: "deshpande.subodh-***@public.gmane.org" <deshpande.subodh-***@public.gmane.org>; "oracle-***@freelists.org" <oracle-l-***@public.gmane.org>
Sent: Thursday, November 24, 2011 5:20 AM
Subject: Re: Standby Database performance
...
Hi,

Why you are not using DataGuard ? in that case you can use real time
apply and it can work better than applying archive logs.
From other side - did you ever try to check why standby is performing
poor ? you can use v$system/session_event and try to figure out where
Oracle is loosing time. It can be issue with applying logs but it can
be issue with DBWR doing checkpoint as well. I have seen case where
MRP was able to apply log in 20 s but checkpoint took 40 s.

regards,
--
Marcin Przepiorowski
http://oracleprof.blogspot.com
--
http://www.freelists.org/webpage/oracle-l


--
http://www.freelists.org/webpage/oracle-l
CRISLER, JON A
14 years ago
Permalink
From Oracle Support - Installing and Using Standby Statspack in 11g [ID 454848.1].
Typically noboby bothers with statspack in 11g since AWR has many more metrics and capability, but this is a case where AWR does not work, but statspack does, for standby. You have to create dblinks from primary to standby to support this.
I have done this twice now- it's a little bit tricky to set up but works ok.
From: kapil vaish [mailto:kapilvaish1-/***@public.gmane.org]
Sent: Sunday, November 27, 2011 1:52 PM
To: CRISLER, JON A
Subject: Re: Standby Database performance

Hi ,
Can you point me to the Doc you are referring here ? Statspack on standby database .

Thanks
Kapil

From: "CRISLER, JON A" <JC1706-60p5jsuXm+***@public.gmane.org>
To: "kapilvaish1-/***@public.gmane.org" <kapilvaish1-/***@public.gmane.org>; Marcin Przepiorowski <pioro1-***@public.gmane.org>
Cc: "deshpande.subodh-***@public.gmane.org" <deshpande.subodh-***@public.gmane.org>; "oracle-***@freelists.org" <oracle-l-***@public.gmane.org>
Sent: Sunday, November 27, 2011 8:08 AM
Subject: RE: Standby Database performance

I find it odd that you say DG did not scale. I would suggest going back and looking at DG again: if implemented properly, it should save a great deal of labor since it will manage archive gaps for you.
Have you implemented statspack on the standby side? AWR reports will not be helpful if I recall correctly (i.e. they don't work on standby, at least for 10g), but there is a technote that shows you how to add statspack to a standby db, so it would give you additional metrics to help diagnose the problem.

How hard is your interconnect running ? We have found a lot of benefit in running 10g Ethernet with jumbo frames. Also, have your sysadmin / platform engineer check your HBA's to make sure they are optimially setup for high i/o to your SAN (or network connections if NFS). Look at things like proper multipath setup, proper queue lengths etc. Consider running Orion to benchark your disk i/o, and compare that to the primary side. Run against all LUN's / filesystems as well as some might perform worse than others.

-----Original Message-----
From: oracle-l-bounce-***@public.gmane.org<mailto:oracle-l-bounce-***@public.gmane.org> [mailto:oracle-l-bounce-***@public.gmane.org<mailto:oracle-l-***@freelists.org>] On Behalf Of kapil vaish
Sent: Thursday, November 24, 2011 11:46 AM
To: Marcin Przepiorowski
Cc: deshpande.subodh-***@public.gmane.org<mailto:deshpande.subodh-***@public.gmane.org>; oracle-***@freelists.org<mailto:oracle-l-***@public.gmane.org>
Subject: Re: Standby Database performance

Hi,
DG was not able to scale upto this level. We tried combination of parallel threads starting from 8 threads upto 64 . We got best perf with 32 threads. Will review the suggested docs .

Thanks
Kapil

________________________________
From: Marcin Przepiorowski <pioro1-***@public.gmane.org<mailto:pioro1-***@public.gmane.org>>
To: kapilvaish1-/***@public.gmane.org<mailto:kapilvaish1-/***@public.gmane.org>
Cc: "deshpande.subodh-***@public.gmane.org<mailto:deshpande.subodh-***@public.gmane.org>" <deshpande.subodh-***@public.gmane.org<mailto:deshpande.subodh-***@public.gmane.org>>; "oracle-***@freelists.org<mailto:oracle-l-***@public.gmane.org>" <oracle-l-***@public.gmane.org<mailto:oracle-l-***@public.gmane.org>>
Sent: Thursday, November 24, 2011 5:20 AM
Subject: Re: Standby Database performance
...
Hi,

Why you are not using DataGuard ? in that case you can use real time
apply and it can work better than applying archive logs.
From other side - did you ever try to check why standby is performing
poor ? you can use v$system/session_event and try to figure out where
Oracle is loosing time. It can be issue with applying logs but it can
be issue with DBWR doing checkpoint as well. I have seen case where
MRP was able to apply log in 20 s but checkpoint took 40 s.

regards,
--
Marcin Przepiorowski
http://oracleprof.blogspot.com
--
http://www.freelists.org/webpage/oracle-l




--
http://www.freelists.org/webpage/oracle-l

Jorgensen, Finn
14 years ago
Permalink
Important information left out :

What version of Orac le?
What OS?
What kind of hardware?
Storage?
ASM?

Thanks,
Finn

-----Original Message-----
From: oracle-l-bounce-***@public.gmane.org [mailto:oracle-l-bounce-***@public.gmane.org] On Behalf Of kapil vaish
Sent: Tuesday, November 22, 2011 2:24 PM
To: oracle-l-***@public.gmane.org
Subject: Standby Database performance


Hi Guys ,
 
we have physical standby database for one of our biggest database. Scripts ship the archived log to standby server and then using parallel 32, manual recovery is performed (thru scripts) . Archived log size is 2 GB and daily production archive generation is aorund 2.5 TB. We are trying to increase performance on our standby database. We tried tuning various standby related parameters and IO, maximum apply rate we could achieve is 45 sec per archive log.  Can you suggest any other tunings you may have seen in your environments ? any pointers are appreciated ..

thanks
kapil Vaish
--
http://www.freelists.org/webpage/oracle-l
This e-mail and any attachments are confidential, may contain legal, professional or other privileged information, and are intended solely for the addressee. If you are not the intended recipient, do not use the information in this e-mail in any way, delete this e-mail and notify the sender. CEG-IP1
--
http://www.freelists.org/webpage/oracle-l
kapil vaish
14 years ago
Permalink
DR 10.2.0.5, HP-Itanium, Super-Dome, EMC SAN, No ASM
 

________________________________
From: "Jorgensen, Finn" <Finn.Jorgensen-R5GB+qwjRMiaMPzRcYMCawC/***@public.gmane.org>
To: "kapilvaish1-/***@public.gmane.org" <kapilvaish1-/***@public.gmane.org>; "oracle-***@freelists.org" <oracle-l-***@public.gmane.org>
Sent: Wednesday, November 23, 2011 3:22 PM
Subject: RE: Standby Database performance

Important information left out :

What version of Orac le?
What OS?
What kind of hardware?
Storage?
ASM?

Thanks,
Finn

-----Original Message-----
From: oracle-l-bounce-***@public.gmane.org [mailto:oracle-l-bounce-***@public.gmane.org] On Behalf Of kapil vaish
Sent: Tuesday, November 22, 2011 2:24 PM
To: oracle-l-***@public.gmane.org
Subject: Standby Database performance


Hi Guys ,
 
we have physical standby database for one of our biggest database. Scripts ship the archived log to standby server and then using parallel 32, manual recovery is performed (thru scripts) . Archived log size is 2 GB and daily production archive generation is aorund 2.5 TB. We are trying to increase performance on our standby database. We tried tuning various standby related parameters and IO, maximum apply rate we could achieve is 45 sec per archive log.  Can you suggest any other tunings you may have seen in your environments ? any pointers are appreciated ..

thanks
kapil Vaish
--
http://www.freelists.org/webpage/oracle-l
This e-mail and any attachments are confidential, may contain legal, professional or other privileged information, and are intended solely for the addressee.  If you are not the intended recipient, do not use the information in this e-mail in any way, delete this e-mail and notify the sender. CEG-IP1
--
http://www.freelists.org/webpage/oracle-l
Jorgensen, Finn
14 years ago
Permalink
Did you play with the parallel_execution_message_size parameter? I've seen some percentage of performance improvement when increasing that value.
Thanks,
Finn

From: kapil vaish [mailto:kapilvaish1-/***@public.gmane.org]
Sent: Wednesday, November 23, 2011 6:26 PM
To: Jorgensen, Finn; oracle-l-***@public.gmane.org
Subject: Re: Standby Database performance

DR 10.2.0.5, HP-Itanium, Super-Dome, EMC SAN, No ASM


From: "Jorgensen, Finn" <Finn.Jorgensen-R5GB+qwjRMiaMPzRcYMCawC/***@public.gmane.org<mailto:Finn.Jorgensen-R5GB+qwjRMiaMPzRcYMCawC/***@public.gmane.org>>
To: "kapilvaish1-/***@public.gmane.org<mailto:kapilvaish1-/***@public.gmane.org>" <***@yahoo.com<mailto:kapilvaish1-/***@public.gmane.org>>; "oracle-l-***@public.gmane.org<mailto:oracle-l-***@public.gmane.org>" <oracle-l-***@public.gmane.org<mailto:oracle-***@freelists.org>>
Sent: Wednesday, November 23, 2011 3:22 PM
Subject: RE: Standby Database performance

Important information left out :

What version of Orac le?
What OS?
What kind of hardware?
Storage?
ASM?

Thanks,
Finn

-----Original Message-----
From: oracle-l-bounce-***@public.gmane.org<mailto:oracle-l-bounce-***@public.gmane.org> [mailto:oracle-l-bounce-***@public.gmane.org<mailto:oracle-l-***@freelists.org>] On Behalf Of kapil vaish
Sent: Tuesday, November 22, 2011 2:24 PM
To: oracle-l-***@public.gmane.org<mailto:oracle-l-***@public.gmane.org>
Subject: Standby Database performance


Hi Guys ,

we have physical standby database for one of our biggest database. Scripts ship the archived log to standby server and then using parallel 32, manual recovery is performed (thru scripts) . Archived log size is 2 GB and daily production archive generation is aorund 2.5 TB. We are trying to increase performance on our standby database. We tried tuning various standby related parameters and IO, maximum apply rate we could achieve is 45 sec per archive log. Can you suggest any other tunings you may have seen in your environments ? any pointers are appreciated ..

thanks
kapil Vaish
--
http://www.freelists.org/webpage/oracle-l
This e-mail and any attachments are confidential, may contain legal, professional or other privileged information, and are intended solely for the addressee. If you are not the intended recipient, do not use the information in this e-mail in any way, delete this e-mail and notify the sender. CEG-IP1
--
http://www.freelists.org/webpage/oracle-l
kapil vaish
14 years ago
Permalink
Yes Finn. We did, it is now set to 64k.


________________________________
From: "Jorgensen, Finn" <Finn.Jorgensen-R5GB+qwjRMiaMPzRcYMCawC/***@public.gmane.org>
To: kapil vaish <kapilvaish1-/***@public.gmane.org>; "oracle-l-***@public.gmane.org" <oracle-***@freelists.org>
Sent: Wednesday, November 23, 2011 3:28 PM
Subject: RE: Standby Database performance


Did you play with the parallel_execution_message_size parameter? IĂ¢Â€Â™ve seen some percentage of performance improvement when increasing that value.
Ă‚ 
Thanks,
Finn
Ă‚ 
From:kapil vaish [mailto:kapilvaish1-/***@public.gmane.org]
Sent: Wednesday, November 23, 2011 6:26 PM
To: Jorgensen, Finn; oracle-l-***@public.gmane.org
Subject: Re: Standby Database performance
Ă‚ 
DR 10.2.0.5, HP-Itanium, Super-Dome, EMC SAN, No ASM
Ă‚ 
Ă‚ 
From:"Jorgensen, Finn" <Finn.Jorgensen-R5GB+qwjRMiaMPzRcYMCawC/***@public.gmane.org>
To: "kapilvaish1-/***@public.gmane.org" <kapilvaish1-/***@public.gmane.org>; "oracle-***@freelists.org" <oracle-l-***@public.gmane.org>
Sent: Wednesday, November 23, 2011 3:22 PM
Subject: RE: Standby Database performance

Important information left out :

What version of Orac le?
What OS?
What kind of hardware?
Storage?
ASM?

Thanks,
Finn

-----Original Message-----
From: oracle-l-bounce-***@public.gmane.org [mailto:oracle-l-bounce-***@public.gmane.org] On Behalf Of kapil vaish
Sent: Tuesday, November 22, 2011 2:24 PM
To: oracle-l-***@public.gmane.org
Subject: Standby Database performance


Hi Guys ,
Ă‚ 
we have physical standby database for one of our biggest database.Ă‚ Scripts ship theĂ‚ archived log to standby server and then using parallel 32, manual recovery is performedĂ‚ (thru scripts) . Archived log size is 2 GB and daily production archive generation is aorund 2.5 TB. We areĂ‚ trying to increase performance on our standby database. We tried tuning various standby related parameters and IO, maximum apply rate we could achieve isĂ‚ 45 sec per archive log. Ă‚ Can you suggest any other tunings you may have seen in your environments ? any pointers are appreciated ..

thanks
kapil Vaish
--
http://www.freelists.org/webpage/oracle-l
This e-mail and any attachments are confidential, may contain legal, professional or other privileged information, and are intended solely for the addressee.Ă‚  If you are not the intended recipient, do not use the information in this e-mail in any way, delete this e-mail and notify the sender. CEG-IP1
--
http://www.freelists.org/webpage/oracle-l
Chitale, Hemant Krishnarao
14 years ago
Permalink
Have you tried actually *reducing* the degree of parallelism ?
A high degree of parallelism causes the PQ slaves to interfere with each other in a Recovery scenario.
This particularly happens with large transactions that update many indexes and then issue ROLLBACKs.

See Oracle Support article
How to Disable Parallel Transaction Recovery When Parallel Txn Recovery is Active --- 238507.1
Parallel Rollback may hang database, Parallel query servers get 100% cpu --- 144332.1

 
Hemant K Chitale


-----Original Message-----
From: oracle-l-bounce-***@public.gmane.org [mailto:oracle-l-bounce-***@public.gmane.org] On Behalf Of kapil vaish
Sent: Wednesday, November 23, 2011 3:24 AM
To: oracle-l-***@public.gmane.org
Subject: Standby Database performance


Hi Guys ,
 
we have physical standby database for one of our biggest database. Scripts ship the archived log to standby server and then using parallel 32, manual recovery is performed (thru scripts) . Archived log size is 2 GB and daily production archive generation is aorund 2.5 TB. We are trying to increase performance on our standby database. We tried tuning various standby related parameters and IO, maximum apply rate we could achieve is 45 sec per archive log.  Can you suggest any other tunings you may have seen in your environments ? any pointers are appreciated ..

thanks
kapil Vaish



This email and any attachments are confidential and may also be privileged. If you are not the addressee, do not disclose, copy, circulate or in any other way use or rely on the information contained in this email or any attachments. If received in error, notify the sender immediately and delete this email and any attachments from your system. Emails cannot be guaranteed to be secure or error free as the message and any attachments could be intercepted, corrupted, lost, delayed, incomplete or amended. Standard Chartered PLC and its subsidiaries do not accept liability for damage caused by this email or any attachments and may monitor email traffic.

Standard Chartered PLC is incorporated in England with limited liability under company number 966425 and has its registered office at 1 Aldermanbury Square, London, EC2V 7SB.

Standard Chartered Bank ("SCB") is incorporated in England with limited liability by Royal Charter 1853, under reference ZC18. The Principal Office of SCB is situated in England at 1 Aldermanbury Square, London EC2V 7SB. In the United Kingdom, SCB is authorised and regulated by the Financial Services Authority under FSA register number 114276.

If you are receiving this email from SCB outside the UK, please click http://www.standardchartered.com/global/email_disclaimer.html to refer to the information on other jurisdictions.
--
http://www.freelists.org/webpage/oracle-l
Jorgensen, Finn
14 years ago
Permalink
I've used that note with great success when doing crash recovery in 11.2. Just FYI.

Thanks,
Finn


-----Original Message-----
From: oracle-l-bounce-***@public.gmane.org [mailto:oracle-l-bounce-***@public.gmane.org] On Behalf Of Chitale, Hemant Krishnarao
Sent: Wednesday, November 23, 2011 8:52 PM
To: kapilvaish1-/***@public.gmane.org
Cc: oracle-l-***@public.gmane.org
Subject: RE: Standby Database performance


Have you tried actually *reducing* the degree of parallelism ?
A high degree of parallelism causes the PQ slaves to interfere with each other in a Recovery scenario.
This particularly happens with large transactions that update many indexes and then issue ROLLBACKs.

See Oracle Support article
How to Disable Parallel Transaction Recovery When Parallel Txn Recovery is Active --- 238507.1 Parallel Rollback may hang database, Parallel query servers get 100% cpu --- 144332.1

 
Hemant K Chitale


-----Original Message-----
From: oracle-l-bounce-***@public.gmane.org [mailto:oracle-l-bounce-***@public.gmane.org] On Behalf Of kapil vaish
Sent: Wednesday, November 23, 2011 3:24 AM
To: oracle-l-***@public.gmane.org
Subject: Standby Database performance


Hi Guys ,
 
we have physical standby database for one of our biggest database. Scripts ship the archived log to standby server and then using parallel 32, manual recovery is performed (thru scripts) . Archived log size is 2 GB and daily production archive generation is aorund 2.5 TB. We are trying to increase performance on our standby database. We tried tuning various standby related parameters and IO, maximum apply rate we could achieve is 45 sec per archive log.  Can you suggest any other tunings you may have seen in your environments ? any pointers are appreciated ..

thanks
kapil Vaish



This email and any attachments are confidential and may also be privileged. If you are not the addressee, do not disclose, copy, circulate or in any other way use or rely on the information contained in this email or any attachments. If received in error, notify the sender immediately and delete this email and any attachments from your system. Emails cannot be guaranteed to be secure or error free as the message and any attachments could be intercepted, corrupted, lost, delayed, incomplete or amended. Standard Chartered PLC and its subsidiaries do not accept liability for damage caused by this email or any attachments and may monitor email traffic.

Standard Chartered PLC is incorporated in England with limited liability under company number 966425 and has its registered office at 1 Aldermanbury Square, London, EC2V 7SB.

Standard Chartered Bank ("SCB") is incorporated in England with limited liability by Royal Charter 1853, under reference ZC18. The Principal Office of SCB is situated in England at 1 Aldermanbury Square, London EC2V 7SB. In the United Kingdom, SCB is authorised and regulated by the Financial Services Authority under FSA register number 114276.

If you are receiving this email from SCB outside the UK, please click http://www.standardchartered.com/global/email_disclaimer.html to refer to the information on other jurisdictions.
--
http://www.freelists.org/webpage/oracle-l
This e-mail and any attachments are confidential, may contain legal,
professional or other privileged information, and are intended solely for the
addressee. If you are not the intended recipient, do not use the information
in this e-mail in any way, delete this e-mail and notify the sender. CEG-IP2

--
http://www.freelists.org/webpage/oracle-l
Andrew Kerber
14 years ago
Permalink
You might be able to do something with a database trigger (on startup of
database). Though if you are backing them up, I dont see why you dont put
them in archivelog mode and run hot backups.
...
--
Andrew W. Kerber

'If at first you dont succeed, dont take up skydiving.'


--
http://www.freelists.org/webpage/oracle-l
kapil vaish
14 years ago
Permalink
Thanks Andrew. triiger is one option but it does not cover all the maintenance part. Is there any way that we can run script dring RAc startup ?
These dbs are in archivelog and hot backups are run. we do cold backups at some specified windows.
 
Thanks
Kapil

________________________________
From: Andrew Kerber <andrew.kerber-***@public.gmane.org>
To: kapilvaish1-/***@public.gmane.org
Cc: "oracle-l-***@public.gmane.org" <oracle-l-***@public.gmane.org>
Sent: Tuesday, November 22, 2011 11:29 AM
Subject: Re: Post script after RAC DB Startup

You might be able to do something with a database trigger (on startup of
database).  Though if you are backing them up, I dont see why you dont put
them in archivelog mode and run hot backups.
...
--
Andrew W. Kerber

'If at first you dont succeed, dont take up skydiving.'


--
http://www.freelists.org/webpage/oracle-l

--
http://www.freelists.org/webpage/oracle-l
CRISLER, JON A
14 years ago
Permalink
Srvctl has options to start a db in modes other than normal open. Example- srvctl start instance -d (dbname) -i (instancename) -o mount
So check the -o open for what you want.

-----Original Message-----
From: oracle-l-bounce-***@public.gmane.org [mailto:oracle-l-bounce-***@public.gmane.org] On Behalf Of kapil vaish
Sent: Tuesday, November 22, 2011 2:53 PM
To: andrew.kerber-***@public.gmane.org
Cc: oracle-l-***@public.gmane.org
Subject: Re: Post script after RAC DB Startup

Thanks Andrew. triiger is one option but it does not cover all the maintenance part. Is there any way that we can run script dring RAc startup ?
These dbs are in archivelog and hot backups are run. we do cold backups at some specified windows.
 
Thanks
Kapil

________________________________
From: Andrew Kerber <andrew.kerber-***@public.gmane.org>
To: kapilvaish1-/***@public.gmane.org
Cc: "oracle-l-***@public.gmane.org" <oracle-l-***@public.gmane.org>
Sent: Tuesday, November 22, 2011 11:29 AM
Subject: Re: Post script after RAC DB Startup

You might be able to do something with a database trigger (on startup of
database).  Though if you are backing them up, I dont see why you dont put
them in archivelog mode and run hot backups.
...
--
Andrew W. Kerber

'If at first you dont succeed, dont take up skydiving.'


--
http://www.freelists.org/webpage/oracle-l

--
http://www.freelists.org/webpage/oracle-l


--
http://www.freelists.org/webpage/oracle-l
Sais, Gene
14 years ago
Permalink
I usually stop all of the instances (i.e. stop db) with srvctl and then start one instance with sqlplus and run my scripts.

-----Original Message-----
From: oracle-l-bounce-***@public.gmane.org [mailto:oracle-l-bounce-***@public.gmane.org] On Behalf Of CRISLER, JON A
Sent: Tuesday, November 22, 2011 3:50 PM
To: kapilvaish1-/***@public.gmane.org; andrew.kerber-***@public.gmane.org
Cc: oracle-l-***@public.gmane.org
Subject: RE: Post script after RAC DB Startup

Srvctl has options to start a db in modes other than normal open. Example- srvctl start instance -d (dbname) -i (instancename) -o mount
So check the -o open for what you want.

-----Original Message-----
From: oracle-l-bounce-***@public.gmane.org [mailto:oracle-l-bounce-***@public.gmane.org] On Behalf Of kapil vaish
Sent: Tuesday, November 22, 2011 2:53 PM
To: andrew.kerber-***@public.gmane.org
Cc: oracle-l-***@public.gmane.org
Subject: Re: Post script after RAC DB Startup

Thanks Andrew. triiger is one option but it does not cover all the maintenance part. Is there any way that we can run script dring RAc startup ?
These dbs are in archivelog and hot backups are run. we do cold backups at some specified windows.
 
Thanks
Kapil

________________________________
From: Andrew Kerber <andrew.kerber-***@public.gmane.org>
To: kapilvaish1-/***@public.gmane.org
Cc: "oracle-l-***@public.gmane.org" <oracle-l-***@public.gmane.org>
Sent: Tuesday, November 22, 2011 11:29 AM
Subject: Re: Post script after RAC DB Startup

You might be able to do something with a database trigger (on startup of database).  Though if you are backing them up, I dont see why you dont put them in archivelog mode and run hot backups.
...
--
Andrew W. Kerber

'If at first you dont succeed, dont take up skydiving.'


--
http://www.freelists.org/webpage/oracle-l

--
http://www.freelists.org/webpage/oracle-l


--
http://www.freelists.org/webpage/oracle-l


--
http://www.freelists.org/webpage/oracle-l
Andy Colvin
14 years ago
Permalink
You could use srvctl to start the database in mount mode, then use sqlplus to start it up in restricted mode and perform your database maintenance. If you're using 11.2 GI, CRS should start automatically on bootup.
Andy Colvin

5605 N MacArthur Blvd
Suite 600
Irving, TX 75038
andy.colvin-***@public.gmane.org

----- Original Message -----
From: kapil vaish <kapilvaish1-/***@public.gmane.org>
To: andrew kerber <andrew.kerber-***@public.gmane.org>
Cc: oracle-l-***@public.gmane.org
Sent: Tue, 22 Nov 2011 13:52:37 -0600 (CST)
Subject: Re: Post script after RAC DB Startup

Thanks Andrew. triiger is one option but it does not cover all the maintenance part. Is there any way that we can run script dring RAc startup ?
These dbs are in archivelog and hot backups are run. we do cold backups at some specified windows.

Thanks
Kapil

________________________________
From: Andrew Kerber
To: kapilvaish1-/***@public.gmane.org
Cc: "oracle-l-***@public.gmane.org"
Sent: Tuesday, November 22, 2011 11:29 AM
Subject: Re: Post script after RAC DB Startup

You might be able to do something with a database trigger (on startup of
database). Though if you are backing them up, I dont see why you dont put
them in archivelog mode and run hot backups.
...
--
Andrew W. Kerber

'If at first you dont succeed, dont take up skydiving.'


--
http://www.freelists.org/webpage/oracle-l

--
http://www.freelists.org/webpage/oracle-l





--
http://www.freelists.org/webpage/oracle-l
Niall Litchfield
14 years ago
Permalink
I'm more than a little surprised that you are regularly compiling packages
on maintenance windows (as opposed to application deployment) , pinning
objects can (and should IMO) be done via a startup trigger and a control
table. However having said all that I'd most likely achieve a quiet
maintenance window by ensuring all DBA access is via services and not
having the services autostart. A blunter instrument would be to keep the
listener(s) down
...
--
http://www.freelists.org/webpage/oracle-l
Andrew Kerber
14 years ago
Permalink
Niall-

Now I feel stupid, controlling access with services or the listener is the
obvious and easy way to do this. Good job spotting the obvious.

On Wed, Nov 23, 2011 at 2:35 AM, Niall Litchfield <
Post by Niall Litchfield
I'm more than a little surprised that you are regularly compiling packages
on maintenance windows (as opposed to application deployment) , pinning
objects can (and should IMO) be done via a startup trigger and a control
table. However having said all that I'd most likely achieve a quiet
maintenance window by ensuring all DBA access is via services and not
having the services autostart. A blunter instrument would be to keep the
listener(s) down
--
Andrew W. Kerber

'If at first you dont succeed, dont take up skydiving.'


--
http://www.freelists.org/webpage/oracle-l
Subodh Deshpande
14 years ago
Permalink
Hello Kapil,
What it forces to do the maintenance job..is it your maintenance procedure
or requirement..
I do not think users can be affected by the maintenance work...index can be
rebuilt online..
package maintenance you can create scheduled job to fire when it is less
load..

thanks..subodh
...
--
=============================================
TRUTH WINS AT LAST, DO NOT FORGET TO SMILE TODAY
=============================================


--
http://www.freelists.org/webpage/oracle-l
Continue reading on narkive:
Loading...