Tuning Self-referencing Inserts

Discussion:

"MacGregor, Ian A." (Redacted sender "ian" for DMARC)

2018-11-30 18:31:26 UTC

The problem is with a PeopleSoft statement which is based on a select statement which references the table being inserted. The problem stems from having to read and build the indexes of the table which is not that large. The buffer gets are extremely high.

Is there generic answer to this problem. Would it help to rebuild with a higher percent free in an attempt to have few rows per block so as to lessen contention.

Also for library cache pins I don't understand how to reads the p3raw value. I think the problem primarily lies with the maintenance m of the unique index bit I am. not 100% confident.

Ian A. MacGregor
SLAC National Accelerator Laboratory
Computing Division
To offer the best IT service at the lab and be the IT provider of choice.

Andy Sayer

2018-11-30 22:56:28 UTC

Permalink

I think we need to take a few steps back.

Itâs hard to explain why such a statement would cause noticeable issues if
the table is really not that large. A simple insert into <target> statement
using a select from <target table>, wonât take a silly amount of time -
itâs just the time to read the table blocks using a full tablescan then the
time to update the indexes which might be some overhead but no different to
any other insert statement of the same volume.

Perhaps this this insert is within some loop? Perhaps the self-reference is
written so that a silly execution plan is being used. It would be great if
you could share the execution plan you are seeing and the SQL being
executed. Is there PL/SQL involved?

One other thing that springs to mind that could have an impact is DML error
logging (it sounds like a situation where unique keys could give you
errors?)

As for library cache pins, this is unusual for an insert/select statement.
Things like running a PL/SQL procedure which another session is trying to
recompile while another session is executing it will bring around this
behaviour. Perhaps there is a function being called by your statement.
Again, what exactly are you seeing? This is a block so you should be able
to investigate what the blocking session is doing by checking v$session.

Hope this helps,
Andy

Post by "MacGregor, Ian A." (Redacted sender "ian" for DMARC)
The problem is with a PeopleSoft statement which is based on a select
statement which references the table being inserted. The problem stems
from having to read and build the indexes of the table which is not that
large. The buffer gets are extremely high.
Is there generic answer to this problem. Would it help to rebuild with
a higher percent free in an attempt to have few rows per block so as to
lessen contention.
Also for library cache pins I don't understand how to reads the p3raw
value. I think the problem primarily lies with the maintenance m of the
unique index bit I am. not 100% confident.
Ian A. MacGregor
SLAC National Accelerator Laboratory
Computing Division
To offer the best IT service at the lab and be the IT provider of choice.

"MacGregor, Ian A." (Redacted sender "ian" for DMARC)

2018-12-02 06:17:39 UTC

Permalink

The table is very small about 64K. Here is an explain plan from a month ago.

-------------------------------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

-------------------------------------------------------------------------------------------------------

| 0 | INSERT STATEMENT | | | | 2999 (100)| |

| 1 | LOAD TABLE CONVENTIONAL | | | | | |

| 2 | HASH UNIQUE | | 1 | 477 | 2999 (1)| 00:00:36 |

| 3 | NESTED LOOPS | | 1 | 477 | 2994 (1)| 00:00:36 |

| 4 | NESTED LOOPS | | 192 | 477 | 2994 (1)| 00:00:36 |

|* 5 | HASH JOIN | | 2 | 254 | 6 (0)| 00:00:01 |

| 6 | NESTED LOOPS | | 8 | 608 | 3 (0)| 00:00:01 |

|* 7 | TABLE ACCESS FULL | PS_PSA_RULES_HDR | 8 | 504 | 3 (0)| 00:00:01 |

|* 8 | INDEX UNIQUE SCAN | PS_PSA_RULES_LVL | 1 | 13 | 0 (0)| |

|* 9 | TABLE ACCESS FULL | PS_PSA_RULES_LN | 32 | 1632 | 3 (0)| 00:00:01 |

|* 10 | INDEX RANGE SCAN | PS_PSA_ACCTG_TA34 | 96 | | 1473 (1)| 00:00:18 |

|* 11 | INDEX RANGE SCAN | PS_PSA_ACCTDSTGL4 | 1 | 20 | 2 (0)| 00:00:01 |

|* 12 | INDEX RANGE SCAN | PS_PSA_ACCTDSTGL4 | 1 | 14 | 2 (0)| 00:00:01 |

| 13 | SORT AGGREGATE | | 1 | 19 | | |

| 14 | TABLE ACCESS BY INDEX ROWID| PS_PSA_ACCTDSTGL4 | 1 | 19 | 3 (0)| 00:00:01 |

|* 15 | INDEX RANGE SCAN | PS_PSA_ACCTDSTGL4 | 1 | | 2 (0)| 00:00:01 |

|* 16 | TABLE ACCESS BY INDEX ROWID | PS_PSA_ACCTG_TA34 | 1 | 350 | 1494 (1)| 00:00:18 |

-------------------------------------------------------------------------------------------------------

It took just over 11,000 seconds to insert 25,000 rows There were 1,509,2 87,736 buffer gets associated with the instance. There is something definitely wrong. This is a 2-node RAC system. FYI the insert is called more than once. It runs pretty well the first time then gets dramatically worse. The above is the second worse case from about a month ago. There doesn't seem to be anything wrong with SQL

For the latest run, I decided to do periodic probes of v$session wait. Each time I did I saw the library cache pin wait all with "waited short time". V$session did not show any blocking session. Neither did the query tailored to find blockers for this wait. I did a system state dump

-------------------------------------------------------------------------------

Oracle session identified by:

{

instance: 1 (fsprd.fsprd1)

os id: 28554

process id: 176, ***@erp-fprd-oracle01

session id: 406

session serial #: 251

}

is not in a wait:

{

last wait: 120 min 20 sec ago

blocking: 0 sessions

current sql: INSERT INTO PS_PSA_ACCTDSTGL4 (PROCESS_INSTANCE, BUSINESS_UNIT_PC, PROJECT_ID, ACTIVITY_ID, RESOURCE_ID, LINE_SEQ_NBR, DEBIT_CREDIT, DST_USE, INTER_ORG_LEVEL, CONVERSION_RATE, ORG_TO_BOOK, COMBO_STATUS, PROJECT_FLAG, IU_ANCHOR_FLG, CONTRACT_NUM, CONTRACT_LINE_NUM, CONTRACT_PPD_SEQ, ACCT_PLAN_ID, EVENT_NUM, ACCOUNT, AL

short stack: ksedsts()+465<-ksdxfstk()+32<-ksdxcb()+1927<-sspuser()+112<-__sighandler()<-qerixtFetch()+547<-subex1()+259<-subsr3()+183<-evaopn3()+2533<-expepr()+576<-evaiand()+51<-expeal()+23<-qerixtFetch()+800<-qerjotRowProc()+359<-qerhjInnerProbeHashTable()+491<-kdstf11001010000km()+617<-kdsttgr()+103608<-qertbFetch()+2455<-rwsfcd()+103<-qerhjFetch()+621<-qerjotFetch()+2025<-qerjotFetch()+2025<-qerghFetch()+315<-rwsfcd()+103<-qerltcFetch()+1223<-insexe()+691<-opiexe()+5632<-kpoal8()+2380<-opiodr()+917<-ttcpip()+2183<-opitsk(

wait history:

1. event: 'library cache pin'

time waited: 0.000158 sec

wait id: 7433154 p1: 'handle address'=0x9c48940d0

p2: 'pin address'=0x5eec1bd08

p3: '100*mode+namespace'=0x41cdd00010002

* time between wait #1 and #2: 0.000046 sec

2. event: 'library cache lock'

time waited: 0.000228 sec

wait id: 7433153 p1: 'handle address'=0x9c48940d0

p2: 'lock address'=0x5eeef6eb0

p3: '100*mode+namespace'=0x41cdd00010002

* time between wait #2 and #3: 0.000516 sec

3. event: 'library cache pin'

time waited: 0.000250 sec

wait id: 7433152 p1: 'handle address'=0x9a2fdaf10

p2: 'pin address'=0x99a42e960

p3: '100*mode+namespace'=0x2160300010002

}

==========================================================================================

I thought the lock/pin was against an index, but the 100 indicates it is against the table. However despite their showing up when I probed v$session_wait they are not shown to be a significant wait. I was surprised to see them however, because I associated such locks/pins with DDL not DML.

The last point is that this is a 2 node RAC system, the database was cloned to a single instance database and the program ran much more efficiently. The longest time for an insert statement was less than an hour.

Here are the RAC stats from the run done about a month ago. This covers a 10 hour period

The top RAC associated wait was ranked 10th

*

Event

Waits

Total Wait Time (sec)

Wait Avg(ms)

% DB time

Wait Class

DB CPU

38K

96.6

SQL*Net message from dblink

3,038

497.6

164

1.3

Network

db file sequential read

1,233,886

474.4

0

1.2

User I/O

control file sequential read

149,191

96.6

1

.2

System I/O

direct path read

71,377

89.5

1

.2

User I/O

SQL*Net more data to dblink

45,096

51.1

1

.1

Network

log switch/archive

21

43.8

2086

.1

Other

recovery area: computing obsolete files

10

22.3

2234

.1

Other

log file sync

33,439

19.2

1

.0

Commit

gc current block 2-way

42,984

16

0

.0

Cluster

The db link wait is not associated with this process.

Ian A. MacGregor
SLAC National Accelerator Laboratory
Computing Division
To offer the best IT service at the lab and be the IT provider of choice.
________________________________
From: oracle-l-***@freelists.org <oracle-l-***@freelists.org> on behalf of Andy Sayer <***@gmail.com>
Sent: Friday, November 30, 2018 2:56:28 PM
To: dmarc-***@freelists.org
Cc: ORACLE-L
Subject: Re: Tuning Self-referencing Inserts

I think we need to take a few steps back.

Its hard to explain why such a statement would cause noticeable issues if the table is really not that large. A simple insert into <target> statement using a select from <target table>, wont take a silly amount of time - its just the time to read the table blocks using a full tablescan then the time to update the indexes which might be some overhead but no different to any other insert statement of the same volume.

Perhaps this this insert is within some loop? Perhaps the self-reference is written so that a silly execution plan is being used. It would be great if you could share the execution plan you are seeing and the SQL being executed. Is there PL/SQL involved?

One other thing that springs to mind that could have an impact is DML error logging (it sounds like a situation where unique keys could give you errors?)

As for library cache pins, this is unusual for an insert/select statement. Things like running a PL/SQL procedure which another session is trying to recompile while another session is executing it will bring around this behaviour. Perhaps there is a function being called by your statement. Again, what exactly are you seeing? This is a block so you should be able to investigate what the blocking session is doing by checking v$session.

Hope this helps,
Andy

On Fri, 30 Nov 2018 at 18:32, MacGregor, Ian A. <dmarc-***@freelists.org<mailto:dmarc-***@freelists.org>> wrote:

The problem is with a PeopleSoft statement which is based on a select statement which references the table being inserted. The problem stems from having to read and build the indexes of the table which is not that large. The buffer gets are extremely high.

Is there generic answer to this problem. Would it help to rebuild with a higher percent free in an attempt to have few rows per block so as to lessen contention.

Also for library cache pins I don't understand how to reads the p3raw value. I think the problem primarily lies with the maintenance m of the unique index bit I am. not 100% confident.

Ian A. MacGregor
SLAC National Accelerator Laboratory
Computing Division
To offer the best IT service at the lab and be the IT provider of choice.

Mark W. Farnham

2018-12-02 17:27:28 UTC

Permalink

Just quick random thoughts:

Parallel or serial? (Both any select feeding this and Parallel DML)

On RAC, using force local parallel?

Is there a reason the insert has not been arranged to be append? (Hard to
see if there is a logical impediment with the SQL truncated).

After the dominant DB CPU wait, the SQL net wait from DBLINK seems
interesting. What's that about?

Decoupling the analysis of select time and insert time is often useful,
where are the values coming from?

When using conventional insert the search for available space in existing
blocks can time considerable time. Depending on the mode of the cloning, you
may find that this issue is erased from the destination system. If this is
some sort of physical clone, no worries, but if the table has been reloaded
the list of blocks with room for just one more or a few rows shoved back on
the (freelist or freespace bitmap) has been removed as well as any issue
with delayed block cleanout.

Any of the table rebuilding exercises done on the RAC would quickly
eliminate this possibility of root cause and provide a possibly improved
difference between RAC and non-RAC. Setting pctused to 0 would prevent a
re-occurrence IF that is the problem, possibly leading to a periodic copy
out keep maintenance IF space lost to accumulated deleted space becomes
significant.

"Empty front" resulting from completely empty blocks at the beginning of the
table that are late on the next up freelist or freespace bitmap can exhibit
these symptoms of huge numbers of blocks read to get very few rows, but that
doesn't show up for indexed access, just fts. Which object is getting all
those reads?

That surely seems like a pantload of db file reads! Is that just this
operation, and how was that stat list produced?

mwf

From: oracle-l-***@freelists.org [mailto:oracle-l-***@freelists.org]
On Behalf Of MacGregor, Ian A. (Redacted sender "ian" for DMARC)
Sent: Sunday, December 02, 2018 1:18 AM
To: dmarc-***@freelists.org; ***@gmail.com
Cc: ORACLE-L
Subject: Re: Tuning Self-referencing Inserts

The table is very small about 64K. Here is an explain plan from a month
ago.

----------------------------------------------------------------------------
---------------------------

| Id | Operation | Name | Rows |
Bytes | Cost (%CPU)| Time |

----------------------------------------------------------------------------
---------------------------

| 0 | INSERT STATEMENT | | |
| 2999 (100)| |

| 1 | LOAD TABLE CONVENTIONAL | | |
| | |

| 2 | HASH UNIQUE | | 1 |
477 | 2999 (1)| 00:00:36 |

| 3 | NESTED LOOPS | | 1 |
477 | 2994 (1)| 00:00:36 |

| 4 | NESTED LOOPS | | 192 |
477 | 2994 (1)| 00:00:36 |

|* 5 | HASH JOIN | | 2 |
254 | 6 (0)| 00:00:01 |

| 6 | NESTED LOOPS | | 8 |
608 | 3 (0)| 00:00:01 |

|* 7 | TABLE ACCESS FULL | PS_PSA_RULES_HDR | 8 |
504 | 3 (0)| 00:00:01 |

|* 8 | INDEX UNIQUE SCAN | PS_PSA_RULES_LVL | 1 |
13 | 0 (0)| |

|* 9 | TABLE ACCESS FULL | PS_PSA_RULES_LN | 32 |
1632 | 3 (0)| 00:00:01 |

|* 10 | INDEX RANGE SCAN | PS_PSA_ACCTG_TA34 | 96 |
| 1473 (1)| 00:00:18 |

|* 11 | INDEX RANGE SCAN | PS_PSA_ACCTDSTGL4 | 1 |
20 | 2 (0)| 00:00:01 |

|* 12 | INDEX RANGE SCAN | PS_PSA_ACCTDSTGL4 | 1 |
14 | 2 (0)| 00:00:01 |

| 13 | SORT AGGREGATE | | 1 |
19 | | |

| 14 | TABLE ACCESS BY INDEX ROWID| PS_PSA_ACCTDSTGL4 | 1 |
19 | 3 (0)| 00:00:01 |

|* 15 | INDEX RANGE SCAN | PS_PSA_ACCTDSTGL4 | 1 |
| 2 (0)| 00:00:01 |

|* 16 | TABLE ACCESS BY INDEX ROWID | PS_PSA_ACCTG_TA34 | 1 |
350 | 1494 (1)| 00:00:18 |

----------------------------------------------------------------------------
---------------------------

It took just over 11,000 seconds to insert 25,000 rows There were 1,509,2
87,736 buffer gets associated with the instance. There is something
definitely wrong. This is a 2-node RAC system. FYI the insert is called
more than once. It runs pretty well the first time then gets dramatically
worse. The above is the second worse case from about a month ago. There
doesn't seem to be anything wrong with SQL

For the latest run, I decided to do periodic probes of v$session wait.
Each time I did I saw the library cache pin wait all with "waited short
time". V$session did not show any blocking session. Neither did the
query tailored to find blockers for this wait. I did a system state dump

----------------------------------------------------------------------------
---

Oracle session identified by:

{

instance: 1 (fsprd.fsprd1)

os id: 28554

process id: 176, ***@erp-fprd-oracle01

session id: 406

session serial #: 251

}

is not in a wait:

{

last wait: 120 min 20 sec ago

blocking: 0 sessions

current sql: INSERT INTO PS_PSA_ACCTDSTGL4 (PROCESS_INSTANCE,
BUSINESS_UNIT_PC, PROJECT_ID, ACTIVITY_ID, RESOURCE_ID, LINE_SEQ_NBR,
DEBIT_CREDIT, DST_USE, INTER_ORG_LEVEL, CONVERSION_RATE, ORG_TO_BOOK,
COMBO_STATUS, PROJECT_FLAG, IU_ANCHOR_FLG, CONTRACT_NUM, CONTRACT_LINE_NUM,
CONTRACT_PPD_SEQ, ACCT_PLAN_ID, EVENT_NUM, ACCOUNT, AL

short stack:
ksedsts()+465<-ksdxfstk()+32<-ksdxcb()+1927<-sspuser()+112<-__sighandler()<-
qerixtFetch()+547<-subex1()+259<-subsr3()+183<-evaopn3()+2533<-expepr()+576<
-evaiand()+51<-expeal()+23<-qerixtFetch()+800<-qerjotRowProc()+359<-qerhjInn
erProbeHashTable()+491<-kdstf11001010000km()+617<-kdsttgr()+103608<-qertbFet
ch()+2455<-rwsfcd()+103<-qerhjFetch()+621<-qerjotFetch()+2025<-qerjotFetch()
+2025<-qerghFetch()+315<-rwsfcd()+103<-qerltcFetch()+1223<-insexe()+691<-opi
exe()+5632<-kpoal8()+2380<-opiodr()+917<-ttcpip()+2183<-opitsk(

wait history:

1. event: 'library cache pin'

time waited: 0.000158 sec

wait id: 7433154 p1: 'handle
address'=0x9c48940d0

p2: 'pin address'=0x5eec1bd08

p3:
'100*mode+namespace'=0x41cdd00010002

* time between wait #1 and #2: 0.000046 sec

2. event: 'library cache lock'

time waited: 0.000228 sec

wait id: 7433153 p1: 'handle
address'=0x9c48940d0

p2: 'lock address'=0x5eeef6eb0

p3:
'100*mode+namespace'=0x41cdd00010002

* time between wait #2 and #3: 0.000516 sec

3. event: 'library cache pin'

time waited: 0.000250 sec

wait id: 7433152 p1: 'handle
address'=0x9a2fdaf10

p2: 'pin address'=0x99a42e960

p3:
'100*mode+namespace'=0x2160300010002

}

============================================================================
==============

I thought the lock/pin was against an index, but the 100 indicates it is
against the table. However despite their showing up when I probed
v$session_wait they are not shown to be a significant wait. I was
surprised to see them however, because I associated such locks/pins with
DDL not DML.

The last point is that this is a 2 node RAC system, the database was cloned
to a single instance database and the program ran much more efficiently.
The longest time for an insert statement was less than an hour.

Here are the RAC stats from the run done about a month ago. This covers a
10 hour period

The top RAC associated wait was ranked 10th

* .

Event

Waits

Total Wait Time (sec)

Wait Avg(ms)

% DB time

Wait Class

DB CPU

38K

96.6

SQL*Net message from dblink

3,038

497.6

164

1.3

Network

db file sequential read

1,233,886

474.4

0

1.2

User I/O

control file sequential read

149,191

96.6

1

.2

System I/O

direct path read

71,377

89.5

1

.2

User I/O

SQL*Net more data to dblink

45,096

51.1

1

.1

Network

log switch/archive

21

43.8

2086

.1

Other

recovery area: computing obsolete files

10

22.3

2234

.1

Other

log file sync

33,439

19.2

1

.0

Commit

gc current block 2-way

42,984

16

0

.0

Cluster

The db link wait is not associated with this process.

Ian A. MacGregor
SLAC National Accelerator Laboratory
Computing Division
To offer the best IT service at the lab and be the IT provider of choice.

_____

From: oracle-l-***@freelists.org <oracle-l-***@freelists.org> on
behalf of Andy Sayer <***@gmail.com>
Sent: Friday, November 30, 2018 2:56:28 PM
To: dmarc-***@freelists.org
Cc: ORACLE-L
Subject: Re: Tuning Self-referencing Inserts

I think we need to take a few steps back.

It's hard to explain why such a statement would cause noticeable issues if
the table is really not that large. A simple insert into <target> statement
using a select from <target table>, won't take a silly amount of time - it's
just the time to read the table blocks using a full tablescan then the time
to update the indexes which might be some overhead but no different to any
other insert statement of the same volume.

Perhaps this this insert is within some loop? Perhaps the self-reference is
written so that a silly execution plan is being used. It would be great if
you could share the execution plan you are seeing and the SQL being
executed. Is there PL/SQL involved?

One other thing that springs to mind that could have an impact is DML error
logging (it sounds like a situation where unique keys could give you
errors?)

As for library cache pins, this is unusual for an insert/select statement.
Things like running a PL/SQL procedure which another session is trying to
recompile while another session is executing it will bring around this
behaviour. Perhaps there is a function being called by your statement.
Again, what exactly are you seeing? This is a block so you should be able to
investigate what the blocking session is doing by checking v$session.

Hope this helps,

Andy

On Fri, 30 Nov 2018 at 18:32, MacGregor, Ian A.
<dmarc-***@freelists.org> wrote:

The problem is with a PeopleSoft statement which is based on a select
statement which references the table being inserted. The problem stems from
having to read and build the indexes of the table which is not that large.
The buffer gets are extremely high.

Is there generic answer to this problem. Would it help to rebuild with a
higher percent free in an attempt to have few rows per block so as to lessen
contention.

Also for library cache pins I don't understand how to reads the p3raw value.
I think the problem primarily lies with the maintenance m of the unique
index bit I am. not 100% confident.

Ian A. MacGregor
SLAC National Accelerator Laboratory
Computing Division
To offer the best IT service at the lab and be the IT provider of choice.

Tim Gorman

2018-12-02 18:16:38 UTC

Permalink

Ian,

I wrote a PL/SQL package called ASH_XPLAN
<http://evdbt.com/download/presentation-real-time-sql-tuning-delivered-at-hotsos-symposium-2015-and-utoug-training-days-2015/>
that combines DBMS_XPLAN with information from V$ACTIVE_SESSION_HISTORY
to accumulate the ASH time spent in each step of the execution plan of a
presently-running SQL statement. By running ASH_XPLAN in a separate
SQL*Plus session repeatedly, I can see which way the wind blows in a
very short time, without having to wait for the statement to complete.
Of course, please don't consider accumulated ASH time to be exactly
equal to elapsed time, as explained by Graham, Uri, and JB
<http://www.rmoug.org/wp-content/uploads/QEW_Presentations/2012/ASH%20Architecture%20and%20Advanced%20Usage.pdf>.

I'm aware that the DBMS_SQLTUNE and SQL Monitor functionality (thanks
Tim Hall!)
<https://oracle-base.com/articles/11g/real-time-sql-monitoring-11gr1> in
views like V$SQL_PLAN_STATISTICS does this as well, but I found it fun
and educational to solve the problem myself. I was lonely for a nice
PL/SQL project that week. :)

Either way, using ASH_XPLAN or DBMS_SQLTUNE, I strongly recommend
displaying the actual elapsed time for each step of the plan in an
actual execution to determine where the most time is being spent.

Nothing really matters in optimization except elapsed time, everything
else is explanatory information.

Elapsed time (a.k.a. "life") is too short for guessing.

Hope this helps...

-Tim

The table is very small about 64K. Here is an explain plan from a
month ago.
-------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |
Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | | | 2999
(100)| |
| 1 | LOAD TABLE CONVENTIONAL | | | |
| |
| 2 | HASH UNIQUE | | 1 | 477 | 2999
(1)| 00:00:36 |
| 3 | NESTED LOOPS | | 1 | 477 | 2994
(1)| 00:00:36 |
| 4 | NESTED LOOPS | | 192 | 477 | 2994
(1)| 00:00:36 |
|* 5 | HASH JOIN | | 2 | 254 | 6
(0)| 00:00:01 |
| 6 | NESTED LOOPS | | 8 | 608 | 3
(0)| 00:00:01 |
|* 7 | TABLE ACCESS FULL | PS_PSA_RULES_HDR | 8 |
504 | 3 (0)| 00:00:01 |
|* 8 | INDEX UNIQUE SCAN | PS_PSA_RULES_LVL | 1 |
13 | 0 (0)| |
|* 9 | TABLE ACCESS FULL | PS_PSA_RULES_LN | 32 |
1632 | 3 (0)| 00:00:01 |
|* 10 | INDEX RANGE SCAN | PS_PSA_ACCTG_TA34 | 96
| | 1473 (1)| 00:00:18 |
|* 11 | INDEX RANGE SCAN | PS_PSA_ACCTDSTGL4 | 1
| 20 | 2 (0)| 00:00:01 |
|* 12 | INDEX RANGE SCAN | PS_PSA_ACCTDSTGL4 | 1
| 14 | 2 (0)| 00:00:01 |
| 13 | SORT AGGREGATE | | 1 | 19 |
| |
| 14 | TABLE ACCESS BY INDEX ROWID| PS_PSA_ACCTDSTGL4 | 1
| 19 | 3 (0)| 00:00:01 |
|* 15 | INDEX RANGE SCAN | PS_PSA_ACCTDSTGL4 | 1
| | 2 (0)| 00:00:01 |
|* 16 | TABLE ACCESS BY INDEX ROWID | PS_PSA_ACCTG_TA34 | 1
| 350 | 1494 (1)| 00:00:18 |
-------------------------------------------------------------------------------------------------------
It took just over 11,000 seconds to insert 25,000 rows There were
1,509,2 87,736 buffer gets associated with the instance. There
is something definitely wrong. This is a 2-node RAC system. FYI
the insert is called more than once. It runs pretty well the
first time then gets dramatically worse. The above is the second
worse case from about a month ago. There doesn't seem to be anything
wrong with SQL
For the latest run, I decided to do periodic probes of v$session
wait. Each time I did I saw the library cache pin wait all with
"waited short time". V$session did not show any blocking session.
Neither did the query tailored to find blockers for this wait. I
did a system state dump
-------------------------------------------------------------------------------
{
instance: 1 (fsprd.fsprd1)
os id: 28554
session id: 406
session serial #: 251
}
{
last wait: 120 min 20 sec ago
blocking: 0 sessions
current sql: INSERT INTO PS_PSA_ACCTDSTGL4
(PROCESS_INSTANCE, BUSINESS_UNIT_PC, PROJECT_ID, ACTIVITY_ID,
RESOURCE_ID, LINE_SEQ_NBR, DEBIT_CREDIT, DST_USE, INTER_ORG_LEVEL,
CONVERSION_RATE, ORG_TO_BOOK, COMBO_STATUS, PROJECT_FLAG,
IU_ANCHOR_FLG, CONTRACT_NUM, CONTRACT_LINE_NUM, CONTRACT_PPD_SEQ,
ACCT_PLAN_ID, EVENT_NUM, ACCOUNT, AL
ksedsts()+465<-ksdxfstk()+32<-ksdxcb()+1927<-sspuser()+112<-__sighandler()<-qerixtFetch()+547<-subex1()+259<-subsr3()+183<-evaopn3()+2533<-expepr()+576<-evaiand()+51<-expeal()+23<-qerixtFetch()+800<-qerjotRowProc()+359<-qerhjInnerProbeHashTable()+491<-kdstf11001010000km()+617<-kdsttgr()+103608<-qertbFetch()+2455<-rwsfcd()+103<-qerhjFetch()+621<-qerjotFetch()+2025<-qerjotFetch()+2025<-qerghFetch()+315<-rwsfcd()+103<-qerltcFetch()+1223<-insexe()+691<-opiexe()+5632<-kpoal8()+2380<-opiodr()+917<-ttcpip()+2183<-opitsk(
1. event: 'library cache pin'
time waited: 0.000158 sec
wait id: 7433154 p1: 'handle
address'=0x9c48940d0
p2: 'pin address'=0x5eec1bd08
'100*mode+namespace'=0x41cdd00010002
* time between wait #1 and #2: 0.000046 sec
2. event: 'library cache lock'
time waited: 0.000228 sec
wait id: 7433153 p1: 'handle
address'=0x9c48940d0
p2: 'lock address'=0x5eeef6eb0
'100*mode+namespace'=0x41cdd00010002
* time between wait #2 and #3: 0.000516 sec
3. event: 'library cache pin'
time waited: 0.000250 sec
wait id: 7433152 p1: 'handle
address'=0x9a2fdaf10
p2: 'pin address'=0x99a42e960
'100*mode+namespace'=0x2160300010002
}
==========================================================================================
I thought the lock/pin was against an index, but the 100 indicates it
is against the table. However despite their showing up when I probed
v$session_wait they are not shown to be a significant wait. I was
surprised to see them however, because I associated such locks/pins
with DDL not DML.
The last point is that this is a 2 node RAC system, the database was
cloned to a single instance database and the program ran much more
efficiently. The longest time for an insert statement was less than
an hour.
Here are the RAC stats from the run done about a month ago. This
covers a 10 hour period
The top RAC associated wait was ranked 10th
* **
*Event*
*Waits*
*Total Wait Time (sec)*
*Wait Avg(ms)*
*% DB time*
*Wait Class*
DB CPU
38K
96.6
SQL*Net message from dblink
3,038
497.6
164
1.3
Network
db file sequential read
1,233,886
474.4
0
1.2
User I/O
control file sequential read
149,191
96.6
1
.2
System I/O
direct path read
71,377
89.5
1
.2
User I/O
SQL*Net more data to dblink
45,096
51.1
1
.1
Network
log switch/archive
21
43.8
2086
.1
Other
recovery area: computing obsolete files
10
22.3
2234
.1
Other
log file sync
33,439
19.2
1
.0
Commit
gc current block 2-way
42,984
16
0
.0
Cluster
The db link wait is not associated with this process.
Ian A. MacGregor
SLAC National Accelerator Laboratory
Computing Division
To offer the best IT service at the lab and be the IT provider of choice.
------------------------------------------------------------------------
*Sent:* Friday, November 30, 2018 2:56:28 PM
*Cc:* ORACLE-L
*Subject:* Re: Tuning Self-referencing Inserts
I think we need to take a few steps back.
Its hard to explain why such a statement would cause noticeable
issues if the table is really not that large. A simple insert into
<target> statement using a select from <target table>, wont take a
silly amount of time - its just the time to read the table blocks
using a full tablescan then the time to update the indexes which might
be some overhead but no different to any other insert statement of the
same volume.
Perhaps this this insert is within some loop? Perhaps the
self-reference is written so that a silly execution plan is being
used. It would be great if you could share the execution plan you are
seeing and the SQL being executed. Is there PL/SQL involved?
One other thing that springs to mind that could have an impact is DML
error logging (it sounds like a situation where unique keys could give
you errors?)
As for library cache pins, this is unusual for an insert/select
statement. Things like running a PL/SQL procedure which another
session is trying to recompile while another session is executing it
will bring around this behaviour. Perhaps there is a function being
called by your statement. Again, what exactly are you seeing? This is
a block so you should be able to investigate what the blocking session
is doing by checking v$session.
Hope this helps,
Andy
On Fri, 30 Nov 2018 at 18:32, MacGregor, Ian A.
The problem is with a PeopleSoft statement which is based on a
select statement which references the table being inserted. The
problem stems from having to read and build the indexes of the
table which is not that large. The buffer gets are extremely high.
Is there generic answer to this problem. Would it help to
rebuild with a higher percent free in an attempt to have few
rows per block so as to lessen contention.
Also for library cache pins I don't understand how to reads the
p3raw value. I think the problem primarily lies with the
maintenance m of the unique index bit I am. not 100% confident.
Ian A. MacGregor
SLAC National Accelerator Laboratory
Computing Division
To offer the best IT service at the lab and be the IT provider of choice.

Andy Sayer

2018-12-02 23:15:00 UTC

Permalink

Post by "MacGregor, Ian A." (Redacted sender "ian" for DMARC)
The table is very small about 64K.

What's more important is how much of the table is being read and how often.

Here is an explain plan from a month ago.

An explain plan from a month ago could be very misleading, you seem to have
access to the relevant privileges so you should be looking at the real
execution plan in memory using

select * from table(dbms_xplan.display_cursor(sql_id=><the sql_id>));

The actual explain plan tells us the query is more complicated than a
simple select * from table.

This should be taken with a pinch of salt as it's an explain plan (so
unreliable) from a month ago (so the stats would have changed)
We can see that most of the cost in the query is in the index range scan on
line 10, it's not often you see an index range scan cost over 1,000 and it
usually suggests you are using a filter with high index selectivity
(reading lots of the index) - this is quite a frequent mistake I see, just
because an index is being used, it doesn't mean it's being used well. The
predicates section of the plan will tell you exactly what's going on, you
should always include it (and the notes) when sharing the explain/execution
plan.
We can see that there are a few scalar subqueries being used, this tends to
be an indicator of rushed SQL that can probably be rewritten to allow the
CBO full power to execute the statement effectively.
The more rows you have that drive these scalar subqueries, the more times
they get run, the more work that needs to be done.

It took just over 11,000 seconds to insert 25,000 rows

Yeah, that is slow! It's certainly 10,000 times slower than just inserting
25,000 rows.

There were 1,509,2 87,736 buffer gets associated with the instance.

That as a statement is not helpful, we care about this statement not the
instance as a whole (with no information of how long this was over)

There is something definitely wrong. This is a 2-node RAC system.

The likely truth is that the SQL is bad and it has a hugely inefficient
execution plan as a result.

FYI the insert is called more than once. It runs pretty well the first time

Post by "MacGregor, Ian A." (Redacted sender "ian" for DMARC)
then gets dramatically worse.

That's no surprise, those scalar subqueries are reading from that table,
for every row in that table that get's found from one filter you seem to
read a lot of another index.

The above is the second worse case from about a month ago. There doesn't

Post by "MacGregor, Ian A." (Redacted sender "ian" for DMARC)
seem to be anything wrong with SQL

I would strongly disagree. Yes, you can probably make the statement more
efficient with more appropriate indexes, but rewriting your scalar
subqueries to better handle the volume would do wonders.

For the latest run, I decided to do periodic probes of v$session wait.

It sounds like v$session_event is what you're after, it stores aggregated
information from all waits that the session experienced in its lifetime.

Each time I did I saw the library cache pin wait all with "waited short

Post by "MacGregor, Ian A." (Redacted sender "ian" for DMARC)
time". V$session did not show any blocking session. Neither did the
query tailored to find blockers for this wait. I did a system state dump
...
...

Actually we can see that these waits were experienced a long time ago,
since then the session has just been on CPU (which is probably due to the
loopy reads of a lot of index which is highly cached).

...Here are the RAC stats from the run done about a month ago. This

Post by "MacGregor, Ian A." (Redacted sender "ian" for DMARC)
covers a 10 hour period
...

When you are looking at things at the instance level, it is very difficult
to see the wood from the trees. Scoping is everything when it comes to
performance investigations - just like you wouldn't do an AWR report over
24 hours when you care about a process that only runs at midday for 5
minutes, you shouldn't use a full instance AWR report for something running
in one session.
Instead, you can manually query v$active_session_history with filters to
properly drill down. You can even join from v$active_session_history to
v$sql_plan to see exactly what step of the plan was being executed most of
the time.

Post by "MacGregor, Ian A." (Redacted sender "ian" for DMARC)
...
The db link wait is not associated with this process.

That is the problem with using an unscoped report, there's plenty of
irrelevant details that you'd need to ignore. Although, the top event is
CPU by a long way.

In summary:
Rewrite the query to not use so many scalar subqueries
Work out why those index range scans are so highly selective

Hope this helps,
Andy

Post by "MacGregor, Ian A." (Redacted sender "ian" for DMARC)
Ian,
I wrote a PL/SQL package called ASH_XPLAN
<http://evdbt.com/download/presentation-real-time-sql-tuning-delivered-at-hotsos-symposium-2015-and-utoug-training-days-2015/>
that combines DBMS_XPLAN with information from V$ACTIVE_SESSION_HISTORY to
accumulate the ASH time spent in each step of the execution plan of a
presently-running SQL statement. By running ASH_XPLAN in a separate
SQL*Plus session repeatedly, I can see which way the wind blows in a very
short time, without having to wait for the statement to complete. Of
course, please don't consider accumulated ASH time to be exactly equal to
elapsed time, as explained by Graham, Uri, and JB
<http://www.rmoug.org/wp-content/uploads/QEW_Presentations/2012/ASH%20Architecture%20and%20Advanced%20Usage.pdf>
.
I'm aware that the DBMS_SQLTUNE and SQL Monitor functionality (thanks Tim
Hall!)
<https://oracle-base.com/articles/11g/real-time-sql-monitoring-11gr1> in
views like V$SQL_PLAN_STATISTICS does this as well, but I found it fun and
educational to solve the problem myself. I was lonely for a nice PL/SQL
project that week. :)
Either way, using ASH_XPLAN or DBMS_SQLTUNE, I strongly recommend
displaying the actual elapsed time for each step of the plan in an actual
execution to determine where the most time is being spent.
Nothing really matters in optimization except elapsed time, everything
else is explanatory information.
Elapsed time (a.k.a. "life") is too short for guessing.
Hope this helps...
-Tim
The table is very small about 64K. Here is an explain plan from a month ago.
-------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows |
Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | |
| 2999 (100)| |
| 1 | LOAD TABLE CONVENTIONAL | | |
| | |
| 2 | HASH UNIQUE | | 1 |
477 | 2999 (1)| 00:00:36 |
| 3 | NESTED LOOPS | | 1 |
477 | 2994 (1)| 00:00:36 |
| 4 | NESTED LOOPS | | 192 |
477 | 2994 (1)| 00:00:36 |
|* 5 | HASH JOIN | | 2 |
254 | 6 (0)| 00:00:01 |
| 6 | NESTED LOOPS | | 8 |
608 | 3 (0)| 00:00:01 |
|* 7 | TABLE ACCESS FULL | PS_PSA_RULES_HDR | 8 |
504 | 3 (0)| 00:00:01 |
|* 8 | INDEX UNIQUE SCAN | PS_PSA_RULES_LVL | 1 |
13 | 0 (0)| |
|* 9 | TABLE ACCESS FULL | PS_PSA_RULES_LN | 32 |
1632 | 3 (0)| 00:00:01 |
|* 10 | INDEX RANGE SCAN | PS_PSA_ACCTG_TA34 | 96 |
| 1473 (1)| 00:00:18 |
|* 11 | INDEX RANGE SCAN | PS_PSA_ACCTDSTGL4 | 1 |
20 | 2 (0)| 00:00:01 |
|* 12 | INDEX RANGE SCAN | PS_PSA_ACCTDSTGL4 | 1 |
14 | 2 (0)| 00:00:01 |
| 13 | SORT AGGREGATE | | 1 |
19 | | |
| 14 | TABLE ACCESS BY INDEX ROWID| PS_PSA_ACCTDSTGL4 | 1 |
19 | 3 (0)| 00:00:01 |
|* 15 | INDEX RANGE SCAN | PS_PSA_ACCTDSTGL4 | 1 |
| 2 (0)| 00:00:01 |
|* 16 | TABLE ACCESS BY INDEX ROWID | PS_PSA_ACCTG_TA34 | 1 |
350 | 1494 (1)| 00:00:18 |
-------------------------------------------------------------------------------------------------------
It took just over 11,000 seconds to insert 25,000 rows There were 1,509,2
87,736 buffer gets associated with the instance. There is something
definitely wrong. This is a 2-node RAC system. FYI the
insert is called more than once. It runs pretty well the first time then
gets dramatically worse. The above is the second worse case from about a
month ago. There doesn't seem to be anything wrong with SQL
For the latest run, I decided to do periodic probes of v$session wait.
Each time I did I saw the library cache pin wait all with "waited short
time". V$session did not show any blocking session. Neither did
the query tailored to find blockers for this wait. I did a system state
dump
-------------------------------------------------------------------------------
{
instance: 1 (fsprd.fsprd1)
os id: 28554
session id: 406
session serial #: 251
}
{
last wait: 120 min 20 sec ago
blocking: 0 sessions
current sql: INSERT INTO PS_PSA_ACCTDSTGL4 (PROCESS_INSTANCE,
BUSINESS_UNIT_PC, PROJECT_ID, ACTIVITY_ID, RESOURCE_ID, LINE_SEQ_NBR,
DEBIT_CREDIT, DST_USE, INTER_ORG_LEVEL, CONVERSION_RATE, ORG_TO_BOOK,
COMBO_STATUS, PROJECT_FLAG, IU_ANCHOR_FLG, CONTRACT_NUM, CONTRACT_LINE_NUM,
CONTRACT_PPD_SEQ, ACCT_PLAN_ID, EVENT_NUM, ACCOUNT, AL
ksedsts()+465<-ksdxfstk()+32<-ksdxcb()+1927<-sspuser()+112<-__sighandler()<-qerixtFetch()+547<-subex1()+259<-subsr3()+183<-evaopn3()+2533<-expepr()+576<-evaiand()+51<-expeal()+23<-qerixtFetch()+800<-qerjotRowProc()+359<-qerhjInnerProbeHashTable()+491<-kdstf11001010000km()+617<-kdsttgr()+103608<-qertbFetch()+2455<-rwsfcd()+103<-qerhjFetch()+621<-qerjotFetch()+2025<-qerjotFetch()+2025<-qerghFetch()+315<-rwsfcd()+103<-qerltcFetch()+1223<-insexe()+691<-opiexe()+5632<-kpoal8()+2380<-opiodr()+917<-ttcpip()+2183<-opitsk(
1. event: 'library cache pin'
time waited: 0.000158 sec
wait id: 7433154 p1: 'handle
address'=0x9c48940d0
p2: 'pin address'=0x5eec1bd08
'100*mode+namespace'=0x41cdd00010002
* time between wait #1 and #2: 0.000046 sec
2. event: 'library cache lock'
time waited: 0.000228 sec
wait id: 7433153 p1: 'handle
address'=0x9c48940d0
p2: 'lock
address'=0x5eeef6eb0
'100*mode+namespace'=0x41cdd00010002
* time between wait #2 and #3: 0.000516 sec
3. event: 'library cache pin'
time waited: 0.000250 sec
wait id: 7433152 p1: 'handle
address'=0x9a2fdaf10
p2: 'pin address'=0x99a42e960
'100*mode+namespace'=0x2160300010002
}
==========================================================================================
I thought the lock/pin was against an index, but the 100 indicates it is
against the table. However despite their showing up when I probed
v$session_wait they are not shown to be a significant wait. I was
surprised to see them however, because I associated such locks/pins with
DDL not DML.
The last point is that this is a 2 node RAC system, the database was
cloned to a single instance database and the program ran much more
efficiently. The longest time for an insert statement was less than an
hour.
Here are the RAC stats from the run done about a month ago. This covers a 10 hour period
The top RAC associated wait was ranked 10th
- *â¢ *
*Event*
*Waits*
*Total Wait Time (sec)*
*Wait Avg(ms)*
*% DB time*
*Wait Class*
DB CPU
38K
96.6
SQL*Net message from dblink
3,038
497.6
164
1.3
Network
db file sequential read
1,233,886
474.4
0
1.2
User I/O
control file sequential read
149,191
96.6
1
.2
System I/O
direct path read
71,377
89.5
1
.2
User I/O
SQL*Net more data to dblink
45,096
51.1
1
.1
Network
log switch/archive
21
43.8
2086
.1
Other
recovery area: computing obsolete files
10
22.3
2234
.1
Other
log file sync
33,439
19.2
1
.0
Commit
gc current block 2-way
42,984
16
0
.0
Cluster
The db link wait is not associated with this process.
Ian A. MacGregor
SLAC National Accelerator Laboratory
Computing Division
To offer the best IT service at the lab and be the IT provider of choice.
------------------------------
*Sent:* Friday, November 30, 2018 2:56:28 PM
*Cc:* ORACLE-L
*Subject:* Re: Tuning Self-referencing Inserts
I think we need to take a few steps back.
Itâs hard to explain why such a statement would cause noticeable issues if
the table is really not that large. A simple insert into <target> statement
using a select from <target table>, wonât take a silly amount of time -
itâs just the time to read the table blocks using a full tablescan then the
time to update the indexes which might be some overhead but no different to
any other insert statement of the same volume.
Perhaps this this insert is within some loop? Perhaps the self-reference
is written so that a silly execution plan is being used. It would be great
if you could share the execution plan you are seeing and the SQL being
executed. Is there PL/SQL involved?
One other thing that springs to mind that could have an impact is DML
error logging (it sounds like a situation where unique keys could give you
errors?)
As for library cache pins, this is unusual for an insert/select statement.
Things like running a PL/SQL procedure which another session is trying to
recompile while another session is executing it will bring around this
behaviour. Perhaps there is a function being called by your statement.
Again, what exactly are you seeing? This is a block so you should be able
to investigate what the blocking session is doing by checking v$session.
Hope this helps,
Andy
On Fri, 30 Nov 2018 at 18:32, MacGregor, Ian A. <
The problem is with a PeopleSoft statement which is based on a select
statement which references the table being inserted. The problem stems
from having to read and build the indexes of the table which is not that
large. The buffer gets are extremely high.
Is there generic answer to this problem. Would it help to rebuild with
a higher percent free in an attempt to have few rows per block so as to
lessen contention.
Also for library cache pins I don't understand how to reads the p3raw
value. I think the problem primarily lies with the maintenance m of the
unique index bit I am. not 100% confident.
Ian A. MacGregor
SLAC National Accelerator Laboratory
Computing Division
To offer the best IT service at the lab and be the IT provider of choice.

"MacGregor, Ian A." (Redacted sender "ian" for DMARC)

2018-12-03 03:06:37 UTC

Permalink

I'm afraid I posted some confusing information..

When I tried to obtain the plan for the run of a few days ago, the sql_id was already gone from the cache. This is why I posted last month's, where the behavior was the same.

Each time the Insert runs it is assigned a new sql_id, because PeopeSoft uses its own process_id as part the where clause. The same statement run by the same Oracle session may have a different Peoplesoft process_id associated with it. Running this particular PeopleSoft task the insert statement is run three diifferent times with three different sql_id's by the same Oracle session Each time the run time and the number of buffer gets increases. The plan, however, is the same each time the statement is run based on the plans hash value.

The timing of 11,000 seconds was for the plan posted; i.e both were gathered from a month ago. The 1,509,287,736 buffer gets were associated with that insert. and not with the whole instance. This was the second worse case for that paerticular PeopleSoft run. The next time the insert was called it took over 1.8 billion gets to run the same insert statement with the same plan and inserting the same number of rows. The 1.5 billion are for the second insert execution and the 1.8 billion are for the third.

I should not have just stuck with last month's information because there was an assumption I was trying to conflate the two runs. I apologize tart my post caused such confusion.

=====================================================================

PeopleSoft inserts and deletes rows into what it calls temporary tables which are regular table segments. The number of rows in these tables can obviously have an impact on the time it takes to run a statement. There are several times during the month in which the same statement using the same plan runs without a problem. The problem only occurs at month's end. The table being inserted into is a "temporary" table. These temporary tables my also be truncated. In examining the PeopleSoft trace I saw was no indication of truncation of this table although other tables were truncated. However some process either deletes the rows or truncates the table because the number of rows is currently 0.

========================================================================

I posted the library cache pin query as a general question as I was surprised to see it all when querying v$session waits. I don't believe it is associated with a I could find no associated truncate statement. I was wondering if this could be RAC related.

I hope this clears any confusion. I renew my request for help.

Ian A. MacGregor
SLAC National Accelerator Laboratory
Computing Division
To offer the best IT service at the lab and be the IT provider of choice.
________________________________
From: oracle-l-***@freelists.org <oracle-l-***@freelists.org> on behalf of Andy Sayer <***@gmail.com>
Sent: Sunday, December 2, 2018 3:15:00 PM
To: Tim Gorman
Cc: ORACLE-L
Subject: Re: Tuning Self-referencing Inserts

The table is very small about 64K.

What's more important is how much of the table is being read and how often.

Here is an explain plan from a month ago.

An explain plan from a month ago could be very misleading, you seem to have access to the relevant privileges so you should be looking at the real execution plan in memory using

select * from table(dbms_xplan.display_cursor(sql_id=><the sql_id>));

The actual explain plan tells us the query is more complicated than a simple select * from table.

This should be taken with a pinch of salt as it's an explain plan (so unreliable) from a month ago (so the stats would have changed)
We can see that most of the cost in the query is in the index range scan on line 10, it's not often you see an index range scan cost over 1,000 and it usually suggests you are using a filter with high index selectivity (reading lots of the index) - this is quite a frequent mistake I see, just because an index is being used, it doesn't mean it's being used well. The predicates section of the plan will tell you exactly what's going on, you should always include it (and the notes) when sharing the explain/execution plan.
We can see that there are a few scalar subqueries being used, this tends to be an indicator of rushed SQL that can probably be rewritten to allow the CBO full power to execute the statement effectively.
The more rows you have that drive these scalar subqueries, the more times they get run, the more work that needs to be done.

It took just over 11,000 seconds to insert 25,000 rows

Yeah, that is slow! It's certainly 10,000 times slower than just inserting 25,000 rows.

There were 1,509,2 87,736 buffer gets associated with the instance.

That as a statement is not helpful, we care about this statement not the instance as a whole (with no information of how long this was over)

There is something definitely wrong. This is a 2-node RAC system.

The likely truth is that the SQL is bad and it has a hugely inefficient execution plan as a result.

FYI the insert is called more than once. It runs pretty well the first time then gets dramatically worse.

That's no surprise, those scalar subqueries are reading from that table, for every row in that table that get's found from one filter you seem to read a lot of another index.

The above is the second worse case from about a month ago. There doesn't seem to be anything wrong with SQL

I would strongly disagree. Yes, you can probably make the statement more efficient with more appropriate indexes, but rewriting your scalar subqueries to better handle the volume would do wonders.

For the latest run, I decided to do periodic probes of v$session wait.

It sounds like v$session_event is what you're after, it stores aggregated information from all waits that the session experienced in its lifetime.

Each time I did I saw the library cache pin wait all with "waited short time". V$session did not show any blocking session. Neither did the query tailored to find blockers for this wait. I did a system state dump
...
is not in a wait:
...
Actually we can see that these waits were experienced a long time ago, since then the session has just been on CPU (which is probably due to the loopy reads of a lot of index which is highly cached).

...Here are the RAC stats from the run done about a month ago. This covers a 10 hour period
...

When you are looking at things at the instance level, it is very difficult to see the wood from the trees. Scoping is everything when it comes to performance investigations - just like you wouldn't do an AWR report over 24 hours when you care about a process that only runs at midday for 5 minutes, you shouldn't use a full instance AWR report for something running in one session.
Instead, you can manually query v$active_session_history with filters to properly drill down. You can even join from v$active_session_history to v$sql_plan to see exactly what step of the plan was being executed most of the time.
...
The db link wait is not associated with this process.
That is the problem with using an unscoped report, there's plenty of irrelevant details that you'd need to ignore. Although, the top event is CPU by a long way.

In summary:
Rewrite the query to not use so many scalar subqueries
Work out why those index range scans are so highly selective

Hope this helps,
Andy

On Sun, 2 Dec 2018 at 18:17, Tim Gorman <***@gmail.com<mailto:***@gmail.com>> wrote:

Ian,

I wrote a PL/SQL package called ASH_XPLAN<http://evdbt.com/download/presentation-real-time-sql-tuning-delivered-at-hotsos-symposium-2015-and-utoug-training-days-2015/> that combines DBMS_XPLAN with information from V$ACTIVE_SESSION_HISTORY to accumulate the ASH time spent in each step of the execution plan of a presently-running SQL statement. By running ASH_XPLAN in a separate SQL*Plus session repeatedly, I can see which way the wind blows in a very short time, without having to wait for the statement to complete. Of course, please don't consider accumulated ASH time to be exactly equal to elapsed time, as explained by Graham, Uri, and JB<http://www.rmoug.org/wp-content/uploads/QEW_Presentations/2012/ASH%20Architecture%20and%20Advanced%20Usage.pdf>.

I'm aware that the DBMS_SQLTUNE and SQL Monitor functionality (thanks Tim Hall!)<https://oracle-base.com/articles/11g/real-time-sql-monitoring-11gr1> in views like V$SQL_PLAN_STATISTICS does this as well, but I found it fun and educational to solve the problem myself. I was lonely for a nice PL/SQL project that week. :)

Either way, using ASH_XPLAN or DBMS_SQLTUNE, I strongly recommend displaying the actual elapsed time for each step of the plan in an actual execution to determine where the most time is being spent.

Nothing really matters in optimization except elapsed time, everything else is explanatory information.

Elapsed time (a.k.a. "life") is too short for guessing.

Hope this helps...

-Tim

On 12/1/18 22:17, MacGregor, Ian A. (Redacted sender ian for DMARC) wrote:

The table is very small about 64K. Here is an explain plan from a month ago.

-------------------------------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

-------------------------------------------------------------------------------------------------------

| 0 | INSERT STATEMENT | | | | 2999 (100)| |

| 1 | LOAD TABLE CONVENTIONAL | | | | | |

| 2 | HASH UNIQUE | | 1 | 477 | 2999 (1)| 00:00:36 |

| 3 | NESTED LOOPS | | 1 | 477 | 2994 (1)| 00:00:36 |

| 4 | NESTED LOOPS | | 192 | 477 | 2994 (1)| 00:00:36 |

|* 5 | HASH JOIN | | 2 | 254 | 6 (0)| 00:00:01 |

| 6 | NESTED LOOPS | | 8 | 608 | 3 (0)| 00:00:01 |

|* 7 | TABLE ACCESS FULL | PS_PSA_RULES_HDR | 8 | 504 | 3 (0)| 00:00:01 |

|* 8 | INDEX UNIQUE SCAN | PS_PSA_RULES_LVL | 1 | 13 | 0 (0)| |

|* 9 | TABLE ACCESS FULL | PS_PSA_RULES_LN | 32 | 1632 | 3 (0)| 00:00:01 |

|* 10 | INDEX RANGE SCAN | PS_PSA_ACCTG_TA34 | 96 | | 1473 (1)| 00:00:18 |

|* 11 | INDEX RANGE SCAN | PS_PSA_ACCTDSTGL4 | 1 | 20 | 2 (0)| 00:00:01 |

|* 12 | INDEX RANGE SCAN | PS_PSA_ACCTDSTGL4 | 1 | 14 | 2 (0)| 00:00:01 |

| 13 | SORT AGGREGATE | | 1 | 19 | | |

| 14 | TABLE ACCESS BY INDEX ROWID| PS_PSA_ACCTDSTGL4 | 1 | 19 | 3 (0)| 00:00:01 |

|* 15 | INDEX RANGE SCAN | PS_PSA_ACCTDSTGL4 | 1 | | 2 (0)| 00:00:01 |

|* 16 | TABLE ACCESS BY INDEX ROWID | PS_PSA_ACCTG_TA34 | 1 | 350 | 1494 (1)| 00:00:18 |

-------------------------------------------------------------------------------------------------------

It took just over 11,000 seconds to insert 25,000 rows There were 1,509,2 87,736 buffer gets associated with the instance. There is something definitely wrong. This is a 2-node RAC system. FYI the insert is called more than once. It runs pretty well the first time then gets dramatically worse. The above is the second worse case from about a month ago. There doesn't seem to be anything wrong with SQL

For the latest run, I decided to do periodic probes of v$session wait. Each time I did I saw the library cache pin wait all with "waited short time". V$session did not show any blocking session. Neither did the query tailored to find blockers for this wait. I did a system state dump

-------------------------------------------------------------------------------

Oracle session identified by:

{

instance: 1 (fsprd.fsprd1)

os id: 28554

process id: 176, ***@erp-fprd-oracle01

session id: 406

session serial #: 251

}

is not in a wait:

{

last wait: 120 min 20 sec ago

blocking: 0 sessions

current sql: INSERT INTO PS_PSA_ACCTDSTGL4 (PROCESS_INSTANCE, BUSINESS_UNIT_PC, PROJECT_ID, ACTIVITY_ID, RESOURCE_ID, LINE_SEQ_NBR, DEBIT_CREDIT, DST_USE, INTER_ORG_LEVEL, CONVERSION_RATE, ORG_TO_BOOK, COMBO_STATUS, PROJECT_FLAG, IU_ANCHOR_FLG, CONTRACT_NUM, CONTRACT_LINE_NUM, CONTRACT_PPD_SEQ, ACCT_PLAN_ID, EVENT_NUM, ACCOUNT, AL

short stack: ksedsts()+465<-ksdxfstk()+32<-ksdxcb()+1927<-sspuser()+112<-__sighandler()<-qerixtFetch()+547<-subex1()+259<-subsr3()+183<-evaopn3()+2533<-expepr()+576<-evaiand()+51<-expeal()+23<-qerixtFetch()+800<-qerjotRowProc()+359<-qerhjInnerProbeHashTable()+491<-kdstf11001010000km()+617<-kdsttgr()+103608<-qertbFetch()+2455<-rwsfcd()+103<-qerhjFetch()+621<-qerjotFetch()+2025<-qerjotFetch()+2025<-qerghFetch()+315<-rwsfcd()+103<-qerltcFetch()+1223<-insexe()+691<-opiexe()+5632<-kpoal8()+2380<-opiodr()+917<-ttcpip()+2183<-opitsk(

wait history:

1. event: 'library cache pin'

time waited: 0.000158 sec

wait id: 7433154 p1: 'handle address'=0x9c48940d0

p2: 'pin address'=0x5eec1bd08

p3: '100*mode+namespace'=0x41cdd00010002

* time between wait #1 and #2: 0.000046 sec

2. event: 'library cache lock'

time waited: 0.000228 sec

wait id: 7433153 p1: 'handle address'=0x9c48940d0

p2: 'lock address'=0x5eeef6eb0

p3: '100*mode+namespace'=0x41cdd00010002

* time between wait #2 and #3: 0.000516 sec

3. event: 'library cache pin'

time waited: 0.000250 sec

wait id: 7433152 p1: 'handle address'=0x9a2fdaf10

p2: 'pin address'=0x99a42e960

p3: '100*mode+namespace'=0x2160300010002

}

==========================================================================================

I thought the lock/pin was against an index, but the 100 indicates it is against the table. However despite their showing up when I probed v$session_wait they are not shown to be a significant wait. I was surprised to see them however, because I associated such locks/pins with DDL not DML.

The last point is that this is a 2 node RAC system, the database was cloned to a single instance database and the program ran much more efficiently. The longest time for an insert statement was less than an hour.

Here are the RAC stats from the run done about a month ago. This covers a 10 hour period

The top RAC associated wait was ranked 10th

*

Event

Waits

Total Wait Time (sec)

Wait Avg(ms)

% DB time

Wait Class

DB CPU

38K

96.6

SQL*Net message from dblink

3,038

497.6

164

1.3

Network

db file sequential read

1,233,886

474.4

0

1.2

User I/O

control file sequential read

149,191

96.6

1

.2

System I/O

direct path read

71,377

89.5

1

.2

User I/O

SQL*Net more data to dblink

45,096

51.1

1

.1

Network

log switch/archive

21

43.8

2086

.1

Other

recovery area: computing obsolete files

10

22.3

2234

.1

Other

log file sync

33,439

19.2

1

.0

Commit

gc current block 2-way

42,984

16

0

.0

Cluster

The db link wait is not associated with this process.

Ian A. MacGregor
SLAC National Accelerator Laboratory
Computing Division
To offer the best IT service at the lab and be the IT provider of choice.
________________________________
From: oracle-l-***@freelists.org<mailto:oracle-l-***@freelists.org> <oracle-l-***@freelists.org><mailto:oracle-l-***@freelists.org> on behalf of Andy Sayer <***@gmail.com><mailto:***@gmail.com>
Sent: Friday, November 30, 2018 2:56:28 PM
To: dmarc-***@freelists.org<mailto:dmarc-***@freelists.org>
Cc: ORACLE-L
Subject: Re: Tuning Self-referencing Inserts

I think we need to take a few steps back.

Its hard to explain why such a statement would cause noticeable issues if the table is really not that large. A simple insert into <target> statement using a select from <target table>, wont take a silly amount of time - its just the time to read the table blocks using a full tablescan then the time to update the indexes which might be some overhead but no different to any other insert statement of the same volume.

Perhaps this this insert is within some loop? Perhaps the self-reference is written so that a silly execution plan is being used. It would be great if you could share the execution plan you are seeing and the SQL being executed. Is there PL/SQL involved?

One other thing that springs to mind that could have an impact is DML error logging (it sounds like a situation where unique keys could give you errors?)

As for library cache pins, this is unusual for an insert/select statement. Things like running a PL/SQL procedure which another session is trying to recompile while another session is executing it will bring around this behaviour. Perhaps there is a function being called by your statement. Again, what exactly are you seeing? This is a block so you should be able to investigate what the blocking session is doing by checking v$session.

Hope this helps,
Andy

On Fri, 30 Nov 2018 at 18:32, MacGregor, Ian A. <dmarc-***@freelists.org<mailto:dmarc-***@freelists.org>> wrote:

The problem is with a PeopleSoft statement which is based on a select statement which references the table being inserted. The problem stems from having to read and build the indexes of the table which is not that large. The buffer gets are extremely high.

Is there generic answer to this problem. Would it help to rebuild with a higher percent free in an attempt to have few rows per block so as to lessen contention.

Also for library cache pins I don't understand how to reads the p3raw value. I think the problem primarily lies with the maintenance m of the unique index bit I am. not 100% confident.

Ian A. MacGregor
SLAC National Accelerator Laboratory
Computing Division
To offer the best IT service at the lab and be the IT provider of choice.

Hemant K Chitale

2018-12-03 03:26:20 UTC

Permalink

You have two "temporary" tables PS_PSA_ACCTG_TA34 and PS_PSA_ACCTDSTGL4
What we did at a previous project is modify the batch job to gather_stats
immediately after such temporary tables are populated *before* they are
used in the SELECT portion of an INSERT.

Hemant K Chitale

On Mon, Dec 3, 2018 at 11:08 AM MacGregor, Ian A. <

Post by "MacGregor, Ian A." (Redacted sender "ian" for DMARC)
I'm afraid I posted some confusing information..
When I tried to obtain the plan for the run of a few days ago, the sql_id
was already gone from the cache. This is why I posted last month's,
where the behavior was the same.
Each time the Insert runs it is assigned a new sql_id, because PeopeSoft
uses its own process_id as part the where clause. The same statement run
by the same Oracle session may have a different Peoplesoft process_id
associated with it. Running this particular PeopleSoft task the insert
statement is run three diifferent times with three different sql_id's by
the same Oracle session Each time the run time and the number of buffer
gets increases. The plan, however, is the same each time the statement is
run based on the plans hash value.
The timing of 11,000 seconds was for the plan posted; i.e both were
gathered from a month ago. The 1,509,287,736 buffer gets were
associated with that insert. and not with the whole instance. This was
the second worse case for that paerticular PeopleSoft run. The next time
the insert was called it took over 1.8 billion gets to run the same
insert statement with the same plan and inserting the same number of rows.
The 1.5 billion are for the second insert execution and the 1.8 billion
are for the third.
I should not have just stuck with last month's information because there
was an assumption I was trying to conflate the two runs. I apologize tart
my post caused such confusion.
=====================================================================
PeopleSoft inserts and deletes rows into what it calls temporary tables
which are regular table segments. The number of rows in these tables can
obviously have an impact on the time it takes to run a statement. There
are several times during the month in which the same statement using the
same plan runs without a problem. The problem only occurs at month's end.
The table being inserted into is a "temporary" table. These temporary
tables my also be truncated. In examining the PeopleSoft trace I saw was
no indication of truncation of this table although other tables were
truncated. However some process either deletes the rows or truncates the
table because the number of rows is currently 0.
========================================================================
I posted the library cache pin query as a general question as I was
surprised to see it all when querying v$session waits. I don't believe it
is associated with a I could find no associated truncate statement. I was
wondering if this could be RAC related.
I hope this clears any confusion. I renew my request for help.
Ian A. MacGregor
SLAC National Accelerator Laboratory
Computing Division
To offer the best IT service at the lab and be the IT provider of choice.
------------------------------
*Sent:* Sunday, December 2, 2018 3:15:00 PM
*To:* Tim Gorman
*Cc:* ORACLE-L
*Subject:* Re: Tuning Self-referencing Inserts
The table is very small about 64K.
What's more important is how much of the table is being read and how often.
Here is an explain plan from a month ago.
An explain plan from a month ago could be very misleading, you seem to
have access to the relevant privileges so you should be looking at the real
execution plan in memory using
select * from table(dbms_xplan.display_cursor(sql_id=><the sql_id>));
The actual explain plan tells us the query is more complicated than a
simple select * from table.
This should be taken with a pinch of salt as it's an explain plan (so
unreliable) from a month ago (so the stats would have changed)
We can see that most of the cost in the query is in the index range scan
on line 10, it's not often you see an index range scan cost over 1,000 and
it usually suggests you are using a filter with high index selectivity
(reading lots of the index) - this is quite a frequent mistake I see, just
because an index is being used, it doesn't mean it's being used well. The
predicates section of the plan will tell you exactly what's going on, you
should always include it (and the notes) when sharing the explain/execution
plan.
We can see that there are a few scalar subqueries being used, this tends
to be an indicator of rushed SQL that can probably be rewritten to allow
the CBO full power to execute the statement effectively.
The more rows you have that drive these scalar subqueries, the more times
they get run, the more work that needs to be done.
It took just over 11,000 seconds to insert 25,000 rows
Yeah, that is slow! It's certainly 10,000 times slower than just inserting 25,000 rows.
There were 1,509,2 87,736 buffer gets associated with the instance.
That as a statement is not helpful, we care about this statement not the
instance as a whole (with no information of how long this was over)
There is something definitely wrong. This is a 2-node RAC system.
The likely truth is that the SQL is bad and it has a hugely inefficient
execution plan as a result.
FYI the insert is called more than once. It runs pretty well the first
time then gets dramatically worse.
That's no surprise, those scalar subqueries are reading from that table,
for every row in that table that get's found from one filter you seem to
read a lot of another index.
The above is the second worse case from about a month ago. There doesn't
seem to be anything wrong with SQL
I would strongly disagree. Yes, you can probably make the statement more
efficient with more appropriate indexes, but rewriting your scalar
subqueries to better handle the volume would do wonders.
For the latest run, I decided to do periodic probes of v$session wait.
It sounds like v$session_event is what you're after, it stores aggregated
information from all waits that the session experienced in its lifetime.
Each time I did I saw the library cache pin wait all with "waited short
time". V$session did not show any blocking session. Neither did the
query tailored to find blockers for this wait. I did a system state dump
...
...
Actually we can see that these waits were experienced a long time ago,
since then the session has just been on CPU (which is probably due to the
loopy reads of a lot of index which is highly cached).
...Here are the RAC stats from the run done about a month ago. This
covers a 10 hour period
...
When you are looking at things at the instance level, it is very difficult
to see the wood from the trees. Scoping is everything when it comes to
performance investigations - just like you wouldn't do an AWR report over
24 hours when you care about a process that only runs at midday for 5
minutes, you shouldn't use a full instance AWR report for something running
in one session.
Instead, you can manually query v$active_session_history with filters to
properly drill down. You can even join from v$active_session_history to
v$sql_plan to see exactly what step of the plan was being executed most of
the time.
...
The db link wait is not associated with this process.
That is the problem with using an unscoped report, there's plenty of
irrelevant details that you'd need to ignore. Although, the top event is
CPU by a long way.
Rewrite the query to not use so many scalar subqueries
Work out why those index range scans are so highly selective
Hope this helps,
Andy
Ian,
I wrote a PL/SQL package called ASH_XPLAN
<http://evdbt.com/download/presentation-real-time-sql-tuning-delivered-at-hotsos-symposium-2015-and-utoug-training-days-2015/>
that combines DBMS_XPLAN with information from V$ACTIVE_SESSION_HISTORY to
accumulate the ASH time spent in each step of the execution plan of a
presently-running SQL statement. By running ASH_XPLAN in a separate
SQL*Plus session repeatedly, I can see which way the wind blows in a very
short time, without having to wait for the statement to complete. Of
course, please don't consider accumulated ASH time to be exactly equal to
elapsed time, as explained by Graham, Uri, and JB
<http://www.rmoug.org/wp-content/uploads/QEW_Presentations/2012/ASH%20Architecture%20and%20Advanced%20Usage.pdf>
.
I'm aware that the DBMS_SQLTUNE and SQL Monitor functionality (thanks Tim
Hall!)
<https://oracle-base.com/articles/11g/real-time-sql-monitoring-11gr1> in
views like V$SQL_PLAN_STATISTICS does this as well, but I found it fun and
educational to solve the problem myself. I was lonely for a nice PL/SQL
project that week. :)
Either way, using ASH_XPLAN or DBMS_SQLTUNE, I strongly recommend
displaying the actual elapsed time for each step of the plan in an actual
execution to determine where the most time is being spent.
Nothing really matters in optimization except elapsed time, everything
else is explanatory information.
Elapsed time (a.k.a. "life") is too short for guessing.
Hope this helps...
-Tim
The table is very small about 64K. Here is an explain plan from a month ago.
-------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows |
Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | |
| 2999 (100)| |
| 1 | LOAD TABLE CONVENTIONAL | | |
| | |
| 2 | HASH UNIQUE | | 1 |
477 | 2999 (1)| 00:00:36 |
| 3 | NESTED LOOPS | | 1 |
477 | 2994 (1)| 00:00:36 |
| 4 | NESTED LOOPS | | 192 |
477 | 2994 (1)| 00:00:36 |
|* 5 | HASH JOIN | | 2 |
254 | 6 (0)| 00:00:01 |
| 6 | NESTED LOOPS | | 8 |
608 | 3 (0)| 00:00:01 |
|* 7 | TABLE ACCESS FULL | PS_PSA_RULES_HDR | 8 |
504 | 3 (0)| 00:00:01 |
|* 8 | INDEX UNIQUE SCAN | PS_PSA_RULES_LVL | 1 |
13 | 0 (0)| |
|* 9 | TABLE ACCESS FULL | PS_PSA_RULES_LN | 32 |
1632 | 3 (0)| 00:00:01 |
|* 10 | INDEX RANGE SCAN | PS_PSA_ACCTG_TA34 | 96 |
| 1473 (1)| 00:00:18 |
|* 11 | INDEX RANGE SCAN | PS_PSA_ACCTDSTGL4 | 1 |
20 | 2 (0)| 00:00:01 |
|* 12 | INDEX RANGE SCAN | PS_PSA_ACCTDSTGL4 | 1 |
14 | 2 (0)| 00:00:01 |
| 13 | SORT AGGREGATE | | 1 |
19 | | |
| 14 | TABLE ACCESS BY INDEX ROWID| PS_PSA_ACCTDSTGL4 | 1 |
19 | 3 (0)| 00:00:01 |
|* 15 | INDEX RANGE SCAN | PS_PSA_ACCTDSTGL4 | 1 |
| 2 (0)| 00:00:01 |
|* 16 | TABLE ACCESS BY INDEX ROWID | PS_PSA_ACCTG_TA34 | 1 |
350 | 1494 (1)| 00:00:18 |
-------------------------------------------------------------------------------------------------------
It took just over 11,000 seconds to insert 25,000 rows There were 1,509,2
87,736 buffer gets associated with the instance. There is something
definitely wrong. This is a 2-node RAC system. FYI the
insert is called more than once. It runs pretty well the first time then
gets dramatically worse. The above is the second worse case from about a
month ago. There doesn't seem to be anything wrong with SQL
For the latest run, I decided to do periodic probes of v$session wait.
Each time I did I saw the library cache pin wait all with "waited short
time". V$session did not show any blocking session. Neither did
the query tailored to find blockers for this wait. I did a system state
dump
-------------------------------------------------------------------------------
{
instance: 1 (fsprd.fsprd1)
os id: 28554
session id: 406
session serial #: 251
}
{
last wait: 120 min 20 sec ago
blocking: 0 sessions
current sql: INSERT INTO PS_PSA_ACCTDSTGL4 (PROCESS_INSTANCE,
BUSINESS_UNIT_PC, PROJECT_ID, ACTIVITY_ID, RESOURCE_ID, LINE_SEQ_NBR,
DEBIT_CREDIT, DST_USE, INTER_ORG_LEVEL, CONVERSION_RATE, ORG_TO_BOOK,
COMBO_STATUS, PROJECT_FLAG, IU_ANCHOR_FLG, CONTRACT_NUM, CONTRACT_LINE_NUM,
CONTRACT_PPD_SEQ, ACCT_PLAN_ID, EVENT_NUM, ACCOUNT, AL
ksedsts()+465<-ksdxfstk()+32<-ksdxcb()+1927<-sspuser()+112<-__sighandler()<-qerixtFetch()+547<-subex1()+259<-subsr3()+183<-evaopn3()+2533<-expepr()+576<-evaiand()+51<-expeal()+23<-qerixtFetch()+800<-qerjotRowProc()+359<-qerhjInnerProbeHashTable()+491<-kdstf11001010000km()+617<-kdsttgr()+103608<-qertbFetch()+2455<-rwsfcd()+103<-qerhjFetch()+621<-qerjotFetch()+2025<-qerjotFetch()+2025<-qerghFetch()+315<-rwsfcd()+103<-qerltcFetch()+1223<-insexe()+691<-opiexe()+5632<-kpoal8()+2380<-opiodr()+917<-ttcpip()+2183<-opitsk(
1. event: 'library cache pin'
time waited: 0.000158 sec
wait id: 7433154 p1: 'handle
address'=0x9c48940d0
p2: 'pin address'=0x5eec1bd08
'100*mode+namespace'=0x41cdd00010002
* time between wait #1 and #2: 0.000046 sec
2. event: 'library cache lock'
time waited: 0.000228 sec
wait id: 7433153 p1: 'handle
address'=0x9c48940d0
p2: 'lock
address'=0x5eeef6eb0
'100*mode+namespace'=0x41cdd00010002
* time between wait #2 and #3: 0.000516 sec
3. event: 'library cache pin'
time waited: 0.000250 sec
wait id: 7433152 p1: 'handle
address'=0x9a2fdaf10
p2: 'pin address'=0x99a42e960
'100*mode+namespace'=0x2160300010002
}
==========================================================================================
I thought the lock/pin was against an index, but the 100 indicates it is
against the table. However despite their showing up when I probed
v$session_wait they are not shown to be a significant wait. I was
surprised to see them however, because I associated such locks/pins with
DDL not DML.
The last point is that this is a 2 node RAC system, the database was
cloned to a single instance database and the program ran much more
efficiently. The longest time for an insert statement was less than an
hour.
Here are the RAC stats from the run done about a month ago. This covers a 10 hour period
The top RAC associated wait was ranked 10th
- *â¢ *
*Event*
*Waits*
*Total Wait Time (sec)*
*Wait Avg(ms)*
*% DB time*
*Wait Class*
DB CPU
38K
96.6
SQL*Net message from dblink
3,038
497.6
164
1.3
Network
db file sequential read
1,233,886
474.4
0
1.2
User I/O
control file sequential read
149,191
96.6
1
.2
System I/O
direct path read
71,377
89.5
1
.2
User I/O
SQL*Net more data to dblink
45,096
51.1
1
.1
Network
log switch/archive
21
43.8
2086
.1
Other
recovery area: computing obsolete files
10
22.3
2234
.1
Other
log file sync
33,439
19.2
1
.0
Commit
gc current block 2-way
42,984
16
0
.0
Cluster
The db link wait is not associated with this process.
Ian A. MacGregor
SLAC National Accelerator Laboratory
Computing Division
To offer the best IT service at the lab and be the IT provider of choice.
------------------------------
*Sent:* Friday, November 30, 2018 2:56:28 PM
*Cc:* ORACLE-L
*Subject:* Re: Tuning Self-referencing Inserts
I think we need to take a few steps back.
Itâs hard to explain why such a statement would cause noticeable issues if
the table is really not that large. A simple insert into <target> statement
using a select from <target table>, wonât take a silly amount of time -
itâs just the time to read the table blocks using a full tablescan then the
time to update the indexes which might be some overhead but no different to
any other insert statement of the same volume.
Perhaps this this insert is within some loop? Perhaps the self-reference
is written so that a silly execution plan is being used. It would be great
if you could share the execution plan you are seeing and the SQL being
executed. Is there PL/SQL involved?
One other thing that springs to mind that could have an impact is DML
error logging (it sounds like a situation where unique keys could give you
errors?)
As for library cache pins, this is unusual for an insert/select statement.
Things like running a PL/SQL procedure which another session is trying to
recompile while another session is executing it will bring around this
behaviour. Perhaps there is a function being called by your statement.
Again, what exactly are you seeing? This is a block so you should be able
to investigate what the blocking session is doing by checking v$session.
Hope this helps,
Andy
On Fri, 30 Nov 2018 at 18:32, MacGregor, Ian A. <
The problem is with a PeopleSoft statement which is based on a select
statement which references the table being inserted. The problem stems
from having to read and build the indexes of the table which is not that
large. The buffer gets are extremely high.
Is there generic answer to this problem. Would it help to rebuild with
a higher percent free in an attempt to have few rows per block so as to
lessen contention.
Also for library cache pins I don't understand how to reads the p3raw
value. I think the problem primarily lies with the maintenance m of the
unique index bit I am. not 100% confident.
Ian A. MacGregor
SLAC National Accelerator Laboratory
Computing Division
To offer the best IT service at the lab and be the IT provider of choice.

Tanel Poder

2018-12-03 18:39:09 UTC

Permalink

Ian,

If the V$SESSION.STATE is anything other than WAITING then you're on CPU
(or at leat Oracle thinks it is). So WAIT*ED* SHORT TIME = ON CPU. So you
were likely burning CPU doing all these logical IOs and not waiting for
library cache pin all this time. The pin just happened to be one unlucky
wait event that you hit just before getting back onto CPU for a long time.
Unfortunately both the OEM "session current state" page

The wait analysis dump you pasted, also shows *"is not in a wait:"* for the
session and the library cache pin is shown in the wait history as the last
event before the session got back on CPU:

session serial #: 251
}
*is not in a wait:*
{
*last wait: 120 min 20 sec ago*

...

wait *history*:
1. event: 'library cache pin'
time waited: 0.000*158* sec

--
Tanel Poder
https://blog.tanelpoder.com

On Sun, Dec 2, 2018 at 10:08 PM MacGregor, Ian A. <

l***@bluewin.ch

2018-12-03 11:57:41 UTC

Permalink

Hi,
the estimates in this plan are incoreect. If they were correct, the statement would be much faster.
Regards
Lothar
----UrsprÃŒngliche Nachricht----
Von : dmarc-***@freelists.org
Datum : 02/12/2018 - 07:17 (MN)
An : dmarc-***@freelists.org, ***@gmail.com
Cc : oracle-***@freelists.org
Betreff : Re: Tuning Self-referencing Inserts
P {
margin-top: 0;
margin-bottom: 0;
}

The table is very small about 64K. Here is an explain plan from a month ago.

-------------------------------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

-------------------------------------------------------------------------------------------------------

| 0 | INSERT STATEMENT | | | | 2999 (100)| |

| 1 | LOAD TABLE CONVENTIONAL | | | | | |

| 2 | HASH UNIQUE | | 1 | 477 | 2999 (1)| 00:00:36 |

| 3 | NESTED LOOPS | | 1 | 477 | 2994 (1)| 00:00:36 |

| 4 | NESTED LOOPS | | 192 | 477 | 2994 (1)| 00:00:36 |

|* 5 | HASH JOIN | | 2 | 254 | 6 (0)| 00:00:01 |

| 6 | NESTED LOOPS | | 8 | 608 | 3 (0)| 00:00:01 |

|* 7 | TABLE ACCESS FULL | PS_PSA_RULES_HDR | 8 | 504 | 3 (0)| 00:00:01 |

|* 8 | INDEX UNIQUE SCAN | PS_PSA_RULES_LVL | 1 | 13 | 0 (0)| |

|* 9 | TABLE ACCESS FULL | PS_PSA_RULES_LN | 32 | 1632 | 3 (0)| 00:00:01 |

|* 10 | INDEX RANGE SCAN | PS_PSA_ACCTG_TA34 | 96 | | 1473 (1)| 00:00:18 |

|* 11 | INDEX RANGE SCAN | PS_PSA_ACCTDSTGL4 | 1 | 20 | 2 (0)| 00:00:01 |

|* 12 | INDEX RANGE SCAN | PS_PSA_ACCTDSTGL4 | 1 | 14 | 2 (0)| 00:00:01 |

| 13 | SORT AGGREGATE | | 1 | 19 | | |

| 14 | TABLE ACCESS BY INDEX ROWID| PS_PSA_ACCTDSTGL4 | 1 | 19 | 3 (0)| 00:00:01 |

|* 15 | INDEX RANGE SCAN | PS_PSA_ACCTDSTGL4 | 1 | | 2 (0)| 00:00:01 |

|* 16 | TABLE ACCESS BY INDEX ROWID | PS_PSA_ACCTG_TA34 | 1 | 350 | 1494 (1)| 00:00:18 |

-------------------------------------------------------------------------------------------------------

It took just over 11,000 seconds to insert 25,000 rows There were 1,509,2 87,736 buffer gets associated with the instance. There is something definitely wrong. This is a 2-node RAC system. FYI the insert is called more than once. It runs pretty well the first time then gets dramatically worse. The above is the second worse case from about a month ago. There doesn't seem to be anything wrong with SQL

For the latest run, I decided to do periodic probes of v$session wait. Each time I did I saw the library cache pin wait all with "waited short time". V$session did not show any blocking session. Neither did the query tailored to find blockers for this wait. I did a system state dump

-------------------------------------------------------------------------------

Oracle session identified by:

{

instance: 1 (fsprd.fsprd1)

os id: 28554

process id: 176, ***@erp-fprd-oracle01

session id: 406

session serial #: 251

}

is not in a wait:

{

last wait: 120 min 20 sec ago

blocking: 0 sessions

current sql: INSERT INTO PS_PSA_ACCTDSTGL4 (PROCESS_INSTANCE, BUSINESS_UNIT_PC, PROJECT_ID, ACTIVITY_ID, RESOURCE_ID, LINE_SEQ_NBR, DEBIT_CREDIT, DST_USE, INTER_ORG_LEVEL, CONVERSION_RATE, ORG_TO_BOOK, COMBO_STATUS, PROJECT_FLAG, IU_ANCHOR_FLG, CONTRACT_NUM, CONTRACT_LINE_NUM, CONTRACT_PPD_SEQ, ACCT_PLAN_ID, EVENT_NUM, ACCOUNT, AL

short stack: ksedsts()+465<-ksdxfstk()+32<-ksdxcb()+1927<-sspuser()+112<-__sighandler()<-qerixtFetch()+547<-subex1()+259<-subsr3()+183<-evaopn3()+2533<-expepr()+576<-evaiand()+51<-expeal()+23<-qerixtFetch()+800<-qerjotRowProc()+359<-qerhjInnerProbeHashTable()+491<-kdstf11001010000km()+617<-kdsttgr()+103608<-qertbFetch()+2455<-rwsfcd()+103<-qerhjFetch()+621<-qerjotFetch()+2025<-qerjotFetch()+2025<-qerghFetch()+315<-rwsfcd()+103<-qerltcFetch()+1223<-insexe()+691<-opiexe()+5632<-kpoal8()+2380<-opiodr()+917<-ttcpip()+2183<-opitsk(

wait history:

1. event: 'library cache pin'

time waited: 0.000158 sec

wait id: 7433154 p1: 'handle address'=0x9c48940d0

p2: 'pin address'=0x5eec1bd08

p3: '100*mode+namespace'=0x41cdd00010002

* time between wait #1 and #2: 0.000046 sec

2. event: 'library cache lock'

time waited: 0.000228 sec

wait id: 7433153 p1: 'handle address'=0x9c48940d0

p2: 'lock address'=0x5eeef6eb0

p3: '100*mode+namespace'=0x41cdd00010002

* time between wait #2 and #3: 0.000516 sec

3. event: 'library cache pin'

time waited: 0.000250 sec

wait id: 7433152 p1: 'handle address'=0x9a2fdaf10

p2: 'pin address'=0x99a42e960

p3: '100*mode+namespace'=0x2160300010002

}

==========================================================================================

I thought the lock/pin was against an index, but the 100 indicates it is against the table. However despite their showing up when I probed v$session_wait they are not shown to be a significant wait. I was surprised to see them however, because I associated such locks/pins with DDL not DML.

The last point is that this is a 2 node RAC system, the database was cloned to a single instance database and the program ran much more efficiently. The longest time for an insert statement was less than an hour.

Here are the RAC stats from the run done about a month ago. This covers a 10 hour period

The top RAC associated wait was ranked 10th

â¢

Event

Waits

Total Wait Time (sec)

Wait Avg(ms)

% DB time

Wait Class

DB CPU

38K

96.6

SQL*Net message from dblink

3,038

497.6

164

1.3

Network

db file sequential read

1,233,886

474.4

0

1.2

User I/O

control file sequential read

149,191

96.6

1

.2

System I/O

direct path read

71,377

89.5

1

.2

User I/O

SQL*Net more data to dblink

45,096

51.1

1

.1

Network

log switch/archive

21

43.8

2086

.1

Other

recovery area: computing obsolete files

10

22.3

2234

.1

Other

log file sync

33,439

19.2

1

.0

Commit

gc current block 2-way

42,984

16

0

.0

Cluster

The db link wait is not associated with this process.

Ian A. MacGregor

SLAC National Accelerator Laboratory

Computing Division

To offer the best IT service at the lab and be the IT provider of choice.

From: oracle-l-***@freelists.org <oracle-l-***@freelists.org> on behalf of Andy Sayer <***@gmail.com>
Sent: Friday, November 30, 2018 2:56:28 PM
To: dmarc-***@freelists.org
Cc: ORACLE-L
Subject: Re: Tuning Self-referencing Inserts

I think we need to take a few steps back.

Itâs hard to explain why such a statement would cause noticeable issues if the table is really not that large. A simple insert into <target> statement using a select from <target table>, wonât take a silly amount of time - itâs just the time to read the table blocks using a full tablescan then the time to update the indexes which might be some overhead but no different to any other insert statement of the same volume.

Perhaps this this insert is within some loop? Perhaps the self-reference is written so that a silly execution plan is being used. It would be great if you could share the execution plan you are seeing and the SQL being executed. Is there PL/SQL involved?

One other thing that springs to mind that could have an impact is DML error logging (it sounds like a situation where unique keys could give you errors?)

As for library cache pins, this is unusual for an insert/select statement. Things like running a PL/SQL procedure which another session is trying to recompile while another session is executing it will bring around this behaviour. Perhaps there is a function being called by your statement. Again, what exactly are you seeing? This is a block so you should be able to investigate what the blocking session is doing by checking v$session.

Hope this helps,

Andy

On Fri, 30 Nov 2018 at 18:32, MacGregor, Ian A. <
dmarc-***@freelists.org> wrote:

The problem is with a PeopleSoft statement which is based on a select statement which references the table being inserted. The problem stems from having to read and build the indexes of the table which is not that large. The buffer gets are extremely high.

Is there generic answer to this problem. Would it help to rebuild with a higher percent free in an attempt to have few rows per block so as to lessen contention.

Also for library cache pins I don't understand how to reads the p3raw value. I think the problem primarily lies with the maintenance m of the unique index bit I am. not 100% confident.

Ian A. MacGregor

SLAC National Accelerator Laboratory

Computing Division

To offer the best IT service at the lab and be the IT provider of choice.