Discussion:
Looking for ideas on blocked sessions updating partitioned table with CLOB
Chris Taylor
2018-08-30 23:21:05 UTC
Permalink
Env: 12.1.0.2

We have a table that stores session data (base64 encoded) from web sessions.
The table has 32 HASH partitions and contains a CLOB.

The table is hash partitioned by SESSION_ID which is a 64-char string.

We have this code that executed in PHP into the Oracle database:
UPDATE sessions_table
SET session_data = :session_data // (CLOB)
WHERE session_id = :session_id;

For 95% of the sessions this runs very,very fast.

For 5% of the sessions that LOB being passed from PHP is 15MB and for some
reason PHP is really slow about transmitting those session OBJECTS over to
the database. (Verified through strace etc).

While those sessions are waiting to complete the update to the CLOB, they
end up BLOCKING other sessions that aren't trying to update the same
session id but are in the same partition.

I believe they're blocked on the CLOB on the same partition that the long
running session has open as I've confirmed that the SESS_ID being updated
is different in most cases.

So, here's my question, what performance strategy should I be investigating
for CLOBS. Should I just add more partitions and spread out the likelihood
that a session will end up in the same partition? Seems logical but I'm
not sure how CLOBS play into this.

The CLOBS are stored like this:
Segment Name = SYS_LOB0051640773C00002$$
Index Name = SYS_LOB0051640773C00002$$
Chunk = 8192
PCTVERSION = 10
CACHE = YES
LOGGING = None
Encrypt = None
Compression = None
IN_ROW = Yes
Partition = Yes
Retention = Yes

Looking for any ideas on how to prevent a session from blocking other
non-related sessions that are updating the CLOB.

I'm trying to mitigate the blocking at the db layer while we stand up a
product such as Redis to handle the session caching at the server level
(and remove it from the db).

Chris
Stefan Knecht
2018-08-31 03:02:17 UTC
Permalink
When you say "blocking" - what event are the blocked sessions waiting on?

Also, what's your TNS config - particularly SDU sizes between the mid-tier
and the database (TNS connection string and the receiving listener)?

Are you using securefile or basicfile LOBs?

What OS is the database on and what block size are you using in the
tablespace where the lobs are stored?

On Fri, Aug 31, 2018 at 6:21 AM, Chris Taylor <
Post by Chris Taylor
Env: 12.1.0.2
We have a table that stores session data (base64 encoded) from web sessions.
The table has 32 HASH partitions and contains a CLOB.
The table is hash partitioned by SESSION_ID which is a 64-char string.
UPDATE sessions_table
SET session_data = :session_data // (CLOB)
WHERE session_id = :session_id;
For 95% of the sessions this runs very,very fast.
For 5% of the sessions that LOB being passed from PHP is 15MB and for some
reason PHP is really slow about transmitting those session OBJECTS over to
the database. (Verified through strace etc).
While those sessions are waiting to complete the update to the CLOB, they
end up BLOCKING other sessions that aren't trying to update the same
session id but are in the same partition.
I believe they're blocked on the CLOB on the same partition that the long
running session has open as I've confirmed that the SESS_ID being updated
is different in most cases.
So, here's my question, what performance strategy should I be
investigating for CLOBS. Should I just add more partitions and spread out
the likelihood that a session will end up in the same partition? Seems
logical but I'm not sure how CLOBS play into this.
Segment Name = SYS_LOB0051640773C00002$$
Index Name = SYS_LOB0051640773C00002$$
Chunk = 8192
PCTVERSION = 10
CACHE = YES
LOGGING = None
Encrypt = None
Compression = None
IN_ROW = Yes
Partition = Yes
Retention = Yes
Looking for any ideas on how to prevent a session from blocking other
non-related sessions that are updating the CLOB.
I'm trying to mitigate the blocking at the db layer while we stand up a
product such as Redis to handle the session caching at the server level
(and remove it from the db).
Chris
--
//
zztat - The Next-Gen Oracle Performance Monitoring and Reaction Framework!
Visit us at zztat.net | @zztat_oracle | fb.me/zztat | zztat.net/blog/
Chris Taylor
2018-08-31 13:23:46 UTC
Permalink
Ah, very good questions.

For the blocking issue:
1. Blocking Type = enq: TX - row lock contention, yet the SESS_IDs (PK) is
different.
2. Basicfile LOBs (I assume these were carried over from an upgrade from a
prev version)

For the PHP object transfer being slow issue:
1. TNS settings:
db server:
sqlnet.ora
---------------------------------
send_buf_size=2097152
recv_buf_size=2097152
default_sdu_size=32768
default_tdu_size=32768

tnsnames.ora
-----------------------------
Nothing on the tnsnames.ora

web/app server:
tnsnames.ora
-------------------------------
Nothing specified here either for SDU etc

2. OS = Red hat Linux 6.8 64-bit , kernel 2.6.32-642.el6.x86_64

3. Block Size = 8192

Thanks,
Chris
Post by Stefan Knecht
When you say "blocking" - what event are the blocked sessions waiting on?
Also, what's your TNS config - particularly SDU sizes between the mid-tier
and the database (TNS connection string and the receiving listener)?
Are you using securefile or basicfile LOBs?
What OS is the database on and what block size are you using in the
tablespace where the lobs are stored?
On Fri, Aug 31, 2018 at 6:21 AM, Chris Taylor <
Post by Chris Taylor
Env: 12.1.0.2
We have a table that stores session data (base64 encoded) from web sessions.
The table has 32 HASH partitions and contains a CLOB.
The table is hash partitioned by SESSION_ID which is a 64-char string.
UPDATE sessions_table
SET session_data = :session_data // (CLOB)
WHERE session_id = :session_id;
For 95% of the sessions this runs very,very fast.
For 5% of the sessions that LOB being passed from PHP is 15MB and for
some reason PHP is really slow about transmitting those session OBJECTS
over to the database. (Verified through strace etc).
While those sessions are waiting to complete the update to the CLOB, they
end up BLOCKING other sessions that aren't trying to update the same
session id but are in the same partition.
I believe they're blocked on the CLOB on the same partition that the long
running session has open as I've confirmed that the SESS_ID being updated
is different in most cases.
So, here's my question, what performance strategy should I be
investigating for CLOBS. Should I just add more partitions and spread out
the likelihood that a session will end up in the same partition? Seems
logical but I'm not sure how CLOBS play into this.
Segment Name = SYS_LOB0051640773C00002$$
Index Name = SYS_LOB0051640773C00002$$
Chunk = 8192
PCTVERSION = 10
CACHE = YES
LOGGING = None
Encrypt = None
Compression = None
IN_ROW = Yes
Partition = Yes
Retention = Yes
Looking for any ideas on how to prevent a session from blocking other
non-related sessions that are updating the CLOB.
I'm trying to mitigate the blocking at the db layer while we stand up a
product such as Redis to handle the session caching at the server level
(and remove it from the db).
Chris
--
//
zztat - The Next-Gen Oracle Performance Monitoring and Reaction Framework!
Jonathan Lewis
2018-08-31 13:55:41 UTC
Permalink
There is a major problem with basicfile LOBs (with similar, but not so drastic symptoms appearing for securefile LOBs), so as a first strategic step you should probably be planning to find a painless way to move to securefile LOBs.

I've written an entire system about a problem with basicfile LOBs when you do inserts and deletes - but the problem is the same if you do lots of updates because for a LOB an update is a delete followed by an insert (with the old LOB left in the LOB segment rather than being copied to the UNDO segment). The way LOBs handle deletes is that the LOB index is a two-part index of which the first part indexes in time order the chunks of LOBs that have been deleted and the second part indexes by lobid the current LOB chunks.

A problem appears when you try to insert a LOB and there isn't any free space in the segment, but there is plenty of freeable space. Your session will free up all the freeable space (deleting LOB index entries) as it goes. This can take a very long time and while it's happening any other session that want to insert a LOB will wait for your session to finish freeing up the space.

Unfortunately I've only see HW enqueues appearing as a result of this action, I don't think I've seen TX enqueues (and I wouldn't really expect to see the "enq TX - row lock contention" enqueue). THe series start at this URL: http://jonathanlewis.wordpress.com/2017/01/26/basicfile-lobs/

Different possibility - is the LOB declared with multiple freepools ? This is the default and it means the LOBindex has only the two parts I describe above. If you have N freepools then the index consistents of N pairs of parts. This doesn't avoid the problem I've described above (your session will try to free ALL the free space from ALL the pools if it needs space), but having a single freepool may explain your observations. If I insert a very large LOB PERHAPS you can't insert one until I finish insert LOBIndex entries into my bit of index, in which case maybe you'd show a row wait of some sort - though, again I'd expect a different wait to appear (maybe buffer busy).

Another thought - your LOBs enable in-row storage: how many of them are short enough to fit in the row (a few, lots, most) and do they get updated many times in situ, and are many of them likely to start short and then grow a few times before getting too long ? How long are the rest of the columns in the row ? I'm thinking about the possibility that your smaller LOBs spend some time growing in-row and causing row migration before they get long enough to move into the LOB segment - and row migraion does funny things with ITLs. Maybe something odd happens as a row that's migrated moves its LOB to the LOB segment and the row migrates back to it's original block.

---

In the absence of answers - a plan for the future might be to range partition by sess_id and hash subpartiiton by sess_id: I'm assuming that sess_id is a value that's going to increase with time when I say that. If that is the case then the benefit of doing this composite partitioning is that on a regular basis you get a new partition and all the action moves into it and any garbage that's happened in the earlier partition can be cleared up (or dropped).

If you stick with basicfiles then look at freepools just in case it's relevant - securefile lobs automatically have a better strategy for concurrency but there is a parameter you can set of increase concurrency. (Can't remember which it is at present).

Since I've mentioned the series - one feature of the mechanism is that the LOB segment can become much larger than it needs to be, so if you can sum() the sizes of the lobs that exceed the in-linesize and it's much smaller than segment size you'll know that your update mechanism is introducing some sort of problem.

Regards
Jonathan Lewis











Regards
Jonathan Lewis




________________________________________
From: oracle-l-***@freelists.org <oracle-l-***@freelists.org> on behalf of Chris Taylor <***@gmail.com>
Sent: 31 August 2018 14:23:46
To: ***@gmail.com
Cc: ORACLE-L
Subject: Re: Looking for ideas on blocked sessions updating partitioned table with CLOB

Ah, very good questions.

For the blocking issue:
1. Blocking Type = enq: TX - row lock contention, yet the SESS_IDs (PK) is different.
2. Basicfile LOBs (I assume these were carried over from an upgrade from a prev version)

For the PHP object transfer being slow issue:
1. TNS settings:
db server:
sqlnet.ora
---------------------------------
send_buf_size=2097152
recv_buf_size=2097152
default_sdu_size=32768
default_tdu_size=32768

tnsnames.ora
-----------------------------
Nothing on the tnsnames.ora

web/app server:
tnsnames.ora
-------------------------------
Nothing specified here either for SDU etc

2. OS = Red hat Linux 6.8 64-bit , kernel 2.6.32-642.el6.x86_64

3. Block Size = 8192

Thanks,
Chris


On Thu, Aug 30, 2018 at 10:02 PM Stefan Knecht <***@gmail.com<mailto:***@gmail.com>> wrote:
When you say "blocking" - what event are the blocked sessions waiting on?

Also, what's your TNS config - particularly SDU sizes between the mid-tier and the database (TNS connection string and the receiving listener)?

Are you using securefile or basicfile LOBs?

What OS is the database on and what block size are you using in the tablespace where the lobs are stored?

On Fri, Aug 31, 2018 at 6:21 AM, Chris Taylor <***@gmail.com<mailto:***@gmail.com>> wrote:
Env: 12.1.0.2

We have a table that stores session data (base64 encoded) from web sessions.
The table has 32 HASH partitions and contains a CLOB.

The table is hash partitioned by SESSION_ID which is a 64-char string.

We have this code that executed in PHP into the Oracle database:
UPDATE sessions_table
SET session_data = :session_data // (CLOB)
WHERE session_id = :session_id;

For 95% of the sessions this runs very,very fast.

For 5% of the sessions that LOB being passed from PHP is 15MB and for some reason PHP is really slow about transmitting those session OBJECTS over to the database. (Verified through strace etc).

While those sessions are waiting to complete the update to the CLOB, they end up BLOCKING other sessions that aren't trying to update the same session id but are in the same partition.

I believe they're blocked on the CLOB on the same partition that the long running session has open as I've confirmed that the SESS_ID being updated is different in most cases.

So, here's my question, what performance strategy should I be investigating for CLOBS. Should I just add more partitions and spread out the likelihood that a session will end up in the same partition? Seems logical but I'm not sure how CLOBS play into this.

The CLOBS are stored like this:
Segment Name = SYS_LOB0051640773C00002$$
Index Name = SYS_LOB0051640773C00002$$
Chunk = 8192
PCTVERSION = 10
CACHE = YES
LOGGING = None
Encrypt = None
Compression = None
IN_ROW = Yes
Partition = Yes
Retention = Yes

Looking for any ideas on how to prevent a session from blocking other non-related sessions that are updating the CLOB.

I'm trying to mitigate the blocking at the db layer while we stand up a product such as Redis to handle the session caching at the server level (and remove it from the db).

Chris







--
//
zztat - The Next-Gen Oracle Performance Monitoring and Reaction Framework!
Visit us at zztat.net<http://zztat.net/> | @zztat_oracle | fb.me/zztat<http://fb.me/zztat> | zztat.net/blog/<http://zztat.net/blog/>
--
http://www.freelists.org/webpage/oracle-l
Stefan Knecht
2018-08-31 15:04:20 UTC
Permalink
You can also look at the p1/p2/p3 values of the sessions that are blocked
by the TX contention - and see which object the contention is on. If it's
the lobindex, that may lead to further clues about some of the
possibilities raised by Jonathan

Stefan
Post by Jonathan Lewis
There is a major problem with basicfile LOBs (with similar, but not so
drastic symptoms appearing for securefile LOBs), so as a first strategic
step you should probably be planning to find a painless way to move to
securefile LOBs.
I've written an entire system about a problem with basicfile LOBs when you
do inserts and deletes - but the problem is the same if you do lots of
updates because for a LOB an update is a delete followed by an insert (with
the old LOB left in the LOB segment rather than being copied to the UNDO
segment). The way LOBs handle deletes is that the LOB index is a two-part
index of which the first part indexes in time order the chunks of LOBs that
have been deleted and the second part indexes by lobid the current LOB
chunks.
A problem appears when you try to insert a LOB and there isn't any free
space in the segment, but there is plenty of freeable space. Your session
will free up all the freeable space (deleting LOB index entries) as it
goes. This can take a very long time and while it's happening any other
session that want to insert a LOB will wait for your session to finish
freeing up the space.
Unfortunately I've only see HW enqueues appearing as a result of this
action, I don't think I've seen TX enqueues (and I wouldn't really expect
to see the "enq TX - row lock contention" enqueue). THe series start at
this URL: http://jonathanlewis.wordpress.com/2017/01/26/basicfile-lobs/
Different possibility - is the LOB declared with multiple freepools ? This
is the default and it means the LOBindex has only the two parts I describe
above. If you have N freepools then the index consistents of N pairs of
parts. This doesn't avoid the problem I've described above (your session
will try to free ALL the free space from ALL the pools if it needs space),
but having a single freepool may explain your observations. If I insert a
very large LOB PERHAPS you can't insert one until I finish insert LOBIndex
entries into my bit of index, in which case maybe you'd show a row wait of
some sort - though, again I'd expect a different wait to appear (maybe
buffer busy).
Another thought - your LOBs enable in-row storage: how many of them are
short enough to fit in the row (a few, lots, most) and do they get updated
many times in situ, and are many of them likely to start short and then
grow a few times before getting too long ? How long are the rest of the
columns in the row ? I'm thinking about the possibility that your smaller
LOBs spend some time growing in-row and causing row migration before they
get long enough to move into the LOB segment - and row migraion does funny
things with ITLs. Maybe something odd happens as a row that's migrated
moves its LOB to the LOB segment and the row migrates back to it's original
block.
---
In the absence of answers - a plan for the future might be to range
partition by sess_id and hash subpartiiton by sess_id: I'm assuming that
sess_id is a value that's going to increase with time when I say that. If
that is the case then the benefit of doing this composite partitioning is
that on a regular basis you get a new partition and all the action moves
into it and any garbage that's happened in the earlier partition can be
cleared up (or dropped).
If you stick with basicfiles then look at freepools just in case it's
relevant - securefile lobs automatically have a better strategy for
concurrency but there is a parameter you can set of increase concurrency.
(Can't remember which it is at present).
Since I've mentioned the series - one feature of the mechanism is that the
LOB segment can become much larger than it needs to be, so if you can sum()
the sizes of the lobs that exceed the in-linesize and it's much smaller
than segment size you'll know that your update mechanism is introducing
some sort of problem.
Regards
Jonathan Lewis
Regards
Jonathan Lewis
________________________________________
Sent: 31 August 2018 14:23:46
Cc: ORACLE-L
Subject: Re: Looking for ideas on blocked sessions updating partitioned table with CLOB
Ah, very good questions.
1. Blocking Type = enq: TX - row lock contention, yet the SESS_IDs (PK) is different.
2. Basicfile LOBs (I assume these were carried over from an upgrade from a prev version)
sqlnet.ora
---------------------------------
send_buf_size=2097152
recv_buf_size=2097152
default_sdu_size=32768
default_tdu_size=32768
tnsnames.ora
-----------------------------
Nothing on the tnsnames.ora
tnsnames.ora
-------------------------------
Nothing specified here either for SDU etc
2. OS = Red hat Linux 6.8 64-bit , kernel 2.6.32-642.el6.x86_64
3. Block Size = 8192
Thanks,
Chris
When you say "blocking" - what event are the blocked sessions waiting on?
Also, what's your TNS config - particularly SDU sizes between the mid-tier
and the database (TNS connection string and the receiving listener)?
Are you using securefile or basicfile LOBs?
What OS is the database on and what block size are you using in the
tablespace where the lobs are stored?
On Fri, Aug 31, 2018 at 6:21 AM, Chris Taylor <
Env: 12.1.0.2
We have a table that stores session data (base64 encoded) from web sessions.
The table has 32 HASH partitions and contains a CLOB.
The table is hash partitioned by SESSION_ID which is a 64-char string.
UPDATE sessions_table
SET session_data = :session_data // (CLOB)
WHERE session_id = :session_id;
For 95% of the sessions this runs very,very fast.
For 5% of the sessions that LOB being passed from PHP is 15MB and for some
reason PHP is really slow about transmitting those session OBJECTS over to
the database. (Verified through strace etc).
While those sessions are waiting to complete the update to the CLOB, they
end up BLOCKING other sessions that aren't trying to update the same
session id but are in the same partition.
I believe they're blocked on the CLOB on the same partition that the long
running session has open as I've confirmed that the SESS_ID being updated
is different in most cases.
So, here's my question, what performance strategy should I be
investigating for CLOBS. Should I just add more partitions and spread out
the likelihood that a session will end up in the same partition? Seems
logical but I'm not sure how CLOBS play into this.
Segment Name = SYS_LOB0051640773C00002$$
Index Name = SYS_LOB0051640773C00002$$
Chunk = 8192
PCTVERSION = 10
CACHE = YES
LOGGING = None
Encrypt = None
Compression = None
IN_ROW = Yes
Partition = Yes
Retention = Yes
Looking for any ideas on how to prevent a session from blocking other
non-related sessions that are updating the CLOB.
I'm trying to mitigate the blocking at the db layer while we stand up a
product such as Redis to handle the session caching at the server level
(and remove it from the db).
Chris
--
//
zztat - The Next-Gen Oracle Performance Monitoring and Reaction Framework!
http://fb.me/zztat> | zztat.net/blog/<http://zztat.net/blog/>
--
//
zztat - The Next-Gen Oracle Performance Monitoring and Reaction Framework!
Visit us at zztat.net | @zztat_oracle | fb.me/zztat | zztat.net/blog/
Chris Taylor
2018-08-31 16:38:38 UTC
Permalink
Oh - interesting. I hadn't even considered that.

Chris
Post by Stefan Knecht
You can also look at the p1/p2/p3 values of the sessions that are blocked
by the TX contention - and see which object the contention is on. If it's
the lobindex, that may lead to further clues about some of the
possibilities raised by Jonathan
Stefan
On Fri, Aug 31, 2018 at 8:55 PM, Jonathan Lewis <
Post by Jonathan Lewis
There is a major problem with basicfile LOBs (with similar, but not so
drastic symptoms appearing for securefile LOBs), so as a first strategic
step you should probably be planning to find a painless way to move to
securefile LOBs.
I've written an entire system about a problem with basicfile LOBs when
you do inserts and deletes - but the problem is the same if you do lots of
updates because for a LOB an update is a delete followed by an insert (with
the old LOB left in the LOB segment rather than being copied to the UNDO
segment). The way LOBs handle deletes is that the LOB index is a two-part
index of which the first part indexes in time order the chunks of LOBs that
have been deleted and the second part indexes by lobid the current LOB
chunks.
A problem appears when you try to insert a LOB and there isn't any free
space in the segment, but there is plenty of freeable space. Your session
will free up all the freeable space (deleting LOB index entries) as it
goes. This can take a very long time and while it's happening any other
session that want to insert a LOB will wait for your session to finish
freeing up the space.
Unfortunately I've only see HW enqueues appearing as a result of this
action, I don't think I've seen TX enqueues (and I wouldn't really expect
to see the "enq TX - row lock contention" enqueue). THe series start at
this URL: http://jonathanlewis.wordpress.com/2017/01/26/basicfile-lobs/
Different possibility - is the LOB declared with multiple freepools ?
This is the default and it means the LOBindex has only the two parts I
describe above. If you have N freepools then the index consistents of N
pairs of parts. This doesn't avoid the problem I've described above (your
session will try to free ALL the free space from ALL the pools if it needs
space), but having a single freepool may explain your observations. If I
insert a very large LOB PERHAPS you can't insert one until I finish insert
LOBIndex entries into my bit of index, in which case maybe you'd show a row
wait of some sort - though, again I'd expect a different wait to appear
(maybe buffer busy).
Another thought - your LOBs enable in-row storage: how many of them are
short enough to fit in the row (a few, lots, most) and do they get updated
many times in situ, and are many of them likely to start short and then
grow a few times before getting too long ? How long are the rest of the
columns in the row ? I'm thinking about the possibility that your smaller
LOBs spend some time growing in-row and causing row migration before they
get long enough to move into the LOB segment - and row migraion does funny
things with ITLs. Maybe something odd happens as a row that's migrated
moves its LOB to the LOB segment and the row migrates back to it's original
block.
---
In the absence of answers - a plan for the future might be to range
partition by sess_id and hash subpartiiton by sess_id: I'm assuming that
sess_id is a value that's going to increase with time when I say that. If
that is the case then the benefit of doing this composite partitioning is
that on a regular basis you get a new partition and all the action moves
into it and any garbage that's happened in the earlier partition can be
cleared up (or dropped).
If you stick with basicfiles then look at freepools just in case it's
relevant - securefile lobs automatically have a better strategy for
concurrency but there is a parameter you can set of increase concurrency.
(Can't remember which it is at present).
Since I've mentioned the series - one feature of the mechanism is that
the LOB segment can become much larger than it needs to be, so if you can
sum() the sizes of the lobs that exceed the in-linesize and it's much
smaller than segment size you'll know that your update mechanism is
introducing some sort of problem.
Regards
Jonathan Lewis
Regards
Jonathan Lewis
________________________________________
Sent: 31 August 2018 14:23:46
Cc: ORACLE-L
Subject: Re: Looking for ideas on blocked sessions updating partitioned table with CLOB
Ah, very good questions.
1. Blocking Type = enq: TX - row lock contention, yet the SESS_IDs (PK) is different.
2. Basicfile LOBs (I assume these were carried over from an upgrade from a prev version)
sqlnet.ora
---------------------------------
send_buf_size=2097152
recv_buf_size=2097152
default_sdu_size=32768
default_tdu_size=32768
tnsnames.ora
-----------------------------
Nothing on the tnsnames.ora
tnsnames.ora
-------------------------------
Nothing specified here either for SDU etc
2. OS = Red hat Linux 6.8 64-bit , kernel 2.6.32-642.el6.x86_64
3. Block Size = 8192
Thanks,
Chris
When you say "blocking" - what event are the blocked sessions waiting on?
Also, what's your TNS config - particularly SDU sizes between the
mid-tier and the database (TNS connection string and the receiving
listener)?
Are you using securefile or basicfile LOBs?
What OS is the database on and what block size are you using in the
tablespace where the lobs are stored?
On Fri, Aug 31, 2018 at 6:21 AM, Chris Taylor <
Env: 12.1.0.2
We have a table that stores session data (base64 encoded) from web sessions.
The table has 32 HASH partitions and contains a CLOB.
The table is hash partitioned by SESSION_ID which is a 64-char string.
UPDATE sessions_table
SET session_data = :session_data // (CLOB)
WHERE session_id = :session_id;
For 95% of the sessions this runs very,very fast.
For 5% of the sessions that LOB being passed from PHP is 15MB and for
some reason PHP is really slow about transmitting those session OBJECTS
over to the database. (Verified through strace etc).
While those sessions are waiting to complete the update to the CLOB, they
end up BLOCKING other sessions that aren't trying to update the same
session id but are in the same partition.
I believe they're blocked on the CLOB on the same partition that the long
running session has open as I've confirmed that the SESS_ID being updated
is different in most cases.
So, here's my question, what performance strategy should I be
investigating for CLOBS. Should I just add more partitions and spread out
the likelihood that a session will end up in the same partition? Seems
logical but I'm not sure how CLOBS play into this.
Segment Name = SYS_LOB0051640773C00002$$
Index Name = SYS_LOB0051640773C00002$$
Chunk = 8192
PCTVERSION = 10
CACHE = YES
LOGGING = None
Encrypt = None
Compression = None
IN_ROW = Yes
Partition = Yes
Retention = Yes
Looking for any ideas on how to prevent a session from blocking other
non-related sessions that are updating the CLOB.
I'm trying to mitigate the blocking at the db layer while we stand up a
product such as Redis to handle the session caching at the server level
(and remove it from the db).
Chris
--
//
zztat - The Next-Gen Oracle Performance Monitoring and Reaction Framework!
http://fb.me/zztat> | zztat.net/blog/<http://zztat.net/blog/>
--
//
zztat - The Next-Gen Oracle Performance Monitoring and Reaction Framework!
Mark W. Farnham
2018-08-31 20:32:03 UTC
Permalink
IF you have periodic slack time. (Don’t miss that IF).



and IF the quantity of updates is relatively modest between slack time windows



and IF your total size of LOBs that change is contained (such as some date based partitioning that correlates with the possibility of update.)



That is three IFs for those keeping score, then it is possible to work around the catastrophic work profile



by copying from one tablespace to another for all the partitions still possibly being updated and then use partition exchange.



This may be useful IF (fourth IF) the total bulk of your LOBs won’t fit in a slack window when you can migrate them all to securefiles LOBs, or get you to that longer window that you need to accomplish the change to the better technology.



mwf







From: oracle-l-***@freelists.org [mailto:oracle-l-***@freelists.org] On Behalf Of Chris Taylor
Sent: Friday, August 31, 2018 12:39 PM
To: ***@gmail.com
Cc: ***@jlcomp.demon.co.uk; ORACLE-L
Subject: Re: Looking for ideas on blocked sessions updating partitioned table with CLOB



Oh - interesting. I hadn't even considered that.



Chris



On Fri, Aug 31, 2018 at 10:04 AM Stefan Knecht <***@gmail.com> wrote:

You can also look at the p1/p2/p3 values of the sessions that are blocked by the TX contention - and see which object the contention is on. If it's the lobindex, that may lead to further clues about some of the possibilities raised by Jonathan



Stefan







On Fri, Aug 31, 2018 at 8:55 PM, Jonathan Lewis <***@jlcomp.demon.co.uk> wrote:


There is a major problem with basicfile LOBs (with similar, but not so drastic symptoms appearing for securefile LOBs), so as a first strategic step you should probably be planning to find a painless way to move to securefile LOBs.

I've written an entire system about a problem with basicfile LOBs when you do inserts and deletes - but the problem is the same if you do lots of updates because for a LOB an update is a delete followed by an insert (with the old LOB left in the LOB segment rather than being copied to the UNDO segment). The way LOBs handle deletes is that the LOB index is a two-part index of which the first part indexes in time order the chunks of LOBs that have been deleted and the second part indexes by lobid the current LOB chunks.

A problem appears when you try to insert a LOB and there isn't any free space in the segment, but there is plenty of freeable space. Your session will free up all the freeable space (deleting LOB index entries) as it goes. This can take a very long time and while it's happening any other session that want to insert a LOB will wait for your session to finish freeing up the space.

Unfortunately I've only see HW enqueues appearing as a result of this action, I don't think I've seen TX enqueues (and I wouldn't really expect to see the "enq TX - row lock contention" enqueue). THe series start at this URL: http://jonathanlewis.wordpress.com/2017/01/26/basicfile-lobs/

Different possibility - is the LOB declared with multiple freepools ? This is the default and it means the LOBindex has only the two parts I describe above. If you have N freepools then the index consistents of N pairs of parts. This doesn't avoid the problem I've described above (your session will try to free ALL the free space from ALL the pools if it needs space), but having a single freepool may explain your observations. If I insert a very large LOB PERHAPS you can't insert one until I finish insert LOBIndex entries into my bit of index, in which case maybe you'd show a row wait of some sort - though, again I'd expect a different wait to appear (maybe buffer busy).

Another thought - your LOBs enable in-row storage: how many of them are short enough to fit in the row (a few, lots, most) and do they get updated many times in situ, and are many of them likely to start short and then grow a few times before getting too long ? How long are the rest of the columns in the row ? I'm thinking about the possibility that your smaller LOBs spend some time growing in-row and causing row migration before they get long enough to move into the LOB segment - and row migraion does funny things with ITLs. Maybe something odd happens as a row that's migrated moves its LOB to the LOB segment and the row migrates back to it's original block.

---

In the absence of answers - a plan for the future might be to range partition by sess_id and hash subpartiiton by sess_id: I'm assuming that sess_id is a value that's going to increase with time when I say that. If that is the case then the benefit of doing this composite partitioning is that on a regular basis you get a new partition and all the action moves into it and any garbage that's happened in the earlier partition can be cleared up (or dropped).

If you stick with basicfiles then look at freepools just in case it's relevant - securefile lobs automatically have a better strategy for concurrency but there is a parameter you can set of increase concurrency. (Can't remember which it is at present).

Since I've mentioned the series - one feature of the mechanism is that the LOB segment can become much larger than it needs to be, so if you can sum() the sizes of the lobs that exceed the in-linesize and it's much smaller than segment size you'll know that your update mechanism is introducing some sort of problem.

Regards
Jonathan Lewis











Regards
Jonathan Lewis




________________________________________
From: oracle-l-***@freelists.org <oracle-l-***@freelists.org> on behalf of Chris Taylor <***@gmail.com>
Sent: 31 August 2018 14:23:46
To: ***@gmail.com
Cc: ORACLE-L
Subject: Re: Looking for ideas on blocked sessions updating partitioned table with CLOB

Ah, very good questions.

For the blocking issue:
1. Blocking Type = enq: TX - row lock contention, yet the SESS_IDs (PK) is different.
2. Basicfile LOBs (I assume these were carried over from an upgrade from a prev version)

For the PHP object transfer being slow issue:
1. TNS settings:
db server:
sqlnet.ora
---------------------------------
send_buf_size=2097152
recv_buf_size=2097152
default_sdu_size=32768
default_tdu_size=32768

tnsnames.ora
-----------------------------
Nothing on the tnsnames.ora

web/app server:
tnsnames.ora
-------------------------------
Nothing specified here either for SDU etc

2. OS = Red hat Linux 6.8 64-bit , kernel 2.6.32-642.el6.x86_64

3. Block Size = 8192

Thanks,
Chris


On Thu, Aug 30, 2018 at 10:02 PM Stefan Knecht <***@gmail.com<mailto:***@gmail.com>> wrote:
When you say "blocking" - what event are the blocked sessions waiting on?

Also, what's your TNS config - particularly SDU sizes between the mid-tier and the database (TNS connection string and the receiving listener)?

Are you using securefile or basicfile LOBs?

What OS is the database on and what block size are you using in the tablespace where the lobs are stored?

On Fri, Aug 31, 2018 at 6:21 AM, Chris Taylor <***@gmail.com<mailto:***@gmail.com>> wrote:
Env: 12.1.0.2

We have a table that stores session data (base64 encoded) from web sessions.
The table has 32 HASH partitions and contains a CLOB.

The table is hash partitioned by SESSION_ID which is a 64-char string.

We have this code that executed in PHP into the Oracle database:
UPDATE sessions_table
SET session_data = :session_data // (CLOB)
WHERE session_id = :session_id;

For 95% of the sessions this runs very,very fast.

For 5% of the sessions that LOB being passed from PHP is 15MB and for some reason PHP is really slow about transmitting those session OBJECTS over to the database. (Verified through strace etc).

While those sessions are waiting to complete the update to the CLOB, they end up BLOCKING other sessions that aren't trying to update the same session id but are in the same partition.

I believe they're blocked on the CLOB on the same partition that the long running session has open as I've confirmed that the SESS_ID being updated is different in most cases.

So, here's my question, what performance strategy should I be investigating for CLOBS. Should I just add more partitions and spread out the likelihood that a session will end up in the same partition? Seems logical but I'm not sure how CLOBS play into this.

The CLOBS are stored like this:
Segment Name = SYS_LOB0051640773C00002$$
Index Name = SYS_LOB0051640773C00002$$
Chunk = 8192
PCTVERSION = 10
CACHE = YES
LOGGING = None
Encrypt = None
Compression = None
IN_ROW = Yes
Partition = Yes
Retention = Yes

Looking for any ideas on how to prevent a session from blocking other non-related sessions that are updating the CLOB.

I'm trying to mitigate the blocking at the db layer while we stand up a product such as Redis to handle the session caching at the server level (and remove it from the db).

Chris







--
//
zztat - The Next-Gen Oracle Performance Monitoring and Reaction Framework!

Visit us at zztat.net<http://zztat.net/> | @zztat_oracle | fb.me/zztat<http://fb.me/zztat> | zztat.net/blog/<http://zztat.net/blog/>
--
//

zztat - The Next-Gen Oracle Performance Monitoring and Reaction Framework!

Visit us at <http://zztat.net/> zztat.net | @zztat_oracle | fb.me/zztat | zztat.net/blog/
Chris Taylor
2018-09-18 14:22:26 UTC
Permalink
So, I need to own up to something.

My whole approach to this problem was wrong. I had been told that the
problem with this table was the CLOB and the number of sessions updating
the table at the same time. So, my whole approach was based on that
information being TRUE.

I was blinded to the 'real' problem and wasn't actively looking for an
alternative answer until I posted this thread. Due to the discussion here,
I took a few steps back and actually asked myself "why would this be row
locks/blocks?" and WHY don't I see internal locking on the clob if the clob
is the problem?

At that point I was in 'question everything' mode and started really
looking at the Event that the blocking session was sitting in (instead of
just assuming it was a problem with design).

The blocking sessions were ALWAYS sitting in EVENT: SQL*Net MORE DATA FROM
CLIENT and WAIT_CLASS: NETWORK and the wait times were reaching upward of 2
minutes before the wait_time_micro restarted counting.

When I saw that (and really looked at it), I was like that makes no sense.
Why would a session be getting latency from the PHP server? (Why is it
taking so long to send an object from the PHP server to the database?)

Running strace on the sessions on both the db server and the php server
showed the sessions just sitting in either read() or write() to the network
socket and never completing.

Large file transfers INTO the main db server would stall. (Transfers OUT
of the db server were fine) - this affected SQLNet, SCP, ncat etc.
Outgoing was fine, Inbound was terrible.

Ultimately we discovered these 2 settings had been DISABLED on this box at
some point (as its been around a while):

/proc/sys/net/ipv4/tcp_sack

/proc/sys/net/ipv4/tcp_timestamps

Both of those were "0" whereas all our other db servers those were "1" and
weren't experiencing any problem.

After setting both of those to "1" , our sessions now complete and no
longer block.

I know some of you may be thinking "How could he miss all the network
waits?" - And that's a good question. All I can say is sometimes you get
"blinded" by assumptions and aren't really seeing the problem for what it
is.

I've known this a long, long time. So, remember kiddos you may have to
question 'what you know versus what you think you know' ;)

Also, the very fact of sharing this problem with the you on the list gave
me the ability to step back and look at it with a new mindset.

Anyway, I wanted to share that as a testament (and a huge thank you) to the
community we have here.

Thanks,
Chris
If the event is "enq TX: row lock wait" the p1/p2/p3 values will be about
the TX lock in v$lock. p2 = id1, p3 = id2, and p1 will encode "TX" and the
lock mode.
Select p1raw (if its available) or to_char(p1,'XXXXXXXXXXXXXXXX') and
you'll get 00000000054580006 - or possibly a 4 on the end. If it's a mode 4
lock then that's a clue that it's not really about a table row and more
likely to be about an index/IOT or some internal anomaly.
If you've got the ASH data then current_obj# may give you the object_id of
the object being accessed, but it's not entirely reliable.
Regards
Jonathan Lewis
________________________________________
Sent: 31 August 2018 16:04:20
To: Jonathan Lewis
Subject: Re: Looking for ideas on blocked sessions updating partitioned table with CLOB
You can also look at the p1/p2/p3 values of the sessions that are blocked
by the TX contention - and see which object the contention is on. If it's
the lobindex, that may lead to further clues about some of the
possibilities raised by Jonathan
Stefan
On Fri, Aug 31, 2018 at 8:55 PM, Jonathan Lewis <
There is a major problem with basicfile LOBs (with similar, but not so
drastic symptoms appearing for securefile LOBs), so as a first strategic
step you should probably be planning to find a painless way to move to
securefile LOBs.
I've written an entire system about a problem with basicfile LOBs when you
do inserts and deletes - but the problem is the same if you do lots of
updates because for a LOB an update is a delete followed by an insert (with
the old LOB left in the LOB segment rather than being copied to the UNDO
segment). The way LOBs handle deletes is that the LOB index is a two-part
index of which the first part indexes in time order the chunks of LOBs that
have been deleted and the second part indexes by lobid the current LOB
chunks.
A problem appears when you try to insert a LOB and there isn't any free
space in the segment, but there is plenty of freeable space. Your session
will free up all the freeable space (deleting LOB index entries) as it
goes. This can take a very long time and while it's happening any other
session that want to insert a LOB will wait for your session to finish
freeing up the space.
Unfortunately I've only see HW enqueues appearing as a result of this
action, I don't think I've seen TX enqueues (and I wouldn't really expect
to see the "enq TX - row lock contention" enqueue). THe series start at
this URL: http://jonathanlewis.wordpress.com/2017/01/26/basicfile-lobs/
Different possibility - is the LOB declared with multiple freepools ? This
is the default and it means the LOBindex has only the two parts I describe
above. If you have N freepools then the index consistents of N pairs of
parts. This doesn't avoid the problem I've described above (your session
will try to free ALL the free space from ALL the pools if it needs space),
but having a single freepool may explain your observations. If I insert a
very large LOB PERHAPS you can't insert one until I finish insert LOBIndex
entries into my bit of index, in which case maybe you'd show a row wait of
some sort - though, again I'd expect a different wait to appear (maybe
buffer busy).
Another thought - your LOBs enable in-row storage: how many of them are
short enough to fit in the row (a few, lots, most) and do they get updated
many times in situ, and are many of them likely to start short and then
grow a few times before getting too long ? How long are the rest of the
columns in the row ? I'm thinking about the possibility that your smaller
LOBs spend some time growing in-row and causing row migration before they
get long enough to move into the LOB segment - and row migraion does funny
things with ITLs. Maybe something odd happens as a row that's migrated
moves its LOB to the LOB segment and the row migrates back to it's original
block.
---
In the absence of answers - a plan for the future might be to range
partition by sess_id and hash subpartiiton by sess_id: I'm assuming that
sess_id is a value that's going to increase with time when I say that. If
that is the case then the benefit of doing this composite partitioning is
that on a regular basis you get a new partition and all the action moves
into it and any garbage that's happened in the earlier partition can be
cleared up (or dropped).
If you stick with basicfiles then look at freepools just in case it's
relevant - securefile lobs automatically have a better strategy for
concurrency but there is a parameter you can set of increase concurrency.
(Can't remember which it is at present).
Since I've mentioned the series - one feature of the mechanism is that the
LOB segment can become much larger than it needs to be, so if you can sum()
the sizes of the lobs that exceed the in-linesize and it's much smaller
than segment size you'll know that your update mechanism is introducing
some sort of problem.
Regards
Jonathan Lewis
Regards
Jonathan Lewis
________________________________________
Sent: 31 August 2018 14:23:46
Cc: ORACLE-L
Subject: Re: Looking for ideas on blocked sessions updating partitioned table with CLOB
Ah, very good questions.
1. Blocking Type = enq: TX - row lock contention, yet the SESS_IDs (PK) is different.
2. Basicfile LOBs (I assume these were carried over from an upgrade from a prev version)
sqlnet.ora
---------------------------------
send_buf_size=2097152
recv_buf_size=2097152
default_sdu_size=32768
default_tdu_size=32768
tnsnames.ora
-----------------------------
Nothing on the tnsnames.ora
tnsnames.ora
-------------------------------
Nothing specified here either for SDU etc
2. OS = Red hat Linux 6.8 64-bit , kernel 2.6.32-642.el6.x86_64
3. Block Size = 8192
Thanks,
Chris
When you say "blocking" - what event are the blocked sessions waiting on?
Also, what's your TNS config - particularly SDU sizes between the mid-tier
and the database (TNS connection string and the receiving listener)?
Are you using securefile or basicfile LOBs?
What OS is the database on and what block size are you using in the
tablespace where the lobs are stored?
On Fri, Aug 31, 2018 at 6:21 AM, Chris Taylor <
Env: 12.1.0.2
We have a table that stores session data (base64 encoded) from web sessions.
The table has 32 HASH partitions and contains a CLOB.
The table is hash partitioned by SESSION_ID which is a 64-char string.
UPDATE sessions_table
SET session_data = :session_data // (CLOB)
WHERE session_id = :session_id;
For 95% of the sessions this runs very,very fast.
For 5% of the sessions that LOB being passed from PHP is 15MB and for some
reason PHP is really slow about transmitting those session OBJECTS over to
the database. (Verified through strace etc).
While those sessions are waiting to complete the update to the CLOB, they
end up BLOCKING other sessions that aren't trying to update the same
session id but are in the same partition.
I believe they're blocked on the CLOB on the same partition that the long
running session has open as I've confirmed that the SESS_ID being updated
is different in most cases.
So, here's my question, what performance strategy should I be
investigating for CLOBS. Should I just add more partitions and spread out
the likelihood that a session will end up in the same partition? Seems
logical but I'm not sure how CLOBS play into this.
Segment Name = SYS_LOB0051640773C00002$$
Index Name = SYS_LOB0051640773C00002$$
Chunk = 8192
PCTVERSION = 10
CACHE = YES
LOGGING = None
Encrypt = None
Compression = None
IN_ROW = Yes
Partition = Yes
Retention = Yes
Looking for any ideas on how to prevent a session from blocking other
non-related sessions that are updating the CLOB.
I'm trying to mitigate the blocking at the db layer while we stand up a
product such as Redis to handle the session caching at the server level
(and remove it from the db).
Chris
--
//
zztat - The Next-Gen Oracle Performance Monitoring and Reaction Framework!
Visit us at zztat.net<http://zztat.net><http://zztat.net/> |
@zztat_oracle | fb.me/zztat<http://fb.me/zztat><http://fb.me/zztat> |
zztat.net/blog/<http://zztat.net/blog/><http://zztat.net/blog/>
--
//
zztat - The Next-Gen Oracle Performance Monitoring and Reaction Framework!
http://fb.me/zztat> | zztat.net/blog/<http://zztat.net/blog/>
Chris Stephens
2018-09-18 14:35:21 UTC
Permalink
great follow-up chris!

On Tue, Sep 18, 2018 at 9:24 AM Chris Taylor <
Post by Chris Taylor
So, I need to own up to something.
My whole approach to this problem was wrong. I had been told that the
problem with this table was the CLOB and the number of sessions updating
the table at the same time. So, my whole approach was based on that
information being TRUE.
I was blinded to the 'real' problem and wasn't actively looking for an
alternative answer until I posted this thread. Due to the discussion here,
I took a few steps back and actually asked myself "why would this be row
locks/blocks?" and WHY don't I see internal locking on the clob if the clob
is the problem?
At that point I was in 'question everything' mode and started really
looking at the Event that the blocking session was sitting in (instead of
just assuming it was a problem with design).
The blocking sessions were ALWAYS sitting in EVENT: SQL*Net MORE DATA FROM
CLIENT and WAIT_CLASS: NETWORK and the wait times were reaching upward of 2
minutes before the wait_time_micro restarted counting.
When I saw that (and really looked at it), I was like that makes no
sense. Why would a session be getting latency from the PHP server? (Why
is it taking so long to send an object from the PHP server to the database?)
Running strace on the sessions on both the db server and the php server
showed the sessions just sitting in either read() or write() to the network
socket and never completing.
Large file transfers INTO the main db server would stall. (Transfers OUT
of the db server were fine) - this affected SQLNet, SCP, ncat etc.
Outgoing was fine, Inbound was terrible.
Ultimately we discovered these 2 settings had been DISABLED on this box at
/proc/sys/net/ipv4/tcp_sack
/proc/sys/net/ipv4/tcp_timestamps
Both of those were "0" whereas all our other db servers those were "1" and
weren't experiencing any problem.
After setting both of those to "1" , our sessions now complete and no
longer block.
I know some of you may be thinking "How could he miss all the network
waits?" - And that's a good question. All I can say is sometimes you get
"blinded" by assumptions and aren't really seeing the problem for what it
is.
I've known this a long, long time. So, remember kiddos you may have to
question 'what you know versus what you think you know' ;)
Also, the very fact of sharing this problem with the you on the list gave
me the ability to step back and look at it with a new mindset.
Anyway, I wanted to share that as a testament (and a huge thank you) to
the community we have here.
Thanks,
Chris
On Fri, Aug 31, 2018 at 11:48 AM Jonathan Lewis <
If the event is "enq TX: row lock wait" the p1/p2/p3 values will be about
the TX lock in v$lock. p2 = id1, p3 = id2, and p1 will encode "TX" and the
lock mode.
Select p1raw (if its available) or to_char(p1,'XXXXXXXXXXXXXXXX') and
you'll get 00000000054580006 - or possibly a 4 on the end. If it's a mode 4
lock then that's a clue that it's not really about a table row and more
likely to be about an index/IOT or some internal anomaly.
If you've got the ASH data then current_obj# may give you the object_id
of the object being accessed, but it's not entirely reliable.
Regards
Jonathan Lewis
________________________________________
Sent: 31 August 2018 16:04:20
To: Jonathan Lewis
Subject: Re: Looking for ideas on blocked sessions updating partitioned table with CLOB
You can also look at the p1/p2/p3 values of the sessions that are blocked
by the TX contention - and see which object the contention is on. If it's
the lobindex, that may lead to further clues about some of the
possibilities raised by Jonathan
Stefan
On Fri, Aug 31, 2018 at 8:55 PM, Jonathan Lewis <
There is a major problem with basicfile LOBs (with similar, but not so
drastic symptoms appearing for securefile LOBs), so as a first strategic
step you should probably be planning to find a painless way to move to
securefile LOBs.
I've written an entire system about a problem with basicfile LOBs when
you do inserts and deletes - but the problem is the same if you do lots of
updates because for a LOB an update is a delete followed by an insert (with
the old LOB left in the LOB segment rather than being copied to the UNDO
segment). The way LOBs handle deletes is that the LOB index is a two-part
index of which the first part indexes in time order the chunks of LOBs that
have been deleted and the second part indexes by lobid the current LOB
chunks.
A problem appears when you try to insert a LOB and there isn't any free
space in the segment, but there is plenty of freeable space. Your session
will free up all the freeable space (deleting LOB index entries) as it
goes. This can take a very long time and while it's happening any other
session that want to insert a LOB will wait for your session to finish
freeing up the space.
Unfortunately I've only see HW enqueues appearing as a result of this
action, I don't think I've seen TX enqueues (and I wouldn't really expect
to see the "enq TX - row lock contention" enqueue). THe series start at
this URL: http://jonathanlewis.wordpress.com/2017/01/26/basicfile-lobs/
Different possibility - is the LOB declared with multiple freepools ?
This is the default and it means the LOBindex has only the two parts I
describe above. If you have N freepools then the index consistents of N
pairs of parts. This doesn't avoid the problem I've described above (your
session will try to free ALL the free space from ALL the pools if it needs
space), but having a single freepool may explain your observations. If I
insert a very large LOB PERHAPS you can't insert one until I finish insert
LOBIndex entries into my bit of index, in which case maybe you'd show a row
wait of some sort - though, again I'd expect a different wait to appear
(maybe buffer busy).
Another thought - your LOBs enable in-row storage: how many of them are
short enough to fit in the row (a few, lots, most) and do they get updated
many times in situ, and are many of them likely to start short and then
grow a few times before getting too long ? How long are the rest of the
columns in the row ? I'm thinking about the possibility that your smaller
LOBs spend some time growing in-row and causing row migration before they
get long enough to move into the LOB segment - and row migraion does funny
things with ITLs. Maybe something odd happens as a row that's migrated
moves its LOB to the LOB segment and the row migrates back to it's original
block.
---
In the absence of answers - a plan for the future might be to range
partition by sess_id and hash subpartiiton by sess_id: I'm assuming that
sess_id is a value that's going to increase with time when I say that. If
that is the case then the benefit of doing this composite partitioning is
that on a regular basis you get a new partition and all the action moves
into it and any garbage that's happened in the earlier partition can be
cleared up (or dropped).
If you stick with basicfiles then look at freepools just in case it's
relevant - securefile lobs automatically have a better strategy for
concurrency but there is a parameter you can set of increase concurrency.
(Can't remember which it is at present).
Since I've mentioned the series - one feature of the mechanism is that
the LOB segment can become much larger than it needs to be, so if you can
sum() the sizes of the lobs that exceed the in-linesize and it's much
smaller than segment size you'll know that your update mechanism is
introducing some sort of problem.
Regards
Jonathan Lewis
Regards
Jonathan Lewis
________________________________________
Sent: 31 August 2018 14:23:46
Cc: ORACLE-L
Subject: Re: Looking for ideas on blocked sessions updating partitioned table with CLOB
Ah, very good questions.
1. Blocking Type = enq: TX - row lock contention, yet the SESS_IDs (PK) is different.
2. Basicfile LOBs (I assume these were carried over from an upgrade from a prev version)
sqlnet.ora
---------------------------------
send_buf_size=2097152
recv_buf_size=2097152
default_sdu_size=32768
default_tdu_size=32768
tnsnames.ora
-----------------------------
Nothing on the tnsnames.ora
tnsnames.ora
-------------------------------
Nothing specified here either for SDU etc
2. OS = Red hat Linux 6.8 64-bit , kernel 2.6.32-642.el6.x86_64
3. Block Size = 8192
Thanks,
Chris
When you say "blocking" - what event are the blocked sessions waiting on?
Also, what's your TNS config - particularly SDU sizes between the
mid-tier and the database (TNS connection string and the receiving
listener)?
Are you using securefile or basicfile LOBs?
What OS is the database on and what block size are you using in the
tablespace where the lobs are stored?
On Fri, Aug 31, 2018 at 6:21 AM, Chris Taylor <
Env: 12.1.0.2
We have a table that stores session data (base64 encoded) from web sessions.
The table has 32 HASH partitions and contains a CLOB.
The table is hash partitioned by SESSION_ID which is a 64-char string.
UPDATE sessions_table
SET session_data = :session_data // (CLOB)
WHERE session_id = :session_id;
For 95% of the sessions this runs very,very fast.
For 5% of the sessions that LOB being passed from PHP is 15MB and for
some reason PHP is really slow about transmitting those session OBJECTS
over to the database. (Verified through strace etc).
While those sessions are waiting to complete the update to the CLOB, they
end up BLOCKING other sessions that aren't trying to update the same
session id but are in the same partition.
I believe they're blocked on the CLOB on the same partition that the long
running session has open as I've confirmed that the SESS_ID being updated
is different in most cases.
So, here's my question, what performance strategy should I be
investigating for CLOBS. Should I just add more partitions and spread out
the likelihood that a session will end up in the same partition? Seems
logical but I'm not sure how CLOBS play into this.
Segment Name = SYS_LOB0051640773C00002$$
Index Name = SYS_LOB0051640773C00002$$
Chunk = 8192
PCTVERSION = 10
CACHE = YES
LOGGING = None
Encrypt = None
Compression = None
IN_ROW = Yes
Partition = Yes
Retention = Yes
Looking for any ideas on how to prevent a session from blocking other
non-related sessions that are updating the CLOB.
I'm trying to mitigate the blocking at the db layer while we stand up a
product such as Redis to handle the session caching at the server level
(and remove it from the db).
Chris
--
//
zztat - The Next-Gen Oracle Performance Monitoring and Reaction Framework!
Visit us at zztat.net<http://zztat.net><http://zztat.net/> |
@zztat_oracle | fb.me/zztat<http://fb.me/zztat><http://fb.me/zztat> |
zztat.net/blog/<http://zztat.net/blog/><http://zztat.net/blog/>
--
//
zztat - The Next-Gen Oracle Performance Monitoring and Reaction Framework!
http://fb.me/zztat> | zztat.net/blog/<http://zztat.net/blog/>
Tim Gorman
2018-09-18 14:41:21 UTC
Permalink
Chris,

You have a gift for telling the story.  This could make an amazing
newsletter article or a fantastic presentation that you'll enjoy
creating and presenting.  The fun part of such an article or
presentation would be explaining the underlying technology, so that the
reader/audience enjoys the punchline as much as we have.

For an article in a newsletter, please consider RMOUG SQL>Update
<http://www.rmoug.org/newsletter/>, the NoCOUG Journal
<http://nocoug.org/newsletter.html>, IOUG Select
<http://select.ioug.org/>, or UKOUG Oracle Scene
<http://www.ukoug.org/membership-new/member-activities/oracle-scene/>? 
There might be others I have overlooked, for which I apologize.

Hope this helps!

-Tim
Post by Chris Taylor
So, I need to own up to something.
My whole approach to this problem was wrong.  I had been told that the
problem with this table was the CLOB and the number of sessions
updating the table at the same time.  So, my whole approach was based
on that information being TRUE.
I was blinded to the 'real' problem and wasn't actively looking for an
alternative answer until I posted this thread.  Due to the discussion
here, I took a few steps back and actually asked myself "why would
this be row locks/blocks?" and WHY don't I see internal locking on the
clob if the clob is the problem?
At that point I was in 'question everything' mode and started really
looking at the Event that the blocking session was sitting in (instead
of just assuming it was a problem with design).
The blocking sessions were ALWAYS sitting in EVENT: SQL*Net MORE DATA
FROM CLIENT and WAIT_CLASS: NETWORK and the wait times were reaching
upward of 2 minutes before the wait_time_micro restarted counting.
When I saw that (and really looked at it), I was like that makes no
sense. Why would a session be getting latency from the PHP server?
(Why is it taking so long to send an object from the PHP server to the
database?)
Running strace on the sessions on both the db server and the php
server showed the sessions just sitting in either read() or write() to
the network socket and never completing.
Large file transfers INTO the main db server would stall.  (Transfers
OUT of the db server were fine) - this affected SQLNet, SCP, ncat
etc.  Outgoing was fine, Inbound was terrible.
Ultimately we discovered these 2 settings had been DISABLED on this
/proc/sys/net/ipv4/tcp_sack
/proc/sys/net/ipv4/tcp_timestamps
Both of those were "0" whereas all our other db servers those were "1"
and weren't experiencing any problem.
After setting both of those to "1" , our sessions now complete and no
longer block.
I know some of you may be thinking "How could he miss all the network
waits?"  - And that's a good question.  All I can say is sometimes you
get "blinded" by assumptions and aren't really seeing the problem for
what it is.
I've known this a long, long time.  So, remember kiddos you may have
to question 'what you know versus what you think you know' ;)
Also, the very fact of sharing this problem with the you on the list
gave me the ability to step back and look at it with a new mindset.
Anyway, I wanted to share that as a testament (and a huge thank you)
to the community we have here.
Thanks,
Chris
On Fri, Aug 31, 2018 at 11:48 AM Jonathan Lewis
If the event is "enq TX: row lock wait" the p1/p2/p3 values will
be about the TX lock in v$lock.  p2 = id1, p3 = id2, and p1 will
encode "TX" and the lock mode.
Select p1raw (if its available) or to_char(p1,'XXXXXXXXXXXXXXXX')
and you'll get 00000000054580006 - or possibly a 4 on the end. If
it's a mode 4 lock then that's a clue that it's not really about a
table row and more likely to be about an index/IOT or some
internal anomaly.
If you've got the ASH data then current_obj# may give you the
object_id of the object being accessed, but it's not entirely
reliable.
Regards
Jonathan Lewis
________________________________________
Sent: 31 August 2018 16:04:20
To: Jonathan Lewis
Subject: Re: Looking for ideas on blocked sessions updating
partitioned table with CLOB
You can also look at the p1/p2/p3 values of the sessions that are
blocked by the TX contention - and see which object the contention
is on. If it's the lobindex, that may lead to further clues about
some of the possibilities raised by Jonathan
Stefan
On Fri, Aug 31, 2018 at 8:55 PM, Jonathan Lewis
There is a major problem with basicfile LOBs (with similar, but
not so drastic symptoms appearing for securefile LOBs), so as a
first strategic step you should probably be planning to find a
painless way to move to securefile LOBs.
I've written an entire system about a problem with basicfile LOBs
when you do inserts and deletes - but the problem is the same if
you do lots of updates because for a LOB an update is a delete
followed by an insert (with the old LOB left in the LOB segment
rather than being copied to the UNDO segment).  The way LOBs
handle deletes is that the LOB index is a two-part index of which
the first part indexes in time order the chunks of LOBs that have
been deleted and the second part indexes by lobid the current LOB
chunks.
A problem appears when you try to insert a LOB and there isn't any
free space in the segment, but there is plenty of freeable space.
Your session will free up all the freeable space (deleting LOB
index entries) as it goes.  This can take a very long time and
while it's happening any other session that want to insert a LOB
will wait for your session to finish freeing up the space.
Unfortunately I've only see HW enqueues appearing as a result of
this action, I don't think I've seen TX enqueues (and I wouldn't
really expect to see the "enq TX - row lock contention" enqueue). 
http://jonathanlewis.wordpress.com/2017/01/26/basicfile-lobs/
Different possibility - is the LOB declared with multiple
freepools ? This is the default and it means the LOBindex has only
the two parts I describe above.  If you have N freepools then the
index consistents of N pairs of parts. This doesn't avoid the
problem I've described above (your session will try to free ALL
the free space from ALL the pools if it needs space), but having a
single freepool may explain your observations.  If I insert a very
large LOB PERHAPS you can't insert one until I finish insert
LOBIndex entries into my bit of index, in which case maybe you'd
show a row wait of some sort - though, again I'd expect a
different wait to appear (maybe buffer busy).
Another thought - your LOBs enable in-row storage:  how many of
them are short enough to fit in the row (a few, lots, most) and do
they get updated many times in situ, and are many of them likely
to start short and then grow  a few times before getting too long
? How long are the rest of the columns in the row ? I'm thinking
about the possibility that your smaller LOBs spend some time
growing in-row and causing row migration before they get long
enough to move into the LOB segment - and row migraion does funny
things with ITLs. Maybe something odd happens as a row that's
migrated moves its LOB to the LOB segment and the row migrates
back to it's original block.
---
In the absence of answers - a plan for the future might be to
range partition by sess_id and hash subpartiiton by sess_id:  I'm
assuming that sess_id is a value that's going to increase with
time when I say that.  If that is the case then the benefit of
doing this composite partitioning is that on a regular basis you
get a new partition and all the action moves into it and any
garbage that's happened in the earlier partition can be cleared up
(or dropped).
If you stick with basicfiles then look at freepools just in case
it's relevant - securefile lobs automatically have a better
strategy for concurrency but there is a parameter you can set of
increase concurrency. (Can't remember which it is at present).
Since I've mentioned the series - one feature of the mechanism is
that the LOB segment can become much larger than it needs to be,
so if you can sum() the sizes of the lobs that exceed the
in-linesize and it's much smaller than segment size you'll know
that your update mechanism is introducing some sort of problem.
Regards
Jonathan Lewis
Regards
Jonathan Lewis
________________________________________
Sent: 31 August 2018 14:23:46
Cc: ORACLE-L
Subject: Re: Looking for ideas on blocked sessions updating
partitioned table with CLOB
Ah, very good questions.
1. Blocking Type = enq: TX - row lock contention, yet the SESS_IDs
(PK) is different.
2. Basicfile LOBs (I assume these were carried over from an
upgrade from a prev version)
     sqlnet.ora
     ---------------------------------
     send_buf_size=2097152
     recv_buf_size=2097152
     default_sdu_size=32768
     default_tdu_size=32768
     tnsnames.ora
     -----------------------------
     Nothing on the tnsnames.ora
      tnsnames.ora
      -------------------------------
      Nothing specified here either for SDU etc
2. OS = Red hat Linux 6.8 64-bit , kernel 2.6.32-642.el6.x86_64
3. Block Size = 8192
Thanks,
Chris
On Thu, Aug 30, 2018 at 10:02 PM Stefan Knecht
When you say "blocking" - what event are the blocked sessions waiting on?
Also, what's your TNS config - particularly SDU sizes between the
mid-tier and the database (TNS connection string and the receiving
listener)?
Are you using securefile or basicfile LOBs?
What OS is the database on and what block size are you using in
the tablespace where the lobs are stored?
On Fri, Aug 31, 2018 at 6:21 AM, Chris Taylor
Env: 12.1.0.2
We have a table that stores session data (base64 encoded) from web sessions.
The table has 32 HASH partitions and contains a CLOB.
The table is hash partitioned by SESSION_ID which is a 64-char string.
UPDATE sessions_table
SET session_data = :session_data  // (CLOB)
WHERE session_id = :session_id;
For 95% of the sessions this runs very,very fast.
For 5% of the sessions that LOB being passed from PHP is 15MB and
for some reason PHP is really slow about transmitting those
session OBJECTS over to the database. (Verified through strace etc).
While those sessions are waiting to complete the update to the
CLOB, they end up BLOCKING other sessions that aren't trying to
update the same session id but are in the same partition.
I believe they're blocked on the CLOB on the same partition that
the long running session has open as I've confirmed that the
SESS_ID being updated is different in most cases.
So, here's my question, what performance strategy should I be
investigating for CLOBS.  Should I just add more partitions and
spread out the likelihood that a session will end up in the same
partition?  Seems logical but I'm not sure how CLOBS play into this.
Segment Name = SYS_LOB0051640773C00002$$
Index Name = SYS_LOB0051640773C00002$$
Chunk = 8192
PCTVERSION = 10
CACHE = YES
LOGGING = None
Encrypt = None
Compression = None
IN_ROW = Yes
Partition = Yes
Retention = Yes
Looking for any ideas on how to prevent a session from blocking
other non-related sessions that are updating the CLOB.
I'm trying to mitigate the blocking at the db layer while we stand
up a product such as Redis to handle the session caching at the
server level (and remove it from the db).
Chris
--
//
zztat - The Next-Gen Oracle Performance Monitoring and Reaction Framework!
Visit us at zztat.net
<http://zztat.net><http://zztat.net><http://zztat.net/> |
@zztat_oracle | fb.me/zztat
<http://fb.me/zztat><http://fb.me/zztat><http://fb.me/zztat> |
zztat.net/blog/
<http://zztat.net/blog/><http://zztat.net/blog/><http://zztat.net/blog/>
--
//
zztat - The Next-Gen Oracle Performance Monitoring and Reaction Framework!
Visit us at zztat.net <http://zztat.net><http://zztat.net/> |
@zztat_oracle | fb.me/zztat
<http://fb.me/zztat><http://fb.me/zztat> | zztat.net/blog/
<http://zztat.net/blog/><http://zztat.net/blog/>
Chris Taylor
2018-09-18 16:02:54 UTC
Permalink
Well, that's not intimating. Not at all :)

Talking to Oracle-L is one thing, putting it out for a wider audience? I'm
not so sure about all that .

Chris
Post by Tim Gorman
Chris,
You have a gift for telling the story. This could make an amazing
newsletter article or a fantastic presentation that you'll enjoy creating
and presenting. The fun part of such an article or presentation would be
explaining the underlying technology, so that the reader/audience enjoys
the punchline as much as we have.
For an article in a newsletter, please consider RMOUG SQL>Update
<http://www.rmoug.org/newsletter/>, the NoCOUG Journal
<http://nocoug.org/newsletter.html>, IOUG Select <http://select.ioug.org/>,
or UKOUG Oracle Scene
<http://www.ukoug.org/membership-new/member-activities/oracle-scene/>?
There might be others I have overlooked, for which I apologize.
Hope this helps!
-Tim
So, I need to own up to something.
My whole approach to this problem was wrong. I had been told that the
problem with this table was the CLOB and the number of sessions updating
the table at the same time. So, my whole approach was based on that
information being TRUE.
I was blinded to the 'real' problem and wasn't actively looking for an
alternative answer until I posted this thread. Due to the discussion here,
I took a few steps back and actually asked myself "why would this be row
locks/blocks?" and WHY don't I see internal locking on the clob if the clob
is the problem?
At that point I was in 'question everything' mode and started really
looking at the Event that the blocking session was sitting in (instead of
just assuming it was a problem with design).
The blocking sessions were ALWAYS sitting in EVENT: SQL*Net MORE DATA FROM
CLIENT and WAIT_CLASS: NETWORK and the wait times were reaching upward of 2
minutes before the wait_time_micro restarted counting.
When I saw that (and really looked at it), I was like that makes no
sense. Why would a session be getting latency from the PHP server? (Why
is it taking so long to send an object from the PHP server to the database?)
Running strace on the sessions on both the db server and the php server
showed the sessions just sitting in either read() or write() to the network
socket and never completing.
Large file transfers INTO the main db server would stall. (Transfers OUT
of the db server were fine) - this affected SQLNet, SCP, ncat etc.
Outgoing was fine, Inbound was terrible.
Ultimately we discovered these 2 settings had been DISABLED on this box at
/proc/sys/net/ipv4/tcp_sack
/proc/sys/net/ipv4/tcp_timestamps
Both of those were "0" whereas all our other db servers those were "1" and
weren't experiencing any problem.
After setting both of those to "1" , our sessions now complete and no
longer block.
I know some of you may be thinking "How could he miss all the network
waits?" - And that's a good question. All I can say is sometimes you get
"blinded" by assumptions and aren't really seeing the problem for what it
is.
I've known this a long, long time. So, remember kiddos you may have to
question 'what you know versus what you think you know' ;)
Also, the very fact of sharing this problem with the you on the list gave
me the ability to step back and look at it with a new mindset.
Anyway, I wanted to share that as a testament (and a huge thank you) to
the community we have here.
Thanks,
Chris
On Fri, Aug 31, 2018 at 11:48 AM Jonathan Lewis <
If the event is "enq TX: row lock wait" the p1/p2/p3 values will be about
the TX lock in v$lock. p2 = id1, p3 = id2, and p1 will encode "TX" and the
lock mode.
Select p1raw (if its available) or to_char(p1,'XXXXXXXXXXXXXXXX') and
you'll get 00000000054580006 - or possibly a 4 on the end. If it's a mode 4
lock then that's a clue that it's not really about a table row and more
likely to be about an index/IOT or some internal anomaly.
If you've got the ASH data then current_obj# may give you the object_id
of the object being accessed, but it's not entirely reliable.
Regards
Jonathan Lewis
________________________________________
Sent: 31 August 2018 16:04:20
To: Jonathan Lewis
Subject: Re: Looking for ideas on blocked sessions updating partitioned table with CLOB
You can also look at the p1/p2/p3 values of the sessions that are blocked
by the TX contention - and see which object the contention is on. If it's
the lobindex, that may lead to further clues about some of the
possibilities raised by Jonathan
Stefan
On Fri, Aug 31, 2018 at 8:55 PM, Jonathan Lewis <
There is a major problem with basicfile LOBs (with similar, but not so
drastic symptoms appearing for securefile LOBs), so as a first strategic
step you should probably be planning to find a painless way to move to
securefile LOBs.
I've written an entire system about a problem with basicfile LOBs when
you do inserts and deletes - but the problem is the same if you do lots of
updates because for a LOB an update is a delete followed by an insert (with
the old LOB left in the LOB segment rather than being copied to the UNDO
segment). The way LOBs handle deletes is that the LOB index is a two-part
index of which the first part indexes in time order the chunks of LOBs that
have been deleted and the second part indexes by lobid the current LOB
chunks.
A problem appears when you try to insert a LOB and there isn't any free
space in the segment, but there is plenty of freeable space. Your session
will free up all the freeable space (deleting LOB index entries) as it
goes. This can take a very long time and while it's happening any other
session that want to insert a LOB will wait for your session to finish
freeing up the space.
Unfortunately I've only see HW enqueues appearing as a result of this
action, I don't think I've seen TX enqueues (and I wouldn't really expect
to see the "enq TX - row lock contention" enqueue). THe series start at
this URL: http://jonathanlewis.wordpress.com/2017/01/26/basicfile-lobs/
Different possibility - is the LOB declared with multiple freepools ?
This is the default and it means the LOBindex has only the two parts I
describe above. If you have N freepools then the index consistents of N
pairs of parts. This doesn't avoid the problem I've described above (your
session will try to free ALL the free space from ALL the pools if it needs
space), but having a single freepool may explain your observations. If I
insert a very large LOB PERHAPS you can't insert one until I finish insert
LOBIndex entries into my bit of index, in which case maybe you'd show a row
wait of some sort - though, again I'd expect a different wait to appear
(maybe buffer busy).
Another thought - your LOBs enable in-row storage: how many of them are
short enough to fit in the row (a few, lots, most) and do they get updated
many times in situ, and are many of them likely to start short and then
grow a few times before getting too long ? How long are the rest of the
columns in the row ? I'm thinking about the possibility that your smaller
LOBs spend some time growing in-row and causing row migration before they
get long enough to move into the LOB segment - and row migraion does funny
things with ITLs. Maybe something odd happens as a row that's migrated
moves its LOB to the LOB segment and the row migrates back to it's original
block.
---
In the absence of answers - a plan for the future might be to range
partition by sess_id and hash subpartiiton by sess_id: I'm assuming that
sess_id is a value that's going to increase with time when I say that. If
that is the case then the benefit of doing this composite partitioning is
that on a regular basis you get a new partition and all the action moves
into it and any garbage that's happened in the earlier partition can be
cleared up (or dropped).
If you stick with basicfiles then look at freepools just in case it's
relevant - securefile lobs automatically have a better strategy for
concurrency but there is a parameter you can set of increase concurrency.
(Can't remember which it is at present).
Since I've mentioned the series - one feature of the mechanism is that
the LOB segment can become much larger than it needs to be, so if you can
sum() the sizes of the lobs that exceed the in-linesize and it's much
smaller than segment size you'll know that your update mechanism is
introducing some sort of problem.
Regards
Jonathan Lewis
Regards
Jonathan Lewis
________________________________________
Sent: 31 August 2018 14:23:46
Cc: ORACLE-L
Subject: Re: Looking for ideas on blocked sessions updating partitioned table with CLOB
Ah, very good questions.
1. Blocking Type = enq: TX - row lock contention, yet the SESS_IDs (PK) is different.
2. Basicfile LOBs (I assume these were carried over from an upgrade from a prev version)
sqlnet.ora
---------------------------------
send_buf_size=2097152
recv_buf_size=2097152
default_sdu_size=32768
default_tdu_size=32768
tnsnames.ora
-----------------------------
Nothing on the tnsnames.ora
tnsnames.ora
-------------------------------
Nothing specified here either for SDU etc
2. OS = Red hat Linux 6.8 64-bit , kernel 2.6.32-642.el6.x86_64
3. Block Size = 8192
Thanks,
Chris
When you say "blocking" - what event are the blocked sessions waiting on?
Also, what's your TNS config - particularly SDU sizes between the
mid-tier and the database (TNS connection string and the receiving
listener)?
Are you using securefile or basicfile LOBs?
What OS is the database on and what block size are you using in the
tablespace where the lobs are stored?
On Fri, Aug 31, 2018 at 6:21 AM, Chris Taylor <
Env: 12.1.0.2
We have a table that stores session data (base64 encoded) from web sessions.
The table has 32 HASH partitions and contains a CLOB.
The table is hash partitioned by SESSION_ID which is a 64-char string.
UPDATE sessions_table
SET session_data = :session_data // (CLOB)
WHERE session_id = :session_id;
For 95% of the sessions this runs very,very fast.
For 5% of the sessions that LOB being passed from PHP is 15MB and for
some reason PHP is really slow about transmitting those session OBJECTS
over to the database. (Verified through strace etc).
While those sessions are waiting to complete the update to the CLOB, they
end up BLOCKING other sessions that aren't trying to update the same
session id but are in the same partition.
I believe they're blocked on the CLOB on the same partition that the long
running session has open as I've confirmed that the SESS_ID being updated
is different in most cases.
So, here's my question, what performance strategy should I be
investigating for CLOBS. Should I just add more partitions and spread out
the likelihood that a session will end up in the same partition? Seems
logical but I'm not sure how CLOBS play into this.
Segment Name = SYS_LOB0051640773C00002$$
Index Name = SYS_LOB0051640773C00002$$
Chunk = 8192
PCTVERSION = 10
CACHE = YES
LOGGING = None
Encrypt = None
Compression = None
IN_ROW = Yes
Partition = Yes
Retention = Yes
Looking for any ideas on how to prevent a session from blocking other
non-related sessions that are updating the CLOB.
I'm trying to mitigate the blocking at the db layer while we stand up a
product such as Redis to handle the session caching at the server level
(and remove it from the db).
Chris
--
//
zztat - The Next-Gen Oracle Performance Monitoring and Reaction Framework!
Visit us at zztat.net<http://zztat.net><http://zztat.net/> |
@zztat_oracle | fb.me/zztat<http://fb.me/zztat><http://fb.me/zztat> |
zztat.net/blog/<http://zztat.net/blog/><http://zztat.net/blog/>
--
//
zztat - The Next-Gen Oracle Performance Monitoring and Reaction Framework!
http://fb.me/zztat> | zztat.net/blog/<http://zztat.net/blog/>
Tim Gorman
2018-09-18 16:27:30 UTC
Permalink
Think of it this way:  *this* is the audience who will pull you or your
story to pieces, so you've already cleared that hurdle.

I didn't intend to cause stress, just letting you know that stories like
this are *valuable*.  Anyone can find problems, fewer can present a
solution or two, fewer still can honestly describe what it takes to find
a solution.

I enjoyed your summary email, so I know I'd enjoy a longer treatise,
that's all.
Well, that's not intimating.  Not at all :)
Talking to Oracle-L is one thing, putting it out for a wider audience?
I'm not so sure about all that .
Chris
Chris,
You have a gift for telling the story.  This could make an amazing
newsletter article or a fantastic presentation that you'll enjoy
creating and presenting.  The fun part of such an article or
presentation would be explaining the underlying technology, so
that the reader/audience enjoys the punchline as much as we have.
For an article in a newsletter, please consider RMOUG SQL>Update
<http://www.rmoug.org/newsletter/>, the NoCOUG Journal
<http://nocoug.org/newsletter.html>, IOUG Select
<http://select.ioug.org/>, or UKOUG Oracle Scene
<http://www.ukoug.org/membership-new/member-activities/oracle-scene/>?
There might be others I have overlooked, for which I apologize.
Hope this helps!
-Tim
Post by Chris Taylor
So, I need to own up to something.
My whole approach to this problem was wrong.  I had been told
that the problem with this table was the CLOB and the number of
sessions updating the table at the same time.  So, my whole
approach was based on that information being TRUE.
I was blinded to the 'real' problem and wasn't actively looking
for an alternative answer until I posted this thread.  Due to the
discussion here, I took a few steps back and actually asked
myself "why would this be row locks/blocks?" and WHY don't I see
internal locking on the clob if the clob is the problem?
At that point I was in 'question everything' mode and started
really looking at the Event that the blocking session was sitting
in (instead of just assuming it was a problem with design).
The blocking sessions were ALWAYS sitting in EVENT: SQL*Net MORE
DATA FROM CLIENT and WAIT_CLASS: NETWORK and the wait times were
reaching upward of 2 minutes before the wait_time_micro restarted
counting.
When I saw that (and really looked at it), I was like that makes
no sense.  Why would a session be getting latency from the PHP
server?  (Why is it taking so long to send an object from the PHP
server to the database?)
Running strace on the sessions on both the db server and the php
server showed the sessions just sitting in either read() or
write() to the network socket and never completing.
Large file transfers INTO the main db server would stall.
(Transfers OUT of the db server were fine) - this affected
SQLNet, SCP, ncat etc.  Outgoing was fine, Inbound was terrible.
Ultimately we discovered these 2 settings had been DISABLED on
/proc/sys/net/ipv4/tcp_sack
/proc/sys/net/ipv4/tcp_timestamps
Both of those were "0" whereas all our other db servers those
were "1" and weren't experiencing any problem.
After setting both of those to "1" , our sessions now complete
and no longer block.
I know some of you may be thinking "How could he miss all the
network waits?"  - And that's a good question.  All I can say is
sometimes you get "blinded" by assumptions and aren't really
seeing the problem for what it is.
I've known this a long, long time.  So, remember kiddos you may
have to question 'what you know versus what you think you know' ;)
Also, the very fact of sharing this problem with the you on the
list gave me the ability to step back and look at it with a new
mindset.
Anyway, I wanted to share that as a testament (and a huge thank
you) to the community we have here.
Thanks,
Chris
On Fri, Aug 31, 2018 at 11:48 AM Jonathan Lewis
If the event is "enq TX: row lock wait" the p1/p2/p3 values
will be about the TX lock in v$lock.  p2 = id1, p3 = id2, and
p1 will encode "TX" and the lock mode.
Select p1raw (if its available) or
to_char(p1,'XXXXXXXXXXXXXXXX') and you'll get
00000000054580006 - or possibly a 4 on the end. If it's a
mode 4 lock then that's a clue that it's not really about a
table row and more likely to be about an index/IOT or some
internal anomaly.
If you've got the ASH data then current_obj# may give you the
object_id of the object being accessed, but it's not entirely
reliable.
Regards
Jonathan Lewis
________________________________________
Sent: 31 August 2018 16:04:20
To: Jonathan Lewis
Subject: Re: Looking for ideas on blocked sessions updating
partitioned table with CLOB
You can also look at the p1/p2/p3 values of the sessions that
are blocked by the TX contention - and see which object the
contention is on. If it's the lobindex, that may lead to
further clues about some of the possibilities raised by Jonathan
Stefan
On Fri, Aug 31, 2018 at 8:55 PM, Jonathan Lewis
There is a major problem with basicfile LOBs (with similar,
but not so drastic symptoms appearing for securefile LOBs),
so as a first strategic step you should probably be planning
to find a painless way to move to securefile LOBs.
I've written an entire system about a problem with basicfile
LOBs when you do inserts and deletes - but the problem is the
same if you do lots of updates because for a LOB an update is
a delete followed by an insert (with the old LOB left in the
LOB segment rather than being copied to the UNDO segment). 
The way LOBs handle deletes is that the LOB index is a
two-part index of which the first part indexes in time order
the chunks of LOBs that have been deleted and the second part
indexes by lobid the current LOB chunks.
A problem appears when you try to insert a LOB and there
isn't any free space in the segment, but there is plenty of
freeable space. Your session will free up all the freeable
space (deleting LOB index entries) as it goes.  This can take
a very long time and while it's happening any other session
that want to insert a LOB will wait for your session to
finish freeing up the space.
Unfortunately I've only see HW enqueues appearing as a result
of this action, I don't think I've seen TX enqueues (and I
wouldn't really expect to see the "enq TX - row lock
http://jonathanlewis.wordpress.com/2017/01/26/basicfile-lobs/
Different possibility - is the LOB declared with multiple
freepools ? This is the default and it means the LOBindex has
only the two parts I describe above.  If you have N freepools
then the index consistents of N pairs of parts. This doesn't
avoid the problem I've described above (your session will try
to free ALL the free space from ALL the pools if it needs
space), but having a single freepool may explain your
observations.  If I insert a very large LOB PERHAPS you can't
insert one until I finish insert LOBIndex entries into my bit
of index, in which case maybe you'd show a row wait of some
sort - though, again I'd expect a different wait to appear
(maybe buffer busy).
Another thought - your LOBs enable in-row storage: how many
of them are short enough to fit in the row (a few, lots,
most) and do they get updated many times in situ, and are
many of them likely to start short and then grow  a few times
before getting too long ? How long are the rest of the
columns in the row ? I'm thinking about the possibility that
your smaller LOBs spend some time growing in-row and causing
row migration before they get long enough to move into the
LOB segment - and row migraion does funny things with ITLs.
Maybe something odd happens as a row that's migrated moves
its LOB to the LOB segment and the row migrates back to it's
original block.
---
In the absence of answers - a plan for the future might be to
range partition by sess_id and hash subpartiiton by sess_id: 
I'm assuming that sess_id is a value that's going to increase
with time when I say that.  If that is the case then the
benefit of doing this composite partitioning is that on a
regular basis you get a new partition and all the action
moves into it and any garbage that's happened in the earlier
partition can be cleared up (or dropped).
If you stick with basicfiles then look at freepools just in
case it's relevant - securefile lobs automatically have a
better strategy for concurrency but there is a parameter you
can set of increase concurrency. (Can't remember which it is
at present).
Since I've mentioned the series - one feature of the
mechanism is that the LOB segment can become much larger than
it needs to be, so if you can sum() the sizes of the lobs
that exceed the in-linesize and it's much smaller than
segment size you'll know that your update mechanism is
introducing some sort of problem.
Regards
Jonathan Lewis
Regards
Jonathan Lewis
________________________________________
Sent: 31 August 2018 14:23:46
Cc: ORACLE-L
Subject: Re: Looking for ideas on blocked sessions updating
partitioned table with CLOB
Ah, very good questions.
1. Blocking Type = enq: TX - row lock contention, yet the
SESS_IDs (PK) is different.
2. Basicfile LOBs (I assume these were carried over from an
upgrade from a prev version)
     sqlnet.ora
     ---------------------------------
     send_buf_size=2097152
     recv_buf_size=2097152
     default_sdu_size=32768
     default_tdu_size=32768
     tnsnames.ora
     -----------------------------
     Nothing on the tnsnames.ora
      tnsnames.ora
      -------------------------------
      Nothing specified here either for SDU etc
2. OS = Red hat Linux 6.8 64-bit , kernel 2.6.32-642.el6.x86_64
3. Block Size = 8192
Thanks,
Chris
On Thu, Aug 30, 2018 at 10:02 PM Stefan Knecht
When you say "blocking" - what event are the blocked sessions waiting on?
Also, what's your TNS config - particularly SDU sizes between
the mid-tier and the database (TNS connection string and the
receiving listener)?
Are you using securefile or basicfile LOBs?
What OS is the database on and what block size are you using
in the tablespace where the lobs are stored?
On Fri, Aug 31, 2018 at 6:21 AM, Chris Taylor
Env: 12.1.0.2
We have a table that stores session data (base64 encoded)
from web sessions.
The table has 32 HASH partitions and contains a CLOB.
The table is hash partitioned by SESSION_ID which is a 64-char string.
UPDATE sessions_table
SET session_data = :session_data  // (CLOB)
WHERE session_id = :session_id;
For 95% of the sessions this runs very,very fast.
For 5% of the sessions that LOB being passed from PHP is 15MB
and for some reason PHP is really slow about transmitting
those session OBJECTS over to the database.  (Verified
through strace etc).
While those sessions are waiting to complete the update to
the CLOB, they end up BLOCKING other sessions that aren't
trying to update the same session id but are in the same
partition.
I believe they're blocked on the CLOB on the same partition
that the long running session has open as I've confirmed that
the SESS_ID being updated is different in most cases.
So, here's my question, what performance strategy should I be
investigating for CLOBS.  Should I just add more partitions
and spread out the likelihood that a session will end up in
the same partition? Seems logical but I'm not sure how CLOBS
play into this.
Segment Name = SYS_LOB0051640773C00002$$
Index Name = SYS_LOB0051640773C00002$$
Chunk = 8192
PCTVERSION = 10
CACHE = YES
LOGGING = None
Encrypt = None
Compression = None
IN_ROW = Yes
Partition = Yes
Retention = Yes
Looking for any ideas on how to prevent a session from
blocking other non-related sessions that are updating the CLOB.
I'm trying to mitigate the blocking at the db layer while we
stand up a product such as Redis to handle the session
caching at the server level (and remove it from the db).
Chris
--
//
zztat - The Next-Gen Oracle Performance Monitoring and
Reaction Framework!
Visit us at zztat.net
<http://zztat.net><http://zztat.net><http://zztat.net/> |
@zztat_oracle | fb.me/zztat
<http://fb.me/zztat><http://fb.me/zztat><http://fb.me/zztat>
| zztat.net/blog/
<http://zztat.net/blog/><http://zztat.net/blog/><http://zztat.net/blog/>
--
//
zztat - The Next-Gen Oracle Performance Monitoring and
Reaction Framework!
Visit us at zztat.net <http://zztat.net><http://zztat.net/> |
@zztat_oracle | fb.me/zztat
<http://fb.me/zztat><http://fb.me/zztat> | zztat.net/blog/
<http://zztat.net/blog/><http://zztat.net/blog/>
Tanel Poder
2018-09-18 15:06:17 UTC
Permalink
Good troubleshooting... And that is an important starting point - don't
troubleshoot what users say they *think* is happening, but just ask how do
they experience the problem and then follow the chain of measurable
evidence from there and go where-ever it leads you to, even if it's way
different from what the friendly DBA thinks that is happening. My usual
sequence goes from high level to microscopic: V$SESSION + wait events ->
V$SESSTAT for extra hints & evidence -> strace / pstack. And can't even
always trust Oracle's own instrumentation (sometimes it's wrong or
missing), so taking the screwdriver and opening up the process up from
outside (pstack, strace, dtrace/perf probe) is sometimes needed.

Btw, I wrote the ash_wait_chains scripts exactly for the "follow the chain
of evidence" reasons. Have a script automatically walk through the
blocker/waiter tree and report what everyone on a chain is doing, so you'd
see the complex wait topology & what the ultimate blocker is doing
sooner... :-)

For anyone interested, here's a "buffer busy wait" chain example:
-
https://blog.tanelpoder.com/2013/11/06/diagnosing-buffer-busy-waits-with-the-ash_wait_chains-sql-script-v0-2/

And I just published videos of my ashtop.sql & ash_wait_chains.sql hacking
sessions (close to 2 hours of free-form hacking :-)
- https://www.youtube.com/tanelpoder

But, back to the original topic - I *still *sometimes get distracted and
end up researching that fun exotic latch or mutex wait event that someone
tells is the problem in their DB, only to then realize that it only took 4%
of total response time and totally disappeared after they fixed the real
problem of excessive CPU usage and oversubscription due to a SQL plan gone
bad. I think I should hang out with Cary Millsap more!

Tanel.

On Tue, Sep 18, 2018 at 10:24 AM Chris Taylor <
Post by Chris Taylor
So, I need to own up to something.
My whole approach to this problem was wrong. I had been told that the
problem with this table was the CLOB and the number of sessions updating
the table at the same time. So, my whole approach was based on that
information being TRUE.
I was blinded to the 'real' problem and wasn't actively looking for an
alternative answer until I posted this thread. Due to the discussion here,
I took a few steps back and actually asked myself "why would this be row
locks/blocks?" and WHY don't I see internal locking on the clob if the clob
is the problem?
At that point I was in 'question everything' mode and started really
looking at the Event that the blocking session was sitting in (instead of
just assuming it was a problem with design).
Mladen Gogala
2018-09-19 14:52:41 UTC
Permalink
Yes, that is the problem with selective acknowledgement or "SACK". To
make the matter worse, your packet time stamps were turned off, so there
was no protection against out of sequence packets. If the client
expected acknowledgement for the TCP packet which would never come
because selective acknowledgements were turned off, that would appear as
an infinite wait. Here is a decent discussion of that parameter:

https://serverfault.com/questions/10955/when-to-turn-tcp-sack-off
Post by Chris Taylor
Running strace on the sessions on both the db server and the php
server showed the sessions just sitting in either read() or write() to
the network socket and never completing.
--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
Loading...