I'm not so sure about all that .
that the reader/audience enjoys the punchline as much as we have.
There might be others I have overlooked, for which I apologize.
Post by Chris TaylorSo, I need to own up to something.
My whole approach to this problem was wrong. I had been told
that the problem with this table was the CLOB and the number of
sessions updating the table at the same time. So, my whole
approach was based on that information being TRUE.
I was blinded to the 'real' problem and wasn't actively looking
for an alternative answer until I posted this thread. Due to the
discussion here, I took a few steps back and actually asked
myself "why would this be row locks/blocks?" and WHY don't I see
internal locking on the clob if the clob is the problem?
At that point I was in 'question everything' mode and started
really looking at the Event that the blocking session was sitting
in (instead of just assuming it was a problem with design).
The blocking sessions were ALWAYS sitting in EVENT: SQL*Net MORE
DATA FROM CLIENT and WAIT_CLASS: NETWORK and the wait times were
reaching upward of 2 minutes before the wait_time_micro restarted
counting.
When I saw that (and really looked at it), I was like that makes
no sense. Why would a session be getting latency from the PHP
server? (Why is it taking so long to send an object from the PHP
server to the database?)
Running strace on the sessions on both the db server and the php
server showed the sessions just sitting in either read() or
write() to the network socket and never completing.
Large file transfers INTO the main db server would stall.
(Transfers OUT of the db server were fine) - this affected
SQLNet, SCP, ncat etc. Outgoing was fine, Inbound was terrible.
Ultimately we discovered these 2 settings had been DISABLED on
/proc/sys/net/ipv4/tcp_sack
/proc/sys/net/ipv4/tcp_timestamps
Both of those were "0" whereas all our other db servers those
were "1" and weren't experiencing any problem.
After setting both of those to "1" , our sessions now complete
and no longer block.
I know some of you may be thinking "How could he miss all the
network waits?" - And that's a good question. All I can say is
sometimes you get "blinded" by assumptions and aren't really
seeing the problem for what it is.
I've known this a long, long time. So, remember kiddos you may
have to question 'what you know versus what you think you know' ;)
Also, the very fact of sharing this problem with the you on the
list gave me the ability to step back and look at it with a new
mindset.
Anyway, I wanted to share that as a testament (and a huge thank
you) to the community we have here.
Thanks,
Chris
On Fri, Aug 31, 2018 at 11:48 AM Jonathan Lewis
If the event is "enq TX: row lock wait" the p1/p2/p3 values
will be about the TX lock in v$lock. p2 = id1, p3 = id2, and
p1 will encode "TX" and the lock mode.
Select p1raw (if its available) or
to_char(p1,'XXXXXXXXXXXXXXXX') and you'll get
00000000054580006 - or possibly a 4 on the end. If it's a
mode 4 lock then that's a clue that it's not really about a
table row and more likely to be about an index/IOT or some
internal anomaly.
If you've got the ASH data then current_obj# may give you the
object_id of the object being accessed, but it's not entirely
reliable.
Regards
Jonathan Lewis
________________________________________
Sent: 31 August 2018 16:04:20
To: Jonathan Lewis
Subject: Re: Looking for ideas on blocked sessions updating
partitioned table with CLOB
You can also look at the p1/p2/p3 values of the sessions that
are blocked by the TX contention - and see which object the
contention is on. If it's the lobindex, that may lead to
further clues about some of the possibilities raised by Jonathan
Stefan
On Fri, Aug 31, 2018 at 8:55 PM, Jonathan Lewis
There is a major problem with basicfile LOBs (with similar,
but not so drastic symptoms appearing for securefile LOBs),
so as a first strategic step you should probably be planning
to find a painless way to move to securefile LOBs.
I've written an entire system about a problem with basicfile
LOBs when you do inserts and deletes - but the problem is the
same if you do lots of updates because for a LOB an update is
a delete followed by an insert (with the old LOB left in the
LOB segment rather than being copied to the UNDO segment).Â
The way LOBs handle deletes is that the LOB index is a
two-part index of which the first part indexes in time order
the chunks of LOBs that have been deleted and the second part
indexes by lobid the current LOB chunks.
A problem appears when you try to insert a LOB and there
isn't any free space in the segment, but there is plenty of
freeable space. Your session will free up all the freeable
space (deleting LOB index entries) as it goes. This can take
a very long time and while it's happening any other session
that want to insert a LOB will wait for your session to
finish freeing up the space.
Unfortunately I've only see HW enqueues appearing as a result
of this action, I don't think I've seen TX enqueues (and I
wouldn't really expect to see the "enq TX - row lock
http://jonathanlewis.wordpress.com/2017/01/26/basicfile-lobs/
Different possibility - is the LOB declared with multiple
freepools ? This is the default and it means the LOBindex has
only the two parts I describe above. If you have N freepools
then the index consistents of N pairs of parts. This doesn't
avoid the problem I've described above (your session will try
to free ALL the free space from ALL the pools if it needs
space), but having a single freepool may explain your
observations. If I insert a very large LOB PERHAPS you can't
insert one until I finish insert LOBIndex entries into my bit
of index, in which case maybe you'd show a row wait of some
sort - though, again I'd expect a different wait to appear
(maybe buffer busy).
Another thought - your LOBs enable in-row storage: how many
of them are short enough to fit in the row (a few, lots,
most) and do they get updated many times in situ, and are
many of them likely to start short and then grow a few times
before getting too long ? How long are the rest of the
columns in the row ? I'm thinking about the possibility that
your smaller LOBs spend some time growing in-row and causing
row migration before they get long enough to move into the
LOB segment - and row migraion does funny things with ITLs.
Maybe something odd happens as a row that's migrated moves
its LOB to the LOB segment and the row migrates back to it's
original block.
---
In the absence of answers - a plan for the future might be to
range partition by sess_id and hash subpartiiton by sess_id:Â
I'm assuming that sess_id is a value that's going to increase
with time when I say that. If that is the case then the
benefit of doing this composite partitioning is that on a
regular basis you get a new partition and all the action
moves into it and any garbage that's happened in the earlier
partition can be cleared up (or dropped).
If you stick with basicfiles then look at freepools just in
case it's relevant - securefile lobs automatically have a
better strategy for concurrency but there is a parameter you
can set of increase concurrency. (Can't remember which it is
at present).
Since I've mentioned the series - one feature of the
mechanism is that the LOB segment can become much larger than
it needs to be, so if you can sum() the sizes of the lobs
that exceed the in-linesize and it's much smaller than
segment size you'll know that your update mechanism is
introducing some sort of problem.
Regards
Jonathan Lewis
Regards
Jonathan Lewis
________________________________________
Sent: 31 August 2018 14:23:46
Cc: ORACLE-L
Subject: Re: Looking for ideas on blocked sessions updating
partitioned table with CLOB
Ah, very good questions.
1. Blocking Type = enq: TX - row lock contention, yet the
SESS_IDs (PK) is different.
2. Basicfile LOBs (I assume these were carried over from an
upgrade from a prev version)
   sqlnet.ora
   ---------------------------------
   send_buf_size=2097152
   recv_buf_size=2097152
   default_sdu_size=32768
   default_tdu_size=32768
   tnsnames.ora
   -----------------------------
   Nothing on the tnsnames.ora
   tnsnames.ora
   -------------------------------
   Nothing specified here either for SDU etc
2. OS = Red hat Linux 6.8 64-bit , kernel 2.6.32-642.el6.x86_64
3. Block Size = 8192
Thanks,
Chris
On Thu, Aug 30, 2018 at 10:02 PM Stefan Knecht
When you say "blocking" - what event are the blocked sessions waiting on?
Also, what's your TNS config - particularly SDU sizes between
the mid-tier and the database (TNS connection string and the
receiving listener)?
Are you using securefile or basicfile LOBs?
What OS is the database on and what block size are you using
in the tablespace where the lobs are stored?
On Fri, Aug 31, 2018 at 6:21 AM, Chris Taylor
Env: 12.1.0.2
We have a table that stores session data (base64 encoded)
from web sessions.
The table has 32 HASH partitions and contains a CLOB.
The table is hash partitioned by SESSION_ID which is a 64-char string.
UPDATE sessions_table
SET session_data = :session_data // (CLOB)
WHERE session_id = :session_id;
For 95% of the sessions this runs very,very fast.
For 5% of the sessions that LOB being passed from PHP is 15MB
and for some reason PHP is really slow about transmitting
those session OBJECTS over to the database. (Verified
through strace etc).
While those sessions are waiting to complete the update to
the CLOB, they end up BLOCKING other sessions that aren't
trying to update the same session id but are in the same
partition.
I believe they're blocked on the CLOB on the same partition
that the long running session has open as I've confirmed that
the SESS_ID being updated is different in most cases.
So, here's my question, what performance strategy should I be
investigating for CLOBS. Should I just add more partitions
and spread out the likelihood that a session will end up in
the same partition? Seems logical but I'm not sure how CLOBS
play into this.
Segment Name = SYS_LOB0051640773C00002$$
Index Name = SYS_LOB0051640773C00002$$
Chunk = 8192
PCTVERSION = 10
CACHE = YES
LOGGING = None
Encrypt = None
Compression = None
IN_ROW = Yes
Partition = Yes
Retention = Yes
Looking for any ideas on how to prevent a session from
blocking other non-related sessions that are updating the CLOB.
I'm trying to mitigate the blocking at the db layer while we
stand up a product such as Redis to handle the session
caching at the server level (and remove it from the db).
Chris
--
//
zztat - The Next-Gen Oracle Performance Monitoring and
Reaction Framework!
Visit us at zztat.net
<http://zztat.net><http://zztat.net><http://zztat.net/> |
@zztat_oracle | fb.me/zztat
<http://fb.me/zztat><http://fb.me/zztat><http://fb.me/zztat>
| zztat.net/blog/
<http://zztat.net/blog/><http://zztat.net/blog/><http://zztat.net/blog/>
--
//
zztat - The Next-Gen Oracle Performance Monitoring and
Reaction Framework!
Visit us at zztat.net <http://zztat.net><http://zztat.net/> |
@zztat_oracle | fb.me/zztat
<http://fb.me/zztat><http://fb.me/zztat> | zztat.net/blog/
<http://zztat.net/blog/><http://zztat.net/blog/>