Discussion:
UNDO Space Error
Pap
2021-03-20 18:22:31 UTC
Permalink
Hello Listers, Its version 11.2.0.4 of oracle. We suddenly started
encountering ORA-30036(ORA-30036: unable to extend segment by 8 in undo
tablespace 'UNDOTBS2') in one of the databases. We don't have any changes
done to the code. We already increased the UNDO tablespace size from ~100Gb
to ~190GB but still facing the same and this time we want to find the root
cause rather than keep on increasing the size of tablespace.

During one of the failures I saw DBA_UNDO_EXTENTS was showing almost all of
the extent status as UNEXPIRED. dba_free_space was showing zero space for
that tablespace. We have the data files in the undo tablespace set as
autoextend ON. And we are using AUTO undo management with UNDO retention
set as 900.

I understand there are two types of UNDO noted by oracle , one is UNDO read
which the SELECT query sometimes fails with Ora-01555 (but here we are not
encountering that). The other one is UNDO generation because of the
DML(INSERT/UPDATE/DELETE) and in this case our failure is because of the
same UNDO and each time it's mostly INSERT queries failing while doing data
load. So is there any way I can track the exact session/sql/user which is
generating maximum UNDO from any historical AWR views and also during run
time?

How to debug from history and get the cause of this sudden increase in UNDO
space consumption? Any other fix other than increasing UNDO space?

Regards

pap
Sayan Malakshinov
2021-03-20 18:26:30 UTC
Permalink
Hi Pap,

Have a look at V$transaction



--
Best regards,
Sayan Malakshinov
Oracle performance tuning expert
Oracle Database Developer Choice Award winner
Oracle ACE Associate
http://orasql.org
Post by Pap
Hello Listers, Its version 11.2.0.4 of oracle. We suddenly started
encountering ORA-30036(ORA-30036: unable to extend segment by 8 in undo
tablespace 'UNDOTBS2') in one of the databases. We don't have any changes
done to the code. We already increased the UNDO tablespace size from ~100Gb
to ~190GB but still facing the same and this time we want to find the root
cause rather than keep on increasing the size of tablespace.
During one of the failures I saw DBA_UNDO_EXTENTS was showing almost all
of the extent status as UNEXPIRED. dba_free_space was showing zero space
for that tablespace. We have the data files in the undo tablespace set as
autoextend ON. And we are using AUTO undo management with UNDO retention
set as 900.
I understand there are two types of UNDO noted by oracle , one is UNDO
read which the SELECT query sometimes fails with Ora-01555 (but here we are
not encountering that). The other one is UNDO generation because of the
DML(INSERT/UPDATE/DELETE) and in this case our failure is because of the
same UNDO and each time it's mostly INSERT queries failing while doing data
load. So is there any way I can track the exact session/sql/user which is
generating maximum UNDO from any historical AWR views and also during run
time?
How to debug from history and get the cause of this sudden increase in
UNDO space consumption? Any other fix other than increasing UNDO space?
Regards
pap
Pap
2021-03-20 18:34:50 UTC
Permalink
Thank You Sayan.

I checked using the below query during run time, but don't see any one
specific session coming out as one of suspects. But I will try to see it
again when the issue occurs. But another doubt I have is , if there exists
any historical AWR view matching v$transaction which will help me go back
and check the exact snap times(when failure occurred) to see what exact
session/process is the culprit and find the cause?


select b.addr,

a.sid,

a.username,

b.xidusn,

b.used_urec,

(b.used_ublk*8192)/1024 size_mb

from

gv$session a,

gv$transaction b

where

a.saddr = b.ses_addr

and a.inst_id=b.inst_id

order by b.used_ublk desc nulls last
Post by Sayan Malakshinov
Hi Pap,
Have a look at V$transaction
--
Best regards,
Sayan Malakshinov
Oracle performance tuning expert
Oracle Database Developer Choice Award winner
Oracle ACE Associate
http://orasql.org
Post by Pap
Hello Listers, Its version 11.2.0.4 of oracle. We suddenly started
encountering ORA-30036(ORA-30036: unable to extend segment by 8 in undo
tablespace 'UNDOTBS2') in one of the databases. We don't have any changes
done to the code. We already increased the UNDO tablespace size from ~100Gb
to ~190GB but still facing the same and this time we want to find the root
cause rather than keep on increasing the size of tablespace.
During one of the failures I saw DBA_UNDO_EXTENTS was showing almost all
of the extent status as UNEXPIRED. dba_free_space was showing zero space
for that tablespace. We have the data files in the undo tablespace set as
autoextend ON. And we are using AUTO undo management with UNDO retention
set as 900.
I understand there are two types of UNDO noted by oracle , one is UNDO
read which the SELECT query sometimes fails with Ora-01555 (but here we are
not encountering that). The other one is UNDO generation because of the
DML(INSERT/UPDATE/DELETE) and in this case our failure is because of the
same UNDO and each time it's mostly INSERT queries failing while doing data
load. So is there any way I can track the exact session/sql/user which is
generating maximum UNDO from any historical AWR views and also during run
time?
How to debug from history and get the cause of this sudden increase in
UNDO space consumption? Any other fix other than increasing UNDO space?
Regards
pap
Mark W. Farnham
2021-03-20 22:24:50 UTC
Permalink
Changes to the volume of data can drive something like this without changing the code.



IF you have any jobs that have a single commit at completion (regardless of batch size that is being handled), and especially if the input batch sizes have tended to rise over time, those are worth checking.



IF you find the culprit, the options are:



1) Changing the code to handle a defined “monolith size” (NOT 1 except under specific conditions of small integral transactions that must be separately committed and verified) and toss a commit once per monolith (and at the end when you underflow the monolith size for the last chunk.

2) Monitor the inbound size of “batches” and hand your transaction processor small enough bites to work with your existing code.



OR it could be something completely different. But this is worth checking. Oracle will handle pretty big chunks, which sometimes causes developers to ignore the fact that nothing handles arbitrarily large chunks.



Good luck,



mwf



From: oracle-l-***@freelists.org [mailto:oracle-l-***@freelists.org] On Behalf Of Pap
Sent: Saturday, March 20, 2021 2:35 PM
To: Sayan Malakshinov
Cc: Oracle L
Subject: Re: UNDO Space Error



Thank You Sayan.

I checked using the below query during run time, but don't see any one specific session coming out as one of suspects. But I will try to see it again when the issue occurs. But another doubt I have is , if there exists any historical AWR view matching v$transaction which will help me go back and check the exact snap times(when failure occurred) to see what exact session/process is the culprit and find the cause?


select b.addr,

a.sid,

a.username,

b.xidusn,

b.used_urec,

(b.used_ublk*8192)/1024 size_mb

from

gv$session a,

gv$transaction b

where

a.saddr = b.ses_addr

and a.inst_id=b.inst_id

order by b.used_ublk desc nulls last







On Sat, Mar 20, 2021 at 11:56 PM Sayan Malakshinov <***@gmail.com> wrote:

Hi Pap,



Have a look at V$transaction






--

Best regards,
Sayan Malakshinov
Oracle performance tuning expert
Oracle Database Developer Choice Award winner
Oracle ACE Associate
http://orasql.org



сб, 20 Ќар. 2021 г., 21:22 Pap <***@gmail.com>:

Hello Listers, Its version 11.2.0.4 of oracle. We suddenly started encountering ORA-30036(ORA-30036: unable to extend segment by 8 in undo tablespace 'UNDOTBS2') in one of the databases. We don't have any changes done to the code. We already increased the UNDO tablespace size from ~100Gb to ~190GB but still facing the same and this time we want to find the root cause rather than keep on increasing the size of tablespace.

During one of the failures I saw DBA_UNDO_EXTENTS was showing almost all of the extent status as UNEXPIRED. dba_free_space was showing zero space for that tablespace. We have the data files in the undo tablespace set as autoextend ON. And we are using AUTO undo management with UNDO retention set as 900.

I understand there are two types of UNDO noted by oracle , one is UNDO read which the SELECT query sometimes fails with Ora-01555 (but here we are not encountering that). The other one is UNDO generation because of the DML(INSERT/UPDATE/DELETE) and in this case our failure is because of the same UNDO and each time it's mostly INSERT queries failing while doing data load. So is there any way I can track the exact session/sql/user which is generating maximum UNDO from any historical AWR views and also during run time?

How to debug from history and get the cause of this sudden increase in UNDO space consumption? Any other fix other than increasing UNDO space?

Regards

pap
Mladen Gogala
2021-03-21 01:15:39 UTC
Permalink
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p>And now something completely different:</p>
<p><img alt="Something Completely Different GIFs | Tenor"
class="n3VNCb"
src="Loading Image..."
data-noaft="1" style="width: 220px; height: 162px; margin:
16.05px 0px;"></p>
<p>I am sorry, I couldn't resist. <br>
</p>
<div class="moz-cite-prefix">On 3/20/21 6:24 PM, Mark W. Farnham
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:08b001d71dd7$d8096ef0$881c4cd0$@rsiz.com"><span
style="font-size:14.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D">OR
it could be something completely different. But this is worth
checking. Oracle will handle pretty big chunks, which sometimes
causes developers to ignore the fact that nothing handles
arbitrarily large chunks.</span></blockquote>
<pre class="moz-signature" cols="72">--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
<a class="moz-txt-link-freetext" href="https://dbwhisperer.wordpress.com">https://dbwhisperer.wordpress.com</a>
</pre>
</body>
</html>
--
http://www.freelists.org/webpage/oracle-l
Mladen Gogala
2021-03-21 00:55:19 UTC
Permalink
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p>As opposed to ORA-01555 which is a nightmare to debug and
resolve, ORA-30036 doesn't present such problems because it's
thrown by the same transaction that causes it. The cause is very
simple: the current transaction ran out of space in the undo
segment. At this point, I'd like to remind you of the common
saying that disks are cheap, especially is someone else is buying
them. So, what you need to do is to figure out which transaction
is causing the problem. Second thing is figuring out what to do
about that. I've frequently seen this with data purges.  If you
want to get rid of 1/3 of the rows in a billion rows table, based
on a date coliumn, it is conceivable that you might need few KB of
the UNDO space. The solution is usually to write a PL/SQL
procedure which will clean the rows in batches of 100K.</p>
<p>So, what were you doing? Can you post the SQL which has thrown
the error? Without the exact statement, cardinality of the tables
involved and rough estimate how much data are you modifying, it's
hard to tell you anything more.<br>
</p>
<div class="moz-cite-prefix">On 3/20/21 2:22 PM, Pap wrote:<br>
</div>
<blockquote type="cite"
cite="mid:***@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">
<p
style="font-family:inherit;color:inherit;font-weight:inherit;font-size:14px;word-break:break-word;line-height:inherit;box-sizing:border-box;padding:0px;margin:3px
0px
14px;border:0px;font-style:inherit;vertical-align:baseline;outline:0px;text-overflow:ellipsis">Hello
Listers, Its version 11.2.0.4 of oracle. We suddenly started
encountering ORA-30036(ORA-30036: unable to extend segment by
8 in undo tablespace 'UNDOTBS2') in one of the databases. We
don't have any changes done to the code. We already increased
the UNDO tablespace size from ~100Gb to ~190GB but still
facing the same and this time we want to find the root cause
rather than keep on increasing the size of tablespace.</p>
<p
style="font-family:inherit;color:inherit;font-weight:inherit;font-size:14px;word-break:break-word;line-height:inherit;box-sizing:border-box;padding:0px;margin:3px
0px
14px;border:0px;font-style:inherit;vertical-align:baseline;outline:0px;text-overflow:ellipsis"><span
style="color:inherit;font-family:inherit;font-style:inherit;font-weight:inherit">During
one of the failures I saw DBA_UNDO_EXTENTS was showing
almost all of the extent status as UNEXPIRED. dba_free_space
was showing zero space for that tablespace. We have the data
files in the undo tablespace set as autoextend ON. And we
are using AUTO undo management with UNDO retention set as
900.</span><br>
</p>
<p
style="font-family:inherit;color:inherit;font-weight:inherit;font-size:14px;word-break:break-word;line-height:inherit;box-sizing:border-box;padding:0px;margin:3px
0px
14px;border:0px;font-style:inherit;vertical-align:baseline;outline:0px;text-overflow:ellipsis"><span
style="color:inherit;font-family:inherit;font-style:inherit;font-weight:inherit">I
understand there are two types of UNDO noted by oracle , one
is UNDO read which the SELECT query sometimes fails with
Ora-01555 (but here we are not encountering that). The other
one is UNDO generation because of the
DML(INSERT/UPDATE/DELETE) and in this case our failure is
because of the same UNDO and each time it's mostly INSERT
queries failing while doing data load. So is there any way I
can track the exact session/sql/user which is generating
maximum UNDO from any historical AWR views and also during
run time?</span></p>
<p
style="font-family:inherit;color:inherit;font-weight:inherit;font-size:14px;word-break:break-word;line-height:inherit;box-sizing:border-box;padding:0px;margin:3px
0px
14px;border:0px;font-style:inherit;vertical-align:baseline;outline:0px;text-overflow:ellipsis"><span
style="color:inherit;font-family:inherit;font-style:inherit;font-weight:inherit">How
to debug from history and get the cause of this sudden
increase in UNDO space consumption? </span><span
style="color:inherit;font-family:inherit;font-style:inherit;font-weight:inherit">Any
other fix other than increasing UNDO space?</span></p>
<p
style="font-family:inherit;color:inherit;font-weight:inherit;font-size:14px;word-break:break-word;line-height:inherit;box-sizing:border-box;padding:0px;margin:3px
0px
14px;border:0px;font-style:inherit;vertical-align:baseline;outline:0px;text-overflow:ellipsis">Regards</p>
<p
style="font-family:inherit;color:inherit;font-weight:inherit;font-size:14px;word-break:break-word;line-height:inherit;box-sizing:border-box;padding:0px;margin:3px
0px
14px;border:0px;font-style:inherit;vertical-align:baseline;outline:0px;text-overflow:ellipsis">pap</p>
<h1
style="font-family:inherit;color:rgb(85,90,98);font-size:32px;word-break:break-word;line-height:1.25;box-sizing:border-box;padding:0px;margin:0px;border:0px;font-style:inherit;vertical-align:baseline;outline:0px;text-overflow:ellipsis"><br>
</h1>
<div><br>
</div>
</div>
</blockquote>
<pre class="moz-signature" cols="72">--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
<a class="moz-txt-link-freetext" href="https://dbwhisperer.wordpress.com">https://dbwhisperer.wordpress.com</a>
</pre>
</body>
</html>
--
http://www.freelists.org/webpage/oracle-l
Pap
2021-03-21 05:16:05 UTC
Permalink
Thank you.

We do use batching for large DML to do it in bulk fashion. However, I got
one of the INSERT queries which has failed and looks something as below.
And I do see this table TAB(masked actual name) has 4 indexes and I think
all the UNDO generated must be because of that direct path load is
happening so table block related UNDO will be zero. And below is the sample
insert query which has failed with Ora-30036.

I do see the execution plan has changed from past but i can't relate
howcome change in path can result in higher UNDO space consumption for the
INSERT query. It should depend on the number of rows those are loaded to
table only. Please correct me if my understanding is wrong. And also even
if the plan suggests the expected rows are higher in the new plan which is
pointing towards a higher number of rows being loaded to the table too, but
i had verified with the application team and they mentioned the number of
rows loaded were around the same as previous executions which got loaded
during the failure days. So still wondering what must be the cause of this
sudden failure?

INSERT /*+ append parallel(16) nologging */ INTO USER1.TAB (c1, c2, c3,
c4... c11) SELECT...;
As opposed to ORA-01555 which is a nightmare to debug and resolve,
ORA-30036 doesn't present such problems because it's thrown by the same
transaction that causes it. The cause is very simple: the current
transaction ran out of space in the undo segment. At this point, I'd like
to remind you of the common saying that disks are cheap, especially is
someone else is buying them. So, what you need to do is to figure out which
transaction is causing the problem. Second thing is figuring out what to do
about that. I've frequently seen this with data purges. If you want to get
rid of 1/3 of the rows in a billion rows table, based on a date coliumn, it
is conceivable that you might need few KB of the UNDO space. The solution
is usually to write a PL/SQL procedure which will clean the rows in batches
of 100K.
So, what were you doing? Can you post the SQL which has thrown the error?
Without the exact statement, cardinality of the tables involved and rough
estimate how much data are you modifying, it's hard to tell you anything
more.
Hello Listers, Its version 11.2.0.4 of oracle. We suddenly started
encountering ORA-30036(ORA-30036: unable to extend segment by 8 in undo
tablespace 'UNDOTBS2') in one of the databases. We don't have any changes
done to the code. We already increased the UNDO tablespace size from ~100Gb
to ~190GB but still facing the same and this time we want to find the root
cause rather than keep on increasing the size of tablespace.
During one of the failures I saw DBA_UNDO_EXTENTS was showing almost all
of the extent status as UNEXPIRED. dba_free_space was showing zero space
for that tablespace. We have the data files in the undo tablespace set as
autoextend ON. And we are using AUTO undo management with UNDO retention
set as 900.
I understand there are two types of UNDO noted by oracle , one is UNDO
read which the SELECT query sometimes fails with Ora-01555 (but here we are
not encountering that). The other one is UNDO generation because of the
DML(INSERT/UPDATE/DELETE) and in this case our failure is because of the
same UNDO and each time it's mostly INSERT queries failing while doing data
load. So is there any way I can track the exact session/sql/user which is
generating maximum UNDO from any historical AWR views and also during run
time?
How to debug from history and get the cause of this sudden increase in
UNDO space consumption? Any other fix other than increasing UNDO space?
Regards
pap
--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217https://dbwhisperer.wordpress.com
-- http://www.freelists.org/webpage/oracle-l
Loading...