Thanks to all who chimed in what I thought was a DCD issue. It turned out
to be network all along, it just coincided with our switchover to new
hardware, huge pages, change in initialization parameters, new version of
the oracle listener, etc.
What I learned along the way is that the original error gave a real clue as
to where it was failing...looks for the lowest error number. Which in our
case was the 110 , standard Linux OS TIMEDOUT message, if that had been an
ORACLE error then it would have been at the ORACLE level.
Fatal NI connect error 12170.
VERSION INFORMATION:
TNS for Linux: Version 11.2.0.3.0 - Production
Oracle Bequeath NT Protocol Adapter for Linux: Version 11.2.0.3.0 -
Production
TCP/IP NT Protocol Adapter for Linux: Version 11.2.0.3.0 -
Production
Time: 18-AUG-2014 03:25:46
Tracing not turned on.
Tns error struct:
ns main err code: 12535
TNS-12535: TNS:operation timed out
ns secondary err code: 12560
nt main err code: 505
TNS-00505: Operation timed out
nt secondary err code: 110
nt OS err code: 0
A week of torture because this was an oracle forms application that
required a dedicated connection for the entire workday. Any sort of network
flakiness shows up immediately.
Rebooting quite a few of the routers, switches along the path fixed things.
On Fri, Aug 15, 2014 at 11:52 AM, Jeremy Schneider <
Post by Jeremy Schneiderprobably most people wouldn't notice but i mistakenly got that example
output from the database server... you actually should run it on the client
server. Here's what it looks like from two different clients.
tcp 0 0 ::ffff:192.168.1.216:43940 ::ffff:192.168.1.130:1521
ESTABLISHED keepalive (2271.79/0/0)
tcp 0 0 ::ffff:192.168.1.216:42615 ::ffff:192.168.1.130:1521
ESTABLISHED keepalive (4657.81/0/0)
tcp 110 0 ::ffff:192.168.1.216:40552 ::ffff:192.168.1.130:1521
ESTABLISHED keepalive (1074.00/0/0)
tcp 2970 0 ::ffff:192.168.1.181:60553 ::ffff:192.168.1.170:1521
ESTABLISHED off (0.00/0/0)
tcp 2910 0 ::ffff:192.168.1.181:59678 ::ffff:192.168.1.170:1521
ESTABLISHED off (0.00/0/0)
tcp 0 0 ::ffff:192.168.1.181:60610 ::ffff:192.168.1.170:1521
ESTABLISHED off (0.00/0/0)
tcp 2980 0 ::ffff:192.168.1.181:59744 ::ffff:192.168.1.170:1521
ESTABLISHED off (0.00/0/0)
OS settings are identical on these two servers.
-J
--
http://about.me/jeremy_schneider
On Fri, Aug 15, 2014 at 12:42 PM, Jeremy Schneider <
Post by Jeremy SchneiderAdding just two more points, since I have been recently working on DCD
with RH linux myself.
strace is quite detailed, but a much easier way to do the job is just use
"netstat -nto|grep 1521" or replace 1521 with your listener port if it's
non-default. The "o" option is the magic one for keepalive. In the far
right column you should see the string "keepalive" rather than "off" and it
will tell you the actual amount of time remaining on each keepalive
connection.
tcp 0 0 192.168.1.130:1521 192.168.1.130:22335
ESTABLISHED keepalive (2380.58/0/0)
tcp 0 0 192.168.1.130:1521 192.168.1.104:56698
ESTABLISHED off (0.00/0/0)
tcp 0 0 192.168.1.130:1521 192.168.1.146:56850
ESTABLISHED off (0.00/0/0)
tcp 0 0 192.168.1.130:1521 192.168.1.130:31120
TIME_WAIT timewait (13.21/0/0)
Notice that the connection from the db server to itself (130) above has
keepalive enabled, but the clients (104 and 146) do not have keepalive
enabled. Which brings up a second point. We were using the thin jdbc
client in some cases and discovered that keepalive was not enabled by this
driver unless you switched to the long format and explicitly specified
"(enable=broken)" in the long TNS entry. This is in addition to the kernel
settings which must be correctly configured.
-Jeremy
--
http://about.me/jeremy_schneider
On Thu, Aug 14, 2014 at 12:43 PM, Riyaj Shamsudeen <
Post by Riyaj ShamsudeenHello April,
Since you have set the sqlnet.expire_time to 10 minutes, every 10
minutes a TCP/IP packet is sent to that client port. If a TCP ACK is
received in a short interval, then both tcp_keepalive and SQLNET timers are
reset. If the TCP ACK is not received , then TCP retransmission code kicks
in, TCP packet is retransmitted tcp_retries2 (15 default) times with an
exponential back off controlled by tcp retransmission interval.
So, in your case, tcp shouldn't kill the connection in 2 hours at
all, from the host side. However, I have seen port level timeouts in the
switch/firewall configurations that is kept at 2 hours normally. Check with
network group to see if that is happening.
a. create a sqlplus connection from that client machine connecting to
the database.
b. Identify the dedicated server process for that connection. Strace
strace -tttT -o /tmp/dcd.lst -p <pid>
c. Just keep the sqlplus connection idle during this period. not even
an enter.
Reading the /tmp/dcd.lst file, you should see packets every 10
minutes. If it dies after 2 hours, then check with firewall/network group.
Hope this helps,
Cheers
Riyaj Shamsudeen
Principal DBA,
Ora!nternals - http://www.orainternals.com - Specialists in
Performance, RAC and EBS
Blog: http://orainternals.wordpress.com/
Oracle ACE Director and OakTable member <http://www.oaktable.com/>
Co-author of the books: Expert Oracle Practices
<http://tinyurl.com/book-expert-oracle-practices/>, Pro Oracle SQL,
<http://tinyurl.com/ahpvms8> <http://tinyurl.com/ahpvms8>Expert RAC
Practices 12c. <http://tinyurl.com/expert-rac-12c> Expert PL/SQL
practices <http://tinyurl.com/book-expert-plsql-practices>
<http://tinyurl.com/book-expert-plsql-practices>
Post by April SimsNeed some help in resolving our new idle timeouts seen since going to 12c.
I have a document
Oracle Net 12c: New Implementation of Dead Connection Detection (DCD)
(Doc ID 1591874.1)
We are on Linux RH 64-bit so this is applicable.
# cat /proc/sys/net/ipv4/tcp_keepalive_time
7200
# cat /proc/sys/net/ipv4/tcp_keepalive_intvl
75
# cat /proc/sys/net/ipv4/tcp_keepalive_probes
9
sqlnet.ora
SQLNET.EXPIRE_TIME = 10
SQLNET.INBOUND_CONNECT_TIMEOUT = 120
listener.ora
INBOUND_CONNECT_TIMEOUT_LISTENER_listenername = 120
Any suggestions on the changes I need to make to prevent a 2 hour idle
timeout?
thanks,
--
April C. Sims
http://aprilcsims.wordpress.com
Twitter, LinkedIn
Oracle Database 11g â Underground Advice for Database Administrators
https://www.packtpub.com/oracle-11g-database-implementations-guide/book
OCP 8i, 9i, 10g, 11g DBA
Southern Utah University
--
April C. Sims
IOUG SELECT Journal Editor
http://aprilcsims.wordpress.com
Twitter, LinkedIn
Oracle Database 11g â Underground Advice for Database Administrators
<http://www.amazon.com/Oracle-Database-Underground-Advice-Administrators/dp/1849680000/ref=sr_1_1?ie=UTF8&s=books&qid=1272289339&sr=8-1#noop>
https://www.packtpub.com/oracle-11g-database-implementations-guide/book
OCP 8i, 9i, 10g, 11g DBA
Southern Utah University
aprilcsims-***@public.gmane.org