Friday, August 6, 2010

Huge number of oracle processes and oracle is not releasing process

Problem Symptoms
In the database there is huge number of Oracle database processes and within several hours the processes reached the max limit of PROCESSES parameter. No one then can connect to database and therefore it is needed to restart the Server or need to kill the oracle process from operating system.

Recent changes are the,
1)Enable DCD.
2)SQLNET.EXPIRE_TIME is set to 1 or more.

Cause of the Problem
This problem is specific with Oracle Database release 10.2.0.3 and on Windows Platform. It is due to the the DCD (dead connection detection) mechanism. The orphaned processes are not being cleaned up even though DCD is enabled and SQLNET.EXPIRE_TIME is set to 1 or more.

Also the number of processes exceeds too much than the number of sessions.
SQL> select count(*) from v$session;

COUNT(*)
----------
55

SQL> select count(*) from v$process;

COUNT(*)
----------
600
In this case, the orphaned resources are not released if only the client application is terminated. After only the client computer has been rebooted, the DCD releases those resources. For example, a Windows application is killed but Windows machine remains running, the probe packet may be received and discarded as if the connection is still active. As it currently stands, it appears that DCD detects dead client machines, but not dead client processes.

Solution of the Problem
1. Comment the SQLNET.EXPIRE_TIME parameter in your sqlnet.ora file.
2. Reboot the server. Server reboot is funny thing to solve the problem.
3. Set KeepAlive mechanism in Windows 2000/NT.
To do so,
- Start > Run > Regedit.exe
- In the Registry, navigate to:
HKEY_LOCAL_MACHINE/SYSTEM/CurrentControlSet/Services/Tcpip/Parameters

- Edit - Add value: KeepAliveTime (REG_DWORD)
Decimal -> 180000 (for example = 3 minutes)

- The parameter will not be present the first time you modify the parameter, so you will need to add it. The default value is 2 hours.
4. Watch to see if there are any check if orphaned SHAD processes that exist.

Monday, August 2, 2010

ORA-28056: Writing audit records to Windows Event Log failed

Problem Description
Connecting as sysdba fails with ORA-28056.

$ sqlplus / as sysdba
ORA-28056: Writing audit records to Windows Event Log failed.

Cause of the Problem
The problem happened because Event Viewer log is full and not able to log anymore events.

Solution of the Problem
You need to free an event log.

- Log in as a administrator or a member of the Administrators group in order to free an event log.

- Open Event Viewer. To do so click on Start Menu, click Control Panel, click Performance and Maintenance, click Administrative Tools, and then double-click Event Viewer.

Or, click on Start Menu, click Control Panel, click Administrative Tools, and then double-click Event Viewer.

- Now you can follow any of the following steps.

1) When a log is full, it stops recording new events. Clearing the log is one way to free the log and start recording new events. To do so, on the Action menu (left click)or on the left side Application/System/Security (as available) right click and select Clear All events.

2) You can also free a log and start recording new events by overwriting old events. To overwrite events, on the Action menu, or on the left side Application/System/Security (as available) click Properties, and then click Overwrite events as needed. This ensures that all new events are written to the log, even when the log is full.

3) You can also start logging new events by increasing the maximum log size. To increase the log size, on the Action menu or on the left side Application/System/Security (as available) click Properties, and then increase the Maximum log size by typing a bigger value.

Sunday, August 1, 2010

ORA-12640, ORA-21561: OID generation failed

Problem Description
In the sqlnet.ora file there is an entry SQLNET.AUTHENTICATION_SERVICES=NONE and now whenever you connect to database using RMAN it fails with following error.
ORA-21561: OID generation failed

Whenever you change the sqlnet.ora entry to SQLNET.AUTHENTICATION_SERVICES=NTS, it fails with error ORA-12640: Authentication adapter initialization failed.

Cause of the Problem
The problem happened due to lower value of Windows SharedSection heap size.

Solution of the Problem
1. Click on start menu on your windows machine and then go to Run.

2. Type regedit in the Run box and press Enter.

3. In the Registry Editor window navigate to
\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\SubSystems\
You will see key name Windows.

4. Right click on windows name and select modify. You should see value data field containing an entry like below.

%SystemRoot%\system32\csrss.exe ObjectDirectory=\Windows SharedSection=1024,3072,512 Windows=On SubSystemType=Windows ServerDll=basesrv,1 ServerDll=winsrv:UserServerDllInitialization,3 ServerDll=winsrv:ConServerDllInitialization,2 ProfileControl=Off MaxRequestThreads=16

From the field value data the parameter SharedSection's third value (512) is the size of the desktop heap for each desktop that is associated with a "noninteractive" window station.

Increase the desktop heap size that is modify the third value to 1024, so the values are now listed as 1024, 3072, 1024.

After you modify the entry try to connect to database using RMAN and it should resolve the problem.

General Solutions of ORA-12547: TNS: lost contact

In the post Troubleshoot ORA-12547: TNS: lost contact it is already discussed about the specific causes and solutions of ORA-12547: TNS: lost contact. If you don't have those specific scenarios then this post will provide you general solutions for error ORA-12547 which might help you to avoid this error.

1. Verify that environmental variable ORACLE_HOME, ORACLE_SID, PATH, LD_LIBRARY_PATH are correct.

$ echo $ORACLE_HOME
$ echo $ORACLE_SID
$ echo $LD_LIBRARY_PATH
$ echo $PATH


2. On AIX, HP-UX, Linux, MacOSX, Solaris, Tru64 system make sure kernel parameters settings are ok. To set, check and modify your kernel parameters have a look at the post How to check and modify Kernel Parameters in Linux
and Install Oracle on Linux.

3. There may have incorrect permissions on the ORACLE.exe
The 'ls' command should show permissions 6751 (as follows)

$ cd $ORACLE_HOME/bin
$ ls -l oracle


The output should be
-rwsr-s--x 1 oracle dba

If not then please execute the following
$ chmod 6751 oracle

4. Check whether current ulimit settings is ok.
$ ulimit -a

A sample output is,
$ ulimit -a 
    core file size        (blocks, -c) unlimited 
    data seg size         (kbytes, -d) unlimited 
    file size             (blocks, -f) unlimited 
    open files                    (-n) 256 
    pipe size          (512 bytes, -p) 10 
    stack size            (kbytes, -s) 8192 
    cpu time             (seconds, -t) unlimited 
    max user processes            (-u) 29995 
    virtual memory        (kbytes, -v) unlimited 
If open files (-n) is set to lower then make it unlimited or set it to a higher value, for example:
$ulimit -n 4096

5. Check the permission on the library files under $ORACLE_HOME/lib and $ORACLE_HOME/lib32 are ok. Since since sqlplus and other binaries require these libraries so if sqlplus does not have permission on these file it may throw ORA-12517: TNS: lost contact. It should have 755 on unix system.

You can give 755 permissions to $ORACLE_HOME/lib and $ORACLE_HOME/lib32
$ chmod -R 755 $ORACLE_HOME/lib
$ chmod -R 755 $ORACLE_HOME/lib32
$ cd $ORACLE_HOME/bin
$ relink all