I just wanted to test a simple rman backup and recovery operation. I took RMAN backup, deleted datafiles and then performed RECOVER DATABASE operation followed by RESTORE DATABASE command. I see my RECOVER DATABASE fails with ORA-00600 [3020] like below.
RMAN> recover database; Starting recover at 20-MAR-11 using channel ORA_DISK_1 starting media recovery media recovery failed RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of recover command at 03/20/2011 21:16:36 ORA-00283: recovery session canceled due to errors RMAN-11003: failure during parse/execution of SQL statement: alter database recover if needed start ORA-00283: recovery session canceled due to errors ORA-00600: internal error code, arguments: [3020], [1], [1674], [4195978], [], [], [], [] ORA-10567: Redo is inconsistent with data block (file# 1, block# 1674) ORA-10564: tablespace SYSTEM ORA-01110: data file 1: 'X:\ORACLE\PRODUCT\10.2.0\ORADATA\KADAVIL\SYSTEM01.DBF' ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 239
From the alert log the entries are following:
Sun Mar 20 21:00:05 2011 alter database recover datafile list 1 , 2 , 3 , 4 , 5 , 6 Completed: alter database recover datafile list 1 , 2 , 3 , 4 , 5 , 6 Sun Mar 20 21:00:05 2011 alter database recover if needed start Media Recovery Start Sun Mar 20 21:00:06 2011 Recovery of Online Redo Log: Thread 1 Group 1 Seq 20 Reading mem 0 Mem# 0 errs 0: X:\ORACLE\PRODUCT\10.2.0\ORADATA\KADAVIL\REDO01.LOG Sun Mar 20 21:00:07 2011 Recovery of Online Redo Log: Thread 1 Group 2 Seq 21 Reading mem 0 Mem# 0 errs 0: X:\ORACLE\PRODUCT\10.2.0\ORADATA\KADAVIL\REDO02.LOG Sun Mar 20 21:00:08 2011 Errors in file x:\oracle\product\10.2.0\admin\kadavil\udump\kadavil_ora_3696.trc: ORA-00600: internal error code, arguments: [3020], [1], [1658], [1], [21], [75], [16], [] ORA-10567: Redo is inconsistent with data block (file# 1, block# 1658) ORA-10564: tablespace SYSTEM ORA-01110: data file 1: 'X:\ORACLE\PRODUCT\10.2.0\ORADATA\KADAVIL\SYSTEM01.DBF' ORA-10561: block type 'TRANSACTION MANAGED DATA BLOCK', data object# 237 Sun Mar 20 21:00:11 2011 Media Recovery failed with error 600 ORA-283 signalled during: alter database recover if needed start ... Sun Mar 20 21:13:26 2011
Cause of the Problem
This is really terrible Oracle bug as now I can't even open my database. My Database is down! In fact there were no transaction going. I just tested simple rman restore & recovery procedures and hit this bug.
From the error, we see this corruption is in the system tablespace, not user tablespace. So this is an internal transaction.
This is a common error when doing recovery if you are not on patched version 10.2.0.3 or higher.
Prior to Oracle 10.2.0.3 it did not record all the changes in the redo stream and when you do recovery you get this error. So, there is an inconsistency between the information stored in the redo and the information stored in a database block being recovered.
After searching to Oracle support I see so many related bugs exist regarding ORA-00600 [3020].
| NB | Bug | Fixed | Description |
| 9847338 | Session hang after applying the patch for Bug 9587912 which causes ORA-600 [3020] | ||
| 11689702 | 12.1.0.0 | ORA-600 [3020] during recovery after datafile RESIZE (to smaller size) | |
| 8774868 | 11.2.0.1.2, 11.2.0.2, 12.1.0.0 | OERI[3020] reinstating primary | |
| + | 8597106 | 11.2.0.2, 12.1.0.0 | Lost Write in ASM when normal redundancy is used |
| 10094823 | 12.1.0.0 | Block change tracking on physical standby can cause data loss | |
| 10071193 | 11.2.0.3, 12.1.0.0 | Lost write / ORA-600 [kclchkblk_3] / ORA-600 [3020] in RAC | |
| P | 8635179 | 10.2.0.5, 11.2.0.2, 12.1.0.0 | Solaris: directio may be disabled for RAC file access. Corruption / Lost Write |
| 9587912 | 11.2.0.2, 12.1.0.0 | ORA-600 [3020] in datafile that went offline/online in a RAC instance | |
| + | 10209232 | 11.2.0.2.1, 12.1.0.0 | ORA-1578 / ORA-600 [3020] Corruption. Misplaced Blocks and Lost Write in ASM |
| + | 10425010 | 12.1 | Stale data blocks may be returned by Flash Cache |
| 8826708 | 10.2.0.5, 11.2.0.2 | ORA-600 [3020] for block type 0x3a (58) during recovery for block restored by RMAN backup | |
| 8230457 | 10.2.0.4.1, 10.2.0.5, 11.1.0.7.1, 11.2.0.1 | Physical standby media recovery gets OERI[krr_media_12] | |
| 4453449 | 10.2.0.2, 11.1.0.6 | OERI:3020 / corruption errors from multiple FLASHBACK DATABASE | |
| 4594917 | 9.2.0.8, 10.2.0.2, 11.1.0.6 | Write IO error can cause incorrect file header checkpoint information | |
| 4637668 | 10.2.0.3, 11.1.0.6 | IMU transactions can produce out-of-order redo (OERI [3020] on recovery) | |
| 5610267 | 10.2.0.5 | MRP terminated by ORA-600[krr_media_12] / OERI:3020 after flashback | |
| 7197445 | 10.2.0.4.1, 10.2.0.5 | Standby Recovery session cancelled due to ORA-600 [3020] "CHANGE IN FUTURE OF BLOCK" | |
| 3560209 | 10.2.0.1 | OERI[3020] stuck recovery under RAC | |
| 3762714 | 9.2.0.7, 10.1.0.4, 10.2.0.1 | ALTER DATABASE RECOVER MANAGED STANDBY fails with OERI[3020] | |
| * | 3381950 | 10.2.0.1 | Backups from RAC DB before Data Guard Failover cannot be used |
| 3397181 | 9.2.0.5, 10.1.0.3, 10.2.0.1 | ALTER SYSTEM KILL SESSION of recovery slave causes stuck recovery | |
| 3535712 | 9.2.0.6, 10.1.0.4 | OERI[3020] / ORA-10567 from RAC with standby in max performance mode | |
| 4594912 | 9.2.0.8, 10.1.0.2 | Incorrect checkpoint possible in datafile headers | |
| 3635331 | 9.2.0.6, 10.1.0.4 | Stuck recovery (OERI:3020) / ORA-1172 on startup after a crash | |
| 2322620 | 9.2.0.1 | OERI:3020 possible on recovery of LOB DATA | |
| P+ | 656370 | 7.3.3.4, 7.3.4.0, 8.0.3.0 | AlphaNT only: Corrupt Redo (zeroed byte) OERI:3020 |
- '*' against a bug indicates that an alert exists for that issue.
- '+' indicates a particularly notable bug.
- 'P' indicates a port specific bug.
Solution of the Problem
Solution 01:
The fix is to do a manual recovery with allow 1 corruption.
That is "recover database allow 1 corruption;" which will skip the bad transaction. We need to repeat this command until the recovery completes.
RMAN> recover database allow 1 corruption; Starting recover at 20-MAR-11 using target database control file instead of recovery catalog allocated channel: ORA_DISK_1 channel ORA_DISK_1: sid=320 devtype=DISK starting media recovery media recovery failed RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of recover command at 03/20/2011 21:53:19 ORA-00283: recovery session canceled due to errors RMAN-11003: failure during parse/execution of SQL statement: alter database recover if needed start allow 1 corruption ORA-00283: recovery session canceled due to errors ORA-00600: internal error code, arguments: [3020], [1], [1674], [4195978], [], [], [], [] ORA-10567: Redo is inconsistent with data block (file# 1, block# 1674) ORA-10564: tablespace SYSTEM ORA-01110: data file 1: 'X:\ORACLE\PRODUCT\10.2.0\ORADATA\KADAVIL\SYSTEM01.DBF' ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 239 RMAN> recover database allow 1 corruption; Starting recover at 20-MAR-11 using channel ORA_DISK_1 starting media recovery media recovery complete, elapsed time: 00:00:02 Finished recover at 20-MAR-11
Solution 02:
Solution 01 is recommended. However after you run "recover database allow 1 corruption;" in a row for 10 times still your problem is not resolved then you might need to do incomplete recovery. For example:
SQL> recover database until time 'YYYY-MON-DD:HH:MI:SS';
No comments:
Post a Comment