今天做测试数据导入的时候,执行imp过程中alert.log提示了错误,
Hex dump of (file 10, block 220258) in trace file /DATA/app/ora11g/diag/rdbms/wdep/WDEP/trace/WDEP_ora_23477.trc Corrupt block relative dba: 0x02835c62 (file 10, block 220258) Bad check value found during buffer read Data in bad block: type: 6 format: 2 rdba: 0x02835c62 last change scn: 0x0000.000b1e8c seq: 0x5 flg: 0x04 spare1: 0x0 spare2: 0x0 spare3: 0x0 consistency value in tail: 0x1e8c0605 check value in block header: 0xab9c computed block checksum: 0x4 Reading datafile '/DATA/oracle/oradata_WDEP/WDEP/acpt_dat01.dbf' for corruption at rdba: 0x02835c62 (file 10, block 220258) Reread (file 10, block 220258) found same corrupt data (no logical check) Thu Jan 18 14:17:22 2018 Corrupt Block Found TSN = 9, TSNAME = ACPE RFN = 10, BLK = 220258, RDBA = 42163298 OBJN = 15099, OBJD = 15099, OBJECT = TBL, SUBOBJECT = TBL_2014Q3 SEGMENT OWNER = XNDEP, SEGMENT TYPE = Table Partition Errors in file /DATA/app/ora11g/diag/rdbms/wdep/WDEP/trace/WDEP_ora_23477.trc (incident=8239): ORA-01578: ORACLE data block corrupted (file # 10, block # 220258) ORA-01110: data file 10: '/DATA/oracle/oradata_WDEP/WDEP/acpt_dat01.dbf' Incident details in: /DATA/app/ora11g/diag/rdbms/wdep/WDEP/incident/incdir_8239/WDEP_ora_23477_i8239.trc Errors in file /DATA/app/ora11g/diag/rdbms/wdep/WDEP/trace/WDEP_ora_23477.trc (incident=8240): ORA-01578: ORACLE data block corrupted (file # 10, block # 220258) ORA-01110: data file 10: '/DATA/oracle/oradata_WDEP/WDEP/acpe_dat01.dbf' Incident details in: /DATA/app/ora11g/diag/rdbms/wdep/WDEP/incident/incdir_8240/WDEP_ora_23477_i8240.trc Thu Jan 18 14:17:31 2018 Dumping diagnostic data in directory=[cdmp_20180118141731], requested by (instance=1, osid=23477), summary=[incident=8239]. Thu Jan 18 14:17:31 2018 Sweep [inc][8239]: completed
从错误信息看,ORA-01578提示10号文件的220258块corrupted,而且提示很明确,是ACPE表空间,object_name是TBL表,他是一张分区表,出现错误的块属于TBL_2014Q3子分区,
TSN = 9, TSNAME = ACPE RFN = 10, BLK = 220258, RDBA = 42163298 OBJN = 15099, OBJD = 15099, OBJECT = TBL, SUBOBJECT = TBL_2014Q3 SEGMENT OWNER = XNDEP, SEGMENT TYPE = Table Partition
此时执行select count(*)等操作,会因为有corrupted的块,提示错误,无法直接得到检索结果,只有检索非TBL_2014Q3分区才能有返回。
测试库没有开启备份,TBL是一张分区表,数据量上亿了,重新执行一次导入操作,效率实在太低,作为测试数据,允许删除些数据,若可以将有corrupted的数据删除,就可以继续操作。
Maclean Liu这篇文章《利用构造ROWID实现无备份情况下绕过ORA-1578、ORA-8103、ORA-1410等逻辑/物理坏块问题》(http://www.askmaclean.com/archives/build-rowid-workaround-1578-1410-8103.html),提出了可以使用MOS的脚本《SCRIPT to skip ORA-8103 ORA-1410 ORA-1578 ORA-600 [kdsgrp1] when reading a TABLE (Doc ID 1527738.1)》,来做无备份情况下绕过ORA-01578,对于数据的恢复。
他的整体思路,就是逐行读取原表数据,碰见一些ORA错误,就将其记录于错误信息表,包括rowid和错误号,将正常数据存入备份表,这张备份表就忽略了,所有corrupted块的数据,进而可以提供正常的访问。
分为两种场景,有索引可以用来获取ROWID,没有索引、索引不可用,必须通过dbms_rowid.ROWID_CREATE来构造ROWID,由于上面示例中,索引尚未建立,所以选择第二种方法,如下是操作过程。
- 创建备份数据表,用于存储可用的原表数据,
SQL create table tbl_new as select * from tbl where 1=2; Table created.
- 创建错误信息表,用于存储错误的行rowid和错误号,
SQL create table bad_rows (row_id rowid, oracle_error_code varchar2(50)); Table created.
- 执行脚本,
SQL DECLARE 2 TYPE RowIDTab IS TABLE OF ROWID INDEX BY BINARY_INTEGER; 3 CURSOR Crowid_info IS 4 select Do.DATA_OBJECT_ID dataid, 5 DE.FILE_ID fid, 6 DE.BLOCK_ID blkid, 7 DE.BLOCKS blkcnt 8 from dba_objects DO, dba_extents DE 9 where DO.OBJECT_NAME = '&TABNAME' 10 and DE.PARTITION_NAME='&PARTITION_NAME' --如果是非分区表,可以注释 11 and nvl(DO.SUBOBJECT_NAME,'-1') = nvl(DE.PARTITION_NAME,'-1') 12 and DO.OBJECT_NAME = DE.SEGMENT_NAME 13 and DO.owner = '&OWNER' 14 order by 1, 2, 3 asc; 15 bad_rows number := 0; 16 errors varchar2(500); 17 error_code varchar2(500); 18 myrowid rowid; 19 BEGIN 20 /* Maclean Liu http://www.askmaclean.com * Copy Right 2013-4-20 */ 21 execute immediate 'alter session set commit_write=''batch,nowait'' '; --设置commite_write,为了提升效率
22 for i in Crowid_info loop 23 for j in 0 .. i.blkcnt - 1 loop --逐个数据块操作
24 for z in 0 .. 2000 loop --2000行记录逐条遍历25 begin 26 myrowid := dbms_rowid.ROWID_CREATE(1, 27 i.dataid, 28 i.fid, 29 i.blkid + j, 30 z); 31 insert into &backup_table --向备份数据表,执行插入操作
32 select /*+ ROWID(A) */ 33 * 34 from &source_table A 35 where rowid = myrowid; 36 EXCEPTION 37 when OTHERS then --碰见ORA-01410/ORA-08103/ORA-01578错误,则立即记录错误信息表
38 BEGIN 39 errors := SQLERRM; 40 error_code := SQLCODE; 41 if (error_code like '%1410%' or error_code like '%8103%' or error_code like '%1578%') then 42 bad_rows := bad_rows + 1; 43 insert into bad_rows values (myrowid, error_code); 44 commit; 45 else 46 raise; 47 end if; 48 END; 49 commit; 50 end; 51 end loop; 52 end loop; 53 end loop; 54 dbms_output.put_line('Total Bad Rows: ' || bad_rows); 55 commit; 56 END; 57 /
输入参数,分别为待恢复的表名、分区名称、表所属用户名、备份数据表名、原表名称,
Enter value for tabname: TBL old 9: where DO.OBJECT_NAME = '&TABNAME' new 9: where DO.OBJECT_NAME = 'TBL' Enter value for partition_name: TBL_2014Q3 old 10: and DE.PARTITION_NAME='&PARTITION_NAME' new 10: and DE.PARTITION_NAME='TBL_2014Q3' Enter value for owner: XNDEP old 13: and DO.owner = '&OWNER' new 13: and DO.owner = 'XNDEP' Enter value for backup_table: TBL_NEW old 31: insert into &backup_table new 31: insert into TBL_NEW Enter value for source_table: TBL old 34: from &source_table A new 34: from TBL A
接下来就是等待,可以检索错误信息表,
SQL select * from bad_rows where rownum 11; ROW_ID ORACLE_ERROR_CODE ------------------ -------------------------------------------------- AAADr7AAKAAAwCAAFK -8103 AAADr7AAKAAAwCAAFL -8103 AAADr7AAKAAAwCAAFM -8103 AAADr7AAKAAAwCAAFN -8103 AAADr7AAKAAAwCAAFO -8103 AAADr7AAKAAAwCAAFP -8103 AAADr7AAKAAAwCAAFQ -8103 AAADr7AAKAAAwCAAFR -8103 AAADr7AAKAAAwCAAFS -8103 AAADr7AAKAAAwCAAFT -8103 10 rows selected.
记录了ORA-01410、ORA-08103和ORA-01578这三种错误,
SQL select distinct oracle_error_code from bad_rows; ORACLE_ERROR_CODE -------------------------------------------------- -1410 -8103 -1578
这三种错误解释,
SQL !oerr ora 1410 01410, 00000, "invalid ROWID" // *Cause: // *Action:
SQL !oerr ora 8103 08103, 00000, "object no longer exists" // *Cause: The object has been deleted by another user since the operation // began, or a prior incomplete recovery restored the database to // a point in time during the deletion of the object. // *Action: Delete the object if this is the result of an incomplete // recovery.
SQL !oerr ora 1578 01578, 00000, "ORACLE data block corrupted (file # %s, block # %s)" // *Cause: The data block indicated was corrupted, mostly due to software // errors. // *Action: Try to restore the segment containing the block indicated. This // may involve dropping the segment and recreating it. If there // is a trace file, report the errors in it to your ORACLE // representative.
但说实话,由于上面采用的是INSERT,将正常数据导入的方式,对于这种上亿数据量的表,并不是很合适,一次次执行插入,非常耗时,对于一些数据量适中的表,可以说这是一种简便的方法,可以绕开ORA-01578的错误,保证正常的数据可以访问。
除此之外,MOS有以下两个脚本,在10231事件错误的情况下,可以绕开堆表ORA-00600,以及IOT索引组织表ORA-00600,朋友们可以参考,
SKIP ORA-600 in a Table
This is useful when the ORA-600 is produced by a non-existent chained row (invalid nrid) like ORA-600 [kdsgrp1] and when event 10231 does not work.
SKIP ORA-600 in IOT
This is useful when the ORA-600 is produced by a non-existent chained row (invalid nrid) like ORA-600 [kdsgrp1] and when event 10231 does not work for an Index Organized Table (IOT).如果您觉得本文有帮助,欢迎朋友们关注转发:bisal的个人杂货铺,