今天在一個報表數據庫後台發現了這個錯誤簡單描述一下問題的解決過程
詳細的錯誤信息為
Fri Feb ::
Errors in file /u/oracle/admin/repdb/bdump/repdb_j_trc:
ORA: internal error code arguments: [] [xADEEE] [] [] [] [] [] []
Fri Feb ::
Errors in file /u/oracle/admin/repdb/bdump/repdb_j_trc:
ORA: internal error code arguments: [] [xADEEE] [] [] [] [] [] []
進一步檢查對應的trace文件
bash$ more /u/oracle/admin/repdb/bdump/repdb_j_trc
/u/oracle/admin/repdb/bdump/repdb_j_trc
Oraclei Enterprise Edition Release bit Production
With the Partitioning OLAP and Oracle Data Mining options
JServer Release Production
ORACLE_HOME = /data/oracle/product/
System name: SunOS
Node name: newreport
Release:
Version: Generic_
Machine: sunu
Instance name: repdb
Redo thread mounted by this instance:
Oracle process number:
Unix process pid: image: oracle@newreport (J)
*** SESSION ID:() ::
*** ::
ksedmp: internal or fatal error
ORA: internal error code arguments: [] [xADEEE] [] [] [] [] [] []
Current SQL statement for this session:
DECLARE job BINARY_INTEGER := :job; next_date DATE := :mydate; broken BOOLEAN := FALSE; BEGIN P_GENERATE_REPDATA(FRT
); :mydate := next_date; IF broken THEN :b := ; ELSE :b := ; END IF; END;
Call Stack Trace
calling call entry argument values in hex
location type point (? means dubious value)
ksedmp()+ CALL ksedst()+ FFFFFFFFFFF ?
? ?
E ?
FFFFFFFFFFFCC ?
DC ?
kgeriv()+ PTR_CALL ? ?
D ? C ?
D ? DC ?
kgesiv()+ CALL kgeriv()+ DE ? C ?
? C ?
FFFFFFFFFFF ?
DB ?
kgesic()+ CALL kgesiv()+ DE ? C ?
AD ? ?
FFFFFFFFFFF ?
?
kglgob()+ CALL kgesic()+ DE ? C ?
AD ? ?
ADEEE ? A ?
kgldpo()+ CALL kglgob()+ ? ?
ADEEE ?
FFFFFFFFFFFA ?
? ?
kgldon()+ CALL kgldpo()+ ? ?
ADA ? ?
?
FFFFFFFFFFFBEE ?
pkldon()+ CALL kgldon()+ DE ?
FFFFFFFFFFFDE ?
ADA ? ?
?
FFFFFFFFFFFDE ?
pkloud()+ CALL pkldon()+ FFFFFFFFFFFAA ?
FFFFFFFFFFFDE ?
ADA ? ?
?
FFFFFFFFFFFDE ?
phnnrl_name_resolve CALL pkloud()+ FCA ?
_by_loading()+ FFFFFFFFFFFEC ?
? ?
? ACE ?
phngdl_get_defining CALL phnnrl_name_resolve ? ?
_libunit()+ _by_loading()+ FFFFFFFFFFF ?
FFFFFFFFFFF ?
? ?
phnrpls_resolve_pre CALL phngdl_get_defining FFFFFFFFFFF ?
fix_libscope()+ _libunit()+ FFFFFFFFFFFA ?
FFFFFFFFFFFC ?
? ?
?
無論是從trace文件對應的名稱還是從trace文件中對應的語句都可以確定引起問題的是一個JOB檢查metalinkOracle在文檔Doc ID 中對這個錯誤的已知bug進行了匯總不過這些bug的描述似乎沒有和當前十分相符的
查看文檔的描述發現ORA錯誤的第二個參數這裡是xADEEE代表Library Cache Object Handle看來問題可能和LATCH有關
但是根據信息在V$LATCH和V$LATCH_CHILDREN視圖中沒有找到有價值的信息
這個JOB由於失敗會自動再次執行檢查JOB運行時的V$LOCK信息
SQL> SELECT ADDR TYPE ID ID LMODE REQUEST BLOCK
FROM V$LOCK
WHERE SID = ;
ADDR TY ID ID LMODE REQUEST BLOCK
CU E+
F JQ
從V$LOCK中看不到什麼特別有價值的信息接著檢查V$SESSION_WAIT看看這個JOB在等待什麼
SQL> SELECT EVENT PTEXT PRAW PTEXT PRAW STATE
FROM V$SESSION_WAIT
WHERE SID = ;
EVENT PTEXT PRAW PTEXT PRAW STATE
library cache pin handle address ADEEE pin address BAA WAITING
這次的信息就明顯了ORA錯誤的第二個參數就是V$SESSION_WAIT視圖的PRAW的值而且從等待事件上也可以看到問題就是出現在LIBRARY CACHE PIN的過程中
重新查看METALINK的信息這個錯誤可能發生在一個長時間運行的進程在其運行過程中所依賴的對象被編譯或者刪除了
檢查JOB調用的過程的狀態
SQL> SELECT OWNER OBJECT_NAME OBJECT_TYPE STATUS
FROM DBA_OBJECTS
WHERE OWNER = FUJIANREP
AND OBJECT_NAME = P_GENERATE_REPDATA;
OWNER OBJECT_NAME OBJECT_TYPE STATUS
FUJIANREP
果然問題過程處於不正常的狀態
將JOB至於BROKEN狀態避免JOB再次運行
SQL> EXEC DBMS_JOBBROKEN( TRUE)
PL/SQL procedure successfully completed
SQL> COMMIT;
Commit complete
殺掉JOB對應的PROCESS
SQL> SELECT SPID FROM V$PROCESS WHERE ADDR IN (SELECT PADDR FROM V$SESSION WHERE SID = );
SPID
SQL> HOST kill
下面用重新編譯該過程
SQL> ALTER PROCEDURE P_GENERATE_REPDATA COMPILE;
ALTER PROCEDURE P_GENERATE_REPDATA COMPILE
*
ERROR at line
:
ORA
: timeout occurred while waiting to lock object FUJIANREP
P_GENERATE_REPDATA
由於從V$LOCK和V$LATCH無法得到信息只能看看有沒有其他人當前在訪問P_GENERATE_REPDATA所依賴的對象
SQL> SELECT * FROM V$ACCESS
WHERE (OWNER OBJECT) IN
(SELECT REFERENCED_OWNER REFERENCED_NAME
FROM DBA_DEPENDENCIES
WHERE OWNER = FUJIANREP
AND NAME = P_GENERATE_REPDATA);
SID OWNER OBJECT TYPE
FUJIANREP CAT_BUYER SYNONYM
FUJIANREP CAT_CATEGORY SYNONYM
FUJIANREP CAT_DOSEAGE_FORM SYNONYM
FUJIANREP CAT_DRUG SYNONYM
FUJIANREP CAT_ENTERPRISE SYNONYM
FUJIANREP CAT_METRIC SYNONYM
FUJIANREP CAT_ORG SYNONYM
FUJIANREP CAT_PRODUCT SYNONYM
FUJIANREP CAT_QUALITY_DEFINE SYNONYM
FUJIANREP GOV_CAT_BUYER TABLE
FUJIANREP GOV_CAT_ENTERPRISE TABLE
FUJIANREP GOV_S_MO_BU TABLE
FUJIANREP GOV_S_MO_BU_EN TABLE
FUJIANREP GOV_S_MO_BU_PR TABLE
FUJIANREP GOV_S_MO_EN TABLE
FUJIANREP GOV_S_MO_ME TABLE
FUJIANREP GOV_S_MO_ME_CA TABLE
FUJIANREP GOV_S_MO_ME_PR TABLE
FUJIANREP GOV_S_MO_ORDER TABLE
FUJIANREP GOV_S_YE_ORDER TABLE
FUJIANREP GRP_HOSPITAL TABLE
FUJIANREP GRP_LEVEL TABLE
FUJIANREP ORD_ORDER TABLE
FUJIANREP ORD_ORDER_ITEM TABLE
FUJIANREP ORD_ORDER_ITEM_REP CURSOR
FUJIANREP ORD_ORDER_RECEIVE TABLE
FUJIANREP ORD_ORDER_RECEIVE_REP SYNONYM
FUJIANREP ORD_ORDER_REP CURSOR
FUJIANREP ORD_ORDER_RETURN TABLE
FUJIANREP ORD_ORDER_RETURN_REP CURSOR
FUJIANREP PLT_PLAT CURSOR
FUJIANREP USER_TAB_PARTITIONS CURSOR
NDMAIN CAT_BUYER TABLE
NDMAIN CAT_CATEGORY TABLE
NDMAIN CAT_DOSEAGE_FORM TABLE
NDMAIN CAT_DRUG TABLE
NDMAIN CAT_ENTERPRISE TABLE
NDMAIN CAT_METRIC TABLE
NDMAIN CAT_ORG TABLE
NDMAIN CAT_PRODUCT TABLE
NDMAIN CAT_QUALITY_DEFINE TABLE
NDMAIN ORD_ORDER VIEW
NDMAIN ORD_ORDER_ITEM VIEW
NDMAIN ORD_ORDER_RECEIVE VIEW
NDMAIN ORD_ORDER_RETURN VIEW
NDMAIN PLT_PLAT TABLE
PUBLIC USER_TAB_PARTITIONS SYNONYM
SYS STANDARD PACKAGE
SYS STANDARD PACKAGE
SYS SYS_STUB_FOR_PURITY_ANALYSIS PACKAGE
SYS USER_TAB_PARTITIONS VIEW
rows selected
對象果然被其他人所訪問看看這個會話在做什麼
SQL> SELECT SID SERIAL# USERNAME PROGRAM TERMINAL
FROM V$SESSION
WHERE SID = ;
SID SERIAL# USERNAME PROGRAM TERMINAL
FUJIANREP PlSqlDevexe LIBY
沒想到是同事的連接的會話看看他在干什麼
SQL> SELECT SQL_TEXT FROM V$SQL
WHERE ADDRESS IN
(SELECT SQL_ADDRESS FROM V$SESSION
WHERE SID = );
SQL_TEXT
ALTER TABLE GOV_S_MO_EN TRUNCATE PARTITION P
居然是TRUNCATE分區操作難怪會導致過程處於INVALID狀態不過這個操作應該不會持續很長時間的難道這個操作一直沒有完成嗎
SQL> SELECT EVENT PTEXT P PTEXT P PTEXT P SECONDS_IN_WAIT
FROM V$SESSION_WAIT WHERE SID = ;
EVENT PTEXT P PTEXT P PTEXT P SECONDS_IN_WAIT
db file sequential read file# block# blocks
這個等待已經發生了幾十天了顯然這是一個僵死的會話
從後台kill掉對應的進程
SQL> SELECT SPID FROM V$PROCESS
WHERE ADDR IN (SELECT PADDR FROM V$SESSION WHERE SID = );
SPID
SQL> HOST kill
切換為FUJIANREP用戶再次編譯過程
SQL> ALTER PROCEDURE P_GENERATE_REPDATA COMPILE;
Procedure altered
至此問題解決將JOB重新設置BROKEN即可
SQL> EXEC DBMS_JOBBROKEN( FALSE)
PL/SQL procedure successfully completed
SQL> COMMIT;
Commit complete
From:http://tw.wingwit.com/Article/program/Oracle/201311/17705.html