Sunday, September 21, 2014

Informatica Audit Framework

Different name but the requirement or functionality is same:-


1) An ETL/Informatica Framework for Operational Metadata Logging

2) Process Control / Audit of Workflows in Informatica 

3) ETL Audit and Control information

4) A Scheduler framework for Audit - Balance and Control

5) Collecting Session Stat for audit purpose

below is the approach to implement the above requirement:-

Approach:-

The above requirement can be implemented using Informatica repository Views:-

The 2 vies used for this framework are:-
REP_SESS_LOG and REP_SESS_TBL_LOG one can find these view in the data base where Informatica repository server installed.

Let's understand what blow query will do. 

 Both query's will return last session run related records.

for example in the development Env we have 6 months of data in the views, out of that we will get only last run information for each session in a folder (as we are filtering data based on the folder first) **Where SUBJECT_AREA=PROJECT_FOLDER.





Querys:-

SELECT 
A.SUBJECT_AREA               AS FOLDER_NAME,
B.WORKFLOW_NAME              AS WORKFLOW_NAME     ,    
B.SESSION_NAME               AS SESSION_NAME, 
B.MAPPING_NAME               AS MAPPING_NAME,
A.TYPE_NAME                  AS TABLE_TYPE_NAME ,
A.TABLE_NAME                 AS TARGET_TABLE_NAME ,
A.TABLE_INSTANCE_NAME        AS TARGET_INSTANCE_NAME,
B.SUCCESSFUL_SOURCE_ROWS     AS TOTAL_SRC_RECORD_COUNT,
B.FAILED_SOURCE_ROWS         AS REJECTED_SRC_RECORD_COUNT,
A.SUCCESSFUL_AFFECTED_ROWS   AS AFFECTED_TGT_RECORD_COUNT,
A.FAILED_ROWS                AS REJECTED_TGT_RECORD_COUNT,
TO_CHAR(B.ACTUAL_START,'MM/DD/YYYY HH24:MI:SS') AS SESSION_START_DATE, 
TO_CHAR(B.SESSION_TIMESTAMP,'MM/DD/YYYY HH24:MI:SS') AS SESSION_END_DATE, 
TO_CHAR(A.START_TIME  ,'MM/DD/YYYY HH24:MI:SS') AS TARGET_LOAD_START_TIME,
TO_CHAR(A.END_TIME,'MM/DD/YYYY HH24:MI:SS')  AS    TARGET_LOAD_END_TIME,
THROUGHPUT                   AS TARGET_THROUGHPUT,
B.FIRST_ERROR_MSG            AS FIRST_ERROR_MSG,
B.FIRST_ERROR_CODE           AS FIRST_ERROR_CODE,
B.LAST_ERROR                 AS LAST_ERROR,
B.LAST_ERROR_CODE            AS LAST_ERROR_CODE
FROM 
DEV9.REP_SESS_TBL_LOG A , DEV9.REP_SESS_LOG B
WHERE 
A.WORKFLOW_ID=B.WORKFLOW_ID
AND 
A.SESSION_ID=B.SESSION_ID   
AND 
A.END_TIME<=B.ACTUAL_START
AND 
A.END_TIME>=B.SESSION_TIMESTAMP
AND
A.SUBJECT_AREA='BI_CORE_Work'            
ORDER BY 
A.START_TIME DESC;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
SELECT * FROM (SELECT 
SUBJECT_AREA AS FOLDER_NAME, 
WORKFLOW_NAME,
SESSION_NAME, 
MAPPING_NAME,
TRUNC (MOD ( (SESSION_TIMESTAMP - ACTUAL_START)* 24, 24))
|| ' HOUR '|| TRUNC (MOD ( (SESSION_TIMESTAMP - ACTUAL_START) * 24 * 60, 60))
|| ' MIN '|| TRUNC ( MOD ( (SESSION_TIMESTAMP - ACTUAL_START) * 24 * 60 * 60, 60))
|| ' SEC ' AS RUN_TIME,
ACTUAL_START AS SESSION_START_DATE, 
SESSION_TIMESTAMP AS SESSION_END_DATE, 
SUCCESSFUL_SOURCE_ROWS AS SRC_SUCCESSFUL_RECORD_COUNT, 
SUCCESSFUL_ROWS AS        TGT_SUCCESSFUL_RECORD_COUNT,
FAILED_SOURCE_ROWS AS     SRC_REJECTED_RECORD_COUNT, 
FAILED_ROWS AS            TGT_REJECTED_ROWS_COUNT,
FIRST_ERROR_CODE  AS ERROR_CODE,
DECODE (RUN_STATUS_CODE, 1, 'SUCCEEDED', 2, 'DISABLED', 3, 'FAILED', 4, 'STOPPED', 5, 'ABORTED',
6, 'RUNNING', 7, 'SUSPENDING', 8, 'SUSPENDED', 9, 'STOPPING', 10, 'ABORTING', 11, 'WAITING',
12, 'SCHEDULED', 13, 'UNSCHEDULED', 14, 'UNKNOWN', 15, 'TERMINATED', 'UNKNOWN') AS RUN_STATUS,
FIRST_ERROR_MSG AS FIRST_ERR_OF_THE_MAPPING,
LAST_ERROR AS LAST_ERR_OF_THE_MAPPING,
ROW_NUMBER() OVER(PARTITION BY SESSION_ID,WORKFLOW_ID 
ORDER BY workflow_run_id DESC) AS LATES_VESRION_ID
FROM 
DEV9.REP_SESS_LOG
WHERE
 SUBJECT_AREA='BI_CORE_Work' AND SESSION_NAME='s_m_veeba_bi_load_STG_SRVC_stg')
 WHERE LATES_VESRION_ID=1;

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
SELECT 
A.SUBJECT_AREA               AS FOLDER_NAME,
B.WORKFLOW_NAME              AS WORKFLOW_NAME     ,    
B.SESSION_NAME               AS SESSION_NAME, 
B.MAPPING_NAME               AS MAPPING_NAME,
A.TYPE_NAME                  AS TABLE_TYPE_NAME ,
A.TABLE_NAME                 AS TARGET_TABLE_NAME ,
A.TABLE_INSTANCE_NAME        AS TARGET_INSTANCE_NAME,
B.SUCCESSFUL_SOURCE_ROWS     AS TOTAL_SRC_RECORD_COUNT,
B.FAILED_SOURCE_ROWS         AS REJECTED_SRC_RECORD_COUNT,
A.SUCCESSFUL_AFFECTED_ROWS   AS AFFECTED_TGT_RECORD_COUNT,
A.FAILED_ROWS                AS REJECTED_TGT_RECORD_COUNT,
TO_CHAR(B.ACTUAL_START,'MM/DD/YYYY HH24:MI:SS') AS SESSION_START_DATE, 
TO_CHAR(B.SESSION_TIMESTAMP,'MM/DD/YYYY HH24:MI:SS') AS SESSION_END_DATE, 
TO_CHAR(A.START_TIME  ,'MM/DD/YYYY HH24:MI:SS') AS TARGET_LOAD_START_TIME,
TO_CHAR(A.END_TIME,'MM/DD/YYYY HH24:MI:SS')  AS    TARGET_LOAD_END_TIME,
THROUGHPUT                   AS TARGET_THROUGHPUT,
B.FIRST_ERROR_MSG            AS FIRST_ERROR_MSG,
B.FIRST_ERROR_CODE           AS FIRST_ERROR_CODE,
B.LAST_ERROR                 AS LAST_ERROR,
B.LAST_ERROR_CODE            AS LAST_ERROR_CODE
FROM 
DEV9.REP_SESS_TBL_LOG A , DEV9.REP_SESS_LOG B
WHERE 
A.WORKFLOW_ID=B.WORKFLOW_ID
AND 
A.SESSION_ID=B.SESSION_ID   
AND 
A.END_TIME>=B.ACTUAL_START
AND 
A.END_TIME<=B.SESSION_TIMESTAMP
AND
A.SUBJECT_AREA='BI_CORE_Work'            
ORDER BY 
A.START_TIME DESC;

No comments :