Saturday, September 20, 2014

Data Base Sequence Issue in Migration

In migration project the common scenario is the sequence.
It might be Trigger commonly used in Oracle DB.
Identity column in Sql server for Sequence.
Identity/Sequence Objects In DB2

For more information on how Identity or Sequence works please refer 

For Triggers:-

Lets see how to handle the above scenario in Netezza.

To generate Sequence in query use the below syntax.

SELECT
         (SELECT Nvl(Max(prod_dw_id),0)FROM   product_dim) 

+ Row_number() OVER (partition BY 'A' ORDER BY 'A') AS sequance,
         rest OF the required columns
FROM     your_table.

If one want to do both insert and update then first flag the records for insert 'I' or Update 'U'.
Based on the flag increment the sequence for Inset flagged records and skip for update once.

Ex:-
SELECT CASE
         WHEN flag = 'I' THEN 

(SELECT Nvl(Max(prod_dw_id), 0) FROM   product_dim)
 + Row_number() OVER (partition BY 'A' ORDER BY 'A')
         ELSE '-1'
       END AS sequance, 

rest OF the required columns
FROM   (SELECT CASE
                 WHEN prod_dw_id IS NOT NULL THEN 'U'
                 ELSE 'I'
               END AS flag,
        rest OF the required columns
        FROM   stg_prod
               LEFT OUTER JOIN roduct_dim
                            ON prod_id = dim.prod_id) a  

To handle the same in Informatica 

Get the MAX(PROD_DW_ID) from the (product_dim)table using lookup and use sequence generator or use expression transformation to increment the sequence from the MAX(PROD_DW_ID)

Netezza Migration Part-1 (Charset for Netezza)

When one want to do migration from DB(ex:-Oracle) to DB(Ex:-Netezza)  the first thing that comes to mind is History migration.

Where to start what all the thing one needs to look into.
1) Data Types
Data types won't supported by Netezza
Ex:- Large Objects (LOBs), Character Large Object ( CLOB ), NCLOB etc.,
For more information on the data types supported by netezza please visit http://support.sas.com/documentation/cdl/en/acreldb/63647/HTML/default/viewer.htm#a003181109.htm

Once the model is developed and physical Data model is ready. Then we start with Data migration.
Come up with the list of the tables need to migrate, in general the below tables will be consider for migration
Facts , Dimension , Relation, Static/Lookup/Lov tables, Aggregate  tables (if any).

Ways To migrate data:-
Informatica, Unix Script, and 3 Party Tools.

Lets talk about informatica and unix way(easy and Automatic ways).
1) visio informatica mapping template(need manual effort to parameterized  the connection string) Informatica power center 9.1 is not supporting Mapping Visio (for netezza).
2) create a Unix script(generic) to connect to Oracle and get the data and place the file in Unix (Ex:- Source dir), from there use Netezza load utilities to get the data from the file to Netezza table.

Things to keep in mind when doing migration using Informatica.
Code Page:- it plays a key role.So lets see which code page needs to be selected for read and write.

one needs to understand the database Character set of source and target.

For example oracle DB CHARACTERSET is UTF-8, Then use UTF-8 while creating connection in informatica to read data from oracle. 

Query to find out CHARACTERSET for oracle DB

SELECT value 
FROM   nls_database_parameters 
WHERE  parameter = 'NLS_CHARACTERSET'; 

Output:- AL32UTF8((this will vary from DB to DB)


Query to find out CHARACTERSET for Sql Server DB


SELECT Serverproperty('collation') AS CHARACTERSET; 

Output:- SQL_Latin1_General_CP1_CI_AS (this will vary from DB to DB)

Lets breaks up the above output into interesting parts:
  1. latin1 makes the server treat strings using charset latin 1, basically ascii
  2. CI case insensitive comparisons so 'ABC' would equal 'abc'
  3. AS accent sensitive, so 'ΓΌ' does not equal 'u'
In the above case i will use MS Windows Latin 1 code page in informatica  

Query to find out CHARACTERSET for Netezza DB

SELECT DB_CHARSET,DATABASE FROM _v_database WHERE DATABASE='YOUR_DB_NAME';

Output:- LATIN 9 (this will vary from DB to DB)
Based on the above out put, create ISO 8859-1 Western European code page in informatica