Saturday, September 20, 2014

Netezza Migration Part-1 (Charset for Netezza)

When one want to do migration from DB(ex:-Oracle) to DB(Ex:-Netezza)  the first thing that comes to mind is History migration.

Where to start what all the thing one needs to look into.
1) Data Types
Data types won't supported by Netezza
Ex:- Large Objects (LOBs), Character Large Object ( CLOB ), NCLOB etc.,
For more information on the data types supported by netezza please visit http://support.sas.com/documentation/cdl/en/acreldb/63647/HTML/default/viewer.htm#a003181109.htm

Once the model is developed and physical Data model is ready. Then we start with Data migration.
Come up with the list of the tables need to migrate, in general the below tables will be consider for migration
Facts , Dimension , Relation, Static/Lookup/Lov tables, Aggregate  tables (if any).

Ways To migrate data:-
Informatica, Unix Script, and 3 Party Tools.

Lets talk about informatica and unix way(easy and Automatic ways).
1) visio informatica mapping template(need manual effort to parameterized  the connection string) Informatica power center 9.1 is not supporting Mapping Visio (for netezza).
2) create a Unix script(generic) to connect to Oracle and get the data and place the file in Unix (Ex:- Source dir), from there use Netezza load utilities to get the data from the file to Netezza table.

Things to keep in mind when doing migration using Informatica.
Code Page:- it plays a key role.So lets see which code page needs to be selected for read and write.

one needs to understand the database Character set of source and target.

For example oracle DB CHARACTERSET is UTF-8, Then use UTF-8 while creating connection in informatica to read data from oracle. 

Query to find out CHARACTERSET for oracle DB

SELECT value 
FROM   nls_database_parameters 
WHERE  parameter = 'NLS_CHARACTERSET'; 

Output:- AL32UTF8((this will vary from DB to DB)


Query to find out CHARACTERSET for Sql Server DB


SELECT Serverproperty('collation') AS CHARACTERSET; 

Output:- SQL_Latin1_General_CP1_CI_AS (this will vary from DB to DB)

Lets breaks up the above output into interesting parts:
  1. latin1 makes the server treat strings using charset latin 1, basically ascii
  2. CI case insensitive comparisons so 'ABC' would equal 'abc'
  3. AS accent sensitive, so 'ü' does not equal 'u'
In the above case i will use MS Windows Latin 1 code page in informatica  

Query to find out CHARACTERSET for Netezza DB

SELECT DB_CHARSET,DATABASE FROM _v_database WHERE DATABASE='YOUR_DB_NAME';

Output:- LATIN 9 (this will vary from DB to DB)
Based on the above out put, create ISO 8859-1 Western European code page in informatica 




1 comment :

Melisa said...

I really enjoyed while reading your article, the information you have mentioned in this post was damn good. Keep sharing your blog with updated and useful information.
Regards,
Informatica training in chennai|Informatica training in chennai|Best Informatica Training In Chennai