Wednesday, October 8, 2014

Records are rejected while Loading into Netezza Using Informatica.

loading special characters in Netezza? (how far this statement is correct)

I have seen in many communities developers facing issues while loading data (into Netezza) for few characters Using Informatica.

Most of the developers misconception is they are Junk or special characters.  But in most of the scenarios that is not the case. So what are they(they are extended ASCII characters) below is the chat.

How to load these characters successfully into Netezza (any data base) please refer the below link for the same.

The extended ASCII codes (character code 128-255)

There are several different variations of the 8-bit ASCII table. The table below is according to ISO 8859-1, also called ISO Latin-1. Codes 129-159 contain the Microsoft® Windows Latin-1 extended characters.

DEC
OCT
HEX
BIN
Symbol
Description
128
200
80
10000000
Euro sign
129
201
81
10000001


130
202
82
10000010
Single low-9 quotation mark
131
203
83
10000011
ƒ
Latin small letter f with hook
132
204
84
10000100
Double low-9 quotation mark
133
205
85
10000101
Horizontal ellipsis
134
206
86
10000110
Dagger
135
207
87
10000111
Double dagger
136
210
88
10001000
ˆ
Modifier letter circumflex accent
137
211
89
10001001
Per mille sign
138
212
8A
10001010
Š
Latin capital letter S with caron
139
213
8B
10001011
Single left-pointing angle quotation
140
214
8C
10001100
Œ
Latin capital ligature OE
141
215
8D
10001101


142
216
8E
10001110
Ž
Latin captial letter Z with caron
143
217
8F
10001111


144
220
90
10010000


145
221
91
10010001
Left single quotation mark
146
222
92
10010010
Right single quotation mark
147
223
93
10010011
Left double quotation mark
148
224
94
10010100
Right double quotation mark
149
225
95
10010101
Bullet
150
226
96
10010110
En dash
151
227
97
10010111
Em dash
152
230
98
10011000
˜
Small tilde
153
231
99
10011001
Trade mark sign
154
232
9A
10011010
š
Latin small letter S with caron
155
233
9B
10011011
Single right-pointing angle quotation mark
156
234
9C
10011100
œ
Latin small ligature oe
157
235
9D
10011101


158
236
9E
10011110
ž
Latin small letter z with caron
159
237
9F
10011111
Ÿ
Latin capital letter Y with diaeresis
160
240
A0
10100000

Non-breaking space
161
241
A1
10100001
¡
Inverted exclamation mark
162
242
A2
10100010
¢
Cent sign
163
243
A3
10100011
£
Pound sign
164
244
A4
10100100
¤
Currency sign
165
245
A5
10100101
¥
Yen sign
166
246
A6
10100110
¦
Pipe, Broken vertical bar
167
247
A7
10100111
§
Section sign
168
250
A8
10101000
¨
Spacing diaeresis - umlaut
169
251
A9
10101001
©
Copyright sign
170
252
AA
10101010
ª
Feminine ordinal indicator
171
253
AB
10101011
«
Left double angle quotes
172
254
AC
10101100
¬
Not sign
173
255
AD
10101101
Soft hyphen
174
256
AE
10101110
®
Registered trade mark sign
175
257
AF
10101111
¯
Spacing macron - overline
176
260
B0
10110000
°
Degree sign
177
261
B1
10110001
±
Plus-or-minus sign
178
262
B2
10110010
²
Superscript two - squared
179
263
B3
10110011
³
Superscript three - cubed
180
264
B4
10110100
´
Acute accent - spacing acute
181
265
B5
10110101
µ
Micro sign
182
266
B6
10110110
Pilcrow sign - paragraph sign
183
267
B7
10110111
·
Middle dot - Georgian comma
184
270
B8
10111000
¸
Spacing cedilla
185
271
B9
10111001
¹
Superscript one
186
272
BA
10111010
º
Masculine ordinal indicator
187
273
BB
10111011
»
Right double angle quotes
188
274
BC
10111100
¼
Fraction one quarter
189
275
BD
10111101
½
Fraction one half
190
276
BE
10111110
¾
Fraction three quarters
191
277
BF
10111111
¿
Inverted question mark
192
300
C0
11000000
À
Latin capital letter A with grave
193
301
C1
11000001
Á
Latin capital letter A with acute
194
302
C2
11000010
Â
Latin capital letter A with circumflex
195
303
C3
11000011
Ã
Latin capital letter A with tilde
196
304
C4
11000100
Ä
Latin capital letter A with diaeresis
197
305
C5
11000101
Å
Latin capital letter A with ring above
198
306
C6
11000110
Æ
Latin capital letter AE
199
307
C7
11000111
Ç
Latin capital letter C with cedilla
200
310
C8
11001000
È
Latin capital letter E with grave
201
311
C9
11001001
É
Latin capital letter E with acute
202
312
CA
11001010
Ê
Latin capital letter E with circumflex
203
313
CB
11001011
Ë
Latin capital letter E with diaeresis
204
314
CC
11001100
Ì
Latin capital letter I with grave
205
315
CD
11001101
Í
Latin capital letter I with acute
206
316
CE
11001110
Î
Latin capital letter I with circumflex
207
317
CF
11001111
Ï
Latin capital letter I with diaeresis
208
320
D0
11010000
Ð
Latin capital letter ETH
209
321
D1
11010001
Ñ
Latin capital letter N with tilde
210
322
D2
11010010
Ò
Latin capital letter O with grave
211
323
D3
11010011
Ó
Latin capital letter O with acute
212
324
D4
11010100
Ô
Latin capital letter O with circumflex
213
325
D5
11010101
Õ
Latin capital letter O with tilde
214
326
D6
11010110
Ö
Latin capital letter O with diaeresis
215
327
D7
11010111
×
Multiplication sign
216
330
D8
11011000
Ø
Latin capital letter O with slash
217
331
D9
11011001
Ù
Latin capital letter U with grave
218
332
DA
11011010
Ú
Latin capital letter U with acute
219
333
DB
11011011
Û
Latin capital letter U with circumflex
220
334
DC
11011100
Ü
Latin capital letter U with diaeresis
221
335
DD
11011101
Ý
Latin capital letter Y with acute
222
336
DE
11011110
Þ
Latin capital letter THORN
223
337
DF
11011111
ß
Latin small letter sharp s - ess-zed
224
340
E0
11100000
à
Latin small letter a with grave
225
341
E1
11100001
á
Latin small letter a with acute
226
342
E2
11100010
â
Latin small letter a with circumflex
227
343
E3
11100011
ã
Latin small letter a with tilde
228
344
E4
11100100
ä
Latin small letter a with diaeresis
229
345
E5
11100101
å
Latin small letter a with ring above
230
346
E6
11100110
æ
Latin small letter ae
231
347
E7
11100111
ç
Latin small letter c with cedilla
232
350
E8
11101000
è
Latin small letter e with grave
233
351
E9
11101001
é
Latin small letter e with acute
234
352
EA
11101010
ê
Latin small letter e with circumflex
235
353
EB
11101011
ë
Latin small letter e with diaeresis
236
354
EC
11101100
ì
Latin small letter i with grave
237
355
ED
11101101
í
Latin small letter i with acute
238
356
EE
11101110
î
Latin small letter i with circumflex
239
357
EF
11101111
ï
Latin small letter i with diaeresis
240
360
F0
11110000
ð
Latin small letter eth
241
361
F1
11110001
ñ
Latin small letter n with tilde
242
362
F2
11110010
ò
Latin small letter o with grave
243
363
F3
11110011
ó
Latin small letter o with acute
244
364
F4
11110100
ô
Latin small letter o with circumflex
245
365
F5
11110101
õ
Latin small letter o with tilde
246
366
F6
11110110
ö
Latin small letter o with diaeresis
247
367
F7
11110111
÷
Division sign
248
370
F8
11111000
ø
Latin small letter o with slash
249
371
F9
11111001
ù
Latin small letter u with grave
250
372
FA
11111010
ú
Latin small letter u with acute
251
373
FB
11111011
û
Latin small letter u with circumflex
252
374
FC
11111100
ü
Latin small letter u with diaeresis
253
375
FD
11111101
ý
Latin small letter y with acute
254
376
FE
11111110
þ
Latin small letter thorn
255
377
FF
11111111
ÿ
Latin small letter y with diaeresis
Check this out 

 If you're facing any problem in implementing  than                                                                 
 feel free to ask below in comments and also leave your views and share this blog with your friends. Cheers!

3 comments :

Unknown said...

Hi Vamshi,
We are migrating SQL server to Netezza, While loading couple of tables we are facing Canonical error, Futher analysis we found that data set is of Latin letters.eg:yến Nguyễn,
In NZ _v_database the db_charset is set to LATIN9.
In SQlServer the db_charset is SQL_Latin1_General_CP1_CI_AS .
In informatica session code page is set ISO 8859-1 Western European , But still i am getting these records rejected. Can you please suggest a solution.

Thanks,
Jahnavi

Unknown said...

Hi Vamshi,

We are migrating from SQL Server to netezza, While loading couple of tables we are getting canonical error. Eg data looks like :yến Nguyễn.
In NZ : _v_database the db_charset is set to LATIN9.
In SQlServer: the db_charset is SQL_Latin1_General_CP1_CI_AS .
ISO 8859-1 Western European code page in informatica
Still we are facing the rejection , Can you please suggest a solution.

krish said...

Hi Can you check the ODBC connection of Netezza(there are 2 things one at the connection level)which you are using ISO 8859-1 Western European code page..

at the server level in ODBC file change the code page to either LATIN or UF8