IBM Info sphere quality stage in datastage tutorial - Pharma Jobs

Sunday, February 16, 2014

IBM Info sphere quality stage in datastage tutorial

1.Why investigate:

àDiscover trends and potential anomalies in data
àIdentify invalid and default values in a data
 Ã Verify the reliability of the data in the fields to be used as a matching criteria
 Ã Gain complte understanding of the data  in a context
Investiage:
Verify the domain:
Review each field and verify the data matches the meta data
Identyfy the data formats and missing and default values
Identify the data anomalies:
Format
Structure
Content

Feature of investigate:
Analyze free form and single domain columns
Provide frequency distribution  of distinct values and patterns

Investigaet methods:
Characte distcrete
Character concate nate:
Word invistgate:


INVESTIGATE STAGE:
1.Character Discrete Investigate C mask:
Job:
clip_image002








Add the columns which ever u want to investigate:
clip_image004

Click on change mask and select for all the fields C mask

At out put it gives 5 columns
1.qsInvColumnName
2.qsInvPattern
3.qsInvSample
4.qsInvCount
5.QsInvPercent

At output it gives like below
clip_image006

Above screen shot is only for one field it gives same like another fields which ever u selected
clip_image008
clip_image010




1.Character Discrete Investigate T mask:
Job:
clip_image012

Add the columns which ever u want to investigate:
clip_image004

Click on change mask and select for all the fields C mask

At out put it gives 5 columns
1.qsInvColumnName
2.qsInvPattern
3.qsInvSample
4.qsInvCount
5.QsInvPercent












At output it gives like below
clip_image014


3. Character Discrete Investigate X mask:
clip_image016

Add the columns which ever u want to investigate:
clip_image018
Here I selected for Policy Number Column C,T,X alternatively
At out put it gives like below
clip_image020
clip_image022
clip_image024

CharacterConcatenate Investigate T mask:

Job:
clip_image026


Add the columns which ever u want to concatenate and investigate:
clip_image028




At out put it gives like below
clip_image030

CharacterConcatenate Investigate C mask:
Job:
clip_image032
Add the columns which ever u want to concatenate and investigate:
clip_image034










At out put it gives like below
clip_image036



CharacterConcatenate Investigate X mask:
Job:

clip_image038

Add the columns which ever u want to concatenate and investigate
clip_image040
At out put it gives like below
clip_image042


Word Investigate Adding rule set to NAME field
Job:
clip_image044
Select the Name rule set from USNAME
clip_image046

Select the Full name field from Available data columns


Output you will get like below
clip_image048

Word Investigate Adding rule set to NAME field
Token report and pattern report
Job:
clip_image050

Select the Name rule set from USNAME
And add fullname field from available data columns
If u want two reports tokenreport and pattern report tick mark for both token report box and pattern report box

clip_image052

Pattern report output u will get like below:
Word investigate pattern report it give 5 columns at output
1.qsInvColumnName
2.qsInvPattern
3.qsInvSample
4.qsInvCount
5.QsInvPercent

Pattern report Output:
clip_image054

Token Report output u will get like below
Word investigate pattern report it give 3 columns at output
1.qsInvCount
2.qsInvWord
3.qsInvClassCode

Token report Output
clip_image056

Classification Code table:
clip_image058

2.STANDARDIZE STAGE:
Example job:
clip_image060

Open standardize stage and select rule set text field and select standardize rules folder with in that folder select OTHER folder and select COUNTRY folder and select COUNTRY
àNext in literal text field enter ZQUSZQ
àAdd column which column u want to select from available data columns
àSelect column name <literal> and AddressLine1, AddressLine1, city, state, Zip columns in selected columns list
You will get the below screen after entering all
clip_image062
àNext click on OK
you will get the below screen
clip_image064
next click on Ok

Output:


At output it gives additional column ie ISOCOUNTRYCODE AND











No comments:

Post a Comment