IBM Info sphere quality stage in datastage tutorial - Pharma Jobs

Sunday, February 16, 2014

IBM Info sphere quality stage in datastage tutorial

1.Why investigate:

àDiscover trends and potential anomalies in data
àIdentify invalid and default values in a data
 Ã Verify the reliability of the data in the fields to be used as a matching criteria
 Ã Gain complte understanding of the data  in a context
Investiage:
Verify the domain:
Review each field and verify the data matches the meta data
Identyfy the data formats and missing and default values
Identify the data anomalies:
Format
Structure
Content

Feature of investigate:
Analyze free form and single domain columns
Provide frequency distribution  of distinct values and patterns

Investigaet methods:
Characte distcrete
Character concate nate:
Word invistgate:


INVESTIGATE STAGE:
1.Character Discrete Investigate C mask:
Job:








Add the columns which ever u want to investigate:

Click on change mask and select for all the fields C mask

At out put it gives 5 columns
1.qsInvColumnName
2.qsInvPattern
3.qsInvSample
4.qsInvCount
5.QsInvPercent

At output it gives like below

Above screen shot is only for one field it gives same like another fields which ever u selected




1.Character Discrete Investigate T mask:
Job:

Add the columns which ever u want to investigate:

Click on change mask and select for all the fields C mask

At out put it gives 5 columns
1.qsInvColumnName
2.qsInvPattern
3.qsInvSample
4.qsInvCount
5.QsInvPercent












At output it gives like below


3. Character Discrete Investigate X mask:

Add the columns which ever u want to investigate:
Here I selected for Policy Number Column C,T,X alternatively
At out put it gives like below

CharacterConcatenate Investigate T mask:

Job:


Add the columns which ever u want to concatenate and investigate:




At out put it gives like below

CharacterConcatenate Investigate C mask:
Job:
Add the columns which ever u want to concatenate and investigate:










At out put it gives like below



CharacterConcatenate Investigate X mask:
Job:


Add the columns which ever u want to concatenate and investigate
At out put it gives like below


Word Investigate Adding rule set to NAME field
Job:
Select the Name rule set from USNAME

Select the Full name field from Available data columns


Output you will get like below

Word Investigate Adding rule set to NAME field
Token report and pattern report
Job:

Select the Name rule set from USNAME
And add fullname field from available data columns
If u want two reports tokenreport and pattern report tick mark for both token report box and pattern report box


Pattern report output u will get like below:
Word investigate pattern report it give 5 columns at output
1.qsInvColumnName
2.qsInvPattern
3.qsInvSample
4.qsInvCount
5.QsInvPercent

Pattern report Output:

Token Report output u will get like below
Word investigate pattern report it give 3 columns at output
1.qsInvCount
2.qsInvWord
3.qsInvClassCode

Token report Output

Classification Code table:

2.STANDARDIZE STAGE:
Example job:

Open standardize stage and select rule set text field and select standardize rules folder with in that folder select OTHER folder and select COUNTRY folder and select COUNTRY
àNext in literal text field enter ZQUSZQ
àAdd column which column u want to select from available data columns
àSelect column name <literal> and AddressLine1, AddressLine1, city, state, Zip columns in selected columns list
You will get the below screen after entering all
àNext click on OK
you will get the below screen
next click on Ok

Output:


At output it gives additional column ie ISOCOUNTRYCODE AND











No comments:

Post a Comment