Quality Stage Full Notes & Quality Stage Scenario's - Pharma Jobs

Saturday, December 28, 2013

Quality Stage Full Notes & Quality Stage Scenario's

Quality stage :

Investigate Stage:

Investigate: 3 methods:
1.chardiscreate->C,T,X masks
2.Charconcatenate:C,T.X masks

Investigate default column names for Pattern Report:

1.Qsinvcolumn name:
2.QsInvPattern
3.QsInvsample
4.QsInvcount
5.Qsinvpercentgae

Investigate default column names for column Report:

1.QsInvcount
2.QsInvword
3.QsInvclasscode

Lab:
Chardiscreate C mask (select one or many columns)
Characterconcatenate C MASK(select two or more columns concate nate)

WordInvstgate:FullName:
Token Rpt
Pattern Rpt
WordInvestigate:Address(pass address line 1,address line2)
Token Rpt
Pattern Rpt
WordInvestigate:Area(city ,state,Zip)
Token Rpt
Pattern Rpt

1.Why investigate:

àDiscover trends and potential anomalies in data
àIdentify invalid and default values in a data
 Ã Verify the reliability of the data in the fields to be used as a matching criteria
 Ã Gain complte understanding of the data  in a context
Investiage:
Verify the domain:
Review each field and verify the data matches the meta data
Identify the data formats and missing and default values
Identify the data anomalies:
Format
Structure
Content

Feature of investigate:
Analyze free form and single domain columns
Provide frequency distribution  of distinct values and patterns

Investigaet methods:
Character discrete
Character concatenate


2.Standardize stage:

1.country identifier:
--- >select the rule set from others COUNTRY
--- > pass the literal ZQUSZQ and add the columns addressline1,addressline 2,city ,state,zip
--- > filter the records where ever we have flag ‘Y’ Those or US records
--- >split US, non US records into separate target

2. Apply the USPREP rule set to filter name components from address fields, and area components from address fields

n  ->Select USPREP rule set from standardize rules
n  ->pass ZQNAMEZQ and add the column “Fullname”
n  ->pass ZQADDRZQ and add the column “addressline1”
n  ->pass ZQADDRZQ and add  the column “addressline2”
n  ->pass ZQAREAZQ and add  the column “City”
n  ->pass ZQAREAZQ and add  the column “State”
n  ->pass ZQAREAZQ and add  the column “Zip”

Standardize USNAME USADDR USAREA

1.Select USNAME rule set from standardize rules and add  the clumn NameDomain_USPREP
2. select new process and select the  USADDR rule set  and add the column AddressDomain_USPREP
3. select new process and select the  USAREA rule set  and add the column AreaDomain_USPREP

Rules                                              Columns
USNAME.SET                               NameDomain_USPREP
USADDR.SET                               AddressDomain_USPREP
USAREA.SET                               AreaDomain_USPREP

Investigate un handled name patterns

Take the above job as input and  use 3 investigate stages
1 for  Inv Unhandled Name
2. for InvUnhandeldAddr
3.for InvUnhandledArea

Inv Unhandled Name:

select the method character concatenate for Name
select the columns
UnhandledPattern_USNAME, --- >set C mask
UnhandledData_USNAME--- >set X mask
InputPattern_USNAME--- >set X mask
NameDomain_USPREP--- >set X mask

InvUnhandeldAddr:

select the method character concatenate for Address
select the columns
UnhandledPattern_USADDR, --- >set C mask
UnhandledData_USADDR--- >set X mask
InputPattern_USADDR--- >set X mask
AddressDomain_USPREP--- >set X mask

InvUnhandeldArea:

select the method character concatenate for Area
select the columns
UnhandledPattern_USAREA, --- >set C mask
UnhandledData_USAREA--- >set X mask
InputPattern_USAREA--- >set X mask

AreaDomain_USPREP--- >set X mask


No comments:

Post a Comment