Data Warehousing Concepts & Interview Questions with Answers - Pharma Jobs

Wednesday, February 5, 2014

Data Warehousing Concepts & Interview Questions with Answers


Data Warehousing Concepts, Data Warehousing Interview Questions with Answers, What is Data Warehousing, Data Warehousing Interview Questions, data warehousing and data mining, data warehousing basics, data warehousing concepts interview questions answers.

Data Warehousing Concepts & Interview Questions with Answers


Q : How strong you are in UNIX?
Ans :
1) I have Unix shell scripting knowledge whatever informatica required like
If we want to run workflows in Unix using PMCMD.
Below is the  script to run workflow using inix
cd /pmar/informatica/pc/pmserver/
/pmar/informatica/pc/pmserver/pmcmd startworkflow -u $INFA_USER -p $INFA_PASSWD -s $INFA_SERVER:$INFA_PORT -f $INFA_FOLDER -wait $1 >>$LOG_PATH/$LOG_FILE

2) And if we suppose to process flat files using informatica but those files were  exists in remote server then  we have to write script to get ftp into informatica server before start process those files.

3) And also file watch mean that  if indicator file available in the specified location then we need to start our informatica jobs otherwise will send email notification using Mail X command saying that previous jobs didn’t completed successfully something like that.

4) Using shell script update parameter file with session start time and end time.
This kind of scripting knowledge I do have. If any new UNIX requirement comes then I can Google and get the solution implement the same.

Q : What is use of Shortcuts in informatica?

Ans :
If we copy source definitions or target definitions or mapplets from Shared folder to any other folders that will become a shortcut.
Let’s assume  we have imported some source and target definitions in a shared folder after that we are using those sources and target definitions in another folders as a shortcut in some  mappings.
If any modifications occur  in the backend (Database) structure like adding new columns or drop existing columns either in source or target I f we reimport into shared folder those new changes automatically  it would reflect in all folder/mappings wherever we used those sources  or target definitions.

Q : How to do Dynamic File generation in Informatica?

Ans :

I want to generate the separate file for every employee (as per Name, it should generate file).It has to generate 5 flat files and name of the flat file is corresponding employee name that is the requirement.
Below is my mapping.
Source (Table) -> SQ -> Target (FF)
Source:
Dept Ename EmpNo
A S 22
A R 27
B P 29
B X 30
B U 34

This functionality was added in informatica 8.5 onwards earlier versions it was not there.
We can achieve it with use of transaction control and special "FileName" port in the target file .
In order to generate the target file names from the mapping, we should make use of the special "FileName" port in the target file. You can't create this special port from the usual New port button. There is a special button with label "F" on it to the right most corner of the target flat file when viewed in "Target Designer".
When you have different sets of input data with different target files created, use the same instance, but with a Transaction Control transformation which defines the boundary for the source sets.
in target flat file there is option in column tab i.e filename as column.
when you click that one non editable column gets created in metadata of target.

in transaction control give condition as iif(not isnull(emp_no),tc_commit_before,continue) else tc_commit_before
map the emp_no column to target's filename column
ur mapping will be like this
source -> squlf-> transaction control-> target
run it ,separate files will be created by name of Ename

How do u populate 1st record to 1st target , 2ndrecord to 2nd  target ,3rdrecord to 3rd target  and 4th record to 1sttarget through informatica?
We can  do using sequence generator  by setting  end value=3 and enable cycle option.then in the router take 3 goups 
In 1st group  specify condition as seq next value=1 pass those records to 1st target simillarly
In 2nd group  specify condition as seq next value=2 pass those records to 2nd  target
In 3rd group  specify condition as seq next value=3 pass those records to 3rd target.
Since we have enabled cycle option after reaching end value sequence generator will start from 1,for the 4threcord seq.next value is 1 so it will go to 1st target.

Q : How do you perform incremental logic or Delta or CDC?

Ans :
Incremental means suppose today  we processed 100 records ,for tomorrow  run u need to extract  whatever the records inserted newly and  updated after previous run  based on last updated timestamp (Yesterday run) this process called as incremental or delta.
Approach_1: Using set max var ()
1)           First need to create mapping var ($$Pre_sess_max_upd)and assign initial value as old date (01/01/1940).
2)             Then override source qualifier query to fetch only LAT_UPD_DATE >=$$Pre_sess_max_upd  (Mapping var)
3)              In the expression assign max last_upd_date value to $$Pre_sess_max_upd(mapping var) using set max var
4)           Because its var  so it stores the max  last upd_date value in the repository, in the next run  our source qualifier query will fetch only the records updated or inseted after previous run.

Approach_2: Using  parameter file
1         First need to create mapping  parameter ($$Pre_sess_start_tmst )and assign initial value as old date (01/01/1940) in the parameterfile.
2         Then override source qualifier query to fetch only LAT_UPD_DATE >=$$Pre_sess_start_tmst (Mapping var)
3        Update mapping parameter($$Pre_sess_start_tmst) values in the parameter file using shell script or another mapping after first session get completed successfully
4        Because its mapping parameter   so every time we need to update the  value in the  parameter file after comptetion of main session.
Approach_3: Using oracle Control tables
1        First we need to create  two control tables cont_tbl_1 and cont_tbl_1 with structure of session_st_time,wf_name
2        Then insert  one record  in each table with session_st_time=1/1/1940 and workflow_name
3        create two store procedures one for update cont_tbl_1 with session st_time, set  property of  store procedure type as Source_pre_load  .
4        In  2ndstore procedure  set property of store procedure type as Target _Post_load.this proc will update the session _st_time  in Cont_tbl_2 from cnt_tbl_1.
5        Then override source qualifier query to fetch only LAT_UPD_DATE >=(Select  session_st_time from cont_tbl_2 where workflow name=’Actual work flow name’.

Q : Difference between dynamic lkp and static lkp cache?

Ans :
1        In Dynamic lkp  the cache memory will get refreshed as soon as the record get inserted or updated/deleted  in the lookup table  where as in static lookup the cache memory will not get refreshed  even though record inserted or updated in the lookup table it will refresh only in the next session run.
2        Best example where we need to use dynamic chache is  if suppose first record and last record both are same but there is a change in the address what  informatica mapping has to do here is  first record needs to get insert  and last record should get update in the target table.
3        If we use static look up first record it will go to lookup and check in the lkp cache based on the condition it will not find the macth so it returns null value then in the router will send that recod to insert flow.

4        But still this record  does not  available in the cach memory so when the last record comes  to look up it will check in the cache it will not find the match it returns null values again it will go to insert flow through router  but  it suppose to go update flow because cache didn’t refreshed when the first record get  insert in to target table. So if we use dynamic look up we can achieve our requirement because first time record get insert then immediately cache also get refresh with the target data. When we process last record it will find the match in the cache so it returns the value then router will route that record to update flow.

Data Warehousing Concepts | Data Warehousing Interview Questions with Answers


1 comment: