Data Warehousing Concepts & Interview Questions with Answers

Data Warehousing Concepts, Data Warehousing Interview Questions with Answers, What is Data Warehousing, Data Warehousing Interview Questions, data warehousing and data mining, data warehousing basics, data warehousing concepts interview questions answers.

Data Warehousing Concepts & Interview Questions with Answers

Q : How strong you are in UNIX?

Ans :

1) I have Unix shell scripting knowledge whatever informatica required like

If we want to run workflows in Unix using PMCMD.

Below is the script to run workflow using inix

cd /pmar/informatica/pc/pmserver/

/pmar/informatica/pc/pmserver/pmcmd startworkflow -u $INFA_USER -p $INFA_PASSWD -s $INFA_SERVER:$INFA_PORT -f $INFA_FOLDER -wait $1 >>$LOG_PATH/$LOG_FILE

2) And if we suppose to process flat files using informatica but those files were exists in remote server then we have to write script to get ftp into informatica server before start process those files.

3) And also file watch mean that if indicator file available in the specified location then we need to start our informatica jobs otherwise will send email notification using Mail X command saying that previous jobs didn’t completed successfully something like that.

4) Using shell script update parameter file with session start time and end time.

This kind of scripting knowledge I do have. If any new UNIX requirement comes then I can Google and get the solution implement the same.

Q : What is use of Shortcuts in informatica?

Ans :

If we copy source definitions or target definitions or mapplets from Shared folder to any other folders that will become a shortcut.

Let’s assume we have imported some source and target definitions in a shared folder after that we are using those sources and target definitions in another folders as a shortcut in some mappings.

If any modifications occur in the backend (Database) structure like adding new columns or drop existing columns either in source or target I f we reimport into shared folder those new changes automatically it would reflect in all folder/mappings wherever we used those sources or target definitions.

Q : How to do Dynamic File generation in Informatica?

Ans :

I want to generate the separate file for every employee (as per Name, it should generate file).It has to generate 5 flat files and name of the flat file is corresponding employee name that is the requirement.
Below is my mapping.
Source (Table) -> SQ -> Target (FF)
Source:

Dept	Ename	EmpNo
A	S	22
A	R	27
B	P	29
B	X	30
B	U	34

This functionality was added in informatica 8.5 onwards earlier versions it was not there.
We can achieve it with use of transaction control and special "FileName" port in the target file .
In order to generate the target file names from the mapping, we should make use of the special "FileName" port in the target file. You can't create this special port from the usual New port button. There is a special button with label "F" on it to the right most corner of the target flat file when viewed in "Target Designer".
When you have different sets of input data with different target files created, use the same instance, but with a Transaction Control transformation which defines the boundary for the source sets.
in target flat file there is option in column tab i.e filename as column.
when you click that one non editable column gets created in metadata of target.
in transaction control give condition as iif(not isnull(emp_no),tc_commit_before,continue) else tc_commit_before
map the emp_no column to target's filename column
ur mapping will be like this
source -> squlf-> transaction control-> target
run it ,separate files will be created by name of Ename

How do u populate 1^st record to 1^st target , 2^ndrecord to 2^nd target ,3^rdrecord to 3^rd target and 4^th record to 1^sttarget through informatica?

We can do using sequence generator by setting end value=3 and enable cycle option.then in the router take 3 goups

In 1^st group specify condition as seq next value=1 pass those records to 1^st target simillarly

In 2nd group specify condition as seq next value=2 pass those records to 2^nd target

In 3rd group specify condition as seq next value=3 pass those records to 3rd target.

Since we have enabled cycle option after reaching end value sequence generator will start from 1,for the 4^threcord seq.next value is 1 so it will go to 1^st target.

Q : How do you perform incremental logic or Delta or CDC?

Ans :

Incremental means suppose today we processed 100 records ,for tomorrow run u need to extract whatever the records inserted newly and updated after previous run based on last updated timestamp (Yesterday run) this process called as incremental or delta.

Approach_1: Using set max var ()

1) First need to create mapping var ($$Pre_sess_max_upd)and assign initial value as old date (01/01/1940).

2) Then override source qualifier query to fetch only LAT_UPD_DATE >=$$Pre_sess_max_upd (Mapping var)

3) In the expression assign max last_upd_date value to $$Pre_sess_max_upd(mapping var) using set max var

4) Because its var so it stores the max last upd_date value in the repository, in the next run our source qualifier query will fetch only the records updated or inseted after previous run.

Approach_2: Using parameter file

1 First need to create mapping parameter ($$Pre_sess_start_tmst )and assign initial value as old date (01/01/1940) in the parameterfile.

2 Then override source qualifier query to fetch only LAT_UPD_DATE >=$$Pre_sess_start_tmst (Mapping var)

3 Update mapping parameter($$Pre_sess_start_tmst) values in the parameter file using shell script or another mapping after first session get completed successfully

4 Because its mapping parameter so every time we need to update the value in the parameter file after comptetion of main session.

Approach_3: Using oracle Control tables

1 First we need to create two control tables cont_tbl_1 and cont_tbl_1 with structure of session_st_time,wf_name

2 Then insert one record in each table with session_st_time=1/1/1940 and workflow_name

3 create two store procedures one for update cont_tbl_1 with session st_time, set property of store procedure type as Source_pre_load .

4 In 2^ndstore procedure set property of store procedure type as Target _Post_load.this proc will update the session _st_time in Cont_tbl_2 from cnt_tbl_1.

5 Then override source qualifier query to fetch only LAT_UPD_DATE >=(Select session_st_time from cont_tbl_2 where workflow name=’Actual work flow name’.

Q : Difference between dynamic lkp and static lkp cache?

Ans :

1 In Dynamic lkp the cache memory will get refreshed as soon as the record get inserted or updated/deleted in the lookup table where as in static lookup the cache memory will not get refreshed even though record inserted or updated in the lookup table it will refresh only in the next session run.

2 Best example where we need to use dynamic chache is if suppose first record and last record both are same but there is a change in the address what informatica mapping has to do here is first record needs to get insert and last record should get update in the target table.

3 If we use static look up first record it will go to lookup and check in the lkp cache based on the condition it will not find the macth so it returns null value then in the router will send that recod to insert flow.

4 But still this record does not available in the cach memory so when the last record comes to look up it will check in the cache it will not find the match it returns null values again it will go to insert flow through router but it suppose to go update flow because cache didn’t refreshed when the first record get insert in to target table. So if we use dynamic look up we can achieve our requirement because first time record get insert then immediately cache also get refresh with the target data. When we process last record it will find the match in the cache so it returns the value then router will route that record to update flow.