Wednesday, September 14, 2016

Job Sequence Scenario

I have 4 Jobs JobA,JobB,JobC,JobD.I need to trigger JobD when any of two Jobs(JobA,JobB,JobC) are successful.

Job Sequence:




Monday, June 24, 2013

Datastage scheduler Vs Third party scheduler

Datastage scheduler:

1)By using Datastage Director we scan schedule only the Datastage Jobs

2)The main drawback of Datastage scheduler is not possible to schedule other components like unix scripts,sql scripts,Informatica Jobs etc.In any  project dependency exists in all these components.In such scenarios you require an explicit scheduling tool

Third party scheduler:

1)TWS, Autosys, Control-M are etc popular schedulers available in market


2)By using Third party schedular we can schedule unix scripts,sql scripts,Informatica Jobs etc all the components.Assume a scenario when your project executes multiple type of components on different machines including UNIX machine, Windows Machine etc.Each schedular has  own features like e-mail notification which is not present in Datastage schedular

Thursday, April 18, 2013

Datastage scenario with small example


Consider Input file.txt as below

telephoneno 
09700020075 
919889110102 
918571233668

Output.txt

telephoneno 
09700020075 
09889110102 
08571233668 and length should be 11.If first two characters is "91" i need to replace as "0"

Ans:
In Transformer use following derivation

If (Len(InLink.Phone) = 11) Then InLink.Phone Else "0" : Right(InLink.Phone,10)

Common steps for Datastge Job Development:

1)Understand the requirement clearly and also exceptions

2)Form the algorithm in simple english and Do not go for Job development directly

Algorithm:
a)Check whether length of the string is 11 or not
b)If it 11 pass the input else extract right most 10 characters and append with zero

3)Now convert the Algorithm into Datastage stages

a)To implement first point in the Algorithm we need to use Len function and also If statement
b)To implement second point in the Algorithm we need to use Right function

4)In most of the cases requirement is Spilted into Jobs and we have to identity the stages used in each Job after forming the algorithm  which makes you debugging easier instead of implementing entire requirement in one Job.

To identify when to use which stage the following link will  help you-when-to-use-which-stage-in-datastage

5)Performance tuning if you are facing any problem which is an iterative approach.

6)Finally connect all the Jobs through Job sequence depends on Dependency because one Datastage Job output may depends upon the Input of other Datstage Job

Wednesday, April 17, 2013

Difference between sequential file stage and Data set stage?


1) When you use sequential file as Source, at the time of Compilation it will convert to native format from ASCII.where as, when you go for using datasets conversion is not required. Also, by default sequential files we be Processed in sequence only. Sequential files can accommodate up to 2GB only. Sequential  files does not support NULL values.All the above can me overcome using dataset Stage,but selection is depends on the Requirement.suppose if you want to capture rejected data in that case you need to use sequential file or file set stage.

2)Sequential file is used to Extract the data from flat files and load the data into flat files and limit is 2GB.Dataset is a intermediate stage and it has parallelism when load data into dataset and it improve the performance.

3)Data set mainly consists of two files.

a)Descriptor file which consists of Metada,data location but not actual data itself
b)Data file contains the data in multiple files and one file file per partition.

4)orchadmin command is used to delete the datasets where as rm unix command is used to remove the flat files.

Complete information about orchadmin can be found in the below
link-orchadmin