Showing posts with label Datastage-Scenarios. Show all posts
Showing posts with label Datastage-Scenarios. Show all posts

Wednesday, September 14, 2016

Job Sequence Scenario

I have 4 Jobs JobA,JobB,JobC,JobD.I need to trigger JobD when any of two Jobs(JobA,JobB,JobC) are successful.

Job Sequence:




Thursday, April 18, 2013

Datastage scenario with small example


Consider Input file.txt as below

telephoneno 
09700020075 
919889110102 
918571233668

Output.txt

telephoneno 
09700020075 
09889110102 
08571233668 and length should be 11.If first two characters is "91" i need to replace as "0"

Ans:
In Transformer use following derivation

If (Len(InLink.Phone) = 11) Then InLink.Phone Else "0" : Right(InLink.Phone,10)

Common steps for Datastge Job Development:

1)Understand the requirement clearly and also exceptions

2)Form the algorithm in simple english and Do not go for Job development directly

Algorithm:
a)Check whether length of the string is 11 or not
b)If it 11 pass the input else extract right most 10 characters and append with zero

3)Now convert the Algorithm into Datastage stages

a)To implement first point in the Algorithm we need to use Len function and also If statement
b)To implement second point in the Algorithm we need to use Right function

4)In most of the cases requirement is Spilted into Jobs and we have to identity the stages used in each Job after forming the algorithm  which makes you debugging easier instead of implementing entire requirement in one Job.

To identify when to use which stage the following link will  help you-when-to-use-which-stage-in-datastage

5)Performance tuning if you are facing any problem which is an iterative approach.

6)Finally connect all the Jobs through Job sequence depends on Dependency because one Datastage Job output may depends upon the Input of other Datstage Job

Sunday, April 14, 2013

How to Use different Configuration file for different Job activities in a Job sequence?


Define a APT_CONFIG_FILE  Environmental variable at Job level and mention the path names of configuration files to the jobs in the Job activities in a Job sequence.



Tuesday, January 29, 2013

Double convert in Datastage


Convert Function

Replaces every instance of specified characters in a string with substitute characters.

Syntax-Convert (list, new.list, string)
The two lists of characters correspond. The first character of new.list replaces all instances of the first character of list, the second replaces the second, and so on. If the two lists do not contain the same number of characters:
1)Any characters in list with no corresponding characters in new.list are deleted from the result. 
2)Any surplus characters in new.list are ignored.
3)We can not replace a string with string  by using convert function we have to use parallel Ereplace c++ routine

Example-How to separate alphabets and numeric values?

Input               
col             
Abc123456            
Pqrs698               

output              
col_1,col_2                 
Abc,123456
Pqrs,698

Convert(Convert('ABCDEFGHIJKLMNOPQRSTUVWXYZZ','',Upcase(Col))
,'',Col)=col_1
Convert(Convert('0123456789','',Upcase(Col)),'',Col)=col_2

Here I am using  second point as a reference

Sunday, January 27, 2013

How to find the list of sequences that are restart able in a Datastage project?


1)Get the the list of the Jobs using dsjob -ljobs <project> 
2)Identify restartability of Job by using dsjob -jobinfo <project> <job> command[Job Restartable].If the Job Restartable=0 means then that sequence is non-restartable.
3)The step 2 should be repeated for all the Jobs in a project[Use Unix scripting or Loop activity in Datastage]


Counter Using Transformer


In a parallel job  a unique counter is generated by using special parallel macros.
  
1.  Create a stage variable for the counter, eg. SVCounter.  

2.  At the Stage Properties form set the Initial Value of the Stage Variable to 
"@PARTITIONNUM - @NUMPARTITIONS + 1".

3.Set the derivation of the stage variable to "svCounter + @NUMPARTITIONS". 

4.Create a column(counter) in the output of the Transformer and  set the derivation of the column(counter) to svCounter(stage variable)

 For 4 node configuration each instance will start at a different number, eg. -3, -2,-1,0 When the counter is incremented each instance is increment by the number of partitions, eg. 4. This gives us a sequence in instance 1 of 1, 5, 9, 13... Instance 2 is 2, 6, 10, 14... etc.In this process the counter will always start from 1.This method only works if the input data is evenly balanced i.e. equal number of rows going through each partition.

In 8.5 version  developer can generate the unique counter(Surrogate Key) by using Surrogate Key tab in the transformer.User need to mention database sequence name and database crendentials where database sequence name is created.

Wednesday, December 5, 2012

How to replace character by string in Datastage?


1)convert function replaces a single character with another single character .It is not possible to replace single  character by string(group of characters).

2)Datastage does not have any inbuilt  functionto  implement this functionality.User has to write parallel routine which can called from transformer.The code for this routine can be found in the below link


How to find the nth highest salary in file?


Job:
sequential file-->sortstage-->transformer stage--->targetsequentialfile
sort stage settings
Sal-desc 

Transformer stage settings
constraint-@INROWNUM=n

Both the sort stage and transformer must run in sequential mode.

Sunday, December 2, 2012

How to find the list of the job names and folder where a particular stage is used?

1)Go to Advanced Find (The magnifier-glass-icon on the tool-bar) 

2)Select the stage[Ex:- Pivot Stage] from the Where Used Option

How to store the output of a Repository query in a file?

1)Open  the Administrator
Administrator Client Component-> Projects Tab-> Command Button
2)COMO ON filename 
3)Then execute your query[ex:-SELECT NAME FROM DS_JOBS]and output will be written to a file in Projects/ProjectName/'&COMO&'/ path
4)Do not forget to turn off COMO. 
COMO OFF

Saturday, December 1, 2012

How to fetch the every nth record in sequential file?

Job design:
Sequentialfile--->samplestage--->Sequentialfile

In sample stage select the following settings

SampleMode-Period
Period(Per partition)-n
[To select the  every 3 record in sequential file mention 3].
Execution Mode-sequential Mode

samplestage extract every third row from input.

List of the stages that used in a jobs in a project

1)Get the list of Jobs in a project Using following command

dsjob [ options ] -ljobs projectname

2)For each Job get the list of stages in a Job

dsjob [ options ] -lstages projectname

3)Use  above commands and loop concepts in unix shell scripting to achieve the required output

How to find job modified date in 8.5

Method-1
Export the job as a DSX and look at the DateModified immediately lines below the BEGIN DSJOB and Identifier lines

ex-
BEGIN DSJOB 
   Identifier "JobNameGoesHere" 
   DateModified "2010-10-24" 
   TimeModified "10.22.00"
   
 Method-2  
2)a)Go to Advanced Find Option
b)change view-->List
c)Type Jobname in Name to find Option
d)Click Find Option

Datastage will show Job Type,Job modified date and who modified the Job etc.Method 2 is possible if the Datastage version is 8x,in such cases user can Use method 2