Showing posts with label Datastage-Scenarios. Show all posts
Showing posts with label Datastage-Scenarios. Show all posts
Wednesday, September 14, 2016
Thursday, April 18, 2013
Datastage scenario with small example
Consider Input file.txt as below
telephoneno
09700020075
919889110102
918571233668
Output.txt
telephoneno
09700020075
09889110102
08571233668 and length should be 11.If first two characters is "91" i need to replace as "0"
Ans:
In Transformer use following derivation
If (Len(InLink.Phone) = 11) Then InLink.Phone Else "0" : Right(InLink.Phone,10)
Common steps for Datastge Job Development:
1)Understand the requirement clearly and also exceptions
2)Form the algorithm in simple english and Do not go for Job development directly
Algorithm:
a)Check whether length of the string is 11 or not
b)If it 11 pass the input else extract right most 10 characters and append with zero
3)Now convert the Algorithm into Datastage stages
a)To implement first point in the Algorithm we need to use Len function and also If statement
b)To implement second point in the Algorithm we need to use Right function
4)In most of the cases requirement is Spilted into Jobs and we have to identity the stages used in each Job after forming the algorithm which makes you debugging easier instead of implementing entire requirement in one Job.
To identify when to use which stage the following link will help you-when-to-use-which-stage-in-datastage
5)Performance tuning if you are facing any problem which is an iterative approach.
6)Finally connect all the Jobs through Job sequence depends on Dependency because one Datastage Job output may depends upon the Input of other Datstage Job
Sunday, April 14, 2013
How to Use different Configuration file for different Job activities in a Job sequence?
Define a APT_CONFIG_FILE Environmental variable at Job level and mention the path names of configuration files to the jobs in the Job activities in a Job sequence.
Tuesday, January 29, 2013
Double convert in Datastage
Convert Function
Replaces every instance of specified characters in a string with substitute characters.
Syntax-Convert (list, new.list, string)
The two lists of characters correspond. The first character of new.list replaces all instances of the first character of list, the second replaces the second, and so on. If the two lists do not contain the same number of characters:
1)Any characters in list with no corresponding characters in new.list are deleted from the result.
2)Any surplus characters in new.list are ignored.
3)We can not replace a string with string by using convert function we have to use parallel Ereplace c++ routine
Example-How to separate alphabets and numeric values?
Input
col
Abc123456
Pqrs698
output
col_1,col_2
Abc,123456
Pqrs,698
Convert(Convert('ABCDEFGHIJKLMNOPQRSTUVWXYZZ','',Upcase(Col))
,'',Col)=col_1
Convert(Convert('0123456789','',Upcase(Col)),'',Col)=col_2
Here I am using second point as a reference
Sunday, January 27, 2013
How to find the list of sequences that are restart able in a Datastage project?
1)Get the the list of the Jobs using dsjob -ljobs <project>
2)Identify restartability of Job by using dsjob -jobinfo <project> <job> command[Job Restartable].If the Job Restartable=0 means then that sequence is non-restartable.
3)The step 2 should be repeated for all the Jobs in a project[Use Unix scripting or Loop activity in Datastage]
Counter Using Transformer
In a parallel job a unique counter is generated by using special parallel macros.
1. Create a stage variable for the counter, eg. SVCounter.
2. At the Stage Properties form set the Initial Value of the Stage Variable to
"@PARTITIONNUM - @NUMPARTITIONS + 1".
3.Set the derivation of the stage variable to "svCounter + @NUMPARTITIONS".
4.Create a column(counter) in the output of the Transformer and set the derivation of the column(counter) to svCounter(stage variable)
For 4 node configuration each instance will start at a different number, eg. -3, -2,-1,0 When the counter is incremented each instance is increment by the number of partitions, eg. 4. This gives us a sequence in instance 1 of 1, 5, 9, 13... Instance 2 is 2, 6, 10, 14... etc.In this process the counter will always start from 1.This method only works if the input data is evenly balanced i.e. equal number of rows going through each partition.
In 8.5 version developer can generate the unique counter(Surrogate Key) by using Surrogate Key tab in the transformer.User need to mention database sequence name and database crendentials where database sequence name is created.
Wednesday, December 5, 2012
How to replace character by string in Datastage?
1)convert function replaces a single character with another single character .It is not possible to replace single character by string(group of characters).
2)Datastage does not have any inbuilt functionto implement this functionality.User has to write parallel routine which can called from transformer.The code for this routine can be found in the below link
How to find the nth highest salary in file?
Job:
sequential file-->sortstage-->transformer stage--->targetsequentialfile
sort stage settings
Sal-desc
Transformer stage settings
constraint-@INROWNUM=n
Both the sort stage and transformer must run in sequential mode.
Sunday, December 2, 2012
How to find the list of the job names and folder where a particular stage is used?
1)Go to Advanced Find (The magnifier-glass-icon on the tool-bar)
2)Select the stage[Ex:- Pivot Stage] from the Where Used Option
2)Select the stage[Ex:- Pivot Stage] from the Where Used Option
How to store the output of a Repository query in a file?
1)Open the Administrator
Administrator Client Component-> Projects Tab-> Command Button
2)COMO ON filename
3)Then execute your query[ex:-SELECT NAME FROM DS_JOBS]and output will be written to a file in Projects/ProjectName/'&COMO&'/ path
4)Do not forget to turn off COMO.
COMO OFF
Saturday, December 1, 2012
How to fetch the every nth record in sequential file?
Job design:
Sequentialfile--->samplestage--->Sequentialfile
In sample stage select the following settings
SampleMode-Period
Period(Per partition)-n
[To select the every 3 record in sequential file mention 3].
Execution Mode-sequential Mode
samplestage extract every third row from input.
Sequentialfile--->samplestage--->Sequentialfile
In sample stage select the following settings
SampleMode-Period
Period(Per partition)-n
[To select the every 3 record in sequential file mention 3].
Execution Mode-sequential Mode
samplestage extract every third row from input.
List of the stages that used in a jobs in a project
1)Get the list of Jobs in a project Using following command
dsjob [ options ] -ljobs projectname
2)For each Job get the list of stages in a Job
dsjob [ options ] -lstages projectname
3)Use above commands and loop concepts in unix shell scripting to achieve the required output
dsjob [ options ] -ljobs projectname
2)For each Job get the list of stages in a Job
dsjob [ options ] -lstages projectname
3)Use above commands and loop concepts in unix shell scripting to achieve the required output
How to find job modified date in 8.5
Method-1
Export the job as a DSX and look at the DateModified immediately lines below the BEGIN DSJOB and Identifier lines
ex-
BEGIN DSJOB
Identifier "JobNameGoesHere"
DateModified "2010-10-24"
TimeModified "10.22.00"
Method-2
2)a)Go to Advanced Find Option
b)change view-->List
c)Type Jobname in Name to find Option
d)Click Find Option
Datastage will show Job Type,Job modified date and who modified the Job etc.Method 2 is possible if the Datastage version is 8x,in such cases user can Use method 2
Export the job as a DSX and look at the DateModified immediately lines below the BEGIN DSJOB and Identifier lines
ex-
BEGIN DSJOB
Identifier "JobNameGoesHere"
DateModified "2010-10-24"
TimeModified "10.22.00"
Method-2
2)a)Go to Advanced Find Option
b)change view-->List
c)Type Jobname in Name to find Option
d)Click Find Option
Datastage will show Job Type,Job modified date and who modified the Job etc.Method 2 is possible if the Datastage version is 8x,in such cases user can Use method 2
Subscribe to:
Posts (Atom)