Stuff: Datastage-Scenarios

Showing posts with label Datastage-Scenarios. Show all posts

Wednesday, September 14, 2016

Job Sequence Scenario

I have 4 Jobs JobA,JobB,JobC,JobD.I need to trigger JobD when any of two Jobs(JobA,JobB,JobC) are successful.

Job Sequence:

Thursday, April 18, 2013

Datastage scenario with small example

Consider Input file.txt as below

telephoneno

09700020075

919889110102

918571233668

Output.txt

telephoneno

09700020075

09889110102

08571233668 and length should be 11.If first two characters is "91" i need to replace as "0"

Ans:

In Transformer use following derivation

If (Len(InLink.Phone) = 11) Then InLink.Phone Else "0" : Right(InLink.Phone,10)

Common steps for Datastge Job Development:

1)Understand the requirement clearly and also exceptions

2)Form the algorithm in simple english and Do not go for Job development directly

Algorithm:

a)Check whether length of the string is 11 or not

b)If it 11 pass the input else extract right most 10 characters and append with zero

3)Now convert the Algorithm into Datastage stages

a)To implement first point in the Algorithm we need to use Len function and also If statement

b)To implement second point in the Algorithm we need to use Right function

4)In most of the cases requirement is Spilted into Jobs and we have to identity the stages used in each Job after forming the algorithm which makes you debugging easier instead of implementing entire requirement in one Job.

To identify when to use which stage the following link will help you-when-to-use-which-stage-in-datastage

5)Performance tuning if you are facing any problem which is an iterative approach.

6)Finally connect all the Jobs through Job sequence depends on Dependency because one Datastage Job output may depends upon the Input of other Datstage Job

Sunday, April 14, 2013

How to Use different Configuration file for different Job activities in a Job sequence?

Define a APT_CONFIG_FILE Environmental variable at Job level and mention the path names of configuration files to the jobs in the Job activities in a Job sequence.

Tuesday, January 29, 2013

Double convert in Datastage

Convert Function

Replaces every instance of specified characters in a string with substitute characters.

Syntax-Convert (list, new.list, string)

The two lists of characters correspond. The first character of new.list replaces all instances of the first character of list, the second replaces the second, and so on. If the two lists do not contain the same number of characters:

1)Any characters in list with no corresponding characters in new.list are deleted from the result.

2)Any surplus characters in new.list are ignored.

3)We can not replace a string with string by using convert function we have to use parallel Ereplace c++ routine

Example-How to separate alphabets and numeric values?

Input

col

Abc123456

Pqrs698

output

col_1,col_2

Abc,123456

Pqrs,698

Convert(Convert('ABCDEFGHIJKLMNOPQRSTUVWXYZZ','',Upcase(Col))

,'',Col)=col_1

Convert(Convert('0123456789','',Upcase(Col)),'',Col)=col_2

Here I am using second point as a reference

Sunday, January 27, 2013

How to find the list of sequences that are restart able in a Datastage project?

1)Get the the list of the Jobs using dsjob -ljobs <project>
2)Identify restartability of Job by using dsjob -jobinfo <project> <job> command[Job Restartable].If the Job Restartable=0 means then that sequence is non-restartable.
3)The step 2 should be repeated for all the Jobs in a project[Use Unix scripting or Loop activity in Datastage]

Counter Using Transformer

In a parallel job a unique counter is generated by using special parallel macros.

1. Create a stage variable for the counter, eg. SVCounter.

2. At the Stage Properties form set the Initial Value of the Stage Variable to

"@PARTITIONNUM - @NUMPARTITIONS + 1".

3.Set the derivation of the stage variable to "svCounter + @NUMPARTITIONS".

4.Create a column(counter) in the output of the Transformer and set the derivation of the column(counter) to svCounter(stage variable)

For 4 node configuration each instance will start at a different number, eg. -3, -2,-1,0 When the counter is incremented each instance is increment by the number of partitions, eg. 4. This gives us a sequence in instance 1 of 1, 5, 9, 13... Instance 2 is 2, 6, 10, 14... etc.In this process the counter will always start from 1.This method only works if the input data is evenly balanced i.e. equal number of rows going through each partition.

In 8.5 version developer can generate the unique counter(Surrogate Key) by using Surrogate Key tab in the transformer.User need to mention database sequence name and database crendentials where database sequence name is created.

Wednesday, December 5, 2012

How to replace character by string in Datastage?

1)convert function replaces a single character with another single character .It is not possible to replace single character by string(group of characters).

2)Datastage does not have any inbuilt functionto implement this functionality.User has to write parallel routine which can called from transformer.The code for this routine can be found in the below link

PXEReplace

How to find the nth highest salary in file?

Job:
sequential file-->sortstage-->transformer stage--->targetsequentialfile
sort stage settings
Sal-desc

Transformer stage settings
constraint-@INROWNUM=n

Both the sort stage and transformer must run in sequential mode.

Sunday, December 2, 2012

How to find the list of the job names and folder where a particular stage is used?

1)Go to Advanced Find (The magnifier-glass-icon on the tool-bar)

2)Select the stage[Ex:- Pivot Stage] from the Where Used Option

How to store the output of a Repository query in a file?

1)Open the Administrator

Administrator Client Component-> Projects Tab-> Command Button

2)COMO ON filename

3)Then execute your query[ex:-SELECT NAME FROM DS_JOBS]and output will be written to a file in Projects/ProjectName/'&COMO&'/ path

4)Do not forget to turn off COMO.

COMO OFF

Saturday, December 1, 2012

How to fetch the every nth record in sequential file?

Job design:
Sequentialfile--->samplestage--->Sequentialfile

In sample stage select the following settings

SampleMode-Period
Period(Per partition)-n
[To select the every 3 record in sequential file mention 3].
Execution Mode-sequential Mode

samplestage extract every third row from input.

List of the stages that used in a jobs in a project

1)Get the list of Jobs in a project Using following command

dsjob [ options ] -ljobs projectname

2)For each Job get the list of stages in a Job

dsjob [ options ] -lstages projectname

3)Use above commands and loop concepts in unix shell scripting to achieve the required output

How to find job modified date in 8.5

Method-1
Export the job as a DSX and look at the DateModified immediately lines below the BEGIN DSJOB and Identifier lines

ex-
BEGIN DSJOB
Identifier "JobNameGoesHere"
DateModified "2010-10-24"
TimeModified "10.22.00"

Method-2
2)a)Go to Advanced Find Option
b)change view-->List
c)Type Jobname in Name to find Option
d)Click Find Option

Datastage will show Job Type,Job modified date and who modified the Job etc.Method 2 is possible if the Datastage version is 8x,in such cases user can Use method 2

Pages