Stuff

Sunday, April 14, 2013

Filter stage

1)Filter stage is used to transform (not modify) records from the input link based on a specific condition

2)Different criteria/requirements can be specified to direct data to different output links from the Filter stage.

3)This processing stage can have a single input link and any number of output links. It can also, optionally, have a reject link.

4)Switch stage supports only 128 output links whereas Filter stage can have any number of output links

Friday, April 12, 2013

When to use which stage in Datastage?

Copy STAGE-To drop a Particular column

Sort STAGE-sorting,generating Key change and similar to order by clause in oracle

Filter STAGE-Similar to where clause in oracle but we can not perform Join operation

Lookup,Join,Merge-To perform Join operation

Pivot Enterprise STAGE-Rows to columns and columns to Rows

External Filter STAGE-Filter the records by using Unix filter commands like Grep etc

MODIFY STAGE-Metadata conversion,Null Handling and similar to conversion functions in oracle

FUNNEL STAGE -Combining the multiple input data into a single output.Metadata should be same for all the inputs

REMOVE DUPLICATES STAGE-To remove duplicate values from a single sorted input.

ENCODE / DECODE STAGES:To encode/compress a data set using UNIX encoding commands like gzip etc

TRANSFORMER STAGE:
a)Filtering the Data(constraints)
b)Metadata conversion(Using Functions)
c)Rows to columns and columns to Rows(Using Stage variables)
d)Looping
e)Creating a counter(Using macros)-Counter using Transformer

SURROGATE KEY GENERATOR STAGE-To generate SURROGATE KEYs similar to oracle Database sequence

Aggregator Stage:To perform Group by Operations like max,min etc similar to Group by clause in oracle

ROW GENERATOR STAGE:To generate a set of mock data fitting the specified metadata when no real data is available

XML OUTPUT STAGE -To convert tabular data such as tables and sequential files to XML hierarchical structures.

SWITCH STAGE- It performs an operation similar to the switch statement in C and to filter the data

CHANGE CAPTURE STAGE-To identify Delta changes(inserts,updates,deletes etc) between two sources

oracle connector-To connect to the oracle Database.

Aggregator stage

1)Aggregator stage is a processing stage that can have one input and one output link. It classifies records from the input link into groups and computes the totals or performs specified aggregator functions for each group

2)Records can be grouped on one or more keys

3)In parallel environment, we need to be careful when partitioning. It can affect the result of the aggregator. If the records that fall in the same group are in different partitions, then the generated output will be wrong.Therefore, it is better to do Hash partition on grouping keys before the aggregator stage so that records with same keys will go to same partition.

4)In Aggregator two grouping methods(Hash and sort) are present.Please find the following link-Grouping Methods for more information about grouping methods in aggregator stage

Copy Stage

1)Copy stage is a processing stage that can have a single input and any number of output links

2)Copy stage is used to copy a single input data set to a number of output data sets

3)This stage is generally used for following things

a)Columns can be dropped.

b)The order of the columns can be altered.

Pages