Stuff: Merge stage

Sunday, April 14, 2013

Merge stage

1)Merge stage is a processing stage, which can have any number of input links and one output link, with same number of reject links as there are update links.

2)The input datasets to the Merge stage must be key partitioned and sorted.This ensures that rows with the same key column values are located in the same partition and will be processed by the same node.

3)Merge stage combines master data with one or more updates link data where the keys match.

4)Master and update links must have duplicate free data to ensure proper results.If the input data is not duplicate-free, the output generated will be improper.

5)Check link ordering to make sure the master and update links are proper otherwise the output generated will be improper

Pages

Sunday, April 14, 2013

Merge stage

No comments:

Post a Comment