1)Sort stage is a processing stage used to perform sorting operations on input data.
2)Need for Sorting:
Some stages require sorted input
ex- Join, merge stages
3)Sort stage requires a ‘key’ to be specified by which the sort is performed.Multiple sort keys can be specified
4)Sort operation is performed partition wise.To sort a complete set of data, you should change the Sort Stage execution mode to sequential.
5)There are two ways Sort can be performed in Datastage :
a)Within stages On input link Partitioning tab, set partitioning to anything other than Auto
b)In a separate Sort stage which has more options like Allow duplicates,case sensitive,sort order(ascending / descending) etc.
By default, both methods use the tsort operator which can be identified in Job score
6)Partitioning keys are often different than Sorting keys
Keyed partitioning (e.g.Hash) is used to group related records into the same partition where as Sort keys are used to establish order within each partition
No comments:
Post a Comment