Saturday, February 9, 2013

Difference between Normal lookup and Sparse lookup?

1)The first input link to lookup stage is called the ‘Primary’ link. Other links   are called ‘Lookup’ links. When  lookup  links are  from a stage  that  is other  than a database stage, all data from the  lookup  link  is read  into memory. Then,  for each row  from  the primary  link,  the  lookup  is performed. If the source of lookups is a database, there can be two types of lookups: 

 Normal lookup: 
All the data from the database is read into memory, and then lookup is performed. 

Sparse  lookup:  For  each  incoming  row  from  the  primary  link,  the  SQL  is  fired  on database at run time.

2)Sparse lookups can be used if the input data is smaller than the reference data.

4 comments:

  1. Here the important aspect is Sparse lookup can be used only when the reference link is Database. Could you please share some lights if it can be still used if ref link is sequential file? Thanks

    ReplyDelete
  2. No, Sparse lookup is used for DB stages only because DataStage will send your rows in the stream link one by one to the reference DB and the join will be performed in the DB. Since join is a relational concept and does not apply to non-structured data like flat files.

    ReplyDelete
  3. lookupfileset is used for sparse lookup for sequential files(flatfiles also)

    ReplyDelete