Learning Datastage Job Design – Part 1 (introduction to stages and palette components)

Before we begin

Datastage Job is an executable object which performs ETL (Extract – Transform – Load) operations. The job contains various stages interconnected by links. The links acts a a pipe which carries data from one stage to another.

Designing a simple Job

lets start by designing a simple job which has the following requirement. we have a file in which we has the mark list as follows,

NAME TEST1 TEST2 TEST3 TEST4
MANO 89 87 68 77
ROHIT 76 78 67 90
VINO 88 99 76 89
TARUN 65 76 98 78

This is the input file. This file has just the marks, we need to produce a more detailed report which has the following information.

NAME PERCENTAGE

So we can define the mapping as follows,

INPUT FILE COLUMN OUTPUT FILE COLUMN COMMENTS
NAME NAME (Direct Mapping)
TEST1   (Not Mapped)
TEST2   (Not Mapped)
TEST3   (Not Mapped)
TEST4   (Not Mapped)
  PERCENTAGE (TEST1 + TEST2 + TEST3 + TEST4)

 

So the above file(typically an excel file) which holds the information how the input column is mapped with the output column is called as mapping document. This can be maintained as a version controlled file so that we can track the change in the job by tracking in this document.

Now we have all the necessary things to start designing a job. so we can represent the logic as follows,

LD-1

Now let us see the components required for designing this job.

Reading and Writing File – Need Sequential File Stage

Selecting the Required Columns – Need Transformer Stage

Performing the Calculation – Need Transformer Stage

Let us look into each components,

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.