Overview
This section contains information about managing and creating individual job steps.
What is a Job Step?
A job step is one action that is performed after the dataset has been retrieved and (optionally) delta hashed.
A job can have multiple steps. Each step has an Order field. Jobs are executed in ascending order based on the Order number.
Order Tip!
When creating job steps, it is a good practice to separate your order numbers by at least 10 (e.g. 10, 20, 30, 40), this allows for future growth/changes where you may want to insert a step in between two other steps.
Job steps can also be instructed to execute conditionally based on if the previous step succeeded or failed. (Useful if you want to, for example, create a step that fires a webhook notification, only if the "main" step fails.)
Job Step Components
Job steps are made up of three main components:
General Settings
This includes things like the step name, order, and condition.
Connection Type
This specifies how you want to transfer your data, for example, via HTTP or FTP.
This specifies what you want to transfer, i.e. the payload. For example: JSON or delimited (CSV/TSV).
Generally speaking, you can use any Connection Type with any Data Format, e.g. you could HTTP POST and provide a pipe-delimited file as part of that request.
However, there may be limitations with specific, non-standard combinations, as not all combinations and options are vetted.
Connection Types
HTTP
HTTP Connection
Specify this connection type to initiate an HTTP request to an endpoint.
The HTTP request timeout is 15 minutes and cannot be changed.
The default User-Agent string, if not otherwise specified, is set to: CSI_DataStation/*** dotnet/***
, where *** represents the version of the item.
Method
Select the HTTP method. Common values are GET or POST.
Target URL
Enter the target URL where the HTTP request should be made. Query string parameters are OK to add.
It is not possible to "parameterize" or add dynamic tokens to the URL at this time. The URL must be static.
Content Type
Specify the HTTP Content-Type
header value.
For example, if you are sending JSON data, enter: application/json
Enter one or more additional headers that should be sent, in the format: Header: Header Value
. Separate multiple headers with a line break.
Do not send any control headers, or headers that are dynamically generated. For example, do not include Content-Length or Location.
If you wanted to send an Authorization
header and a User-Agent
header, for example, you would enter the following into the Headers field:
CODE
Authorization: Bearer AaBbCcDdEeFf0123456789
User-Agent: CSI_DataStation/1.0
Fail Step on 4xx/5xx Status
If enabled, and the response status is between 400 and 599, the step is marked as "failed" (which, subsequently, will cause the job itself to either be failed or partially succeeded, depending on other steps).
If disabled, the response status code is printed in the log file, but otherwise ignored.
Payload Chunking
It is possible for extremely large payloads to be broken up into smaller requests.
Consider a data source containing the letters A through F.
When Payload Chunking is off, the dataset is sent as:
CODE
[
{ "letter": "A" },
{ "letter": "B" },
{ "letter": "C" },
{ "letter": "D" },
{ "letter": "E" },
{ "letter": "F" }
]
But if Payload Chunking is on, and the chunk size is set to 3, then two datasets will be sent:
CODE
[
{ "letter": "A" },
{ "letter": "B" },
{ "letter": "C" }
]
... DELAY n SECONDS ...
[
{ "letter": "D" },
{ "letter": "E" },
{ "letter": "F" }
]
Payload Chunk Size
Specify the maximum number of records that should be sent with each request.
Delay Between Chunks
Specify the amount of time, in seconds, that the DataStation should wait before sending the next chunk of data.
To disable chunking, set the Payload Chunk Size to 0.
FTP
FTP Connection
Specify this method to initiate an FTP file upload.
FTP Server / Hostname
Enter the FTP server / hostname. Can be a publicly-resolvable hostname such as ftp.example.org, or an IP address.
Port Number
This is almost always 21, unless the FTP server administrator has told you otherwise.
If enabled, will attempt to use FTPS (FTP over SSL / TLS).
If FTPS is enabled, the Port Number above should be changed to 990, or whatever the FTPS port is of the destination server.
Enable Legacy TLS 1.0 / SSL3 Protocols
If enabled, FTPS connetions over TLS 1.0 and SSL3 will be allowed.
Otherwise, only TLS 1.1 / TLS 1.2 connections are allowed.
Username
Enter the username of the user to log in as.
Password
Enter the password of the user above.
Path / Folder
Enter a fully-qualified file name/path (relative to the FTP root) of the destination file.
This setting also defines your file name.
For example, if you need to upload a file to the (relative) folder "/drop", and you want the file to be called "sample-transfer.csv", you would enter:
/drop/sample-transfer.csv
into this field.
If you just want to upload a file to the root / home folder of the FTP server, you may just enter the file name into this field, for example: sample-transfer.csv
If you would like add date or random guid to make your filename unique, we have added these options:
"${date:...}"
"${guid}"
For example, if you wanted to send a file with the date tagged onto the end of the file name (i.e.: MyFile-2023-01-03.csv), you would code this:
/test/myFile-${date:yyyy-MM-dd}.csv
For example, if you wanted to send a file with a random guid tagged onto the end of the file name you would code this:
/test/myFile-${guid}.csv
Enable Passive Mode
If your FTP server requires passive file transfers, enable this option.
Verify with your FTP server administrator which file transfer mode is required (active or passive). If this setting is incorrect, it can lead to failures when uploading.
Overwrite Existing Files
If enabled, and if the FTP server allows this operation, the DataStation can overwrite an existing file of the same name on the server, if it finds one.
This is useful, for example, during nightly uploads, where the destination system retrieves the file and processes it, but does not remove it from the FTP folder.
SFTP
SFTP Connection
Specify this method to initiate a file upload via an SFTP connection.
Not to be confused with FTP or FTPS, SFTP is the SSH File Transfer Protocol. It enables file transfers over an SSH connection, typically to a Unix or Linux-based server.
iTransfer will always accept (and write to the log) whatever host key it receives from the host. (It is not currently possible to specify an allowed list of host keys.)
At this time, it is not possible to authenticate via a Public/Private keypair. Support for public key authentication will be added in a future release.
Server / Hostname
Enter the FTP server / hostname. Can be a publicly-resolvable hostname such as ftp.example.org, or an IP address.Server / Hostname
Port Number
This is almost always 22, unless the SSH/SFTP server administrator has told you otherwise.
Username
Enter the username of the user to log in as.
Password
Enter the password of the user above.
If you just want to upload a file to the root / home folder of the SFTP server, you may just enter the file name into this field, for example: sample-transfer.csv
Path / Folder
Enter a fully-qualified file name/path (relative to the FTP root) of the destination file.
This setting also defines your file name.
For example, if you need to upload a file to the (relative) folder "/drop", and you want the file to be called "sample-transfer.csv", you would enter:
/drop/sample-transfer.csv
into this field.
If you just want to upload a file to the root / home folder of the FTP server, you may just enter the file name into this field, for example:
sample-transfer.csv
If you would like add date or random guid to make your filename unique, we have added these options:
"${date:...}"
"${guid}"
For example, if you wanted to send a file with the date tagged onto the end of the file name (i.e.: MyFile-2023-01-03.csv), you would code this:
/test/myFile-${date:yyyy-MM-dd}.csv
For example, if you wanted to send a file with a random guid tagged onto the end of the file name you would code this:
/test/myFile-${guid}.csv
Overwrite Existing Files
If enabled, and if the SFTP server allows this operation, the DataStation can overwrite an existing file of the same name on the server, if it finds one.
This is useful, for example, during nightly uploads, where the destination system retrieves the file and processes it, but does not remove it from the SFTP folder.
JSON Data Format
Specify this data format if you want to transmit JSON data.
Template
JSON data is built using a template.
The JSON data entered into the Template field is wrapped into a JSON Array, and transmitted in bulk to the third party.
For example, given this source data table:
FirstName | LastName | FavoriteNumber |
---|
John | Smith | 42 |
Bob | Jones | 3 |
Alice | Thompson | 85 |
And the following template:
CODE
{
"first": "$$FirstName$$",
"last": "$$LastName$$",
"favNum": $$FavoriteNumber$$
}
The following payload will be produced:
CODE
[
{
"first": "John",
"last": "Smith",
"favNum": 42
},
{
"first": "Bob",
"last": "Jones",
"favNum": 3
},
{
"first": "Alice",
"last": "Thompson",
"favNum": 85
}
]
Notice that:
The resulting payload is automatically enclosed in a JSON array [...]
.
Each row/object (except the last) is automatically suffixed with a comma (,
) – do not include one in your template.
The name of the column from the source data, enclosed in $$...$$
, is replaced with the value from each row of the source data table.
Strings must be enclosed in "..."
, per the JSON specification.
The FavoriteNumber field, being numeric, does not need to be enclosed in "..."
, although if the receiving party needs it to be a string and not a number, it can optionally be enclosed.
The names of the columns do not need to match the property names that are sent (e.g. "favNum" is the JSON property name, but "FavoriteNumber" is the source table's column name).
You may hard-code data if you wish, each property does not have to contain a $$...$$
placeholder.
Insert fields from IQA
Click on a blue field name to automatically insert it into the Template field where your cursor is currently located.
Auto-Template
Click this button to automatically generate a template based on the source columns. This is an excellent time-saver if your dataset contains many columns, but please take note of the following warnings, and always review your template before saving.
Every field is included in the template.
The JSON property name is copied directly from the column name as-is.
All fields are generated as strings. If you have numeric or boolean (true
/false
) fields, you need to remove the quotation marks surrounding these placeholders.
Enable Single Row Mode
If this mode is enabled:
This mode is useful for:
Creating a data source with only one row of data, containing aggregate/reporting numbers, such as totals and other statistics.
Manually entering a static JSON payload to send as a webhook, e.g. in a subsequent step marked as Only on Failure where you want to send a webhook message to another online service such as Slack or Microsoft Teams.
Wrapper Template
In the above template example, notice that the root object is a JSON array.
Some systems are unable to accept an array as the root object, or otherwise require the array to be nested within a parent object.
If this setting is populated, you can wrap the JSON array in an outer JSON object which will then be sent.
Again, using the preceding example:
CODE
[
{
"first": "John",
"last": "Smith",
"favNum": 42
},
...
]
If we enter a wrapper template like so:
CODE
{
"success": true,
"data": %%data%%
}
The actual payload that is sent will look like this:
CODE
{
"success": true,
"data": [
{
"first": "John",
"last": "Smith",
"favNum": 42
},
...
]
}
As shown in the previous example, you may hard-code other properties to send, such as "success": true
. However, replacement tokens from the source data table ($$...$$
) are not allowed in this field.
If you don't want to use a wrapper template, simply leave this field blank.
Delimited (CSV) Data Form
Specify this data format to create any flat-file delimited data, such as CSV, TSV, pipe-separated, etc.
At this time, the only line (or record) delimiters allowed are standard "line breaks". You may choose between Windows-style (CR LF
) or Unix-style (LF
).
You may use any typeable character for the field delimiter (such as a comma, pipe, semicolon, or other symbol).
For non-typeable characters, such as tabs, you will need to compose your line template in a text editor such as Notepad++, ensuring that the tab or other character is represented correctly, then copy and paste the template into the Line Template field.
Header Template
If you'd like your file to contain a header line, enter the header line as it should appear in the file verbatim.
Line Template
Enter the line template. This template is repeated once per record.
A column name from the source data table, surrounded by $$..$$
, will be replaced with the value from the current row.
CSV Data Tip
If you are writing CSV data, it is always a good idea to enclose each field in double quotes ("..."
).
For example, instead of this line template: $$FirstName$$,$$LastName$$
Use this template instead: "$$FirstName$$","$$LastName$$"
Insert fields from IQA
Click on a blue field name to automatically insert it into the Line Template field where your cursor is currently located.
Auto-Template
Click this button to automatically generate a template based on the source columns. This is an excellent time-saver if your dataset contains many columns, but please always review your template before saving.
Tip
The Auto-Template button will populate both the Header Template and Line Template fields for you.
If you don't want a header in your file, simply blank out that field.
Enable UNIX line endings
If enabled, line endings will be written in the LF
format (\n
or char(10)
).
Otherwise, if disabled, line endings will be written in the CR
LF
format (\r\n
or char(13)char(10)
).
For more information, please refer to Newline on Wikipedia.
Conditional Multi-Step Logic
All steps except the first can define conditional logic, where the step will only execute if the previous step succeeded or failed.
The first step is always executed. The condition field is always ignored for a given job's first step (as defined by its Order field).
After the first step, each subsequent step is checked against an internal "success" flag. If the flag matches the condition, the step is run. Otherwise, it is skipped.
Ways That a Step Can Fail
A step is marked as failed if:
It throws an unknown error during processing
The network transmission failed (connection interrupted, host not found, timed out, invalid username/password, etc.)
For the HTTP transmission step, if the Fail Step on 4xx/5xx status option is checked, and the remote server responds with a status code between 400 and 599
Success Flag Logic
The internal success flag is only set to false if a step explicitly fails, and a step wasn't skipped.
If the current step is set to execute, the success flag is first reset to true, and then the step is run. Only if the current step runs AND fails is the flag then set to false.
This is so that you can have both an "Only on Success" and "Only on Failure" step execute, which both check the main job.
Consider the following example:
Order | Step Name | Step Condition | Success Flag Before Execution | Step Outcome | Success Flag After Execution |
---|
10 | Transmit Data to Third Party | — (First step is always run.) | | Success | |
20 | Failure Notification | Only on Failure | — | Skipped | — |
30 | Success Notification | Only on Success | | Success | |
A simple example. Now, let's see a more complex example.
Order | Step Name | Step Condition | Success Flag Before Execution | Step Outcome | Success Flag After Execution |
---|
10 | Transmit Data to Third Party | — (First step is always run.) | | Error | |
20 | Success Notification | Only on Success | — | Skipped | — |
30 | Failure Notification | Only on Failure | | Success | |
40 | Job Completion Notification | Always | | Success | |
In the preceding example, notice that the Success Notification step did NOT reset the Success Flag, because the step did not run.
Then, when step 3 executed (Failure Notification), it was able to read the Success flag from step 1, which was still false.
Additionally, a Job Completion Notification step was added, which has a condition of "Always". Irrespective of the outcome of the other steps, this step will always run.