Sorter is an active transformation which sorts the incoming data based on one or more key values (in an ascending, descending or mixed order). The Sorter transformation attribute, 'Distinct' provides a facility to remove duplicates from the input rows.
Sorter Cache Size - the Integration Service passes all rows into the Sorter transformation before it performs the sort operation. Any amount between 1 MB and 4 GB can be specified for the Sorter cache size.
Case Sensitive and Null Treated Low - options used to precise the output order
Work Directory - the directory used to create temporary files while it sorts data. After the rows are sorted, the temporary files get deleted automatically.
Distinct - if distinct output rows is checked, the Mapping Designer configures all ports as part of the sort key.
Key properties worth looking at:
The Aggregator transformation calculates aggregates such as sums, minimum or maximum values across multiple groups of rows. Sorted Input - should be checked when the input data is already sorted by groups.
Aggregator Data Cache Size
Aggregator Index Cache Size - default is "Auto", index cache contains all group by ports.
The Aggregator transformation can apply expressions to its ports however those expressions will be applied to a group of rows unlike the Expression transformation which applies calculations on a row-by-row basis only.
Aggregate functions are created in output ports only. Function grouping requirements are set using the Aggregator Group By port.
Available aggregate expressions: AVG, COUNT, FIRST, LAST, MAX, MEDIAN, MIN, PERCENTILE, STDDEV, SUM
Examples / useful tips
Sorting the records prior to passing them on to an Aggregator transformation may improve the overall performance of the aggregation task.
Using a Sorter transformation may improve performance over an 'Order By' clause in a SQL override statement in aggregate session when the source is a database because the source database may not be tuned with the buffer sizes needed for a database sort.
In Aggregator checking the sorted input attribute will bypass caching.