Data Process Engine: esProc Finds Differences between CSV files

5/05/2015

esProc Finds Differences between CSV files

userName and date are the logical primary key of both old.csv file and new.csv file, in which we want to find rows that are new, deleted and updated.

The source data is as follows:

As can be seen from the above data, in new.csv the 2^nd and the 3^rd row are the new and the 4^th row is the updated; in old.csv the 3^rd row is the deleted.

esProc code：

A1,B1：Retrieve the comma-separated files.

A2,B2：Sort data by the key, as this is required by the following merge function.

A3：Find the new records by the key. merge function is used to merge data sets. @d means calculating the difference during the merge. Similar options include @u for union and @i for intersection. The computed result is as follows: