3/26/2015

esProc Assists Java to Join Text Files


JAVA lacks the class library for handling text file JOIN operation. So it’s quite complicated to write code specifically for it, especially when the files are too big to be loaded into the memory all at once. esProc can help JAVA do this kind of operation. Below is an example showing the steps in detail.

order.txt is a text file, in which the first row contains column names and SellerId column is the logical foreign key that points to EID column in employee.txt. It is required to retrieve OrderID, Client, Name and Dept to create a new file. Below are selections of the source data.

Order.txt

Employee.txt

Suppose both files are not big, then we can perform an inner join using the following esProc code:

A1, A2Retrieve files to memory. @t indicates making the first row as column names.

A3Perform inner join and the result is as follows:

A4: Retrieve the desired columns from A3.

A5: Export them to a new file. The result is as follows:

If left join is required, then just use @1 option with JOIN function. The code will be =join@1(A1:o,SellerId; A2:e,EId) and the result file is this:

Similarly we can use @f option for a full join.

However if one of the files is big, or both are big, we can handle the operation with esProc cursor. As the following code shows:

A1,A2Open the big files with cursors.

A3: Inner join of cursors, which, different from the in-memory inner join, requires each cursor is ordered. So sortx function is used here to sort.

A4: Retrieve the desired columns to create a new cursor.

A5: Export the cursor data to a new file.

Similarly, @1 indicates the left join and @f the full join.

         // establish a connection via esProc JDBC
         Class.forName("com.esproc.jdbc.InternalDriver");
         con= DriverManager.getConnection("jdbc:esproc:local://");
         // call esProc script whose name is test
         st =(com.esproc.jdbc.InternalCStatement)con.prepareCall("call test()");
         st.execute();//execute esProc stored procedure
         ResultSet set = st.getResultSet();//get the result

No comments:

Post a Comment