5/17/2015

Performance Test of esProc Script Execution

Sometimes we may encounter computations too complicated to be handled solely using library functions. We need to hardcode the computational logic using scripts. In that case the interpretative and executional performance of the script becomes really important. Not a few traditional database tasks, being unable to be expressed directly in SQL, require stored procedure to perform data traversal. 

We’ll test performance of executing esProc scripts and compare it with executing script handling the same task with Oracle’s stored procedure.

Test data:
ORDERID CLIENT     SELLERID AMOUNT ORDERDATE NOTE
1       287  47     5825         2013-05-31       gafcaghafdgie f ci…
2       89     22     8681         2013-05-04       gafcaghafdgie f ci…
3       47     67     7702         2009-11-22       gafcaghafdgie f ci…
4       76     85     8717         2011-12-13       gafcaghafdgie f ci…
5       307  81     8003         2008-06-01       gafcaghafdgie f ci…
6       366  39     6948         2009-09-25       gafcaghafdgie f ci…
7       295  8       1419         2013-11-11       gafcaghafdgie f ci…
8       496  35     6018         2011-02-18       gafcaghafdgie f ci…
9       273  37     9255         2011-05-04       gafcaghafdgie f ci…
10     212  0       2155         2009-03-22       gafcaghafdgie f ci…
The test involves two cases: small data and big data. The former will be loaded into memory in one go to be processed by loop. The latter will be imported into memory in batches or row by row to be processed by loop.

Case one: Small data


Below is an esProc script:

The execution time of the esProc script only includes the execution of A3’s loop, without the data importing in B2 counted. As the execution time is short, we’ll repeat the execution five times in case that large error will be produced due to the tiny result value.

Below is the Oracle’s stored procedure for doing the same task:
create or replace procedure orderssp is
type v_table is table of orders%rowtype index by binary_integer;  
my_table v_table;  
ret2 number;num1 number;num2 number;num3 number;date1 date;
begin
  date1:=to_date('20150101','yyyymmdd');
  select * bulk collect into my_table from orders;  
for j in 1..5
    loop
        for i in 1..my_table.count  
        loop
          num1:= floor(date1-my_table(i).orderdate);
          num2:= instr(my_table(i).note,'a');
          num3:= (my_table(i).client+my_table(i).sellerid);
          if num3<>0 then ret2:=num1* num2/num3;
          else ret2:=num1*num2;
          end if;
        end loop;
    end loop; 
end orderssp;
Similarly, the execution time of the Oracle’s stored procedure will not include the data importing time (select * bulk collect into my_table from orders;). It is the time taken to complete the for loop, and executions will repeat five times.

Test data amount: 4 million rows. Oracle data table is 457M and esProc data file is 418M.

Test results (measured by seconds):

Case two: Big data

esProc script:

esProc execution time = Total execution time – data fetching time. The data fetching time is the time spent in fetching data, without real computation counted, i.e. the execution time after removing code from C3 to C6. We’ll do the execution twice to get the total execution time and the data fetching time respectively.

Below is the Oracle’s stored procedure for doing the same task:
create or replace procedure mem is
 cursor mem_cur is
select * from orders;
l_mem orders%rowtype;
ret2 number;num1 number;num2 number;num3 number;date1 date;
begin
  date1:=to_date('20150101','yyyymmdd');
open mem_cur;
  loop
       fetch mem_cur into l_mem;
       exit when mem_cur%notfound;
    num1:= floor(date1-l_mem.orderdate);
    num2:= instr(l_mem.note,'a');
    num3:= (l_mem.client+l_mem.sellerid);
    if num3<>0 then ret2:=num1* num2/num3;
          else ret2:=num1*num2;
          end if;
 end loop;
 close mem_cur;
end mem;

Likewise the time of executing Oracle’s stored procedure computing= Total execution time – data fetching time. The data fetching time is the time spent in fetching data, without real computation counted, that is, the time spent in executing the loop body where only fetch statement and exit when statement exist.

Test data amount: 2.4 billion rows. Oracle data table is 28G and esProc data file is 25G.

Test result (measured by seconds):

Conclusion:

For small data that can be loaded into memory all at once, esProc and Oracle are equal in their computing performance.

For big data that cannot be entirely loaded into memory, esProc excels in both data fetching and computing performance. The interpreter of Oracle’s stored procedure performs poorly.

In real world business, when dealing with complicated computations that cannot be directly coded using SQL and require handling by the stored procedure after data are fetched row by row, we can move the data out of the database and use esProc to process it. By doing so, higher computing performance can be achieved.

Test environment:
CPU: Core(TM) i5-3450  Four cores, four threads
Memory capacity: 16GB
OSCENT OS
Oracle11gMaximum memory 14G
esProc 3.1 versionMaximum memory 14G

No comments:

Post a Comment