Sometimes we may encounter computations too
complicated to be handled solely using library functions. We need to hardcode
the computational logic using scripts. In that case the interpretative and
executional performance of the script becomes really important. Not a few traditional
database tasks, being unable to be expressed directly in SQL, require stored procedure
to perform data traversal.
We’ll test performance of executing esProc
scripts and compare it with executing script handling the same task with Oracle’s
stored procedure.
Test data:
ORDERID CLIENT SELLERID AMOUNT ORDERDATE NOTE
1 287 47 5825 2013-05-31 gafcaghafdgie f ci…
2 89 22 8681 2013-05-04 gafcaghafdgie f ci…
3 47 67 7702 2009-11-22 gafcaghafdgie f ci…
4 76 85 8717 2011-12-13 gafcaghafdgie f ci…
5 307 81 8003 2008-06-01 gafcaghafdgie f ci…
6 366 39 6948 2009-09-25 gafcaghafdgie f ci…
7 295 8 1419 2013-11-11 gafcaghafdgie f ci…
8 496 35 6018 2011-02-18 gafcaghafdgie f ci…
9 273 37 9255 2011-05-04 gafcaghafdgie f ci…
10 212 0 2155 2009-03-22 gafcaghafdgie f ci…
…
The test involves two cases: small data and
big data. The former will be loaded into memory in one go to be processed by
loop. The latter will be imported into memory in batches or row by row to be
processed by loop.
Case one: Small data
Below is an esProc script:
The execution time of the esProc script
only includes the execution of A3’s loop, without the data importing in B2
counted. As the execution time is short, we’ll repeat the execution five times
in case that large error will be produced due to the tiny result value.
Below is the Oracle’s stored procedure for
doing the same task:
create or replace procedure orderssp is
type v_table is table of orders%rowtype
index by binary_integer;
my_table v_table;
ret2 number;num1 number;num2 number;num3
number;date1 date;
begin
date1:=to_date('20150101','yyyymmdd');
select * bulk collect into my_table from orders;
for j in 1..5
loop
for i in 1..my_table.count
loop
num1:= floor(date1-my_table(i).orderdate);
num2:= instr(my_table(i).note,'a');
num3:= (my_table(i).client+my_table(i).sellerid);
if num3<>0 then ret2:=num1* num2/num3;
else ret2:=num1*num2;
end if;
end loop;
end loop;
end orderssp;
Similarly, the execution time of the Oracle’s
stored procedure will not include the data importing time (select * bulk
collect into my_table from orders;). It is the time taken to complete the for loop, and executions will repeat
five times.
Test data amount: 4 million rows. Oracle
data table is 457M and esProc data file is 418M.
Case two: Big data
esProc execution time = Total execution
time – data fetching time. The data fetching time is the time spent in fetching
data, without real computation counted, i.e. the execution time after removing
code from C3 to C6. We’ll do the execution twice to get the total execution
time and the data fetching time respectively.
Below is the Oracle’s stored procedure for
doing the same task:
create or replace procedure mem is
cursor mem_cur is
select * from orders;
l_mem orders%rowtype;
ret2 number;num1 number;num2 number;num3 number;date1 date;
begin
date1:=to_date('20150101','yyyymmdd');
open mem_cur;
loop
fetch mem_cur into l_mem;
exit when mem_cur%notfound;
num1:= floor(date1-l_mem.orderdate);
num2:= instr(l_mem.note,'a');
num3:= (l_mem.client+l_mem.sellerid);
if num3<>0 then ret2:=num1* num2/num3;
else ret2:=num1*num2;
end if;
end
loop;
close mem_cur;
end mem;
Likewise the time of executing Oracle’s
stored procedure computing= Total execution time – data fetching time. The data
fetching time is the time spent in fetching data, without real computation
counted, that is, the time spent in executing the loop body where only fetch statement and exit when statement exist.
Test data amount: 2.4 billion rows. Oracle
data table is 28G and esProc data file is 25G.
Conclusion:
For small data that can be loaded into
memory all at once, esProc and Oracle are equal in their computing performance.
For big data that cannot be entirely loaded
into memory, esProc excels in both data fetching and computing performance. The
interpreter of Oracle’s stored procedure performs poorly.
In real world business, when dealing with
complicated computations that cannot be directly coded using SQL and require
handling by the stored procedure after data are fetched row by row, we can move
the data out of the database and use esProc to process it. By doing so, higher
computing performance can be achieved.
Test environment:
CPU: Core(TM) i5-3450 Four cores, four threads
Memory capacity: 16GB
OS:CENT OS
Oracle11g:Maximum memory 14G
esProc 3.1 version:Maximum
memory 14G
No comments:
Post a Comment