This article aims to test esProc
performance of retrieving files and compares it with the result got by using Oracle
JDBC.
The test data is some order records saved
in the orders.b file. Import the file
with esProc and its content would be like this:
ORDERID CLIENT SELLERID AMOUNT ORDERDATE NOTE
1 287 47 5825 2013-05-31 gafcaghafdgie f ci…
2 89 22 8681 2013-05-04 gafcaghafdgie f ci…
3 47 67 7702 2009-11-22 gafcaghafdgie f ci…
4 76 85 8717 2011-12-13 gafcaghafdgie f ci…
5 307 81 8003 2008-06-01 gafcaghafdgie f ci…
6 366 39 6948 2009-09-25 gafcaghafdgie f ci…
7 295 8 1419 2013-11-11 gafcaghafdgie f ci…
8 496 35 6018 2011-02-18 gafcaghafdgie f ci…
9 273 37 9255 2011-05-04 gafcaghafdgie f ci…
10 212 0 2155 2009-03-22 gafcaghafdgie f ci…
…
note field is the string field for increasing each record’s length and
doesn’t have practical meaning. Test process: Write a Java program to call esProc
JDBC and Oracle JDBC respectively for retrieving and traversing the same data,
and record the time taken. The Java program, esProc script, the data file and
Oracle are all held in one machine, so no network is needed to transmit them.
Data volume: 890M
Hardware and
software configuration for the test machine: Normal PC
CPU: Core(TM) i5-3450 4 cores, 4 threads
Memory: 16GB
SSD
OS:CENT OS
Oracle:11g
esProc version: 3.1
esProc script esProcjdbc.dfx for data
retrieval (cellset parameter fileName is the file name):
|
A
|
1
|
=file(fileName).import@b()
|
2
|
result A1
|
The
test program of calling esProc JDBC by Java:
import
java.sql.*;
import
java.text.SimpleDateFormat;
import
java.util.Date;
public class
esproc {
public static void testDataServer(String
input){
Connection con = null;
try{
SimpleDateFormat df
= new SimpleDateFormat("yyyy-MM-dd HH:mm:ss"); Date
begin=new Date();
System.out.println("begin"+df.format(begin));
long diff=0;
Class.forName("com.esproc.jdbc.InternalDriver");
con=
DriverManager.getConnection("jdbc:esproc:local://");
com.esproc.jdbc.InternalCStatement
st=
(com.esproc.jdbc.InternalCStatement)con.prepareCall("call
esProcjdbc(?)");
st.setObject(1,input);
ResultSet set
=st.executeQuery();
int len=0;
while(set.next())
{
String
orderid=set.getString("ORDERID");
String
client=set.getString("CLIENT");
String
sellerid=set.getString("SELLERID");
float
amount=set.getFloat("AMOUNT");
Timestamp
ORDERDATE=set.getTimestamp("ORDERDATE");
String
note=set.getString("NOTE");
len++;
}
System.out.println("len="+len);
diff=((new
Date()).getTime()-begin.getTime())/1000;
System.out.println("time
of all="+diff);
}
catch(Exception e){
System.out.println(e);
}
finally{
if (con!=null) {
try {
con.close();
}
catch(Exception
e) {
System.out.println(e);
}
}
}
}
public static void main(String[] args)
{
testDataServer(“/ssd/data/orders.b”);
}
}
The SQL statement that Oracle implements:
Select * from
orders;
The
test program of calling Oracle JDBC by Java:
package jdbc;
import
java.sql.*;
import
java.util.Date;
import
java.text.SimpleDateFormat;
public class
oracle {
public static void testDataServer(String
input){
Connection con = null;
try{
SimpleDateFormat df
= new SimpleDateFormat("yyyy-MM-dd HH:mm:ss"); Date
begin=new Date();
System.out.println("begin"+df.format(begin));
long diff=0;
Class.forName("oracle.jdbc.driver.OracleDriver");
con=
DriverManager.getConnection("jdbc:oracle:thin:@192.168.1.21:1521:orcl","test","test");
Statement
st=con.createStatement();
ResultSet set
=st.executeQuery(input);
int len=0;
while(set.next())
{
String
orderid=set.getString("ORDERID");
String
client=set.getString("CLIENT");
String
sellerid=set.getString("SELLERID");
float
amount=set.getFloat("AMOUNT");
Timestamp
ORDERDATE=set.getTimestamp("ORDERDATE");
String
note=set.getString("NOTE");
len++;
}
System.out.println("len="+len);
diff=((new
Date()).getTime()-begin.getTime())/1000;
System.out.println("time
of all="+diff);
}
catch(Exception e){
System.out.println(e);
}
finally{
if (con!=null) {
try {
con.close();
}
catch(Exception
e) {
System.out.println(e);
}
}
}
}
public static void main(String[] args)
{
testDataServer(“select * from
orders”);
}
}
Test
result:
|
esProc
|
Oracle
|
Execution
time
|
13
seconds
|
43
seconds
|
Conclusion:
Data retrieval using Oracle JDBC is more
than one time slower than that via esProc JDBC. The reason is that data flow needs
to be converted to object through Oracle JDBC, which takes time. There is no
need for Java-based esProc to do the same thing, so it works much faster.
One point to note is that the time taken by
the first execution of calling Oracle JDBC to retrieve data is not used by the
test. On the other hand, because the data volume is well within 6G, Oracle’s
maximum memory capacity, it can make full use of the cache memory to save data.
Therefore most of the execution time is devoted to conversion to object via
JDBC.
This shows that esProc approach of retrieving
data into the external memory is more efficient than Oracle’s throughput capacity.
Generally data computing within database
will achieve higher performance. But in some cases you need to retrieve a great
amount of data from the database. For example, data outside the database is
involved, or the algorithm is too complex and specialized to be expressed in
SQL. In these cases, JDBC for database connectivity will be the key factor to determine
the performance.
Yet better performance can be achieved by
using esProc to retrieve data from the database and save it in files for use in
later computation. To handle or display a large amount of historical data by
report applications, it is more efficient to retrieve data into files before
processing than to process it by retrieving data directly from the database.
Within the log analysis system, you can save data as files and use esProc to
handle the computation if you must retrieve data from the database using a
programming language to perform certain operations involving log files.
No comments:
Post a Comment