Heterogeneous datasources are frequently
needed in report development, but they are difficult to realize using the
reporting tool, like JasperReport, alone. To present the result
of joining two MongoDB collections together, for instance. Though JasperReport
has the functions of virtual data source or table join, but they are offered
only in the commercial or higher versions and probably won’t appear in free
version. Moreover, these two functions support connection to only two
datasources. If users need more, the program development becomes complicated.
Additionally, they don’t support subsequent structured data computing of the
joined data as SQL does, due to providing only a graphical interface.
esProc has a powerful structured data
computing engine, supports heterogeneous datasources and is easy to be integrated,
thus it is useful for assisting the reporting tool to realize MongoDB join
conveniently. The following example will show you how esProc works to create a
join in MongoDB.
There are two collections – sales and emp – in MongoDB. Logically, sale’s
SellerId field is equivalent to a foreign key that points to emp’s EId field. The task is to query
orders in sales by the time range,
create a left join with emp and
present the result in the report. Some of the source data are as follows:
Collection sales
Collection emp
esProc script for completing the task:
A1=MongoDB("mongo://localhost:27017/test?user=root&password=sa")
This line of code establishes the
connection to MongoDB. User name and password can be specified through the two
parameters user and password.
As connecting to an ordinary database, esProc
also supports connecting to MongoDB through JDBC. But since the third-party
JDBC is not free, and has a weaker function than the official library function –
for example, it cannot retrieve the multilayer data, esProc will directly
encapsulate the native methods, thereby retaining the function and syntax of
MongoDB. find function, for example,
can be used in this case.
A2=A1.find("sales","{'$and':[{'OrderDate':{'$gte':'"+string(begin)+"'}},{'OrderDate':{'$lte':'"+string(end)+"'}}]}","{_id:0}").fetch()
This line of code finds records during a
certain time period from MongoDB’s collection sales. find function’s
first parameter is the collection’s name, its second parameter is the query
condition that is defined according to MongoDB syntax and the third one
specifies the field to be returned. Note that begin and end in the query
condition are external parameters passed from the report, respectively
representing the beginning date and the ending date of OrderDate.
find function will return a cursor, which means it doesn’t retrieve data
into the memory entirely and thus supports processing big data. Data of the
result cursor can be further processed with functions like skip, sort and conj and data retrieval won’t start
until either fetch function or groups function, or for statement appears. fetch()
function is used in this example to fetch the data into the memory. Suppose the
time period is from 2009-01-01 to 2009-12-31, result of A2 is as follows:
A3=A1.find("emp",,"{_id:0}").fetch()
This line of code retrieves all data, except for _id field, from collection emp unconditionally. Result is as follows:
A4=A1.close()
This line of code closes the connection to
MongoDB established in A1.
A5=join@1(A2:sales,SellerId;A3:emp,EId)
This line of code creates a left join between A2 and A3. The fields to join them together are A2’s SellerId and A3’s EId. Intuitively, two parts of the joined data are respectively named sales and emp. join function is used to perform the join operation, in which @1 option means left join. The result can be seen in the left part of the following figure:
It can
be seen that some of the SellerId in sales
cannot find corresponding records in emp
because of the left join. We can use @f
to perform a full join; without any option, the function will perform an inner
join.
A6=A5.new(sales.OrderID:OrderID,sales.Client:Client,sales.Amount:Amount,sales.OrderDate:OrderDate,emp.Name:Name,emp.Dept:Dept,emp.Gender:Gender)
A5 joins the data together. A6 gets the fields we want from the result of joining and creates a two-dimensional table using new function. For example, sales.OrderID:OrderID means getting sales.OrderID field from A5 and rename it OrderID (because reporting tools cannot identify field names like sales.OrderID). Result is as follows:
Now all data are ready for creating the
report. The final step is to return A6’s two-dimensional table to the reporting
tool with result A6. esProc offers JDBC
interface to be integrated with the reporting tool, and the latter will identify
it as the ordinary database. Please refer to related documents for the
integration solution.
Define two parameters – Pbegin and Pend – in the report according to the corresponding esProc parameters. Click Preview to see the report:
The way the reporting tool calls the esProc
script is the same as that it calls the stored procedure. Save the esProc
script as, say, mongodbJoin.dfx to be
called by mongodbJoin $P{pbegin},$P{pend} in JasperReport’s SQL designer.
No comments:
Post a Comment