MongoDB script has limited computational ability in realizing complicated operations, so it is
difficult to solve problems of this kind using it alone. In many cases, you can
only perform further computations after retrieving the desired data out. And
there is no less difficulty in trying to realizing this kind of set operations
with high-level programming languages like Java. In this case, you can use
esProc to help with the computation in MongoDB. An example will be provided for
explaining how esProc works.
There is a collection – test – in MongoDB, as shown below:
>
db.test.find({},{"_id":0})
{
"value" : NumberLong(112937552) }
{
"value" : NumberLong(715634640) }
{
"value" : NumberLong(487229712) }
{
"value" : NumberLong(79198330) }
{
"value" : NumberLong(440998943) }
{
"value" : NumberLong(93148782) }
{
"value" : NumberLong(553008873) }
{
"value" : NumberLong(336369168) }
{ "value"
: NumberLong(369669461) }
…
Specifically, test includes multiple values, each of which is a digital string.
It is required that each digital string be compared with all the other digital
strings and find the biggest same digit and the biggest different digit in each
digital string. If the number 1 exists both in the first row and in the nth
row, their same digit will be counted as one. If the number exists only in the
first row, and there is no such a number in the nth row, you can count one
different digit.
esProc code:
A1: Connect to MongoDB. Both
IP and the port number is localhost:27017.
The database name, user name and the password all are test.
A2: find function is used to fetch data from
MongoDB and create a cursor. orders
is the collection, the filtering condition is null and _id , the specified key, won't be fetched. It can be seen that esProc
uses the same parameter format in find
function as that in find statement in
MongoDB. esProc's cursor supports fetching and processing data in batches,
thereby avoiding the memory overflow caused by importing big data at once. As
the data size is not big, fetch
function is used to get the records altogether from the cursor.
A3: Add two new columns to A2 for storing
the biggest same and different numbers. And, at the same time, convert values
into strings.
A4: Perform loop on the
collection in A3, the loop body covers an area of B4-D10.
B4: Get the value on the
current loop.
C4: Use array@s to split the column value into a
sequence consisting of single characters and remove the duplicate values.
B5: Perform an inner
loop on the collection in A3. The loop body is C6-D10.
C5: If the loop position
of the inner loop is the same as the current one in the outer loop, that is,
they hold the same value, skip the current inner loop and move on to the next.
C6: Get the value on the
current inner loop.
C7: Define two variables
- same and diff – for storing the same numbers and different numbers respectively
got through the current comparison. The initial value is defined as zero.
C8: loop function is used to examine one by
one in the inner loop the numerical values of the sequence formed by splitting
values in the outer loop. If a same value is caught, the value of same will
increase by one; otherwise the value of diff will increase by one.
C9, C10: Compare same and
diff with those in A4, and reassign the bigger values to the same and diff in
A4.
Note: esProc isn't equipped with a Java
driver included in MongoDB. So to access MongoDB using esProc, you must put
MongoDB's Java driver (a version of 2.12.2 or above is required for esProc,
e.g. mongo-java-driver-2.12.2.jar) into [esProc installation
directory]\common\jdbc beforehand.
The esProc script used to help MongoDB in
the computation is easy to be integrated into the Java program. You just need
to add another line of code – A11 – that is, result A3, for outputting a result
in the form of resultset to Java
program. For the detailed code, please refer to esProc Tutorial. In the same way, MongoDB's Java driver must be put
into the classpath of a Java program before the latter accesses MongoDB by
calling an esProc program.
No comments:
Post a Comment