I would like to extend the Apache Drill Mongo Storage Plugin to push down INNER JOINs. Therefore I would like to rewrite INNER JOIN into the mongo aggregation pipeline.
How do we need to start to implement the rewrite in Apache Drill.
Here is a SQL example:
SELECT *
FROM `mymongo.db`.`test` `test`
  INNER JOIN `mymongo.db`.`test2` `test2`
  ON (`test`.`id` = `test2`.`fk`)
WHERE `test2`.`date` = '09.05.2017'
I have found the push down of WHERE clauses in the Mongo Storage Plugin. But I am still struggling to do the same for INNER JOINS. How would the constuctor of public class MongoPushDownInnerJoinScan extends StoragePluginOptimizerRule look like? Which equivalent of MongoGroupScan (AbstractGroupScan) would I have to implement? Any help would be very much appreciated.
If you want to make an inner join with the aggregation framework similar to SQL you can do it with the pipeline stage $lookup.
$lookup:
    {
    from: <collection to join>,
    localField: <field from the input documents>,
    foreignField: <field from the documents of the "from" collection>,
    as: <output array field>
    }
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With