I have an Oozie workflow which is supposed to run every X minutes. It reads a value from an HBase table. After this a Sqoop action is run (incremental) based on the value read from HBase in the previous step. To make the workflow work I need to somehow capture the new --last-value from the Sqoop Oozie action to be written back to HBase and the next time the workflow runs to read it again... and so on.
How can I do this, or might there be a better way?
Jonas
I think the blog http://www.tanzirmusabbir.com/2013/05/chunk-data-import-incremental-import-in.html might give you some hints.
Basically, it keeps the startindex and chunksize in the job.properties, and the startindex is used in where condition in Sqoop job, then it changes the startindex via shell script after the sqoop job.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With