Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Problems running Storm with additional classpath

How can I run storm topology with additional classpath?

My output directory is as follows:

  1. myapp.jar (The manifest's classpath contains the config & lib dir)
  2. lib - (directory)
  3. conf -(directory)

There are the following solutions for this problem (which are not good for me- I don't think it's best practice):

  1. pack those files within the jar.
  2. Put those files in storm lib.

ref: https://groups.google.com/forum/#!topic/storm-user/YqNr82Y3Nac

like image 257
user2550587 Avatar asked Dec 07 '25 02:12

user2550587


2 Answers

There is no (clean) way (that I am aware of) to extend the classpath of the storm workers. Apart from the topology jar itself, the classpath is defined at JVM start-up => anything present there is visible to (thus is shared by) all topologies running on each node. Moreover, because of the cluster nature of storm, putting topology-specific files on the filesystem of storm worker would render your deployment a bit trickier to deploy since you have to copy/update those topology-specific files to every node. The Storm Deployer is meant to hide that from us.

Bundling dependant jars into myapp.jar worked well for me so far since it makes sure my dependencies are always deployed and updated in every node. Bundling config files there technically works as well, but that renders myapp.jar environment-specific which is indeed not a best practise.

I usually copy any supplementary config file in the node from where I do deployment (not the nodes where the topologies are running), serialise them in a json-friendly format and add them to the Storm config at deployment time. Like that I can read them again from any prepare() method anytime my topology starts on some node of the cluster. Here again, this approach makes me sure that my config is present and up to date in any node of my cluster.

like image 200
Svend Avatar answered Dec 08 '25 14:12

Svend


The script ${STORM_HOME}/bin/storm.py, which gets invoked after you run the storm command, picks up the environment variable STORM_EXT_CLASSPATH and adds it to the classpath.

Setting that environment variable should solve your problem:

export STORM_EXT_CLASSPATH={myoutput_dir}/lib

Follow up: this doesn't work because of a known bug in storm.py where STORM_EXT_CLASSPATH doesn't get added properly - each individual character in STORM_EXT_CLASSPATH gets added instead of the entire string (even if you use square brackets, quotes, etc.)

like image 30
wizetmin Avatar answered Dec 08 '25 15:12

wizetmin



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!