We are running a spark-streaming application on a standalone setup (version 1.6).
The logging in spark seems to be a bit scattered and I am attempting to configure a nagios log file monitor that checks for certain "errors" in log files and sends out alerts.
My current understanding in regards to logs for spark are the following:
Now for Driver and Spark/Executor App logs It seems the location for these logs are dynamic and spark will generated new directories under /var/run/spark/work in my case.
My issue:
Monitoring the static location log files is straight forward for spark-worker and spark-master. I am a bit confused as to how the dynamic logs for app and drivers can be monitored.
From what I read in the documentation, it seems upon spark-submit I can pass a -D option with a location to a log4j.properties file.
Can this be configured to stream the logs to a local syslog in a static location and then have nagios monitor that static log?
What have others done in this case?
Is it yarn or spark standalone. If you are using yarn, you can get all the consolidated logs using the command
yarn logs -applicationId <application ID> <options>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With