Notes and ideas about Java, Scala, Big Data, NoSQL, Quality and Software Deploy
Avoiding a Spark Job to Die When Disconnecting From Shell
Today I launched a spark job that was taking to long to complete and I forgot to start it through screen so I need find a way to keep it running after I disconnect my terminal of the cluster.
$ spark-submit ....
14/08/29 23:57:32 INFO TaskSetManager: Starting task 1.0:3303 as TID 11603 on executor 0: ip-xxxxx.ec2.internal (PROCESS_LOCAL)
14/08/29 23:57:32 INFO TaskSetManager: Serialized task 1.0:3303 as 2721 bytes in 0 ms
14/08/29 23:57:32 INFO TaskSetManager: Finished TID 11596 in 7724 ms on ip-xxxxx.ec2.internal (progress: 3298/4150)
14/08/29 23:57:32 INFO DAGScheduler: Completed ShuffleMapTask(1, 3296)
Here I sent the job to background, pressing CTRL+Z. It will stop the job
 suspended java
$ disown %1
Running disown and % job number it marks the job to not receive kill when I leave the main term.
After logout I get into another terminal to make sure everything worked as expected and Voilá :)