Today I launched a spark job that was taking to long to complete and I forgot to start it through screen so I need find a way to keep it running after I disconnect my terminal of the cluster.
$ spark-submit .... 14/08/29 23:57:32 INFO TaskSetManager: Starting task 1.0:3303 as TID 11603 on executor 0: ip-xxxxx.ec2.internal (PROCESS_LOCAL) 14/08/29 23:57:32 INFO TaskSetManager: Serialized task 1.0:3303 as 2721 bytes in 0 ms 14/08/29 23:57:32 INFO TaskSetManager: Finished TID 11596 in 7724 ms on ip-xxxxx.ec2.internal (progress: 3298/4150) 14/08/29 23:57:32 INFO DAGScheduler: Completed ShuffleMapTask(1, 3296)
Here I sent the job to background, pressing
Z. It will stop the job
$ jobs  suspended java $ disown %1
Running disown and
% job number it marks the job to not receive
kill when I leave the main term.
After logout I get into another terminal to make sure everything worked as expected and Voilá :)
$ ps aux | grep spark spark 9643 1.6 6.6 2038688 506744 ? Sl 19:33 4:25 /usr/lib/jvm/java-1.7.0/bin/java -cp :::/root/ephemeral-hdfs/conf:/root/spark/conf:/root/spark/lib/spark-assembly-1.0.1-hadoop1.0.4.jar:/root/spark/lib/datanucleus-api-jdo-3...
Also I could check the Spark UI and I could see my job slowly progressing …