Distributed computing using Celery, dispy, parallel python (pp), or something else?

You can think of WorkQueue as a layer on top of HTCondor, but it doesn't actually need HTCondor. So let me explain. HTCondor and Torque (mentioned below) are job schedulers and resource allocators. You tell those systems you have N resources and when you submit a task or job, these systems will play match maker and match your job requirements to the resources available. That is, if you have 10 cores available in your cluster and your job needs 4, the batch scheduler will reserve 4 of those cores for your job.

With WorkQueue, you have a master/worker model. The worker is just a static executable that you can deploy on any machine and in any manner that you see fit. This means, for instance, if you have access to a HTCondor cluster or a Torque cluster, you can submit WorkQueue workers as tasks. That said, if you didn't have a batch scheduler, you can still use WorkQueue by launching the workers manually via say SSH or on your local multi-core machine. Once the workers are running (regardless of how they are started), they will connect to the master you write and receive commands to execute.

This is really flexible since it allows you to take advantage of existing resources (HTCondor, Torque, multi-core, independent machines) AND mix them! For instance, in one project we had workers running in Amazon EC2, Windows Azure, a SGE cluster, and a HTCondor cluster.

So, to answer your question about setup: getting started with WorkQueue is very easy. You just write a master and start a worker. How that worker is deployed is up to you. You can use any batch system (HTCondor, Torque, SGE, SLURM, etc.) or none at all (SSH, local multi-core). Of course, if you want to automate and monitor things, having some sort of batch system would help, but it is not necessary to use WorkQueue.

I hope that helps. I recomend skimming this paper: http://ccl.cse.nd.edu/research/papers/wq-python-pyhpc2011.pdf for more information about WorkQueue.

PS. HTCondor and Torque can seem daunting to setup, but if you have pretty basic needs and requirements, then it is very much possible to get them up and running on your own. Of the two, I feel HTCondor is easier to manage and setup.

/r/Python Thread