Making it easier to run image analysis in the cloud: announcing Distributed-CellProfiler

December 28, 2016

Beth Cimini

There’s nothing more exciting than getting back a big batch of data from your automated microscope – finally, you have the results of your screen, your timelapse, or whatever you’ve spent the last weeks or months preparing.  That excitement can turn to sadness quickly though when you realize that neither your laptop nor the old general-use computer in the lab are up to analyzing thousands (or tens of thousands, or hundreds of thousands!) of images.  But, congratulations! You’ve reached an elite level of CellProfiler users when you outgrow processing on a single local computer.

Hopefully, your institution has access to a large server or cluster and an IT department that can help you get CellProfiler installed on it and your images processing at top speed.  If sadly that’s not true for you, we’ve been working on a tool that may help: Distributed-CellProfiler.

Distributed-CellProfiler takes advantage of Amazon Web Services (AWS), which allows you to upload and store files, rent out computing power, and much more.  This means that once your images are uploaded to the cloud, you can run your analyses from anywhere and don’t need to buy or maintain any hardware on your own.  Full instructions on what you’ll need, how to get started, and how to use it are on our wiki, but we know you may have some questions:

  • Is this free?  AWS does have a free tier of resources, but if you’re working on this scale you’re likely going to have to pay some amount of storage and computing costs.  The good news is that you only pay for what you use and you can ‘bid’ how much you’re willing to pay for the computer time, so you should be able to find an option that works for your budget.  You’re also saving money you would have had to spend to buy a big new computer or pay into a local cluster, and this has no upkeep time, fees, or hassle to worry about!
  • Won’t everyone see my data if I put it in the cloud?  Not at all!  You can configure your privacy settings however you like.
  • I’m not good at computers.  Will I be able to do this?  We think so!  You will have to install some things and work a bit from the command line, but we provide step-by-step instructions and helpful hints to get you started.  If you were able to learn your microscope’s software and how to make your CellProfiler pipeline, after investing a small amount of time you can definitely learn to do this too.
  • I have an idea for a cool addition to Distributed-CellProfiler.  What can I do?  Like everything else we make, Distributed-CellProfiler is free and open-source, so we welcome input and code contributions from the whole community.  Feel free to file a feature request or make your own fork of the code to add it yourself.  The more input we have from you, the better the software will become!

What else would you like to know?  Are there other ways you’ve found to process big image sets?