Python on Cloud Foundry
By Ian
I’m very happy to be giving a talk at the latest PyData conference in New York this weekend.
This is a long post but I wanted a place to collect all the code I am showing in my talk and to provide a few more resources for those interested in trying out Python on Cloud Foundry further.
Resources
- Simple Flask app
- Slides from the talk
- My new Cloud Foundry buildpack using conda
- Video of Flask app deployment
- Update: Video of PyData NY talk
What is Cloud Foundry?
My talk is about how to use Python and the PyData stack on Cloud Foundry the open source cloud platform. Cloud Foundry started life at VMware and development transferred to Pivotal when it was formed. Cloud Foundry has grown much bigger since then with over 30 companies joining together to form the Cloud Foundry Foundation which will guide the development of the open source project.
On one level Cloud Foundry is a simple way to deploy and scale cloud based web applications. Instead of a complicated process to set up a host, install a web server, configure a load balancer etc, Cloud Foundry does all this for you, letting you concentrate on your application rather than the scaffolding around it.
On another level Cloud Foundry provides protection against cloud lock-in, where your application deployment process is so tied in to Amazon Web Services, Google Compute Engine or another provider that you can’t easily move your applications if you want to. In addition, Cloud Foundry lets you build on-site private clouds and your apps will never know the difference compared to a hosted public cloud installation.
Update: Video of PyData talk
Cloud Foundry for Data Scientists
As a data scientist, I tend not to want to get involved in setting up or maintaining systems and Cloud Foundry has given me a really simple way to write, deploy and iterate web apps that display results, process incoming data or bind to existing data stores. More on that later.
Quick Howto
- Deploy application in the current directory
cf push myapp
- Scale up and out quickly
cf scale myapp -i 5 -m 1G
- Create and bind services
cf bind-service myapp redis
Python on Cloud Foundry
Python is a first class language on Cloud Foundry and standard Python web apps can be auto-detected and built. Cloud Foundry deploys applications in containers and uses buildpacks to install the runtime (a Python interpreter in the case of Python apps), any dependencies for the application, and then launches the app.
The official Cloud Foundry Python buildpack uses pip to install dependencies and is simple to use with (non-PyData) Python web applications. (Update: The official buildpack now includes conda support as of v1.5.6.)
A simple Flask web app like this one can be deployed using the Cloud Foundry command line interface with cf push
and Cloud Foundry will make sure to install Python, the Flask package and all its dependencies, before starting the server with the command in the Procfile
.
Here’s a video of the whole process:
Data services
The ephemeral nature of cloud applications means that you cannot rely on the local storage of the container to persist your data. (See the rules for 12 Factor Apps for more info.) Of course in the era of big data you probably need to have a distributed data store in the first place.
You can use Cloud Foundry services to make setting up and connecting with data storage and processing systems simple. Here’s an example using the RedisCloud service to create a Redis store.
- Create a service from the command line
cf create-service rediscloud PLAN_NAME INSTANCE_NAME
- Bind the service to your app
cf bind-service APP_NAME INSTANCE_NAME
Your application should look in the VCAP_APP_SERVICES
environmental variable to find details of the services available to it in JSON format:
{ 'rediscloud': [
{ 'name': 'rediscloud-42', 'label': 'rediscloud', 'plan': '20mb',
'credentials': { 'port'': '6379',
'hostname': 'pub-redis-6379.us-east-1-2.3.ec2.redislabs.com',
'password': 'your_redis_password” } } ]
}
There are many different data services available for both hosted and packaged instances of Cloud Foundry.
PyData stack on Cloud Foundry
The current Python buildpack uses pip
to install dependencies. Anyone who has tried to install NumPy or SciPy using pip
knows that the process can be lengthy and painful often requiring manual intervention to correct library paths and install Fortran compilers.
Fortunately Continuum Analytics' conda package manager was created to solve these problems by packaging and distributing the standard tools of the Python data stack in compiled binaries.
I wanted to build web apps on Cloud Foundry using the PyData stack so with help from a colleague I have written a Cloud Foundry buildpack which uses both conda and pip to install required packages. (Update: the official Python buildpack now includes conda support, so you don’t need to use my now deprecated buildpack.)
You can specify packages to be installed by conda
in the conda_requirements.txt
file and these will be installed first, followed by packages in the requirements.txt
which will be installed as usual by pip
.
As an example of a PyData web app, Adam Hajari has created an RShiny equivalent called Spyre. This can be easily deployed to Cloud Foundry by specifying the conda
and pip
requirements as described above. If you want to try yourself I’ve put together this gist with the simple sine wave example from Adam’s notes.
Summary
Why is Cloud Foundry useful for data scientists? Being able to forget about server provisioning and configuration and concentrating instead on creating compelling visualisations and data driven apps is a welcome step forward for my workflow.
If you want to try out Cloud Foundry for yourself there are a number of hosted options available, one of which is Pivotal Web Services (which is run by Pivotal, my employer). There is a six week introductory trial available and you can estimate your monthly running costs after that.
Further Reading
If you want to learn more about how to use Cloud Foundry or how to write cloud ready applications here are a few links to get you started:
- Cloud Foundry Docs
- Cloud Foundry Docs: App Design for the Cloud
- Flask Mega Meta Tutorial for Data Science
- Twelve Factor Apps
- Cloud Foundry Meetups