Posted by: John Bresnahan | March 26, 2013

The Cost Of Client Side Image Downloads On the CPU

Introduction

In a previous post I discussed the importance of managing the resource consumption of large file transfers.  Here I illustrate one of the lesser considered resources involved, the CPU.  The NIC (and the network in general) is always thought of as a consumable resource involved in a data transfer.  To a lesser extent the disk bandwidth is considered, and on occasion the system bus is as well.  However, the effects on the CPU tend to be underestimated.

Nova-compute is co-located with the hypervisor and all of the CPUs running user virtual machines (VMs), thus the effects of the OpenStack services on the CPU are important to consider.  The amount to which nova-compute is itself a noisy neighbor to user VMs should be minimized, or at least understood.  Currently nova-compute downloads images by importing the python-glanceclient module and making calls to it that execute the HTTP/HTTPS protocol.  Let’s take a look at the cost to the CPU of such a download.

The Experiment

I used the following set up to study this problem:

  • A Lenovo T530 laptop with 4 cores and 8GB RAM
  • A VM with 2GB of memory and 2 cores running on the above laptop
  • Fedora 18 installed on the VM
  • DevStack running a Glance server
  • A 2GB image (just a file I made from /dev/urandom)

I uploaded the image to the Glance service.  I then downloaded the image with two different clients:

  1. curl: This shows the results from a client without some of the complicated python inefficiencies and thus provide a baseline for the best case situation.
  2. python-glanceclient: This provides a much more realistic look at the load that nova-compute is introducing to a hypervisor

The download trials were performed with both HTTP and HTTPS over the loopback interface.  The data was written to /dev/null, thus eliminating any overhead of the file system. The CPU load was measured in two different ways.  First I used GNU time 1.7, this gave an overall summary of the resources consumed.  To show the load over time I used top (the exact command was top -b -d 1 -p <pid>).

Results

CPU Load Measured From GNU Time

Client Type Time CPU Load
curl HTTP 9.61s 12.00%
curl HTTPS 35.65s 89.00%
glance HTTP 9.57s 66.00%
glance HTTPS 39.98s 90.00%

CPU Load Over Time From top

curl_downloads

glance_download

Conclusions

The above results show that a substantial amount of processing overhead is introduced by downloading images, especially when the expense of SSL is added.  It also shows that even without SSL, an HTTP download can be costly on a CPU when using python code.  While this is only for a short period of time, in the case of SSL it is for enough time that it should be managed.  Further, it should be noted that the problem is exacerbated when taking into account the effects of many images downloaded at the same time (which is likely to happen as machines like the SM15000 are considered for use with OpenStack).

About these ads

Responses

  1. Thanks for the post , something which says we need to change how nova-compute fetches images from glance.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Categories

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: