Posted by: John Bresnahan | March 7, 2014

OpenStack and Docker

Introduction

I recently tried to set up OpenStack with docker as the hypervisor on a single node and I ran into mountains of trouble.  I tried with DevStack and entirely failed using both the master branch and stable/havana.  After much work I was able to launch container but the network was not right.  Ultimately I found a path that worked.  This post explains how I did this.

Create the base image

CentOS 6.5

The first step is to have a VM that can support this.  Because I was using RDO this needed to be a Red Hat derivative.  I originally chose a stock vagrant CentOS 6.5 VM.  I got everything set up and then ran out of disk space (many bad words were said).  Thus I used packer and the templates here to create a CentOS VM with 40GB of disk space.  I had to change the “disk_size” value under “builders” to something larger than 40000. Then I ran the build.

packer build template.json

When this completed I had a centos-6.5 vagrant box ready to boot.

Vagrant

I wanted to manage this VM with Vagrant and because OpenStack is fairly intolerant to HOST_IP changes I had to inject in an interface with a static IP address.  Below is the Vagrant file I used:

Vagrant.configure("2") do |config|
   config.vm.box = "centos-6.5-base"
   config.vm.network :private_network, ip: "172.16.129.26"
   ES_OS_MEM = 3072
   ES_OS_CPUS = 1
   config.vm.hostname = "rdodocker"
  config.vm.provider :virtualbox do |vb|
    vb.customize ["modifyvm", :id, "--memory", ES_OS_MEM]
    vb.customize ["modifyvm", :id, "--cpus", ES_OS_CPUS]
  end
end

After running vagrant up to boot this VM I got the following error:

The following SSH command responded with a non-zero exit status.
Vagrant assumes that this means the command failed!
/sbin/ifdown eth1 2> /dev/null
Stdout from the command:

Stderr from the command:

Thankfully this was easily solved.  I sshed into the VM with vagrant ssh, and then ran the following:

cat <<EOM | sudo tee /etc/sysconfig/network-scripts/ifcfg-eth1 >/dev/null
DEVICE="eth1"
ONBOOT="yes"
TYPE="Ethernet"
EOM

After that I exited from the ssh session and repackaged the VM with vagrant package –output centos-6.5-goodnet.box. I added the new box to vagrant and altered my Vagrant file to boot it.  I now had a base image on which I could install OpenStack.

RDO

Through much trial and error I came to the conclusion that I needed the icehouse development release of RDO.  Unfortunately this alone was not enough to properly handle docker.  I also had to install nova from the master branch into a python virtualenv and reconfigure the box to use that nova code.  This section has the specifics of what I did.

RDO Install

I followed the instructions for installing RDO that are here, only instead of running packstack –allinone I used a custom answer file.  I generated a template answer file with the command packstack –gen-answer-file=~/answers.txt.  Then I opened that file and I substituted out ever IP address with the IP address that vagrant was injecting into my VM (in my case this was 172.16.129.26).  I also set the following:

CONFIG_NEUTRON_INSTALL=n

This is very important.  The docker driver does not work with neutron.  (I learned this the hard way).  I then installed RDO with the command packstack –answerfile answers.txt.

Docker

Once RDO was installed and working (without docker) I set up the VM such that nova would use docker.  The instructions here are basically what I followed.  Here is my command set:

sudo yum -y install docker-io
sudo service docker start
sudo chkconfig docker on
sudo yum -y install docker-registry
sudo usermod -G docker nova
sudo service redis start
sudo chkconfig redis on
sudo service docker-registry start
sudo chkconfig docker-registry on

I edited the file /etc/sysconfig/docker-registry and add the following:

export SETTINGS_FLAVOR=openstack
export REGISTRY_PORT=5042 
. /root/keystonerc_admin 
export OS_GLANCE_URL=http://172.16.129.26:9292

Note that some of the values in that file were already set.  I removed those entires.

OpenStack for Docker Configuration

I changed this entry in  /etc/nova/nova.conf

compute_driver = docker.DockerDriver

this value (and uncommented it) in /etc/glance/glance-api.conf:

container_formats = ami,ari,aki,bare,ovf,docker

Hand Rolled Nova

Unfortunately docker does not work with the current release of RDO icehouse.  Therefore I had to get the latest code from the master branch of nova.  Further I had to install it.  To be safe I put it in its own python virtualenv.  In order to install nova like this a lot of dependencies must be installed.  Here is the yum command I used to install what I needed (and in some cases just wanted).

yum update
 yum install -y telnet git libxslt-devel libffi-devel python-virtualenv mysql-devel

Then I installed nova and its package specific dependencies into a virtualenv that I created.  The command sequence is below:

git clone https://github.com/openstack/nova.git
virtualenv --no-site-packages /usr/local/OPENSTACKVE
source /usr/local/OPENSTACKVE/bin/activate
pip install -r requirements.txt
pip install qpid-python
pip install mysql-python
python setup.py install

At this point I had an updated version of the nova software but it was running against an old version of the data base.  Thus I had to run:

nova-manage db sync

The final step was to change all of the nova startup scripts to point to the code in the virtualenv instead of the code installed to the system.  I did this by opening up every file at /etc/init.d/openstack-nova-* and changing the exec=”/usr/bin/nova-$suffix” line to exec=”/usr/local/OPENSTACKVE/bin/nova-$suffix”.  I then rebooted the VM and I was FINALLY all set to launch docker containers that I could ssh into!

Posted by: John Bresnahan | August 26, 2013

HTTP GET-outta here!

In the up coming Havana release of OpenStack Virtual Machine images can be downloaded a lot more efficiently.  This post will explain how to configure Glance and Nova so that VM images can be directly copied via the file system instead of routing all of the data over HTTP.

Historically How An Image Becomes a Law

Often times the architecture of an OpenStack deployment looks like the following:

GlanceNovaSameDisk

In the above Glance and Nova-compute are both backed by the same file system.  Glance stores VM images available for boot on the same file system onto which Nova-compute downloads these images and uses them as an instance store.  Even tho the file system is the same, in previous releases of OpenStack a lot of unnecessary work must be done to retrieve the image.  The following steps were needed:

  1. Glance opened the file on disk and read it into user space memory.
  2. Glance marshaled the data in the image into the HTTP protocol
  3. The data was sent through the TCP stack
  4. The data was received through the TCP stack
  5. Nova compute marshaled the HTTP data into memory buffers.
  6. Nova compute sent the data buffers back to the file system.

That is a lot of unneeded memory copies and processing.  If HTTPS is used the processes is even more laborious.

Direct Copy

In the Havana release a couple of patches have made it so the file can be directly accessed, and thus all of the HTTP protocol processing is skipped.  The specifics involved with these patches can be found in the following links, but will otherwise not be discussed in this post.

Setting Up Glance

In order to make this process work the first thing that must be done is to describe the Glance file system store in a JSON document.  There are two mandatory pieces of information that must be determined.

  1. The mount point.  This is simply at what point the associated file system is mounted.  the command line tool df can help you determine this.
  2. A unique ID.  This ID is what Glance and Nova use to determine that they are talking about the same file system.This can be anything you like.  The only requirement is that it must match what is give to nova (described later).   You can use uuidgen to determine this value.

Once you have this information create a file that looks like the following:
{
"id": "b9d69795-5951-4cb0-bb5c-29491e1e2daf",
"mountpoint": "/"
}

Now edit your glance-api.conf file and make the following changes:
show_multiple_locations = True
filesystem_store_metadata_file =<path to the JSON file created above>

This tells Glance to expose direct URLs to clients and to associate the meta data described in your JSON file with all URLs that come from the file system store.

Note that this metadata will only apply to new images.  Anything that was in the store prior to this configuration change will not have the associated metadata.

Setting Up Nova Compute

Now we must configure Nova Compute to make use of the new information that Glance is exposing.  Edit nova.conf and add the folowing values:


allowed_direct_url_schemes=file

[image_file_url]
filesystems=myfs

[image_file_url:myfs]
id=b9d69795-5951-4cb0-bb5c-29491e1e2daf
mountpoint=/

This tells Nova to use direct URLs if they have the file:// scheme and they are advertised with the id b9d69795-5951-4cb0-bb5c-29491e1e2daf.  It also sets where Nova has this file system mounted.  This may be different from what Glance has it mounted, if so the correct path will be calculated. For example, if glance has the file system mounted at /mnt/gluster, and nova has it mounted at /opt/gluster the paths with which each accesses the file will be different. However, because Glance tells nova where it has it mounted, Nova can create the correct path.

Verify the Configuration

To verify that things are setup correctly do the following:

Add a new image to Glance

$ wget http://cloud.fedoraproject.org/fedora-19.x86_64.qcow2
$ glance image-create --file fedora-19.x86_64.qcow2 --name fedora-19.x86_64.qcow2 --disk-format qcow2 --container-format bare --progress
[=============================>] 100%
+------------------+--------------------------------------+
| Property         | Value                                |
+------------------+--------------------------------------+
| checksum         | 9ff360edd3b3f1fc035205f63a58ec3e     |
| container_format | bare                                 |
| created_at       | 2013-08-26T20:23:12                  |
| deleted          | False                                |
| deleted_at       | None                                 |
| disk_format      | qcow2                                |
| id               | 461bf150-4d41-47be-967f-3b4dbafd7fa5 |
| is_public        | False                                |
| min_disk         | 0                                    |
| min_ram          | 0                                    |
| name             | fedora-19.x86_64.qcow2               |
| owner            | 96bd6038d1e4404e83ad12108cad7029     |
| protected        | False                                |
| size             | 237371392                            |
| status           | active                               |
| updated_at       | 2013-08-26T20:23:16                  |
+------------------+--------------------------------------+

Verify that Glance is exposing the direct URL information

$ glance –os-image-api-version 2 image-show 461bf150-4d41-47be-967f-3b4dbafd7fa5

...
| id | 461bf150-4d41-47be-967f-3b4dbafd7fa5 |
| locations | [{u'url': u'file:///opt/stack/data/glance/images/461bf150-4d41-47be-967f-3b4dbafd7fa5', u'metadata': {u'mountpoint': u'/', u'id': u'b9d69795-5951-4cb0-bb5c-29491e1e2daf'}}] |
...

Make sure that

Make sure that the metadata above matches what you put in the JSON file.

Boot the image

$ nova boot --image 461bf150-4d41-47be-967f-3b4dbafd7fa5 --flavor 2 testcopy
+--------------------------------------+--------------------------------------+
| Property                             | Value                                |
+--------------------------------------+--------------------------------------+
| OS-EXT-STS:task_state                | scheduling                           |
| image                                | fedora-19.x86_64.qcow2               |
| OS-EXT-STS:vm_state                  | building                             |
.....

Verify the copy in logs

The image should successfully boot. However we not only want to know it booted, we also want to know that it propagated with a direct file copy.  The easiest way to verify this is by looking nova compute’s logs.  Look for the string Successfully transferred using file.   Lines like the following should be found:


2013-08-26 19:39:38.170 INFO nova.image.download.file [-] Copied /opt/stack/data/glance/images/70746c77-5625-41ff-a3f8-a3dfb35d33e5 using <nova.image.download.file.FileTransfer object at 0x416f410>
2013-08-26 19:39:38.172 INFO nova.image.glance [req-20b0ce76-1f70-482b-a72b-382621e9c8f9 admin admin] Successfully transferred using file

If these lines are found, then congratulations, you just made your system more efficient.

Posted by: John Bresnahan | August 14, 2013

Preparing Fedora 19 for OpenStack Glance Development

For development of Glance I have recently been using the publicly available Fedora 19 VM which RDO made available here.  From time to time I have found that I need to boot the VM clean (eg: verifying that my environment is not influencing recent changes etc).  In this brief post I will describe how to prepare that clean VM instance with all the dependencies needed for Glance development.

Installing the Dependencies

Run the following to install all the base deps:

sudo yum update -y
sudo yum install git vim gcc postgresql-devel mariadb-devel python-virtualenv libffi-devel libxslt-devel

Install Glance Into A Virtualenv

Now run the following to create a python virtual environment and install glance and its deps into it.

git clone git://github.com/openstack/glance.git
virtualenv --no-site-packages VE
source VE/bin/activate
cd glance/
pip install -r requirements.txt
pip install -r test-requirements.txt
python setup.py develop
./run_tests.sh -N

At this point you have a configured python virutalenv with Glance installed into it and you are ready to start developing Glance!

Posted by: John Bresnahan | August 7, 2013

Download Modules In Nova

A patch has recently been accepted into Nova that allows for images to be downloaded in user defined ways.  In the past images could only be fetched via the Glance REST API but now it is possible to fetch images via customized protocols (think bittorrent and direct access to swift).

Loading Modules

In nova.conf there is the option allowed_direct_url_schemes. This is a list of strings that tell nova what download modules should be imported at load time.

Download modules are loaded via stevedore.  Each download module must be registered with Python as an entry point.  The name space must be nova.image.download.modules and the name must match the strings in the allowed_direct_url_schemes list.

When Nova-compute starts it will walk the list of values in allowed_direct_url_schemes and for each one it will ask stevedore to return the module associated with that name under the nova.image.download.modules name space.  Nova will them make a call into that module (<module>.get_schemes()) to get a list of URL schemes for which the module can be used.  The module will then be associated with each of the strings returned by get_schemes() in a look-up table.

Later, when Nova attempts to download an image it will ask Glance for a list of direct URLs where that image can be found.  It will then walk that list to see if any of the URL schemes can be found in the download module look-up table.  If so it will be used for that download.

A File Example.

For example, here is the setup.cfg for the file download module:

[entry_points]
nova.image.download.modules =
file = nova.image.download.file

This sets up the module nova.image.download.file as an entry point.

When allowed_direct_url_schemes = file Nova will ask stevedore for the nova.image.download.modules:file entry point.  Stevedore will return the module nova.image.download.file.  Nova will then call nova.image.download.file.get_schemes().  This call will return the list of strings file and filesystem.  Two entries will then be added to the download module look-up table:

download_modules['file'] = nova.image.download.file
download_modules['filesystem'] = nova.image.download.file

When booting, if Glance returns a direct URL of file:///var/lib/glance/images/fedora19.qcow2 Nova will look up file in the download_modules table and thereby get the nova.image.download.file module for use in the download (I feel like I have said download too many times.  download).

The Interface

Download modules are python modules that must have the following interface functions:

def get_download_hander(**kwargs):
    returns a nova.image.download.base.TransferBase
def get_schemes():
    returns a list of strings representing schemes

The meat of the work is done by an implementation of nova.image.download.base.TransferBase.  This object only needs to have a single method:

download(self, url_parts, dst_path, metadata, **kwargs):

It is called with the URL to download (already parsed with urlparse), the path to the local destination file, and metadata describing the transfer (this last one will be described in a later post).  When it returns Nova will assume that the data has been downloaded to dst_path.  If anything goes wrong one of the following errors should be raised:

nova.exception.ImageDownloadModuleError
nova.exception.ImageDownloadModuleLoadError
nova.exception.ImageDownloadModuleMetaDataError
nova.exception.ImageDownloadModuleNotImplementedError
nova.exception.ImageDownloadModuleMetaDataError
nova.exception.ImageDownloadModuleConfigurationError

Posted by: John Bresnahan | July 29, 2013

RDO on Fedora 19

A few months back I installed RDO on a Fedora 18 and it just worked!  Unfortunately I ran into a few blockers when trying to install it on Fedora 19.  In this post I will describe my experiences of setting up RDO on Fedora 19 and how I made it work.

Cheat-Sheet

Here is the set of commands in order that I used to install RDO on Fedora 19.  An explaination of why each was done can be found in the later sections of the post

  1. yum update -y
  2. yum install -y http://repos.fedorapeople.org/repos/openstack/openstack-havana/rdo-release-havana-2.noarch.rpm
  3. yum install -y python-django14
  4. yum install -y ruby
  5. yum install -y openstack-packstack
  6. Fix nova_compute.pp
    1. Open /usr/lib/python2.7/site-packages/packstack/puppet/templates/nova_compute.pp
    2. Delete lines 40 – 45 (6 lines total)
  7. adduser jbresnah
  8. passwd jbresnah
  9. usermod -G wheel jbresnah
  10. su – jbresnah
  11. packstack –allinone –os-quantum-install=n
  12. sed -i ‘s/cluster-mechanism/ha-mechanism/’ /etc/qpidd.conf
  13. reboot

RDO

RDO is Red Hat’s community distribution of OpenStack.  The quick start for installing it can be found here.   The documentation there will be the most authoritative source as it will be kept up to date and this blog post will likely not.  That said,these are the steps I had to do in order to get the present day version of Fedora 19 to work with the current packstack.

Update the system, install initial dependencies and create a user

  1. yum update -y
  2. yum install -y http://repos.fedorapeople.org/repos/openstack/openstack-havana/rdo-release-havana-2.noarch.rpm

Not e that step #2 is includes the correct path to the Hanava release, which is different from the RDO instructions which point at the Grizzly release.

Bugs

Bug 1: puppet/util/command_line

Before continue with the installation instructions note that there are a couple of bugs to work around. The first manifests itself with the following error message:


2013-07-28 07:09:29::ERROR::ospluginutils::144::root:: LoadError: no such file to load -- puppet/util/command_line
require at org/jruby/RubyKernel.java:1027
require at /usr/share/rubygems/rubygems/core_ext/kernel_require.rb:51
(root) at /usr/bin/puppet:3

Details on this bug can be found here: 961915 and 986154.

The work around is easy, simply install ruby with the following command:

yum install -y ruby

Bug 2: kvm.modules

The second bug results in this error:
ERROR : Error during puppet run : Error: /bin/sh: /etc/sysconfig/modules/kvm.modules: No such file or directory

Details on it can be found here.  To work around it do the following as root:

  1. Open /usr/lib/python2.7/site-packages/packstack/puppet/templates/nova_compute.pp
  2. Delete lines 40 – 45 (6 lines total)

Bug 3: django14

Another bug detailed here requires that you explicitly install django14.
yum install python-django14

Bug 4: qpidd

The file /etc/qpidd.cong has an invalid value in it.    To correct this run the following:

sed -i 's/cluster-mechanism/ha-mechanism/' /etc/qpidd.conf

Install OpenStack with packstack

At this point we are ready to install with packstack.  You need to run packstack as a non-root user:


su - jbresnah
packstack --allinone --os-quantum-install=n

Once this completes OpenStack should be installed and all set.  From there proceed with the instructions in the RDO Quick Start starting here.

Note: Running RDO in a Fedora 19 VM on Fedora 19

I tried this out in a VM and had some trouble.  My host OS is Fedora 19 and the VM in which I was installing RDO was also Fedora 19.   The problem is that on both the host and the guest the IP address for the network bridge was the same: 192.168.122.1.  This ultimately cause a failure in packstack:

ERROR : Error during puppet run : Error: /usr/sbin/tuned-adm profile virtual-host returned 2 instead of one of [0]

The solution is to edit the file (on the guest VM) /etc/libvirt/qemu/networks/default.xml and chance the ip address and dhcp range to a different IP addess, i used the following:

<network>
<name>default</name>
<uuid>e981cfed-949b-4763-be0b-3e5a91701882</uuid>
<bridge name="virbr0" />
<mac address='52:54:00:28:8E:A9'/>
<forward/>
<ip address="192.168.122.1" netmask="255.255.255.0">
<dhcp>
<range start="192.168.122.2" end="192.168.122.254" />
</dhcp>
</ip>
</network>

The re-run packstack with the answer file that was left in the users home directory:

packstack --answer-file=packstack-answers-20130729-111751.txt

Posted by: John Bresnahan | July 27, 2013

Hacking with devstack in a VM

When I first started working on OpenStack I spent too much time trying to find a good development environment. I even started a series of blog posts (that I never finished) talking about how I rolled my own development environments. I had messed around with devstack a bit, but I had a hard time configuring and using it in a way that was comfortable for me. I have since rectified that problem and I now have a methodology for using devstack that works very and saves me much time. I describe that in detail in this post.

Run devstack in a VM

For whatever reason, there may be a temptation may be to run devstack locally (easier access to local files, IDE, debugger, etc), resist this temptation.  Put devstack in a VM.

Here are the steps I recommend for your devstack VM

  1. Create a fedora 18 qcow2 base VM
  2. Create a child VM from that base
  3. Run devstack in the child VM

In this way you always have a fresh, untouched fedora 18 VM (the base image) around for creating new child images which can be used for additional disposable devstack VMs or whatever else.

Create the base VM

The following commands will create a Fedora 18 base VM.

qemu-img create -f qcow2 -o preallocation=metadata devstack_base.qcow2 12G
wget http://fedora.mirrors.pair.com/linux/releases/18/Live/x86_64/Fedora-18-x86_64-Live-Desktop.iso
virt-install --connect qemu:///system -n devstack_install -r 1024 --disk path=`pwd`/devstack_base.qcow2 -c `pwd`/Fedora-18-x86_64-Live-Desktop.iso --vnc --accelerate --hvm

At this point a window will open that looks like this:

fedora18_startup

Simply follow the installation wizard just as you would when installing Fedora on bare metal.  When partitioning the disks make sure that the swap space is not bigger than 1GB   Once it is complete run the following commands:

sudo virsh -c qemu:///system undefine devstack_install
sudo chown <username>:<username> devstack.qcow2

Import into virt-manager

From here on I recommend using the GUI program virt-manager.  It is certainly possible to do everything from the command line but virt-manager will make it a lot easier.

Find the Create a new virtual machine button (shown below) and click it:virt-manager_new_vm_circle

This will open a wizard that will walk you through the process.  In the first dialog click on Import existing  disk image as shown below

create_new_vm_import

Once this is complete run your base VM.  At this point you will need to complete the Fedora installation wizard.

Before you can ssh into this VM you need to determine its IP address and enable sshd in it.  To do this log in via the display and get a shell.  Run the following commands as root:

sudo service sshd enable
sudo  service sshd start

You can determine the IP address by running ifconfig. In the sample session below note that the VMs IP address is 192.168.122.52

fedora_18_terminal

ssh into the VM and install commonly used software with the following commands:

ssh root@192.168.122.52
yum update -y
yum install -y python-netaddr git python-devel python-virtualenv telnet
yum groupinstall 'Development Tools'

You now have a usable base VM. Shut it down.

Create the Child VM

The VM we created above will serve as a solid clean base upon which you can create disposable devstack VMs quickly.  In this way you will know that your environment is always clean.

Create the child VM from the base:

qemu-img create -b devstack_base.qcow2 -f qcow2  devstack.qcow2

Again import this child into virt-manager.  Configure it with at least 2GB of RAM. When you get to the final screen you have to take additional steps to make sure that virt-manager knows this is a qcow2 image.   Make sure that the Customize configuration before install option is selected (as shown below) and the click on Finish.

create_new_vm_customize

In the next window find Disk 1 on the left hand side and click it.  Then on the right hand side find Storage format and make sure that qcow2 is select.  An example screen is below:

create_new_vm_qcow2

Now click on Begin Installation and your child VM will boot up.  Just as you did with the base VM, determine its IP address and log in from a host shell.

 Install devstack

Once you have have the VM running ssh into it.  devstack cannot be installed as root so be sure to add a user that has sudo privileges.  Then log into that users account and run the following commands (note that the first 2 commands are working around a problem that I have observed with tgtd).

mv /etc/tgt/conf.d/sample.conf /etc/tgt/conf.d/sample.conf.back
service tgtd restart
git clone git://github.com/openstack-dev/devstack.git
cd devstack
./stack.sh

devstack will now ask for a bunch of passwords, just click enter for them all and wait (a long time) for the script to finish.  When it ends you should see something like the following:

Horizon is now available at http://192.168.122.127/
Keystone is serving at http://192.168.122.127:5000/v2.0/
Examples on using novaclient command line is in exercise.sh
The default users are: admin and demo
The password: 5b63703f25be4225a725
This is your host ip: 192.168.122.127
stack.sh completed in 172 seconds.

Hacking With devstack

If the above was painful do not fret, you never have to do it again.  You may choose to create more child VMs, but for the most part you can use your single devstack VM over and over.

Checkout devstack

In order to run devstack commands you have to first set some environment variables.  Fortunately devstack has a very convenient script for this named openrc. You can source it as the admin user or the demo user.  Here is an example of setting up an environment for using OpenStack sell commands as the admin user and admin tenant:

. openrc admin admin

it is that easy!  Now lets run a few OpenStack commands to make sure it works:

[jbresnah@localhost devstack]$ . openrc admin admin
[jbresnah@localhost devstack]$ glance image-list
+--------------------------------------+---------------------------------+-------------+------------------+----------+--------+
| ID                                   | Name                            | Disk Format | Container Format | Size     | Status |
+--------------------------------------+---------------------------------+-------------+------------------+----------+--------+
| a3e245c2-c8fa-4885-9b2e-2fc2e5f358a1 | cirros-0.3.1-x86_64-uec         | ami         | ami              | 25165824 | active |
| e6554b2a-cc75-42bf-8278-e3fc3f97501b | cirros-0.3.1-x86_64-uec-kernel  | aki         | aki              | 4955792  | active |
| f2750476-4125-46f1-8339-f94140c40ba3 | cirros-0.3.1-x86_64-uec-ramdisk | ari         | ari              | 3714968  | active |
+--------------------------------------+---------------------------------+-------------+------------------+----------+--------+
[jbresnah@localhost devstack]$ glance image-show cirros-0.3.1-x86_64-uec
+-----------------------+--------------------------------------+
| Property              | Value                                |
+-----------------------+--------------------------------------+
| Property 'kernel_id'  | e6554b2a-cc75-42bf-8278-e3fc3f97501b |
| Property 'ramdisk_id' | f2750476-4125-46f1-8339-f94140c40ba3 |
| checksum              | f8a2eeee2dc65b3d9b6e63678955bd83     |
| container_format      | ami                                  |
| created_at            | 2013-07-26T23:20:18                  |
| deleted               | False                                |
| disk_format           | ami                                  |
| id                    | a3e245c2-c8fa-4885-9b2e-2fc2e5f358a1 |
| is_public             | True                                 |
| min_disk              | 0                                    |
| min_ram               | 0                                    |
| name                  | cirros-0.3.1-x86_64-uec              |
| owner                 | 6f4bdfaac28349b6b8087f51ff963cd5     |
| protected             | False                                |
| size                  | 25165824                             |
| status                | active                               |
| updated_at            | 2013-07-26T23:20:18                  |
+-----------------------+--------------------------------------+
[jbresnah@localhost devstack]$ glance image-download --file local.img a3e245c2-c8fa-4885-9b2e-2fc2e5f358a1
[jbresnah@localhost devstack]$ ls -l local.img
-rw-rw-r--. 1 jbresnah jbresnah 25165824 Jul 26 19:29 local.img

In the above session we listed all of the images that are registered with Glance, got specific details on one of them, and then downloaded it.  At this point you can play with the other OpenStack clients and components as well.

Screen

devstack runs all of the OpenStack components under screen.  You can attach to the screen session by running:

screen -r

You should now see something like the following:

devstackscreen

notice all of the entries on the bottom screen toolbar.  Each one of these is a session running an OpenStack service.  The output is a log from that service.  To toggle through them hit <ctrl+a+space>.

Making a Code Change

In any given screen session you can hit <ctrl+c> to kill a service, and then <up arrow> <enter> to restart it.  The current directory is the home directory of the python source code as if you had checked it out from git.  You can make changes in that directory and you do not need to install them in any way.  Simply kill the service (<ctrl+c>), make your change, and then restart it (<up arrow>.<enter>).

In the following screen cast you see me do the following:

  1. Connect to the devcast screen session
  2. toggle over to the glance-api session
  3. kill the session
  4. alter the configuration
  5. make a small code change
  6. restart the service
  7. verify the change
Posted by: John Bresnahan | July 15, 2013

Quality Of Service In OpenStack

In this post I will be exploring the current state of quality of service (QoS) in OpenStack.  I will be looking at both what is possible now and what is on the horizon and targeted for the Havana release.  Note that I am truly only intimately familiar with Glance and thus part of the intention of this post is to gather information from the community.  Please let me know what I have missed, what I have gotten incorrect, and what else might be out there.

Introduction

The term quality of service traditionally refers to the users reservation, or guarantee of a certain amount of network bandwidth.  Instead of letting current network traffic and TCP flow control and back off algorithms dictate the rate of a users transfer across a network, the user would request N bits/second over a period of time.  If the request is granted the user could expect to have that amount of bandwidth at their disposal.  It is quite similar to resource reservation.

When considering quality of service in OpenStack we really should look beyond networks and at all of the resources on which there is contention, the most important of which are:

  • CPU
  • Memory
  • Disk IO
  • Network IO
  • System bus

Let us take a look at QoS in some of the prominent OpenStack components.

Keystone and Quotas

While quotas are quite different from QoS they do have some overlapping concepts and thus will be discussed here briefly.  A quota is a set maximum amount of a resource that a user is allowed to use.  This does not necessarily mean that the user is guaranteed that much of the given resource, it just means that is the most they can have.  That said quotas can sometimes be manipulated to provide a type of QoS (ex: set a bandwidth quota to 50% of your network resources per user and then only allow two users at a time).

Currently there is an effort in the keystone community to add centralized quota management for all OpenStack components to keystone.  Keystone will provide management interfaces to the quota information.  When a user attempts to use a resource OpenStack components will query Keystone for the particular resource’s quota.  Enforcement of the quota will be done by that OpenStack service, not by Keystone.

The design for quota management in keystone seems fairly complete and is described here.  The implementation does not appear to be targeted for the Havana release but hopefully we will see it some time in the I cycle.  Note that once this is in Keystone the other OpenStack components must be modified to use it so it will likely be some time before this is available across OpenStack.

Glance

Glance is the image registry and delivery component of OpenStack.  The main resources that it uses is network bandwidth when uploading/downloading images and the storage capacity of backend storage systems (like swift and GlusterFS).  A user of Glance may wish to get a guarantee from the server that when it starts uploading or downloading an image that server will deliver N bits/second.  In order to achieve this Glance does not only have to reserve bandwidth on the workers NIC and the local network, but it also has to get a similar QoS guarantee from the storage system which houses its data (swift, GlusterFS, etc).

Current State

Glance provides no first class QoS features.  There is no way at all for a client to negotiate or discover the amount of bandwidth which can be dedicated to them.  Even using outside OS level services to work around this issue is unlikely.  The main problem is reserving the end to end path (from the network all the way through to the storage system).

Looking forward

In my opinion the solution to adding QoS to Glance is to get Glance out of the Image delivery business.  Efforts are well underway (and should be available in the Havana release) to expose the underlying physical locations of a given image (things like http:// and swift://).  In this way the user can negotiate directly with the storage system for some level of QoS, or it can Staccato to handle the transfer for it.

Cinder

QoS for Cinder appears to be underway for the Havana release.  Users of Cinder can ask for a specific volume type.  Part of that volume type is a string that defines the QoS of the volume IO (fast, normal, or slow).  Backends that can handle all of the demands of the volume type become candidates for scheduling.

More information about QoS in cinder can be found in the following links:

Quantum/Neutron

Neutron (formerly known as Quantum) provides network connectivity as a service.  A blueprint for QoS in Neutron can be found here and additional information can be found here.

This effort is targeted for the Havana release.  In the presence of Neutron plugins that support QoS (Cisco, Nicira, ?) this will allow users reservation of network bandwidth.

Nova

In nova all of the resources in the above list are used.  User VMs necessarily use some amount of CPU, memory, IO, and network resources. Users truly interested in a guaranteed level of quality of service need a way to pin all of those resources.  An effort for this in Nova is documented here with this blueprint.

While this effort appear to be what is needed in Nova it is unfortunately quite old and currently marked as obsolete.  However the effort seems to have new life recently as shown by this email exchange. A definition of work can be found here with the blueprint here.

This effort will operate similarly to how Cinder is proposing QoS. A set of string will be defined: High (1 vCPU per CPU), Normal (2 vCPUs per CPU), low (4 vCPUs per CPU).  This type string would then be added as part of the instance type when requesting a new VM instance.  Memory commitment is not addressed in this effort, nor is network and disk IO (however those are best handled by Neutron and
Cinder respectively).

Unfortunately nothing seems to be scheduled for Havana.

Current State

Currently in nova there is the following configuration option:

# cpu_allocation_ratio=16.0

This sets the ratio of virtual CPUs to physical CPUs.  If this value is set to 1.0 then the user will know that the number of CPUs in its requested instance type maps to full system CPUs.  Similarly there is:

# ram_allocation_ratio=1.5

which does the same thing for RAM.  While these do give a notion of QoS to the user they are too coarsely grained and can be inefficient when considering users that do not need/want such QoS.

Swift

Swift does not have any explicit QoS options.  However it does have a rate limiting middleware which provides a sort of quota on bandwidth for users.  How to set these values can be found here.

Posted by: John Bresnahan | June 29, 2013

tempest devstack and bugs oh my!

One of the best things about OpenStack development is the rich testing and review frameworks around it.  The most important of which are tempest and devstack.  In this post I will discuss how I recently squashed a particularly painful bug (read: I make dumb mistakes) in my latest patch to Glance.

The Bug

My patch seemed like a fairly minor change.  Functionally it added little, but it touched quite a few files.  My first patch passed all the automated tests and received some thoughtful reviews from the Glance community.  I made the needed changes which seemed innocuous, but caused it to not passing the gate tests.  This was perplexing to me because it passed all the unit tests and functional tests just fine as well as my own manual tests.  Walking through the code with a suspicious eye-ball also gave me no hints as to the problem.

The output from tempest on reveiw.openstack.org was fantastic.  Not only did I get the output from all of the nosetests under python 2.6/7 and the output of the gate tests, I also accessed the logs of the devstack screen sessions (more on this later).  This output was found on the gerrit review page for the patch I submitted.  Jenkin’s added comments to it that look like the following:

jenins_fail

By clicking on one of the failure links (like the one circled above) I was taken to quite a bit of information, specifically this page where I found the console log (the output from the tempest tests) and the screen session logs (which are basically the logs from each OpenStack service).

This was amazingly helpful.  Unfortunately in my case all I found out was that the bug was indeed my fault, and that I still had no idea why.  At this point I knew that I needed to step through the code.

This was a problem for me.  I had a good development environment for stepping though Glance’s unit and functional tests, but not the gate tests.  I needed a way to run these failing tests in my own debug environment. After many fruitless attempts (which included writing my own client to simulate the test which failed — working harder not smarter) here is what I ultimately did.

Devstack

I had a Fedora 18 devstack enabled VM ready for use (if you do not have a devstack VM make one).  I originally created it using virt-manager.  I ran it with 2 vCPUs and 4GB of RAM.  To ‘devstack enable’ it I cloned devstack from github and then ran stack.sh (it is pretty much that easy, see http://devstack.org).  I then learned how easy it was to run tempest gate tests on a devstack VM.  This was literally all I did:

nosetests tempest

boom! The gate tests were running.  Not with my faulty code, and not in my debug environment.  But it was on my machine and under my control. I was half of the way there.

My Patch

To get my devstack VM to run my comically brain-dead and buggy code I had to do the following:

cd /opt/stack/glance
git fetch https://review.openstack.org/openstack/glance refs/changes/92//8 && git checkout FETCH_HEAD
git checkout -b squish_this_bug

note that the 2nd command can be copied from any git review by clicking on the button shown bellow.

gerrit

Now that I had the code in the right place I just needed to restart Glance in devstack.  To do this I simply attached to the screen session with:

screen -r

Then I found the glance-api and glance-registry sessions by hitting <ctrl+a+space_bar> until I say g-reg/g-api in the bottom toolbar:

devstack_screen

At this point I was at a command line as tho I typed the command.  So all I had to do was hit <ctrl+c>to kill it and then press the up arrow and to restart it.  Now devstack was running my troubled code.

Stepping Though The Code

I am partial to pycharm for my IDE and debugger (hey jetbrains, wanna hook a bruddah up with a free open source license?).  In a previous blog I talked about how to get that running.  I will review and expand on that a bit here.

NFS and Remote Debugging

Even tho my devstack VM was running on my laptop just like pycharm, it was still a remote process.  The first thing I had to do was get pycharm and devstack access to the same files.  I did this by running NFS inside of the devstack VM and exporting an NFS mount at /opt/stack.  I then used my host laptop as an NFS client and I mounted the VM’s /opt/stack file system on my laptop in the same place (note: it must be mounted at /opt stack).  At this point both the laptop and the VM have access to the devstack source code.

pycharm

Next I needed to create a pycharm project for glance under /opt/stack/glance. I started pycharm and clicked on File->Open Directory and then selected the directory /opt/stack/glance.  From there pycharm did the rest of the work modulo a few questions which had easy answers (click next).  Finally I had to create a remote project as outlined in my previous post.

Once pycharm was configured to accept remote connections I only needed to tell Glance to connect to it on start up.  To do this I opened /etc/glance/glance-api.conf (or /etc/glance/glance-registry.conf depending upon which service I was fighting with at the time) and added the following line:

pydev_worker_debug_host = <IP of host machine>

Then I just killed the process in the screen session and restarted it (as described above with <ctrl+c> and up arrow) and everything was all connected.

From there I was able to finally determine that in python there is in fact a very big difference between None and [].  Does anyone want to fund my first trip to PyCon?!

Posted by: John Bresnahan | April 25, 2013

A Picture Can Beat 1000 Dead Horses

Unless this is your first time reading my blog, you are probably aware that I am beginning to become obsessed with the idea of a data transfer service.  In this post I continue the topic from my previous post by introducing a couple of diagrams.

the_wild_west

A diagram of a possible swift deployment is on the right side.  On the left is a client to that service.  The swift deployment is very well managed, redundant and highly available.  The client speaks to the swift via a well defined REST API and using supported client side software to interpret the protocol.  However, between the server side network protocol interpreter and the client side network protocol interpreter is the wild west.

The wild west is completely unprotected and unmanaged. Many things can occur that cause a lost, slow, or disruptive transfer.  For example

  • Dropped connections
  • Congestion events
  • Network partitions

Such problems make data transfer expensive.  Ideally there would be a service to oversee the transfer.  Transfer could be check-pointed as they progress so that if a connection is dropped it could be restarted with minimal loss.  Also it could try to maximize the efficiency of the pathway between the source and the destination by tuning the protocols in use (like setting a good value for the TCP window), or using multicast protocols where appropriate (like bittorrent), or scheduling transfers so as to not shoot itself in the foot.

A safer architecture would look like this:

tamed_west

The transfer service is now in a position to manage the transfer thus it allows for the following:

  • A fire and forget asynchronous transfer request from the client.
  • Baby sit and checkpoint the transfer.  If it fails restart it from the last checkpoint.
  • Schedule transfer for optimal times.
  • Prioritize transfers and their use of the network.
  • Coalesce transfer requests and schedule appropriately and into multicast sessions.
  • Negotiate the best possible protocol between the two endpoints.
  • Verify that the data successfully is written to the destination storage system and verify its integrity.
Posted by: John Bresnahan | April 24, 2013

Storage != Transfer

In this post I argue that the concepts of data transfer and data storage should not be conflated into a single solution.  Like many problems in computer science, by abstracting problems into their own solution space, they can be more easily solved.  I believe that OpenStack can benefit from a new component that offloads the burden of optimally transferring images from existing components like nova-compute and swift.

Storage Systems

Within the OpenStack world there are a few interesting storage systems. Swift, Gluster, and Ceph are just three that immediately come to mind. These systems do amazing things like data redundancy, distribution, high availability, parallel access, and consistency to name just a few.  As such systems get more complex they can become aware of caching levels and tertiary storage. Storage systems also need to be concerned with the integrity of the physical media used to store the data which quickly leads to a system of checksums and forward error correction.  One can imagine how complex that can become.

I have probably missed many other challenges, and yet that list alone is near daunting.  In addition to it, storage systems need an access protocol that enables reading and writing data. The access protocol is used in many ways including random access, block level IO, small chunks, large chucks, and parallel IO.

With the access protocol users can also stream large data sets from the storage system to a client (and thereby another storage system), even across a WAN.  However I argue that such actions are often best left to a service dedicated to that job (as I described in a previous post). The storage systems control domain ends at its API.  After that, the all bytes coming and going are in the wild west.

Transfer Service

A transfer service’s primary responsibility is moving data from one place to another in the most efficient, safe, and effective way.  GridFTP and Globus Online provide good examples of transfer services.  The transfer service’s job is to make the lawless land between two storage systems safer.  It’s duty is to make sure that all bytes (or bytes that look just like them) make it across the network and to the destination, safely and quickly and without disruption to other travellers.

When dealing with large data set transfers the following must be considered:

  • Restart transfers that fail after partial completion without having to retransmit large amounts of data.
  • Negotiate the fastest/best protocol between endpoints.
  • Set protocol specific parameters for optimal performance (eg: TCP window size).
  • Schedule transfer for an optimal time (which can prevent thrashing).
  • Mange the resources it is using (network, CPU, etc) of both the source and destination and prevent over heating.
  • Allow for 3rd party transfers (do not force the end user to speak ever complex protocol).

Just as the transfer service is not concerned with data once it safely hits a storage system, the storage system should not be concerned with the above list.  Yet both services are needed in an offering like OpenStack.

Summary

When data is written to storage it should be kept safe and available.  When it is read the exact same data should be immediately available and correct.  That is the charge placed on the storage system, and that is where its charge should reasonably end.  The storage system cannot be responsible for making sure the data crosses networks to other storage systems which are often out side of its control safely and in the most efficient manner.  That is asking too much of one logical component.  That is the job of a transfer service.

Older Posts »

Categories

Follow

Get every new post delivered to your Inbox.