One of the best things about OpenStack development is the rich testing and review frameworks around it. The most important of which are tempest and devstack. In this post I will discuss how I recently squashed a particularly painful bug (read: I make dumb mistakes) in my latest patch to Glance.
My patch seemed like a fairly minor change. Functionally it added little, but it touched quite a few files. My first patch passed all the automated tests and received some thoughtful reviews from the Glance community. I made the needed changes which seemed innocuous, but caused it to not passing the gate tests. This was perplexing to me because it passed all the unit tests and functional tests just fine as well as my own manual tests. Walking through the code with a suspicious eye-ball also gave me no hints as to the problem.
The output from tempest on reveiw.openstack.org was fantastic. Not only did I get the output from all of the nosetests under python 2.6/7 and the output of the gate tests, I also accessed the logs of the devstack screen sessions (more on this later). This output was found on the gerrit review page for the patch I submitted. Jenkin’s added comments to it that look like the following:
By clicking on one of the failure links (like the one circled above) I was taken to quite a bit of information, specifically this page where I found the console log (the output from the tempest tests) and the screen session logs (which are basically the logs from each OpenStack service).
This was amazingly helpful. Unfortunately in my case all I found out was that the bug was indeed my fault, and that I still had no idea why. At this point I knew that I needed to step through the code.
This was a problem for me. I had a good development environment for stepping though Glance’s unit and functional tests, but not the gate tests. I needed a way to run these failing tests in my own debug environment. After many fruitless attempts (which included writing my own client to simulate the test which failed — working harder not smarter) here is what I ultimately did.
I had a Fedora 18 devstack enabled VM ready for use (if you do not have a devstack VM make one). I originally created it using virt-manager. I ran it with 2 vCPUs and 4GB of RAM. To ‘devstack enable’ it I cloned devstack from github and then ran stack.sh (it is pretty much that easy, see http://devstack.org). I then learned how easy it was to run tempest gate tests on a devstack VM. This was literally all I did:
boom! The gate tests were running. Not with my faulty code, and not in my debug environment. But it was on my machine and under my control. I was half of the way there.
To get my devstack VM to run my comically brain-dead and buggy code I had to do the following:
cd /opt/stack/glance git fetch https://review.openstack.org/openstack/glance refs/changes/92//8 && git checkout FETCH_HEAD git checkout -b squish_this_bug
note that the 2nd command can be copied from any git review by clicking on the button shown bellow.
Now that I had the code in the right place I just needed to restart Glance in devstack. To do this I simply attached to the screen session with:
Then I found the glance-api and glance-registry sessions by hitting <ctrl+a+space_bar> until I say g-reg/g-api in the bottom toolbar:
At this point I was at a command line as tho I typed the command. So all I had to do was hit <ctrl+c>to kill it and then press the up arrow and to restart it. Now devstack was running my troubled code.
Stepping Though The Code
I am partial to pycharm for my IDE and debugger (hey jetbrains, wanna hook a bruddah up with a free open source license?). In a previous blog I talked about how to get that running. I will review and expand on that a bit here.
NFS and Remote Debugging
Even tho my devstack VM was running on my laptop just like pycharm, it was still a remote process. The first thing I had to do was get pycharm and devstack access to the same files. I did this by running NFS inside of the devstack VM and exporting an NFS mount at /opt/stack. I then used my host laptop as an NFS client and I mounted the VM’s /opt/stack file system on my laptop in the same place (note: it must be mounted at /opt stack). At this point both the laptop and the VM have access to the devstack source code.
Next I needed to create a pycharm project for glance under /opt/stack/glance. I started pycharm and clicked on File->Open Directory and then selected the directory /opt/stack/glance. From there pycharm did the rest of the work modulo a few questions which had easy answers (click next). Finally I had to create a remote project as outlined in my previous post.
Once pycharm was configured to accept remote connections I only needed to tell Glance to connect to it on start up. To do this I opened /etc/glance/glance-api.conf (or /etc/glance/glance-registry.conf depending upon which service I was fighting with at the time) and added the following line:
pydev_worker_debug_host = <IP of host machine>
Then I just killed the process in the screen session and restarted it (as described above with <ctrl+c> and up arrow) and everything was all connected.
From there I was able to finally determine that in python there is in fact a very big difference between None and . Does anyone want to fund my first trip to PyCon?!