New Year, New Blog @

Just a short message to let you know that in an effort to blog more I’ve decided to kick-start the new year with a new blog which you can now find at where all the previous (and I admit, old) OpenStack posts of mine can be found.


Upgrade Ubuntu 12.04 Precise to 12.10 Quantal causes black screen

Disappointingly, I’ve just upgraded my 12.04 desktop to Ubuntu 12.10 Quantal release and the upgrade was not smooth. X failed to run. Given I use ATI’s proprietary driver, I assumed this was the issue. Unfortunately this didn’t seem to be the case as running the “Failsafe X” resulted in a black screen.

Recalling the pain when upgrading from 11.04 to 11.10 ( and getting similar symptoms (although those issues related to networking) I ran through the same steps and I had X back again. The steps I did where:

  1. Boot into a safe session by selected Advanced Ubuntu Options from the Grub menu then choosing recovery mode
  2. Drop to a root shell
  3. rm -rf /var/run /var/lock
  4. ln -s /run /var
  5. ln -s /run/lock /var
  6. reboot

I then had X back and lightdm working. I then went to log in, but Unity didn’t seem to work (I still need to troubleshoot and will update here). My current work around was to install Gnome.

  1. Get to a console (try CTRL+ALT F1, although this mightn’t work – reboot into a root shell again)
  2. (If you don’t have networking, as the root user run the following: dhclient eth0)
  3. apt-get update
  4. apt-get install gnome
  5. reboot or restart lightdm (service lightdm restart)
  6. Under your username, choose the Ubuntu symbol above the > symbol to choose window managers and select Gnome
  7. Log in

To get Unity back:

  1. Fire up a console
  2. sudo apt-get remove –purge unity
  3. sudo apt-get install unity ubuntu-desktop

(thanks @EwanToo!)

I have no option to install proprietary hardware now – so I ‘ll see how my ATI FireGL card performs under the GPL driver…

My response to “OpenStack is overstretched” – The Register

What do we get from writing an opinion online? Noteriety? Fame? Publicity?  I’m doing it because sometimes I need to fill in the blank space between the last post and this one.
I’ve just finished reading a Register article entitled “OpenStack is overstretched” in which a Enrico Signoretti writes that due to the popularity of OpenStack it seems like everyone needs to stick their awe in – good and bad – and the outcome is too many cooks.
Whilst I don’t question the burden of having to juggle a large number of vendors interests into a popular Open Source Cloud “Operating System” is a large one – isn’t this where Open Source projects shine? Sub groups work on particular code that may have a small group of people’s interests at heart. But The source code is available to do that right? What makes that feature get into the final cut? Does it matter so long as the core platform exists to serve the general public?
There is no doubt that OpenStack is a challenging product to use and implement and equally one that must keep the release manager busy and community as a whole – but their interests are to make a fantastic Open Source product available to all that want to take it in any direction they want.
We live in a world where we expect products to be able to have one-click installs – and in fact, a part of my job is to get to this stage in all aspects by automate everything. But sometimes it isn’t possible – but we can make it easier and certainly strive to get there. What we can do is to contribute back rather than sit back and watch a product mature enough to then add your own SKU that you can resell to keep your shareholders happy.
It sounds like people are frustrated because OpenStack has the potential to be big but its not ready to shine yet. There are companies using it, there are companies investigating it and there are companies that are contributing to it.

This is Open Source. You get the choice of what you want to do with it.

Ubuntu 11.10 Oneiric Ocelot on the desktop – my thoughts (and it’s not good)

So Ubuntu 11.10, aka Oneiric Ocelot has been out for a short while now and so far it has been nothing but pain for people upgrading from an earlier release. Not only are the bugs racking up (and some are showstoppers like my post regarding the “Waiting for network configuration” shows, but the move to Unity seems disastrous and is losing people’s allegiance to the once admired desktop Linux of choice for many.
Has Ubuntu lost its way here? Ubuntu’s parental backers, Canonical, are concentrating their efforts on their Ubuntu Cloud Infrastructure project and Ubuntu 11.10 on a server is great, even bringing with it an easier way to get OpenStack installed.
For me, Unity is a mistake. It made sense on my netbook, does it make sense on a touchscreen maybe, but it doesn’t make sense on my desktop. Integration with even the most basic apps are causing problems (Gvim anyone? Empathy?), its sluggish (Gwibber status updates

Overall I’ve lost my faith in Ubuntu on the desktop, which is a shame as it was on the way to make adoption to an Open Source desktop possible.

sshfs – SSH FileSystem, replacement to NFS over wireless?

sshfs (SSH FileSystem) is a FUSE based file system that allows mounting of remote directories using SSH (or more specifically, sftp).
I have all my Linux machines mounting remote directories from a central NAS on my home network. Some machines are connecting at 100Mb LAN (though reduced as I actually run PowerLine adapters between the machines and the NAS), but the more frequently used machines (laptops/netbooks) connect via wireless. One connects at 802.11n, another connects at 802.11g speeds.

Recently, NFS performance has been poor on the UNR Acer Aspire 1 (A110) netbook which is the most used device in the home.  To improve this without opting to add on a 802.11n USB dongle I recently started to look at sshfs.  SSH is stable – I’ve never had issues scping files netween NAS and netbook.  NFS hooks at lower levels into the kernel, and when NFS hangs, the netbook becomes unstable enough to warrant a reboot. Turn it off and on again just isn’t an option for something that “should just work”.

sshfs uses SFTP to present the filesystem on the local machine so make sure the sftp subsystem is enabled in your sshd_config on the server:

Subsystem       sftp    /usr/libexec/sftp-server

Now the great thing about sshfs being based on sftp and a FUSE implementation means you can (and as shown here initally) is that you run this as a regular user.

On my Qnap NAS I have a number of NFS exports. To test the performance I used an NFS export containing lots of photos.


The NFS export is the following

"/share/NFS/test" *(rw,nohide,async,no_root_squash)
This is mounted with the following options in fstab
nas:/share/NFS/test /media/test nfs rw,bg,tcp,wsize=32768,rsize=32768,vers=3


The general syntax for sshfs is

sshfs user@host:/directory /mountpoint

After a number of tests, the following options gave me performance with reliability

sshfs -o idmap=user -o uid=1000 -o gid=1000 -o workaround=nodelaysrv:buflimit -o no_check_root -o kernel_cache -o auto_cache admin@nas:/share/MD0_DATA/test /media/sshfs

Note that when you execute this command it will ask for the password of the username specified. Normal SSH rules apply here – access using ssh keys is the way to provide secure, seamless, unprompted access [or prompted if you sign your key with a password, of course].

So this gives me two areas on my wireless laptop:

/media/test is the NFS mount point of /share/MD0_DATA/test on ‘nas’

/media/sshfs is the sshfs mount point of /share/MD0_DATA/test on ‘nas’

SSH Timeout

It is crucial that you add the following in to your SSH client config which is located under .ssh/ssh_config

ServerAliveInterval 15
ServerAliveCountMax 3

This is to avoid the SSHFS mount hanging after a timeout – which is quite messy to clean up.

Performance sshfs vs NFS

Performance tests were rough and ready, but I needed to represent the real world.  I did timings of directory listings/finds and also visually using Gnome as this is to fix performance issues on a netbook which would be the crux of the issue.

The test area had 4,501 photos of various formats and file sizes.


time find /media/sshfs

real 0m6.254s
user 0m0.010s
sys 0m0.110s


time find /media/test

real 0m3.738s
user 0m0.020s
sys 0m0.110s


I can repeat those tests over and over and I get NFS consistently quicker than sshfs. Visually I see Gnome creating thumbnails slower under sshfs than I do under NFS – but it is still acceptable.  The reason I’m looking at improving performance of the remote filesystems currently mounted under NFS is because of the instability witnessed using NFS – although there is a caveat to this stability…

So, NFS is quicker than sshfs, but is it enough to not use it? I think sshfs is a great idea and will certainly be used for some parts of my home network.  It will easily work its way into the enterprise too as a replacement to age old habit of using scp – especially when used with an automount set up and that niggling issue of “do I really want to run NFS on that server just to access some files?”.

Will I use this completely at home as opposed to NFS? I’m not so sure – the jury’s still out.  I’m currently using UNR on the netbook and I’ve tracked down another issue with NFS over wireless – the latest kernels are what seems to be the cause of the instability with my ath5k wireless driver.  It appears that the Ubuntu kernels, 2.6.32-23 kernels and later are causing my issue.  I’m currently running an older 2.6.32-21 kernel and all is well… for now.

First foray with Drupal

I’ve recently started to use Drupal for a community intranet portal system as I see it as a good fit for what the project wants to achieve.

I decided to dive straight into the latest 7 alpha release as it has the features and design of what I expect of a modern CMS portal system.

Over the next few months I’ll be documenting my trials and tribulations of using Drupal!

How to solve a problem like scraping

It’s been a while since I last blogged but it doesn’t mean I’ve disappeared. See it as me being deep in thought.

I work for a large web site operating in EMEA that has lots of invaluable data available to the public. This is great, but other people want to take that data wholesale without going through the proper authorised channels. This is known as scraping – effectively “Site Content Raping” to coin a not so nice phrase.

Scraping is very easy to do. There are tools out there that in a few clicks, will spider your site and download the content – after all, the data is public, the hyperlinks are designed to take you through the data. The web search engine bots effectively scrape our site, but the difference is that they report back the links. Scraping content involves downloading the relevant data that causes legal issues. In truth, scraping is a legal issue – but legal routes to stopping scraping is hit by two issues: its a lengthy process and one that needs evidence to support the scraping activity to show its breaking the terms and conditions of your site.

The problem with scraping is being able to identify it in the first place. Some scrapes are relatively benign and easy to spot. Ironically, they’re not usually an issue unless the lazy way they’ve implemented their scrape causes site capacity issues. But well designed site infrastructure should be able to cope with any surge in demand however it is presented. Most scraping activity remains under the radar though and spotting the trail involves understanding how the site can be scraped in the first place, the methods to evade detection and the hardest challenge in all this is distinguishing this from the millions of legitimate traffic accessing the site at the same time.

There are a number of ways to tackle the problem:

– Employ a 3rd party to monitor and report on the scraping activity in real-time on your behalf as part of a monthly service operational expense
– Implement ingress filtering of your data to report on activities in real-time using equipment maintained and set up by teams internally
– Implement log analysis after the event

I’m looking at the log analysis to tackle the issue which involves large data set processing using Hadoop and custom scripts to slice and dice the information to help form conclusions that will help towards writing the reports to support scraping activity.

Over the next few months I aim to track my successes and failures in combating this problem.

Ubuntu Lucid Lynx 10.04 Beta Announced

The Register has a great short article introducing you to the delights of Ubuntu Lucid Lynx first Beta announced today.  You can grab the Beta version here.

The release sees new rebranding which has gotten rid of Human Theme in favour of a more professional looking desktop in an attempt to make Linux desktops look less Linux-like.  The release also ventures into the world of online music with the imminent launch of Ubuntu’s U1 music store which ties up Rythmbox to MP3 purchases.

Try it out.

I gave it a go on VirtualBox and the video drivers from the guest additions kept stack tracing and given my current stable Karmic Koala setup I don’t particularly want to run this Beta just yet on real hardware.