Ben Clifford Technical Blog: barwench

Showing posts with label barwench. Show all posts

12 June, 2012

fcgi, haskell, cpanel, php, drupal

I played with fastcgi, which is like CGI but doesn't have to spawn a new process each time.

The initial motivation for this was a server which has a bunch of drupal websites. It was previously running in plain CGI mode, which forks a PHP process for every page request (about 15 spawns per second on this server), with each site's PHP running under a different user account. (The other mode we've tried with this is using mod_php, which runs things much faster but doesn't provide as much isolation between web sites as using CGI as everything runs as the www-data unix user, rather than as a per-site user).

I thought I'd have to do more compiling, but it turns out fastcgi support for both apache and for PHP was already available. On my dev server I needed to apt-get the fastcgi apache module; on the production server, which uses cpanel, fastcgi support was already installed and switching it on was a single mouse click.

Here's a plot of the server CPU load before and after the switch:

There's a clearly visible daily cycle, using up almost 8 cores worth of CPU before the change. At the end of the 30th, I switched on fastcgi, and woo, the load drops right down and stays down. That's just what I wanted.

Reading more, cpanel disrecommends using fastcgi, and recommends somethign else - ruid2 - which looks like it does something similar but different. That recommendation seems mostly because fastcgi has a lot of tweakables that are hard to get right. see this thread.

caveats

I discovered a few interesting things during deployment:

Firstly, a potential attack on directories that have the ExecCGI option enabled - this is discussed in the context of the nginx web server here.

Another was a bug with a specific version of mod_fcgid and the specific configuration I set up, which resulted in a new PHP process being spawned for every page request, and then staying resident (!). Other people have experienced this and it was straightforward to tweak it so that it didn't happen.

haskell

I have a few apps for my own use written in Haskell, and one (a photo ranking app) struggles when called through the regular CGI interface, due to loading the vote/photo database each time. I've considered putting that into snap, a haskell framework, but it seemed interesting to see if I could get fastcgi running under Haskell.

apt-get install libcfgi-dev; cabal install fcgi got me the modules installed. I had some trouble running the hello-world app here

that came down to me not compiling with the -threaded option.

(I also tried the haskell direct-fastcgi module, but the home page for it is gone, and there is no example code so I rapidly gave up)

barwen.ch

I made an fcgi-bin directory available to all barwen.ch users, running FastCGI code under user accounts. There isn't much CGI going on on barwen.ch, but it seemed easy enough to deploy and make available, and is one more feature for the feature list.

10 April, 2012

commandline RSS->text tool using Haskell arrows

I wanted barwen.ch to display news updates at login. I already have an RSS feed from the drupal installation on the main page; and that RSS feed is already gatewayed into the IRC channel. So that seemed an obvious place to get news updates.

I wrote a tool, rsstty, to output the headlines to stdout. Then, I wired it into the existing update-motd installation to fire everything someone logs in.

So you can say:

$ rsstty http://s0.barwen.ch/rss.xml
 * ZNC hosting(Thu, 01 Mar 2012 10:09:15 +0000)
 * finger server with cgi-like functionaity(Wed, 22 Feb 2012 18:43:08 +0000)
 * Welcome, people who are reading the login MOTD(Fri, 17 Feb 2012 23:56:44 +0000)
 * resized and rebooted(Wed, 25 Jan 2012 12:23:39 +0000)
 * One time passwords (HOTP/TOTP)(Wed, 18 Jan 2012 11:33:45 +0000)

I wrote the code in Haskell, using the arrow-xml package.

arrow-xml is a library for munging XML data. Programming using it is vaguely reminiscent of XSLT, but it is embedded inside Haskell, so you get to use Haskell syntax and Haskell libraries.

The interesting arrow bit of the code is this. Arrow syntax is kinda awkward to get used to Haskell and sufficiently different from regular syntax and monad syntax that even if you know those you have to get used to it. If you want to get even more confused, try to figure out how it ties into category theory - possibly the worst possible way to learn arrows ever.

But basically, the below definition make a Haskell arrow which turns a url (to an RSS feed) into a stream of one line text headlines with title and date (as above)

> arrow1 urlstring =
>  proc x -> do
>   url <- (arr $ const urlstring) -< x

This turns the supplied filename into a stream of just that single filename. (i.e. awkward plumbing)

>   rss <- readFromDocument [withValidate no, withCurl []] -< url

This uses that unixy favourite, curl (which already has Haskell bindings), to convert a stream of URLs into a stream of XML documents retrieved from those URLs - for each URL, there will be one corresponding XML document.

>   item <- deep (hasName "item" <<< isElem) -< rss

Now convert a stream of XML documents into a stream of <item> XML elements. Each XML document might have multiple item elements (and probably will - each RSS news item is supplied as an <item>) so there will be more things in the output stream than in the input stream.

>   title <- textOfChild "title" -< item
>   pubdate <- textOfChild "pubDate" -< item

Next, I'm going to pull out the text of the <title> and <pubdate> child elements of the items - there should be one each per item

>   returnA -< " * " ++ title ++ "(" ++ pubdate ++ ")\n"

When we get to this point, we should have a stream of items, a stream of titles corresponding to each item, and a stream of pubdates corresponding to each title. So now I can return (using the arrow-specific returnA) what I want using regular Haskell string operations: a stream of strings describing each item.

The above arrow is wrapped in code which feeds in the URL from the command line, and displays the stream of one-line news items on stdout.

The other interesting bit is a helper arrow, textOfChild which extracts the text content of a named child of each element coming through an XML stream. Each part of this helper arrow is another arrow, and they're wired together using <<<. To read it, imagine feeding in XML elements at the right hand side, with each arrow taking that stream and outputting a different stream: first each element is converted into a stream of its children; then only the element children are allowed through; then only the elements with the supplied name; then all of the children of any elements so selected; and then the text content of those. (its quite a long chain, but thats what the XML infoset looks like...)

> textOfChild name =
>  textNodeToString <<< getChildren <<< hasName name <<< isElem <<< getChildren

28 January, 2012

one line to make your site look nice on iPhone

Well, I came across a magic line of HTML for making a website look basically readable on an iPhone. Not magic in the sense that I don't understand what it does. But magic in the sense that its a single line that is the first big step to making a site look OK.

The line is (to go in your <head> section:
<meta name="viewport" content="width = device-width" />

What it does is make the iPhone web browser render the HTML at a sensible readable font size, with word wrapping at the end of the screen. (the default, otherwise, is to try to fit a regular screen worth of pixels across, then zoom out to make it fit on the small iPhone screen - that means the user has to zoom and pan to do anything).

Now my pages still look like crappy hand written HTML, but at least they're readable on an iPhone now.

I added this to the shellinabox installation I have on barwen.ch, and now its much prettier to use a browser-based shell on an iphone - you get a 30 character terminal thats at sensible font size, rather than a wide wide terminal at unreadable font size.

03 December, 2011

DHmPC

I have a server running inside EC2. It gets its network details using dhcp.

ubuntu@s0:~$ hostname
s0.barwen.ch
ubuntu@s0:~$ hostname --fqdn
hostname: Name or service not known

grr.

This happens pretty much every time the VM reboots. Its something to do with getting a new private IP address each time it reboots.

Although this manages to work:

hostname --all-fqdns
s0.barwen.ch polm.pl stacheldraht.it s0.barwen.ch

The autoconfigured resolv.conf looks like this:

nameserver 172.16.0.23
domain eu-west-1.compute.internal
search eu-west-1.compute.internal

and if I comment out or remove both the domain and search lines, then everything works...

Those lines are wrong anyway - this machine is in my barwen.ch domain. It just happens to be hosted in amazon's network...

16 July, 2011

browserid in-browser shell logins

I've previously wired up my shell server barwen.ch to allow browser-based logins using OpenID and shellinabox. I've written about that on this blog before. I saw a few articles about Mozilla's BrowserID. The code snippets there looked like they would integrate well with the code I had already. So my evening project (which ended up only taking about half an hour) was to prototype BrowserID-based shell login.

It works basically the same as for the OpenID login on barwen.ch:

To get set up: you need to sign up for a barwen.ch account which will cost you 50 cents on PayPal; you need to send me the email that you use for BrowserID (instead of / in addition to an SSH key).

To actually log in, go to the login page http://s0.barwen.ch/~benc/browserid.html and log in. A terminal will appear in your browser. You do shell stuff.

This code is pretty crappy so I don't really want to release it until I've had a thought about the security for at least an hour (though you can find fragments of it elsewhere on this blog). I especially think that there might be some attacks possible by using freaky email addresses vs my unsanitised string handling. (I'm looking at you, Bobby Tables).

Ben Clifford Technical Blog