Troy Hunt

Another technology blog...

Request Validation, DotNetNuke and design utopia

Tuesday, March 9, 2010

It’s a hot summer day in Perth over on the western seaboard of Australia and the local pub is packed with patrons downing cold beers. You’re in your shiny new Ferrari – red, of course – and come cruising past the pub in full view of the enthralled audience. As any red-blooded, testosterone fuelled Aussie bloke would do, you give the Italian thoroughbred a full redline launch to the delight of the crowd. Right up until you run into the street sign:

image

Why did this happen? Well there’s the fact the guy was allegedly drunk which is never a good idea in any car, let alone a supercar with a few hundred kilowatts. But there’s also an interesting element of the story captured in the news article:

Witnesses said the driver revved the Italian stallion at the lights next to the Windsor Hotel, at the intersection of Mends Street and Mill Point Rd, dropped the clutch and burned off around the corner

Ah, dropped the clutch and burning off says our inebriated show pony hit the button with the car and the squiggly lines:

image

As many car manufactures now do, Ferrari fitted the 360 with traction control. Left on, which is the default, loss of traction will cut power to the wheels allowing the vehicle (and hopefully the driver) to regain control and continue on without causing a scene. However, turn it off and suddenly the driver has to take a whole lot more responsibility for their actions because the safety net is gone. This is precisely what Microsoft did when they introduced Request Validation to .NET.

✝ Ok, I’m wildly speculating for the sake of edutainment. And yes trainspotters, I referenced “dropping the clutch” then used an image of the F1 gearbox which obviously doesn’t have a manually actuated clutch. Edutainment!

About request validation

Back in the days of classic ASP, pretty much any data could readily be sent to the server through input mechanisms such as form fields and query strings without any parsing beyond what the competent developer would apply. And here’s your first problem; the term “competent developer” is a little like “common sense” in that neither are that common. Many devs simply did not (and still do not) understand what needs to be parsed and what the potential ramifications are when it doesn’t happen.

Why is request validation even important? Put simply, because you want to control the information which gets sent to your application and potentially stored or redisplayed. Failure to do this opens up a whole range of potential exploits to those with malicious intent.

XSS

Before I delve into specific vulnerabilities, I want to quickly draw attention to the Open Web Application Security Project or OWASP for short. OWASP defines a “Top 10” of the common web application security risks and has become the de facto guideline for all web developers conscious of security implications in their apps.

Onto the vulnerabilities; one of the most common exploits that take advantage of missing input validation is Cross Site Scripting or XSS. Here’s what OWASP has to say about XSS:

XSS flaws occur whenever an application takes untrusted data and sends it to a web browser without proper validation and escaping. XSS allows attackers to execute script in the victim’s browser which can hijack user sessions, deface web sites, or redirect the user to malicious sites.

Let’s break this down a little. “Untrusted data” refers to data sent to the application from third parties outside those explicitly trusted by the application, for example form and querystring inputs. “Proper validation and escaping” involves ensuring the data, although untrusted, is consistent with what is expected from an external party.

Worked examples

XSS exposes itself in a number of different forms but the most commonly seen and easily identifiable exploit involves rewriting querystring parameters directly to the page without validating or escaping characters. Here’s a classic example of what immediately appears to be a potentially vulnerable URL:

http://troyhunt.com/Default.aspx?Name=Troy

In this use case, my name would appear somewhere within the page. Changing the querystring value to “John”, would change the name displayed on the page accordingly. This in itself may pose a vulnerability (direct rewriting of page content) but there are cases where this is valid. It’s the next step in the discovery phase which unearths the vulnerability. What if we changed the URL as follows:

http://troyhunt.com/Default.aspx?Name=Tr<script>alert(‘Hello’);</script>oy

There are three likely scenarios:

  1. The page will identify that a script tag is not valid input for the querystring parameter and the application will reject the request.
  2. The page will display the name exactly as it is entered in the querystring by escaping the non alpha numeric characters and rendering them in HTML as Tr%3Cscript%3Ealert(%91Hello%92);%3C/script%3Eoy which will then display on the page precisely as entered in the querystring.
  3. The page will display the name “Troy” and you will get a JavaScript alert saying “Hello”.

This example is seemingly harmless but stop and think for a moment what is going on in the third scenario. You are controlling not only the HTML content which gets rendered to the browser but the script which executes within it. Let’s run through the possibilities for a moment:

  1. You can rewrite the page contents by closing the tag where the text is displaying and entering your own.
  2. You can insert CSS which changes the layout of the page and substitutes it with your own.
  3. You can write tags which look like username and password fields with a submit button which sends the credentials to your server.
  4. You can insert an IFrame which loads malware from your server and prompts the user to install it.
  5. You can access the user’s cookies for the site (which may include sensitive data if poorly designed) and submit them to your own server.
  6. You can do all of the above in external .js or .css files by including external references.

There are some great examples in the Cross Site Scripting FAQ over on the cgisecurity.com site which talks about XSS in more detail and provides some excellent references if you want to see how much further the exploit can be pushed.

What makes these exploits particularly cunning is that as far as the user is concerned, they’re visiting a legitimate site. The domain looks valid and they’re actually seeing what they probably expected to, there’s just other stuff going on in the background.

Social engineering

In the example above, there is a dependency on someone loading a maliciously formed URL. There are other XSS exploits which are more subvert, such as successfully injecting the examples into a database which then renders them to unsuspecting users, but let’s focus on this one for the purposes of this post.

Tricking someone into following an exploited URL is simply a matter of social engineering. This term refers to manipulating the behaviour of an individual so that they perform an action which may not be overtly dangerous. It’s simply trickery on behalf of the party with malicious intent.

Encouraging the loading of a malicious URL could be done by distributing it through popular channels, such as social networking, with the promise of an alluring website. If the unsuspecting party can be convinced the website is trustworthy, often because the domain and website content both appear legitimate, then the battle is halfway over. It’s then a matter of exercising any number of the potential XSS exploits.

.NET request validation

The entire preamble so far was so that the XSS technique and potential exploit was clear. As I said earlier on, in the old days of classic ASP the exploits above could easily be exercised if the developer hadn’t explicitly parsed input. Then along came .NET and with it the concept of Request Validation.

The beauty of Request Validation is that like the Ferrari’s traction control, it’s just there. You don’t have to turn it on, you don’t have to write code, and in fact you don’t even need to know it exists. It just works. To demonstrate this, here’s an entirely random querystring on a legitimate .NET page with custom errors turned off:

image

In this case there was no vulnerability to potentially exploit but Request Validation rolled out the safety net and identified a potentially malicious request anyway. This is a very good thing because no matter how (in)competent the developer is, this particular XSS mechanism cannot be exploited by many common string manipulation techniques.

However, there is a downside to the catchall approach. There are legitimate purposes for posting HTML tags to the server. One very common example of this is rich text editors. Many of these allow for editing in WYSIWYG or directly in HTML and as such this becomes a use case for legitimately posting HTML formatted strings to the server. However, just like the traction control button, Request Validation can be simply turned off and it can be done at the page level:

<%@ Page Language="C#" ValidateRequest="false" %>

Whilst this leaves that particular page vulnerable if no manual input parsing is performed, the scope is limited and all other pages still benefit from the fall gamut of Request Validation protection. The other alternative is to turn off Request Validation across the entire application which leaves the web.config looking like this:

<configuration>
  <system.web>
    <pages validateRequest="false" />
  </system.web>
</configuration>

To get a sense of how easy it is to slip up when no global input validation exists, take a look at A XSS Vulnerability in Almost Every PHP Form 'I’ve Ever Written. Or XSS Woes for a similar experience. Very innocuous development practices, very big XSS holes.

The DotNetNuke story

For those not already familiar with this product, DotNetNuke is an open source, .NET based content management system that has now been around in different forms for about 8 years. It’s achieved a strong following by those wanting to quickly and easily deploy CMS and portal style functionality in the .NET realm without the financial and administrative burden of SharePoint.

Being a content management system, there are a lot of places where text, rich text at that, needs to be submitted to the server. A decision appears to have been made to completely disable Request Validation at the DotNetNuke application level as demonstrated in the web.config file above. What this means is that the native protection described above is gone – the traction control is permanently off - and if the developer adds new features and hasn’t explicitly coded against the potential for an exploit, they could be vulnerable.

Unfortunately for DotNetNuke, in November 2009 they identified a vulnerability with their own codebase which spans versions 4.8 through 5.1.4 or to put it another way, nearly two years worth of revisions. This is a significant period of time and as such, a significant number of web applications exist out there with this vulnerability. In fact I found many vulnerable sites after just a few minutes of running carefully crafted Google searches.

The bottom line is that many DotNetNuke sites now sit out in the publicly facing domain with a very easily compromised exploit. This happens in part because the input on the search page was not correctly parsing input but this alone would not have been a problem if Request Validation hadn’t been turned off.

Design utopia

When I came across a site with the vulnerability described above I fired out a quick tweet and got an interesting response. The feedback from DotNetNuke Corp was that .NET Request Validation was a “crude filter” and that it didn’t make sense in a product allowing online editing of HTML.

What Joe was trying to say (and elaborated on in subsequent tweets), was that Request Validation doesn’t provide the same fidelity as custom parsing of request content. And he’s absolutely right. I got similar feedback from a query on Stack Overflow with the response explaining how global input validation “makes programmers lazy” and that they “must be responsible for the vulnerabilities they create”. And I couldn’t agree more.

The problem with the feedback above is that it’s like saying we should just teach people to drive properly in the first place and forget about the traction control. Yes, we should and it’s an admirable pursuit but it’s simply not going to happen and we’ll need to continue catering for those with ambition that exceeds ability. In a utopian world, all developers would understand the OWASP Top 10 and wouldn’t introduce vulnerabilities into their code. But this just doesn’t happen which is why we have Request Validation along with web server level defences such as IIS UrlScan and browser level defences like the IE8 XSS Filter.

Unfortunately the reality is that people will continue to crash cars they are unequipped to handle and developers will keep writing vulnerable code because they don’t understand fundamental security concepts. Just as our friend in the Ferrari had the traction control button, we have a safety net for developers building on ASP.NET. If you turn either of these off, you’d better really know what you’re doing because you’re now on your own without a safety net.


Share/Save/Bookmark

The no-name infrared IP camera for DIY baby monitoring

Thursday, February 25, 2010

As a new parent, I obsess about what the baby is doing. Is he awake, asleep, sucking his thumb or even still breathing? I mean I want to be quite, just not too quite. Do I try and sneak in commando style just to make sure he’s all good and risk waking a sleeping baby (this is never a good idea!), or do I sit in anticipation waking for the baby monitor to confirm signs of life? I’m sure new parent paranoia is not unique to me but I like to have a little more control over my environment than just wondering what on earth is going on behind that closed door.

Recently I placed a laptop in his room with a webcam then fired up a Skype session and monitored him from the desktop in another room. I actually found it disturbingly addictive watching the actions of another whilst obscured from vision however I assure you my voyeurism begins and ends with him! The potentially creepy aspect of this aside, it was interesting to watch him go through sleep cycles and how he behaved as he woke up. It was also pretty darn amusing to see his reaction when my voice came out of nowhere telling him to “GO BACK TO SLEEP” :)

This setup was enlightening but unsustainable and impractical. It also became pretty useless once it got dark but by now I was excited and a more permanent solution was on the cards. Enter the IP camera.

About IP cameras

When it comes to cameras, and by this I mean the video kind, for this sort of thing there are a few different routes you can go down:

  1. Dedicated device and receiver unit: these are expensive as you’re getting a display unit with the package
  2. USB camera: this is essentially my prototype which has a dependency on being plugged in to a PC
  3. IP based camera: this can be wired or wireless and obtains its own IP address making it a network connected device

The advantage of the IP camera is that it is a fully autonomous unit with no dependency on other devices other than a router to give it a network identity. Wireless versions are also ubiquitous which is pretty handy if you don’t have ethernet cabling all over the place. Finally, exposing data over IP opens up a world of possibilities in terms of how it is received; PC, mobile device, internal network or public internet. Lots of options to play with.

Choosing a camera

There are a heap of wireless IP cameras on eBay but the one which keeps coming up is the little guy on the right. I say “little guy” because I have absolutely no idea who makes it or what the model is. Other than being described as “IP Camera”, it basically has no name.

Moving on, as well as being IP based and wifi enabled, this particular device supports night vision, remote pan and tilt, has a built in mic and speaker plus has a bunch of software features including motion detection, image capture, FTP, mail, externally accessible viewing and support for mobile devices. Not bad for under $90 Aus delivered or about $80 American, especially considering the specs:

Image Compression Format Standard M-JPEG
Sensor CMOS,300,000 pixel
Image Resolution Rate VGA(640x480)/QVGA(320x240)
Network interface RJ-45/10-100 Base T ,802.11b/g
Network protocol TCP/IP,FTP,SMTP,HTTP,UDP,DHCP,NTP,DDNS,UPNP,D
NS,PPPOE
Image Max Transmission Rate 30 frame/second(QVGA), 15 frame/second(VGA)
Alert control Output1 router(5VDC,0.1A);input:1 router(closure Trigger)
Motion Detection Support
Software Update Users automatically upgrade
Monitor Mode IE browse or special program
Playback Mode Microsoft Media Player
Security 3rd ranks password authority setting
Minimum illumination 2.0Lux@550nm
Auto White Balance Support
Working environment -10C°– 50C°,20% - 80%PH

The seller I ended up going with was savemoneyforyou in Hong Kong and I had it on my desk two weeks later which isn’t too bad given the price which obviously included bargain basement shipping. This particular seller was pretty good in answering a couple of questions very promptly and at the time of writing had 99.3% positive feedback from 4,654 buyers. That’s a pretty acceptable risk for 90 bucks.

What you get

You get a box (no name!):

image

And you get a whole bunch of bits inside including a power supply (with Australian socket converter and a ridiculously short cable), a mounting bracket with an extra base plate and adjustable thumbscrews, a mini CD, an antenna and of course, a camera:

image

You don’t get a manual, all the info is on CD which is just fine. What’s not so fine is the quality of information in the PDF manual. This is Engrish personified and in some cases it’s almost impossible to understand what’s being said. But hey, who ever reads documentation anyway?! One thing I did enjoy reading though was the mission statement on the final page:

image

Indeed.

Making it work

On the disk you get a little executable which allows you to find the device by browsing the network. Plug it into wired ethernet and by default, it grabs 192.168.1.126 and exposes the device over port 81.

image

Pointing IE to the IP gives you an ActiveX installer which then fires you into the video and voila, webcam running!

image

Other than the obvious video image, you also get controls to pan and tilt plus set the thing “on patrol” which means it spins around all over the place. Major geek-out factor when you first see it moving! Below the controls for resolution, refresh rate, brightness and contrast you also get the ability to record in AVI, snapshot in JPG, listen to audio or broadcast your own through the built-in speaker. The final cog icon at the bottom launches you into the configuration.

Configuration

You’ll find most of the usual IP appliance options here; device info, account configuration, wireless settings etc. You also get all the options to distribute output via means such as mail and FTP or trigger an alarm based on an event.

image

The only thing I really needed to configure was my time zone and wifi settings which I obviously did whilst still connected directly to the router by ethernet. However, no matter what I did I couldn’t get the device to connect while MAC address control was enabled or while the SSID was being broadcast. I’m not real happy about this and will persevere some more but at least the WPA configuration went smoothly.

Bundled software

Quite frankly, the less said about this the better; it’s a disaster zone. Other than the little IP Camera Finder app mentioned earlier on, everything else is just a train wreck of bad usability, total Windows app design inconsistency and functional flaws. Let me demonstrate; the following is a screen grab of the “IP Camera Super-Client”:

image

I have no idea what this does as it won’t load without throwing an exception. Maybe it’s not compatible with Win 7 x64 but I have no way of knowing because there’s no app specific documentation. Instead, you get sentences like this (take a deep breath before reading):

When finishing adding camera, once needs to do other settings, please select the camera, and right click to choose equipment setting and then set the following parameters, also when click one monitor screen, click , and set the camera, basic information, equipment parameters, alarm, record, action plans, additional information will appear;

And then there is no additional information. When you try to exit out of the app and are prompted with “Be sure to close the procedure?” after which you’ll get another exception and you get to go through the whole process again. Resource Manager –> End Process.

But wait, there’s more; next up is the impressively named IP Camera Central Management System. It’s a bit hard for me to say exactly what this does because no matter what I did it couldn’t find the device. From the information in the manual, it appears to facilitate scheduling of events and auto recording of activity. What really worried me though was a certain odd behaviour when I was trying to get the thing to run. Look very carefully for anything unusual in the following image:

image

Yep, it’s a big freakin’ Mickey Mouse. My first reaction was “oh crap, I’ve just installed some Chinese software and now I have a Disney invader” but his behaviour seems pretty innocuous beyond this app. He only appears when running dual screens and moving focus off the app then he disappears a couple of seconds after the mouse stops. There’s no unusual network activity on Resource Monitor while this is going on and everything else on the machine seems to be business as usual so hopefully it’s simply either the world’s most poorly hidden Easter egg or professionalism and usability just mean something different where this thing was built. The final straw with this app is that you need to enter your password to exit it. C’mon, I’m trying to stop doing things do I really need to authenticate?!

In short, don’t install the software. The browser based edition works just fine, at least for my purposes.

Image quality

There’s some interesting behaviour here so let me start with a control photo. The image below is my home office as seen through a Canon DSLR:

image

And here’s the IP cam in the same light:

image

Obviously you’re not going to get anywhere near the quality of a DSLR and the overexposure is no surprise but what I did find odd was the behaviour of the blacks. The throw rug and cushion on the couch plus the camera bag to the left of the chair all came out a light blue. Intriguing though, the chair itself shows the full black appearance it naturally has. Even more intriguing was that as I sat in the chair, in my black shirt, the image I saw on the camera showed me in light blue. Here’s the same DSLR versus IP cam comparison:

image

Picture perfect image reproduction was never really important but this behaviour did take me by surprise. I can only assume it’s somehow related to the filters or infrared capabilities of the device. Which is a good segue to the next image. Here’s how the same shot looks with no light. And by “no light” I mean during the night with no sunlight creeping through the shutters and every single light turned off:

image

This image is real pleasant surprise. I honestly didn’t expect a $90 “night vision” webcam to have anything like the clarity in the above shot. This is more than sufficient for the purposes of “baby spying” and clearly exposes every feature in the room. The only letdown is the red dot in the middle of the chair which is obviously a defect. Similarly, you can see what looks like a tiny white lens flare in the centre of the black part of the chair in the previous image. Bit of a shame but again, it’s hard to expect perfection from 90 bucks.

You start to get a bit of a gist of how enough light to create the image above is generated when you see this thing running in the dark from the front. A total of 10 red LEDs surround the lens once light levels drop sufficiently (the green LED indicates activity and is also visible under fully lit circumstances):

image

Having a bit of a trawl through the web, it seems that higher quality IR cameras have less visible light emitting from the diodes and that the appearance above is pretty typical of what the low entry price gets you. Higher quality LEDs apparently emit light at longer wavelengths whilst the cheaper versions, such as the one above, have shorter wavelengths which creates the visible reddish or pinkish glow.

iPhone compatibility

photo1

This was important to me as both my wife and I have iPhones and we needed them to work while wandering around the house. The web interface comes with three different mechanisms for viewing the video feed:

  1. “ActiveX Mode” for use in IE which obviously uses an embedded control
  2. “Server Push Mode” for other browsers which pushes down a stream of JPG images
  3. “Mobile Phone” which just refreshes a JPG every few seconds

The mobile phone mode runs pretty well (see image to the right) but the controls are very clunky to use. I found Safari on the iPhone does a pretty good job of rendering the “Server Push Mode” anyway which pretty much makes the mobile version obsolete for my purposes.

Server push mode renders with all the controls surrounding it as seen in one of the first images above. Neither the “picture frame” effect nor the controls are really needed for the purposes of mobile viewing so I’ve just bookmarked the video stream itself directly which is at /videostream.cgi on the device.

Audio

Just don’t even bother. As soon as either the mic or the speaker is activated the frame rate drops dramatically. I suspect this is more to do with the processing power of the device than the increased bandwidth but either way, the end result is the video becomes pretty useless. It doesn’t really concern me as we have an audio baby monitor but it could be frustrating in other circumstances.

Let the baby spying begin!

So configuration complete I installed it just above the little guy’s bed at an angle that gave a pretty full picture of what’s going on. The camera simply remains turned on then once he gets put to bed we get a nice 640x480,  view of the following:

image

This is just what I was after! The software sucks, the audio is dismal and I have no idea what the manual says, but to get a picture like the above in total darkness and transmitted wirelessly so that it’s consumable on PC or mobile and to do it all for only $90 is a very pleasing result.


Share/Save/Bookmark

Creating Subversion pre-commit hooks in .NET

Saturday, February 13, 2010

A while back I wrote about Creating your own custom Subversion management layer which involved rolling your own UI in .NET to perform common management tasks in SVN such as provisioning a repository or managing permissions. This is a great way of quickly and easily giving users a self-service mechanism for managing their own repositories in a controlled, secure fashion.

Continuing the theme of customising SVN to do your bidding I thought I’d share some info on commit hooks. There are a heap of examples out there in Python and Perl but not much in the .NET realm so hopefully this will make someone’s life a little easier.

As with the previous blog post, all the info in this post relates to a Visual SVN Server instance of Subversion. Having said that, there’s nothing specific to this particular SVN distribution so the concepts and code snippets should be equally relevant to any others.

About commit hooks

A common SVN practice, and a good use case for writing a bit of code, is to create event hooks which fire at certain points in the transaction lifecycle of a commit to a repository. Valid transaction lifecycle events include start-commit (before a transaction is created), pre-commit (the transaction is complete but not committed) and post-commit (the transaction is committed and a new revision has been created). There are half a dozen other hook events relating to different server events but we’ll ignore them for the purpose of this post.

Let’s look a bit more into what the SVN book has to say about the pre-commit hook:

This is run when the transaction is complete, but before it is committed. Typically, this hook is used to protect against commits that are disallowed due to content or location (for example, your site might require that all commits to a certain branch include a ticket number from the bug tracker, or that the incoming log message is non-empty). The repository passes two arguments to this program: the path to the repository, and the name of the transaction being committed. If the program returns a non-zero exit value, the commit is aborted and the transaction is removed. If the hook program writes data to stderr, it will be marshalled back to the client.

By far the most common use case for pre-commit hooks is ensuring a revision is accompanied by an appropriate comment. “Appropriate” may simply mean characters must exist or it could lay out some rules for what constitutes an acceptable pattern. Another common use case is to prohibit certain file patterns. For example, the Thumbs.db file really shouldn’t exist in your repository and you may want to enforce its exclusion.

How a commit hook is invoked

As with most things Subversion, it’s pretty simple. If a correctly named executable or script exists in the “hooks” folder of the repository it will be executed at the appropriate time in the transaction. Here you can see a series of template files automatically created by SVN when the repository is provisioned:
 image

Each template file has a sample Unix script inside so if you’re running in that environment you can just drop the .tmpl extension, give it execute rights and you’re good to go. I’m going to write .NET based hooks in this example but just for a sense of what comes out of the box, here’s the contents of the pre-commit.tmpl file:

REPOS="$1"
TXN="$2"

# Make sure that the log message contains some text.
SVNLOOK=/usr/local/bin/svnlook
$SVNLOOK log -t "$TXN" "$REPOS" | \
   grep "[a-zA-Z0-9]" > /dev/null || exit 1

# Check that the author of this commit has the rights to perform
# the commit on the files and directories being modified.
commit-access-control.pl "$REPOS" "$TXN" commit-access-control.cfg || exit 1

# All checks passed, so allow the commit.
exit 0

There are two main behaviours this script has in common with the one I’ll write in this post:

  1. There are two arguments being passed to the script declared as REPOS and TXN.
  2. There is a response code returned being a 1 for a failure or 0 for success.

Creating the solution and reading the args

As mentioned above, all we need is an executable so let’s just build a console app called “SvnPreCommitHooks”:

image

You’ll get a Program.cs file to act as the entry point to the console app. Let’s start off by following the pattern in the template file above and declaring variables for the two arguments that will be passed to the hook:

namespace SvnPreCommitHooks
{
  class Program
  {
    static void Main(string[] args)
    {
      var repos = args[0];
      var txn = args[1];

The repos argument is the full path of the repository which going by the folder structure in the grab above will be c:\Repositories\BlogTemplate. The txn argument is the commit transaction name and this is something we need to understand a little bit more about before we proceed.

Understanding transactions

The commit transaction name is generated by SVN and comprises of the next revision number and the transaction number in a format such as “3-6” (revision 3, transaction 6). If the transaction succeeds and becomes a new revision then the next txn value will be “4-7”. If it fails then no revision will be created and the next txn value will be “3-7”.  The Subversion book explains the process in Understanding Transactions and Revisions:

Every revision begins life as a transaction tree. When doing a commit, a client builds a Subversion transaction that mirrors their local changes (plus any additional changes that might have been made to the repository since the beginning of the client's commit process), and then instructs the repository to store that tree as the next snapshot in the sequence. If the commit succeeds, the transaction is effectively promoted into a new revision tree, and is assigned a new revision number. If the commit fails for some reason, the transaction is destroyed and the client is informed of the failure.

You really don’t need to understand the inner workings of SVN transactions to build hooks but one important point to comprehend is that in order to reject a commit at the pre-commit event we still need to create an entire transaction on the server. This means transferring all the intended commit content to the server which, depending on the data being committed, may mean waiting while large volumes of information is being transferred. And your commit could be rejected causing you to remedy the root cause and go through the entire process again!

Using svnlook to inspect the transaction

So the pre-commit arguments tell us where the repository is and what transaction we need to inspect before promoting it as a new revision. The next thing we need to do is inspect the transaction so we can understand exactly what it is the author is trying to put into the repository. This is where svnlook comes in.

The svnlook command comes packaged with Visual SVN Server and the location is added to the “path” system environment variable on install so it can be invoked from any directory on the system. To get an idea of the sort of information this command can retrieve, here’s what it reports for the last log message and then the last changed paths in the BlogTemplate repository:

image

The log message is pretty straight forward but let’s look further at the changed paths. What we’ve get here is a collection of rows with each one specifying an action (“D” for deleted, “U” for updated and “A” for added) followed by the path of the file within the repository. In the example above I’ve ditched the .suo file (solution user options shouldn’t be maintained under source control as they’re user specific) and updated the “Blogger Template.xml” file. The commands above have only specified a repository path argument for the respective subcommands but another valid argument is the transaction which is what we’re going to look at next.

Back to the console app; we’ve got the repos and txn arguments and we now know what svnlook can do with them so let’s tie it all together:

private static string GetSvnLookOutput(string repos, string txn, string subcommand)
{
  var processStartInfo = new ProcessStartInfo
  {
    FileName = "svnlook.exe",
    UseShellExecute = false,
    CreateNoWindow = true,
    RedirectStandardOutput = true,
    RedirectStandardError = true,
    Arguments = String.Format("{0} -t \"{1}\" \"{2}\"", subcommand, txn, repos)
  };

  var process = Process.Start(processStartInfo);
  var output = process.StandardOutput.ReadToEnd();
  process.WaitForExit();
  return output;
}

The subcommand parameter is going to allow us to reuse the method for both “log” to get the commit comment and “changed” to get the files and actions. The ProcessStartInfo class allows us to reference both the svnlook command and the parameters after which the Process class will run svnlook and give us back the output in a string. We’ll go back up the entry point of the console app and save the output to variables:

var log = GetSvnLookOutput(repos, txn, "log");
var changedPaths = GetSvnLookOutput(repos, txn, "changed");

Validating the log message

We’re going to say a valid log message must comprise of at least 20 characters and 5 words so let’s encapsulate that within another method. As the message could be invalidated by two different conditions we’re going to give the user a bit of information about what went wrong rather than just returning a boolean. No errors will mean a null message:

private static string GetLogMessageErrors(string log)
{
  if(log.Length < 20)
  {
    return "Message is less than 20 characters.";
  }
  if(log.Split(' ').Length < 5)
  {
    return "Message is less than 5 words.";
  }
  return null;
}

Validating the changed paths

Moving on to the changed paths, cast your mind back to the svnlook output further up. We’ll start by removing the trailing line return with a TrimEnd then split the changedPaths string lines based on Environment.NewLine. Each row then consists of a character to represent the change type, 3 spaces than the changed path which obviously contains the file name. Let’s extract all these out into variables then apply a little conditional logic:

private static string GetFileNameErrors(string changedPaths)
{
  var changeRows = Regex.Split(changedPaths.TrimEnd(), Environment.NewLine);
  foreach (var changeRow in changeRows)
  {
    var changeType = changeRow[0];
    var filePath = changeRow.Substring(4, changeRow.Length - 4);
    var fileName = Path.GetFileName(filePath);
    if(changeType != 'D' && fileName == "Thumbs.db")
    {
      return "Thumbs.db file was found.";
    }
  }
  return null;
}

One thing worth noting here is that we’re only going to return an error if the change type is “D” for “Delete”. Although this condition should be impossible if the hooks are applied to a brand new repository (the file could never have been added in the first place), we may well apply the pre-commit hook to an existing repository. In this scenario the hook would still prohibit adding or changing the Thumbs.db file but obviously we’d like to retain the capability to remove it.

Exiting the hook

That’s pretty much all the legwork done, we now just need to invoke the two methods created above and exit the program with a message and either a success or failure:

var logValidation = GetLogMessageErrors(log);
if(logValidation != null)
{
  Console.Error.WriteLine(logValidation);
  Environment.Exit(1);
}

var changedPathsValidation = GetFileNameErrors(changedPaths);
if (changedPathsValidation != null)
{
  Console.Error.WriteLine(changedPathsValidation);
  Environment.Exit(1);
}

Environment.Exit(0);

It’s pretty obvious from the above but an exit code of 1 will tell SVN to rollback the transaction while 0 indicates success. In both the error states I’ve included a bit of information about what went wrong to try and help the user rectify things on the next go.

Joining all the dots

All we need to do now is compile the project and get the executable invoked on pre-commit. To save on redundancy, we’re going to place the compiled SvnPreCommitHooks.exe directly into the c:\Repositories folder then create a little bootsrapper to place in each repository hooks folder. Let’s save the following as pre-commit.cmd:

C:\Repositories\SvnPreCommitHooks.exe %1 %2

The command file can now be replicated across as many repositories as required without increasing maintenance burden should the console app need to change. It’s also very easy to tie the creation of this file into the custom repository provisioning process I referenced right at the start of this post so that all new repositories can take advantage of the hook.

Testing the hook

This is the fun part! Let’s run through all the test cases using TortoiseSVN:

No message with one valid file and one invalid file:

image

Fail with an error notification about the message being too short:

image

Less than 5 words with one valid file and one invalid file:

image

Fail with an error notification about an insufficient number of words:

image

Valid message with one valid file and one invalid file:

image

Fail with an error notification about a disallowed file:

image

Valid message with one valid file:

image

Success!

image

Wrapup

This has been an intentionally simple illustration more to point the .NET reader in the right direction rather than illustrate the potential of SVN hooks. A more applicable real world solution might involve a data driven set of rules or greater integration with other systems such as bug trackers or change logs. There might also be author or even time of day based rules depending on the requirement.

Hopefully this is enough to cover the SVN idiosyncrasies and from here on in it’s all business as usual .NET code wise. SVN is a fantastically versatile product and a few little tweaks like this can really increase both the value of the tool and make for a more productive source control experience.


Share/Save/Bookmark

Disclaimer

Opinions expressed here are my own and may not reflect those of my employer, my colleagues, my mates, my wife and so on and so forth. Unless I’m quoting someone, they’re my own opinions and may not necessarily be cohesive nor entertaining but hey, at least they’re original!