Ah source control, if there’s a more essential tool which indiscriminately spans programming languages without favour, I’m yet to see it. It’s an essential component of how so many of us work; the lifeblood of many development teams, if you like. So why do we often get it so wrong? Why are some of the really core, fundamentals of version control systems often so poorly understood?
I boil it down to 10 practices – or “commandments” if you like – which often break down or are not properly understand to begin with. These are all relevant to version control products of all types and programming languages of all flavours. I’ll pick some examples from Subversion and .NET but they’re broadly applicable to other technologies.
1. Stop right now if you’re using VSS – just stop it!
It’s dead. Let it go. No really, it’s been on life support for years, taking its dying gasps as younger and fitter VCS tools have rocketed past it. And now it’s really seriously about to die as Microsoft finally pulls the plug next year (after several stays of execution).
In all fairness, VSS was a great tool. In 1995. It just simply got eclipsed by tools like Subversion then the distributed guys like Git and Mercurial. Microsoft has clearly signalled its intent to supersede it for many years now – the whole TFS thing wasn’t exactly an accident!
The point is that VSS is very broadly, extensively, almost unanimously despised due to a series of major shortcomings by today’s standards. Colloquially known as Microsoft’s source destruction system, somehow it manages to just keep clinging on to life despite extensively documented glitches, shortcomings and essential functionality (by today’s standards), which simply just doesn’t exist.
2. If it’s not in source control, it doesn’t exist
Repeat this mantra daily – “The only measure of progress is working code in source control”. Until your work makes an appearance in the one true source of code truth – the source control repository for the project – it simply doesn’t exist.
Sure, you’ve got it secreted away somewhere on your local machine but that’s not really doing anyone else any good now, is it? They can’t take your version, they can’t merge theirs, you can’t deploy it (unless you’re deploying it wrong) and you’re one SSD failure away from losing it all permanently.
Once you take the mindset of it not existing until it’s committed, a whole bunch of other good practices start to fall into place. You break tasks into smaller units so you can commit atomically. You integrate more frequently. You insure yourself against those pesky local hardware failures.
But more importantly (at least for your team lead), you show that you’re actually producing something. Declining burn down charts or ticked off tasks lists are great, but what do they actually reconcile with? Unless they correlate with working code in source control, they mean zip.
3. Commit early, commit often and don’t spare the horses
Further to the previous point, the only way to avoid “ghost code” – that which only you can see on your local machine – is to get it into VCS early and often and don’t spare the horses. Addressing the issues from the previous point is one thing the early and often approach achieves, but here’ a few others which can make a significant difference to the way you work:
- Every committed revision gives you a rollback position. If you screw up fundamentally (don’t lie, we all do!), are you rolling back one hour of changes or one week?
- The risk of a merge nightmare increases dramatically with time. Merging is never fun. Ever. When you’ve not committed code for days and you suddenly realise you’ve got 50 conflicts with other people's changes, you’re not going to be a happy camper.
- It forces you to isolate features into discrete units of work. Let’s say you’ve got a 3 man day feature to build. Oftentimes people won’t commit until the end of that period because they’re trying to build the whole box and dice into one logical unit. Of course a task as large as this is inevitably comprised of smaller, discrete functions and committing frequently forces you to identify each of these, build them one by one and commit them to VCS.
When you work this way, your commit history inevitably starts to resemble a semi-regular pattern of multiple commits each work day. Of course it’s not always going to be a consistent pattern, there are times we stop and refactor or go through testing phases or any other manner of perfectly legitimate activities which interrupt the normal development cycle.
However, when I see an individual – and particularly an entire project – where I know we should be in a normal development cycle and there are entire days or even multiple days where nothing is happening, I get very worried. I’m worried because as per the previous point, no measurable work has been done but I’m also worried because it usually means something is wrong. Often development is happening in a very “boil the ocean” sort of way (i.e. trying to do everything at once) or absolutely nothing of value is happening at all because people are stuck on a problem. Either way, something is wrong and source control is waving a big red flag to let you know.
4. Always inspect your changes before committing
Committing code into source control is easy – too easy! (Makes you wonder why the previous point seems to be so hard.) Anyway, what you end up with is changes and files being committed with reckless abandon. “There’s a change somewhere beneath my project root – quick – get it committed!”
What happens is one (or both) of two things: Firstly, people inadvertently end up with a whole bunch of junk files in the repository. Someone sees a window like the one below, clicks “Select all” and bingo – the repository gets polluted with things like debug folders and other junk that shouldn’t be in there.
Or secondly, people commit files without checking what they’ve actually changed. This is real easy to do once you get things like configuration or project definition files where there are a lot going on at once. It makes it really easy to inadvertently put things into the repository that simply weren’t intended to be committed and then of course they’re quite possibly taken down by other developers. Can you really remember everything you changed in that config file?
The solution is simple: you must inspect each change immediately before committing. This is easier than it sounds, honest. The whole “inadvertently committed file” thing can be largely mitigated by using the “ignore” feature many systems implement. You never want to commit the Thumbs.db file so just ignore it and be done with it. You also may not want to commit every file that has changed in each revision – so don’t!
As for changes within files, you’ve usually got a pretty nifty diff function in there somewhere. Why am I committing that Web.config file again?
Ah, I remember now, I wanted to decrease the maximum invalid password attempts from 5 down to 3. Oh, and I played around with a dummy login page which I definitely don’t want to put into the repository. This practice of pre-commit inspection also makes it much easier when you come to the next section…
5. Remember the axe-murderer when writing commit messages
There’s an old adage (source unknown), along the lines of “Write every commit message like the next person who reads it is an axe-wielding maniac who knows where you live”. If I was that maniac and I’m delving through reams of your code trying to track down a bug and all I can understand from your commit message is “updated some codes”, look out, I’m coming after you!
The whole idea of commit messages is to explain why you committed the code. Every time you make any change to code, you’re doing it for a reason. Maybe something was broken. Maybe the customer didn’t like the colour scheme. Maybe you’re just tweaking the build configuration. Whatever it is, there’s a reason for it and you need to leave this behind you.
Why? Well there are a few different reasons and they differ depending on the context. For example, using a “blame” feature or other similar functionality which exposes who changed what and hopefully, why. I can’t remember what I was doing in the Web.config of this project 18 months ago or why I was mucking around with app settings, but because I left a decent commit message, it all becomes very simple:
It’s a similar thing for looking at changes over time. Whether I want to see the entire history of a file, like below, or I just want to see what the team accomplished yesterday, having a descriptive paper trail of comments means it doesn’t take much more than a casual glance to get an idea of what’s going on.
And finally, commit messages are absolutely invaluable when it comes to tracking down errors. For example, getting to the bottom of why the build is breaking in your continuous integration environment. Obviously my example is overtly obvious, but the point is that bringing this information to the surface can turn tricky problems into absolute no-brainers.
With this in mind, here are some anti-patterns of good commit messages:
- Some shit.
- It works!
- fix some fucking errors
- fix
- Fixed a little bug...
- Updated
- typo
- Revision 1024!!
Ok, I picked these all out of the Stack Overflow question about What is the WORST commit message you have ever authored but the thing is that none of them are that dissimilar to many of the messages I’ve seen in the past. They tell you absolutely nothing about what has actually happened in the code; they’re junk messages.
One last thing about commit messages; subsequent commit messages from the same author should never be identical. The reason is simple: you’re only committing to source control because something has changed since the previous version. Your code is now in a different state to that previous version and if your commit message is accurate and complete, it logically cannot be identical. Besides, if it was identical (perhaps there’s a legitimate edge-case there somewhere), the log is now a bit of a mess to read as there’s no way to discern the difference between the two commits.
6. You must commit your own changes – you can’t delegate it
As weird as this sounds, it happens and I’ve seen it more than once, most recently just last week. What’s happening here is that the source control repository is being placed on a pedestal. For various reasons, the team is viewing it as this sanitised, pristine environment of perfect code. In order to maintain this holy state, code is only committed by a lead developer who carefully aggregates, reviews and (assumedly) tweaks and improves the code before it’s committed.
It’s pretty easy to observe this pattern from a distance. Very infrequent commits (perhaps weekly), only a single author out of a team with multiple developers and inevitably, conflict chaos if anyone else has gone near the project during that lengthy no-commit period. Very, very nasty stuff.
There are two major things wrong here: Firstly, source control in not meant to be this virginal, unmolested stash of pristine code; at least not throughout development cycles. It’s meant to be a place where the team integrates frequently, rolls back when things go wrong and generally comes together around a single common base. It doesn’t have to be perfect throughout this process, it only has to (try to) achieve that state at release points in the application lifecycle.
The other problem – and this is the one that really blow me away – is that from the developer’s perspective, this model means you have no source control! It means no integration with code from peers, no rollback, no blame log, no nothing! You’re just sitting there in your little silo writing code and waiting to hand it off to the boss at some arbitrary point in the future.
Don’t do this. Ever.
7. Versioning your database isn’t optional
This is one of those ones that everyone knows they should be doing but very often they just file it away in the “too hard” basket. The problem you’ve got is that many (most?) applications simply won’t run without their database. If you’re not versioning the database, what you end up with is an incomplete picture of the application which in practice is rendered entirely useless.
Most VCS systems work by simply versioning files on the file system. That’s just fine for your typical app files like HTML page, images, CSS, project configuration files and anything else that sits on the file system in nice discrete little units. Problem is that’s not quite the way relational databases work. Instead, you end up with these big old data and log files which encompass a whole bunch of different objects and data. This is pretty messy stuff when it comes to version control.
What changes the proposition of database versioning these days is the accessibility of tools like the very excellent SQL Source Control from Red Gate. I wrote about this in detail last year in the post about Rocking your SQL Source Control world with Red Gate so I won’t delve into the detail again; suffice to say that database versioning is now easy!
Honestly, if you’re not versioning your databases by now you’re carrying a lot of risk in your development for no good reason. You have no single source of truth, no rollback position and no easy collaboration with the team when you make changes. Life is just better with the database in source control :)
8. Compilation output does not belong in source control
Here’s an easy way of thinking about it: nothing that is automatically generated as a result of building your project should be in source control. For the .NET folks, this means pretty much everything in the “bin” and “obj” folders which will usually be .dll and .pdb files.
Why? Because if you do this, your co-workers will hate you. It means that every time they pull down a change from VCS they’re overwriting their own compiled output with yours. This is both a merge nightmare (you simply can’t do it), plus it may break things until they next recompile. And then once they do recompile and recommit, the whole damn problem just gets repeated in the opposite direction and this time you’re on the receiving end. Kind of serves you right, but this is not where we want to be.
Of course the other problem is that it’s just wasteful. It’s wasted on the source control machine disk, it’s wasted in bandwidth and additional latency every time you need to send it across the network and it’s sure as hell a waste of your time every time you’ve got to deal with the inevitable conflicts that this practice produces.
So we’re back to the “ignore” patterns mentioned earlier on. Once paths such as “bin” and “obj” are set to ignore, everything gets really, really simple. Do it once, commit the rule and everybody is happy.
In fact I’ve even gone so far as to write pre-commit hooks that execute on the VCS server just so this sort of content never makes it into source control to begin with. Sure, it can be a little obtrusive getting your hand slapped by VCS but, well, it only happens when you deserve it! Besides, I’d far rather put the inconvenience back on the perpetrator rather than pass it on to the entire time by causing everyone to have conflicts when they next update.
9. Nobody else cares about your personal user settings
To be honest, I think that quite often people aren’t even aware they’re committing their own personal settings into source control. Here’s what the problem is: many tools will produce artefacts which manage your own personal, local configurations. They’re only intended to be for you and they’ll usually be different to everyone else's. If you put them into VCS, suddenly you’re all overwriting each other’s personal settings. This is not good.
Here’s an example of a typical .NET app:
The giveaway should be the extensions and type descriptions but in case it’s not immediately clear, the .ReSharper.user file and the .suo (Solution User Options) file are both, well, yours. They’re nobody else's.
Here’s why: Let’s take a look inside the ReSharper file:
<Configuration> <SettingsComponent> <string /> <integer /> <boolean> <setting name="SolutionAnalysisEnabled">True</setting> </boolean> </SettingsComponent> <RecentFiles> <RecentFiles> <File id="F985644D-6F99-43AB-93F5-C1569A66B0A7/f:Web.config" caret="1121" fromTop="26" /> <File id="F985644D-6F99-43AB-93F5-C1569A66B0A7/f:Site.Master.cs" caret="0" fromTop="0" />
In this example, the fact that I enabled solution analysis is recorded in the user file. That’s fine by me, I like it, other people don’t. Normally because they’ve got an aging, bargain basement PC, but I digress. The point is that this is my setting and I shouldn’t be forcing it upon everyone else. It’s just the same with the recent files node; just because I recently opened these files doesn’t mean it should go into someone else’s ReSharper history.
Amusing sidenote: the general incompetence of VSS means ignoring .ReSharper.user files is a bit of a problem.
It’s a similar story with the .suo file. Whilst there’s not much point looking inside it (no pretty XML here, it’s all binary), the file records things like the state of the solution explorer, publishing settings and other things that you don’t want to go forcing on other people.
So we’re back to simply ignoring these patterns again. At least if you’re not running VSS, that is.
10. Dependencies need a home too
This might be the last of the Ten Commandments but it’s a really, really important one. When an app has external dependencies which are required for it to successfully build and run, get them into source control! The problem people tend to have is that they get everything behaving real nice in their own little environment with their own settings and their own local dependencies then they commit everything into source control, walk away and think things are cool. And they are, at least until someone else who doesn’t have some same local decencies available pulls it down and everything fails catastrophically.
I was reminded of this myself today when I pulled an old project out of source control and tried to build it:
I’d worked on the assumption that NUnit would always be there on the machine but this time that wasn’t the case. Fortunately the very brilliant NuGet bailed me out quickly, but it’s not always that easy and it does always take some fiddling when you start discovering that dependencies are missing. In some cases, they’re not going to be publicly available and it can be downright painful trying to track them down.
I had this happen just recently where I pulled down a project from source control, went to run it and discovered there was a missing assembly located in a path that began with “c:\Program Files…”. I spent literally hours trying to track down the last guy who worked on this (who of course was on the other side of the world), get the assembly, put it in a “Libraries” folder in the project and actually get it into VCS so the next poor sod who comes across the project doesn’t go through the same pain.
Of course the other reason this is very important is that if you’re working in any sort of continuous integration environment, your build server isn’t going to have these libraries installed. Or at least you shouldn’t be dependent on it. Doug Rathbone made a good point about this recently when he wrote about Third party tools live in your source control. It’s not always possible (and we had some good banter to that effect), but it’s usually a pretty easy proposition.
So do everyone a favour and make sure that everything required for your app to actually build and run is in VCS from day 1.
Summary
None of these things are hard. Honestly, they’re really very basic: commit early and often, know what you’re committing and that it should actually be in VCS, explain your commits and make sure you do it yourself, don’t forget the databases and don’t forget the dependencies. But please do forget VSS :)






Software architect and Microsoft MVP, you’ll usually find me writing about security concepts and process improvement in software delivery.






67 comments:
Brilliant. You wrote down most of the good practices that I've been predicating in despair for a few years now.
I just can't get why people commit their Thumbs.db files inadvertently or their configuration files so happily. Same goes for writing (good) comments and not keeping uncommitted local changes for days (or months). All that is so important.
I would add two more commandments: One, keep an eye on others' commits, ie periodically browse the recent history of changes, and the associated comments, specially what might affect your own areas of the code. Two, remember that all changes on a commit should be coherent, ie while possible, avoid combining on a single commit two changes that are conceptually unrelated, and of course, avoid splitting one change into several commits (more so when the project remains in a weird or even un-compilable state between those commits!)
Thanks for the post!
Excellent post. I'd like to add in addition to the stop using VSS, that you should also take a long hard look at open source alternatives if you're using TFS, as in my experience TFS is also a bit, well, poopy.
Just one tiny point based on a recent disaster I had. When using a DVCS, a commit that's not pushed is not safe yet. A hard disk crash or some other kind of system outage can make you lose the commit. I lose over 2 weeks of work this way.
Very complete memo, thanks. note that #10 is relative to the building tool used (ie: you should probably not commit maven dependencies).
haha, love the "works on my machine" badge :)
@noufal It is a requirement that you understand how to use a DVCS otherwise pain and sorrow can follow. But that is true with anything.
You should add "0. Stop using Windows, it just generally fucks things up and isn't POSIX-based, so git and other modern tools are unusably frustrating."
Fantastic post!
as usual you cut to chase and nailed all the points perfectly.
So many of the things we take for granted everyday would be so much easier if they were defined this way from day one
as you were...
I strongly disagree about compilation output not belonging in the repository.
With the use of the repository and the compilation output, we can overcome the problem of others not being able to build or merge code from the repository into their own workspaces. In most cases, it is easier to allow workmates to download the binaries and link them to their own code.
@Anonymous - With the caveat that I appreciate there are always edge cases and I'm sure there may be valid reasons, what's the cause of "problem of others not being able to build or merge code from the repository into their own workspaces"? What's the technology and what's causing the problem? Is putting compilation output into VCS simply working around the root cause of another problem?
Completely agree with Doug Rathbone, Just a fantastic post. In my opinion the best and most important commandments is "writing commit messages" because that's the first one anybody looks in case of any issue, difference or review. if you have a meaning full comment or JIRA link that would be just what doctor advised.
Javin
10 tips on logging in Java
Something I'd like to add:
add dummy files to empty directories. remember empty directories can't be stored in SCM. It can be frustrating if a folder (eg. cache or tmp) is missing because the are ignored the wrong way.
Excellent post. Following the ten commandments is not too difficult; making your co-workers do it too is much harder...
I agree with all of them although I usually struggle with 'Commit early, commit often'. I prefer to make sure the task is complete and does not break the build before committing the change.
In response to: “Write every commit message like the next person who reads it is an axe-wielding maniac who knows where you live”
My coworker writes:
"My next commit message is 'IM READY FOR YOU!'"
@Stefano Thanks for the feedback - which is the SCM tool which doesn't allow you to store empty folders?
Great post.
One tip for VS users who use tortoise SVN is to change tortoise's options to use "_svn" instead of ".svn". Will save many headaches in the future.
For larger scale products, I would agree that it would make sense not to include compiled outputs in the repository and have another location for tagged builds that refer back to milestones in the repo.
For small embedded work, it makes sense to include binaries in the repository as part of the history especially when you're dealing with tools (compilers, assemblers, loaders, synthesizers, PARTs, and fitters) that are regularly being updated.
I'd add an 11th: do not commit sensitive material. Unfortunately many times I've seen developers committing files with credentials and plain-text passwords in it, or even (GASP!) source code with hard-coded passwords in them. Then they go out in the wild for everyone who has a checked out copy of the source. Chaos ensues.
1.5 Always work in branches, never trunk
This is an unstated assumption based on 3 (committing often). Developers hate things that slow them down. If given the option, many will code everything on their machine for a long time and then dump it into trunk. "The build will find the problems."
Also, use a VCS client that generates false positives when merging (SmartSVN) over one that generates false negatives (Tortoise). There's nothing more fun than finding incorrectly auto-merged bits in your code after the fact.
The last one's a bit iffy. Plenty of frameworks have systems where external dependencies don't need to be in source control and can be accessed via an external service. Take Maven in Java, for example.
I think it is Mercurial (hg) the SCM that doesn't store empty folders. It's very good for me, though
Rule #1: Your changes pass all tests before committing.
Enforce this is a commit hook.
Once you have that, you can enforce any other rule by adding a new test :-)
For example: formatting doesn't matter when working. Run astyle on each commit automatically. All complaints about lack of style adherence are now gone!
No "print" feature on this blog makes me very very sad :/
Hi there,
Thank you for the great post.
We recently started using Mercurial as our source control solution. Having worked with SVN for a while it was a mind shift to get used to.
Mercurial allows you to commit to a repository located on your local machine and then commit it to the server repository when you changes are done. For me this works like a dream.
Mercurial's merging also works like a dream... I would recommend to give Mercurial a try.
"Stop right now if you’re using VSS – just stop it!
Same thing is true for: SVN / CVS / PVCS / Perforce / Clear Case and other non distributed last century tools
"Versioning your database isn’t optional"
Think about versioning NoSQL databases.. And then think again :)
"Nobody else cares about your personal user settings"
But I do. And if I work with people who also use e.g. IDEA, it does make sense especially in case of tricky setup. As a whole ( read "released" ), I agree, but as development goes, if you ever developed in a huge team [ unfortunately there are still huge teams in lots of places ], having .idea of .settings or .somethingelse helps more than hurts. Again, when you release, it should not be there.
"Dependencies need a home too"
WHAT!?? Maybe it is a Microsofty thing ( I am no hater, just not sure how .NET dependency management work [ NMaven, Byldan ? ] ), but in the rest of the world Gradle / Maven / Ivy / Bundler /etc.. do that for you.
There is no point to upload MBs of dependencies into SCM. Especially bad, when people upload these dependencies without versions => fired! :)
A good read overall => thanks!
/Anatoly
Regarding #10 - what if my 3rd party tools/libraries require installation? the proposed method for keeping them under VCS is fine, but what if i rely on some external library that has to be installed? do i install this (somehow.. how i don't know) every time before i build my software on the build server/on my machine? how should this be tackled..?
To the last point i would add that you should use externals in SVN to save dependencies so that they are more reusable.
@Troy, " which is the SCM tool which doesn't allow you to store empty folders?" mercurial and git for two.
Yeah, Troy, all very obvious, and of course we know that already ... but why don't we do it then? Thanks for reminding us, and for putting it all in a nice list.
I am personally very happy about your commandment #1. Please see my source destruction experience with SourceAnywhere, if you are interested: http://stackoverflow.com/questions/5726031/sourceanywhere-standalone-and-visual-studio-2010-file-reverted-to-old-version-d
One thing I'm having trouble working out is to do with #10. If I am using an open source library (say from github) in a project do I:
A) Just use the current versioning and get any new changes that come down.
B) Remove any previous versioning information and put the project in my own repository.
A) Makes it easier to keep the library up to date but can make it difficult to understand where all the dependencies in the project live.
B) Makes it easier to get all the source required to build a project but makes keeping the library source up to date difficult.
So far I've been using B
"No 5: Remember the axe-murderer when writing commit messages" is worth the price of admission alone. Am in a heated discussion with a vendor about this topic right now, and I might just send this onto them!
Although I've always hated the feature "Blame" purely because of the negative connotation. "Annotate" would be a nicer name for it.
I sympathise with the comments regarding "Section 10: Dependencies". We're a MSFT outfit too and we try to follow what Troy has suggested here but as it stands we have about 50 separate software packages we need to install on each build agent. It's just plain nasty, and one of the biggest issues is that certain software vendors make you license the components on the build agent itself (shout outs to Telerik and RedGate!).
Thanks everyone for the feedback. Let me caveat my responses by saying they come from a primarily Microsoft focus, but I’ll try to be as general as possible.
Re dependencies in source control, the intent is to say that projects should be compilable and runnable straight out of VCS without chasing around after assemblies or other artefacts. For a .NET app, this would mean committing dependent libraries to the project or referencing externals but the end result is the same; a developer or build server can pull the project and immediately run it. A few MB – or even tens of MB – is a small price to pay. Obviously Maven reduces the VCS dependency but the objective is the same.
@jwatte having a pre-commit dependency on passing tests is a tricky one, particularly when compilation is involved and even more so once the numbers of tests start racking up. A commit that takes several minutes isn’t going to encourage anyone to use VCS. Best off leaving this to the CI server.
@sdfasfd dependencies requiring install (or licenses) is the challenge Doug Rathbone and I were debating recently in the context of CI. I think it’s spelled out quite well here: http://troy.hn/fsCIan
@Alex for dependencies from an open source project, I’d personally just take the compiled assembly (either from the project or build it yourself) and use that as the reference and update it as and when required. I’m not too concerned about an external dependencies source code versioning history so would agree with your solution.
@Daniel yeah, as much as I love the Red Gate tools, the installation and licensing can be a bit of a barrier in terms of dependency configuration. That’s one of the areas discussed with Doug in the link above.
"However, when I see an individual – and particularly an entire project – where I know we should be in a normal development cycle and there are entire days or even multiple days where nothing is happening, I get very worried."
Not necessarily. I have many commits going into my local repo. When it comes to doing the Push I might rebase or cherry pick them into one commit. To the outsider this might look like I haven't done anything for a while.
Nice article.Thanks for sharing. Work Plan Platform
I know you mentioned committing config files, but it was not as blatant as point 9).
You see there is the other side where teams do not bother to commit the config files from all regions (Prod, UAT, QA) and then they have lost their ability to control the source.
Sometimes they think as it is product config files (tomcat, apache etc) that they not need too.
I see it as if you make a change (CRUD) from the vanilla in any way then this must be placed under management. This includes CSS and HTML files. A concept even server side devs do not understand, thinking only code needs to go in there. It should be almost everything!
Hi everybody,
regarding #10 I recently reorganized my company's projects so that there is a separate repo called "Libraries" where we keep all third party or internal libraries in binary format, then we use svn:externals to pull-in those libs into other projects. This way we are not polluting each end every repository with the same 25 megs reporting library.
Every release goes into a separate folder, so that if project Foo references Bar v1.5.2 it will keep referencing the right version until we decide to migrate project Foo to use Bar vNew.
For our internal libraries we omit the revision number from the folder name, so that bug fixes can be incorporated without the need to alter every svn:external to point to a new revision. Of course we have a rule that every breaking change to a library must trigger at least a minor-version increment, and that would go into a separate folder.
We are still not happy with libraries that need to be installed (like Telerik's products).
Although you can do the same trick, then you have to carefully watch your .csproj file every time you drag something from the VS Toolbox, as it will probably pull in the wrong reference. And it's against the license anyway, so don't do that :-)
Hope it helps...
-Rodrigo-
Very nice article.
http://stackoverflow.com/questions/909338/what-is-the-worst-commit-message-you-have-ever-authored
was really funny.
One thing I think you may want to mention is using auto-formatting tools. This is helps with point 4#, inspecting your code, it is much easier to inspect the code for changes if there are no formatting only changes to parse through. Alot of people do this as a pre-commit hook but I think its better to do it client side before committing so real changes are not missed in formatting updates noise.
Good post, but I beg to differ on saving binaries. When VS compiles it embeds current date and perhaps other data, so re-builds are not identical. When I receive a mini crash dump from the field I need the exe and pdb to debug. Anyone have a way around this?
Good points on putting dependencies under control. I got bit by this problem recently: running unit tests, Resharper that I'm using has a dependency of a specific nnunit dll. (The version which came with the Resharper package) I had problems using the latest downloaded NUnit DLLs until I referenced the ones which came with Resharper. So I put those into a 3rd party DLL folder so I know those specific versions belong with the project I'm working on, to help avoid the "works on my machine" issues.
All cool, but I'm not sure practically how commandment 10 is supposed to work. Our code has dependencies on DirectX and Visual Assert (CFIX) etc - are you saying these (installer based) should be added to source control? To exaggerate the point would you add Visual Studio to source control because the code won't build without it?
Great post.
Great! Thanks for the 10 year old news!
@Phillipus One of the problem with long periods between commits - even with locally versioned DVCS systems - is that often (usually?) there are no backups of the local versions. You're one HDD failure away from blowing your work away. There's already a comment from @noufal above about losing a fortnight of work this way.
@Anonymous1 Config files are obviously very language specific. Fortunately for .NET, we now have config transforms which make it pretty simple to version different environment settings in web apps: http://troy.hn/d6QdDU
@Anonymous2 Re formatting, again, it's language specific. For .NET, it's easy to drop in StyleCop and set the build to fail if rules are broken which can then surface problems at compile time on the client or on the CI server.
@geomayfiel You're talking about build artefacts and the right place for that is on your CI server. It's easy to then grab these along with the source when you need them without them polluting the VCS repository and causing the other dramas I mentioned above.
@Anonymous3 Re dependencies, why stop there? Put the whole damn Windows installer into VCS! The reasonable assumption is that a developer with a fresh machine and a clean IDE install can pull the project and build it. Likewise the CI machine should be able to do the same so yes, you're making an assumption about a baseline state of the machine doing the building but it's a pretty reasonable one :)
Troy, fair comment about local backups. A good working practice is to push to a backup repo which could be on the network, another drive or hosted on-line. The beauty of Git is that you can have many repos to push to, local and non-local.
For my project I have 3 hosted Git repos that I push to - two "official", one at SourceForge and one at Git (a mirror, if you like), and one marked "experimental" on SourceForge. The "experimental" repo is the one I push to as a backup.
Thank you for sharing.
It is always a delight to feel what I believe – that
information is inspiration
Here's another rule: never commit reformatted code (just a bunch of changes in whitespace) together with actual code changes. It kills me when I'm having to review those!
I have to disagree on the committing of configuration files. Well, at least in part. Mainly I think it's naive (and dangerous) to think you can make a blanket statement about config files and whether they can / should be in VCS. Taking eclipse for instance I've been able to enforce format and style consistency, thereby reducing the number of merge conflicts by committing and requiring the use of various code formatters and templates. I've also been able to speed the ramp up time for new developers by giving them useful user oriented config files and shortcuts that would have taken them months to figure out on their own. There are SOME config files that we don't commit, but not many. Maybe the .Net tools don't have the same team oriented benefits that Eclipse config files do, but stating you should never commit IDE / User config files is a bit presumptuous.
Interesting stuff!
i need to forward this to my team.. what else can you do if you want to track changes with very informative comments like 'updated', 'committed'.. *sigh*
Interesting stuff!
Good points on putting dependencies under control. I got bit by this problem recently: running unit tests, Resharper that I'm using has a dependency of a specific nnunit dll. (The version which came with the Resharper package) I had problems using the latest downloaded NUnit DLLs until I referenced the ones which came with Resharper. So I put those into a 3rd party DLL folder so I know those specific versions belong with the project I'm working on, to help avoid the "works on my machine" issues.
"No 5: Remember the axe-murderer when writing commit messages" is worth the price of admission alone. Am in a heated discussion with a vendor about this topic right now, and I might just send this onto them!
Although I've always hated the feature "Blame" purely because of the negative connotation. "Annotate" would be a nicer name for it.
I sympathise with the comments regarding "Section 10: Dependencies". We're a MSFT outfit too and we try to follow what Troy has suggested here but as it stands we have about 50 separate software packages we need to install on each build agent. It's just plain nasty, and one of the biggest issues is that certain software vendors make you license the components on the build agent itself (shout outs to Telerik and RedGate!).
One thing I'm having trouble working out is to do with #10. If I am using an open source library (say from github) in a project do I:
A) Just use the current versioning and get any new changes that come down.
B) Remove any previous versioning information and put the project in my own repository.
A) Makes it easier to keep the library up to date but can make it difficult to understand where all the dependencies in the project live.
B) Makes it easier to get all the source required to build a project but makes keeping the library source up to date difficult.
So far I've been using B
Yeah, Troy, all very obvious, and of course we know that already ... but why don't we do it then? Thanks for reminding us, and for putting it all in a nice list.
I am personally very happy about your commandment #1. Please see my source destruction experience with SourceAnywhere, if you are interested: http://stackoverflow.com/questions/5726031/sourceanywhere-standalone-and-visual-studio-2010-file-reverted-to-old-version-d
Regarding #10 - what if my 3rd party tools/libraries require installation? the proposed method for keeping them under VCS is fine, but what if i rely on some external library that has to be installed? do i install this (somehow.. how i don't know) every time before i build my software on the build server/on my machine? how should this be tackled..?
"Stop right now if you’re using VSS – just stop it!
Same thing is true for: SVN / CVS / PVCS / Perforce / Clear Case and other non distributed last century tools
"Versioning your database isn’t optional"
Think about versioning NoSQL databases.. And then think again :)
"Nobody else cares about your personal user settings"
But I do. And if I work with people who also use e.g. IDEA, it does make sense especially in case of tricky setup. As a whole ( read "released" ), I agree, but as development goes, if you ever developed in a huge team [ unfortunately there are still huge teams in lots of places ], having .idea of .settings or .somethingelse helps more than hurts. Again, when you release, it should not be there.
"Dependencies need a home too"
WHAT!?? Maybe it is a Microsofty thing ( I am no hater, just not sure how .NET dependency management work [ NMaven, Byldan ? ] ), but in the rest of the world Gradle / Maven / Ivy / Bundler /etc.. do that for you.
There is no point to upload MBs of dependencies into SCM. Especially bad, when people upload these dependencies without versions => fired! :)
A good read overall => thanks!
/Anatoly
Hi there,
Thank you for the great post.
We recently started using Mercurial as our source control solution. Having worked with SVN for a while it was a mind shift to get used to.
Mercurial allows you to commit to a repository located on your local machine and then commit it to the server repository when you changes are done. For me this works like a dream.
Mercurial's merging also works like a dream... I would recommend to give Mercurial a try.
No "print" feature on this blog makes me very very sad :/
I think it is Mercurial (hg) the SCM that doesn't store empty folders. It's very good for me, though
For larger scale products, I would agree that it would make sense not to include compiled outputs in the repository and have another location for tagged builds that refer back to milestones in the repo.
For small embedded work, it makes sense to include binaries in the repository as part of the history especially when you're dealing with tools (compilers, assemblers, loaders, synthesizers, PARTs, and fitters) that are regularly being updated.
Excellent post. Following the ten commandments is not too difficult; making your co-workers do it too is much harder...
I agree with all of them although I usually struggle with 'Commit early, commit often'. I prefer to make sure the task is complete and does not break the build before committing the change.
@noufal It is a requirement that you understand how to use a DVCS otherwise pain and sorrow can follow. But that is true with anything.
Really great. Very informative specially that I am about to start learning GIT :p Keep rocking!!
Cheers!!
Troy if you want to find a truly innovative solution you need to check out Source Reliance by Core Software Technologies.
It has the best feature set on the market, a dedicated database and it outperforms all other vendors that charge 10 times as much.
It was a long time in the making and you can tell because it is bug free (like mission critical software should be)!
For the ultimate in source control... check out PlasticSCM (http://www.plasticscm.com)
To Add to number 5 - always add the reference to card/bug/jira no etc. in the commit comments, it really helps to be able to understand why in a few months
Post a Comment