Even the best of us makes mistakes. And sometimes, we don't notice 'till it's too late to turn back. This weekend, I'm working on refactoring/remodelling a fairly large feature which is part of a big legacy application at work. A seemingly innocent design decisions made years ago, has come back to haunt me, making things that should be easy difficult, and things that should just work, well, not work at all.
Years ago, the legacy application in question was party ported from unmanaged C++ into .NET. Not only that, it was ported from a flat file-based application to a database-backed application. Naturally, this created the need for a data access layer. An extension for ADO.NET was written, leveraging DataSets with all kinds of neat auto-magic features. Years of joy followed (well, disregarding the constantly growing list of reported bugs, of course).
Fast forward to present day, and finally we're getting ready to start killing off at the last bits of that old C++ code. Not only that, but we're taking on the momentous task of refactoring and cleaning up the entire application using modern-day architecture and general good-practice idioms.
Something != Nothing
Okay, so there I am, happily building a sparkling, new domain model with all the right entities in all the right places, happily staying on track with the help of Eric Evans excellent Domain Driven Design book. We've decided to use Developer Express' eXpress Persistent Objects (XPO) as our O/R mapper this time around, which among other things lets us take advantage of all the lovely new LINQ stuff (which, by the way, makes abstracting away the data access layer in unit tests a joy - look out for a post on that topic soon). So, all my tests are dressed up in their finest green and all is joy. Until I put together a simple test application to test things out against a real database... BAM! XPO blows up, complaining that the referential integrity of the database is compromised. Say what? I poke around for a bit, and quickly realise that even though the database which was modelled all those years ago and has incrementally grown to several times it's own weight since then is normalized quite well, there's something weird about these foreign keys... They've all got NOT NULL constraints. Well okay, so that might not be such a big problem in and of itself - but considering that several of the relationships are 1 <-> 0-1, this clearly poses a question of how we represent the case where, say, a Person doesn't own a Pet; what do we set the Person.PetID column to? Well, in our case, 0 is treated as the 'magic' value meaning 'nothing'. Trouble is, that's really a lie and the database doesn't know it, so it thinks that our Person really does have a pet - the pet with ID = 0. No Pet with that ID exists however, so when XPO tries to load the Person, it can't find it's Pet. Time to call the neighbours and organize a search party.
Obviously, in hindsight it makes perfect sense that the PetID column should be nullable, so that any Person who doesn't own a Pet can have Person.PetID = NULL. The difference may seem insignificant; it is however a huge difference semantically, and say one thing for computers, say they're pedantic about semantics.