Behavior Driven Development (BDD) with SpecFlow and ASP
This is a three-part series introducing the concept of double-blind test-driven development in Rails. This post defines the concept itself, and lays the groundwork by showing the way tests are more commonly written. The next couple posts will show how to double-blind test various common rails elements, and how to make this added layer of protection automatic and quick.Looking at a rails application that was built with test-driven development, you might expect to see something like this:Truth be told, if you’re seeing this in the wild the app is probably doing pretty good. This level of testing works great during the early stages of an app, when things are simple. But as things grow and/or multiple developers become involved, you need more.Consider models where the associations and validations stretch into the dozens of lines. The more careful and specific you are about validations, the easier it is to get conflicting or overlapping validations. I actually came up with the concept of double-blind testing while retro-testing models in a client app that previously had no validation specs.In the world of scientific studies, you always need a control group. One set of participants gets the latest and greatest new diet pill, while the other gets a placebo. Researchers used to think this was good enough, and probably pretty funny to watch the placebo users rave about their shrinking waistlines. But it turns out studies like this still allowed some bias – as researchers observed the effects, their *own* preconceived notions tainted results. Enter the double-blind study.In a double-blind study, the researchers themselves are unaware of which participants are in the control group, and which are being tested. Both sides are “blind”. They may have lost funny patient anecdotes, but they gained research reliability.As I said, in the early stages of an app the tests I showed above work great, as long as you’re using TDD and the red-green-refactor cycle. This means you write the test, run it, and it fails. Then you write the simplest code that will make the test pass, run the test again, and confirm that it passes. Most testing tools will literally show red or green as you do this. Then, as you start to amass tests, you’re free to refactor your code (abstracting common code into helper methods, changing for readability, etc) and run the tests again at any time. You will see failures if you broke anything. If not, you’ve more or less guaranteed your code refactoring works properly.The problem comes in when you start changing old code, or adding tests to processes that didn’t initially happen. What I’m calling double-blind testing is this:each test needs to verify the object’s behavior before testing what changes.As an example, let’s rewrite one of the tests from above:This is the basic pattern for all double-blind testing. We’re not leaving anything to chance. In the original version, we expected our object to be invalid, we treated it as such, and we got the result we expected. Do you see the problem with this?Here’s an exercise: can you make the original test pass, even though the object validation is not working correctly? There’s actually a style of pair programming that routinely does exactly this. One developer writes the test, and the other writes just enough code to make it pass, with the good-natured intention of tripping up the first developer whenever possible. If you wrote the original test, I could satisfy it by just adding the error message to every record on validation, regardless of whether it’s true! Your test would pass, but the app would fail.The test is now “double-blind” in the sense that we as testers have factored out our own expectations from the test. In this case, we expect the error message to not be there until we initialize the object a certain way, and this can be bad. It may sound far-fetched or paranoid*, but in large codebases your original tests are often abused in this very way. The “you” that writes new code today is often at odds with the “you” from three months ago that wrote the older code with a different understanding of the problem at hand.Now that I’ve laid out the justification, let’s take a closer look at how the test changed. The first thing I did was create a version of the object that I believe should NOT trigger the error message. Then I run through two cases that should. You can see right away, I was forced to be more *specific* about what should trigger an error. Instead of just a blank object with no values set, I’ve proactively set the attribute in question to both nil and blank. A key element here is to try to work with the *same* object, modifying between tests, rather than creating a new object each time. My test wouldn’t have been as specific if I’d just recreated a blank Teacher object and run a single validation check.Also, with the increased code comes the increased chance of typos. We don’t want to DRY test code up too much, because a good rule is to keep your tests are readable (non-abstract) as possible. But I’ve specified the error message at the top of the test, and reused that string over and over. I did this in a way that DRY’s the code and adds readability. You can see at a glance that all three tests are checking for the same error.Finally, the first time I run the object’s validation, notice I’m not asserting that it should be valid. If I had written teacher.should be_valid on line 8 of the double-blind test, I’d have to take the extra time to make sure every other part of the object was valid. Not only is this time-consuming, it’s very brittle. Any future validations would break this test.If you use factories often, you may suggest setting it up that way since a factory-generated object should always be valid. Then you could assert validity. However, this only slows down your test suite. it’s enough just to run valid? on the object, which triggers all the validation checks to load up our errors hash.I believe this is a new concept – I was already coding most of my tests this way, but it didn’t dawn on me how valuable it was until I started retro-testing previously testless code. The value showed itself right away.I would love to hear feedback on this – if you think it’s unnecessary (I tend to be very rainman-ish about my testing code) or even detrimental. However, if you think it’s too much work, I ask you to hold your criticism until you’ve read part 3 of this article, where I show how to use your own RSpec matchers to greatly speed this process.Tags: bdd, double-blind, Rails, rspec, Ruby, ruby on rails, shoulda, tdd, test-driven development, Testing, validation This entry was posted on January 31, 2011 at 10:00 am and is filed under Database, Rails, Ruby, Testing. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site. I’ve been coding my unit tests in Java code this way for a long time, without the benefit of TDD. I’m just paranoid enough that I don’t trust tests that always pass any more than I trust tests that always fail — I want to see them do both, given appropriate criteria.Some people so smart that they writing tests, som people so smart that they do not writing tests. (joke)Fill in your details below or click an icon to log in: You are commenting using your WordPress.com account. ( Log Out / Change ) You are commenting using your Twitter account. ( Log Out / Change ) You are commenting using your Facebook account. ( Log Out / Change ) You are commenting using your Google+ account. ( Log Out / Change )Connecting to %s Notify me of new comments via email. Create a free website or blog at WordPress.com. Entries (RSS) and Comments (RSS).
This is part of a series of introductory guidelines for software development. It’s a continuation of the previous post about Dependency Injection.One of the primary reasons to adopt Dependency Injection is that it is impossible to achieve a good layered architecture without it. Many developers argue that IoC container is only beneficial if you are doing test-driven-development. I disagree. I can’t quite think of how else you can build any layered architecture at all without IoC.First, let’s take a moment to define layered architecture. There is a common misconception about layered architecture, which is largely contributed by misleading diagrams in most textbooks and articles. This is how layered architecture often described in textbooks.I think that’s rather deceptive. Unfortunately, developers often take this diagram too literally. Each layer depends on the layers below it. Presentation layer depends on business-layer, and then both depend on data-access layer.There is a huge flaw in that diagram. Data-access layer should be at the outmost part of application. Business layer should be at the core! But in this diagram, Data-access layer is becoming the core foundation of the whole application structure. It is becoming the critical layer. Any change on data-access layer will affect all other layers of the application.Firstly, this architecture shows an incongruity. Data-access layer, in reality, can never be implemented without dependencies to business detail. E.g. DbUserRepository needs to depend on User. So what often happens is developers start introducing a circular dependency between Business layer and DataAccess layer. When circular dependency happens between layers, it’s no longer a layered architecture. The linearity is broken.Secondly, which is more important, this architecture tightly couples the whole application to infrastructure layer. Databases, Web-service, configurations, they are all infrastructure. This structure could not stand without the infrastructure. So in this approach, developers build the system starting from writing the infrastructure plumbing first (e.g. designing database-tables, drawing ERD diagram, environment configuration), then followed by writing the business code to fill the gaps left by the infrastructural bits and pieces.It’s a bit upside-down and not quite the best approach. If a business-layer couples itself with infrastructure concerns, it’s doing way too much. Business layer should know close to nothing about infrastructure. It should be in the very core of the system. Infrastructure is only a plumbing to support the business-layer, not the other way around. Infrastructure details are most likely to change very frequently, so we definitely do not want to be tightly coupled to it.Business layer is an area where you have the real meat of your application. You want it to be clean of any infrastructure concerns. Development effort starts from designing your domain-code, not data-access. You want to be able to just write the business code right away without setting up, or even thinking about, the necessary plumbings. One of the guideline in previous post is that we want classes within business layer to be POCO. All classes in business layer should describe purely domain logic WITHOUT having any reference to infrastructure classes like data-access, UI, or configuration. Once we have all domain layer established, then we can start implementing infrastructure plumbing to support it. We get all our domain models baked properly first, before we start thinking about designing the database tables to persist them.So this is a more accurate picture for layered architecture.Infrastructure sits at the top of the structure. Domain layer is now the core layer of the application. It doesn’t have any dependency to any other layer. From code perspective, Domain project does not have any reference to any other project, or any persistence library. Databases, just like other infrastructure (UI, web-services), are external to the application, not the centre.Thus, there is no such thing as “database application”. The application might use database as its storage, but simply because we have some external infrastructure code in the neighbourhood implementing the interfaces. The application itself is fully decoupled from database, file-system, etc. This is the primary premise behind layered architecture.Alistair Cockburn formalizes this concept as Hexagonal Architecture, so does Jeffrey Palermo as Onion Architecture, but they truly are merely formalized vocabularies to refer to the old well-known layered architecture wisdom.To describe the concept, let’s switch the diagrams into onion-rings. Jeffrey Palermo describes bad layering architecture as follows:In onion diagram, all couplings are toward the centre. The fundamental rule is that all code can only depend on layers more central, never on the layers further out from the core. This way, you can completely tear out and replace the skin of the onion without affecting the core. But you can’t easily take away the core without breaking all outer layers.In the diagram above, the infrastructure sits in the core of the onion. It becomes an unreplaceable part of the system. The structure cannot stand without infrastructure layer in the core, and the business layer is doing too much by embracing the infrastructure.This is the better architecture model:UI and infrastructure sits right at the edge skin of the application. They are replaceable parts of the system. You can take the infrastructure layer completely and the entire structure stays intact.Take a look at the diagram again.If you follow the previous post, we were building “resend PIN to email” functionality. AuthenticationService is in the domain layer at the core of application, and it does not have knowledge about SMTP client or database (or even where User information is stored). Notice that DbUserRepository and SmtpService are in outmost layer of the application, and they depend downward on the interfaces in the core (IUserRepository, IEmailService) and can implement them. Dependency Injection is the key.At runtime, IoC container will resolve the infrastructure classes that implement the service interfaces (IUserRepository, ICommunicationService) and pass them into UserAuthenticationService constructor.My typical project structure looks like this in Visual Studio:I separate each layer into its own project. One of the benefits is that it physically enforces linear dependencies. Each layer can only depend on the layer below it. If a developer introduces a circular dependency between Domain layer and Data-Access, Visual Studio (or Java’s Eclipse) won’t even compile. Here is how these projects work together within the architecture:Every component depends only to the components in the layer below it. As you can see, the only project that has reference to System.Data and NHibernate library is Sheep.Infrastructure.Data. Likewise, the only project with System.Web.Services reference is Sheep.Infrastructure.BackEnds. And none of those infrastructure projects is referenced by any other part of the application (and this is enforced by Visual Studio). Therefore, they can be totally taken away and replaced without affecting the application core. On the flip side, the whole infrastructure layer is tightly coupled on the core layer.Domain layer is POCO. It does not depend on any other project, and it does not contain any reference to any dll for particular technology. Not even to IoC framework. It contains just pure business logic code. In order to be executed, this assembly has to be hosted either within a system that can provide infrastructure implementation, or within test system that provides mock dependencies. Hence notice that we have 2 versions of top-level skins in the structure: Infrastructure and Test.So what’s the benefit?Basically, it all describes Single Responsibility Principle (each class should only have 1 reason to change). In the classic diagram, being tightly coupled to data-access layer, domain layer needs to change because of some changes in infrastructure details. Whereas in layered architecture, you can completely tear apart the infrastructure layer at the skin, and replace with arbritary implementation without even laying a finger on any domain class at the core. Thus there is only 1 reason the classes in domain layer need to change: when the business logic changes.Having said that, this architecture is not necessarily the best approach for all kind of applications. For simple form-over-data applications, Active-Record approach is probably better. It couples the whole application directly to data-access concerns, but it allows a rapid pace of data-driven development. However, for most non-trivial business applications, loose coupling is a very crucial aspect of maintainability.This whole post is actually an introduction to our next dicussion about Domain Driven Design in some later post. To be continued in part 3, Object Relational MappingHi Hendry! I am following these series enthusiastically! Your postings have a high quality and I really enjoy reading them. Keep up the good work!Regards,Daniel Lidström Stockholm, SwedenHi HendryVery nice and informative series. If possible, can you adopt a step-by-step approach to the email app? It would really help us newbies in the IOC world !Again thanks for taking time to do this.rgds SunitHendry Luk,Very informative, baru sadar kalo diagram layer yang ada di buku-buku (bahkan buku tentang DDD sekalipun) ngaco 🙂@Sunit, unfortunately developing a usable application isn’t quite the objective of these posts. Short snippets and examples are easier to highlight specific concepts and practices without the need to bring in various other concerns. But if you’re interested in seeing those concepts in a real-world application, there are already several open-source applications written firmly in the same spirit. E.g.: – Code Camp Server – Sharp Architecture – Suteki Shop – MVC Storefront – Car Trackr@Dicky, I probably wouldn’t go as far as to say diagrams in textbooks are incorrect. They merely describe *logical* layering of applications, and they should NOT be intepreted as how you should structure your code and dependencies physically.Where is part3?great writing… mas Hendry. still waiting for part three.Another brilliant posting from Hendry. It would be a-must-to-read article to the Juniors in our Office. Thank you, Hendry 😀Thank’s Hendry….. Your article helped me a lot 🙂Please bare with me as I am really trying hard to get my head around this (new to IoC)!I am confused by this Project Structure for two reasons.1) In the diagram, there is no arrow showing Sheep.Infrastructure.Web using Sheep.Infrastructure.Data. If the IoC mapping is done in the initialisation of Web, surely it must know about the DAL classes in Data in order to register them against Domain’s dependencies?2) In order to keep all of Domain’s classes in a single dependency graph, wouldn’t Domain then need a class (E.g. DomainEngine) with a bazillion parameters for it’s constructor? For example, I may have persistance classes for each object, e.g. IPersonDb, IAccountDb, with methods like IPersonDb.GetPerson(personId). Accessing such a class might look like myDomainEngineFromContainer.MyIPersonStore. So, DomainEngine would have dependencies for at least as many classes as I have models. What am I missing?Your help is much appreciated.I wouldn’t group infrastructure with UI and SIL concerns. You may want to create two “blue boxes” side-by-side; one for infrastructure and one for UI/SIL implementations. Also, instead of Service Interfaces I would call this Domain Contracts.Thanks for the reference. I enjoyed your post.Thanks Jeffrey. Your article on Onion Architecture has been very helpfulDependency injection relates to a object configuration in which an external entity is set, on the other hand it is an alternative to have the object configure itself. Even I think one should apply this as without that good layered architecture is not possible.Great controversial article that makes one think about data access.Fill in your details below or click an icon to log in: You are commenting using your WordPress.com account. ( Log Out / Change ) You are commenting using your Twitter account. ( Log Out / Change ) You are commenting using your Facebook account. ( Log Out / Change ) You are commenting using your Google+ account. ( Log Out / Change )Connecting to %s Notify me of new comments via email.