The Morning Scrub

Thursday, September 3, 2009

Introducing Cronus

Over the last few years I have been assembling the skills needed to create a “Core” framework that all my future projects should be based on. This framework is a library containing features specifically designed to eliminate the constant need to recreate the wheel every time a new project is started. Let’s face it everyone has this problem. Each time a new project is developed; there is a need for certain functionality like a security model, configuration model, notification system, and navigation system. This is where Cronus comes in.

The creation of this framework is among other things intended to save development time, for this reason I have chosen to name the project after the Greek God Cronus aka the god of human time.

In my next post I discuss the advantages/disadvantages to using Cronus 1.0.

Sunday, August 30, 2009

Building A Better “Where” Clause

Over my career I have learned a lot about SQL Server, unfortunately, sometimes some fundamental ideas get forgotten. This weekend I went to the Jacksonville Code Camp 2009 where I sat in on a session titled “building a better where clause” here are a few of my notes in case anyone forgot of never knew them.

- Try to use indexed fields whenever possible. Searching indexes in much, much faster than un-indexed fields. This means if you are not searching on the primary or foreign key, consider whether the field in questions should either be searched in the first place. If it does, consider adding an index on that field.

- When writing your where clause, take the order into account. Place the expressions most likely to return false first. Take a look at the below queries. Both return the same results, however unless people with the last name “smith” make up 50% of the table the first expression would evaluate faster than the second.

SELECT fname, lname, gender FROM People where lname = ‘smith’ AND gender = ‘male’
SELECT fname, lname, gender FROM People where gender = ‘male’ AND lname = ‘smith’

This is because as the system evaluates each row, deciding if it should be included in the result set, it will move on to the next row as soon as part of the where clause evaluates to false.

- Avoid wrapping table columns listed in the where clause in a function, doing so prevents SQL Server from using any indexes, which means a significant drop in performance.

- Avoid wrapping table columns with upper() or lower() in the where clause unless you SQL server is set to be case sensitive. By default SQL Server is not case sensitive, meaning the functions are not needed. As noted below using these functions prevent the use of indexes.

- Avoid using any form of “NOT”, including not null if possible as all forms of not also prevent indexes from being used.

- When using an in() statement, if possible place the most likely to be found first. Once the comparison evaluates to true the system can immediately move on to the next record.

Saturday, August 22, 2009

Hudson, More than Just a River in New York….

In the past I have written at length about the value of implementing a few select agile principals. One of these principals, “Continuous Integration” (CI) provides so much “bang for your buck”, it just can’t be ignored.

For so long I have wanted to implement continuous integration practices. Unfortunately, until now I never was able to get it going. I did some research on “Cruise Control.net” but all I read pointed to a long and drawn out installation and configuration process. Given the fact that this was a one man show and I would likely not be able to obtain the sys admin support it became less likely that this would be an option. Another issue was the requirement of a server with IIS, meaning I could not simply install it on my workstation for initial testing. Needless to say this was a deal breaker. I looked into another product “TeamCity”, which sounded easier to get up and running but, again it required IIS.

With this I started trying to implement my CI plans via hook scripts attached to our SVN server. These scripts though appeared to work were slow and seemed like a house of cards ready. The idea was to add a post-commit script to automatically checkout new work. The script next called on NAnt to build the project. NANT would also call on FxCop (for static code analysis) and NUnit (for the eventual unit tests we would have). Again, this worked, kind of, but I now felt comfortable sending it into the wild. After all this the project got shelved for a time.

Then one fateful day over lunch I read an article featured in CODE magazine titled “Hudson Continuous Integration Server” by Eric Anderson. Hudson is a continuous integration tool written in Java. As a committed .Net enthusiast I must say I was skeptical. I must say I was pretty lucky to find such a jewel. Within an hour of reading the article I was up and running. I won’t go into details of specific confirmation steps; Mr. Anderson does this so well. Instead I will discuss my impressions of the application and the overall contribution it has made.
Since the app is Java, you must have a recent edition of the JVM. As far as I can tell this is real requirement. I simply put in a few command line statements, and the server installed was installed on my workstation. The whole thing is packaged up into a Java .jar file meaning everything needed is included in this file, upon execution it unpacks and install into memory, no permanent installation is required.

A few more statements and the system was alive and available for configuration via web browser (there is a built in web server). The URL was http://localhost:8080/. The article used a repo from Google Code, with that and of course visual studio I was ready to go. I copied the configuration examples in the article and did a few tests.
Soon I had it linked to a few departmental repositories and now anytime anyone checked in code Hudson would check out a copy and attempt to compile reporting the results. This is a huge advancement. Until now team members checked in code whenever, not knowing that they had forgotten to add one file or another to the repo. Now within a minute Hudson checks out the new changes and attempts to compile them. After I got the system installed on one of our servers, Hudson was be able to include email support, notifying the team of each good or bad build.

Before this system was in place people would check stuff in, then others might check more stuff in, finally at some point someone did an update and their build was broken, stopping work. What followed was usually someone stomping around saying that this system sucks. After that we would attempt to identify the team member who had the missing files and get them added to the repo. Knowing immediately if you broke the build will reduce these incidents significantly, in most cases the team member who forgot a file will know and fix it before anyone else in the team has a problem. Our team needed something better, and here it is.
During my research of “ CruiseControl.net” I became aware of a really nice satellite application enter CCTray. CCTray is a small windows app that sits in the system tray and pulls a feed from the CI server for all projects a specific user subscribes to; it then shows red, yellow, or green depending on the state of the build. The CODE magazine article outlined the basics of what needed to be done to use CCTray with Husdon and again, after a little tinkering we were up and running.

Now that we have been running Hudson for well over a month, I have increased the scope of its responsibilities considerably. I use it to generate change reports for me monthly reports, to analyze and report on adherence to coding standards via its FxCop, and report on compiler warnings reported by MSBuild.
I am totally satisfied with this product and encourage anyone else who wants to implement CI on a shoestring to give it a look.

Saturday, February 14, 2009

What is unit testing and why should we do it?

Throughout the lifecycle of any project, many types of testing should and are done. Usability testing is needed to check if the user interface is easy to use and understand. Security testing is essential for software which processes confidential data and to prevent system intrusion by hackers. Performance of load testing checks to see if the software can handle large quantities of data or users. This is generally referred to as software scalability. Each of these types performs an important role. One type of testing that is often overlooked it known as unit testing.

Unit testing is the practice using small bits of code to exercise code that you have written. In many ways it is something that we have all been doing all along. Every time you single out a specific method and test it intensely to make sure it works as expected you are in a way unit testing your code. This is ok, but wouldn’t it be nice if you could re-execute these tests over and over again throughout the lifetime of the project to ensure that the classes and methods you worked so hard to write continue to work as designed even as other aspects of the project are added or modified? Too often we as a team, write something, test it, and later someone else is directed to make some other change, that causes previously coded features to un-expectantly fail. This is when unit testing comes in.

Instead of manually testing each component, visually inspecting the results, we can use a testing framework to exercise the code and evaluate if it still works or not. This works because we simply write a test that emulates the manual test that would have been written. For example, if you had written a math class and needed to test the add method, you could call that method passing 2 and 3 outputting the results. If it returned 5 it would seem reasonable to say that the add method works. Instead you could create a small test method that would also call the add method it would look something like this:

public void TestAdd()
{
Assert.AreEqual(5,MyMathClass.Add(2,3));
}

This method says the your add method should return 5 if it is passed 2 and 3 if it does not the test would be reported as a failure. Now that the test is save for all to re-run at any time. As each developer writes more components, they contribute more tests, which all would be run over and over again.

There are many excuses not to test
Though I think few would even attempt to argue that testing in general is not needed, most times much of it is left to the end of the project or even skipped completely. In-fact many times it is even regarded as a nuisance, causing more work than it is worth. I have heard countless excuses why testing is considered a second class citizen. My favorite is that we are too busy and simply don’t have time to do any more than minimal testing. I personally, am shocked every time I hear these words spoken. Consider for a moment how much time has been spent debugging code that was believed to be stable, just to find out later that it is not. How much time has been spent trying to track down bugs reported by users that cannot be easily replicated? In the end the cost of the test at the end strategy is far more than implementing a pay as you go approach.

One analogy I recently read said that it is far easier to mow the yard each week, then to try and cut a virtual forest once a week. Think about this for a moment, if more attention was paid to testing as the application was developed, many of the bugs would have been found and fixed before the system became too complex and too many other components were build atop the flawed logic.

Coding with Confidence
Invariably, by the time each project is enters the maintained phase, the team has already begun to distrusts the code base, and many already look forward to the day it is retired. As features are changed, developers do the absolute minimum to meet the new requirements. This means a lot of legacy code is left in the project for no other reason than the team member is terrified that removing or refactoring anymore than is absolutely necessary will lead to cascade failures across the system. This is where unit testing frameworks are so powerful. Tests that had been previously written can be run at anytime, ensuring each atomic element continues to perform as designed. This allows developers to code with confidence, instead of coping with the paralysis that so often causes us to do only the bare minimum.

The next step
So where do we start? Now that we hopefully agree in principal that unit testing is a good thing, we need a road map to implementation. The first step is to adapt a testing framework from which we can build all of our tests.

There are quite a few frameworks available the leaders in the .Net community include Microsoft Test, MBUnit, and NUnit among others.

Microsoft’s product though it does offer integration with Visual Studio. It first became available in Visual Studio 2005 Team Foundation, and limited support in now available in Visual Studio 2008 professional, but there are many features that remain unavailable to us. Additionally, this product has generally been regarded as an immature product by many prominent members of the .Net Community.

That leaves MBUnit and NUnit both are variations of the popular XUnit family of testing suites derived from Java’s JUnit.

MBUnit is the newer of the two frameworks; it actually builds upon NUnit, and has some interesting new features. Unfortunately, it is not nearly as widely used as NUnit and has far less available documentation available. It is for that reason I have chosen to us NUnit. It has been around for over 10 years, and the amount of information and support for this product is virtually endless. The only thing that remains is to get started with our first test…

(NUnit has an addin called TestDriven.Net which allows for tighter Visual Studio integration)
NUnit is probably the most mature unit testing framework of the bunch. It also has the biggest adoption and community support. There are more articles, discussions, tools and add ons for NUnit than any of the other frameworks combined. Just look at this link on google trends that compares searches for each of the frameworks. No other even comes close.

Friday, January 23, 2009

What is Continuous Integration and Why Do We Need It II

So now that we have a very general understanding of what CI is the next step is to take a look at and see how it can work for us. On projects where sources control is used, we can have the system attempt to build before code is actually accepted into the repository. This will seriously cut down on the issues we had recently with someone forgetting to include a file, and failing the build for the next person who comes along. This is a good start but we can do better.

Over the last few months I have written and sometimes pulled articles about different utilities that can help us increase departmental productivity. Now it is time to take advantage of a few of them.

Lets Review:
FxCop
“FxCop is a free static code analysis tool from Microsoft that checks .NET managed code assemblies for conformance to Microsoft's .NET Framework Design Guidelines.” In short, this is an application that looks at the compiled code and verifies proper usage of coding standards set by Microsoft standards. This includes unused variables, misspellings in code, naming conventions and tons of others. Unfortunately, someone has to run it at specific intervals, generate a report and see to it that the changes are made, at that point it has to be run again, and the cycle starts over again. Imagine sending something back and forth over and over, that’s not to mention there is no way to identify the developer responsible for the offending code.

NUnit
NUnit is a popular testing framework for Microsoft .Net. Its main function is to allow developers to develop unit tests for their application.

A unit is the smallest testable part of an application. In object-oriented programming, the smallest unit is a method, which may belong to a base/super class, abstract class or derived/child class. A unit test is designed specifically to make sure a single unit is functioning as expected. Once developed, these unit tests could be run each time a change is made, thus identifying any new bugs that were introduced. Unfortunately, like these test only work if they are run and sometimes people forget to do this before checking their code in.

NAnt
NAnt is a free and open source software tool for automating software build processes. NANT executes its build process based on an XML build file. It can intern call command line instances of NUnit, FxCop, NDoc, and other tools during the build process. Configurations can be made so that if a unit test fails or a standard is not me the build is failed.

NAnt can also handle automated deployment, If the build is successful, it will generate a release build, zip it and place it in the releases folder for later deployment to the production server. All of this is kicked off by one command line call.

This is good, however, some still has to execute NAnt to build and verify their code before checking it in. This is where CI servers such as CruiseControl.Net and other come in. They bind to the source control product, such as Subversion, and on check-in download a fresh version of the code. From here they execute NAnt (including FxCop, NUnit, and others) if anything goes wrong the team receives an email and the code is rejected. The developer fixes whatever issue was reported and then the code is accepted. The better news is all these utilities are open source and free. Even better, I have already done most of the work to integrate everything together. The only thing I don’t have it the integration server mentioned above, because that will require some further assistance from the network team.

What is Continuous Integration and Why Do We Need It?

Continuous Integration is a set of software engineering practices that speed up the delivery of software by decreasing integration times.

Sounds good but what are these practices?
Although these practices existed before, extreme programming practitioners recommend that teams require their developers to continuously integrate. I will give more detail about the practices later on, for now here is the list:
· Maintain a code repository
· Automate the build
· Make your build self-testing
· Everyone commits every day
· Every commit (to mainline) should be built
· Keep the build fast
· The build needs to be fast, so that if there is a problem with integration, it is quickly identified.
· Test in a clone of the production environment
· Make it easy to get the latest deliverables
· Everyone can see the results of the latest build
· Automate Deployment

Maintain a code repository
I think everyone is on board with the concept of a code repository. It provides versioning of the source code add allows management to see who did what when through comments placed in the revision log each time code is checked in. Consider for a moment, how many times code has been checked in, only to later cause someone else to download code that wont compile? How much time was spent trying to figure out what the problem was and resolve it?

Automate the build
Enter the automated build, this means that the entire system should be buildable using a single command. This is in contrast to a manual build process where a person has to perform multiple, often tedious and error prone tasks. The goal of this automation is to create a one-step process for turning source code into a working system. This is done to save time and to reduce errors.

Each time the system is built the code is compiled, if for any reason the build fails (including syntax errors or missing resources) everything stops. This code should not be accepted into the repository.

Make your build self-testing
The system should be written with tests that verify that it performs as intended. These tests should be run as part of the build. The best way to make the build self-testing is to create “unit tests” for each granular function within the business layer. At build time each of these unit tests could be run to ensure that previously written functionality still works and is compatible with the latest updates. To get the most from unit testing a strict barrier between the UI, business, and data layer should be implemented.

The value of a self testing build grows exponentially as your application gets older and more complex.

Everyone commits every day
By committing regularly, every committer can reduce the number of conflicting changes. Checking in week’s worth of work runs the risk of conflicting with other features and can be very difficult to solve. Early, small conflicts in an area of the system cause team members to communicate about the change they are making.

Every commit (to mainline) should be built
Commits to the current working version should be built to verify they have been integrated correctly. A common practice is to use Automated Continuous Integration. For many, continuous integration is synonymous with using Automated Continuous Integration where a continuous integration server or daemon monitors the version control system for changes, then automatically runs the build process.

Keep the build fast
The build process needs to be fast, this way if a problem with integration is identified, it is quickly identified and addressed by the appropriate team members.

Test in a clone of the production environment
Having a test environment can lead to failures in tested systems when they are deployed to the production environment, because the production environment may differ from the test environment in a significant way.

Everyone can see the results of the latest build
It should be easy to find out whether the build is broken and who made the change. This provides a valuable “state of the application” while at the same time providing a measure of accountability.

Automate Deployment
Automated deployment removes the human factor or “Magic” from deployment, as the system always follows the same procedures and therefore is less error prone.

Tuesday, January 13, 2009

Source Control

For some time now I have been very excited about the benefits offered by source control (SC). It provides a clear and precise record of changes made to code along with who made them and when good comments are added then it can say why.

Through our recent .Net projects we have become more familiar with one specific system for source control, Subversion (SVN) along with its popular client TortoiseSVN. I would agree that there have been ups and downs, however, looking back I think we are much better off having used then if we had not. Recently, a new client application became available, Ankh. This app is not exactly a replacement for Tortoise, instead more of a companion. The most important service that Ankh offers is that it integrates directly in to Visual Studio. Though I have not yet had time to play around with Ankh, It sounds very exciting.

Having enjoyed all the benefits of source control on our .Net projects for more than a year now, I have been very interested in using it with other languages our department develops with, most notably ColdFusion. Unfortunately, it doesn’t seem to be conducive to the way our CF apps are developed. Though each developer working on a .Net project runs a copy locally when developing, our network administrators policy is not to install CF express on each machine. This means that all CF source code resides on the server and is edited from there, with developers sharing a single copy. Though I really would like to use source control I have yet to divise a practical approach to doing so, therefore, all CF source code remains outside of source control.

Another area that would be nice to have under SC is the database. I have read many articles explaining that schema should be kept in SC, I have yet to read a comprehensive approach that would also cover data, particularly configuration data that is just as important as the schema itself. Without this data under SC then it seems like using it at all would be pointless.

One idea that I recently had was to ban all config data from being kept in the database, instead it would be kept in xml config files that could be kept under SC.

Recently, I read some interesting thoughts on the matter. Though I have not had time to read all the content, it is what lead me to the xml config file idea… If this interests you please read the 5 part series available at http://www.codinghorror.com/blog/archives/001050.html

The Morning Scrub

Thursday, September 3, 2009

Introducing Cronus

Sunday, August 30, 2009

Building A Better “Where” Clause

Saturday, August 22, 2009

Hudson, More than Just a River in New York….

Saturday, February 14, 2009

What is unit testing and why should we do it?

Friday, January 23, 2009

What is Continuous Integration and Why Do We Need It II

What is Continuous Integration and Why Do We Need It?

Tuesday, January 13, 2009

Source Control

My Blog List

Blog Archive

The Morning Scrub

Thursday, September 3, 2009

Introducing Cronus

Sunday, August 30, 2009

Building A Better “Where” Clause

Saturday, August 22, 2009

Hudson, More than Just a River in New York….

Saturday, February 14, 2009

What is unit testing and why should we do it?

Friday, January 23, 2009

What is Continuous Integration and Why Do We Need It II

What is Continuous Integration and Why Do We Need It?

Tuesday, January 13, 2009

Source Control

My Blog List

Blog Archive

Subscribe To The Morning Scrub