devCatharsis

I spend a considerable part of my time reviewing and refactoring code. Code written by interns I coach, master thesis students I accompany, external projects I'm asked to audit, and last but not least my own code.

For ages, code reviewing and refactoring was considered a luxury most projects could easily live without, but on the last years it finally got mainstream acceptance - yes, today I can even explicitly add refactoring tasks to my projects planning :)

Tools have been accompanying this trend. Tool like Resharper on what Patrick Smacchia refers as the micro level (i.e methods' body structuring). But when projects fail technically, they usually do it on a macro level (i.e class, namespace, assembly structuring), and here is where NDepend comes to the rescue.

NDepend is a tool that analyze .NET code base structure against a set of design rules, suggesting refactoring and allowing version metrics comparisons. Yes, making part of my job. Oopss...

Here are some of the basic metrics anyone would expect to find:

Lines Of Code
Lines Of Comment
Number of IL Instructions
Number of Namespaces
Number of Types
Number of Methods
Number of Fields
Percentage Coverage
Cyclomatic Complexity

And here are some cooler ones:

Afferent coupling (Ca): The number of types outside this assembly that depend on types within this assembly. High afferent coupling indicates that the concerned assemblies have many responsibilities.
Efferent coupling (Ce): The number of types inside this assembly that depends on types outside this assembly. High efferent coupling indicates that the concerned assembly is dependant. Notice that types declared in framework assemblies are taken into account.
Lack of Cohesion Of Methods (LCOM): The single responsibility principle states that a class should not have more than one reason to change. Such a class is said to be cohesive. A high LCOM value generally pinpoints a poorly cohesive class. There are several LCOM metrics. The LCOM takes its values in the range [0-1]. The LCOM HS (HS stands for Henderson-Sellers) takes its values in the range [0-2]. A LCOM HS value highest than 1 should be considered alarming.
Relational Cohesion (H): Average number of internal relationships per type. Let R be the number of type relationships that are internal to this assembly (i.e that do not connect to types outside the assembly). Let N be the number of types within the assembly. H = (R + 1)/ N. The extra 1 in the formula prevents H=0 when N=1. The relational cohesion represents the relationship that this assembly has to all its types.
Instability (I): The ratio of efferent coupling (Ce) to total coupling. I = Ce / (Ce + Ca). This metric is an indicator of the package's resilience to change. The range for this metric is 0 to 1, with I=0 indicating a completely stable package and I=1 indicating a completely instable package.
Abstractness (A): The ratio of the number of internal abstract types (i.e abstract classes and interfaces) to the number of internal types. The range for this metric is 0 to 1, with A=0 indicating a completely concrete assembly and A=1 indicating a completely abstract assembly.
Distance from main sequence (D): The perpendicular normalized distance of an assembly from the idealized line A + I = 1 (called main sequence). This metric is an indicator of the assembly's balance between abstractness and stability. An assembly squarely on the main sequence is optimally balanced with respect to its abstractness and stability. Ideal assemblies are either completely abstract and stable (I=0, A=1) or completely concrete and instable (I=1, A=0). The range for this metric is 0 to 1, with D=0 indicating an assembly that is coincident with the main sequence and D=1 indicating an assembly that is as far from the main sequence as possible. The picture in the report reveals if an assembly is in the zone of pain (I and A both close to 0) or in the zone of uselessness (I and A both close to 1).

Most of this metrics come with recommendations. For example:

Methods where NbLinesOfCode is higher than 20 are hard to understand and maintain. Methods where NbILInstructions is higher than 40 are extremely complex and should be split in smaller methods (except if they are automatically generated by a tool).

Or this:

Code where the percentage of comment is lower than 20% should be more commented. However overly commented code (>40%) is not necessarily a blessing as it can be considered as an insult to the intelligence of the reader. Guidelines about code commenting can be found here.

Let's give it a try against a small R&D project written with NetTiers. I'll use VisualDepend, the interactive tool. A command line utility is also available, but I'll save it for a future post on continuous integration.

The rules are expressed using a cool DSL for code metrics: Code Query Language (CQL). You can express queries like:

// <Name>A stateless type might be turned into a static type</Name>
WARN IF Count > 0 IN SELECT TOP 10 TYPES WHERE 
  SizeOfInst ==0 AND 
  NbInterfacesImplemented == 0 AND // To be accurate, this constraint doesn't take 
                                   // account of types that implement some interfaces.
  !IsStatic AND 
  !IsGeneric AND 
  !IsInterface 
// It indicates stateless types that might eventually be turned into static classes.
// See the definition of the SizeOfInst metric here http://www.ndepend.com/Metrics.aspx#SizeOfInst

Cool, you can use the set of rules that come defined out-of-the-box, and you can always write your own.

Here's another favorite of mine:

// <Name>Fields should be marked as ReadOnly when possible</Name>
WARN IF Count > 0 IN SELECT FIELDS WHERE IsImmutable AND !IsInitOnly

// A field that matches the condition IsImmutable is a field that is assigned only by constructors of its class.
// For an instance field, this means its value will remain constant throught the lifetime of the object.
// For a static field, this means its value will remain constant throught the lifetime of the program.
// In both cases, such field can safely be marked with the C# readonly keyword (ReadOnly in VB.NET).

// The condition IsInitOnly matches fields that are marked with the C# readonly keyword (ReadOnly in VB.NET).

And look, we can see it against the metric view:

Oopos, lots of potential refactoring here (the blue components)... This view components are directly proportional to the metric - in this case, number of fields by type. Yes, NetTiers appications tend to do it.

As expected, NDepend help you identifying dependencies. For a start, it does it graphically:

But graphical dependencies are hard to scale, so the matrix representation is my favorite one to keep track of dependencies:

Finally an important artifact to give you an idea of where you stand is this graph that represents abstractness and stability:

I read a lot of code - anything I can get my hands on. One thing I've been doing lately is running NDepend against all the code I can - I find it most educating.

The tool itself is very drillable and effective. The only problem I had with it was the lack of support on some actions on x64, but since version 2.10 full 64 bit support was added.

Give it a try. A must.

devCatharsis

Sunday, September 28, 2008

Dependent on NDepend

No comments:

ClustrMaps

About Me

Label Cloud

Blog Archive

BlogRoll

E-mail syndication

GeoTag