Comparing LINQ’s maintainability
LINQ hides the gory details of how algorithms work, making code measurably more maintainable. Maintainability is correlated with both cost[pdf] and defects[pdf]. In the following example, we’ll use Visual Studio 2010 to measure maintainability.
Set Intersection
Imagine we have two sets of integers, and we need to find the intersection – that is, numbers that are in both sets. In addition, the intersection set should not contain duplicates. Below are four C# functions, followed by a comparison of their maintainability indexes.
Function 1
A junior developer might begin by taking each number in set A, and searching set B for that number. The C# code for this might look like:
{
var intersections = new List();
foreach (int a in SetA)
foreach (int b in SetB)
if (a == b)
if (Contains(intersections, a))
intersections.Add(a);
return intersections.ToArray();
}
bool Contains(List set, int target)
{
foreach (int i in set)
if (i == target)
return true;
return false;
}
Function 2
A more senior developer might review Function 1 and notice that it is slow when operating on large sets. There are three nested loops, including the .Contains, which is great than O(n2) complexity.
A more efficient solution would be to sort both sets first, so that we can find the intersection using a single loop. For example:
{
//sort sets and use parallel pointers O(n)
var sortedA = new List(SetA);
var sortedB = new List(SetB);
var intersect = new List();
sortedA.Sort();
sortedB.Sort();
int indexA = 0;
int indexB = 0;
while (indexA < sortedA.Count && indexB < sortedB.Count)
{
if (sortedA[indexA] < sortedB[indexB])
indexA++;
else if (sortedB[indexB] < sortedA[indexA])
indexB++;
else
{
intersect.Add(sortedA[indexA]);
indexA++;
indexB++;
}
}
return intersect.ToArray();
}
This is the type of code that gives the author pride, and the maintainer a headache. It may be fast, but the intent is not obvious.
Function 3
Now let’s try a query using LINQ.
{
return (from a in SetA
where SetB.Contains(a)
select a).Distinct().ToArray();
}
Note that this function creates a nested loop. For each number in set A, we must search for it in set B.
Function 4
Fortunately, LINQ already contains a number of basic math operations, including intersect. For example:
{
return SetA.Intersect(SetB).ToArray();
}
Maintainability
The LINQ functions appear to be the most maintainble. However, we need a more systematic approach. Starting with Visual Studio 2008, the Visual Studio now includes a static code analysis tool, which we can use to compute a maintainability index. These numbers range from 0 to 100, where 100 is the most maintainable. I’ll discuss how this works in a future post.
Using Visual Studio 2010 Ultimate RC1, we get the following results:
| Function | Maintainability Index |
|---|---|
| 1 | 64 |
| 2 | 54 |
| 3 | 73 |
| 3 | 83 |
These metrics show that LINQ does produce more maintainable code. In the next post, we’ll measure the performance of each algorithm.
