Scientist.Net

January 29, 2017

The purpose of the library is to allow you to try new code in a small sample of production usage - effectively, testing in production. The idea being that if you’re refactoring an important part of the system, you can re-write, and then call your new code on occasion; it’s logged and, should it reveal a major issue, can be simply switched off.

The first port of call is the GitHub repository:

Which adds this:

The following is some test code; there are two methods, an old, slow method, and a refactored new method:




class LegacyCode
{
    public void OldMethod1()
    {
        System.Threading.Thread.Sleep(1000);
        System.Console.WriteLine("This is old code");
    }
}
class RefactoredCode
{
    public void RefactoredNewMethod()
    {
        System.Console.WriteLine("RefactoredNewMethod called");
    }
}
static void Main(string[] args)
{
    System.Console.WriteLine("Start Test");
 
    for (int i = 1; i <= 100; i++)
    {
        Scientist.Science<bool>("Test", testNewCode =>
        {
            testNewCode.Use(() =>
            {
                new LegacyCode().OldMethod1();
                return true;
            });
            testNewCode.Try(() =>
            {
                new RefactoredCode().RefactoredNewMethod();
                return true;
            });
        });
    }
 
    System.Console.ReadLine();
}

In the code above you’ll notice that the call to Scientist looks a little forced - that’s because it insists on a return value from the experiments (and experiment being a trial of new code).

As you can see, Scientist is managing the calls between the new and old method:

One thing that wasn’t immediately obvious to me here was exactly how / what it does with this; especially given that the Try and Use blocks were not always appearing in a consistent order; the following test revealed it more clearly:

Because the order of the runs are randomly altered, I had assumed that which code was called was also randomly determined; in fact, both code paths are run. This is a hugely important distinction, because if you are changing data in one or the other, you need to factor this in.

Statistics

Scientist collects a number of statistics on the run; to see these, you need to implement an IResultPublisher; for example:



public class ResultPublisher : IResultPublisher
{
    public Task Publish<T, TClean>(Result<T, TClean> result)
    {
        System.Console.WriteLine($"Publishing results for experiment '{result.ExperimentName}'");
        System.Console.WriteLine($"Result: {(result.Matched ? "MATCH" : "MISMATCH")}");
        System.Console.WriteLine($"Control value: {result.Control.Value}");
        System.Console.WriteLine($"Control duration: {result.Control.Duration}");
        foreach (var observation in result.Candidates)
        {
            System.Console.WriteLine($"Candidate name: {observation.Name}");
            System.Console.WriteLine($"Candidate value: {observation.Value}");
            System.Console.WriteLine($"Candidate duration: {observation.Duration}");
        }
 
        return Task.FromResult(0);
    }
}

The code in here is executed for every call:

We’ve clearly sped up the call, but does it still do the same thing?

Matches… and mismatches

There’s a lot of information in the trace above. One thing that Scientist.Net does allow you to do is to compare the results of a function; let’s change the initial experiment a little:



public bool OldMethod1(int test)
{            
    System.Threading.Thread.Sleep(1000);
    System.Console.WriteLine("This is old code");
    return test >= 50;
}

public bool RefactoredNewMethod(int test)
{
    System.Console.WriteLine("RefactoredNewMethod called");
 
    return test >= 50;
}

for (int i = 1; i <= 100; i++)
{
    var result = Scientist.Science<bool>("Test", testNewCode =>
    {
        testNewCode.Use(() =>
        {
            return new LegacyCode().OldMethod1(i);                        
        });
        testNewCode.Try(() =>
        {
            return new RefactoredCode().RefactoredNewMethod(i);                        
        });
    });
}

Now we’re returning a boolean flag to say that the number is greater or equal to 50, and returning that. Finally, we need to change ResultPublisher (otherwise we won’t be able to see the wood for the trees:




public Task Publish<T, TClean>(Result<T, TClean> result)
{
    if (result.Mismatched)
    {
        System.Console.WriteLine($"Publishing results for experiment '{result.ExperimentName}'");
        System.Console.WriteLine($"Result: {(result.Matched ? "MATCH" : "MISMATCH")}");
        System.Console.WriteLine($"Control value: {result.Control.Value}");
        System.Console.WriteLine($"Control duration: {result.Control.Duration}");
        foreach (var observation in result.Candidates)
        {
            System.Console.WriteLine($"Candidate name: {observation.Name}");
            System.Console.WriteLine($"Candidate value: {observation.Value}");
            System.Console.WriteLine($"Candidate duration: {observation.Duration}");
        }
    }
 
    return Task.FromResult(0);
}

If we run that:

Everything is the same. So, let’s break the new code:




public bool RefactoredNewMethod(int test)
{
    System.Console.WriteLine("RefactoredNewMethod called");
 
    return test > 50;
}

Now we have a bug in the new code, so what happens:

We have a mismatch. The old code is now behaving differently, and so Scientist has identified this.

Summary

I came across this on this episode of .Net Rocks with Phil Haack. There are more features here, too - you can control the way the comparison works, categorise the results, and so forth.

References

http://haacked.com/archive/2016/01/20/scientist/

https://visualstudiomagazine.com/articles/2016/11/01/testing-experimental-code.aspx

https://github.com/github/Scientist.net