Using BenchmarkDotNet to profile string comparison

Introduction

String comparison and manipulation of strings are some of the slowest and most expensive (in terms of GC) things that you can do in .Net. In my head, I’ve always believed that using String.Compare outperforms string1.ToUpper() == string2.ToUpper(), which I think I once saw on a StackOverflow post.

In this post, I will do some actual testing on the various methods using BenchMarkDotNet (which I have previously written about).

Setting Up BenchmarkDotNet

There’s not much to this – just install a NuGet package:

Install-Package BenchmarkDotNet

Other than that, you just need to decorate your methods with:

[Benchmark]

You can’t (ATM) specify method parameters, but you can decorate a set-up method, or you can specify some parameters in a public variable:


        [Params("test1", "test2", "I am an aardvark")]
        public string _string1;

        [Params("test1", "Test2", "I Am an AARDVARK")]
        public string _string2;

Finally, in the main method, you run the class:


        static void Main(string[] args)
        {
            BenchmarkRunner.Run<StringCompareCaseSensitive>();
        }

Once run, the results are output into the following directory:

bin\Debug\BenchmarkDotNet.Artifacts\results

Comparing strings

Case sensitive

The following are the ways that I can think of to compare a string where the case is known:

string1 == string2

string1.Equals(string2) – with various flags

string.Compare(string1, string2)

string.CompareOrdinal(string1, string2)

string1.CompareTo(string2)

string1.IndexOf(string2) – with various flags

And the results were:

This is definitely not what I expected. String.Compare is actually slower that a straightforward comparison, and not by a small amount.

Case insensitive

The following are the ways that I can think of to compare a string where the case is not known:

String1.ToUpper() == string2.ToUpper()

String1.ToLower() == string2.ToLower()

string1.Equals(string2) – with various flags

string.Compare(string1, string2, true)

string1.IndexOf(string2) -with various flags

Results:

So, it looks like the most efficient string comparison is:

_string1.Equals(_string2, StringComparison.OrdinalIgnoreCase);

But why?

Nobody knows – Looking at the IL

The good thing about .Net, is that if you want to see what your code looks like once it’s “compiled”, you can. It’s not perfect, because you still can’t see the actual, executed code, but it still gives you a good idea of why it’s slow or fast. However, because all of the functions in question are system functions, looking at the IL for the test code is pretty much pointless.

Let’s run ildasm:

(bet you’re glad I included that screenshot)

The string comparison functions are in mscorelib.dll:

Here’s the code in there:

.method public hidebysig static int32  Compare(string strA,
                                               string strB,
                                               valuetype System.StringComparison comparisonType) cil managed
{
  .custom instance void System.Security.SecuritySafeCriticalAttribute::.ctor() = ( 01 00 00 00 ) 
  // Code size       0 (0x0)
} // end of method String::Compare

To be honest, I spent a while burrowing down this particular rabbit hole… but finally decided to see what ILSpy had to say about it… it looks like there is a helper method in the string class that, for some reason, ildasm doesn’t show. Let’s have a look what it does for:

string.Compare(_string1, _string2, true) == 0

The decompiled version is:

[__DynamicallyInvokable]
public static int Compare(string strA, string strB, bool ignoreCase)
{
    if (ignoreCase)
    {
        return CultureInfo.CurrentCulture.CompareInfo.Compare(strA, strB, CompareOptions.IgnoreCase);
    }
    return CultureInfo.CurrentCulture.CompareInfo.Compare(strA, strB, CompareOptions.None);
}

And the static method CompareInfo.Compare:

public virtual int Compare(string string1, string string2, CompareOptions options)
{
    if (options == CompareOptions.OrdinalIgnoreCase)
    {
        return string.Compare(string1, string2, StringComparison.OrdinalIgnoreCase);
    }
    if ((options & CompareOptions.Ordinal) != CompareOptions.None)
    {
        if (options != CompareOptions.Ordinal)
        {
            throw new ArgumentException(Environment.GetResourceString("Argument_CompareOptionOrdinal"), "options");
        }
        return string.CompareOrdinal(string1, string2);
    }
    else
    {
        if ((options & ~(CompareOptions.IgnoreCase | CompareOptions.IgnoreNonSpace | CompareOptions.IgnoreSymbols | CompareOptions.IgnoreKanaType | CompareOptions.IgnoreWidth | CompareOptions.StringSort)) != CompareOptions.None)
        {
            throw new ArgumentException(Environment.GetResourceString("Argument_InvalidFlag"), "options");
        }
        if (string1 == null)
        {
            if (string2 == null)
            {
                return 0;
            }
            return -1;
        }
        else
        {
            if (string2 == null)
            {
                return 1;
            }
            return CompareInfo.InternalCompareString(this.m_dataHandle, this.m_handleOrigin, this.m_sortName, string1, 0, string1.Length, string2, 0, string2.Length, CompareInfo.GetNativeCompareFlags(options));
        }
    }
}

And further:

Well… I couldn’t get further, so I asked Microsoft… the impression is that this function is generated at runtime.

There was a link to some code in this answer, too. While I couldn’t really identify any actual comparison code from this, I did notice that there was a check like this:

#ifndef FEATURE_CORECLR

So… does .NetCore work any better?

Having created a new .Net Core project, and copying the files across (I was going to add them as a link, but InvariantCulture has been removed (or rather, not included) in Core.

Anyway, the results from .Net Core (for case sensitive checks) are:

And case in-sensitive:

Conclusion

So, the clear winner across all tests for case sensitive checks is to use:

string1.Equals(string2)

And .Net Core is slightly faster than 4.6.2.

For case insensitive the clear winner is (by a large margin):

string1.Equals(string2, StringComparison.OrdinalIgnoreCase);

And, again, there’s around a 15 – 20% speed boost using .Net Core.

References

There is a GitHub repository for the code in this post here.

https://msdn.microsoft.com/en-us/library/fbh501kz%28v=vs.110%29.aspx?f=255&MSPPError=-2147217396

https://github.com/dotnet/BenchmarkDotNet/issues/60

http://mattwarren.org/2016/02/17/adventures-in-benchmarking-memory-allocations/

https://www.hanselman.com/blog/BenchmarkingNETCode.aspx

http://pmichaels.net/2016/11/04/message-persistence-in-rabbitmq-and-benchmarkdotnet/

https://blog.codinghorror.com/the-real-cost-of-performance/

https://msdn.microsoft.com/en-us/library/aa309387%28v=vs.71%29.aspx?f=255&MSPPError=-2147217396

http://ilspy.net/

http://stackoverflow.com/questions/9491337/what-is-dllimportqcall

Leave a Reply

Your email address will not be published. Required fields are marked *