Category Archives: LINQ

Finding Duplicate Values in a Dictionary Using C#

Due to a series of blog posts that I’m writing on TFS and MS Cognitive Services, I came across a requirement to identify duplicate values in a dictionary. For example, imagine you had an actual physical dictionary, and you wanted to find all the words that meant the exact same thing. Here’s the set-up for the test:

Dictionary<int, string> test = new Dictionary<int, string>()
{
    { 1, "one"},
    { 2, "two" },
    { 3, "one" },
    { 4, "three" }
};
DisplayDictionary("Initial Collection", test);

I’m outputting to the console at every stage, so here’s the helper method for that:


private static void DisplayDictionary(string title, Dictionary<int, string> test)
{
    Console.WriteLine(title);
    foreach (var it in test)
    {
        Console.WriteLine($"Key: {it.Key}, Value: {it.Value}");
    }
}

Finding Duplicates

LINQ has a special method for this, it’s Intersect. For flat collections, this works excellently, but no so well for Dictionaries; here was my first attempt:


Dictionary<int, string> intersect = test.Intersect(test)
    .ToDictionary(i => i.Key, i => i.Value);
DisplayDictionary("Intersect", intersect);

As you can see, the intersect doesn’t work very well this time (don’t tell Chuck).

Manual Intersect

The next stage then is to roll your own; a pretty straightforward lambda in the end:


var intersect2 = test.Where(i => test.Any(t => t.Key != i.Key && t.Value == i.Value))
    .ToDictionary(i => i.Key, i => i.Value);
DisplayDictionary("Manual Intersect", intersect2);
 

This works much better.

Assigning an index to a collection using LINQ

This is a neat little trick that I came across when looking for a way to arbitrarily assign a unique index to each element in a collection. Imagine a scenario where you have a collection of a class such as the following:

    class TestIndex
    {
        public string Title { get; set; }
        public int Amount { get; set; }
        public int Index { get;set; }

        public override string ToString()
        {
            return string.Format("{0}: {1}, {2}", Index, Title, Amount);
        }

    }

Here’s the code to populate it:

List<TestIndex> l = new List<TestIndex>()
{
    new TestIndex() { Title="test1", Amount=20 },
    new TestIndex() { Title="test2", Amount=30 },
    new TestIndex() { Title="test3", Amount=5 },
    new TestIndex() { Title="test4", Amount=30 }
};

I’ve overriden ToString so that we can see what’s in it:

foreach (var t in l)
{
    Console.WriteLine(t.ToString());
}

linq1

A you can see, we have a field called Index, but it contains nothing (well, 0). However, it can be populated in a single line:

var newList = l.Select((el, idx) => { el.Index = idx; return el; });

linq2

The Select statement has an override that will tell you the index of that element based on the order; for example:

var newList = l.OrderBy(a => a.Amount).Select((el, idx) => { el.Index = idx; return el; });

linq3

Complete Code Listing

    class Program
    {
        static void Main(string[] args)
        {
            List<TestIndex> l = new List<TestIndex>()
            {
                new TestIndex() { Title="test1", Amount=20 },
                new TestIndex() { Title="test2", Amount=30 },
                new TestIndex() { Title="test3", Amount=5 },
                new TestIndex() { Title="test4", Amount=30 }
            };

            foreach (var t in l)
            {
                Console.WriteLine(t.ToString());
            }

            Console.WriteLine(" - - - ");

            var newList = l.Select((el, idx) => { el.Index = idx; return el; });

            foreach (var t in newList)
            {
                Console.WriteLine(t.ToString());
            }

            Console.ReadLine();
        }

        
    }

    class TestIndex
    {
        public string Title { get; set; }
        public int Amount { get; set; }
        public int Index { get;set; }

        public override string ToString()
        {
            return string.Format("{0}: {1}, {2}", Index, Title, Amount);
        }
    }