︿
Top

2015年1月23日 星期五

C# LINQ: GroupBy


緣起

前一陣子參與的專案裡, 用了不少 LINQ 的語法, 但沒有時間細推其原埋.
其中, 常用到 GroupBy 的語法, 作各群組的 加熜, 平均, 最大值, 最小值, 計數 ... 等.
本文會採用 Query Syntax 及 Method Syntax 並陳的方式, 進行 GroupBy 的說明. 同時, 以 LINQPad 展示其執行結果, 會有助於了解 GroupBy 在作什麼.
註: 關於 Query Syntax, Method Syntax, LINQPad, 可以參考筆者的這篇文章.

完整程式範例, 筆者放在 GitHub, 請由此下載.


程式範例

以下範例, 均取自 MSDN, 加上一些修改; 主要是補上 Query Syntax 的部份.

範例一

該範例係將寵物依年紀作分組, 但放入 Group 的元素是 Pet.Name 這個屬性 (其型別為 string).
筆者加上了 Query Syntax 的語法, 比較容易瞭解其意義

class Pet
{
    public string Name { get; set; }
    public int Age { get; set; }
}

public static void GroupByEx1()
{
    // Create a list of pets.
    List<Pet> pets =
        new List<Pet>{ new Pet { Name="Barley", Age=8 },
                           new Pet { Name="Boots", Age=4 },
                           new Pet { Name="Whiskers", Age=1 },
                           new Pet { Name="Daisy", Age=4 } };

    // Group the pets using Age as the key value 
    // and selecting only the pet's Name for each value.

    // Query Syntax
    // -----------------
    //IEnumerable<IGrouping<int, string>> query1 =
    var query1 =
        from pet in pets
        group pet.Name by pet.Age;

    //Method Syntax
    // -----------------
    IEnumerable<IGrouping<int, string>> query2 =
        pets.GroupBy(pet => pet.Age, pet => pet.Name);

    // Iterate over each IGrouping in the collection.
    Console.WriteLine("===== Query Syntax =====");
    foreach (IGrouping<int, string> petGroup in query1)
    {
        // Print the key value of the IGrouping.
        Console.WriteLine(petGroup.Key);
        // Iterate over each value in the 
        // IGrouping and print the value.
        foreach (string name in petGroup)
            Console.WriteLine("  {0}", name);
    }

    Console.WriteLine("===== Method Syntax =====");
    foreach (IGrouping<int, string> petGroup in query2)
    {
        // Print the key value of the IGrouping.
        Console.WriteLine(petGroup.Key);
        // Iterate over each value in the 
        // IGrouping and print the value.
        foreach (string name in petGroup)
            Console.WriteLine("  {0}", name);
    }

    //// for LINQPad
    //// ------------------
    //query1.Dump();
    //query2.Dump();

    /*
     This code produces the following output:
    ===== Query Syntax =====
    8
      Barley
    4
      Boots
      Daisy
    1
      Whiskers
    ===== Method Syntax =====
    8
      Barley
    4
      Boots
      Daisy
    1
      Whiskers
    */
}

以下為採用 LINQPad 的輸出結果: 共有3個群組 ...
第1個群組: Key = 8, 其對應的元素為 IGrouping<int, string>, 這裡的 int 就是 Key, 以白話來看, 就是含有 {"Barley" }  這個物件
第2個群組: Key = 4, 其對應的元素為 IGrouping<int, string>, 這裡的 int 就是 Key, 以白話來看, 就是含有 { "Boots" } 及 { "Daisy"}  這2個物件
第3個群組: Key = 1, 其對應的元素為 IGrouping<int, string>, 這裡的 int 就是 Key, 以白話來看, 就是含有 { "Whiskers" }  這個物件
註: select projection 出來的只有 Name 這個欄位, 所以內部存放 string 的資料型別


LINQ GroupBy 的輸出結果 (Example01)

追蹤源碼發現, 最後會有一個實作 IGrouping 的類別 Grouping, 如下; 該類別實作了  IGrouping<TKey, TElement>, IList<TElement>, 所以內部有一個 TElement[]  elements 的集合. 亦即可以對照到上圖的結構.

internal class Grouping : IGrouping<TKey, TElement>, IList<TElement>
{
    internal TKey key;
    internal int hashCode;
    internal TElement[] elements;
    internal int count;
    internal Grouping hashNext;
    internal Grouping next;

    internal void Add(TElement element) {
        if (elements.Length == count) Array.Resize(ref elements, checked(count * 2));
        elements[count] = element;
        count++;
    }

    public IEnumerator<TElement> GetEnumerator() {
        for (int i = 0; i < count; i++) yield return elements[i];
    }

    IEnumerator IEnumerable.GetEnumerator() {
        return GetEnumerator();
    }

    // DDB195907: implement IGrouping<>.Key implicitly
    // so that WPF binding works on this property.
    public TKey Key {
        get { return key; }
    }

    int ICollection<TElement>.Count {
        get { return count; }
    }

    bool ICollection<TElement>.IsReadOnly {
        get { return true; }
    }

    void ICollection<TElement>.Add(TElement item) {
        throw Error.NotSupported();
    }

    void ICollection<TElement>.Clear() {
        throw Error.NotSupported();
    }

    bool ICollection<TElement>.Contains(TElement item) {
        return Array.IndexOf(elements, item, 0, count) >= 0;
    }

    void ICollection<TElement>.CopyTo(TElement[] array, int arrayIndex) {
        Array.Copy(elements, 0, array, arrayIndex, count);
    }

    bool ICollection<TElement>.Remove(TElement item) {
        throw Error.NotSupported();
    }

    int IList<TElement>.IndexOf(TElement item) {
        return Array.IndexOf(elements, item, 0, count);
    }

    void IList<TElement>.Insert(int index, TElement item) {
        throw Error.NotSupported();
    }

    void IList<TElement>.RemoveAt(int index) {
        throw Error.NotSupported();
    }

    TElement IList<TElement>.this[int index] {
        get {
            if (index < 0 || index >= count) throw Error.ArgumentOutOfRange("index");
            return elements[index];
        }
        set {
            throw Error.NotSupported();
        }
    }
}


範例二

該範例係將寵物依年紀作分組, 但放入 Group 的元素是 Pet 這個類別.

public static void GroupByEx2()
{
    // Create a list of pets.
    List<Pet> pets =
        new List<Pet>{ new Pet { Name="Barley", Age=8 },
                           new Pet { Name="Boots", Age=4 },
                           new Pet { Name="Whiskers", Age=1 },
                           new Pet { Name="Daisy", Age=4 } };

    // Group the pets using Age as the key value 
    // and selecting only the pet's Name for each value.

    // Query Syntax
    // -----------------
    //IEnumerable<IGrouping<int, Pet>> query1 =
    var query1 =
        from pet in pets
        group pet by pet.Age;

    //Method Syntax
    // -----------------
    IEnumerable<IGrouping<int, Pet>> query2 =
        pets.GroupBy(pet => pet.Age, pet => pet);

    // Iterate over each IGrouping in the collection.
    Console.WriteLine("===== Query Syntax =====");
    foreach (IGrouping<int, Pet> petGroup in query1)
    {
        // Print the key value of the IGrouping.
        Console.WriteLine(petGroup.Key);
        // Iterate over each value in the 
        // IGrouping and print the value.
        foreach (var item in petGroup)
            Console.WriteLine("  {0} : {1}", item.Name, item.Age ) ;
    }

    Console.WriteLine("===== Method Syntax =====");
    foreach (IGrouping<int, Pet> petGroup in query2)
    {
        // Print the key value of the IGrouping.
        Console.WriteLine(petGroup.Key);
        // Iterate over each value in the 
        // IGrouping and print the value.
        foreach (var item in petGroup)
            Console.WriteLine("  {0} : {1}", item.Name, item.Age);
    }

    //// for LINQPad
    //// ------------------
    //query1.Dump();
    //query2.Dump();

    /*
     This code produces the following output:
    ===== Query Syntax =====
    8
      Barley : 8
    4
      Boots : 4
      Daisy : 4
    1
      Whiskers : 1
    ===== Method Syntax =====
    8
      Barley : 8
    4
      Boots : 4
      Daisy : 4
    1
      Whiskers : 1
    */
}

以下為採用 LINQPad 的輸出結果, 共有3個群組 ...
第1個群組: Key = 8, 其對應的元素為 IGrouping<int, Pet>, 這裡的 int 就是 Key, 以白話來看, 就是含有 {8, "Barley" }  這個 Pet 物件
第2個群組: Key = 4, 其對應的元素為 IGrouping<int, Pet>, 這裡的 int 就是 Key, 以白話來看, 就是含有 { 4, "Boots" } 及 { 4, "Daisy"}  這2個 Pet 物件
第3個群組: Key = 1, 其對應的元素為 IGrouping<int, Pet>, 這裡的 int 就是 Key, 以白話來看, 就是含有 { 1, "Whiskers" }  這個 Pet 物件
註: select projection 出來的 Pet 這個類別, 所以內部存放 Pet 的資料型別

LINQ GroupBy 的輸出結果 (Example02)


範例三

該範例係將寵物依年紀作分組 (注意: Age 的部份改為 double 型別), 並取得其 Count, Average, Min, Max 等資料

class Pet2
{
    public string Name { get; set; }
    public double Age { get; set; }
}

public static void GroupByEx3()
{
    // Create a list of pets.
    List<Pet2> petsList =
        new List<Pet2>{ new Pet2 { Name="Barley", Age=8.3 },
                           new Pet2 { Name="Boots", Age=4.9 },
                           new Pet2 { Name="Whiskers", Age=1.5 },
                           new Pet2 { Name="Daisy", Age=4.3 } };

    // Group Pet objects by the Math.Floor of their age.
    // Then project an anonymous type from each group
    // that consists of the key, the count of the group's
    // elements, and the minimum and maximum age in the group.

    // Query Syntax
    // -----------------
    var query1 = from p in petsList
                group p by Math.Floor(p.Age)
                    into pets
                    select new
                    {
                        Key = pets.Key,
                        Count = pets.Count(),
                        Min = pets.Min(pet => pet.Age),
                        Max = pets.Max(pet => pet.Age),
                        DetailList = new List<Pet2>(pets)
                    };

    // Method Syntax
    // -----------------
    var query2 = petsList.GroupBy(
        pet => Math.Floor(pet.Age),
        (age, pets) => new
        {
            Key = age,
            Count = pets.Count(),
            Min = pets.Min(pet => pet.Age),
            Max = pets.Max(pet => pet.Age),
            DetailList = new List<Pet2>(pets)
        });


    // Iterate over each anonymous type.
    Console.WriteLine("");
    Console.WriteLine("===== Query Syntax =====");
    foreach (var result in query1)
    {
        Console.WriteLine("Age group: " + result.Key);
        Console.WriteLine("Number of pets in this age group: " + result.Count);
        Console.WriteLine("Minimum age: " + result.Min);
        Console.WriteLine("Maximum age: " + result.Max);
        foreach (var item in result.DetailList)
        {
            Console.WriteLine("    Members: " + item.Name + item.Age);
        }
    }

    Console.WriteLine("===== Method Syntax =====");
    foreach (var result in query2)
    {
        Console.WriteLine("Age group: " + result.Key);
        Console.WriteLine("Number of pets in this age group: " + result.Count);
        Console.WriteLine("Minimum age: " + result.Min);
        Console.WriteLine("Maximum age: " + result.Max);
        foreach (var item in result.DetailList)
        {
            Console.WriteLine("    Members: " + item.Name + item.Age);
        }
    }

    //// for LINQPad
    //// ------------------
    //query1.Dump();
    //query2.Dump();

    /*  This code produces the following output:

    ===== Query Syntax =====
    Age group: 8
    Number of pets in this age group: 1
    Minimum age: 8.3
    Maximum age: 8.3
        Members: Barley8.3
    Age group: 4
    Number of pets in this age group: 2
    Minimum age: 4.3
    Maximum age: 4.9
        Members: Boots4.9
        Members: Daisy4.3
    Age group: 1
    Number of pets in this age group: 1
    Minimum age: 1.5
    Maximum age: 1.5
        Members: Whiskers1.5
    ===== Method Syntax =====
    Age group: 8
    Number of pets in this age group: 1
    Minimum age: 8.3
    Maximum age: 8.3
        Members: Barley8.3
    Age group: 4
    Number of pets in this age group: 2
    Minimum age: 4.3
    Maximum age: 4.9
        Members: Boots4.9
        Members: Daisy4.3
    Age group: 1
    Number of pets in this age group: 1
    Minimum age: 1.5
    Maximum age: 1.5
        Members: Whiskers1.5
    */
}

以下為採用 LINQPad 的輸出結果:
LINQ GroupBy 的輸出結果 (Example03)



總結

本文針對了 C# LINQ GroupBy 的 Query Syntax 及 Method Syntax 的語法, 作了一些範例的呈現, 同時以 LINQPad 展示其輸出結構, 供有興趣者作參考.


參考文件



沒有留言:

張貼留言