C# 개발자로서 알아야 할 사항 - 컬렉션

18587 단어 csharpdotnet


이 게시물에서는 주요 C# 컬렉션을 살펴보고, 주요 기능을 요약하고, 이에 대한 몇 가지 은밀한 사항을 신속하게 알려드리겠습니다.

목차



  • Arrays
  • Features of an array


  • Collections
  • Common features of almost all collections
  • Selecting the best collection for the job


  • Lists
  • List add operations can be costly
  • Example of an O(n) operation using RemoveAt(0) with a List
  • Do not perform O(n) operations inside a loop!
  • Sorted List

  • LinkedList

  • Dictionary and related types
  • Features of a Dictionary
  • Sorted Dictionary

  • Stack
  • Queue

  • HashSet
  • The best solution for dealing with duplicates
  • Comparison between Dictionary and HashSet
  • SortedSet


  • Read-Only Collection
  • Caution ReadOnlyDictionary it's not really read-only

  • Immutable Collections
  • Some examples of time complexity in collections

  • 어레이

    An array is a reference type that contains a fixed-size list of elements that can be accessed using a positional index number. They are available through the System.Array 네임스페이스.

    어레이의 기능

    • Strongly typed
    • Useful when an array size can in known at design time.
    • No ability to add or remove elements.
    • Allows for multiple dimensions (array of an array)
    • Using an array might allow you to squeeze out a bit more performance with large datasets.
    • Two similar arrays are not equal when compared (arr1 == arr2) because they are reference types. The memory location is what's compared and not the content of the array.
    • To compare the content of the array, we can use the possibly expensive SequenceEqual() method.
    • When assigning array X to array Y, the equality of (x == y) equates to true.
    • An array occupies a single continuous block of memory.

    컬렉션

    A collection is a class that provides in-memory management of groups of items; by providing methods for adding, removing, or finding things in the collection. The items can be similar (strongly typed) or different from each other.

    There are four main types of collections: generic collections and non-generic collections, concurrent and immutable. These are the following namespaces:

  • Not Strongly typed & Strongly typed System.Collections &( System.Collections.Generic )
  • 동시( System.Collections.Concurrent )
  • 불변( System.Collections.Immutable )

  • 거의 모든 컬렉션의 공통 기능

    • Used when the number of elements is unknown at design time, or it's required to have the ability to add and remove elements.
    • The ability to iterate through the collection (implements the IEnumerable interface).
    • A method to copy it's elements to an array with CopyTo().
    • Provide information about the capacity of the collection and the number of elements currently in it.
    • They are 0-indexed, meaning the first element is always at [0].
    • Two similar collections are not equal when comparing (arrayX == arrayZ) because they are reference types. The memory location (reference) of the object is what's compared and not the content of the collection.

    작업에 가장 적합한 컬렉션 선택

    In general, avoid using non-generic collections. I recommend using the generic and concurrent versions of the collections because of their greater type safety and other improvements.

    To learn how to choose the appropriate collection, check and .

    .NET 컬렉션 살펴보기



    다음으로 속성을 빠르게 소개하고 일부 .NET 컬렉션을 관찰하겠습니다.

    기울기

    • Occupies a single continuous block of memory.
    • Optimized for fast lookups
    • To look up an element using the List .[] index access is an O(1) operation.
    • To lookup an element using a method like List.Find(x => x == y) the operation will be O(n).

    목록 추가 작업은 비용이 많이 들 수 있습니다.

    1. When a new List is created; Internally, an array, of a specific size, int[] is created with an initial capacity of N;
    2. After N add operations, the array is not big enough. A new array with more capacity is created, the contents of the current array are copied, the previous array gets discarded, and the new element is added.

    An operation to add an element to a List can be costly because it can cause memory relocations, which makes the operation possibly slow. Usually, the List is allocated with enough space, but attention should be given to this for a list where large datasets are being inserted.

    목록과 함께 RemoveAt(0)을 사용하는 O(n) 작업의 예

    It's important to note that we cannot have empty spaces in memory.
    When removing the first element in a list, all its elements must be moved (-1 positions) in memory. This means that removing an element at for a list of n elements:

    • At [0] position of an array is an O(1) operation
    • At [x] position of an array is an O(n-x) operation
    • At [n-1] position of an array is an O(n) operation.

    Or we can say that the RemoveAt() method is an O(n) operations where n is (Count - index).

    루프 내에서 O(n) 작업을 수행하지 마십시오!

    Look at the following piece of code:

    for(int i = lst.Count-1; i  0; i--){
       if(isPair(anotherList[i])){
          lst.RemoveAt(i);
       }
    }
    

    The RemoveAt() is O(n), and the entire loop could be in the worst case also O(n), which creates a complexity of O(n²), which should be avoided whenever possible; this is even a bigger problem for large datasets. In general, we should avoid O(n) operations inside loops. An alternative to this is using LINQ method RemoveAll():

        list.RemoveAll(x => isPair(x));
    

    SortedList

    • Functionally a dictionary
    • Internally uses a list
    • Uses less memory than SortedDictionary

    LinkedList

    This is a very well known data structure where it's data elements order is not given by their physical placement in memory. Instead, each element points to the next.

    • Optimized for fast changes (adding and removing is fast). Very scalable.
    • Contrary to a List, when removing an element in the middle of the LinkedList, there is no need for memory relocations. Only update the references in the previous and next node.
    • The lookup is slow (because we have to jump from one place to another).
    • Under the hood, the data is saved to a LinkedListNode.
    • No direct indexed element access.

    Because of its optimization for changes, a LinkedList could be used as a temporary working set for operations that are performing a large number of changes to it; it can then be copied to another collection when done with editing the List.

    사전 및 관련 유형

    사전의 특징

    • Lists are not suitable for lookups in large data sets; the Dictionary fetches an element for the cost of O(1) when trying to get an element from it!
    • The Dictionary uses keys to index its elements; they can be of any data type, some better than others. The keys are case sensitive, and we must specify a StringComparer when using string keys to avoid problems.
    • The Dictionary returns a key-value pair and not just the value.
    • The items in a dictionary are not ordered.
    • When using custom types for the key, we must implement an IEquatable interface and specify an override to the following methods: "bool Equals(object obj)" method, and the "GetHashCode(…)" method.
      • Optionally, you can also specify the "bool operator==" operator.
    • Make sure to have consistency between the Equality comparator and the HashCode generator. Take advantage of existing MS implementations.

    정렬된 사전

    • Keyed access to items
    • Sorts items only by key
    • Guaranteed sort order when enumerating
    • Modifications scale better

    스택

    We pop the last item in the collection and push items to it.
    Useful when we have items (or tasks) to be processed. New items are added to the collection; Items are removed as they are processed.

    • An Undo operation is a good use case of this data structure.
    • Items are stored in order (like List ).
    • Retrieving an item removes it.
    • No direct element lookup.
    • Provides the most recent item in the collection.

    대기줄

    • Behaves like a real queue; the first item in is the first item out (FIFO).
    • Retrieving an item with "Dequeue()" removes it from the queue, but we can use "Peek()" to see the next item in the queue without removing it.
    • No direct element lookup but supports enumeration with the foreach loop.
    • Provides the item waiting the longest time in the queue.

    해시셋

    • Useful to enforce uniqueness out of the box.
    • Allows for set operations (unions and intersections).
    • Similar to dictionaries (But lack keys and don't support lookups).

    중복 처리를 위한 HashSet

    There are two ways to deal with unwanted duplicates: the native way and the better away (just kidding, kind of).

    순진한 방법:




    var uniqueListOfNumbers = listOfNumbers.Distinct().ToList();
    


    이 전략은 소규모 컬렉션에는 적합하지만 대규모 컬렉션에는 적합하지 않습니다.

    더 나은 방법



    또 다른 전략은 HashSet을 사용하는 것입니다. 이렇게 하면 HashSet에 요소를 추가할 때 요소를 세트에 추가할 때 중복 값을 무시합니다! 고유성을 적용하는 데 매우 확장 가능하고 효율적입니다.

    Dictionary와 HashSet의 비교

    • They are both unordered bags of objects and use similar data structures internally.
    • Dictionaries have keys and use them to perform lookups | HashSet does not have keys and does not support lookup (only enumeration).
    • Dictionaries have unique keys | HashSet has unique values.
    • Dictionaries throw an exception when adding duplicates | HashSet ignores when adding duplicates.

    SortedSet

    • Used to order a set as you add elements to it.
    • Requires that an implementation of IComparer is provided so that the sorted set can use it to compare elements when sorting. * We are the ones that decide how the sorting happens.

    읽기 전용 컬렉션

    Read-only collections only allow reading of the collection (pretty self-explanatory). This is useful if you have a collection that you want to be able to modify internally, but you want it to be read-only to external users of your library or application.

    주의 ReadOnlyDictionary 실제로는 읽기 전용이 아닙니다.

    When we update a standard Dictionary , and we then create a new ReadOnlyDictionary based on the standard dictionary Dictionary the changes will also be reflected in the ReadOnlyDictionary, maybe defeating our initial goal. To avoid this, we use ImmutableDictionary, which will guarantee that no changes are made to the collection.

    This happens because the ReadOnlyDictionary adds a read-only wrapper around the original collection reference! This is true to all ReadOnly collections.

    불변 컬렉션

    An immutable collection prevents any changes to the collection. It's useful to enforce that a dataset should not be tampered with, possibly avoiding other programmers to introduce bugs in your code.

    • Although immutable collections cannot be modified by "normal" C# code, they could be manipulated with reflection or unmanaged code.
    • Immutable collections are guaranteed to be thread-safe because of their immutable nature.

    컬렉션의 시간 복잡성에 대한 몇 가지 참고 사항

    As you should know, the time complexity of any piece of code is an approximation of the time (it's not an absolute measure) that piece of code takes to execute. With collections, it will tell you how the performance scales as the collection size increases.

    오(1)



    컬렉션의 크기와 관계없이 작업을 실행하는 데 걸리는 일정한 시간을 나타냅니다.
  • 배열 또는 목록에서 인덱싱된 조회를 위한 조회 시간입니다.

  • 에)



    컬렉션 크기에 따라 선형적으로 확장되는 시간을 나타냅니다.
  • 목록에서 n번째 항목을 제거합니다.
  • n개의 항목이 있는 컬렉션을 열거하는 시간입니다.

  • 오(n²)



    비선형 방식으로 확장되는 시간을 나타냅니다.
  • 광범위한 컬렉션의 경우 매우 느림
  • 루프 내부에 O(n) 작업을 넣는 것과 같은 작업을 수행하지 않는 한 .NET에서는 드뭅니다.

  • O(log n) & O(n log n)


  • 후드 아래의 배열보다 더 복잡한 알고리즘을 사용하는 일부 컬렉션에서 발견됩니다.
  • 대규모 컬렉션의 경우 O(log n)은 O(1)만큼 성능이 좋습니다.
  • 대규모 수집의 경우 O(n log n)은 O(n)만큼 성능이 좋습니다.

  • 참조



    [1]. Arrays - https://docs.microsoft.com

    좋은 웹페이지 즐겨찾기