2010年10月25日星期一

The Blog of Tony Jiang: SortedSet vs HashSet

SortedSet vs HashSet


HashSet is very good at add and search operations. Any search operation (Contains, Remove, and similar operations) are O(1). That's great. However, on the minus side, the HashSet is not a sorted collection. Therefore, enumerating the elements in a sorted order forces you to copy the items to a different collection (like a List) and sort the resulting list. You could construct a LINQ query to order the elements, however internally that query will likely use some form of temporary storage to create the sorted sequence. That means every sort will be an expensive operation. Sort is typically an O(n ln n) operation, Also, because the HashSet does not have a sort method, you'll also have increased memory pressure and time cost to copy the elements.

SortedSet is new to .NET 4.0 System.Collections.Generic namespace. SortedSet has different characteristics. The sorted set ensures that the elements in the set are always in sorted order. Every Add operation places the new element in the correct location in the set. That means Add is an O(ln n) operation. The SortedSet must perform a binary search to find the correct location for the new element. The search happens on any of the search actions (Contains, Remove, etc). Those operations also have an O(ln n) performance characteristic. That sounds like the SortedSet is always slower than the HashSet. No one would use it if it was always slower. SortedSet is much faster for iterating the set in sorted order. It's already in the correct order, so the enumeration becomes an O(n) operation.

Conclusion
SortedSet will typically be faster than HashSet when the majority of your operations require enumerating the set in one particular order. If, instead, most of the operations are searching, you'll find better performance using the HashSet. The frequency of insert operations also has an effect on which collection would be better. The more frequently insert operations occur, the more likely HashSet will be faster.

Posted via email from 米良的实验室

没有评论:

发表评论