You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: Hash Set/README.markdown
+41-1Lines changed: 41 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -1,8 +1,11 @@
1
1
# Hash Set
2
+
# 哈希集合
2
3
3
4
A set is a collection of elements that is kind of like an array but with two important differences: the order of the elements in the set is unimportant and each element can appear only once.
5
+
集合是元素的集合,有点像数组但有两个重要的区别:集合中元素的顺序不重要,每个元素只能出现一次。
4
6
5
7
If the following were arrays, they'd all be different. However, they all represent the same set:
8
+
如果以下是数组,它们都会有所不同。 但是,它们都代表相同的集合:
6
9
7
10
```swift
8
11
[1 ,2, 3]
@@ -12,8 +15,10 @@ If the following were arrays, they'd all be different. However, they all represe
12
15
```
13
16
14
17
Because each element can appear only once, it doesn't matter how often you write the element down -- only one of them counts.
18
+
因为每个元素只能出现一次,所以将元素写下来的频率并不重要 - 只有其中一个元素有效。
15
19
16
20
> **Note:** I often prefer to use sets over arrays when I have a collection of objects but don't care what order they are in. Using a set communicates to the programmer that the order of the elements is unimportant. If you're using an array, then you can't assume the same thing.
As of Swift 1.2, the standard library includes a built-in `Set` type but here I'll show how you can make your own. You wouldn't use this in production code, but it's instructive to see how sets are implemented.
It's possible to implement a set using a simple array but that's not the most efficient way. Instead, we'll use a dictionary. Since `Swift`'s dictionary is built using a hash table, our own set will be a hash set.
@@ -71,12 +90,16 @@ public struct HashSet<T: Hashable> {
71
90
```
72
91
73
92
The code is really very simple because we rely on Swift's built-in `Dictionary` to do all the hard work. The reason we use a dictionary is that dictionary keys must be unique, just like the elements from a set. In addition, a dictionary has **O(1)** time complexity for most of its operations, making this set implementation very fast.
Because we're using a dictionary, the generic type `T` must conform to `Hashable`. You can put any type of object into our set, as long as it can be hashed. (This is true for Swift's own `Set` too.)
Normally, you use a dictionary to associate keys with values, but for a set we only care about the keys. That's why we use `Bool` as the dictionary's value type, even though we only ever set it to `true`, never to `false`. (We could have picked anything here but booleans take up the least space.)
The `allElements()` function converts the contents of the set into an array. Note that the order of the elements in that array can be different than the order in which you added the items. As I said, a set doesn't care about the order of the elements (and neither does a dictionary).
A lot of the usefulness of sets is in how you can combine them. (If you've ever used a vector drawing program like Sketch or Illustrator, you'll have seen the Union, Subtract, Intersect options to combine shapes. Same thing.)
The *union* of two sets creates a new set that consists of all the elements in set A plus all the elements in set B. Of course, if there are duplicate elements they count only once.
As you can see, the union of the two sets contains all of the elements now. The values `3` and `4` still appear only once, even though they were in both sets.
171
+
如您所见,两个集合的并集现在包含所有元素。 值`3`和`4`仍然只出现一次,即使它们都在两组中。
143
172
144
173
The *intersection* of two sets contains only the elements that they have in common. Here is the code:
174
+
两个集合的*交集*仅包含它们共有的元素。 这是代码:
145
175
146
176
```swift
147
177
extensionHashSet {
@@ -165,8 +195,10 @@ intersection.allElements()
165
195
```
166
196
167
197
This prints `[3, 4]` because those are the only objects from set A that are also in set B.
198
+
这打印 `[3, 4]` 因为那些是集合A中也是集合B的唯一对象。
168
199
169
200
Finally, the *difference* between two sets removes the elements they have in common. The code is as follows:
201
+
最后,两组之间的*差异*删除了它们共有的元素。 代码如下:
170
202
171
203
```swift
172
204
extensionHashSet {
@@ -183,6 +215,7 @@ extension HashSet {
183
215
```
184
216
185
217
It's really the opposite of `intersect()`. Try it out:
If you look at the [documentation](http://swiftdoc.org/v2.1/type/Set/) for Swift's own `Set`, you'll notice it has tons more functionality. An obvious extension would be to make `HashSet` conform to `SequenceType` so that you can iterate it with a `for`...`in` loop.
Another thing you could do is replace the `Dictionary` with an actual [hash table](../Hash%20Table), but one that just stores the keys and doesn't associate them with anything. So you wouldn't need the `Bool` values anymore.
If you often need to look up whether an element belongs to a set and perform unions, then the [union-find](../Union-Find/) data structure may be more suitable. It uses a tree structure instead of a dictionary to make the find and union operations very efficient.
> **Note:** I'd like to make `HashSet` conform to `ArrayLiteralConvertible` so you can write `let setA: HashSet<Int> = [1, 2, 3, 4]` but currently this crashes the compiler.
A set is a collection of elements that is kind of like an array but with two important differences: the order of the elements in the set is unimportant and each element can appear only once.
4
+
5
+
If the following were arrays, they'd all be different. However, they all represent the same set:
6
+
7
+
```swift
8
+
[1 ,2, 3]
9
+
[2, 1, 3]
10
+
[3, 2, 1]
11
+
[1, 2, 2, 3, 1]
12
+
```
13
+
14
+
Because each element can appear only once, it doesn't matter how often you write the element down -- only one of them counts.
15
+
16
+
> **Note:** I often prefer to use sets over arrays when I have a collection of objects but don't care what order they are in. Using a set communicates to the programmer that the order of the elements is unimportant. If you're using an array, then you can't assume the same thing.
17
+
18
+
Typical operations on a set are:
19
+
20
+
- insert an element
21
+
- remove an element
22
+
- check whether the set contains an element
23
+
- take the union with another set
24
+
- take the intersection with another set
25
+
- calculate the difference with another set
26
+
27
+
Union, intersection, and difference are ways to combine two sets into a single one:
As of Swift 1.2, the standard library includes a built-in `Set` type but here I'll show how you can make your own. You wouldn't use this in production code, but it's instructive to see how sets are implemented.
32
+
33
+
It's possible to implement a set using a simple array but that's not the most efficient way. Instead, we'll use a dictionary. Since `Swift`'s dictionary is built using a hash table, our own set will be a hash set.
34
+
35
+
## The code
36
+
37
+
Here are the beginnings of `HashSet` in Swift:
38
+
39
+
```swift
40
+
publicstructHashSet<T: Hashable> {
41
+
fileprivatevar dictionary =Dictionary<T, Bool>()
42
+
43
+
publicinit() {
44
+
45
+
}
46
+
47
+
publicmutatingfuncinsert(_element: T) {
48
+
dictionary[element] =true
49
+
}
50
+
51
+
publicmutatingfuncremove(_element: T) {
52
+
dictionary[element] =nil
53
+
}
54
+
55
+
publicfunccontains(_element: T) ->Bool {
56
+
return dictionary[element] !=nil
57
+
}
58
+
59
+
publicfuncallElements() -> [T] {
60
+
returnArray(dictionary.keys)
61
+
}
62
+
63
+
publicvar count: Int {
64
+
return dictionary.count
65
+
}
66
+
67
+
publicvar isEmpty: Bool {
68
+
return dictionary.isEmpty
69
+
}
70
+
}
71
+
```
72
+
73
+
The code is really very simple because we rely on Swift's built-in `Dictionary` to do all the hard work. The reason we use a dictionary is that dictionary keys must be unique, just like the elements from a set. In addition, a dictionary has **O(1)** time complexity for most of its operations, making this set implementation very fast.
74
+
75
+
Because we're using a dictionary, the generic type `T` must conform to `Hashable`. You can put any type of object into our set, as long as it can be hashed. (This is true for Swift's own `Set` too.)
76
+
77
+
Normally, you use a dictionary to associate keys with values, but for a set we only care about the keys. That's why we use `Bool` as the dictionary's value type, even though we only ever set it to `true`, never to `false`. (We could have picked anything here but booleans take up the least space.)
78
+
79
+
Copy the code to a playground and add some tests:
80
+
81
+
```swift
82
+
varset= HashSet<String>()
83
+
84
+
set.insert("one")
85
+
set.insert("two")
86
+
set.insert("three")
87
+
set.allElements() // ["one, "three", "two"]
88
+
89
+
set.insert("two")
90
+
set.allElements() // still ["one, "three", "two"]
91
+
92
+
set.contains("one") // true
93
+
set.remove("one")
94
+
set.contains("one") // false
95
+
```
96
+
97
+
The `allElements()` function converts the contents of the set into an array. Note that the order of the elements in that array can be different than the order in which you added the items. As I said, a set doesn't care about the order of the elements (and neither does a dictionary).
98
+
99
+
100
+
## Combining sets
101
+
102
+
A lot of the usefulness of sets is in how you can combine them. (If you've ever used a vector drawing program like Sketch or Illustrator, you'll have seen the Union, Subtract, Intersect options to combine shapes. Same thing.)
The *union* of two sets creates a new set that consists of all the elements in set A plus all the elements in set B. Of course, if there are duplicate elements they count only once.
122
+
123
+
Example:
124
+
125
+
```swift
126
+
var setA = HashSet<Int>()
127
+
setA.insert(1)
128
+
setA.insert(2)
129
+
setA.insert(3)
130
+
setA.insert(4)
131
+
132
+
var setB = HashSet<Int>()
133
+
setB.insert(3)
134
+
setB.insert(4)
135
+
setB.insert(5)
136
+
setB.insert(6)
137
+
138
+
let union = setA.union(setB)
139
+
union.allElements() // [5, 6, 2, 3, 1, 4]
140
+
```
141
+
142
+
As you can see, the union of the two sets contains all of the elements now. The values `3` and `4` still appear only once, even though they were in both sets.
143
+
144
+
The *intersection* of two sets contains only the elements that they have in common. Here is the code:
It's really the opposite of `intersect()`. Try it out:
186
+
187
+
```swift
188
+
let difference1 = setA.difference(setB)
189
+
difference1.allElements() // [2, 1]
190
+
191
+
let difference2 = setB.difference(setA)
192
+
difference2.allElements() // [5, 6]
193
+
```
194
+
195
+
## Where to go from here?
196
+
197
+
If you look at the [documentation](http://swiftdoc.org/v2.1/type/Set/) for Swift's own `Set`, you'll notice it has tons more functionality. An obvious extension would be to make `HashSet` conform to `SequenceType` so that you can iterate it with a `for`...`in` loop.
198
+
199
+
Another thing you could do is replace the `Dictionary` with an actual [hash table](../Hash%20Table), but one that just stores the keys and doesn't associate them with anything. So you wouldn't need the `Bool` values anymore.
200
+
201
+
If you often need to look up whether an element belongs to a set and perform unions, then the [union-find](../Union-Find/) data structure may be more suitable. It uses a tree structure instead of a dictionary to make the find and union operations very efficient.
202
+
203
+
> **Note:** I'd like to make `HashSet` conform to `ArrayLiteralConvertible` so you can write `let setA: HashSet<Int> = [1, 2, 3, 4]` but currently this crashes the compiler.
204
+
205
+
*Written for Swift Algorithm Club by Matthijs Hollemans*
0 commit comments