Use HashSet for error deduplication (#6268)

## Description Optimization Explanation - Using HashSet to track seen elements simplifies the code logic. - Using the retain method directly filters out duplicate elements in place, avoiding the complexity of manually managing indices and swapping elements. - The code is more concise and readable while maintaining the original order. - Time complexity comparison: Original code: Due to the need to manually manage indices and swap elements, the time -complexity is O(n^2) (in the worst case). Optimized code: Using HashSet and the retain method, the time complexity is O(n). - Space complexity comparison: Original code: Requires additional HashMap and SmallVec to store hash values and indices. Optimized code: Only requires a HashSet to store seen elements. ## Checklist - [ ] I have linked to any relevant issues. - [ ] I have commented my code, particularly in hard-to-understand areas. - [ ] I have updated the documentation where relevant (API docs, the reference, and the Sway book). - [ ] If my change requires substantial documentation changes, I have [requested support from the DevRel team](https://github.com/FuelLabs/devrel-requests/issues/new/choose) - [ ] I have added tests that prove my fix is effective or that my feature works. - [ ] I have added (or requested a maintainer to add) the necessary `Breaking*` or `New Feature` labels where relevant. - [x] I have done my best to ensure that my PR adheres to [the Fuel Labs Code Review Standards](https://github.com/FuelLabs/rfcs/blob/master/text/code-standards/external-contributors.md). - [x] I have requested a review from the relevant team or maintainers. --------- Co-authored-by: IGI-111 <igi-111@protonmail.com>
FuelLabs · Jul 16, 2024 · 807d7f4 · 807d7f4
1 parent fe89d16
commit 807d7f4
Showing 1 changed file with 5 additions and 33 deletions.
diff --git a/sway-error/src/handler.rs b/sway-error/src/handler.rs
@@ -1,5 +1,4 @@
 use crate::{error::CompileError, warning::CompileWarning};
-use std::collections::HashMap;
 
 use core::cell::RefCell;
 
@@ -138,37 +137,10 @@ pub struct ErrorEmitted {
 /// Stdlib dedup in Rust assumes sorted data for efficiency, but we don't want that.
 /// A hash set would also mess up the order, so this is just a brute force way of doing it
 /// with a vector.
-fn dedup_unsorted<T: PartialEq + std::hash::Hash>(mut data: Vec<T>) -> Vec<T> {
-    // TODO(Centril): Consider using `IndexSet` instead for readability.
-    use smallvec::SmallVec;
-    use std::collections::hash_map::{DefaultHasher, Entry};
-    use std::hash::Hasher;
-
-    let mut write_index = 0;
-    let mut indexes: HashMap<u64, SmallVec<[usize; 1]>> = HashMap::with_capacity(data.len());
-    for read_index in 0..data.len() {
-        let hash = {
-            let mut hasher = DefaultHasher::new();
-            data[read_index].hash(&mut hasher);
-            hasher.finish()
-        };
-        let index_vec = match indexes.entry(hash) {
-            Entry::Occupied(oe) => {
-                if oe
-                    .get()
-                    .iter()
-                    .any(|index| data[*index] == data[read_index])
-                {
-                    continue;
-                }
-                oe.into_mut()
-            }
-            Entry::Vacant(ve) => ve.insert(SmallVec::new()),
-        };
-        data.swap(write_index, read_index);
-        index_vec.push(write_index);
-        write_index += 1;
-    }
-    data.truncate(write_index);
+fn dedup_unsorted<T: PartialEq + std::hash::Hash + Clone + Eq>(mut data: Vec<T>) -> Vec<T> {
+    use std::collections::HashSet;
+
+    let mut seen = HashSet::new();
+    data.retain(|item| seen.insert(item.clone()));
     data
 }