This question already has an answer here:
- Easiest way to find duplicate values in a JavaScript array 53 answers
As everybody knows there's no built-in function to remove the duplicates from an array in javascript. I've noticed this is also lacking in jQuery (which has a unique function for DOM selections only), and the most common snippet I found checks the entire array and a subset of it for each element (not very efficient I think), like:
for (var i = 0; i < arr.length; i++)
for (var j = i + 1; j < arr.length; j++)
if (arr[i] === arr[j])
//whatever
so I made my own:
function unique (arr) {
var hash = {}, result = [];
for (var i = 0; i < arr.length; i++)
if (!(arr[i] in hash)) { //it works with objects! in FF, at least
hash[arr[i]] = true;
result.push(arr[i]);
}
return result;
}
I wonder if there's any other algorithm accepted as the best for this case (or if you see any obvious flaw that could be fixed), or, what do you do when you need this in javascript (I'm aware that jQuery is not the only framework and some others may have this already covered).
Using the object literal is exactly what I would do. A lot of people miss this technique a lot of the time, opting instead for typical array walks as the original code that you showed. The only optimization would be to avoid the arr.length
lookup each time. Other than that, O(n) is about as good as you get for uniqueness and is much better than the original O(n^2) example.
function unique(arr) {
var hash = {}, result = [];
for ( var i = 0, l = arr.length; i < l; ++i ) {
if ( !hash.hasOwnProperty(arr[i]) ) { //it works with objects! in FF, at least
hash[ arr[i] ] = true;
result.push(arr[i]);
}
}
return result;
}
// * Edited to use hasOwnProperty per comments
Time complexities to summarize
f() | unsorted | sorted | objects | scalar | library
____________________________________________________________
unique | O(n) | O(n) | no | yes | n/a
original | O(n^2) | O(n^2) | yes | yes | n/a
uniq | O(n^2) | O(n) | yes | yes | Prototype
_.uniq | O(n^2) | O(n) | yes | yes | Underscore
As with most algorithms, there are trade offs. If you are only sorting scalar values, you're modifications to the original algorithm give the most optimal solution. However, if you need to sort non-scalar values, then using or mimicking the uniq
method of either of the libraries discussed would be your best choice.