New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Binary search for IList<T> and IReadOnlyList<T> #15804
Comments
Suggestion, inherit to a new interface and implement that.
|
@benaadams What's the point? Shouldn't binary search be the same for each implementation of |
Not if someone has implemented the Adding a new interface that derives from |
@benaadams I was assuming it would be an extension method. That way, it's easy to use and you don't need to reimplement it for every collection. |
@svick ah, yeah, that could work :D |
Probably would want a matching Sort function to get it in right state? |
Perhaps it could be useful to add a collection independent helper methods: Perhaps
where Or perhaps
|
I do like the idea of having extension methods for sorting and doing a binary search through the IList interface. All you really need for this generic algorithm is random access to the elements, which is what you get through the IList interface. One interesting challenge I see from just briefly looking at the code is the need for duplicate code. Ideally, you would only need a single implementation of an algorithm like binary search for collections that provide random access. The problem is that System.Array and System.Collections.Generic.List live in mscorlib so their binary search implementation would also need to exist within that assembly since I don't believe mscorlib can reference another assembly. With that in mind, in order to provide extension methods that use the same binary search implementation as Array and List, those extension methods would need to live in mscorlib or some other generic method would need to be publicly exposed in mscorlib. This would be very different from the extension methods for IEnumerable that live in the System.Linq assembly. If someone has a good idea so that Array and List can share the same generic implementation as an extension method for IList, I'd be interested in hearing it. Otherwise, I wonder if people would just want to bite the bullet and duplicate the code in some other assembly. |
List< T > uses the Array.BinarySearch implementation on it's internal array. There isn't a way to have the IList extension method do the same since we don't have access to any underlying array, just the item getters. Therefore, it would need to be distinct (i.e. duplicated) from the implementation in Array.BinarySearch regardless of where it lives.
I don't think it should go in mscorlib, so the next logical place would be in the System.Collections partial facade in CoreFX.
Imo this is an overdue addition. There are a ton of people who want to perform sorting on an IList but their options are limited to:
|
I did the implementation of BinarySearch for
private static int BinarySearch(int index, int count, Func<int, int> compare)
{
...
while (low <= hi)
{
int i = GetMedian(low, hi);
int c;
try
{
c = compare(i);
... //Full Binary Search Logic
}
public static int BinarySearch<T>(this IList<T> list, int index, int count, T item, IComparer<T> comparer)
{
... //Validation
return BinarySearch(index, count, i => comparer.Compare(list[i], item));
}
public static int BinarySearch(this IList list, int index, int count, object item, IComparer comparer)
{
... //Validation
return BinarySearch(index, count, i => comparer.Compare(list[i], item));
} To maintain consistency I was planning to mimic/reuse same test cases which are used to test List BinarySearch. I see there are two versions |
This is unavoidable. TrySZBinarySearch requires an array to access but from the IList interface all we have is getters - we do not have any underlying array to pass a reference to.
Why do you want to convert between them? Since there's no guarantee that a non-generic IList will contain all elements of type T, the two may not be used interchangeably.
I'm not too well-versed in how delegates get compiled, but I'd imagine there is some overhead since they're subject to change during execution and thus cannot be inlined. BinarySearch only takes like 10 lines of code so imo a little bit of duplication is preferable to adding unnecessary overhead/complication via the use of delegates to wrap the comparers.
Seems reasonable.
Those tests are ported to xunit from an old framework that was last used years ago. I do not know the difference, nor would I suggest you spend time figuring it out and replicating them as they will be removed in #14117. |
I did the implementation for binary search however I am unable to write test cases. The extension method just doesn't show up in System.Collections.Tests project. Digging further I found that the definition of
and in System.Collections.Tests it is at System.Runtime
Since both the definitions are not same, the implemented extension methods are not available in test projects. What is the workaround? |
I have intentionally not added sort logic. It seems there are multiple open issues regarding sort such as #15803 and #14800 . Isn't it better we resolve them first before duplicating the code again? |
Ditto. |
I'm partial to what I've been using: public enum BinarySearchType
{
/// <summary>
/// The standard BCL binary search. Returns the index of the first equal item found,
/// or if there are no equal items, the two's complement of index where it would be inserted.
/// </summary>
Fast = 0,
/// <summary>
/// Return the index of the first equal item, or if there are no equal items, the index where it would be inserted.
/// </summary>
PrependIndex,
/// <summary>
/// Return the index following the last equal item, or if there are no equal items, the index where it would be inserted.
/// </summary>
AppendIndex
}
public static class BinarySearchExtensions
{
private interface IInlineComparer<T>
{
int Compare(T value);
}
private static int BinarySearchInline<T, TCalculator>(this IReadOnlyList<T> sortedList, TCalculator calculator, BinarySearchType searchType)
where TCalculator : IInlineComparer<T>
{
if (calculator == null) throw new ArgumentNullException(nameof(calculator));
var high = sortedList.Count - 1;
var low = 0;
switch (searchType)
{
case BinarySearchType.Fast:
while (low <= high)
{
var mid = (high + low) >> 1;
var comparison = calculator.Compare(sortedList[mid]);
if (comparison == 0) return mid;
if (comparison < 0)
low = mid + 1;
else
high = mid - 1;
}
return ~low;
case BinarySearchType.AppendIndex:
while (low <= high)
{
var mid = (high + low) >> 1;
var comparison = calculator.Compare(sortedList[mid]);
if (comparison <= 0)
low = mid + 1;
else
high = mid - 1;
}
return low;
case BinarySearchType.PrependIndex:
while (low <= high)
{
var mid = (high + low) >> 1;
var comparison = calculator.Compare(sortedList[mid]);
if (comparison < 0)
low = mid + 1;
else
high = mid - 1;
}
return low;
default:
throw new InvalidEnumValueException(searchType);
}
} And then I've got: public static int BinarySearch<T, TComparable>(this IReadOnlyList<T> sortedList, TComparable comparableValue, BinarySearchType searchType = BinarySearchType.Fast)
where T : IComparable<TComparable>
{ /* ... */ }
public static int BinarySearch<T, TOrder>(this IReadOnlyList<T> sortedList, Func<T, TOrder> orderSelector, TOrder value, IComparer<TOrder> comparer = null, BinarySearchType searchType = BinarySearchType.Fast)
{ /* ... */ }
public static int BinarySearch<T>(this IReadOnlyList<T> sortedList, T value, IComparer<T> comparer = null, BinarySearchType searchType = BinarySearchType.Fast)
{ /* ... */ }
} Entire thing: https://gist.github.com/jnm2/9bbdc74c5f2ec835a2af2a88ca8c100d |
Triage: I would assume that efficient binary search on an |
With the new C#8 feature of providing default implementations of methods on interfaces, would it be worth considering adding a "efficient" default implementation on |
@eiriktsarpalis This means that a binary search would be O(log n * log n), which doesn't seem like the worst thing in the world? Even so, I think most people would reasonably assume that list indexing is an O(1) operation, which means that |
To me running binary search over a list implies that we care about performance (otherwise one might as well use a dedicated data structure like SortedList). Having an extension method over |
It seems that there is some demand for this API. The following StackOverflow question has 45 upvotes: |
API proposal, targeting the namespace System.Collections.Generic
{
public static class CollectionExtensions
{
public static int BinarySearch<T>(this IReadOnlyList<T> list, T item);
public static int BinarySearch<T>(this IReadOnlyList<T> list, T item, IComparer<T>? comparer);
public static int BinarySearch<T>(this IReadOnlyList<T> list, int index, int count, T item, IComparer<T>? comparer);
}
} The overloads and the arguments are the same with the Adding the same extension method to the |
Actually I am not sure if the SortedList<string, string> list = new(StringComparer.OrdinalIgnoreCase);
int index = list.Keys.BinarySearch(key, list.Comparer); |
I faced with the same issue - binary search over In .NET 9 |
We have List.BinarySearch and Array.Search for searching sorted lists and arrays. But what about other collections that implement
IList<T>
orIReadOnlyList<T>
? It would be useful to have similar functions (extension methods) to search them. Binary search is a fundamental algorithm, so it's helpful to have a fast reliable well-documented implementation in the standard library. Otherwise developers waste time reinventing the wheel.A workaround is calling
.ToList()
or.ToArray()
and then searching the new object, but that's not ideal because it copies the collection.The text was updated successfully, but these errors were encountered: