Given an array `a[]`

, what would be the most efficient way to determine whether or not at least one element `i`

satisfies the condition `a[i] == i`

?

All the elements in the array are sorted and distinct, but they aren't necessarily integer types (i.e. they might be floating point types).

For a sorted array, you can perform an interpolation search. Similiar to a binary search, but assuming an even distribution of values, can be faster.

In the worst case, you can't do any better than checking every element. (Imagine something like `a[i] = i + uniform_random(-.25, .25)`

.) You'll need some information on what your input looks like.

I think the main problem here is your conflicting statements:

a[i]

==iAll the elements in the array are

sortedand distinct , theyneed not be integeralways.

If the array's value is equal to its accessing subscript that means it's an integer. If it's not an integer, and they're say.. `char`

, what is considered "sorted"? ASCII value ( `A < B < C`

)?

If it were an array of chars would we consider:

```
a[i] == i
```

to be true if

i == 65_{10}&& a[i] == 'A'

If I were in this interview I would be grilling the interviewer with follow up questions before answering. That said...

If all we know is what you stated, we can safely say that we can find the value in O(n) because that is the time to make one full pass of the array. With more details we can probably limit this to O(log(n)) with a binary search of the array.

Actually I would start from the last element, and do a basic check (for example, if you have 1000 elements, but highest is 100, you know you need only check 0..100). In a worst case scenario you still need to check every element, but it should be faster to find the areas where it may be possible. If it is as stated above (a[i] = i + [-0.25..0.25]), you are f($!ed and need to search every single element.

Noticed that all the elements in the array are **sorted and distinct**, so if we construct a new array b with b[i]=a[i]-i, **elements in array b is also sorted**, what we need to find is to find zeros in array b. I think binary search can solve the problem! Here is a link for count the number of occurrences in a sorted array. You can also do the similar Divide & Conquer technique on the original array without construct a auxiliary array! The time complexity is O(Logn)!

```
Take this as an example:
a=[0,1,2,4,8]
b=[0,0,0,1,4]
What we need to find is exactly index 0,1,2
```

Hope it helps!

Several people have made claims about the relevance of “sorted”, “distinct” and “aren't necessarily integers”. In fact, proper selection of an efficient algorithm to solve this problem hinges on these characteristics. A more efficient algorithm would be possible if we could know that the values in the array were both distinct and integral, while a less efficient algorithm would be required if the values might be non-distinct, whether or not they were integral. And of course, if the array was not already sorted, you could sort it first (at average complexity O(n log n)) and then use the more efficient pre-sorted algorithm (i.e. for a sorted array), but in the unsorted case it would be more efficient to simply leave the array unsorted and run through it directly comparing the values in linear time (O(n)). Note that regardless of the algorithm chosen, best-case performance is O(1) (when the first element examined contains its index value); at any point during execution of any algorithm we might come across an element where `a[i] == i`

at which point we return true; what actually matters in terms of algorithm performance in this problem is how quickly we can exclude all elements and declare that there is no such element `a[i]`

where `a[i] == i`

.

The problem does not state the sort order of `a[]`

, which is a pretty critical piece of missing information. If it’s ascending, the worst-case complexity will always be O(n), there’s nothing we can do to make the worst-case complexity better. But if the sort order is descending, even the worst-case complexity is O(log n): since values in the array are distinct and descending, there is only one possible index where `a[i]`

could equal `i`

, and basically all you have to do is a binary search to find the crossover point (where the ascending index values cross over the descending element values, if there even is such a crossover), and determine if `a[c] == c`

at the crossover point index value `c`

. Since that’s pretty trivial, I’ll proceed assuming that the sort order is ascending. Interestingly if the elements were integers, even in the ascending case there is a similar “crossover-like” situation (though in the ascending case there could be more than one `a[i] == i`

match), so if the elements were integers, a binary search would also be applicable in the ascending case, in which case even the worst-case performance would be O(log n) (see Interview question - Search in sorted array X for index i such that X[i] = i). But we aren’t given that luxury in this version of the problem.

Here is how we might solve this problem:

Begin with the first element, `a[0]`

. If its value is `== 0`

, you’ve found an element which satisfies `a[i] == i`

so return true. If its value is `< 1`

, the next element (`a[1]`

) could possibly contain the value `1`

, so you proceed to the next index. If, however, `a[0] >= 1`

, you know (because the values are distinct) that the condition `a[1] == 1`

cannot possibly be true, so you can safely skip index `1`

. But you can even do better than that: For example, if `a[0] == 12`

, you know (because the values are sorted in ascending order) that there cannot possibly be any elements that satisfy `a[i] == i`

prior to element `a[13]`

. Because the values in the array can be non-integral, we cannot make any further assumptions at this point, so the next element we can safely skip to directly is `a[13]`

(e.g. `a[1]`

through `a[12]`

may all contain values between `12.000...`

and `13.000...`

such that `a[13]`

could still equal exactly `13`

, so we have to check it).

Continuing that process yields an algorithm as follows:

```
// Algorithm 1
bool algorithm1(double* a, int len)
{
for (int i=0; i<len; ++i) // worst case is O(n)
{
if (a[i] == i)
return true; // of course we could also return i here (as an int)...
if (a[i] > i)
i = static_cast<size_t>(std::floor(a[i]));
}
return false; // ......in which case we’d want to return -1 here (an int)
}
```

This has pretty good performance if many of the values in `a[]`

are greater than their index value, and has excellent performance if all values in `a[]`

are greater than n (it returns false after only one iteration!), but it has dismal performance if all values are less than their index value (it will return false after n iterations). So we return to the drawing board... but all we need is a slight tweak. Consider that the algorithm could have been written to scan backwards from n down to 0 just as easily as it can scan forward from 0 to n. If we combine the logic of iterating from both ends toward the middle, we get an algorithm as follows:

```
// Algorithm 2
bool algorithm2(double* a, int len)
{
for (int i=0, j=len-1; i<j; ++i,--j) // worst case is still O(n)
{
if (a[i]==i || a[j]==j)
return true;
if (a[i] > i)
i = static_cast<size_t>(std::floor(a[i]));
if (a[j] < j)
j = static_cast<size_t>(std::ceil(a[j]));
}
return false;
}
```

This has excellent performance in both of the extreme cases (all values are less than 0 or greater than n), and has pretty good performance with pretty much any other distribution of values. The worst case is if all of the values in the lower half of the array are less than their index and all of the values in the upper half are greater than their index, in which case the performance degrades to the worst-case of O(n). Best case (either extreme case) is O(1), while average case is probably O(log n) but I’m deferring to someone with a math major to determine that with certainty.

Several people have suggested a “divide and conquer” approach to the problem, without specifying how the problem could be divided and what one would do with the recursively divided sub-problems. Of course such an incomplete answer would probably not satisfy the interviewer. The naïve linear algorithm and worst-case performance of algorithm 2 above are both O(n), while algorithm 2 improves the average-case performance to (probably) O(log n) by skipping (not examining) elements whenever it can. The divide-and-conquer approach can only outperform algorithm 2 if, in the average case, it is somehow able to skip more elements than algorithm 2 can skip. Let’s assume we divide the problem by splitting the array into two (nearly) equal contiguous halves , recursively, and decide if, with the resulting sub-problems, we are likely to be able to skip more elements than algorithm 2 could skip, especially in algorithm 2’s worst case. For the remainder of this discussion, let’s assume an input that would be worst-case for algorithm 2. After the first split, we can check both halves’ top & bottom elements for the same extreme case that results in O(1) performance for algorithm2, yet results in O(n) performance with both halves combined. This would be the case if all elements in the bottom half are less than 0 and all elements in the upper half are greater than n-1. In these cases, we can immediately exclude the bottom and/or top half with O(1) performance for any half we can exclude. Of course the performance of any half that cannot be excluded by that test remains to be determined after recursing further, dividing that half by half again until we find any segment whose top or bottom element contains its index value. That’s a reasonably nice performance improvement over algorithm 2, but it occurs in only certain special cases of algorithm 2’s worst case. All we’ve done with divide-and-conquer is decrease (slightly) the proportion of the problem space that evokes worst-case behavior. There are still worst-case scenarios for divide-and-conquer, and they exactly match most of the problem space that evokes worst-case behavior for algorithm 2.

So, given that the divide-and-conquer algorithm has less worst-case scenarios, doesn’t it make sense to go ahead and use a divide-and-conquer approach?

In a word, no. Well, maybe. If you know up front that about half of your data is less than 0 and half is greater than n, this special case would generally fare better with the divide-and-conquer approach. Or, if your system is multicore and your ‘n’ is large, it might be helpful to split the problem evenly between all of your cores, but once it’s split between them, I maintain that the sub-problems on each core are probably best solved with algorithm 2 above, avoiding further division of the problem and certainly avoiding recursion, as I argue below....

At each recursion level of a recursive divide-and-conquer approach, the algorithm needs some way to remember the as-yet-unsolved 2nd half of the problem while it recurses into the 1st half. Often this is done by having the algorithm recursively call itself first for one half and then for the other, a design which maintains this information implicitly on the runtime stack. Another implementation might avoid recursive function calls by maintaining essentially this same information on an explicit stack. In terms of space growth, algorithm 2 is O(1), but any recursive implementation is unavoidably O(log n) due to having to maintain this information on some sort of stack. But aside from the space issue, a recursive implementation has extra runtime overhead of remembering the state of as-yet-unrecursed-into subproblem halves until such time as they can be recursed into. This runtime overhead is not free, and given the simplicity of algorithm 2’s implementation above, I posit that such overhead is proportionally significant. Therefore I suggest that algorithm 2 above will roundly spank any recursive implementation for the vast majority of cases.

Similar Questions

I have a multidimensional array. $array[0] = array(1, 8, 2); $array[1] = array(5, 6, 15); $array[2] = array(-8, 2, 1025); I am wondering what the most efficient way to order the parent array by a par

Believe it or not, after profiling my current code, the repetitive operation of numpy array reversion ate a giant chunk of the running time. What I have right now is the common view-based method: reve

What can be the most efficient way to find an item in an array, which is logical and understood by a web developer? I came across this piece of code: var inArray = function(a, b, c, d) { for (c in b

I am implementing a tagging system on my website similar to one stackoverflow uses, my question is - what is the most effective way to store tags so that they may be searched and filtered? My idea is

With the advent of rvalue references on top of Return Value Optimization, what would be the most efficient way to implement a core function like this? How can I improve this implementation or should I

Using the Twitter API, what would be the most efficient way to find which users favorited a specific tweet? And also it possible to do this in retrospective, or must the streams be tapped prior the fa

Let's suppose we have a variable that could be a function, object or array. I want to find the most efficient way to determinate it. I think the following way is not optimized because if I know that i

I need to see if there are duplicates in an array of strings, what's the most time-efficient way of doing it?

I'm thinking of doing something like... Table Pants (20 entrees) | ID | Item | Description | Price Table Shirts (20 entrees) | ID | Item | Description | Price Table Socks (5 entrees) | ID | Item | Des

I have a table in a MySQL database from which I want to select the row with the closest timestamp to another given timestamp. time is the timestamp column (an integer UNIX timestamp). I chose 12507100

Let's say I've got: Dim los1 as New List(Of String) los1.Add(Some value) Dim los2 as New List(Of String) los2.Add(More values) What would be the most efficient way to combine the two into a singl

I have seen a lot of search algorithms to search in a binary sorted tree, but all of them are using the same way: recursion. I know recursion is expensive in comparison to loops, because every time we

Suppose you are on line 640 and notice the following on line 671: if (jumps_over(quick_fox[brown],the_lazy_dog)) the_lazy_dog.bark(); What would be the most key efficient way to navigate to bark?

What's the most efficient way to get out of a parallel_for? To get out of a standard for loop, we do the following : for(int i = 0; i < 100; i+) { bool bValue = DoSomething(); //Break if bValue is

What is the most efficient way to concatenate N arrays of objects in JavaScript? The arrays are mutable, and the result can be stored in one of the input arrays.

I'm trying to do ranking of a QuerySet efficiently (keeping it a QuerySet so I can keep the filter and order_by functions), but cannot seem to find any other way then to iterate through the QuerySet a

I have a class Logger with a number of static methods for various user activity logging. Something like: public static class Logger { public static void FileDownload(int fileId, int userId) { // Do st

So given a list: o = [1,2,4,6] And a dictionary: f = {1:10, 2:5, 3:1, 4:3, 5:7, 6:9} What is the most efficient way to find the key in f that is associated with the lowest value in f where the key i

I have a form being validated in the following manner: //Clear all variables $formCheck = ''; $rep = ''; $name = ''; $department = ''; $location = ''; $email = ''; $phone = ''; $type = ''; $drink = ''

What is the most efficient/elegant way to filter all paths by a base path? I have a path list and a base path, and I want a to get a list of paths that are children of the base path: public IList<s

I have a list of record Id's that I want to retrieve from Sql Server. I'm trying to figure out what the most performant way doing this would be. For example in code I have this: var recordsToFind = ne

What's the most efficient way to determine if a table is empty (that is, currently contains neither array-style values nor dict-style values)? Currently, I'm using next(): if not next(myTable) then --

Given lots of intervals [ai, bi], find the interval that intersects the most number of intervals. Can we do this in O(nlogn) or better? I can think only of a n^2 approach.

I have been trying to learn how to write faster more efficient jQuery and would like to get some insight on how I should write this function so it will work faster. Should i use variables, I am most c

In my application, I let users choose how they want their data arranged (sorted or unsorted). When they choose sorted, I simply sort everything. That part is straightforward to me. However, when they

I've something like this Object[] myObjects = ...(initialized in some way)... int[] elemToRemove = new int[]{3,4,6,8,...} What's the most efficient way of removing the elements of index position 3,4,

I have a video which is represented as AVURLAsset. It comes from Camera Roll and is at full resolution. Now if I only need it with 380 pixels width, what would be the fastest and most efficient way to

This problem is about searching a string in a master array (contains the list of all UIDs). The second array contains all the strings to be searched. For example: First array(Master List) contains: UI

I am trying to generate a query and having difficulty finding the most efficient way to do it in sqlalchemy, (note I'm using flask-sqlalchemy) The goal is to find all users have a meeting with a speci

I am looking for the most memory efficient way to combine reading a Pytables table (columns: x,y,z) in a sorted order(z column has a CSI) and evaluating an expression like x+a*y+b*z where a and b are

I got this as an interview question ... infinite array which is sorted and from some position (we dont know the position) only special symbol '$' will be there we need to find an element in that array

function get_total_adults() { $sql = SELECT SUM(number_adults_attending) as number_of_adults FROM is_nfo_rsvp; $result = mysql_query($sql) or die(mysql_error()); $array = mysql_fetch_assoc($result);

This request is based in MS Access VBA. I would like to know what the most efficient way is, to see if an item exists in a listbox control.

What is the most efficient way to get an array of months, from a specified date, up until the present day, grouped by year. Eg getMonths(August 2012) would output array( array(Year=>2013, mo

What is the most efficient(fast and safe) way of reading a log file in java? The log file continuously(almost every second) gets updated.

So I'm currently working on a project that needs to time when certain processes are running. I'm trying to figure out the most efficient way to scan the process list, then check the process list execu

I have an array containing several keys, values, objects etc.. I need to empty that array but I'd like to do it in the most efficient manner. The best I can come up with is: foreach ($array as $key =&

Is there a more efficient way to convert an HTMLCollection to an Array, other than iterating through the contents of said collection and manually pushing each item into an array?

assuming I have an array X and X has a size of N (where N > 0) is there a more efficient way of prepending the array that would not require O(N+1) steps? in code essentially what I currently am doi

What is the most efficient and fastest way to compare elements in a vector of class using two iterators? The vector is not a sortable one and I have an overloaded > operator for the class. I use

I was wondering what would be the most ethical way to consume some bytes (386 precisely) of content from a given Site A, with an application (e.g. Google App Engine) in some Site B, but doing it right

Lets say I have a class Point with a toInt method, and I have an immutable Map[Point,V], for some type V. What is the most efficient way in Scala to convert it to an IntMap[V]? Here is my current imp

Considering developer's perspective, what's the most efficient way to create, maintain, and improve a complex Web UI. I'm familiar with a bunch of toolkits like ext.net, telerik, devx. Silverlight is

What is the most efficient way of getting current time/date/day/year in C language? As I have to execute this many times, I need a real efficient way. I am on freeBSD. thanks in advance.

I am upgrading a web project containing 2 views with plenty of elements. At this time, all elements have several events like mouseenter, mouseleave, click, ... which are defined one by one during the

I am going to store a huge amount of matrix data in a mysqlDB what is the most efficient way to store and access the data? The efficiency is most important when getting the data, the table will not be

the following MySQL statement takes a very long time to execute, I think because I'm using gameId IN. I think the most efficient way would be to use a INNER JOIN, but I'm new to MySQL and I can't come

When computing the inverse for some square matrix A in MATLAB, using Ai = inv(A) % should be the same as: Ai = A^-1 MATLAB usually notifies me that this is not the most efficient way of inverting. So

I am writing a small application with clock that sicplay date and time with tenths of a second. What is the most efficient to do this ? I have already wrote this but I doubt that this is a good solut

I currently use std::vector<std::vector<std::string> > MyStringArray But I have read several comments here on SO that discourage the use of nested vectors on efficiency grounds. Unforuna