I believe there's a way to find the kth largest element in an unsorted array of length n in O(n). Or perhaps it's "expected" O(n) or something. How can we do this?

iterate through the list. if the current value is larger than the stored largest value, store it as the largest value and bump the 1-4 down and 5 drops off the list. If not,compare it to number 2 and do the same thing. Repeat, checking it against all 5 stored values. this should do it in O(n)

A quick Google on that ('kth largest element array') returned this: http://discuss.joelonsoftware.com/default.asp?interview.11.509587.17

```
"Make one pass through tracking the three largest values so far." (it was specifically for 3d largest)
```

and..

```
Build a heap/priority queue. O(n)
Pop top element. O(log n)
Pop top element. O(log n)
Pop top element. O(log n)
Total = O(n) + 3 O(log n) = O(n)
```

A Programmer's Companion to Algorithm Analysis gives a version that *is* O(n), although the author states that the constant factor is so high, you'd probably prefer the naive sort-the-list-then-select method.

I answered the letter of your question :)

You can do it in O(n + kn) = O(n) (for constant k) for time and O(k) for space, by keeping track of the k largest elements you've seen.

For each element in the array you can scan the list of k largest and replace the smallest element with the new one if it is bigger.

Warren's priority heap solution is neater though.

What I would do is this:

```
initialize empty doubly linked list l
for each element e in array
if e larger than head(l)
make e the new head of l
if size(l) > k
remove last element from l
the last element of l should now be the kth largest element
```

You can simply store pointers to the first and last element in the linked list. They only change when updates to the list are made.

Update:

```
initialize empty sorted tree l
for each element e in array
if e between head(l) and tail(l)
insert e into l // O(log k)
if size(l) > k
remove last element from l
the last element of l should now be the kth largest element
```

The keywords you are looking for are *selection algorithm*: Wikipedia lists a number of different ways of doing this.

This is called finding the k-th order statistic. There's a very simple randomized algorithm (called quickselect) taking `O(n)`

average time, and a pretty complicated non-randomized algorithm taking `O(n)`

worst case time. There's some info on Wikipedia, but it's not very good.

Everything you need is in these powerpoint slides. Just to extract the basic algorithm of the `O(n)`

worst-case algorithm:

```
Select(A,n,i):
Divide input into ⌈n/5⌉ groups of size 5.
/* Partition on median-of-medians */
medians = array of each group’s median.
pivot = Select(medians, ⌈n/5⌉, ⌈n/10⌉)
Left Array L and Right Array G = partition(A, pivot)
/* Find ith element in L, pivot, or G */
k = |L| + 1
If i = k, return pivot
If i < k, return Select(L, k-1, i)
If i > k, return Select(G, n-k, i-k)
```

It's also very nicely detailed in the Introduction to Algorithms book by Cormen et al.

Read Chapter 9, Medians and Other statistics from Cormen's "Introduction to Algorithms", 2nd Ed. It has an expected linear time algorithm for selection. It's not something that people would randomly come up with in a few minutes.. A heap sort, btw, won't work in O(n), it's O(nlgn).

The C++ standard library has almost exactly that function, although it does modify your data. It has expected linear run-time, O(N), and it also does a partial sort.

```
const int N = ...;
double a[N];
// ...
const int m = ...; // m < N
nth_element (a, a + m, a + N);
// a[m] contains the mth element in a
```

If you want a true `O(n)`

algorithm, as opposed to `O(kn)`

or something like that, then you should use quickselect (it's basically quicksort where you throw out the partition that you're not interested in). My prof has a great writeup, with the runtime analysis: (reference)

The QuickSelect algorithm quickly finds the k-th smallest element of an unsorted array of `n`

elements. It is a RandomizedAlgorithm, so we compute the worst-case *expected* running time.

Here is the algorithm.

```
QuickSelect(A, k)
let r be chosen uniformly at random in the range 1 to length(A)
let pivot = A[r]
let A1, A2 be new arrays
# split into a pile A1 of small elements and A2 of big elements
for i = 1 to n
if A[i] < pivot then
append A[i] to A1
else if A[i] > pivot then
append A[i] to A2
else
# do nothing
end for
if k <= length(A1):
# it's in the pile of small elements
return QuickSelect(A1, k)
else if k > length(A) - length(A2)
# it's in the pile of big elements
return QuickSelect(A2, k - (length(A) - length(A2))
else
# it's equal to the pivot
return pivot
```

What is the running time of this algorithm? If the adversary flips coins for us, we may find that the pivot is always the largest element and `k`

is always 1, giving a running time of

`T(n) = Theta(n) + T(n-1) = Theta(n`^{2})

But if the choices are indeed random, the expected running time is given by

`T(n) i=1 to nT(max(i, n-i-1))`

where we are making the not entirely reasonable assumption that the recursion always lands in the larger of `A1`

or `A2`

.

Let's guess that `T(n) <= an`

for some `a`

. Then we get

```
T(n)
i=1 to nT(max(i-1, n-i))
= cn + (1/n) ∑
```_{i=1 to floor(n/2)} T(n-i) + (1/n) ∑_{i=floor(n/2)+1 to n} T(i)
i=floor(n/2) to n T(i)
i=floor(n/2) to n ai

and now somehow we have to get the horrendous sum on the right of the plus sign to absorb the `cn`

on the left. If we just bound it as `2(1/n) ∑`

, we get roughly _{i=n/2 to n} an`2(1/n)(n/2)an = an`

. But this is too big - there's no room to squeeze in an extra `cn`

. So let's expand the sum using the arithmetic series formula:

`∑`_{i=floor(n/2) to n} i
= ∑_{i=1 to n} i - ∑_{i=1 to floor(n/2)} i
= n(n+1)/2 - floor(n/2)(floor(n/2)+1)/2
2/2 - (n/4)^{2}/2
= (15/32)n^{2}

where we take advantage of n being "sufficiently large" to replace the ugly `floor(n/2)`

factors with the much cleaner (and smaller) `n/4`

. Now we can continue with

`cn + 2 (1/n) ∑`_{i=floor(n/2) to n} ai,
2
= n (c + (15/16)a)

` `provided `a > 16c`

.

`This gives `

.`T(n) = O(n)`

. It's clearly `Omega(n)`

, so we get T(n) = Theta(n)

i would like to suggest one answer

if we take the first k elements and sort them into a linked list of k values

now for every other value even for the worst case if we do insertion sort for rest n-k values even in the worst case number of comparisons will be k*(n-k) and for prev k values to be sorted let it be k*(k-1) so it comes out to be (nk-k) which is o(n)

cheers

Find the median of the array in linear time, then use partition procedure exactly as in quicksort to divide the array in two parts, values to the left of the median lesser( < ) than than median and to the right greater than ( > ) median, that too can be done in lineat time, now, go to that part of the array where kth element lies, Now recurrence becomes: T(n) = T(n/2) + cn which gives me O (n) overal.

You do like quicksort. Pick an element at random and shove everything either higher or lower. At this point you'll know which element you actually picked, and if it is the kth element you're done, otherwise you repeat with the bin (higher or lower), that the kth element would fall in. Statistically speaking, the time it takes to find the kth element grows with n, O(n).

I implemented finding kth minimimum in n unsorted elements using dynamic programming, specifically tournament method. The execution time is O(n + klog(n)). The mechanism used is listed as one of methods on Wikipedia page about Selection Algorithm (as indicated in one of the posting above). You can read about the algorithm and also find code (java) on my blog page Finding Kth Minimum. In addition the logic can do partial ordering of the list - return first K min (or max) in O(klog(n)) time.

Though the code provided result kth minimum, similar logic can be employed to find kth maximum in O(klog(n)), ignoring the pre-work done to create tournament tree.

**Although not very sure about O(n) complexity, but it will be sure to be between O(n) and nLog(n). Also sure to be closer to O(n) than nLog(n). Function is written in Java**

```
public int quickSelect(ArrayList<Integer>list, int nthSmallest){
//Choose random number in range of 0 to array length
Random random = new Random();
//This will give random number which is not greater than length - 1
int pivotIndex = random.nextInt(list.size() - 1);
int pivot = list.get(pivotIndex);
ArrayList<Integer> smallerNumberList = new ArrayList<Integer>();
ArrayList<Integer> greaterNumberList = new ArrayList<Integer>();
//Split list into two.
//Value smaller than pivot should go to smallerNumberList
//Value greater than pivot should go to greaterNumberList
//Do nothing for value which is equal to pivot
for(int i=0; i<list.size(); i++){
if(list.get(i)<pivot){
smallerNumberList.add(list.get(i));
}
else if(list.get(i)>pivot){
greaterNumberList.add(list.get(i));
}
else{
//Do nothing
}
}
//If smallerNumberList size is greater than nthSmallest value, nthSmallest number must be in this list
if(nthSmallest < smallerNumberList.size()){
return quickSelect(smallerNumberList, nthSmallest);
}
//If nthSmallest is greater than [ list.size() - greaterNumberList.size() ], nthSmallest number must be in this list
//The step is bit tricky. If confusing, please see the above loop once again for clarification.
else if(nthSmallest > (list.size() - greaterNumberList.size())){
//nthSmallest will have to be changed here. [ list.size() - greaterNumberList.size() ] elements are already in
//smallerNumberList
nthSmallest = nthSmallest - (list.size() - greaterNumberList.size());
return quickSelect(greaterNumberList,nthSmallest);
}
else{
return pivot;
}
}
```

For very small values of k (i.e. when k << n), we can get it done in ~O(n) time. Otherwise, if k is comparable to n, we get it in O(nlogn).

Explanation of the median - of - medians algorithm to find the k-th largest integer out of n can be found here: http://cs.indstate.edu/~spitla/presentation.pdf

Implementation in c++ is below:

```
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;
int findMedian(vector<int> vec){
// Find median of a vector
int median;
size_t size = vec.size();
median = vec[(size/2)];
return median;
}
int findMedianOfMedians(vector<vector<int> > values){
vector<int> medians;
for (int i = 0; i < values.size(); i++) {
int m = findMedian(values[i]);
medians.push_back(m);
}
return findMedian(medians);
}
void selectionByMedianOfMedians(const vector<int> values, int k){
// Divide the list into n/5 lists of 5 elements each
vector<vector<int> > vec2D;
int count = 0;
while (count != values.size()) {
int countRow = 0;
vector<int> row;
while ((countRow < 5) && (count < values.size())) {
row.push_back(values[count]);
count++;
countRow++;
}
vec2D.push_back(row);
}
cout<<endl<<endl<<"Printing 2D vector : "<<endl;
for (int i = 0; i < vec2D.size(); i++) {
for (int j = 0; j < vec2D[i].size(); j++) {
cout<<vec2D[i][j]<<" ";
}
cout<<endl;
}
cout<<endl;
// Calculating a new pivot for making splits
int m = findMedianOfMedians(vec2D);
cout<<"Median of medians is : "<<m<<endl;
// Partition the list into unique elements larger than 'm' (call this sublist L1) and
// those smaller them 'm' (call this sublist L2)
vector<int> L1, L2;
for (int i = 0; i < vec2D.size(); i++) {
for (int j = 0; j < vec2D[i].size(); j++) {
if (vec2D[i][j] > m) {
L1.push_back(vec2D[i][j]);
}else if (vec2D[i][j] < m){
L2.push_back(vec2D[i][j]);
}
}
}
// Checking the splits as per the new pivot 'm'
cout<<endl<<"Printing L1 : "<<endl;
for (int i = 0; i < L1.size(); i++) {
cout<<L1[i]<<" ";
}
cout<<endl<<endl<<"Printing L2 : "<<endl;
for (int i = 0; i < L2.size(); i++) {
cout<<L2[i]<<" ";
}
// Recursive calls
if ((k - 1) == L1.size()) {
cout<<endl<<endl<<"Answer :"<<m;
}else if (k <= L1.size()) {
return selectionByMedianOfMedians(L1, k);
}else if (k > (L1.size() + 1)){
return selectionByMedianOfMedians(L2, k-((int)L1.size())-1);
}
}
int main()
{
int values[] = {2, 3, 5, 4, 1, 12, 11, 13, 16, 7, 8, 6, 10, 9, 17, 15, 19, 20, 18, 23, 21, 22, 25, 24, 14};
vector<int> vec(values, values + 25);
cout<<"The given array is : "<<endl;
for (int i = 0; i < vec.size(); i++) {
cout<<vec[i]<<" ";
}
selectionByMedianOfMedians(vec, 8);
return 0;
}
```

Below is the link to full implementation with quite an extensive explanation how the algorithm for finding Kth element in an unsorted algorithm works. Basic idea is to partition the array like in QuickSort. But in order to avoid extreme cases (e.g. when smallest element is chosen as pivot in every step, so that algorithm degenerates into O(n^2) running time), special pivot selection is applied, called median-of-medians algorithm. The whole solution runs in O(n) time in worst and in average case.

Here is link to the full article (it is about finding Kth *smallest* element, but the principle is the same for finding Kth *largest*):

First we can build a BST from unsorted array which takes O(n) time and from the BST we can find the kth smallest element in O(log(n)) which over all counts to an order of O(n).

This is an implementation in Javascript.

If you release the constraint that you cannot modify the array, you can prevent the use of extra memory using two indexes to identify the "current partition" (in classic quicksort style - http://www.nczonline.net/blog/2012/11/27/computer-science-in-javascript-quicksort/).

```
function kthMax(a, k){
var size = a.length;
var pivot = a[ parseInt(Math.random()*size) ]; //Another choice could have been (size / 2)
//Create an array with all element lower than the pivot and an array with all element higher than the pivot
var i, lowerArray = [], upperArray = [];
for (i = 0; i < size; i++){
var current = a[i];
if (current < pivot) {
lowerArray.push(current);
} else if (current > pivot) {
upperArray.push(current);
}
}
//Which one should I continue with?
if(k <= upperArray.length) {
//Upper
return kthMax(upperArray, k);
} else {
var newK = k - (size - lowerArray.length);
if (newK > 0) {
///Lower
return kthMax(lowerArray, newK);
} else {
//None ... it's the current pivot!
return pivot;
}
}
}
```

If you want to test how it perform, you can use this variation:

```
function kthMax (a, k, logging) {
var comparisonCount = 0; //Number of comparison that the algorithm uses
var memoryCount = 0; //Number of integers in memory that the algorithm uses
var _log = logging;
if(k < 0 || k >= a.length) {
if (_log) console.log ("k is out of range");
return false;
}
function _kthmax(a, k){
var size = a.length;
var pivot = a[parseInt(Math.random()*size)];
if(_log) console.log("Inputs:", a, "size="+size, "k="+k, "pivot="+pivot);
// This should never happen. Just a nice check in this exercise
// if you are playing with the code to avoid never ending recursion
if(typeof pivot === "undefined") {
if (_log) console.log ("Ops...");
return false;
}
var i, lowerArray = [], upperArray = [];
for (i = 0; i < size; i++){
var current = a[i];
if (current < pivot) {
comparisonCount += 1;
memoryCount++;
lowerArray.push(current);
} else if (current > pivot) {
comparisonCount += 2;
memoryCount++;
upperArray.push(current);
}
}
if(_log) console.log("Pivoting:",lowerArray, "*"+pivot+"*", upperArray);
if(k <= upperArray.length) {
comparisonCount += 1;
return _kthmax(upperArray, k);
} else if (k > size - lowerArray.length) {
comparisonCount += 2;
return _kthmax(lowerArray, k - (size - lowerArray.length));
} else {
comparisonCount += 2;
return pivot;
}
/*
* BTW, this is the logic for kthMin if we want to implement that... ;-)
*
if(k <= lowerArray.length) {
return kthMin(lowerArray, k);
} else if (k > size - upperArray.length) {
return kthMin(upperArray, k - (size - upperArray.length));
} else
return pivot;
*/
}
var result = _kthmax(a, k);
return {result: result, iterations: comparisonCount, memory: memoryCount};
}
```

The rest of the code is just to create some playground:

```
function getRandomArray (n){
var ar = [];
for (var i = 0, l = n; i < l; i++) {
ar.push(Math.round(Math.random() * l))
}
return ar;
}
//Create a random array of 50 numbers
var ar = getRandomArray (50);
```

Now, run you tests a few time. Because of the Math.random() it will produce every time different results:

```
kthMax(ar, 2, true);
kthMax(ar, 2);
kthMax(ar, 2);
kthMax(ar, 2);
kthMax(ar, 2);
kthMax(ar, 2);
kthMax(ar, 34, true);
kthMax(ar, 34);
kthMax(ar, 34);
kthMax(ar, 34);
kthMax(ar, 34);
kthMax(ar, 34);
```

If you test it a few times you can see even empirically that the number of iterations is, on average, O(n) ~= constant * n and the value of k does not affect the algorithm.

Sexy quickselect in Python

```
def quickselect(arr, k):
''' k = 1 returns first element in ascending order'''
r = random.randrange(0, len(arr))
a1 = [i for i in arr if i < arr[r]] '''partition'''
a2 = [i for i in arr if i > arr[r]]
if k <= len(a1):
return quickselect(a1, k)
elif k > len(arr)-len(a2):
return quickselect(a2, k - (len(arr) - len(a2)))
else:
return arr[r]
```

There is also Wirth's selection algorithm, which has a simpler implementation than QuickSelect. Wirth's selection algorithm is slower than QuickSelect, but with some improvements it becomes faster.

In more detail. Using Vladimir Zabrodsky's MODIFIND optimization and the median-of-3 pivot selection and paying some attention to the final steps of the partitioning part of the algorithm, i've came up with the following algorithm (imaginably named "LefSelect"):

```
#define F_SWAP(a,b) { float temp=(a);(a)=(b);(b)=temp; }
# Note: The code needs more than 2 elements to work
float lefselect(float a[], const int n, const int k) {
int l=0, m = n-1, i=l, j=m;
float x;
while (l<m) {
if( a[k] < a[i] ) F_SWAP(a[i],a[k]);
if( a[j] < a[i] ) F_SWAP(a[i],a[j]);
if( a[j] < a[k] ) F_SWAP(a[k],a[j]);
x=a[k];
while (j>k & i<k) {
do i++; while (a[i]<x);
do j--; while (a[j]>x);
F_SWAP(a[i],a[j]);
}
i++; j--;
if (j<k) {
while (a[i]<x) i++;
l=i; j=m;
}
if (k<i) {
while (x<a[j]) j--;
m=j; i=l;
}
}
return a[k];
}
```

In benchmarks that i did here, LefSelect is 20-30% faster than QuickSelect.

Haskell Solution:

```
kthElem index list = sort list !! index
withShape ~[] [] = []
withShape ~(x:xs) (y:ys) = x : withShape xs ys
sort [] = []
sort (x:xs) = (sort ls `withShape` ls) ++ [x] ++ (sort rs `withShape` rs)
where
ls = filter (< x)
rs = filter (>= x)
```

This implements the median of median solutions by using the withShape method to discover the size of a partition without actually computing it.

Here is a C++ implementation of Randomized QuickSelect. The idea is to randomly pick a pivot element. To implement randomized partition, we use a random function, rand() to generate index between l and r, swap the element at randomly generated index with the last element, and finally call the standard partition process which uses last element as pivot.

```
#include<iostream>
#include<climits>
#include<cstdlib>
using namespace std;
int randomPartition(int arr[], int l, int r);
// This function returns k'th smallest element in arr[l..r] using
// QuickSort based method. ASSUMPTION: ALL ELEMENTS IN ARR[] ARE DISTINCT
int kthSmallest(int arr[], int l, int r, int k)
{
// If k is smaller than number of elements in array
if (k > 0 && k <= r - l + 1)
{
// Partition the array around a random element and
// get position of pivot element in sorted array
int pos = randomPartition(arr, l, r);
// If position is same as k
if (pos-l == k-1)
return arr[pos];
if (pos-l > k-1) // If position is more, recur for left subarray
return kthSmallest(arr, l, pos-1, k);
// Else recur for right subarray
return kthSmallest(arr, pos+1, r, k-pos+l-1);
}
// If k is more than number of elements in array
return INT_MAX;
}
void swap(int *a, int *b)
{
int temp = *a;
*a = *b;
*b = temp;
}
// Standard partition process of QuickSort(). It considers the last
// element as pivot and moves all smaller element to left of it and
// greater elements to right. This function is used by randomPartition()
int partition(int arr[], int l, int r)
{
int x = arr[r], i = l;
for (int j = l; j <= r - 1; j++)
{
if (arr[j] <= x) //arr[i] is bigger than arr[j] so swap them
{
swap(&arr[i], &arr[j]);
i++;
}
}
swap(&arr[i], &arr[r]); // swap the pivot
return i;
}
// Picks a random pivot element between l and r and partitions
// arr[l..r] around the randomly picked element using partition()
int randomPartition(int arr[], int l, int r)
{
int n = r-l+1;
int pivot = rand() % n;
swap(&arr[l + pivot], &arr[r]);
return partition(arr, l, r);
}
// Driver program to test above methods
int main()
{
int arr[] = {12, 3, 5, 7, 4, 19, 26};
int n = sizeof(arr)/sizeof(arr[0]), k = 3;
cout << "K'th smallest element is " << kthSmallest(arr, 0, n-1, k);
return 0;
}
```

The worst case time complexity of the above solution is still O(n2).In worst case, the randomized function may always pick a corner element. The expected time complexity of above randomized QuickSelect is Θ(n)

I came up with this algorithm and seems to be O(n):

Let's say k=3 and we want to find the 3rd largest item in the array. I would create three variables and compare each item of the array with the minimum of these three variables. If array item is greater than our minimum, we would replace the min variable with the item value. We continue the same thing until end of the array. The minimum of our three variables is the 3rd largest item in the array.

```
define variables a=0, b=0, c=0
iterate through the array items
find minimum a,b,c
if item > min then replace the min variable with item value
continue until end of array
the minimum of a,b,c is our answer
```

And, to find Kth largest item we need K variables.

Example: (k=3)

```
[1,2,4,1,7,3,9,5,6,2,9,8]
Final variable values:
a=7 (answer)
b=8
c=9
```

Can someone please review this and let me know what I am missing?

As per this paper Finding the Kth largest item in a list of n items the following algorithm will take `O(n)`

time in worst case.

- Divide the array in to n/5 lists of 5 elements each.
- Find the median in each sub array of 5 elements.
- Recursively ﬁnd the median of all the medians, lets call it M
- Partition the array in to two sub array 1st sub-array contains the elements larger than M , lets say this sub-array is a1 , while other sub-array contains the elements smaller then M., lets call this sub-array a2.
- If k <= |a1|, return selection (a1,k).
- If k− 1 = |a1|, return M.
- If k> |a1| + 1, return selection(a2,k −a1 − 1).

**Analysis:** As suggested in the original paper:

We use the median to partition the list into two halves(the first half, if

`k <= n/2`

, and the second half otherwise). This algorithm takes time`cn`

at the first level of recursion for some constant`c`

,`cn/2`

at the next level (since we recurse in a list of size n/2),`cn/4`

at the third level, and so on. The total time taken is`cn + cn/2 + cn/4 + .... = 2cn = o(n)`

.

**Why partition size is taken 5 and not 3?**

As mentioned in original paper:

Dividing the list by 5 assures a worst-case split of 70 − 30. Atleast half of the medians greater than the median-of-medians, hence atleast half of the n/5 blocks have atleast 3 elements and this gives a

`3n/10`

split, which means the other partition is 7n/10 in worst case. That gives`T(n) = T(n/5)+T(7n/10)+O(n). Since n/5+7n/10 < 1`

, the worst-case running time is`O(n)`

.

Now I have tried to implement the above algorithm as:

```
public static int findKthLargestUsingMedian(Integer[] array, int k) {
// Step 1: Divide the list into n/5 lists of 5 element each.
int noOfRequiredLists = (int) Math.ceil(array.length / 5.0);
// Step 2: Find pivotal element aka median of medians.
int medianOfMedian = findMedianOfMedians(array, noOfRequiredLists);
//Now we need two lists split using medianOfMedian as pivot. All elements in list listOne will be grater than medianOfMedian and listTwo will have elements lesser than medianOfMedian.
List<Integer> listWithGreaterNumbers = new ArrayList<>(); // elements greater than medianOfMedian
List<Integer> listWithSmallerNumbers = new ArrayList<>(); // elements less than medianOfMedian
for (Integer element : array) {
if (element < medianOfMedian) {
listWithSmallerNumbers.add(element);
} else if (element > medianOfMedian) {
listWithGreaterNumbers.add(element);
}
}
// Next step.
if (k <= listWithGreaterNumbers.size()) return findKthLargestUsingMedian((Integer[]) listWithGreaterNumbers.toArray(new Integer[listWithGreaterNumbers.size()]), k);
else if ((k - 1) == listWithGreaterNumbers.size()) return medianOfMedian;
else if (k > (listWithGreaterNumbers.size() + 1)) return findKthLargestUsingMedian((Integer[]) listWithSmallerNumbers.toArray(new Integer[listWithSmallerNumbers.size()]), k-listWithGreaterNumbers.size()-1);
return -1;
}
public static int findMedianOfMedians(Integer[] mainList, int noOfRequiredLists) {
int[] medians = new int[noOfRequiredLists];
for (int count = 0; count < noOfRequiredLists; count++) {
int startOfPartialArray = 5 * count;
int endOfPartialArray = startOfPartialArray + 5;
Integer[] partialArray = Arrays.copyOfRange((Integer[]) mainList, startOfPartialArray, endOfPartialArray);
// Step 2: Find median of each of these sublists.
int medianIndex = partialArray.length/2;
medians[count] = partialArray[medianIndex];
}
// Step 3: Find median of the medians.
return medians[medians.length / 2];
}
```

Just for sake of completion, another algorithm makes use of Priority Queue and takes time `O(nlogn)`

.

```
public static int findKthLargestUsingPriorityQueue(Integer[] nums, int k) {
int p = 0;
int numElements = nums.length;
// create priority queue where all the elements of nums will be stored
PriorityQueue<Integer> pq = new PriorityQueue<Integer>();
// place all the elements of the array to this priority queue
for (int n : nums) {
pq.add(n);
}
// extract the kth largest element
while (numElements - k + 1 > 0) {
p = pq.poll();
k++;
}
return p;
}
```

Both of these algorithms can be tested as:

```
public static void main(String[] args) throws IOException {
Integer[] numbers = new Integer[]{2, 3, 5, 4, 1, 12, 11, 13, 16, 7, 8, 6, 10, 9, 17, 15, 19, 20, 18, 23, 21, 22, 25, 24, 14};
System.out.println(findKthLargestUsingMedian(numbers, 8));
System.out.println(findKthLargestUsingPriorityQueue(numbers, 8));
}
```

As expected output is: `18 18`

Here is the implementation of the algorithm eladv suggested(I also put here the implementation with random pivot):

```
public class Median {
public static void main(String[] s) {
int[] test = {4,18,20,3,7,13,5,8,2,1,15,17,25,30,16};
System.out.println(selectK(test,8));
/*
int n = 100000000;
int[] test = new int[n];
for(int i=0; i<test.length; i++)
test[i] = (int)(Math.random()*test.length);
long start = System.currentTimeMillis();
random_selectK(test, test.length/2);
long end = System.currentTimeMillis();
System.out.println(end - start);
*/
}
public static int random_selectK(int[] a, int k) {
if(a.length <= 1)
return a[0];
int r = (int)(Math.random() * a.length);
int p = a[r];
int small = 0, equal = 0, big = 0;
for(int i=0; i<a.length; i++) {
if(a[i] < p) small++;
else if(a[i] == p) equal++;
else if(a[i] > p) big++;
}
if(k <= small) {
int[] temp = new int[small];
for(int i=0, j=0; i<a.length; i++)
if(a[i] < p)
temp[j++] = a[i];
return random_selectK(temp, k);
}
else if (k <= small+equal)
return p;
else {
int[] temp = new int[big];
for(int i=0, j=0; i<a.length; i++)
if(a[i] > p)
temp[j++] = a[i];
return random_selectK(temp,k-small-equal);
}
}
public static int selectK(int[] a, int k) {
if(a.length <= 5) {
Arrays.sort(a);
return a[k-1];
}
int p = median_of_medians(a);
int small = 0, equal = 0, big = 0;
for(int i=0; i<a.length; i++) {
if(a[i] < p) small++;
else if(a[i] == p) equal++;
else if(a[i] > p) big++;
}
if(k <= small) {
int[] temp = new int[small];
for(int i=0, j=0; i<a.length; i++)
if(a[i] < p)
temp[j++] = a[i];
return selectK(temp, k);
}
else if (k <= small+equal)
return p;
else {
int[] temp = new int[big];
for(int i=0, j=0; i<a.length; i++)
if(a[i] > p)
temp[j++] = a[i];
return selectK(temp,k-small-equal);
}
}
private static int median_of_medians(int[] a) {
int[] b = new int[a.length/5];
int[] temp = new int[5];
for(int i=0; i<b.length; i++) {
for(int j=0; j<5; j++)
temp[j] = a[5*i + j];
Arrays.sort(temp);
b[i] = temp[2];
}
return selectK(b, b.length/2 + 1);
}
}
```

How about this kinda approach

Maintain a `buffer of length k`

and a `tmp_max`

, getting tmp_max is O(k) and is done n times so something like `O(kn)`

Is it right or am i missing something ?

*Although it doesn't beat average case of quickselect and worst case of median statistics method but its pretty easy to understand and implement.*

it is similar to the quickSort strategy, where we pick an arbitrary pivot, and bring the smaller elements to its left, and the larger to the right

```
public static int kthElInUnsortedList(List<int> list, int k)
{
if (list.Count == 1)
return list[0];
List<int> left = new List<int>();
List<int> right = new List<int>();
int pivotIndex = list.Count / 2;
int pivot = list[pivotIndex]; //arbitrary
for (int i = 0; i < list.Count && i != pivotIndex; i++)
{
int currentEl = list[i];
if (currentEl < pivot)
left.Add(currentEl);
else
right.Add(currentEl);
}
if (k == left.Count + 1)
return pivot;
if (left.Count < k)
return kthElInUnsortedList(right, k - left.Count - 1);
else
return kthElInUnsortedList(left, k);
}
```

Go to the End of this link : ...........

- Have Priority queue created.
- Insert all the elements into heap.
Call poll() k times.

`public static int getKthLargestElements(int[] arr) { PriorityQueue<Integer> pq = new PriorityQueue<>((x , y) -> (y-x)); //insert all the elements into heap for(int ele : arr) pq.offer(ele); // call poll() k times int i=0; while(i<k) { int result = pq.poll(); } return result; }`

Similar Questions

This is an interview question I saw online and I am not sure I have correct idea for it. The problem is here: Design an algorithm to find the two largest elements in a sequence of n numbers. Number o

This question was asked in one of the interview : Given two unsorted array, check if it will create the same bst. eg: 2, 1, 4, 0 and 2, 1, 0, 4 will both form same BST. 2 / \ 1 4 / 0 please suggest

Algorithm for Finding nth smallest/largest element in an array using data strucuture self balancing binary search tree.. Read the post: http://stackoverflow.com/questions/2329171/find-kth-smallest-ele

-- 3 (find kth element of a list) element_at xs x = xs !! x prop_3a xs x = (x < length xs && x >= 0) ==> element_at xs (x::Int) == (xs !! x::Int) When prop_3a is ran through QuickCh

Possible Duplicate: Get the maximum value from an element in a multidimensional array? find max() of specific multidimensional array value in php Iam trying to find out the largest array from multi

This question already has an answer here: How to find the kth largest element in an unsorted array of length n in O(n)? 22 answers I am wondering if this is the most efficient way to find the n

I have an array in which i have element like a = array.array('i',[3,5,7,2,8,9,10,37,99]). Now I have to find 4th largest element, If this is a list , then i can find by this way, l = [3,5,7,2,8,9,10,3

Is this possible to find the smallest element of an array of unsorted integers by faster than O (n) time complexity. Space complexity is not a concern.

I try to find k-th minimum element using my code, but can't fix an error in my code. When it try to make partitioning for [0, 0, 2] with pivot = 0 it's looping. import java.util.Arrays; public class O

How in C++ get array length with pointers only ? I know that tab name is pointer to first element, but what next ?

I have this question: Given two sorted lists (stored in arrays) of size n, find an O(log n) algorithm that computes the nth largest element in the union of the two lists. I can see there is probably

I would like have a function return the largest of N list. With two items in the list I can write: l1 = [3, 4, 5] l2 = [4, 5, 6, 7] def f(L): if(len(L[0]) > len(L[1])): return L[0] else: return L[1

For an array of size N, what is the # of comparisons required?

A is an array of the integers from 1 to n in random order. I need random access to the ith largest element of the first j elements in at least log time. What I've come up with so far is an n x n matri

How to find length of an array in mips,spim?

I know that in C I can get the length of a char array by using strlen() but how do I get the length of an integer array. Is there a function in a library somewhere?

i have an array full of numbers. i need to find the maximum different between 2 numbers but the biggest number is before the smallest number in the array. public static int maximalDrop (int [] a) For

I have an array a = [3,6,774,24,56,2,64,56,34]. I need to find the second largest number in a single iteration using Ruby. How do I achieve it?

I am trying to find the location of an element in the array. I have tried to use this code i generated for(i=0;i<10;i++) { if (strcmp(temp[0],varptr[i])==0) j=i; } varptr is a pointer which points

What would be a way to find largest commits (i.e. commits introducing most changes, for instance counted as the number of added/removed lines) in a git repo? Note that I really want largest commits, n

I want to know how to compares values in a column to get the largest integer in the specific column. Considering Column (0) has integers in it how to find the largest integer? i tried the coding below

We can easily find the nth largest using the Median of Medians Algorithm in O(n) time complexity. If we have to find multiple times the nth largest numbers in the same array the best would be to sort

This is my code for finding the largest word from a given string. I have got the length of all the words in the string now how do I get the largest word to be printed out? I have tried to get all the

So I have an unsorted numeric array int[] anArray = { 1, 5, 2, 7 }; and I need to get both the value and the index of the largest value in the array which would be 7 and 3, how would I do this?

I am trying to find the most efficient way to sort the t smallest integers of an unsorted array of length n. I am trying to have O(n) runtime but, keep getting stuck. The best I can think of is just s

I have a question about finding the kth largest element using a min-heap. The algorithm is as follows: We take the first k elements and build a minheap Let Sk be the smallest element in S. Look at a

I have a coding challenge question that I'm struggling with. Given an array of integers, iterate through the array (only allowed once) connecting a value with the biggest value to it's right. You are

#include <iostream> //include header file using namespace std; int main () //start of main fcn { int values[ 20 ]; //delcares array and how many elements int small,big; //declares integer big=sm

Given an NxN binary matrix (containing only 0's or 1's), how can we go about finding largest rectangle containing all 0's? Example: I 0 0 0 0 1 0 0 0 1 0 0 1 II->0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0

I have an unsorted array with n*n order. How to get the largest element from each row with complexity O(n logn).

Let's say I have array foo and a positive integer y, where foo.length > y. I want to remove elements from foo so that foo.length becomes y (or very close to it). Also, I need to preserve the first

Assume I have an array of consecutive data and consecutive nulls, e.g.: 0, 3, 1, 2, null, null, null How to use binary search idea to find index of the first null element?

Possible Duplicate: Algorithm to determine if array contains n…n+m? Lets assume that M > N, and you have 2 arrays. One of length M called A and one of length N called B. Is there a faster way to

As the title says, is there any efficient way to find the second largest element in an array using recursion?

Possible Duplicate: length of array in function argument my array size is 5. ex: arrCustId[5] then how to know that how many CustomerIds are already present in my array in c? in short how to find le

Suppose I have an array {1, 2, 5, 4} and m = 3. I need to find: 1*2*5 + 1*2*4 + 1*5*4 + 2*5*4 i.e Sum of multiplication of all combination of m element from an array of n elements. One of the solutio

I know that if balanced, a BST height is O(log(n)), meaning searching is O(log(n)), but making an unbalanced tree into a balanced one would increase the run-time of inserting / deleting, since you wou

I wanted to find the largest sum continuous subarray from the given array. I know the O(n) approach of finding the largest sum continuous subarray approach using the concept of dynamic programming usi

Give a data structure that stores comparable objects and supports add() and get(k) operations [get(k) returns the kth smallest element in the data structure (1 <= k <= n)]. get(k) must be O(1) a

To find the unique array among the N arrays, consider I have 3 arrays; array_1 [ ]; array_2 [ ]; array_3 [ ]; How I approached is; I will have set a reference array ref_array[] =[values from 1 to 12]

I have this array of strings private static String[] colorsArray = { #bde876, #ff8581, #ffc472, #faed75, #a8c9e5, #999999, #e3a8e5, #dddddd, #fc603c, #ffcc00, #74e8d4, #3cd6fc

In an un sorted array, (we can pre-process this array). How can we answer the following query in O(1) time? Find the maximum from index i to j Edit: The preprocessing can take O(n) time and O(n) order

Given an array of positive integers, find a O(n² log(n)) algorithm to find all distinct numbers combinations for numbers x, y, z, u such that it satisfies x2 + y2 = z2 + u2 Basically, I see how you

I'm doing the following problem for fun / Java practice: Write a method kthSmallest that takes in a PriorityQueue of integers as input and outputs the kth smallest integer. The internal state of the

1.In a given array how to find the 2nd or 3rd,4th ,5th values. 2.Also if we use max() function in python what is the order of complexity i.e, associated with this function max() def nth_largest(li,n)

I am trying to find the last element in the array by using foreach loop. I have.. foreach ( $employees as $employee ) { $html.=$employee ->name.'and '; } I don't want to add 'and' to the last em

I want to find all consecutive sub-sequences of length n in a sequence. E.g. say n was 3 and the sequence was: [0,1,7,3,4,5,10] I want a function that would produce as output: [[0,1,7],[1,7,3],[7,3,4

I have an unsorted array of objects. I need to know how I can sort my array in descending order, according to the highest value inside the objects. I need to do this using for loops, not the easy way.

How can I find if a sorted array has an element a[j]=j in O(log n) time?(no duplicates)

I have a std::set, what's the proper way to find the largest int in this set ?