## Saturday, October 3, 2015

### LeetCode OJ - Find the Duplicate Number

Problem:

Solution:

Think of it this way, suppose we have a perfect array look like this

[1, 2, 3, 4, 5, 6, 7, 8, 9]

Now we introduce a duplicated number, let say, it is 4, then the array look like this

[1, 2, 3, 4, 5, 6, 7, 8, 9, 4]

Next the number 4 could be duplicated more than once, so we rename so the existing elements to 4

[1, 4(was 2), 3, 4, 5, 6, 4(was 7), 8, 9, 4]

The key idea is here - if we are looking at the array through a range filter (let say, I am only interested in number in the range [7, 9), then a few case could happen.

Case 1: The range contain the duplicated element, in that case, the number of elements we can found in the filter must be larger than the number of elements it should have. For example, if we look at [3, 5), we will find [1, 4(was 2), 3, 4, 5, 6, 4(was 7), 8, 9, 4], or 5 elements. If no duplication occurred, there should be only two [3, 4].

This is true because some other elements could rename into this group, but that cannot rename out. In this case, the expected number of element is [3, 4], but we have one extra 4 (as the duplicate) and then two 4 renamed in, that's why we have 5.

Case 2: The range does not contain the duplicated element, in that case, the number of elements we can find in the filter must be equal or less than the number of elements.

Similar argument applies to claim this, the element is not repeated, and it could be renamed out. For example, in the [7, 9) case, we have only [8] because 7 is renamed out.

With this key idea, we could binary search for the answer!

Code:

#include "stdafx.h"

// https://leetcode.com/problems/find-the-duplicate-number/

#include "LEET_FIND_THE_DUPLICATE_NUMBER.h"
#include <map>
#include <iostream>
#include <sstream>
#include <vector>
#include <string>

using namespace std;

namespace _LEET_FIND_THE_DUPLICATE_NUMBER
{
class Solution
{
public:
int findDuplicate(vector<int>& nums)
{
int size = nums.size();
int begin = 1;
int end = size;
while (begin < end)
{
#ifdef LOG
cout << "Answer should be in [" << begin << ", " << end << ")" << endl;
#endif
if ((end - begin) == 1)
{
return begin;
}
int mid = (begin + end) / 2;
int lCount = 0;
int rCount = 0;
for (int i = 0; i < size; i++)
{
if (begin <= nums[i] && nums[i] < mid)
{
lCount++;
}
else if (mid <= nums[i] && nums[i] < end)
{
rCount++;
}
}
#ifdef LOG
cout << "Count in [" << begin << ", " << mid << ") is " << lCount << endl;
cout << "Count in [" << mid << ", " << end << ") is " << rCount << endl;
#endif
int lCountExpected = mid - begin;
int rCountExpected = end - mid;
if (lCount > lCountExpected)
{
end = mid;
}
else if (rCount > rCountExpected)
{
begin = mid;
}
else
{
throw 1;
}
}
}
};
};

using namespace _LEET_FIND_THE_DUPLICATE_NUMBER;

int LEET_FIND_THE_DUPLICATE_NUMBER()
{
int case1[] = { 2, 2, 2, 2, 2 };
int case2[] = { 1, 2, 3, 4, 5, 6, 7, 7 };
int case3[] = { 4, 4, 16, 3, 12, 18, 11, 19, 10, 17, 4, 15, 4, 9, 4, 13, 4, 4, 1, 4 };
Solution solution;
cout << solution.findDuplicate(vector<int>(case1, case1 + _countof(case1))) << endl;
cout << solution.findDuplicate(vector<int>(case2, case2 + _countof(case2))) << endl;
cout << solution.findDuplicate(vector<int>(case3, case3 + _countof(case3))) << endl;
return 0;
}