辉太郎: 九月 2019

传送门： link

题目描述：
Implementing the class MajorityChecker, which has the following API:

MajorityChecker(int[] arr) constructs an instance of MajorityChecker with the given array arr;
int query(int left, int right, int threshold) has arguments such that:
- 0 <= left <= right < arr.length representing a subarray of arr;
- 2 * threshold > right - left + 1, ie. the threshold is always a strict majority of the length of the subarray

Each query(...) returns the element in arr[left], arr[left+1], ..., arr[right] that occurs at least threshold times, or -1 if no such element exists.

Example:

MajorityChecker majorityChecker = new MajorityChecker([1,1,2,2,1,1]);
majorityChecker.query(0,5,4); // returns 1
majorityChecker.query(0,3,3); // returns -1
majorityChecker.query(2,3,2); // returns 2

Constraints:

1 <= arr.length <= 20000
1 <= arr[i] <= 20000
For each query, 0 <= left <= right < len(arr)
For each query, 2 * threshold > right - left + 1
The number of queries is at most 10000

1. Moore Voting Solution
这个解法不是第一次出现在Leetcode，我第一次使用这个算法是在 [Majority Element](https://leetcode.com/problems/majority-element/) 中。


class MajorityChecker {
    int[] arr;
    public MajorityChecker(int[] arr) {
        this.arr = arr;
    }

    public int query(int left, int right, int threshold) {
        int cnt = 0, val = -1;
        // voting algo
        for(int i = left; i <= right; i++) {
            if(cnt == 0) {
                cnt++;
                val = arr[i];
            } else {
                if(arr[i] == val)
                    cnt++;
                else
                    cnt--;
            }
        }
        // check whether the winner is larger or equal to threshold
        cnt = 0;
        for(int i = left; i <= right; i++) {
            if(val == arr[i])
                cnt++;
        }
        if(cnt >= threshold)
            return val;
        else
            return -1;
    }
}

这个解法的复杂度是构造函数O(1)，query函数O(N)。这个复杂度不算低，根据数据集规模来看，应该有更好的解法。

2. 近似算法
假设我们要计算[l, r]中majority元素，如果这个数字存在，我们随机选择一个数字，那么这个数字就是majority元素的概率至少是1/2。如果我们重复20次，那么我们没有选择到majority元素的概率是 (1/2）^20，这个是非常小的概率。我们利用这个近似的算法来计算。那么问题变成了已知l,r且对于一个随机选择的val，怎么才能高效判断这个val是不是majority元素呢，我们可以二分搜索。代码如下：


class MajorityChecker {
    // store val -> set of idx
    Map<Integer, List<Integer>> map;
    Random random;
    int[] arr;
    public MajorityChecker(int[] arr) {
        map = new HashMap<>();
        random = new Random();
        for(int i = 0; i < arr.length; i++) {
            int val = arr[i];
            if(!map.containsKey(val)) {
                List<Integer> list = new ArrayList<>();
                map.put(val, list);
            }
            map.get(val).add(i);
        }
        this.arr = arr;
    }

    public int query(int left, int right, int threshold) {
        for(int times = 0; times < 20; times++) {
            int num = random.nextInt(right - left + 1);
            int idx = left + num;
            int val = arr[idx];
            int start = Collections.binarySearch(map.get(val), left);
            if(start < 0)   start = -(start + 1);
            int end = Collections.binarySearch(map.get(val), right);
            if(end < 0) end = -(end + 1) - 1;
            if(end - start + 1 >= threshold)
                return val;
        }
        return -1;
    }
}

算法复杂度O(20lgn)

3. segment tree
提到线段的题目，经典算法包括BIT和segment tree。 BIT多用于计算某一线段的元素之和，segment tree适用场景比较多。所以直觉上这个题目应该可以使用segment tree来解。这个解法还是用到了上面提到的二分搜索算法。


class MajorityChecker {
    int[] data, tree;
    Map<Integer, List<Integer>> map;
    public MajorityChecker(int[] arr) {
        this.data = arr;
        tree = new int[arr.length * 4];
        map = new HashMap<>();
        for(int i = 0; i < arr.length; i++) {
            if(!map.containsKey(arr[i])) {
                List<Integer> list = new ArrayList<>();
                map.put(arr[i], list);
            }
            map.get(arr[i]).add(i);
        }
        buildTree(0, arr.length-1, 0);
    }

    private void buildTree(int start, int end, int treeIdx) {
        if(start == end) {
            tree[treeIdx] = data[start];
            return;
        }

        int leftTreeIdx = treeIdx * 2 + 1, rightTreeIdx = treeIdx * 2 + 2;
        int mid = (start + end) / 2;
        buildTree(start, mid, leftTreeIdx);
        buildTree(mid+1, end, rightTreeIdx);

        if(tree[leftTreeIdx] != 0 && getOccurrence(tree[leftTreeIdx], start, end) * 2 > end - start + 1) {
            tree[treeIdx] = tree[leftTreeIdx];
        } else if(tree[rightTreeIdx] != 0 && getOccurrence(tree[rightTreeIdx], start, end) * 2 > end - start + 1) {
            tree[treeIdx] = tree[rightTreeIdx];
        }
    }

    private int[] query(int treeIdx, int left, int right, int targetL, int targetR) {
        if(left >= targetL && right <= targetR) {
            if(tree[treeIdx] == 0)
                return new int[]{0, 0};
            return new int[] {tree[treeIdx], getOccurrence(tree[treeIdx], targetL, targetR)};
        }

        int mid = (left + right) / 2, leftTreeIdx = treeIdx * 2 + 1, rightTreeIdx = treeIdx * 2 + 2;
        if(targetL > mid)
            return query(rightTreeIdx, mid+1, right, targetL, targetR);
        else if (targetR <= mid)
            return query(leftTreeIdx, left, mid, targetL, targetR);
        else {
            int[] ans1 = query(leftTreeIdx, left, mid, targetL, targetR);
            if(ans1[0] > 0 && ans1[1] * 2 > targetR - targetL + 1) return ans1;
            int[] ans2 = query(rightTreeIdx, mid+1, right, targetL, targetR);
            if(ans2[0] > 0) return ans2;
            return new int[] {0, 0};
        }
    }

    private int getOccurrence(int num, int left, int right) {
        List<Integer> list = map.get(num);
        int start = Collections.binarySearch(list, left);
        if(start < 0)   start = -(start + 1);
        int end = Collections.binarySearch(list, right);
        if(end < 0) end = -(end + 1) - 1;
        return end - start + 1;
    }

    public int query(int left, int right, int threshold) {
        int[] ans = query(0, 0, data.length-1, left, right);
        if(ans[1] >= threshold) return ans[0];
        return -1;
    }
}

这个算法初始化复杂度O(nlgn)。查询复杂度是O(lgnlgn) -> 我们会递归lgn次，每次都可能做一次binary search。

TAKE AWAY POINTS:

Moore Voting Algo
Segment Tree
Random API -> random.nextInt(100) 会生成0-99数字

辉太郎

[LEETCODE 1157] Online Majority Element In Subarray

[LEETCODE 1157] Online Majority Element In Subarray

搜索此博客