求数组中Majority元素的方法

frank-liu

浏览: 1665385 次
性别:
来自: 北京

最近访客更多访客>>

oneone1995

loginboot

45587123

liuqu11

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

java
data structures
algorithms

问题描述

在一个规模为N的数组A中，所谓的Majority元素就是指出现次数大于N/2的元素（因而最多只有一个这种元素）。比如数组

3, 3, 4, 2, 4, 4, 2, 4, 4 中间有Majority元素4。而数组3, 3, 4, 2, 4, 4, 2, 4则没有majority元素。需要一个算法，如果majority元素存在的话，就找出来，如果不存在，则给出报告。

下意识解法

通过这个问题，我们可以很快得出一个如下的方法。就是首先定义一个HashMap,里面存放数组里面的每个元素以及出现的次数。可以通过两个过程来做。

第一步是映射，将每个元素放进去，如果HashMap里面已经有元素了，则该元素对应的出现次数加一，否则新增加该元素，出现的次数设置为1.

第二步就是遍历整个HashMap，判断是否找到出现次数大于数组长度一半的。如果有，则返回该元素以及出现的次数。否则显示没有该元素。

详细的代码如下：

import java.util.HashMap;
import java.util.Map;

public class MaxFind
{
    private int maxValue;
    private int occurence;
    private int[] array;
    private HashMap<Integer, Integer> map = new HashMap<Integer, Integer>();

    public MaxFind(int[] array)
    {
        this.array = array;
    }

    public void map()
    {
        if(array == null || array.length == 0)
            return;
        for(int i = 0; i < array.length; i++)
        {
            if(map.containsKey(array[i]))
            {
                int key = array[i];
                map.put(key, map.get(key) + 1);
            }
            else
                map.put(array[i], 1);
        }
    }

    public int find()
    {
        this.occurence = 0;
        getValueAndOccurence();
        if(this.occurence > array.length / 2)
            return this.maxValue;
        else
            return -1;
    }

    private void getValueAndOccurence()
    {
        for(Map.Entry<Integer, Integer> entry : map.entrySet())
        {
            if(entry.getValue() > this.occurence)
            {
                this.occurence = entry.getValue();
                this.maxValue = entry.getKey();
            }
        }
    }

    public static void main(String[] args)
    {
        int[] array0 = {3, 3, 4, 2, 4, 4, 2, 4, 4};
        int[] array1 = {1, 2, 3, 4, 5, 6, 7, 8, 9};
        MaxFind finder = new MaxFind(array0);
        finder.map();
        if(finder.find() > 0)
            System.out.println(finder.find());
    }
}

分析

整体的代码实现采用一种相对OO一点的方式，因为两个方法都需要访问同一个数组和HashMap，可以将数组和HashMap作为类的成员。另外，HashMap的泛型参数必须为Integer，后面比较的时候会自动的装箱和拆箱。

这个方法的时间复杂度为O(N),因为要用到一个额外的HashMap，所以存储的空间复杂度也是O(N).

总的来说，这个方法很简单，没多少好说的。

进一步的要求

现在，我们看看是否还有进一步可以改进的地方。如果要求空间复杂度为O(1)呢？是否有方法？

分析

这个问题更苛刻的一点在于原来的办法可以用一个HashMap来存放统计结果。现在只能允许用O(1)的空间意味着只能用一个普通的数值变量来帮助解决问题。

这个时候就需要根据问题本身来进一步考虑一下。我们可以看到，如果存在一个majority元素的话，那么肯定一半多的元素都是这一个。我们考虑一种抵消的策略。如果该元素去和所有其他的元素比较，如果不同的都抵消的话，那么剩下的这个元素肯定就是majority元素。那么，具体的抵消步骤该怎么走呢？何况一开始我们也不知道哪个元素是majority元素。

进一步分析

假设我们一开始从数组的开头，碰到某个元素的时候，就设置该元素为当前元素。当前出现的次数为1.后面，如果接着碰到的元素和该元素相同，则当前次数加1，否则减1.如果当前出现的次数为0，则表示当前元素不确定。如果结合我们有majority元素这个前提的话，必然最后的结果是大于0的，而且最终获取到的值就是majority元素。

但是，这种分析还存在一个问题。我们考虑的是majority元素存在的情况。如果majority元素不存在，也有可能出现最终元素出现次数大于0的情况。比如说如下的序列： 1, 1, 1, 2, 2, 2, 3。这个序列没有majority元素，但是可以按照前面方法得到出现次数大于0的数字3.

很显然，数组中存在majority元素必然可以推出按照我们的方法得到的结果。但是按照我们的方法得到的结果却不能推出数组中存在majority元素。我们的方法只是判断的必要条件。

既然该方法不够充分，我们就还需要进一步的验证。接下来的步骤可以非常简单。我们再一次遍历这个数组，只要找按照前面我们的方法推出来的这个majority元素的次数，如果结果确实> N/2，则这个元素就是我们找到的。否则就没有找到。

至此，我们整个过程就整理出来了。概括一下的话可以分为两个步骤：

1. parse阶段

通过一个抵消的过程遍历数组，得到一个最终出现次数不为0的元素。如果出现的元素为0，则表明没有majority元素。

2. find阶段

按照前面找到的元素，到数组里面核对一下。

具体实现的代码如下：

public class Majority
{
    private static int count;
    private static int currentValue;

    public static void parse(int[] array)
    {
        if(array == null || array.length == 0)
            return;
        count = 1;
        currentValue = array[0];
        for(int i = 1; i < array.length; i++)
        {
            if(array[i] == currentValue)
                count++;
            else
                count--;
            if(count == 0)
                currentValue = array[i];
        }
    }

    public static boolean find(int[] array)
    {
        if(count == 0)
            return false;
        count = 0;
        for(int i = 0; i < array.length; i++)
        {
            if(array[i] == currentValue)
                count++;
        }
        if(count > array.length / 2)
            return true;
        else 
            return false;
    }

    public static void main(String[] args)
    {
        int[] array1 = {1, 2, 3, 4, 5};
        int[] array2 = {1, 2, 2};
        int[] array3 = {1, 2, 2, 2, 3, 4};

        // for array1 case 1
        parse(array1);
        if(find(array1))
            System.out.println("value:" + currentValue + "\tcount:" + count);
        else
            System.out.println("No majority value found.");

        // case 2
        parse(array2);
        if(find(array2))
            System.out.println("value:" + currentValue + "\tcount:" + count);
        else
            System.out.println("No majority value found.");


        // case 3
        parse(array3);
        if(find(array3))
            System.out.println("value:" + currentValue + "\tcount:" + count);
        else
            System.out.println("No majority value found.");
    }
}

该算法需要遍历数组两次，总的时间复杂度为O(N) ，空间复杂度为O(1)。

总结

寻找majority元素的方法还有其他的。比如说，先将元素排序，然后再进行判断。因为如果有majority元素的话，取数组中间点的那个元素即为所要找的那个。不过这种方法首先排序就需要O(NlogN)的时间复杂度，并不是很理想。

至于我们那个分两次遍历数组的方法，关键在于利用majority元素的特性进行抵消。

参考资料

A linear time majority vote algorithm

分享到：

连续子序列最大和与乘积问题的分析 | ant杂事整理

2012-08-05 13:33
浏览 1698
评论(0)
分类:编程语言
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论