其他分享
首页 > 其他分享> > RoaringBitmap运行机制解析

RoaringBitmap运行机制解析

作者:互联网

背景

用于将int类型转换成bitmap类型

public static RoaringBitmap bitmapOf(final int... dat) {
    final RoaringBitmap ans = new RoaringBitmap();
    ans.add(dat);
    return ans;
  }

运行机制

  1. 初始化bitmap
    final RoaringBitmap ans = new RoaringBitmap();
    在初始化过程中,无参构造器会新建一个RoaringArray类对象,该对象被赋值给成员变量highLowContainer。hightLowContainer包含两个重要成员变量:高位值数组和低位值数组,它们的初始容量为4,高位是short类型数组,低位是Container类型数组。
    一个int类型数据(占32位)按16位均分成两部分:高位数值+低位数值
  public RoaringBitmap() {
    highLowContainer = new RoaringArray();
  }

RoaringArray类

public final class RoaringArray implements Cloneable, Externalizable {
  static final int INITIAL_CAPACITY = 4;
  short[] keys = null;
  Container[] values = null;
  protected RoaringArray() {
    this.keys = new short[INITIAL_CAPACITY];
    this.values = new Container[INITIAL_CAPACITY];
  }
}
  1. 往bitmap中增加新数据
    目前Container容器分三种:ArrayContainer/BitmapContainer/RunContainer,它们的容量均有上限,容器中包含的元素个数以其成员变量cardinality表示。
    容器类型随着数据量的增多而改变:ArrayContainer (0<=数据量<4096,2^12) -> BitmapContainer (4096<=数据量<65536,2^16) -> RunContainer (调用runOptimize方法)
    第一个元素:
    初始化一个ArrayContainer类型的容器,其本质是初始容量为4的short类型数组。
    计算第一个元素的高位数值,将其放入highLowContainer的short数组下标为0的位置,然后再计算出低位数值,通过插入方式放入ArrayContainer容器的指定位置(所有元素有序),最后将ArrayContainer容器放入highLowContainer的Container数组下标为0的位置。
    后续元素:
    计算元素的高位数值,在highLowContainer的short类型数组中判断是否存在该数值(此处有过优化,如果最后一个元素恰好等于该数值,直接返回下标索引;否则通过混合二分查找算法返回下标索引),如果存在,那么将元素的低位数值添加到Container类型数组相同位置的容器中(容器为三类容器中的某一种,因为容器中的数据量决定容器的类型);如果不存在,那么将元素的高位数值追加到highLowContainer的short数组的末尾,并将低位数值放入新建的ArrayContainer容器后追加到highLowContainer的Container数组末尾。
public class RoaringBitmap implements Cloneable, Serializable, Iterable<Integer>, Externalizable,
    ImmutableBitmapDataProvider, BitmapDataProvider {
    /**
   * Set all the specified values  to true. This can be expected to be slightly
   * faster than calling "add" repeatedly. The provided integers values don't
   * have to be in sorted order, but it may be preferable to sort them from a performance point of
   * view.
   *
   * @param dat set values
   */
  public void add(final int... dat) {
    Container currentcont = null;
    short currenthb = 0;
    int currentcontainerindex = 0;
    int j = 0;
    if(j < dat.length) {
      int val = dat[j];
      currenthb = Util.highbits(val);
      currentcontainerindex = highLowContainer.getIndex(currenthb);
      if (currentcontainerindex >= 0) {
        currentcont = highLowContainer.getContainerAtIndex(currentcontainerindex);
        Container newcont = currentcont.add(Util.lowbits(val));
        if(newcont != currentcont) {
          highLowContainer.setContainerAtIndex(currentcontainerindex, newcont);
          currentcont = newcont;
        }
      } else {
        currentcontainerindex = - currentcontainerindex - 1;
        final ArrayContainer newac = new ArrayContainer();
        currentcont = newac.add(Util.lowbits(val));
        highLowContainer.insertNewKeyValueAt(currentcontainerindex, currenthb, currentcont);
      }
      j++;
    }
    for( ; j < dat.length; ++j) {
      int val = dat[j];
      short newhb = Util.highbits(val);
      if(currenthb == newhb) {// easy case
        // this could be quite frequent
        Container newcont = currentcont.add(Util.lowbits(val));
        if(newcont != currentcont) {
          highLowContainer.setContainerAtIndex(currentcontainerindex, newcont);
          currentcont = newcont;
        }
      } else {
        currenthb = newhb;
        currentcontainerindex = highLowContainer.getIndex(currenthb);
        if (currentcontainerindex >= 0) {
          currentcont = highLowContainer.getContainerAtIndex(currentcontainerindex);
          Container newcont = currentcont.add(Util.lowbits(val));
          if(newcont != currentcont) {
            highLowContainer.setContainerAtIndex(currentcontainerindex, newcont);
            currentcont = newcont;
          }
        } else {
          currentcontainerindex = - currentcontainerindex - 1;
          final ArrayContainer newac = new ArrayContainer();
          currentcont = newac.add(Util.lowbits(val));
          highLowContainer.insertNewKeyValueAt(currentcontainerindex, currenthb, currentcont);
        }
      }
    }
  }
    }
  1. ArrayContainer晋升为BitmapContainer
    当ArrayContainer容器中的元素个数cardinality大于等于默认最大容量4096(2^12)时,就会晋升为BitmapContainer容器。
    BitmapContainer类型容器本质是一个long类型数组,在初始化时,数组容量一次性被指定为1024(即65536 / 64)。
    在晋升过程中,ArrayContainer拥有的元素会被逐个添加到BitmapContainer中,当元素除以64等于0,1,2,…时,这些元素被分成不同组,每组元素按位合并成一个long数值后被添加到BitmapContainer的long类型数组中
public final class ArrayContainer extends Container implements Cloneable {
	private static final int DEFAULT_INIT_SIZE = 4;
	static final int DEFAULT_MAX_SIZE = 4096;
	
	@Override
  	public Container add(final short x) {
	    int loc = Util.unsignedBinarySearch(content, 0, cardinality, x);
	    if (loc < 0) {
	      // Transform the ArrayContainer to a BitmapContainer
	      // when cardinality = DEFAULT_MAX_SIZE
	      if (cardinality >= DEFAULT_MAX_SIZE) {
	        BitmapContainer a = this.toBitmapContainer();
	        a.add(x);
	        return a;
	      }
	      if (cardinality >= this.content.length) {
	        increaseCapacity();
	      }
	      // insertion : shift the elements > x by one position to
	      // the right
	      // and put x in it's appropriate place
	      System.arraycopy(content, -loc - 1, content, -loc, cardinality + loc + 1);
	      content[-loc - 1] = x;
	      ++cardinality;
	    }
	    return this;
  	}

	 @Override
	 public BitmapContainer toBitmapContainer() {
	    BitmapContainer bc = new BitmapContainer();
	    bc.loadData(this);
	    return bc;
	 }
}

BitmapContainer容器

public final class BitmapContainer extends Container implements Cloneable {
	protected static final int MAX_CAPACITY = 1 << 16;
	final long[] bitmap;
	int cardinality;

	public BitmapContainer() {
    	this.cardinality = 0;
    	this.bitmap = new long[MAX_CAPACITY / 64];
    }
    //将ArrayContainer容器元素放入BitmapContainer容器
    protected void loadData(final ArrayContainer arrayContainer) {
    	this.cardinality = arrayContainer.cardinality;
	    for (int k = 0; k < arrayContainer.cardinality; ++k) {
	      final short x = arrayContainer.content[k];
	      bitmap[Util.toIntUnsigned(x) / 64] |= (1L << x);
	    }
    }
}

标签:currentcontainerindex,BitmapContainer,运行机制,currentcont,RoaringBitmap,解析,final,Ar
来源: https://blog.csdn.net/md_2014/article/details/111568110