RoaringBitmap运行机制解析
作者:互联网
背景
用于将int类型转换成bitmap类型
public static RoaringBitmap bitmapOf(final int... dat) {
final RoaringBitmap ans = new RoaringBitmap();
ans.add(dat);
return ans;
}
运行机制
- 初始化bitmap
final RoaringBitmap ans = new RoaringBitmap();
在初始化过程中,无参构造器会新建一个RoaringArray类对象,该对象被赋值给成员变量highLowContainer。hightLowContainer包含两个重要成员变量:高位值数组和低位值数组,它们的初始容量为4,高位是short类型数组,低位是Container类型数组。
一个int类型数据(占32位)按16位均分成两部分:高位数值+低位数值
public RoaringBitmap() {
highLowContainer = new RoaringArray();
}
RoaringArray类
public final class RoaringArray implements Cloneable, Externalizable {
static final int INITIAL_CAPACITY = 4;
short[] keys = null;
Container[] values = null;
protected RoaringArray() {
this.keys = new short[INITIAL_CAPACITY];
this.values = new Container[INITIAL_CAPACITY];
}
}
- 往bitmap中增加新数据
目前Container容器分三种:ArrayContainer/BitmapContainer/RunContainer,它们的容量均有上限,容器中包含的元素个数以其成员变量cardinality表示。
容器类型随着数据量的增多而改变:ArrayContainer (0<=数据量<4096,2^12) -> BitmapContainer (4096<=数据量<65536,2^16) -> RunContainer (调用runOptimize方法)
第一个元素:
初始化一个ArrayContainer类型的容器,其本质是初始容量为4的short类型数组。
计算第一个元素的高位数值,将其放入highLowContainer的short数组下标为0的位置,然后再计算出低位数值,通过插入方式放入ArrayContainer容器的指定位置(所有元素有序),最后将ArrayContainer容器放入highLowContainer的Container数组下标为0的位置。
后续元素:
计算元素的高位数值,在highLowContainer的short类型数组中判断是否存在该数值(此处有过优化,如果最后一个元素恰好等于该数值,直接返回下标索引;否则通过混合二分查找算法返回下标索引),如果存在,那么将元素的低位数值添加到Container类型数组相同位置的容器中(容器为三类容器中的某一种,因为容器中的数据量决定容器的类型);如果不存在,那么将元素的高位数值追加到highLowContainer的short数组的末尾,并将低位数值放入新建的ArrayContainer容器后追加到highLowContainer的Container数组末尾。
public class RoaringBitmap implements Cloneable, Serializable, Iterable<Integer>, Externalizable,
ImmutableBitmapDataProvider, BitmapDataProvider {
/**
* Set all the specified values to true. This can be expected to be slightly
* faster than calling "add" repeatedly. The provided integers values don't
* have to be in sorted order, but it may be preferable to sort them from a performance point of
* view.
*
* @param dat set values
*/
public void add(final int... dat) {
Container currentcont = null;
short currenthb = 0;
int currentcontainerindex = 0;
int j = 0;
if(j < dat.length) {
int val = dat[j];
currenthb = Util.highbits(val);
currentcontainerindex = highLowContainer.getIndex(currenthb);
if (currentcontainerindex >= 0) {
currentcont = highLowContainer.getContainerAtIndex(currentcontainerindex);
Container newcont = currentcont.add(Util.lowbits(val));
if(newcont != currentcont) {
highLowContainer.setContainerAtIndex(currentcontainerindex, newcont);
currentcont = newcont;
}
} else {
currentcontainerindex = - currentcontainerindex - 1;
final ArrayContainer newac = new ArrayContainer();
currentcont = newac.add(Util.lowbits(val));
highLowContainer.insertNewKeyValueAt(currentcontainerindex, currenthb, currentcont);
}
j++;
}
for( ; j < dat.length; ++j) {
int val = dat[j];
short newhb = Util.highbits(val);
if(currenthb == newhb) {// easy case
// this could be quite frequent
Container newcont = currentcont.add(Util.lowbits(val));
if(newcont != currentcont) {
highLowContainer.setContainerAtIndex(currentcontainerindex, newcont);
currentcont = newcont;
}
} else {
currenthb = newhb;
currentcontainerindex = highLowContainer.getIndex(currenthb);
if (currentcontainerindex >= 0) {
currentcont = highLowContainer.getContainerAtIndex(currentcontainerindex);
Container newcont = currentcont.add(Util.lowbits(val));
if(newcont != currentcont) {
highLowContainer.setContainerAtIndex(currentcontainerindex, newcont);
currentcont = newcont;
}
} else {
currentcontainerindex = - currentcontainerindex - 1;
final ArrayContainer newac = new ArrayContainer();
currentcont = newac.add(Util.lowbits(val));
highLowContainer.insertNewKeyValueAt(currentcontainerindex, currenthb, currentcont);
}
}
}
}
}
- ArrayContainer晋升为BitmapContainer
当ArrayContainer容器中的元素个数cardinality大于等于默认最大容量4096(2^12)时,就会晋升为BitmapContainer容器。
BitmapContainer类型容器本质是一个long类型数组,在初始化时,数组容量一次性被指定为1024(即65536 / 64)。
在晋升过程中,ArrayContainer拥有的元素会被逐个添加到BitmapContainer中,当元素除以64等于0,1,2,…时,这些元素被分成不同组,每组元素按位合并成一个long数值后被添加到BitmapContainer的long类型数组中
public final class ArrayContainer extends Container implements Cloneable {
private static final int DEFAULT_INIT_SIZE = 4;
static final int DEFAULT_MAX_SIZE = 4096;
@Override
public Container add(final short x) {
int loc = Util.unsignedBinarySearch(content, 0, cardinality, x);
if (loc < 0) {
// Transform the ArrayContainer to a BitmapContainer
// when cardinality = DEFAULT_MAX_SIZE
if (cardinality >= DEFAULT_MAX_SIZE) {
BitmapContainer a = this.toBitmapContainer();
a.add(x);
return a;
}
if (cardinality >= this.content.length) {
increaseCapacity();
}
// insertion : shift the elements > x by one position to
// the right
// and put x in it's appropriate place
System.arraycopy(content, -loc - 1, content, -loc, cardinality + loc + 1);
content[-loc - 1] = x;
++cardinality;
}
return this;
}
@Override
public BitmapContainer toBitmapContainer() {
BitmapContainer bc = new BitmapContainer();
bc.loadData(this);
return bc;
}
}
BitmapContainer容器
public final class BitmapContainer extends Container implements Cloneable {
protected static final int MAX_CAPACITY = 1 << 16;
final long[] bitmap;
int cardinality;
public BitmapContainer() {
this.cardinality = 0;
this.bitmap = new long[MAX_CAPACITY / 64];
}
//将ArrayContainer容器元素放入BitmapContainer容器
protected void loadData(final ArrayContainer arrayContainer) {
this.cardinality = arrayContainer.cardinality;
for (int k = 0; k < arrayContainer.cardinality; ++k) {
final short x = arrayContainer.content[k];
bitmap[Util.toIntUnsigned(x) / 64] |= (1L << x);
}
}
}
标签:currentcontainerindex,BitmapContainer,运行机制,currentcont,RoaringBitmap,解析,final,Ar 来源: https://blog.csdn.net/md_2014/article/details/111568110