其他分享
首页 > 其他分享> > Apache Hbase

Apache Hbase

作者:互联网

Author: Lijb
Eamil: lijb1121@163.com
WeChat: ljb1121

HBase是一个分布式的、面向列的开源数据库,该技术来源于 Fay Chang 所撰写的Google论文“Bigtable:一个结构化数据的分布式存储系统”。就像Bigtable利用了Google文件系统(File System)所提供的分布式数据存储一样,HBase在Hadoop之上提供了类似于Bigtable的能力。HBase是Apache的Hadoop项目的子项目。HBase不同于一般的关系数据库,它是一个适合于非结构化数据存储的数据库。另一个不同的是HBase基于列的而不是基于行的模式。

Hbase和HDFS之间关系?

因为HDFS文件系统虽然支持海量数据存储,但是不擅长对单条记录做高效的管理(查询、修改、删除,增加)不支持对海量数据做随机读写。HBase是构建在HDFS上的一款NoSQL数据库,实现了对HDFS上的数据的高效管理,能够实现对海量数据的随机读写,实现行级别数据管理。

HBase数据库特点

行存储和列存储

行存储

列存储

HBase环境搭建

    [root[@CentOS](https://my.oschina.net/u/1241776) ~]# tar -zxf zookeeper-3.4.6.tar.gz -C /usr/
    [root[@CentOS](https://my.oschina.net/u/1241776) ~]# vi /usr/zookeeper-3.4.6/conf/zoo.cfg
    tickTime=2000
    dataDir=/root/zkdata
    clientPort=2181
    [root[@CentOS](https://my.oschina.net/u/1241776) ~]# mkdir /root/zkdata
    [root[@CentOS](https://my.oschina.net/u/1241776) zookeeper-3.4.6]# ./bin/zkServer.sh start zoo.cfg
    JMX enabled by default
    Using config: /usr/zookeeper-3.4.6/bin/../conf/zoo.cfg
    Starting zookeeper ... STARTED
    [root[@CentOS](https://my.oschina.net/u/1241776) zookeeper-3.4.6]# ./bin/zkServer.sh status zoo.cfg
    JMX enabled by default
    Using config: /usr/zookeeper-3.4.6/bin/../conf/zoo.cfg
    Mode: standalone
    [root@centos ~]# jps
    1612 SecondaryNameNode
    1348 NameNode
    1742 QuorumPeerMain //zookeeper
    1437 DataNode
    [root@centos ~]# tar -zxf hbase-1.2.4-bin.tar.gz -C /usr/
    [root@centos ~]# vi /usr/hbase-1.2.4/conf/hbase-site.xml
    ~~~
  ~~~xml  
    <property>
                <name>hbase.rootdir</name>
                <value>hdfs://CentOS:9000/hbase</value>
    </property>
    <property>
                <name>hbase.cluster.distributed</name>
                <value>true</value>
    </property>
    <property>
                <name>hbase.zookeeper.quorum</name>
                <value>CentOS</value>
    </property>
    <property>
                <name>hbase.zookeeper.property.clientPort</name>
                <value>2181</value>
    </property>
    ~~~
 ~~~bash
    [root@centos ~]# vi /usr/hbase-1.2.4/conf/regionservers 
    
    centos
    
    [root@centos ~]# vi .bashrc
    
    HBASE_MANAGES_ZK=false
    HBASE_HOME=/usr/hbase-1.2.4
    HADOOP_HOME=/usr/hadoop-2.6.0
    JAVA_HOME=/usr/java/latest
    CLASSPATH=.
    PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HBASE_HOME/bin
    export JAVA_HOME
    export CLASSPATH
    export PATH
    export HADOOP_HOME
    export HBASE_HOME
    export HBASE_MANAGES_ZK
    
    [root@centos ~]# start-hbase.sh 
    [root@centos ~]# jps
    1612 SecondaryNameNode
    2102 HRegionServer  //负责实际表数据的读写操作
    1348 NameNode
    2365 Jps
    1978 HMaster        //类似namenode管理表相关元数据、管理ResgionServer
    1742 QuorumPeerMain
    1437 DataNode

可以访问:http://centos:16010

HBase Shell命令

    [root@centos ~]# hbase shell
    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/usr/hbase-1.2.4/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/usr/hadoop-2.6.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
    HBase Shell; enter 'help<RETURN>' for list of supported commands.
    Type "exit<RETURN>" to leave the HBase Shell
    Version 1.2.4, rUnknown, Wed Feb 15 18:58:00 CST 2017
    
    hbase(main):001:0> 
    hbase(main):001:0> status
    1 active master, 0 backup masters, 1 servers, 0 dead, 2.0000 average load
    hbase(main):006:0> version
    1.2.4, rUnknown, Wed Feb 15 18:58:00 CST 2017
    

namespace操作(数据库)

    hbase(main):003:0> list_namespace
    NAMESPACE                                                                                default                        
    hbase      
    hbase(main):006:0> create_namespace 'baizhi',{'author'=>'zs'}
    0 row(s) in 0.3260 seconds
    hbase(main):004:0> list_namespace_tables 'hbase'
    TABLE                      
    meta  
    namespace
    hbase(main):008:0> describe_namespace 'baizhi'
    DESCRIPTION               
    {NAME => 'baizhi', author => 'zs'}
    1 row(s) in 0.0550 seconds
    hbase(main):010:0> alter_namespace 'baizhi',{METHOD => 'set','author'=> 'wangwu'}
    0 row(s) in 0.2520 seconds
    hbase(main):011:0> describe_namespace 'baizhi'
    DESCRIPTION            
    {NAME => 'baizhi', author => 'wangwu'}                                                           1 row(s) in 0.0030 seconds
    hbase(main):012:0> alter_namespace 'baizhi',{METHOD => 'unset',NAME => 'author'}
    0 row(s) in 0.0550 seconds
    hbase(main):013:0> describe_namespace 'baizhi'
    DESCRIPTION              
    {NAME => 'baizhi'}         
    1 row(s) in 0.0080 seconds
    hbase(main):020:0> drop_namespace 'baizhi'
    0 row(s) in 0.0730 seconds

HBase不允许删除有表的数据库

table相关操作(DDL操作)

    hbase(main):023:0> create 't_user','cf1','cf2'
    0 row(s) in 1.2880 seconds
    
    => Hbase::Table - t_user
    hbase(main):024:0> create 'baizhi:t_user',{NAME=>'cf1',VERSIONS=>3},{NAME=>'cf2',TTL=>3600}
    0 row(s) in 1.2610 seconds
    
    => Hbase::Table - baizhi:t_user
    hbase(main):026:0> describe 'baizhi:t_user'
    Table baizhi:t_user is ENABLED                                                                                               
    baizhi:t_user                                                                                                                
    COLUMN FAMILIES DESCRIPTION                                                                                                  
    {NAME => 'cf1', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION =
    > 'NONE', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', B
    LOCKCACHE => 'true'}                                                                                                         
    {NAME => 'cf2', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION =
    > 'NONE', MIN_VERSIONS => '0', TTL => '3600 SECONDS (1 HOUR)', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY
     => 'false', BLOCKCACHE => 'true'}                                                                                           
    2 row(s) in 0.0240 seconds
    hbase(main):030:0> exists 't_user'
    Table t_user does exist                                                                                                      
    0 row(s) in 0.0250 seconds
    hbase(main):036:0> enable 't_user'
    0 row(s) in 0.0220 seconds
    
    hbase(main):037:0> is_enabled 't_user'
    true                                                                                                                   
    0 row(s) in 0.0090 seconds
    hbase(main):035:0> enable_all 't_.*'
    t_user                                                                                                                       
    Enable the above 1 tables (y/n)?
    y
    1 tables successfully enabled
    hbase(main):038:0> disable 't_user'
    0 row(s) in 2.2930 seconds
    
    hbase(main):039:0> drop 't_user'
    0 row(s) in 1.2670 seconds
    hbase(main):042:0> list 'baizhi:.*'
    TABLE                                                                                                                        
    baizhi:t_user                                                                                                                
    1 row(s) in 0.0050 seconds
    
    => ["baizhi:t_user"]
    hbase(main):043:0> list
    TABLE                                                                                                                        
    baizhi:t_user                                                                                                                
    1 row(s) in 0.0050 seconds
    hbase(main):002:0> t=get_table 'baizhi:t_user'
    0 row(s) in 0.0440 seconds
    hbase(main):008:0> alter 'baizhi:t_user',{ NAME => 'cf2', TTL => 60 }
    Updating all regions with the new schema...
    1/1 regions updated.
    Done.
    0 row(s) in 2.7000 seconds

表的DML操作

    hbase(main):010:0> put 'baizhi:t_user',1,'cf1:name','zhangsan'
    0 row(s) in 0.2060 seconds
    
    hbase(main):011:0> t = get_table 'baizhi:t_user'
    0 row(s) in 0.0010 seconds
    
    hbase(main):012:0> t.put 1,'cf1:age','18'
    0 row(s) in 0.0500 seconds
    hbase(main):017:0> get 'baizhi:t_user',1
    COLUMN                           CELL     
     cf1:age                         timestamp=1536996547967, value=21   
     cf1:name                        timestamp=1536996337398, value=zhangsan  
    2 row(s) in 0.0680 seconds
    hbase(main):019:0> get 'baizhi:t_user',1,{COLUMN =>'cf1', VERSIONS=>10}
    COLUMN                           CELL   
     cf1:age                         timestamp=1536996547967, value=21 
     cf1:age                         timestamp=1536996542980, value=20 
     cf1:age                         timestamp=1536996375890, value=18 
     cf1:name                        timestamp=1536996337398, value=zhangsan 
    4 row(s) in 0.0440 seconds
    
    hbase(main):020:0> get 'baizhi:t_user',1,{COLUMN =>'cf1:age', VERSIONS=>10}
    COLUMN                           CELL                
     cf1:age                         timestamp=1536996547967, value=21 
     cf1:age                         timestamp=1536996542980, value=20 
     cf1:age                         timestamp=1536996375890, value=18   
    3 row(s) in 0.0760 seconds
    
    hbase(main):021:0> get 'baizhi:t_user',1,{COLUMN =>'cf1:age', TIMESTAMP => 1536996542980 }
    COLUMN                           CELL 
     cf1:age                         timestamp=1536996542980, value=20  
    1 row(s) in 0.0260 seconds
    
    hbase(main):025:0> get 'baizhi:t_user',1,{TIMERANGE => [1536996375890,1536996547967]}
    COLUMN                           CELL  
     cf1:age                         timestamp=1536996542980, value=20 
    1 row(s) in 0.0480 seconds
    
    hbase(main):026:0> get 'baizhi:t_user',1,{TIMERANGE => [1536996375890,1536996547967],VERSIONS=>10}
    COLUMN                           CELL
     cf1:age                         timestamp=1536996542980, value=20 
     cf1:age                         timestamp=1536996375890, value=18 
    2 row(s) in 0.0160 seconds
    hbase(main):004:0> scan 'baizhi:t_user'
    ROW                              COLUMN+CELL   
     1                               column=cf1:age, timestamp=1536996547967, value=21
     1                               column=cf1:height, timestamp=1536997284682, value=170
     1                               column=cf1:name, timestamp=1536996337398, value=zhangsan
     1                               column=cf1:salary, timestamp=1536997158586, value=15000  
     1                               column=cf1:weight, timestamp=1536997311001, value=\x00\x00\x00\x00\x00\x00\x00\x05          
     2                               column=cf1:age, timestamp=1536997566506, value=18 
     2                               column=cf1:name, timestamp=1536997556491, value=lisi  
    2 row(s) in 0.0470 seconds
    hbase(main):009:0> scan 'baizhi:t_user', {STARTROW => '1',LIMIT=>1}
    ROW                              COLUMN+CELL 
     1                               column=cf1:age, timestamp=1536996547967, value=21
     1                               column=cf1:height, timestamp=1536997284682, value=170 
     1                               column=cf1:name, timestamp=1536996337398, value=zhangsan
     1                               column=cf1:salary, timestamp=1536997158586, value=15000
     1                               column=cf1:weight, timestamp=1536997311001, value=\x00\x00\x00\x00\x00\x00\x00\x05          
    1 row(s) in 0.0280 seconds
    
    hbase(main):011:0> scan 'baizhi:t_user', {COLUMNS=>'cf1:age',TIMESTAMP=>1536996542980}
    ROW                              COLUMN+CELL 
     1                               column=cf1:age, timestamp=1536996542980, value=20
    1 row(s) in 0.0330 seconds
    hbase(main):013:0> scan 'baizhi:t_user', {COLUMNS=>'cf1:age',VERSIONS=>3}
    ROW                              COLUMN+CELL  
     1                               column=cf1:age, timestamp=1536996547967, value=21
     1                               column=cf1:age, timestamp=1536996542980, value=20
     1                               column=cf1:age, timestamp=1536996375890, value=18 
     2                               column=cf1:age, timestamp=1536997566506, value=18                                           
    2 row(s) in 0.0150 seconds
    
    hbase(main):014:0> delete 'baizhi:t_user',1,'cf1:age',1536996542980
    0 row(s) in 0.0920 seconds
    
    hbase(main):015:0> scan 'baizhi:t_user', {COLUMNS=>'cf1:age',VERSIONS=>3}
    ROW                              COLUMN+CELL 
     1                               column=cf1:age, timestamp=1536996547967, value=21
     2                               column=cf1:age, timestamp=1536997566506, value=18                                           
    2 row(s) in 0.0140 seconds
    
    hbase(main):016:0> delete 'baizhi:t_user',1,'cf1:age'
    0 row(s) in 0.0170 seconds
    
    hbase(main):017:0> scan 'baizhi:t_user', {COLUMNS=>'cf1:age',VERSIONS=>3}
    ROW                              COLUMN+CELL     
     2                               column=cf1:age, timestamp=1536997566506, value=18                                           
    1 row(s) in 0.0170 seconds
    
    hbase(main):019:0> deleteall 'baizhi:t_user',1
    0 row(s) in 0.0200 seconds
    
    hbase(main):020:0>  get 'baizhi:t_user',1
    COLUMN                           CELL 
    0 row(s) in 0.0200 seconds
    hbase(main):022:0> truncate 'baizhi:t_user'
    Truncating 'baizhi:t_user' table (it may take a while):
     - Disabling table...
     - Truncating table...
    0 row(s) in 4.0040 seconds

HBase java API

    <dependency>
        <groupId>org.apache.hbase</groupId>
        <artifactId>hbase-client</artifactId>
        <version>1.2.4</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hbase</groupId>
        <artifactId>hbase-common</artifactId>
        <version>1.2.4</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hbase</groupId>
        <artifactId>hbase-protocol</artifactId>
        <version>1.2.4</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hbase</groupId>
        <artifactId>hbase-server</artifactId>
        <version>1.2.4</version>
    </dependency>
    
    private Connection conn;
    private Admin admin;
    @Before
    public void before() throws IOException {
        Configuration config= HBaseConfiguration.create();
        //因为HMaster和HRegionServer都将信息注册在zookeeper中
        config.set("hbase.zookeeper.quorum","centos");
        conn= ConnectionFactory.createConnection(config);
    
        admin=conn.getAdmin();
    }
    @After
    public void after() throws IOException {
        admin.close();
        conn.close();
    }
    NamespaceDescriptor nd=NamespaceDescriptor.create("zpark")
        .addConfiguration("author","zhangsan")
        .build();
    admin.createNamespace(nd);
    //create 'zpark:t_user',{NAME=>'cf1',VERIONS=>3},,{NAME=>'cf2',TTL=>10}
    TableName tname=TableName.valueOf("zpark:t_user");
    //创建表的描述
    HTableDescriptor t_user=new HTableDescriptor(tname);
    
    //构建列簇
    HColumnDescriptor cf1=new HColumnDescriptor("cf1");
    cf1.setMaxVersions(3);
    
    HColumnDescriptor cf2=new HColumnDescriptor("cf2");
    cf2.setTimeToLive(10);
    
    //添加列簇
    t_user.addFamily(cf1);
    t_user.addFamily(cf2);
    
    admin.createTable(t_user);
    TableName tname=TableName.valueOf("zpark:t_user");
    Table t_user = conn.getTable(tname);
    
    String[] company={"www.baizhi.com","www.sina.com"};
    for(int i=0;i<1000;i++){
        String com=company[new Random().nextInt(2)];
        String rowKey=com;
        if(i<10){
            rowKey+=":00"+i;
        }else if(i<100){
            rowKey+=":0"+i;
        }else if(i<1000){
            rowKey+=":"+i;
        }
        Put put=new Put(rowKey.getBytes());
        put.addColumn("cf1".getBytes(),"name".getBytes(),("user"+i).getBytes());
        put.addColumn("cf1".getBytes(),"age".getBytes(), Bytes.toBytes(i));
        put.addColumn("cf1".getBytes(),"salary".getBytes(),Bytes.toBytes(5000+1000*i));
        put.addColumn("cf1".getBytes(),"company".getBytes(),com.getBytes());
    
        t_user.put(put);
    }
    t_user.close();
    TableName tname=TableName.valueOf("zpark:t_user");
    String[] company={"www.baizhi.com","www.sina.com"};
    BufferedMutator mutator=conn.getBufferedMutator(tname);
    for(int i=0;i<1000;i++){
        String com=company[new Random().nextInt(2)];
        String rowKey=com;
        if(i<10){
            rowKey+=":00"+i;
        }else if(i<100){
            rowKey+=":0"+i;
        }else if(i<1000){
            rowKey+=":"+i;
        }
        Put put=new Put(rowKey.getBytes());
        put.addColumn("cf1".getBytes(),"name".getBytes(),("user"+i).getBytes());
        put.addColumn("cf1".getBytes(),"age".getBytes(), Bytes.toBytes(i));
        put.addColumn("cf1".getBytes(),"salary".getBytes(),Bytes.toBytes(5000+1000*i));
        put.addColumn("cf1".getBytes(),"company".getBytes(),com.getBytes());
        mutator.mutate(put);
    }
     mutator.close();
    mutator.close();
    TableName tname=TableName.valueOf("zpark:t_user");
    Table t_user = conn.getTable(tname);
    
    Put put=new Put("www.baizhi.com:000".getBytes());
    put.addColumn("cf1".getBytes(),"name".getBytes(),("zhangsan").getBytes());
    
    t_user.put(put);
    t_user.close();
    TableName tname=TableName.valueOf("zpark:t_user");
    Table t_user = conn.getTable(tname);
    
    Get get=new Get("www.sina.com:002".getBytes());
    //表示一行数据,涵盖n个cell
    Result result = t_user.get(get);
    
    String name = Bytes.toString(result.getValue("cf1".getBytes(), "name".getBytes()));
    Integer age = Bytes.toInt(result.getValue("cf1".getBytes(), "age".getBytes()));
    Integer salary = Bytes.toInt(result.getValue("cf1".getBytes(), "salary".getBytes()));
    System.out.println(name+","+age+","+salary);
    
    t_user.close();
     TableName tname=TableName.valueOf("zpark:t_user");
    Table t_user = conn.getTable(tname);
    
    Scan scan=new Scan();
    // scan.setStartRow("www.baizhi.com:000".getBytes());
    // scan.setStopRow("www.taizhi.com:020".getBytes());
    Filter filter1=new PrefixFilter("www.baizhi.com:00".getBytes());
    Filter filter2=new PrefixFilter("www.sina.com:00".getBytes());
    FilterList filter=new FilterList(FilterList.Operator.MUST_PASS_ONE,filter1,filter2);
    scan.setFilter(filter);
    
    ResultScanner resultScanner = t_user.getScanner(scan);
    for (Result result : resultScanner) {
        String rowKey=Bytes.toString(result.getRow());
        String name = Bytes.toString(result.getValue("cf1".getBytes(), "name".getBytes()));
        Integer age = Bytes.toInt(result.getValue("cf1".getBytes(), "age".getBytes()));
        Integer salary = Bytes.toInt(result.getValue("cf1".getBytes(), "salary".getBytes()));
        System.out.println(rowKey+" => "+ name+","+age+","+salary);
    }
    t_user.close();
    TableName tname=TableName.valueOf("zpark:t_user");
    Table t_user = conn.getTable(tname);
    
    Scan scan=new Scan();
    // scan.setStartRow("www.baizhi.com:000".getBytes());
    // scan.setStopRow("www.taizhi.com:020".getBytes());
    Filter filter1=new PrefixFilter("www.baizhi.com:00".getBytes());
    Filter filter2=new PrefixFilter("www.sina.com:00".getBytes());
    FilterList filter=new FilterList(FilterList.Operator.MUST_PASS_ALL,filter1,filter2);
    scan.setFilter(filter);
    
    ResultScanner resultScanner = t_user.getScanner(scan);
    for (Result result : resultScanner) {
        String rowKey=Bytes.toString(result.getRow());
        String name = Bytes.toString(result.getValue("cf1".getBytes(), "name".getBytes()));
        Integer age = Bytes.toInt(result.getValue("cf1".getBytes(), "age".getBytes()));
        Integer salary = Bytes.toInt(result.getValue("cf1".getBytes(), "salary".getBytes()));
        System.out.println(rowKey+" => "+ name+","+age+","+salary);
    }
    t_user.close();

HBase和MapReduce集成

    public class CustomJobSubmiter extends Configured implements Tool {
        public int run(String[] args) throws Exception {
    
            //创建Job
            Configuration config = HBaseConfiguration.create(getConf());
            Job job = Job.getInstance(config);
            job.setJarByClass(CustomJobSubmiter.class);     // class that contains mapper
    
            //设置输入、输出格式
            job.setInputFormatClass(TableInputFormat.class);
            job.setOutputFormatClass(TableOutputFormat.class);
    
            TableMapReduceUtil.initTableMapperJob(
                    "zpark:t_user",
                    new Scan(),
                    UserMapper.class,
                    Text.class,
                    DoubleWritable.class,
                    job
            );
            TableMapReduceUtil.initTableReducerJob(
                    "zpark:t_user_count",
                    UserReducer.class,
                    job
            );
            job.setCombinerClass(UserCombiner.class);
            job.waitForCompletion(true);
            
            
            return 0;
        }
    
        public static void main(String[] args) throws Exception {
            ToolRunner.run(new CustomJobSubmiter(),args);
        }
        public static class UserMapper extends TableMapper<Text, CountWritable>{
            private Text k=new Text();
            private DoubleWritable v=new DoubleWritable();
            @Override
            protected void map(ImmutableBytesWritable key, Result value, Context context) throws IOException, InterruptedException {
                String company= Bytes.toString(value.getValue("cf1".getBytes(),"company".getBytes()));
                Double salary= Bytes.toInt(value.getValue("cf1".getBytes(),"salary".getBytes()))*1.0;
                k.set(company);
                v.set(salary);
                context.write(k,new CountWritable(1,salary,salary,salary));
            }
        }
        public static class UserCombiner extends Reducer<Text, CountWritable,Text, CountWritable>{
            @Override
            protected void reduce(Text key, Iterable<CountWritable> values, Context context) throws IOException, InterruptedException {
                int total=0;
                double tatalSalary=0.0;
                double avgSalary=0.0;
                double maxSalary=0.0;
                double minSalary=Integer.MAX_VALUE;
                for (CountWritable value : values) {
                    tatalSalary+=value.getTatalSalary();
                    total+=value.getTotal();
                    if(minSalary>value.getMinSalary()){
                        minSalary=value.getMinSalary();
                    }
                    if(maxSalary<value.getMaxSalary()){
                        maxSalary=value.getMaxSalary();
                    }
                }
                
                context.write(key,new CountWritable(total,tatalSalary,maxSalary,minSalary));
            }
        }
        public static class UserReducer extends TableReducer<Text, CountWritable, NullWritable>{
            @Override
            protected void reduce(Text key, Iterable<CountWritable> values, Context context) throws IOException, InterruptedException {
                int total=0;
                double tatalSalary=0.0;
                double avgSalary=0.0;
                double maxSalary=0.0;
                double minSalary=Integer.MAX_VALUE;
                for (CountWritable value : values) {
                    tatalSalary+=value.getTatalSalary();
                    total+=value.getTotal();
                    if(minSalary>value.getMinSalary()){
                        minSalary=value.getMinSalary();
                    }
                    if(maxSalary<value.getMaxSalary()){
                        maxSalary=value.getMaxSalary();
                    }
                }
                avgSalary=tatalSalary/total;
    
                Put put=new Put(key.getBytes());
                put.addColumn("cf1".getBytes(),"taotal".getBytes(),(total+"").getBytes());
                put.addColumn("cf1".getBytes(),"tatalSalary".getBytes(),(tatalSalary+"").getBytes());
                put.addColumn("cf1".getBytes(),"maxSalary".getBytes(),(maxSalary+"").getBytes());
                put.addColumn("cf1".getBytes(),"minSalary".getBytes(),(minSalary+"").getBytes());
                put.addColumn("cf1".getBytes(),"avgSalary".getBytes(),(avgSalary+"").getBytes());
    
                context.write(null,put);
    
            }
        }
    }
    
    public class CountWritable implements Writable {
        int total=0;
        double tatalSalary=0.0;
        double maxSalary=0.0;
        double minSalary=Integer.MAX_VALUE;
    
        public CountWritable(int total, double tatalSalary, double maxSalary, double minSalary) {
            this.total = total;
            this.tatalSalary = tatalSalary;
            this.maxSalary = maxSalary;
            this.minSalary = minSalary;
        }
    
        public CountWritable() {
        }
    
        public void write(DataOutput out) throws IOException {
            out.writeInt(total);
            out.writeDouble(tatalSalary);
            out.writeDouble(maxSalary);
            out.writeDouble(minSalary);
        }
    
        public void readFields(DataInput in) throws IOException {
            total=in.readInt();
            tatalSalary=in.readDouble();
            maxSalary=in.readDouble();
            minSalary=in.readDouble();
        }
        //....
    }
        

HBase架构

HBase宏观架构

RegionServer架构

Region架构图

参考:http://www.blogjava.net/DLevin/archive/2015/08/22/426877.html

       http://www.blogjava.net/DLevin/archive/2015/08/22/426950.html

HBase集群构建

    [root@CentOSX ~]# tar -zxf hbase-1.2.4-bin.tar.gz -C /usr/
    [root@CentOSX ~]# vi /usr/hbase-1.2.4/conf/hbase-site.xml
    
    <property>
                <name>hbase.rootdir</name>
                <value>hdfs://mycluster/hbase</value>
    </property>
    <property>
                <name>hbase.cluster.distributed</name>
                <value>true</value>
    </property>
    <property>
                <name>hbase.zookeeper.quorum</name>
                <value>CentOSA,CentOSB,CentOSC</value>
    </property>
    <property>
                <name>hbase.zookeeper.property.clientPort</name>
                <value>2181</value>
    </property>
    [root@centos ~]# vi /usr/hbase-1.2.4/conf/regionservers 
    CentOSA
    CentOSB
    CentOSC
    [root@CentOSX ~]# vi .bashrc
    HBASE_MANAGES_ZK=false
    HBASE_HOME=/usr/hbase-1.2.4
    HADOOP_HOME=/usr/hadoop-2.6.0
    JAVA_HOME=/usr/java/latest
    CLASSPATH=.
    PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HBASE_HOME/bin
    export JAVA_HOME
    export CLASSPATH
    export PATH
    export HADOOP_HOME
    export HBASE_HOME
    export HBASE_MANAGES_ZK
    
    [root@CentOSX ~]# source .bashrc

标签:baizhi,value,user,Apache,Hbase,main,hbase,cf1
来源: https://blog.csdn.net/Jona_Li/article/details/94049276