Hbase的安装及简单使用

/ hadoopBig DataHbaseDatabase技术 / 没有评论 / 4595浏览

alt 最近在学习使用Hbase,从安装到使用简单操作了一遍。安装Hbase的前提是已经安装配置好了Hadoop和zookeeper。这里就不在细讲了,后面可能会在补充一下。

安装Hbase

  1. 下载Hbase
wget http://mirrors.hust.edu.cn/apache/hbase/stable/hbase-1.2.6.1-bin.tar.gz
tar -xzvf hbase-1.2.6.1-bin.tar.gz

在单机模式下配置HBase

  1. 在hbase-env.sh配置JAVA_HOME环境变量
cd /home/aozhang/Documents/company/app/hbase-1.2.6.1/conf
vim hbase-env.sh
export JAVA_HOME=/home/aozhang/Documents/company/app/jdk1.8.0_181
  1. Edit conf/hbase-site.xml
<configuration>
  <property>
    <name>hbase.rootdir</name>
    <value>file:///home/aozhang/Documents/company/data/hbase</value>
  </property>
  <property>
    <name>hbase.zookeeper.property.dataDir</name>
    <value>/home/aozhang/Documents/company/data/hbase/zookeeper</value>
  </property>
  <property>
    <name>hbase.unsafe.stream.capability.enforce</name>
    <value>false</value>
    <description>
      Controls whether HBase will check for stream capabilities (hflush/hsync).

      Disable this if you intend to run on LocalFileSystem, denoted by a rootdir
      with the 'file://' scheme, but be mindful of the NOTE below.

      WARNING: Setting this to false blinds you to potential data loss and
      inconsistent system state in the event of process and/or node failures. If
      HBase is complaining of an inability to use hsync or hflush it's most
      likely not a false positive.
    </description>
  </property>
</configuration>
  1. 启动HBASE
cd /home/aozhang/Documents/company/app/hbase-1.2.6.1/bin
./start-hbase.sh
  1. 访问页面是否正常

ubuntu16搭建OpenTSDB中的大坑hosts文件

127.0.1.1	aozhang-Latitude-5290
192.168.31.111	aozhang-Latitude-5290

hbase server 断电后发现Master启动不了

查看master相关日志,删除日志中报错数据

cd ../data/hbase/WALs/
rm -rf aozhang-latitude-5290,16201,1532424355724-splitting

HBase导入创建表脚本报错:Compression algorithm 'lzo' previously failed test.

Type "exit<RETURN>" to leave the HBase Shell
Version 1.2.6.1, rUnknown, Sun Jun  3 23:19:26 CDT 2018

create 'tsdb-uid',
  {NAME => 'id', COMPRESSION => 'LZO', BLOOMFILTER => 'ROW'},
  {NAME => 'name', COMPRESSION => 'LZO', BLOOMFILTER => 'ROW'}

ERROR: org.apache.hadoop.hbase.DoNotRetryIOException: java.lang.RuntimeException: java.lang.ClassNotFoundException: com.hadoop.compression.lzo.LzoCodec Set hbase.table.sanity.checks to false at conf or table descriptor if you want to bypass sanity checks
	at org.apache.hadoop.hbase.master.HMaster.warnOrThrowExceptionForFailure(HMaster.java:1754)
<property>  
  <name>hbase.table.sanity.checks</name>  
  <value>false</value>  
</property>  

为OpenTSDB导入脚本(需导入代码版本对应的脚本)

/home/aozhang/Documents/company/app/hbase-1.2.6.1/create_table.sh

启动OpenTSDB报错

2018-07-27 21:00:10.616  WARN 3973 --- [e I/O Worker #1] org.hbase.async.HBaseClient              : Probe Exists(table="tsdb-uid", key=[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 58, 65, 115, 121, 110, 99, 72, 66, 97, 115, 101, 126, 112, 114, 111, 98, 101, 126, 60, 59, 95, 60, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 58, 65, 115, 121, 110, 99, 72, 66, 97, 115, 101, 126, 112, 114, 111, 98, 101, 126, 60, 59, 95, 60], family=null, qualifiers=null, attempt=0, region=null) failed

org.hbase.async.NonRecoverableException: Too many attempts: Exists(table="tsdb-uid", key=[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 58, 65, 115, 121, 110, 99, 72, 66, 97, 115, 101, 126, 112, 114, 111, 98, 101, 126, 60, 59, 95, 60, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 58, 65, 115, 121, 110, 99, 72, 66, 97, 115, 101, 126, 112, 114, 111, 98, 101, 126, 60, 59, 95, 60], family=null, qualifiers=null, attempt=11, region=null)
hbase(main):001:0> list
TABLE                                          
tsdb                                                        
tsdb-meta                                                            
tsdb-tree                                                           
tsdb-uid 
    cd /home/aozhang/Documents/company/app/opentsdb-2.3.1
    env COMPRESSION=NONE HBASE_HOME=/home/aozhang/Documents/company/app/hbase-1.2.6.1 ./src/create_table.sh

HBase是什么?

HBase的存储机制

Hbase操作

  1. 启动 HBase Shell
/home/aozhang/Documents/company/app/hbase-1.2.6.1/bin/hbase shell
#查看所有表
hbase(main):001:0> list
TABLE  
tsdb  
tsdb-meta 
tsdb-tree 
tsdb-uid 
4 row(s) in 0.0300 seconds
=> ["tsdb", "tsdb-meta", "tsdb-tree", "tsdb-uid"]
# 提供HBase的状态,例如,服务器的数量。
hbase(main):002:0> status
1 active master, 0 backup masters, 1 servers, 0 dead, 6.0000 average load
  1. 一系列命令
list :查看有哪些表
status:命令返回包括在系统上运行的服务器的细节和系统的状态
version:该命令返回HBase系统使用的版本
table_help:此命令将引导如何使用表引用的命令。下面给出的是使用这个命令的语法。
whoami:该命令返回HBase用户详细信息。如果执行这个命令,返回当前HBase用户

HBase在表中操作的命令

数据操纵语言

HBaseAdmin类

HBaseAdmin是一个类表示管理。这个类属于org.apache.hadoop.hbase.client包。使用这个类,可以执行管理员任务。使用Connection.getAdmin()方法来获取管理员的实例。

  1. 创建一个新的表
void createTable(HTableDescriptor desc)
  1. 创建一个新表使用一组初始指定的分割键限定空区域
void createTable(HTableDescriptor desc, byte[][] splitKeys)
  1. 从表中删除列
void deleteColumn(byte[] tableName, String columnName)
  1. 删除表中的列
void deleteColumn(String tableName, String columnName)
  1. 删除表
void deleteTable(String tableName)

Descriptor类

这个类包含一个HBase表,如详细信息:

  1. 构造函数,构造一个表描述符指定TableName对象。
HTableDescriptor(TableName name)
  1. 将列家族给定的描述符
HTableDescriptor addFamily(HColumnDescriptor family)

HBase创建表

create ‘<table name>’,’<column family>’ 
hbase(main):002:0> create 'emp','personal data','professional data'
Row keypersonal dataprofessional data

HBase删除表

  1. 删除前必须见禁用表
disable_all "tsdb"
  1. 删除
drop_all "tsdb"
  1. 启用表
enable "emp"
  1. 扫描验证表
scan 'emp'
  1. 查看表是否被启用
is_enabled 'emp'

HBase表描述和修改

hbase> describe 'table name'
  1. 修改 alter用于更改现有表的命令。使用此命令可以更改列族的单元,设定最大数量和删除表范围运算符,并从表中删除列家族。
hbase> alter 't1', NAME => 'f1', VERSIONS => 5
hbase(main):009:0> alter  'emp', NAME => 'personal data', VERSIONS => 5
Updating all regions with the new schema...
1/1 regions updated.
Done.
Unknown argument ignored: VERSIONS
Updating all regions with the new schema...
1/1 regions updated.
Done.
0 row(s) in 3.8320 seconds
hbase>alter 't1', READONLY(option)

在下面的例子中,我们已经设置表emp为只读。

hbase(main):015:0> alter 'emp', READONLY
Updating all regions with the new schema...
1/1 regions updated.
Done.
0 row(s) in 1.9000 seconds
hbase(main):016:0> 
hbase> alter 'emp', METHOD => 'table_att_unset', NAME => 'MAX_FILESIZE'
hbase> alter 'emp', 'delete' => 'personal data'

HBase Table Exists

  1. 可以使用exists命令验证表的存在
hbase(main):021:0> exists 'emp'
Table emp does exist 
0 row(s) in 0.0110 seconds

HBase删除表

  1. 用drop命令可以删除表。在删除一个表之前必须先将其禁用。
hbase(main):018:0> disable 'emp'
0 row(s) in 1.4580 seconds


hbase(main):019:0> drop 'emp'
0 row(s) in 0.3060 seconds
hbase(main):020:0> exists 'emp'
Table emp does not exist

0 row(s) in 0.0730 seconds
  1. drop_all 这个命令是用来在给出删除匹配“regex”表。它的语法如下:
hbase> drop_all 't.*'

注意:要删除表,则必须先将其禁用。

HBase关闭

  1. exit,可以通过键入exit命令退出shell。
hbase(main):021:0> exit
  1. 停止HBase
./bin/stop-hbase.sh

HBase创建数据

  1. 创建表emp,如果存在删除重建
hbase(main):023:0> disable 'emp'
0 row(s) in 2.2420 seconds

hbase(main):024:0> drop 'emp' 
0 row(s) in 1.2430 seconds

hbase(main):026:0> create 'emp', 'personal data', 'professional data'
0 row(s) in 1.2390 seconds

=> Hbase::Table - emp
  1. 使用put命令插入数据,它的语法如下:
put '<table name>','row1','<colfamily:colname>','<value>'
hbase(main):028:0> put 'emp','1','personal data:name','raju'
0 row(s) in 0.0660 seconds
hbase(main):032:0> put 'emp','1','personal data:city','hyderabad'
0 row(s) in 0.0060 seconds
hbase(main):033:0> put 'emp','1','professional data:designation','manager'
0 row(s) in 0.0070 seconds
hbase(main):034:0> put 'emp','1','professional data:salary','50000'
0 row(s) in 0.0050 seconds
hbase(main):035:0> scan 'emp'
ROW                                      COLUMN+CELL
 1                                       column=personal data:city, timestamp=1533194702585, value=hyderabad 
 1                                       column=personal data:name, timestamp=1533194618563, value=raju 
 1                                       column=professional data:designation, timestamp=1533194765859, value=manager 
 1                                       column=professional data:salary, timestamp=1533194786444, value=50000 
1 row(s) in 0.0090 seconds
    @Test
    public void insertRowData() {
        /**
         * insert data
         **/
        Table table = null;
        try {
            Connection connection = ConnectionFactory.createConnection(configuration);
            TableName tableName = TableName.valueOf("emp");
            table = connection.getTable(tableName);
            Put put = new Put(Bytes.toBytes("row2"));
            put.addColumn(Bytes.toBytes("personal data"), Bytes.toBytes("city"), Bytes.toBytes("ravi"));
            put.addColumn(Bytes.toBytes("personal data"), Bytes.toBytes("name"), Bytes.toBytes("chengnai"));
            put.addColumn(Bytes.toBytes("professional data"), Bytes.toBytes("designation"), Bytes.toBytes("sr.engineer"));
            put.addColumn(Bytes.toBytes("professional data"), Bytes.toBytes("salary"), Bytes.toBytes("30,000"));
            table.put(put);
        } catch (MasterNotRunningException e) {
            e.printStackTrace();
        } catch (ZooKeeperConnectionException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            try {
                if (table != null) {
                    table.close();
                }
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }

HBase更新数据

put 'table name','row','Column family:column name','new value'
hbase(main):038:0> scan 'emp'
ROW                                      COLUMN+CELL 
 1                                       column=personal data:city, timestamp=1533194702585, value=hyderabad
 1                                       column=personal data:name, timestamp=1533194618563, value=raju  
 1                                       column=professional data:designation, timestamp=1533194765859, value=manager
 1                                       column=professional data:salary, timestamp=1533194786444, value=50000 
 row2                                    column=personal data:city, timestamp=1533195621053, value=ravi 
 row2                                    column=personal data:name, timestamp=1533195621053, value=chengnai
 row2                                    column=professional data:designation, timestamp=1533195621053, value=sr.engineer 
 row2                                    column=professional data:salary, timestamp=1533195621053, value=30,000 
2 row(s) in 0.0140 seconds
hbase(main):039:0> put 'emp','1','personal data:city','Delhi'
0 row(s) in 0.0070 seconds
hbase(main):040:0> scan 'emp'
ROW                                      COLUMN+CELL                                                                                                         
 1                                       column=personal data:city, timestamp=1533196196912, value=Delhi
 1                                       column=personal data:name, timestamp=1533194618563, value=raju 
 1                                       column=professional data:designation, timestamp=1533194765859, value=manager 
 1                                       column=professional data:salary, timestamp=1533194786444, value=50000  
 row2                                    column=personal data:city, timestamp=1533195621053, value=ravi  
 row2                                    column=personal data:name, timestamp=1533195621053, value=chengnai
 row2                                    column=professional data:designation, timestamp=1533195621053, value=sr.engineer 
 row2                                    column=professional data:salary, timestamp=1533195621053, value=30,000 
2 row(s) in 0.0150 seconds

HBase读取数据

get '<table name>','row1'
hbase(main):043:0> get 'emp' , '1'
COLUMN                                   CELL 
 personal data:city                      timestamp=1533196196912, value=Delhi
 personal data:name                      timestamp=1533194618563, value=raju 
 professional data:designation           timestamp=1533194765859, value=manager
 professional data:salary                timestamp=1533194786444, value=50000 
4 row(s) in 0.0170 seconds
hbase>get 'table name', ‘rowid’, {COLUMN => ‘column family:column name ’}
get 'emp', '1', {COLUMN=>'personal data:name'}
hbase(main):045:0> get 'emp', 'row2', {COLUMN=>'personal data:name'}
COLUMN                                   CELL
 personal data:name                      timestamp=1533195621053, value=chengnai 
1 row(s) in 0.0030 seconds
    public void getData() {
        /**
         * get data
         **/
        Table table = null;
        try {
            Connection connection = ConnectionFactory.createConnection(configuration);
            TableName tableName = TableName.valueOf("emp");
            table = connection.getTable(tableName);
            Get get = new Get(Bytes.toBytes("row2"));
            Result result = table.get(get);
            NavigableMap<byte[], NavigableMap<byte[], NavigableMap<Long, byte[]>>> navigableMap = result.getMap();
            for (Map.Entry<byte[], NavigableMap<byte[], NavigableMap<Long, byte[]>>> entry : navigableMap.entrySet()) {
                System.out.println("columnFamily:" + Bytes.toString(entry.getKey()));
                NavigableMap<byte[], NavigableMap<Long, byte[]>> map = entry.getValue();
                for (Map.Entry<byte[], NavigableMap<Long, byte[]>> en : map.entrySet()) {
                    System.out.print(Bytes.toString(en.getKey()) + "##");
                    NavigableMap<Long, byte[]> nm = en.getValue();
                    for (Map.Entry<Long, byte[]> me : nm.entrySet()) {
                        System.out.println("column key:" + me.getKey() + " value:" + Bytes.toString(me.getValue()));
                    }
                }
            }
        } catch (MasterNotRunningException e) {
            e.printStackTrace();
        } catch (ZooKeeperConnectionException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            try {
                if (table != null) {
                    table.close();
                }
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }

HBase删除数据

delete '<table name>', '<row>', '<column name >', '<time stamp>'
hbase(main):048:0> delete 'emp', '1', 'professional data:salary',1533194786444
0 row(s) in 0.0260 seconds

hbase(main):049:0> scan 'emp'
ROW                                      COLUMN+CELL 
 1                                       column=personal data:city, timestamp=1533196196912, value=Delhi
 1                                       column=personal data:name, timestamp=1533194618563, value=raju 
 1                                       column=professional data:designation, timestamp=1533194765859, value=manager 
 row2                                    column=personal data:city, timestamp=1533196466515, value=ShangHai
 row2                                    column=personal data:name, timestamp=1533195621053, value=chengnai
 row2                                    column=professional data:designation, timestamp=1533195621053, value=sr.engineer
 row2                                    column=professional data:salary, timestamp=1533195621053, value=30,000 
2 row(s) in 0.0230 seconds
hbase(main):050:0> deleteall 'emp','1'
0 row(s) in 0.0240 seconds
    @Test
    public void deleteData(){
        /**
         * delete data
         **/
        Table table = null;
        try {
            Connection connection = ConnectionFactory.createConnection(configuration);
            TableName tableName = TableName.valueOf("emp");
            table = connection.getTable(tableName);
            Delete delete = new Delete(Bytes.toBytes("1"));
            //删除列簇里面的指定列,不要被名字迷惑,这里就是删除接口,应该要理解成add要删除的列到Delete的对象里
            delete.addColumn(Bytes.toBytes("professional data"), Bytes.toBytes("designation"));
            //删除整个列簇
            delete.addFamily(Bytes.toBytes("professional data"));
//            table.delete(delete);
        } catch (MasterNotRunningException e) {
            e.printStackTrace();
        } catch (ZooKeeperConnectionException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            try {
                if (table != null) {
                    table.close();
                }
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }

HBase扫描

scan '<table name>'
    @Test
    public void scanHbase(){
        /**
         * delete data
         **/
        Table table = null;
        ResultScanner scanner = null;
        try {
            Connection connection = ConnectionFactory.createConnection(configuration);
            TableName tableName = TableName.valueOf("emp");
            table = connection.getTable(tableName);
            // Instantiating the Scan class
            Scan scan = new Scan();
            scan.addColumn(Bytes.toBytes("personal data"), Bytes.toBytes("city"));
            scan.addColumn(Bytes.toBytes("personal data"), Bytes.toBytes("name"));
            // Getting the scan result
            scanner = table.getScanner(scan);
            // Reading values from scan result
            for (Result result = scanner.next(); result != null; result = scanner.next()){
                System.out.println("Found row : " + result);
            }
        } catch (MasterNotRunningException e) {
            e.printStackTrace();
        } catch (ZooKeeperConnectionException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            try {
                if(scanner != null){
                    scanner.close();
                }
                if (table != null) {
                    table.close();
                }
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }

HBase计数和截断

count '<table name>'
hbase> truncate 'table name'

HBase安全

hbase> grant <user> <permissions> [<table> [<column family> [<column; qualifier>]]
1. R - 代表读取权限
2. W - 代表写权限 
3. X - 代表执行权限
4. C - 代表创建权限
5. A - 代表管理权限

我们可以从RWXCA组,其中给予零个或多个特权给用户

hbase(main):018:0> grant 'Tutorialspoint', 'RWXCA'
hbase> revoke <user>
hbase(main):006:0> revoke 'Tutorialspoint'
hbase>user_permission ‘tablename’