`
CSAPP
  • 浏览: 9902 次
  • 性别: Icon_minigender_1
  • 来自: 杭州
最近访客 更多访客>>
文章分类
社区版块
存档分类
最新评论

The Apache HBase Book学习篇(一)

阅读更多

一段时间以来一直在使用Hadoop和Hive进行数据仓库的开发,最近感觉HBase逐渐在实时处理方面能力显示出来,这也是数据开发人员梦寐以求的一件事情,看到Apache上有关于HBase的学习书籍,故想不自量力翻译一番,呵呵~

 Apache HBase是一种分布式的,基于列模式的架构在Apache Hadoop和Apache Zookeeper上的数据库,这个手册是基于HBase 0.90.0版本的,这本手册会介绍包括了HBase的部分内容,更多的内容可以从以上网站上获取:

https://issues.apache.org/jira/browse/HBASE

http://wiki.apache.org/hadoop/Hbase

http://zookeeper.apache.org/

http://hadoop.apache.org/


Chapter 1. Getting Started

1.1、介绍

    quick start主要帮助你搭建和运行一个单机的HBase并且使用本地文件系统,No-so-quick-start主要描述如何在分布式模式下让HBase运行在HDFS上。

1.2、quick start

     本篇将描述如何在单机上使用本地文件系统运行分布式的HBase实例,它将让你通过HBase提供的HBase Shell工具完成创建一张表、插入一列,然后清除以及关闭HBase实例,这些操作可以在10分钟以内完成。

1.2.1.下载以及解压最新的稳定版本

     选择下载的网址,http://www.apache.org/dyn/closer.cgi/hbase/  ,建议国内用户可以通过人人或者北交大的apache镜像进行下载,打开相关的链接,进入到Hbase的下载页面,点击stable,选择以.tar.gz为后缀的文件,比如hbase-0.90.0.tar.gz 下载到本地,最好是在liunx环境中安装HBase或者使用虚拟机。

解压相关文件到指定的目录下面

$ tar xfz hbase-0.90.0.tar.gz
$ cd hbase-0.90.0
 

 现在我们可以开始准备运行HBase了,不过在运行HBase之前,需要现在conf/hbase-site.xml设置你想要将HBase写入什么位置,具体的参数是hbase.rootdir

 

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
    <name>hbase.rootdir</name>
    <value>file:///DIRECTORY/hbase</value>
  </property>
</configuration>

  替换掉上面的DIRECTORY使用一个你想要HBase用来存放数据的目录,默认情况下,hbase.rootdir会设置在/tmp/hbase-${user.name}下面,这么意味着如果任何时候你重启你的服务,你将丢失掉你全部的数据(大多数的系统在重启后将会清除掉/tmp下的数据)

1.2.2 Start Hbase

好了,现在可以开始运行HBase了

$ ./bin/start-hbase.sh
starting Master, logging to logs/hbase-user-master-example.org.out

你现在已经运行了Hbase的单机模式,在单机模式下面,HBase运行所有的守护程序在一个JVM下面,包括HBase和Zookeeper的daemons,HBase的日志信息可以logs的子目录下面找到,如果HBase遇到异常情况可以先查看他们。

 提示:确定Java是否已经按照好了?

         前面所有的叙述都是建立在你的机器已经安装好了Java 1.6及其以上版本,并且已经配置了正确的路径,可以直接键入java进行操作,如果不满足这些要求,HBase就不能正常运行,在安装Java的过程中,需要先对conf/hbase-env.sh进行配置,需要对已经注释了JAVA_HOME变量进行反注释,并将它设置成为指向你安装的目录,然后重新尝试以上步骤。

1.2.3 Shell Exercises

通过Hbase Shell连接到你使用的HBase

$ ./bin/hbase shell
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version: 0.89.20100924, r1001068, Fri Sep 24 13:55:42 PDT 2010

hbase(main):001:0> 

键入help,然后回车可以将显示shell相关的命令和选项,浏览到help最新的版本描述信息后面,可以看到关于各种命令和变量如何在HBase Shell中输入,特别注意的是表名、列、行等必须加引号表示。

创建一个test表有唯一的column family 为cf,然后通过list命令验证它的存在,并且插入一些值:

hbase(main):003:0> create 'test', 'cf'
0 row(s) in 1.2200 seconds
hbase(main):003:0> list 'table'
test
1 row(s) in 0.0550 seconds
hbase(main):004:0> put 'test', 'row1', 'cf:a', 'value1'
0 row(s) in 0.0560 seconds
hbase(main):005:0> put 'test', 'row2', 'cf:b', 'value2'
0 row(s) in 0.0370 seconds
hbase(main):006:0> put 'test', 'row3', 'cf:c', 'value3'
0 row(s) in 0.0450 seconds
 

   上面我们插入了3个值,一次一个,第一个插入在row1,列cf:a  值为value1,列在HBase中由Column family的前缀构成,这里cf为例子,后面跟着一个冒号,然后列值为后缀(a在这里)

    验证数据插入情况,

    运行scan命令,得到如下结果:

hbase(main):007:0> scan 'test'
ROW        COLUMN+CELL
row1       column=cf:a, timestamp=1288380727188, value=value1
row2       column=cf:b, timestamp=1288380738440, value=value2
row3       column=cf:c, timestamp=1288380747365, value=value3
3 row(s) in 0.0590 seconds
 

 

获取某一列:

hbase(main):008:0> get 'test', 'row1'
COLUMN      CELL
$ ./bin/stop-hbase.sh
stopping hbase............... cf:a        timestamp=1288380727188, value=value1
1 row(s) in 0.0400 seconds
 

disable和drop某张表,这将清除以上做的:

hbase(main):012:0> disable 'test'
0 row(s) in 1.0930 seconds
hbase(main):013:0> drop 'test'
0 row(s) in 0.0770 seconds 
 

exit:

hbase(main):014:0> exit
 

 1.2.4 Stop Hbase

停止HBase的实例:

$ ./bin/stop-hbase.sh
stopping hbase...............
分享到:
评论

相关推荐

    HBase-The Definitive Guide-Second Edition-Early Release.pdf

    If you’re looking for a scalable storage solution to accommodate a virtually endless amount of data, this updated edition shows you how Apache HBase can meet your needs. Modeled after Google’s ...

    《Hbase权威指南》原版

    If your organization is looking for a storage solution to accommodate a virtually endless amount of data, this book will show you how Apache HBase can fulfill your needs. As the open source ...

    hbase-0.98.9-src.tar

    The hbase 'book' at http://hbase.apache.org/book.html has a 'quick start' section and is where you should being your exploration of the hbase project. The latest HBase can be downloaded from an ...

    HBase.The.Definitive.Guide.2nd.Edition

    If you’re looking for a scalable storage solution to accommodate a virtually endless amount of data, this updated edition shows you how Apache HBase can meet your needs. Modeled after Google’s ...

    HBase.High.Performance.Cookbook.epub

    This book is also for big data enthusiasts and database developers who have worked with other NoSQL databases and now want to explore HBase as another futuristic scalable database solution in the big...

    HBase:权威指南

    If your organization is looking for a storage solution to accommodate a virtually endless amount of data, this book will show you how Apache HBase can fulfill your needs. As the open source ...

    在hadoop-3.1.2上安装hbase-2.2.1.pdf

    关于分布式安装,请浏览:http://hbase.apache.org/book/standalone_dist.html#distributed,关于HBase使用外置的ZooKeeper配置,请浏览:http://hbase.apache.org/book/zookeeper.html。所有在线的文档,均会出现在...

    Hbase中文文档

    HBase and the Apache Software Foundation H.1. ASF Development Process H.2. ASF Board Reporting Index 表列表 5.1. Table webtable 5.2. ColumnFamily anchor 5.3. ColumnFamily contents 8.1. Operation To ...

    Practical Hadoop Ecosystem(Apress,2016)

    This book is a practical guide on using the Apache Hadoop projects including MapReduce, HDFS, Apache Hive, Apache HBase, Apache Kafka, Apache Mahout and Apache Solr. From setting up the environment to...

    Mastering.Apache.Spark.178397146

    Explore the integration of Apache Spark with third party applications such as H20, Databricks and Titan Evaluate how Cassandra and Hbase can be used for storage An advanced guide with a combination of...

    《HBase Not sleeping book.pdf

    HBase是Apache旗下一个高可靠性、高性能、面向列、可伸缩的分 布式存储系统。利用HBase技术可在廉价的PC服务器上搭建大规模的存 储化集群,使用HBase可以对数十亿级别的大数据进行实时性的高性能 读写,在满足高性能...

    Apache Hadoop 3 Quick Start Guide

    The book begins with an overview of big data and Apache Hadoop. Then, you will set up a pseudo Hadoop development environment and a multi-node enterprise Hadoop cluster. You will see how the parallel ...

    Pro Apache Phoenix(Apress,2016)

    Pro Apache Phoenix covers the nuances of setting up a distributed HBase cluster with Phoenix libraries, running performance benchmarks, configuring parameters for production scenarios, and viewing the...

    Pro.Docker.148421829

    In this fast-paced book on the Docker open standards platform for developing, packaging and running portable distributed applications, Deepak Vorha discusses how to build, ship and run applications ...

    Hadoop The Definitive Guide PDF

    The rest of this book is organized as follows. Chapter 2 provides an introduction to MapReduce. Chapter 3 looks at Hadoop filesystems, and in particular HDFS, in depth. Chapter 4 covers the ...

    Hadoop: The Definitive Guide

    Ideal for processing large datasets, the Apache Hadoop framework is an open source implementation of the MapReduce algorithm on which Google built its empire. This comprehensive resource demonstrates...

    Pro Spark Streaming,The Zen of Real-time Analytics using Apache Spark

    Finally, these applications can use out-of-the- box integrations with other systems such as Kafka, Flume, HBase, and Cassandra. All of these features have turned Spark Streaming into the Swiss Army ...

    Hadoop Backup and Recovery Solutions(ydE).pdf

    A deep dive into the interesting world of Apache HBase will show you different ways of backing up data and will compare them. Going forward, you’ll learn the methods of defining recovery strategies ...

    Seven Databases in Seven Weeks - Luc Perkins

    covered are PostgreSQL, Apache HBase, MongoDB, Apache CouchDB, Neo4J, DynamoDB, and Redis. Each chapter is designed to be taken as a long weekend’s worth of work, split up into three days. Each day ...

    Hadoop- The Definitive Guide, 3rd Edition.pdf

    The rest of this book is organized as follows. Chapter 1 emphasizes the need for Hadoop and sketches the history of the project. Chapter 2 provides an introduction to MapReduce. Chapter 3 looks at ...

Global site tag (gtag.js) - Google Analytics