`

Hive安装配置学习笔记

 
阅读更多

转载请标明出处SpringsSpace: http://springsfeng.iteye.com

 

1 . 首先请安装好MySQL并修改root账户密码,使用root账户执行下面命令:

      su - root

      mysql

      GRANT ALL PRIVILEGES ON *.* TO 'root'@'localhost' IDENTIFIED BY 'root' WITH GRANT OPTION; 

2.  创建Hive用户: 使用root账户执行下面命令:

      su - root

      mysql -uroot -p

      CREATE USER 'hive'@'localhost' IDENTIFIED BY 'hive'; 

      CREATE USER 'hive'@'linux-fdc.linux.com' IDENTIFIED BY 'hive'; 

      CREATE USER 'hive'@'192.168.81.251' IDENTIFIED BY 'hive'; 

 

      CREATE DATABASE metastore;
      CREATE DATABASE metastore DEFAULT CHARACTER SET latin1 DEFAULT COLLATE latin1_swedish_ci; 

     
      GRANT ALL PRIVILEGES ON metastore.* TO 'hive'@'localhost' IDENTIFIED BY 'hive' WITH GRANT OPTION; 

      GRANT ALL PRIVILEGES ON metastore.* TO 'hive'@'192.168.81.251' IDENTIFIED BY 'hive' WITH GRANT OPTION; 
      flush privileges;

3.  导入MySQL 脚本

     使用hive账户登录:

     mysql -uhive -p -h192.168.81.251

     mysql> use metastore;
     Database changed
     mysql> source /opt/custom/hive-0.11.0/scripts/metastore/upgrade/mysql/hive-schema-0.10.0.mysql.sql

4.  Hive安装配置

     (1) 编译:针对当前Hive-0.11.0-SNAPSHOT版本

     下载最新的Hive源码包:hive-trunk.zip, 解压至:/home/kevin/Downloads/hive-trunk,修改:

     build.properties, 中: 

...
hadoop-0.20.version=0.20.2
hadoop-0.20S.version=1.1.2
hadoop-0.23.version=2.0.3-alpha
...

    若需修改其他依赖包的版本,请修改:ivy目录下的libraries.properties文件, 例如修改hbase

    的版本:

    ...

    guava-hadoop23.version=11.0.2
    hbase.version=0.94.6
    jackson.version=1.8.8

    ...

    在当前目录下执行:

     ant tar -Dforrest.home=/usr/custom/apache-forrest-0.9

     其中,forrest.home参考:http://springsfeng.iteye.com/admin/blogs/1734557中的第一部分,

     第3小节。

     (2)  解压:从编译后的build下Copy hive-0.11.0-SNAPSHOT.tar.gz至:/usr/custom/并解压。

     (3)  配置环境变量:

     exprot HIVE_HOME=/usr/custom/hive-0.11.0
     exprot PATH=$HIVE_HOME/bin:$PATH

     (4) 配置文件:

     复制conf目录下的.template生成对应的.xml或.properties文件:
     cp hive-default.xml.template hive-site.xml
     cp hive-log4j.properties.template hive-log4j.properties

     (5) 配置hive-config.sh:       

...
#
# processes --config option from command line
#

export JAVA_HOME=/usr/custom/jdk1.6.0_43
export HIVE_HOME=/usr/custom/hive-0.11.0
export HADOOP_HOME=/usr/custom/hadoop-2.0.3-alpha


this="$0"
while [ -h "$this" ]; do
  ls=`ls -ld "$this"`
  link=`expr "$ls" : '.*-> \(.*\)$'`
  if expr "$link" : '.*/.*' > /dev/null; then
    this="$link"
  else
    this=`dirname "$this"`/"$link"
  fi
done
...

    (6) 配置日志hive-log4j.properties:针对0.10.0版本的特别处理。

     将org.apache.hadoop.metrics.jvm.EventCounter改成:org.apache.hadoop.log.metrics

     .EventCounter , 这样将解决异常:

     WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. 
     Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.

     (7) 创建hive-site文件:

<configuration>

	<!-- WARNING!!! This file is provided for documentation purposes ONLY! -->
	<!-- WARNING!!! Any changes you make to this file will be ignored by Hive. -->
	<!-- WARNING!!! You must make your changes in hive-site.xml instead. -->

	<!-- Hive Execution Parameters -->
	<property>
		<name>javax.jdo.option.ConnectionURL</name>
		<value>jdbc:mysql://localhost:3306/metastore_db?createDatabaseIfNotExist=true</value>
		<description>
			JDBC connect string for a JDBC metastore.
			请注意上面value标签之间的部分前后之间不能有空格,否则HiveClient提示:
			FAILED: Error in metadata: java.lang.RuntimeException: Unable 
			to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
		</description>
	</property>
	
	<property>
		<name>javax.jdo.option.ConnectionDriverName</name>
		<value>com.mysql.jdbc.Driver</value>
		<description>Driver class name for a JDBC metastore</description>
	</property>

	<property>
		<name>javax.jdo.option.ConnectionUserName</name>
		<value>hive</value>
		<description>username to use against metastore database</description>
	</property>

	<property>
		<name>javax.jdo.option.ConnectionPassword</name>
		<value>hive</value>
		<description>password to use against metastore database</description>
	</property>

	<property>
		<name>hive.metastore.uris</name>
		<value>thrift://linux-fdc.linux.com:8888</value>
		<description>
			Thrift uri for the remote metastore. Used by metastore
			client to connect to remote metastore.
		</description>
	</property>

</configuration>

    (8) 配置MySQL-Connector-Java

    下载mysql-connector-java-5.1.22-bin.jar放在/usr/custom/hive-0.11.0/lib目录下,否则执行

    show tables; 命令时提示找不到ConnectionDriverName。

5. 启动使用

    (1) 启动
    进入bin目录下,执行命令:hive
   (2) 查看当前库及表
   show databases;   //默认为:default
   show tables;
   (3) 创建表示例

   这部分为我自己的测试, 测试数据见附件。

  
   CREATE TABLE cite (citing INT, cited INT)
   ROW FORMAT DELIMITED
   FIELDS TERMINATED BY ','
   STORED AS TEXTFILE;
    
   CREATE TABLE cite_count (cited INT, count INT);
    
   INSERT OVERWRITE TABLE cite_count
   SELECT cited,COUNT(citing)
   FROM cite
   GROUP BY cited;
    
   SELECT * FROM cite_count WHERE count > 10 LIMIT 10;
    
   CREATE TABLE age (name STRING, birthday INT)
   ROW FORMAT DELIMITED
   FIELDS TERMINATED BY '\t'
   LINES TERMINATED BY '\n'
   STORED AS TEXTFILE;
    
   CREATE TABLE age_out (birthday INT, birthday_count INT)
   ROW FORMAT DELIMITED
   FIELDS TERMINATED BY '\t'
   LINES TERMINATED BY '\n'
   STORED AS TEXTFILE;
   (4) 查看表结构
   desribe cite;
   (5) 加载数据
   hive> LOAD DATA LOCAL INPATH '/home/kevin/Documents/age.txt' OVERWRITE INTO TABLE age;

  • 大小: 11.8 KB
分享到:
评论
6 楼 xchd 2014-02-11  
 [echo] Project: common
 [echo]  Writing POM to F:\workspace\branch-0.12\build\common/pom.xml
[ivy:makepom] DEPRECATED: 'ivy.conf.file' is deprecated, use 'ivy.settings.file'instead
[ivy:makepom] :: loading settings :: file = F:\workspace\branch-0.12\ivy\ivysettings.xml

create-dirs:
[echo] Project: common
init:[echo] Project: common
setup: [echo] Project: common
ivy-retrieve: [echo] Project: common
[ivy:retrieve] :: retrieving :: org.apache.hive#hive-common
[ivy:retrieve]  confs: [default]
[ivy:retrieve]  4 artifacts copied, 2 already retrieved (508kB/112ms)
compile:[echo] Project: common

BUILD FAILED
F:\workspace\branch-0.12\build.xml:327: The following error occurred while executing this line:
F:\workspace\branch-0.12\build.xml:166: The following error occurred while executing this line:

F:\workspace\branch-0.12\common\build.xml:33: Execute failed: java.io.IOException: Cannot run program "bash" (in directory "F:\workspace\branch-0.12\common"): CreateProcess error=2, ?????????

at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)


common\build.xml 这个文件我没有改过。
这个问题怎么解决
5 楼 xchd 2014-02-11  
你好。
...
hadoop-0.20.version=0.20.2
hadoop-0.20S.version=1.1.2
hadoop-0.23.version=2.0.3-alpha
...
这里改怎么配置
要和hadoop2.2和hbase0.96集成用的。

libraries.properties下的
hbase.version=0.96.0我已经修改。
4 楼 SpringsFeng 2013-04-15  
白杨付 写道
[artifact:pom] [WARNING] Unable to get resource 'org.apache:apache:pom:11' from repository central (http://repo1.maven.org/maven2): GET request of: org/apache/apache/11/apache-11.pom from central failed
[artifact:pom] An error has occurred while processing the Maven artifact tasks.
[artifact:pom]  Diagnosis:
[artifact:pom]
[artifact:pom] Unable to initialize POM pom.xml: Cannot find parent: org.apache:apache for project: org.apache.hcatalog:hcatalog:pom:0.11.0-SNAPSHOT for project org.apache.hcatalog:hcatalog:pom:0.11.0-SNAPSHOT
[artifact:pom] Unable to download the artifact from any repository


编译过程中需要通过Maven中央仓库下载依赖JAR, 默认的中央仓库中国大陆无法方法,请修改Maven 安装目录下的setttingx.xml文件,增加镜像文件为ibilibio.org, 具体配置可Google.
3 楼 白杨付 2013-04-11  
[artifact:pom] [WARNING] Unable to get resource 'org.apache:apache:pom:11' from repository central (http://repo1.maven.org/maven2): GET request of: org/apache/apache/11/apache-11.pom from central failed
[artifact:pom] An error has occurred while processing the Maven artifact tasks.
[artifact:pom]  Diagnosis:
[artifact:pom]
[artifact:pom] Unable to initialize POM pom.xml: Cannot find parent: org.apache:apache for project: org.apache.hcatalog:hcatalog:pom:0.11.0-SNAPSHOT for project org.apache.hcatalog:hcatalog:pom:0.11.0-SNAPSHOT
[artifact:pom] Unable to download the artifact from any repository
2 楼 白杨付 2013-04-11  
详细看了一下 , 应该是被墙了
1 楼 白杨付 2013-04-11  
您好,我安装上面的方法编译hive,报如下错误,
/home/q/hive-trunk/build.xml:270: The following error occurred while executing this line:
/home/q/hive-trunk/build.xml:109: The following error occurred while executing this line:
/home/q/hive-trunk/build.xml:111: The following error occurred while executing this line:
/home/q/hive-trunk/hcatalog/build.xml:65: The following error occurred while executing this line:
/home/q/hive-trunk/hcatalog/build-support/ant/deploy.xml:53: Unable to initialize POM pom.xml: Cannot find parent: org.apache:apache for project: org.apache.hcatalog:hcatalog:pom:0.11.0-SNAPSHOT for project org.apache.hcatalog:hcatalog:pom:0.11.0-SNAPSHOT

如果能帮忙指点一下,感激不尽

相关推荐

    《Hive编程技术与应用》学习笔记.pdf

    全书共10章:前6章系统讲解Hive工作原理、特点,Hive架构,HiveQL表操作,HiveQL数据操作,HiveQL查询,Hive安装与配置,Hive自定义函数;第8~10章是综合案例部分,通过案例帮助读者掌握整个大数据项目的开发流程,...

    【Mybatis-Plus学习笔记一】——Mybatis-Plus快速使用.zip

    【Mybatis-Plus学习笔记一】——Mybatis-Plus快速使用.zip 博客地址:https://blog.csdn.net/weixin_43817709/article/details/117717192

    大数据学习笔记

    第四部分HIVE学习 70 第19章 HIVE介绍 71 19.1 HIVE是什么? 71 19.2 HIVE特点 71 19.3 HIVE架构 71 19.5 HIVE工作流 72 第20章 HIVE 安装 74 20.1 Hadoop安装 74 20.2 HIVE安装 77 20.3 Derby安装与设置 78 第21章 ...

    大数据学习笔记.pdf

    第一部分 Spark学习 ....................................................................................................................... 6 第1章 Spark介绍 ..............................................

    手把手教你搭建分布式Hadoop集群视频教程(视频+讲义+笔记+软件+配置)

    手把手教你搭建分布式Hadoop家族集群视频教程(视频+讲义+笔记+配置),内容包括 Hadoop,SQOOP,Hive,Hbase的安装配置及集群搭建。内容包括但不限于: 01_回顾集群架构及配置集群时间同步 02_配置主节点与从节点...

    Spark-Core学习知识笔记整理

    Spark-Core文档是本人经三年总结笔记汇总而来,对于自我学习Spark核心基础知识非常方便,资料中例举完善,内容丰富。具体目录如下: 目录 第一章 Spark简介与计算模型 3 1 What is Spark 3 2 Spark简介 3 3 Spark...

    大数据知识仓库涉及到数据仓库建模、实时计算、大数据、数据中台、系统设计、Java、算法等代码

    5、Hadoop生态圈的学习笔记,主要记录HDFS、MapReduce、Yarn相关读书笔记及源码分析等 5.1 HDFS Hadoop快速入门 HDFSOverView Hadoop广义生态系统 Hadoop高可用配置 HadoopCommon分析 HDFS集群相关管理 HDFS Shell ...

    云资源下载V1.2

    Nosql (1)redis安置 ...(9)Redis学习笔记 (11)redis应用场景 (12)redis应用之日志汇总 (13)构建可扩展微博架构 (14)浅谈redis的键值设计 (15)浅谈redis数据库的键值设计 (16)为什么使用 Redis及其产品定位 ...

    ml-vm-notebook:机器学习虚拟机(由Vagrant提供)用于构建Spark Notebook应用程序

    用于机器学习/数据科学任务的64位虚拟机。 由Vagrant生成并置备。 该实例基于spark-base64 VM(在Ubuntu 18.04上已经提供了所有必需的软件包)。 最重要的是,它配置并启动Jupyter Notebook进程,并作为HTTP服务...

    Spark Core 笔记02

    Spark Core学习 对最近在看的赵星老师Spark视频中关于SparkCore的几个案例进行总结。 目录1.WordCountWordCount 执行流程详解2.统计最受欢迎老师topN1. 方法一:普通方法,不设置分组/分区2. 方法二:设置分组和过滤...

Global site tag (gtag.js) - Google Analytics