Hive JDBC操作

2020-10-18 Java實用技術

一：啟動hadoop

1. core-site.xml 配置代理用戶屬性

特別注意：hadoop.proxyuser.<伺服器用戶名>.hosts 和 hadoop.proxyuser.<伺服器用戶名>.groups這兩個屬性，伺服器用戶名是hadoop所在的機器的登錄的名字，根據自己實際的登錄名來配置。這裡我的電腦用戶名為mengday。

<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?><configuration> <property> <name>hadoop.tmp.dir</name> <value>file:/usr/local/Cellar/hadoop/3.2.1/libexec/tmp</value> </property> <property> <name>fs.defaultFS</name> <value>hdfs://localhost:8020</value> </property> <property> <name>hadoop.proxyuser.mengday.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.mengday.groups</name> <value>*</value> </property> </configuration>

2. 啟動hadoop

> cd /usr/local/Cellar/hadoop/3.2.1/sbin> ./start-all.sh> jps

啟動成功後注意查看DataNode節點是否啟動起來, 經常遇到DataNode節點啟動不成功。

二：配置hive-site.xml

Java是通過beeline來連接Hive的。啟動beeline最重要的就是配置好hive-site.xml。

其中javax.jdo.option.ConnectionURL涉及到一個資料庫，最好重新刪掉原來的metastore資料庫然後重新創建一個並初始化一下。

mysql> create database metastore;

> cd /usr/local/Cellar/hive/3.1.2/libexec/bin> schematool -initSchema -dbType mysql

hive-site.xml

<configuration>　　<property> <name>hive.metastore.local</name> <value>true</value> </property> <property> <name>hive.metastore.uris</name> <value>thrift://localhost:9083</value> <description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description> </property> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://localhost:3306/metastore?characterEncoding=UTF-8&createDatabaseIfNotExist=true</value> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.cj.jdbc.Driver</value> </property>　　 <property> <name>javax.jdo.option.ConnectionUserName</name> <value>root</value> </property>　　　　<property> <name>javax.jdo.option.ConnectionPassword</name> <value>root123</value> </property>  <property> <name>hive.exec.local.scratchdir</name> <value>/tmp/hive</value> </property> <property> <name>hive.downloaded.resources.dir</name> <value>/tmp/hive</value> </property> <property> <name>hive.metastore.warehouse.dir</name> <value>/data/hive/warehouse</value> </property> <property> <name>hive.metastore.event.db.notification.api.auth</name> <value>false</value> </property> <property> <name>hive.server2.active.passive.ha.enable</name> <value>true</value> </property> <property> <name>hive.server2.transport.mode</name> <value>binary</value> <description> Expects one of [binary, http]. Transport mode of HiveServer2. </description> </property> <property> <name>hive.server2.logging.operation.log.location</name> <value>/tmp/hive</value> </property> <property> <name>hive.hwi.listen.host</name> <value>0.0.0.0</value> <description>This is the host address the Hive Web Interface will listen on</description> </property> <property> <name>hive.server2.webui.host</name> <value>0.0.0.0</value> <description>The host address the HiveServer2 WebUI will listen on</description> </property></configuration>

三：啟動metastore

在啟動beeline之前需要先啟動hiveserver2，而在啟動hiveserver2之前需要先啟動metastore。metastore默認的埠為9083。

> cd /usr/local/Cellar/hive/3.1.2/bin> hive --service metastore &

啟動過一定確認一下啟動是否成功。

四：啟動hiveserver2

> cd /usr/local/Cellar/hive/3.1.2/bin> hive --service hiveserver2 &

hiveserver2默認的埠為10000，啟動之後一定要查看10000埠是否存在，配置有問題基本上10000埠都啟動不成功。10000埠存在不存在是啟動beeline的關鍵。

五：啟動beeline

> cd /usr/local/Cellar/hive/3.1.2/bin> beeline -u jdbc:hive2://localhost:10000/default -n mengday -p

-u: 連接的url，jdbc:hive2://<主機名或IP>:<埠默認>/<資料庫名>，埠號默認10000 可通過 ```hiveserver2 --hiveconf hive.server2.thrift.port=14000 修改埠號，default是自帶的資料庫
-n: hive所在的那臺伺服器的登錄帳號名稱, 這裡是我Mac機器的登錄用戶名mengday, 這裡的名字要和core-site.xml中的hadoop.proxyuser.mengday.hosts和hadoop.proxyuser.mengday.groups中mengday保持一致。
-p: 密碼，用戶名對應的密碼

看到0: jdbc:hive2://localhost:10000/default>就表示啟動成功了。

六：Hive JDBC

1. 引入依賴

<dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-jdbc</artifactId> <version>3.1.2</version></dependency>

2. 準備數據

/data/employee.txt

1,zhangsan,28,60.66,2020-02-01 10:00:00,true,eat#drink,k1:v1#k2:20,s1#c1#s1#12,lisi,29,60.66,2020-02-01 11:00:00,false,play#drink,k3:v3#k4:30,s2#c2#s1#2

3. Java

import java.sql.*;public class HiveJdbcClient { private static String url = "jdbc:hive2://localhost:10000/default"; private static String driverName = "org.apache.hive.jdbc.HiveDriver"; private static String user = "mengday"; private static String password = "user對應的密碼"; private static Connection conn = null; private static Statement stmt = null; private static ResultSet rs = null; static { try { Class.forName(driverName); conn = DriverManager.getConnection(url, user, password); stmt = conn.createStatement(); } catch (Exception e) { e.printStackTrace(); } } public static void init() throws Exception { stmt.execute("drop database if exists hive_test"); stmt.execute("create database hive_test"); rs = stmt.executeQuery("show databases"); while (rs.next()) { System.out.println(rs.getString(1)); } stmt.execute("drop table if exists employee"); String sql = "create table if not exists employee(" + " id bigint, " + " username string, " + " age tinyint, " + " weight decimal(10, 2), " + " create_time timestamp, " + " is_test boolean, " + " tags array<string>, " + " ext map<string, string>, " + " address struct<street:String, city:string, state:string, zip:int> " + " ) " + " row format delimited " + " fields terminated by ',' " + " collection items terminated by '#' " + " map keys terminated by ':' " + " lines terminated by '\n'"; stmt.execute(sql); rs = stmt.executeQuery("show tables"); while (rs.next()) { System.out.println(rs.getString(1)); } rs = stmt.executeQuery("desc employee"); while (rs.next()) { System.out.println(rs.getString(1) + "\t" + rs.getString(2)); } } private static void load() throws Exception { // 加載數據 String filePath = "/data/employee.txt"; stmt.execute("load data local inpath '" + filePath + "' overwrite into table employee"); // 查詢數據 rs = stmt.executeQuery("select * from employee"); while (rs.next()) { System.out.println(rs.getLong("id") + "\t" + rs.getString("username") + "\t" + rs.getObject("tags") + "\t" + rs.getObject("ext") + "\t" + rs.getObject("address") ); } } private static void close() throws Exception { if ( rs != null) { rs.close(); } if (stmt != null) { stmt.close(); } if (conn != null) { conn.close(); } } public static void main(String[] args) throws Exception { init(); load(); close(); }}

Hive JDBC操作

一：啟動hadoop

1. core-site.xml 配置代理用戶屬性

2. 啟動hadoop

二：配置hive-site.xml

三：啟動metastore

四：啟動hiveserver2

五：啟動beeline

六：Hive JDBC

1. 引入依賴

2. 準備數據

3. Java

相關焦點

Hive學習筆記，看懂 Hive

數據倉庫組件:Hive環境搭建和基礎用法

sqoop的使用之導入到hive和mysql

HIVE的安裝與配置

使用sqoop在MySQL、hadoop、hive間同步數據

Hive 如何快速拉取大批量數據

CentOS+Hadoop+MySQL安裝Hive

大數據兵器譜之hive數據倉庫

MySQL數據導入Hive-Java

hive學習筆記之八：Sqoop

HBase基礎環境搭建之Hive和Sqoop安裝

Win10系統下Hadoop和Hive開發環境搭建填坑指南

快速搭建CDH-Hadoop-Hive-Zoopkeeper-Sqoop學習環境

Apache Hive 0.7.1 發布,數據倉庫平臺

0791-5.13.1-Hive視圖執行show create table被截斷異常分析

編寫Hive的UDF(查詢平臺數據同時向mysql添加數據)

大數據之Hive安裝配置

SpringBoot+Sharding-JDBC操作分庫分表

數倉工程師的利器-HIVE詳解

大數據分析：數據倉庫hive詳解