Hive sql 상세 학습
19109 단어 hive.
1. 데이터 준비
emp.txt
7369 SMITH CLERK 7902 1980-12-17 800.00 20
7499 ALLEN SALESMAN 7698 1981-2-20 1600.00 300.00 30
7521 WARD SALESMAN 7698 1981-2-22 1250.00 500.00 30
7566 JONES MANAGER 7839 1981-4-2 2975.00 20
7654 MARTIN SALESMAN 7698 1981-9-28 1250.00 1400.00 30
7698 BLAKE MANAGER 7839 1981-5-1 2850.00 30
7782 CLARK MANAGER 7839 1981-6-9 2450.00 10
7788 SCOTT ANALYST 7566 1987-4-19 3000.00 20
7839 KING PRESIDENT 1981-11-17 5000.00 10
7844 TURNER SALESMAN 7698 1981-9-8 1500.00 0.00 30
7876 ADAMS CLERK 7788 1987-5-23 1100.00 20
7900 JAMES CLERK 7698 1981-12-3 950.00 30
7902 FORD ANALYST 7566 1981-12-3 3000.00 20
7934 MILLER CLERK 7782 1982-1-23 1300.00 10
dept.txt
10 ACCOUNTING NEW YORK
20 RESEARCH DALLAS
30 SALES CHICAGO
40 OPERATIONS BOSTON
2. 테이블 emp 만 들 기
drop table if exists default.emp; create table default.emp( empno int, ename string, job string, mgr int, hiredate string, sal double, deptno int ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
hive> create table default.emp(
> empno int,
> ename string,
> job string,
> mgr int,
> hiredate string,
> sal double,
> deptno int
> )
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
OK
Time taken: 3.152 seconds
2. 데이터 불 러 오기
load data local inpath '/opt/hive-0.13.1/emp.txt' overwrite into table default.emp;
hive> load data local inpath '/opt/hive-0.13.1/emp.txt' overwrite into table default.emp ;
Copying data from file:/opt/hive-0.13.1/emp.txt
Copying file: file:/opt/hive-0.13.1/emp.txt
Loading data to table default.emp
rmr: DEPRECATED: Please use 'rm -r' instead.
Deleted hdfs://cluster/user/hive/warehouse/emp
Table default.emp stats: [numFiles=1, numRows=0, totalSize=656, rawDataSize=0]
OK
Time taken: 7.242 seconds
3. 표 dept 만 들 기
hive> create table default.dept(
> deptno int,
> dname string,
> loc string
> )
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
OK
Time taken: 1.566 seconds
4. 데이터 불 러 오기
hive> load data local inpath '/opt/hive-0.13.1/dept.txt' overwrite into table default.emp;
Copying data from file:/opt/hive-0.13.1/dept.txt
Copying file: file:/opt/hive-0.13.1/dept.txt
Loading data to table default.emp
rmr: DEPRECATED: Please use 'rm -r' instead.
Deleted hdfs://cluster/user/hive/warehouse/emp
Table default.emp stats: [numFiles=1, numRows=0, totalSize=79, rawDataSize=0]
OK
Time taken: 2.526 seconds
5. hive 에서 만 든 시 계 를 볼 수 있 습 니 다.
hive> dfs -ls /user/hive/warehouse/
> ;
Found 8 items
drwxr-xr-x - root supergroup 0 2015-10-18 05:04 /user/hive/warehouse/db_hive.db
drwxr-xr-x - root supergroup 0 2015-10-19 05:56 /user/hive/warehouse/dept
drwxr-xr-x - root supergroup 0 2015-10-19 05:56 /user/hive/warehouse/emp
drwxr-xr-x - root supergroup 0 2015-10-17 23:50 /user/hive/warehouse/hello.db
drwxr-xr-x - root supergroup 0 2015-10-17 23:48 /user/hive/warehouse/student
drwxr-xr-x - root supergroup 0 2015-10-18 08:34 /user/hive/warehouse/weblog
drwxr-xr-x - root supergroup 0 2015-10-18 08:44 /user/hive/warehouse/weblog_20150923
drwxr-xr-x - root supergroup 0 2015-10-18 08:56 /user/hive/warehouse/weblog_comm
6. my sql 에서 hive 메타 데이터 학습 을 볼 수 있 습 니 다.
mysql> select * from TBLS;
+--------+-------------+-------+------------------+-------+-----------+-------+-----------------+---------------+--------------------+--------------------+
| TBL_ID | CREATE_TIME | DB_ID | LAST_ACCESS_TIME | OWNER | RETENTION | SD_ID | TBL_NAME | TBL_TYPE | VIEW_EXPANDED_TEXT | VIEW_ORIGINAL_TEXT |
+--------+-------------+-------+------------------+-------+-----------+-------+-----------------+---------------+--------------------+--------------------+
| 1 | 1445150880 | 1 | 0 | root | 0 | 1 | student | MANAGED_TABLE | NULL | NULL |
| 2 | 1445169777 | 3 | 0 | root | 0 | 2 | student | MANAGED_TABLE | NULL | NULL |
| 6 | 1445182413 | 1 | 0 | root | 0 | 6 | weblog | MANAGED_TABLE | NULL | NULL |
| 7 | 1445183058 | 1 | 0 | root | 0 | 7 | weblog_20150923 | MANAGED_TABLE | NULL | NULL |
| 8 | 1445183801 | 1 | 0 | root | 0 | 8 | weblog_comm | MANAGED_TABLE | NULL | NULL |
| 9 | 1445258345 | 1 | 0 | root | 0 | 9 | emp | MANAGED_TABLE | NULL | NULL |
| 10 | 1445258771 | 1 | 0 | root | 0 | 10 | dept | MANAGED_TABLE | NULL | NULL |
+--------+-------------+-------+------------------+-------+-----------+-------+-----------------+---------------+--------------------+--------------------+
7 rows in set (0.00 sec)
7. 표를 삭제 하면 hdfs 위의 데이터 와 my sql 안의 메타 데이터 정 보 를 삭제 합 니 다.
hive> drop table if exists default.weblog_comm;
OK
Time taken: 9.869 seconds
hive> dfs -ls /user/hive/warehouse/;
Found 7 items
drwxr-xr-x - root supergroup 0 2015-10-18 05:04 /user/hive/warehouse/db_hive.db
drwxr-xr-x - root supergroup 0 2015-10-19 05:56 /user/hive/warehouse/dept
drwxr-xr-x - root supergroup 0 2015-10-19 05:56 /user/hive/warehouse/emp
drwxr-xr-x - root supergroup 0 2015-10-17 23:50 /user/hive/warehouse/hello.db
drwxr-xr-x - root supergroup 0 2015-10-17 23:48 /user/hive/warehouse/student
drwxr-xr-x - root supergroup 0 2015-10-18 08:34 /user/hive/warehouse/weblog
drwxr-xr-x - root supergroup 0 2015-10-18 08:44 /user/hive/warehouse/weblog_20150923
mysql> select * from TBLS;
+--------+-------------+-------+------------------+-------+-----------+-------+-----------------+---------------+--------------------+--------------------+
| TBL_ID | CREATE_TIME | DB_ID | LAST_ACCESS_TIME | OWNER | RETENTION | SD_ID | TBL_NAME | TBL_TYPE | VIEW_EXPANDED_TEXT | VIEW_ORIGINAL_TEXT |
+--------+-------------+-------+------------------+-------+-----------+-------+-----------------+---------------+--------------------+--------------------+
| 1 | 1445150880 | 1 | 0 | root | 0 | 1 | student | MANAGED_TABLE | NULL | NULL |
| 2 | 1445169777 | 3 | 0 | root | 0 | 2 | student | MANAGED_TABLE | NULL | NULL |
| 6 | 1445182413 | 1 | 0 | root | 0 | 6 | weblog | MANAGED_TABLE | NULL | NULL |
| 7 | 1445183058 | 1 | 0 | root | 0 | 7 | weblog_20150923 | MANAGED_TABLE | NULL | NULL |
| 9 | 1445258345 | 1 | 0 | root | 0 | 9 | emp | MANAGED_TABLE | NULL | NULL |
| 10 | 1445258771 | 1 | 0 | root | 0 | 10 | dept | MANAGED_TABLE | NULL | NULL |
+--------+-------------+-------+------------------+-------+-----------+-------+-----------------+---------------+--------------------+--------------------+
6 rows in set (0.00 sec)
8. hive 에서 표 의 유형 외부 표 관리 표
앞에서 말 한 것 은 모두 관리 표 입 니 다. 지금 은 외부 표를 만 드 는 것 을 말 합 니 다.
* MANAGED_TABLE (관리 표) * EXTERNALTABLE (외부 표)
먼저 데이터베이스 만 들 기:
hive> create database if not exists db_hive_0927; OK Time taken: 0.792 seconds
hive> drop table if exists db_hive_0927.dept_external ;
OK
Time taken: 0.092 seconds
hive> create EXTERNAL table db_hive_0927.dept_external(
> deptno int,
> dname string,
> loc string
> )
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' ;
OK
Time taken: 0.866 seconds
9. 표 형식 보기 (외부 표)
hive> desc formatted db_hive_0927.dept_external;
OK
# col_name data_type comment
deptno int
dname string
loc string
# Detailed Table Information
Database: db_hive_0927
Owner: root
CreateTime: Mon Oct 19 06:16:47 PDT 2015
LastAccessTime: UNKNOWN
Protect Mode: None
Retention: 0
Location: hdfs://cluster/user/hive/warehouse/db_hive_0927.db/dept_external
Table Type: EXTERNAL_TABLE
Table Parameters:
EXTERNAL TRUE
transient_lastDdlTime 1445260607
# Storage Information
SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
InputFormat: org.apache.hadoop.mapred.TextInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
Compressed: No
Num Buckets: -1
Bucket Columns: []
Sort Columns: []
Storage Desc Params:
field.delim \t
serialization.format \t
Time taken: 0.384 seconds, Fetched: 30 row(s)
10. 표 형식 보기 (관리 표) 예 를 들 어 앞에서 만 든 표
hive> desc formatted default.dept;
OK
# col_name data_type comment
deptno int
dname string
loc string
# Detailed Table Information
Database: default
Owner: root
CreateTime: Mon Oct 19 05:46:11 PDT 2015
LastAccessTime: UNKNOWN
Protect Mode: None
Retention: 0
Location: hdfs://cluster/user/hive/warehouse/dept
Table Type: MANAGED_TABLE
Table Parameters:
COLUMN_STATS_ACCURATE true
numFiles 1
numRows 0
rawDataSize 0
totalSize 79
transient_lastDdlTime 1445259404
# Storage Information
SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
InputFormat: org.apache.hadoop.mapred.TextInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
Compressed: No
Num Buckets: -1
Bucket Columns: []
Sort Columns: []
Storage Desc Params:
field.delim \t
serialization.format \t
Time taken: 0.383 seconds, Fetched: 34 row(s)
11. 외부 표 와 관리 표를 소개 하 는 것 은 어떤 차이 가 있 습 니까?
외부 테이블 에 데 이 터 를 불 러 오고 보기:
hive> load data local inpath '/opt/hive-0.13.1/dept.txt' overwrite into table db_hive_0927.dept_external;
Copying data from file:/opt/hive-0.13.1/dept.txt
Copying file: file:/opt/hive-0.13.1/dept.txt
Loading data to table db_hive_0927.dept_external
rmr: DEPRECATED: Please use 'rm -r' instead.
Deleted hdfs://cluster/user/hive/warehouse/db_hive_0927.db/dept_external
Table db_hive_0927.dept_external stats: [numFiles=1, numRows=0, totalSize=79, rawDataSize=0]
OK
Time taken: 2.687 seconds
hive> select * from db_hive_0927.dept_external;
OK
10 ACCOUNTING NEW YORK
20 RESEARCH DALLAS
30 SALES CHICAGO
40 OPERATIONS BOSTON
Time taken: 0.405 seconds, Fetched: 4 row(s)
hive> select * from default.dept;
OK
10 ACCOUNTING NEW YORK
20 RESEARCH DALLAS
30 SALES CHICAGO
40 OPERATIONS BOSTON
Time taken: 0.37 seconds, Fetched: 4 row(s)
결론 은
외부 테이블 삭제, 메타 데이터 삭제, hdfs 위 데이터 삭제 되 지 않 았 습 니 다.
삭제 관리, 메타 데이터 삭제, hdfs 위 데이터 삭제
메타 데이터 정보 먼저 보기:
mysql> select * from TBLS;
+--------+-------------+-------+------------------+-------+-----------+-------+-----------------+----------------+--------------------+--------------------+
| TBL_ID | CREATE_TIME | DB_ID | LAST_ACCESS_TIME | OWNER | RETENTION | SD_ID | TBL_NAME | TBL_TYPE | VIEW_EXPANDED_TEXT | VIEW_ORIGINAL_TEXT |
+--------+-------------+-------+------------------+-------+-----------+-------+-----------------+----------------+--------------------+--------------------+
| 1 | 1445150880 | 1 | 0 | root | 0 | 1 | student | MANAGED_TABLE | NULL | NULL |
| 2 | 1445169777 | 3 | 0 | root | 0 | 2 | student | MANAGED_TABLE | NULL | NULL |
| 6 | 1445182413 | 1 | 0 | root | 0 | 6 | weblog | MANAGED_TABLE | NULL | NULL |
| 7 | 1445183058 | 1 | 0 | root | 0 | 7 | weblog_20150923 | MANAGED_TABLE | NULL | NULL |
| 9 | 1445258345 | 1 | 0 | root | 0 | 9 | emp | MANAGED_TABLE | NULL | NULL |
| 10 | 1445258771 | 1 | 0 | root | 0 | 10 | dept | MANAGED_TABLE | NULL | NULL | | 11 | 1445260607 | 6 | 0 | root | 0 | 11 | dept_external | EXTERNAL_TABLE | NULL | NULL |
+--------+-------------+-------+------------------+-------+-----------+-------+-----------------+----------------+--------------------+--------------------+
7 rows in set (0.00 sec)
hdfs 위 데이터:
hive> dfs -ls /user/hive/warehouse/db_hive_0927.db;
Found 1 items
drwxr-xr-x - root supergroup 0 2015-10-19 06:34 /user/hive/warehouse/db_hive_0927.db/dept_external
hive> dfs -ls /user/hive/warehouse/dept;
Found 1 items
-rw-r--r-- 2 root supergroup 79 2015-10-19 05:56 /user/hive/warehouse/dept/dept.txt
다음 표 삭제:
hive> drop table default.dept ;
OK
Time taken: 2.27 seconds
hive> drop table db_hive_0927.dept_external ;
OK
Time taken: 0.479 seconds
메타 데이터 정보 보기:
mysql> select * from TBLS;
+--------+-------------+-------+------------------+-------+-----------+-------+-----------------+---------------+--------------------+--------------------+
| TBL_ID | CREATE_TIME | DB_ID | LAST_ACCESS_TIME | OWNER | RETENTION | SD_ID | TBL_NAME | TBL_TYPE | VIEW_EXPANDED_TEXT | VIEW_ORIGINAL_TEXT |
+--------+-------------+-------+------------------+-------+-----------+-------+-----------------+---------------+--------------------+--------------------+
| 1 | 1445150880 | 1 | 0 | root | 0 | 1 | student | MANAGED_TABLE | NULL | NULL |
| 2 | 1445169777 | 3 | 0 | root | 0 | 2 | student | MANAGED_TABLE | NULL | NULL |
| 6 | 1445182413 | 1 | 0 | root | 0 | 6 | weblog | MANAGED_TABLE | NULL | NULL |
| 7 | 1445183058 | 1 | 0 | root | 0 | 7 | weblog_20150923 | MANAGED_TABLE | NULL | NULL |
| 9 | 1445258345 | 1 | 0 | root | 0 | 9 | emp | MANAGED_TABLE | NULL | NULL |
+--------+-------------+-------+------------------+-------+-----------+-------+-----------------+---------------+--------------------+--------------------+
5 rows in set (0.01 sec)
위의 메타 데이터 정보 가 모두 삭제 되 었 습 니 다!!
hdfs 데이터 보기:
hive> dfs -ls /user/hive/warehouse/;
Found 7 items
drwxr-xr-x - root supergroup 0 2015-10-18 05:04 /user/hive/warehouse/db_hive.db
drwxr-xr-x - root supergroup 0 2015-10-19 06:34 /user/hive/warehouse/db_hive_0927.db
drwxr-xr-x - root supergroup 0 2015-10-19 05:56 /user/hive/warehouse/emp
drwxr-xr-x - root supergroup 0 2015-10-17 23:50 /user/hive/warehouse/hello.db
drwxr-xr-x - root supergroup 0 2015-10-17 23:48 /user/hive/warehouse/student
drwxr-xr-x - root supergroup 0 2015-10-18 08:34 /user/hive/warehouse/weblog
drwxr-xr-x - root supergroup 0 2015-10-18 08:44 /user/hive/warehouse/weblog_20150923
관리 표 데이터 가 없습니다.
hive> dfs -ls /user/hive/warehouse/db_hive_0927.db;
Found 1 items
drwxr-xr-x - root supergroup 0 2015-10-19 06:34 /user/hive/warehouse/db_hive_0927.db/dept_external
외부 표 hdfs 위 데이터 아직 있 습 니 다!
12. 마지막 으로 외부 시 계 를 만 들 때 위 치 를 지정 할 수 있 습 니 다.
drop table if exists db_hive_0927.xiaoming ; create EXTERNAL table db_hive_0927.xiaoming( deptno int, dname string, loc string ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION '/user/hive/warehouse/db_hive_0927.db/xiaoming';
hive> drop table if exists db_hive_0927.xiaoming ;
OK
Time taken: 0.173 seconds
hive> create EXTERNAL table db_hive_0927.xiaoming(
> deptno int,
> dname string,
> loc string
> )
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
> LOCATION '/user/hive/warehouse/db_hive_0927.db/xiaoming';
OK
Time taken: 0.396 seconds
hive> dfs -ls /user/hive/warehouse/db_hive_0927.db/;
Found 3 items
drwxr-xr-x - root supergroup 0 2015-10-19 06:58 /user/hive/warehouse/db_hive_0927.db/aa
drwxr-xr-x - root supergroup 0 2015-10-19 06:34 /user/hive/warehouse/db_hive_0927.db/dept_external
drwxr-xr-x - root supergroup 0 2015-10-19 07:00 /user/hive/warehouse/db_hive_0927.db/xiaoming
이 내용에 흥미가 있습니까?
현재 기사가 여러분의 문제를 해결하지 못하는 경우 AI 엔진은 머신러닝 분석(스마트 모델이 방금 만들어져 부정확한 경우가 있을 수 있음)을 통해 가장 유사한 기사를 추천합니다:
Hive 복잡 한 데이터 구조 삽입Hive Hive 기본 데이터 구조 지원 제외 Hive 복잡 한 데이터 구조: 데이터 형식 hive 표 구조 디자인: select :...
텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.
CC BY-SA 2.5, CC BY-SA 3.0 및 CC BY-SA 4.0에 따라 라이센스가 부여됩니다.