Hive sql 상세 학습

19109 단어 hive.
Hive sql 상세 학습
1. 데이터 준비
emp.txt
7369	SMITH	CLERK	7902	1980-12-17	800.00		20
7499	ALLEN	SALESMAN	7698	1981-2-20	1600.00	300.00	30
7521	WARD	SALESMAN	7698	1981-2-22	1250.00	500.00	30
7566	JONES	MANAGER	7839	1981-4-2	2975.00		20
7654	MARTIN	SALESMAN	7698	1981-9-28	1250.00	1400.00	30
7698	BLAKE	MANAGER	7839	1981-5-1	2850.00		30
7782	CLARK	MANAGER	7839	1981-6-9	2450.00		10
7788	SCOTT	ANALYST	7566	1987-4-19	3000.00		20
7839	KING	PRESIDENT		1981-11-17	5000.00		10
7844	TURNER	SALESMAN	7698	1981-9-8	1500.00	0.00	30
7876	ADAMS	CLERK	7788	1987-5-23	1100.00		20
7900	JAMES	CLERK	7698	1981-12-3	950.00		30
7902	FORD	ANALYST	7566	1981-12-3	3000.00		20
7934	MILLER	CLERK	7782	1982-1-23	1300.00		10

dept.txt
10	ACCOUNTING	NEW YORK
20	RESEARCH	DALLAS
30	SALES	CHICAGO
40	OPERATIONS	BOSTON

2. 테이블 emp 만 들 기
drop table if exists default.emp; create table default.emp( empno int, ename string, job string, mgr int, hiredate string, sal double, deptno int ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
hive> create table default.emp(
    > empno int,
    > ename string,
    > job string,
    > mgr int,
    > hiredate string,
    > sal double,
    > deptno int
    > )
    > ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
OK
Time taken: 3.152 seconds

2. 데이터 불 러 오기
load data local inpath '/opt/hive-0.13.1/emp.txt' overwrite into table default.emp;
hive> load data local inpath '/opt/hive-0.13.1/emp.txt' overwrite into table default.emp ;
Copying data from file:/opt/hive-0.13.1/emp.txt
Copying file: file:/opt/hive-0.13.1/emp.txt
Loading data to table default.emp
rmr: DEPRECATED: Please use 'rm -r' instead.
Deleted hdfs://cluster/user/hive/warehouse/emp
Table default.emp stats: [numFiles=1, numRows=0, totalSize=656, rawDataSize=0]
OK
Time taken: 7.242 seconds

3. 표 dept 만 들 기
hive> create table default.dept(
    > deptno int,
    > dname string,
    > loc string 
    > )
    > ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
OK
Time taken: 1.566 seconds

4. 데이터 불 러 오기
hive> load data local inpath '/opt/hive-0.13.1/dept.txt' overwrite into table default.emp;
Copying data from file:/opt/hive-0.13.1/dept.txt
Copying file: file:/opt/hive-0.13.1/dept.txt
Loading data to table default.emp
rmr: DEPRECATED: Please use 'rm -r' instead.
Deleted hdfs://cluster/user/hive/warehouse/emp
Table default.emp stats: [numFiles=1, numRows=0, totalSize=79, rawDataSize=0]
OK
Time taken: 2.526 seconds

5. hive 에서 만 든 시 계 를 볼 수 있 습 니 다.
hive> dfs -ls /user/hive/warehouse/                                                        
    > ;
Found 8 items
drwxr-xr-x   - root supergroup          0 2015-10-18 05:04 /user/hive/warehouse/db_hive.db
drwxr-xr-x   - root supergroup          0 2015-10-19 05:56 /user/hive/warehouse/dept
drwxr-xr-x   - root supergroup          0 2015-10-19 05:56 /user/hive/warehouse/emp
drwxr-xr-x   - root supergroup          0 2015-10-17 23:50 /user/hive/warehouse/hello.db
drwxr-xr-x   - root supergroup          0 2015-10-17 23:48 /user/hive/warehouse/student
drwxr-xr-x   - root supergroup          0 2015-10-18 08:34 /user/hive/warehouse/weblog
drwxr-xr-x   - root supergroup          0 2015-10-18 08:44 /user/hive/warehouse/weblog_20150923
drwxr-xr-x   - root supergroup          0 2015-10-18 08:56 /user/hive/warehouse/weblog_comm

6. my sql 에서 hive 메타 데이터 학습 을 볼 수 있 습 니 다.
mysql> select * from TBLS;
+--------+-------------+-------+------------------+-------+-----------+-------+-----------------+---------------+--------------------+--------------------+
| TBL_ID | CREATE_TIME | DB_ID | LAST_ACCESS_TIME | OWNER | RETENTION | SD_ID | TBL_NAME        | TBL_TYPE      | VIEW_EXPANDED_TEXT | VIEW_ORIGINAL_TEXT |
+--------+-------------+-------+------------------+-------+-----------+-------+-----------------+---------------+--------------------+--------------------+
|      1 |  1445150880 |     1 |                0 | root  |         0 |     1 | student         | MANAGED_TABLE | NULL               | NULL               |
|      2 |  1445169777 |     3 |                0 | root  |         0 |     2 | student         | MANAGED_TABLE | NULL               | NULL               |
|      6 |  1445182413 |     1 |                0 | root  |         0 |     6 | weblog          | MANAGED_TABLE | NULL               | NULL               |
|      7 |  1445183058 |     1 |                0 | root  |         0 |     7 | weblog_20150923 | MANAGED_TABLE | NULL               | NULL               |
|      8 |  1445183801 |     1 |                0 | root  |         0 |     8 | weblog_comm     | MANAGED_TABLE | NULL               | NULL               |
|      9 |  1445258345 |     1 |                0 | root  |         0 |     9 | emp             | MANAGED_TABLE | NULL               | NULL               |
|     10 |  1445258771 |     1 |                0 | root  |         0 |    10 | dept            | MANAGED_TABLE | NULL               | NULL               |
+--------+-------------+-------+------------------+-------+-----------+-------+-----------------+---------------+--------------------+--------------------+
7 rows in set (0.00 sec)

7. 표를 삭제 하면 hdfs 위의 데이터 와 my sql 안의 메타 데이터 정 보 를 삭제 합 니 다.
hive> drop table if exists default.weblog_comm;
OK
Time taken: 9.869 seconds
hive> dfs -ls /user/hive/warehouse/;
Found 7 items
drwxr-xr-x   - root supergroup          0 2015-10-18 05:04 /user/hive/warehouse/db_hive.db
drwxr-xr-x   - root supergroup          0 2015-10-19 05:56 /user/hive/warehouse/dept
drwxr-xr-x   - root supergroup          0 2015-10-19 05:56 /user/hive/warehouse/emp
drwxr-xr-x   - root supergroup          0 2015-10-17 23:50 /user/hive/warehouse/hello.db
drwxr-xr-x   - root supergroup          0 2015-10-17 23:48 /user/hive/warehouse/student
drwxr-xr-x   - root supergroup          0 2015-10-18 08:34 /user/hive/warehouse/weblog
drwxr-xr-x   - root supergroup          0 2015-10-18 08:44 /user/hive/warehouse/weblog_20150923
mysql> select * from TBLS;
+--------+-------------+-------+------------------+-------+-----------+-------+-----------------+---------------+--------------------+--------------------+
| TBL_ID | CREATE_TIME | DB_ID | LAST_ACCESS_TIME | OWNER | RETENTION | SD_ID | TBL_NAME        | TBL_TYPE      | VIEW_EXPANDED_TEXT | VIEW_ORIGINAL_TEXT |
+--------+-------------+-------+------------------+-------+-----------+-------+-----------------+---------------+--------------------+--------------------+
|      1 |  1445150880 |     1 |                0 | root  |         0 |     1 | student         | MANAGED_TABLE | NULL               | NULL               |
|      2 |  1445169777 |     3 |                0 | root  |         0 |     2 | student         | MANAGED_TABLE | NULL               | NULL               |
|      6 |  1445182413 |     1 |                0 | root  |         0 |     6 | weblog          | MANAGED_TABLE | NULL               | NULL               |
|      7 |  1445183058 |     1 |                0 | root  |         0 |     7 | weblog_20150923 | MANAGED_TABLE | NULL               | NULL               |
|      9 |  1445258345 |     1 |                0 | root  |         0 |     9 | emp             | MANAGED_TABLE | NULL               | NULL               |
|     10 |  1445258771 |     1 |                0 | root  |         0 |    10 | dept            | MANAGED_TABLE | NULL               | NULL               |
+--------+-------------+-------+------------------+-------+-----------+-------+-----------------+---------------+--------------------+--------------------+
6 rows in set (0.00 sec)

8. hive 에서 표 의 유형 외부 표 관리 표
앞에서 말 한 것 은 모두 관리 표 입 니 다. 지금 은 외부 표를 만 드 는 것 을 말 합 니 다.
* MANAGED_TABLE   (관리 표) * EXTERNALTABLE (외부 표)
먼저 데이터베이스 만 들 기:
hive> create database if not exists db_hive_0927; OK Time taken: 0.792 seconds
hive> drop table if exists db_hive_0927.dept_external ;
OK
Time taken: 0.092 seconds

hive> create EXTERNAL table db_hive_0927.dept_external(
    > deptno int,
    > dname string,
    > loc string
    > )
    > ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' ;
OK
Time taken: 0.866 seconds
9. 표 형식 보기 (외부 표)
hive> desc formatted db_hive_0927.dept_external;
OK
# col_name            	data_type           	comment             
	 	 
deptno              	int                 	                    
dname               	string              	                    
loc                 	string              	                    
	 	 
# Detailed Table Information	 	 
Database:           	db_hive_0927        	 
Owner:              	root                	 
CreateTime:         	Mon Oct 19 06:16:47 PDT 2015	 
LastAccessTime:     	UNKNOWN             	 
Protect Mode:       	None                	 
Retention:          	0                   	 
Location:           	hdfs://cluster/user/hive/warehouse/db_hive_0927.db/dept_external	 
Table Type:         	EXTERNAL_TABLE      	 
Table Parameters:	 	 
	EXTERNAL            	TRUE                
	transient_lastDdlTime	1445260607          
	 	 
# Storage Information	 	 
SerDe Library:      	org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe	 
InputFormat:        	org.apache.hadoop.mapred.TextInputFormat	 
OutputFormat:       	org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat	 
Compressed:         	No                  	 
Num Buckets:        	-1                  	 
Bucket Columns:     	[]                  	 
Sort Columns:       	[]                  	 
Storage Desc Params:	 	 
	field.delim         	\t                  
	serialization.format	\t                  
Time taken: 0.384 seconds, Fetched: 30 row(s)

10. 표 형식 보기 (관리 표) 예 를 들 어 앞에서 만 든 표
hive> desc formatted default.dept;              
OK
# col_name            	data_type           	comment             
	 	 
deptno              	int                 	                    
dname               	string              	                    
loc                 	string              	                    
	 	 
# Detailed Table Information	 	 
Database:           	default             	 
Owner:              	root                	 
CreateTime:         	Mon Oct 19 05:46:11 PDT 2015	 
LastAccessTime:     	UNKNOWN             	 
Protect Mode:       	None                	 
Retention:          	0                   	 
Location:           	hdfs://cluster/user/hive/warehouse/dept	 
Table Type:         	MANAGED_TABLE       	 
Table Parameters:	 	 
	COLUMN_STATS_ACCURATE	true                
	numFiles            	1                   
	numRows             	0                   
	rawDataSize         	0                   
	totalSize           	79                  
	transient_lastDdlTime	1445259404          
	 	 
# Storage Information	 	 
SerDe Library:      	org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe	 
InputFormat:        	org.apache.hadoop.mapred.TextInputFormat	 
OutputFormat:       	org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat	 
Compressed:         	No                  	 
Num Buckets:        	-1                  	 
Bucket Columns:     	[]                  	 
Sort Columns:       	[]                  	 
Storage Desc Params:	 	 
	field.delim         	\t                  
	serialization.format	\t                  
Time taken: 0.383 seconds, Fetched: 34 row(s)

11. 외부 표 와 관리 표를 소개 하 는 것 은 어떤 차이 가 있 습 니까?
외부 테이블 에 데 이 터 를 불 러 오고 보기:
hive> load data local inpath '/opt/hive-0.13.1/dept.txt' overwrite into table db_hive_0927.dept_external;
Copying data from file:/opt/hive-0.13.1/dept.txt
Copying file: file:/opt/hive-0.13.1/dept.txt
Loading data to table db_hive_0927.dept_external
rmr: DEPRECATED: Please use 'rm -r' instead.
Deleted hdfs://cluster/user/hive/warehouse/db_hive_0927.db/dept_external
Table db_hive_0927.dept_external stats: [numFiles=1, numRows=0, totalSize=79, rawDataSize=0]
OK
Time taken: 2.687 seconds
hive> select * from db_hive_0927.dept_external;
OK
10	ACCOUNTING	NEW YORK
20	RESEARCH	DALLAS
30	SALES	CHICAGO
40	OPERATIONS	BOSTON
Time taken: 0.405 seconds, Fetched: 4 row(s)
hive> select * from default.dept;              
OK
10	ACCOUNTING	NEW YORK
20	RESEARCH	DALLAS
30	SALES	CHICAGO
40	OPERATIONS	BOSTON
Time taken: 0.37 seconds, Fetched: 4 row(s)

결론 은
외부 테이블 삭제, 메타 데이터 삭제, hdfs 위 데이터 삭제 되 지 않 았 습 니 다.
삭제 관리, 메타 데이터 삭제, hdfs 위 데이터 삭제
메타 데이터 정보 먼저 보기:
mysql> select * from TBLS;

+--------+-------------+-------+------------------+-------+-----------+-------+-----------------+----------------+--------------------+--------------------+
| TBL_ID | CREATE_TIME | DB_ID | LAST_ACCESS_TIME | OWNER | RETENTION | SD_ID | TBL_NAME        | TBL_TYPE       | VIEW_EXPANDED_TEXT | VIEW_ORIGINAL_TEXT |
+--------+-------------+-------+------------------+-------+-----------+-------+-----------------+----------------+--------------------+--------------------+
|      1 |  1445150880 |     1 |                0 | root  |         0 |     1 | student         | MANAGED_TABLE  | NULL               | NULL               |
|      2 |  1445169777 |     3 |                0 | root  |         0 |     2 | student         | MANAGED_TABLE  | NULL               | NULL               |
|      6 |  1445182413 |     1 |                0 | root  |         0 |     6 | weblog          | MANAGED_TABLE  | NULL               | NULL               |
|      7 |  1445183058 |     1 |                0 | root  |         0 |     7 | weblog_20150923 | MANAGED_TABLE  | NULL               | NULL               |
|      9 |  1445258345 |     1 |                0 | root  |         0 |     9 | emp             | MANAGED_TABLE  | NULL               | NULL               |
|     10 |  1445258771 |     1 |                0 | root  |         0 |    10 | dept            | MANAGED_TABLE  | NULL               | NULL               | |     11 |  1445260607 |     6 |                0 | root  |         0 |    11 | dept_external   | EXTERNAL_TABLE | NULL               | NULL               |
+--------+-------------+-------+------------------+-------+-----------+-------+-----------------+----------------+--------------------+--------------------+
7 rows in set (0.00 sec)
hdfs 위 데이터:
hive> dfs -ls /user/hive/warehouse/db_hive_0927.db;
Found 1 items
drwxr-xr-x   - root supergroup          0 2015-10-19 06:34 /user/hive/warehouse/db_hive_0927.db/dept_external
hive> dfs -ls /user/hive/warehouse/dept;
Found 1 items
-rw-r--r--   2 root supergroup         79 2015-10-19 05:56 /user/hive/warehouse/dept/dept.txt

다음 표 삭제:
hive>  drop table default.dept ;
OK
Time taken: 2.27 seconds
hive> drop table db_hive_0927.dept_external ;
OK
Time taken: 0.479 seconds

메타 데이터 정보 보기:
mysql> select * from TBLS;
+--------+-------------+-------+------------------+-------+-----------+-------+-----------------+---------------+--------------------+--------------------+
| TBL_ID | CREATE_TIME | DB_ID | LAST_ACCESS_TIME | OWNER | RETENTION | SD_ID | TBL_NAME        | TBL_TYPE      | VIEW_EXPANDED_TEXT | VIEW_ORIGINAL_TEXT |
+--------+-------------+-------+------------------+-------+-----------+-------+-----------------+---------------+--------------------+--------------------+
|      1 |  1445150880 |     1 |                0 | root  |         0 |     1 | student         | MANAGED_TABLE | NULL               | NULL               |
|      2 |  1445169777 |     3 |                0 | root  |         0 |     2 | student         | MANAGED_TABLE | NULL               | NULL               |
|      6 |  1445182413 |     1 |                0 | root  |         0 |     6 | weblog          | MANAGED_TABLE | NULL               | NULL               |
|      7 |  1445183058 |     1 |                0 | root  |         0 |     7 | weblog_20150923 | MANAGED_TABLE | NULL               | NULL               |
|      9 |  1445258345 |     1 |                0 | root  |         0 |     9 | emp             | MANAGED_TABLE | NULL               | NULL               |
+--------+-------------+-------+------------------+-------+-----------+-------+-----------------+---------------+--------------------+--------------------+
5 rows in set (0.01 sec)

위의 메타 데이터 정보 가 모두 삭제 되 었 습 니 다!!
hdfs 데이터 보기:
hive> dfs -ls /user/hive/warehouse/;    

Found 7 items
drwxr-xr-x   - root supergroup          0 2015-10-18 05:04 /user/hive/warehouse/db_hive.db
drwxr-xr-x   - root supergroup          0 2015-10-19 06:34 /user/hive/warehouse/db_hive_0927.db
drwxr-xr-x   - root supergroup          0 2015-10-19 05:56 /user/hive/warehouse/emp
drwxr-xr-x   - root supergroup          0 2015-10-17 23:50 /user/hive/warehouse/hello.db
drwxr-xr-x   - root supergroup          0 2015-10-17 23:48 /user/hive/warehouse/student
drwxr-xr-x   - root supergroup          0 2015-10-18 08:34 /user/hive/warehouse/weblog
drwxr-xr-x   - root supergroup          0 2015-10-18 08:44 /user/hive/warehouse/weblog_20150923
관리 표 데이터 가 없습니다.
hive> dfs -ls /user/hive/warehouse/db_hive_0927.db;
Found 1 items
drwxr-xr-x   - root supergroup          0 2015-10-19 06:34 /user/hive/warehouse/db_hive_0927.db/dept_external

외부 표 hdfs 위 데이터 아직 있 습 니 다!
12. 마지막 으로 외부 시 계 를 만 들 때 위 치 를 지정 할 수 있 습 니 다.
drop table if exists db_hive_0927.xiaoming ; create EXTERNAL table db_hive_0927.xiaoming( deptno int, dname string, loc string ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION '/user/hive/warehouse/db_hive_0927.db/xiaoming';
hive> drop table if exists db_hive_0927.xiaoming ;
OK
Time taken: 0.173 seconds
hive> create EXTERNAL table db_hive_0927.xiaoming(
    > deptno int,
    > dname string,
    > loc string
    > )
    > ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
    > LOCATION '/user/hive/warehouse/db_hive_0927.db/xiaoming';
OK
Time taken: 0.396 seconds
hive> dfs -ls /user/hive/warehouse/db_hive_0927.db/;        
Found 3 items
drwxr-xr-x   - root supergroup          0 2015-10-19 06:58 /user/hive/warehouse/db_hive_0927.db/aa
drwxr-xr-x   - root supergroup          0 2015-10-19 06:34 /user/hive/warehouse/db_hive_0927.db/dept_external
drwxr-xr-x   - root supergroup          0 2015-10-19 07:00 /user/hive/warehouse/db_hive_0927.db/xiaoming

좋은 웹페이지 즐겨찾기