Hive DML 구문

테이블에 파일 로드

LOAD DATA [LOCAL] INPATH ‘filepath’ [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 …)]

hive> load data local inpath "/home/hadoop/data/deptn.sql" overwrite into table dept;
Loading data to table default.dept
Table default.dept stats: [numFiles=1, numRows=0, totalSize=80, rawDataSize=0]
OK
Time taken: 2.401 seconds

조회 결과 파일 시스템 쓰기(데이터 내보내기) INSERT OVERWRITE [LOCAL] DIRECTORY directory 1 [ROW FORMAT row format] [STORED AS file format] SELECT... FROM...

insert overwrite local directory '/home/hadoop/data/outputemp2' 
row format delimited fields terminated by "\t"
select * from emp;

FROM from_statement INSERT OVERWRITE [LOCAL] DIRECTORY directory1 select_statement1 [INSERT OVERWRITE [LOCAL] DIRECTORY directory2 select_statement2]...//여러 테이블로 내보내기

from emp
INSERT OVERWRITE  LOCAL DIRECTORY '/home/hadoop/tmp/hivetmp1'
ROW FORMAT DELIMITED FIELDS TERMINATED BY "\t"
select empno, ename  
INSERT OVERWRITE  LOCAL DIRECTORY '/home/hadoop/tmp/hivetmp2'
ROW FORMAT DELIMITED FIELDS TERMINATED BY "\t"
select ename;

Hive-e로 데이터 내보내기

조회문 각 부서의 평균 임금select deptno, avg(salary)fromemp group by deptno;

select ename, deptno, avg(salary) from emp group by deptno; Expression not in GROUP BY key'ename'//대응 관계 주의
select에 나타나는 필드, 그룹 by에 나타나거나 집합 함수
각 부서, 직무의 최고 임금 select deptno,job,max(salary)fromemp group by deptno,job;
각 부문의 평균 임금은 2000select deptno, avg(salary) avg 보다 크다sal from emp group by deptno having avg_sal>2000;
검색 결과에 대해 select ename,sal,case when sal>1 and sal<=1000 then'lower'when sal>1000 and sal<=2000 then'moddle'when sal>2000 and sal<=4000 then'high'else'highest'end from emp;

hive> select ename,sal,
    > case
    > when sal>1 and sal<=1000 then 'lower'
    > when sal>1000 and sal <=2000 then 'moddle'
    > when sal>2000 and sal <=4000 then 'high'
    > else 'highest'
    > from emp;
FAILED: ParseException line 7:0 missing KW_END at 'from' near ''
hive> select ename,sal,
    > case
    > when sal>1 and sal<=1000 then 'lower'
    > when sal>1000 and sal <=2000 then 'moddle'
    > when sal>2000 and sal <=4000 then 'high'
    > else 'highest'
    > end
    > from emp;
OK
SMITH   800.0   lower
ALLEN   1600.0  moddle
WARD    1250.0  moddle
JONES   2975.0  high
MARTIN  1250.0  moddle
BLAKE   2850.0  high
CLARK   2450.0  high
SCOTT   3000.0  high
KING    5000.0  highest
TURNER  1500.0  moddle
ADAMS   1100.0  moddle
JAMES   950.0   lower
FORD    3000.0  high
MILLER  1300.0  moddle
Time taken: 2.191 seconds, Fetched: 14 row(s)

검색 결과를 select count(1)from emp where empno=7566 union all select count(1)from emp where empno=7654로 통합하기;

hive> select count(1) from emp where empno=7566
    > union all
    > select count(1) from emp where empno=7654;
Query ID = hadoop_20180109015050_8d760d00-2c6e-4cc4-a99e-5573e64bfc9b
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=
In order to set a constant number of reducers:
  set mapreduce.job.reduces=
Starting Job = job_1515472546059_0002, Tracking URL = http://hadoop:8088/proxy/application_1515472546059_0002/
Kill Command = /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/bin/hadoop job  -kill job_1515472546059_0002
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2018-01-09 01:52:52,767 Stage-1 map = 0%,  reduce = 0%
2018-01-09 01:53:15,692 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 5.93 sec
2018-01-09 01:53:33,559 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 8.71 sec
MapReduce Total cumulative CPU time: 8 seconds 710 msec
Ended Job = job_1515472546059_0002
Launching Job 2 out of 3
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=
In order to set a constant number of reducers:
  set mapreduce.job.reduces=
Starting Job = job_1515472546059_0003, Tracking URL = http://hadoop:8088/proxy/application_1515472546059_0003/
Kill Command = /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/bin/hadoop job  -kill job_1515472546059_0003
Hadoop job information for Stage-3: number of mappers: 1; number of reducers: 1
2018-01-09 01:53:52,732 Stage-3 map = 0%,  reduce = 0%
2018-01-09 01:54:15,733 Stage-3 map = 100%,  reduce = 0%, Cumulative CPU 7.23 sec
2018-01-09 01:54:37,718 Stage-3 map = 100%,  reduce = 100%, Cumulative CPU 9.75 sec
MapReduce Total cumulative CPU time: 9 seconds 750 msec
Ended Job = job_1515472546059_0003
Launching Job 3 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1515472546059_0004, Tracking URL = http://hadoop:8088/proxy/application_1515472546059_0004/
Kill Command = /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/bin/hadoop job  -kill job_1515472546059_0004
Hadoop job information for Stage-2: number of mappers: 2; number of reducers: 0
2018-01-09 01:54:56,536 Stage-2 map = 0%,  reduce = 0%
2018-01-09 01:55:21,218 Stage-2 map = 50%,  reduce = 0%, Cumulative CPU 2.49 sec
2018-01-09 01:55:22,329 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 4.81 sec
MapReduce Total cumulative CPU time: 4 seconds 810 msec
Ended Job = job_1515472546059_0004
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 8.71 sec   HDFS Read: 7881 HDFS Write: 114 SUCCESS
Stage-Stage-3: Map: 1  Reduce: 1   Cumulative CPU: 9.75 sec   HDFS Read: 7886 HDFS Write: 114 SUCCESS
Stage-Stage-2: Map: 2   Cumulative CPU: 4.81 sec   HDFS Read: 5348 HDFS Write: 4 SUCCESS
Total MapReduce CPU Time Spent: 23 seconds 270 msec
OK
1
1
Time taken: 179.507 seconds, Fetched: 2 row(s)

import과 export의 사용은 데이터를 가져오는 것과 내보내는 특징을 사용할 때 메타데이터도 함께 내보내고 가져오며 서로 다른 Hadoop에 이식할 수 있어 이식성이 있다.

export EXPORT TABLE tablename [PARTITION (part_column=”value”[, …])] TO ‘export_target_path’ [ FOR replication(‘eventid’) ]

import

IMPORT [[EXTERNAL] TABLE new_or_original_tablename [PARTITION (part_column=”value”[, …])]] FROM ‘source_path’ [LOCATION ‘import_target_path’]

예시 가져오기표 export table emp to'/emp/emp.sql’ import table new_emp from ‘/emp/emp.sql’

가져오기 파티션 테이블 export table emp 내보내기dy_partition partition(deptno=30) to ‘/exprt’; import table new_emp_dy partition (deptno=30) from ‘/exprt’;
약택 빅데이터 교류군: 671914634

이 내용에 흥미가 있습니까?

현재 기사가 여러분의 문제를 해결하지 못하는 경우 AI 엔진은 머신러닝 분석(스마트 모델이 방금 만들어져 부정확한 경우가 있을 수 있음)을 통해 가장 유사한 기사를 추천합니다:

Spark + HWC로 Hive 테이블을 만들고 자동으로 Metadata를 Atlas에 반영합니다.

HDP 3.1.x의 경우 Spark + HWC에서 Hive 테이블을 만들고 자동으로 Metadata를 Atlas에 반영하는 방법이 있습니다. 방법: 전제조건: Hive Warehouse Connector (HWC) ...

텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.

CC BY-SA 2.5, CC BY-SA 3.0 및 CC BY-SA 4.0에 따라 라이센스가 부여됩니다.

Hive DML 구문

좋은 웹페이지 즐겨찾기