sqoop 작업 의 ORACLE 를 HIVE 로 가 져 오기

표 의 모든 필드 가 져 오기

sqoop import --connect jdbc:oracle:thin:@192.168.1.107:1521:ORCL \

--username SCOTT --password tiger \

--table EMP \ --hive-import --create-hive-table --hive-table emp  -m 1;

비슷 한 실 수 를 하면:

ERROR tool.ImportTool: Encountered IOException running import job: org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory EMP already exists

먼저 HDFS 시스템 에서 이 파일 을 삭제 합 니 다: hadop fs - rmr / user / hadop / EMP
비슷 한 실 수 를 하면:

FAILED: Error in metadata: AlreadyExistsException(message:Table emp already exists)

비슷 한 실 수 를 하면:

hive.HiveImport: Exception in thread "main" java.lang.NoSuchMethodError: org.apache.thrift.EncodingUtils.setBit(BIZ)B

같은 경로 에 hive 와 hbase 가 설치 되 어 있 고 hbase 와 hive 의 lib 디 렉 터 리 에 있 는 thrift 버 전이 다 르 기 때 문 입 니 다.hbase 아래 는 libthrift - 0. x. 0. jar 이 고, hive 아래 는 libthrift - 0. x. 0. jar 입 니 다.Hbase 의 0. x. 0 버 전 을 삭제 하고 0. x. 0 으로 바 꾸 면 됩 니 다.ps: Sqoop 이 Hive 에 데 이 터 를 가 져 오 는 이유 와 Hbase 의 일;

설명: hive 표 가 이미 존재 하 므 로 먼저 삭제 해 야 합 니 다.
보기:

desc emp;

empno   double

ename   string

job     string

mgr     double

hiredate        string

sal     double

comm    double

deptno  double



select * from emp;

7369.0  SMITH   CLERK   7902.0  1980-12-17 00:00:00.0   800.0   NULL    20.0

7499.0  ALLEN   SALESMAN  7698.0  1981-02-20 00:00:00.0   1600.0  300.0   30.0

7521.0  WARD    SALESMAN 7698.0  1981-02-22 00:00:00.0   1250.0  500.0   30.0

7566.0  JONES   MANAGER 7839.0  1981-04-02 00:00:00.0   2975.0  NULL    20.0

7654.0  MARTIN  SALESMAN  7698.0  1981-09-28 00:00:00.0   1250.0  1400.0  30.0

……

주: 일반적인 상황 에서 -- create - hiv - table 을 사용 하지 않 고 표를 만 듭 니 다. 표 의 필드 형식 이 우리 의 요구 에 부합 되 지 않 기 때 문 입 니 다.

표 의 지정 필드 가 져 오기
수 동 으로 hive 테이블 만 들 기:

create table emp_column(

empno int,

ename string,

job string,

mgr int,

hiredate string,

sal double,

comm double,

deptno int

)

row format delimited fields terminated by '\t' lines terminated by '
' 

stored as textfile;

sqoop import --connect jdbc:oracle:thin:@192.168.1.107:1521:ORCL \

--username SCOTT --password tiger \

--table EMP --columns "EMPNO,ENAME,JOB,SAL,COMM" \

--fields-terminated-by '\t' --lines-terminated-by '
' \

--hive-drop-import-delims --hive-import  --hive-table emp_column \

-m 3;

설명: 다시 실행 합 니 다. 반복 적 으로 가 져 올 때마다 hive 의 데 이 터 는 반복 적 으로 가 져 옵 니 다.

sqoop import --connect jdbc:oracle:thin:@192.168.1.107:1521:ORCL \

--username SCOTT --password tiger \

--table EMP --columns "EMPNO,ENAME,JOB,SAL,COMM" \

--fields-terminated-by '\t' --lines-terminated-by '
' \

--hive-drop-import-delims --hive-overwrite --hive-import --hive-table emp_column \

-m 3;

주: - hive - overwrite 는 덮어 쓰기 표 에 존재 하 는 기록 을 지정 합 니 다. 99% 는 overwrite 를 사용 하여 다시 달 릴 때 중복 데이터 가 발생 하지 않도록 해 야 합 니 다.

표 의 지정 한 필드 를 hive 파 티 션 표 로 가 져 옵 니 다.
hive 파 티 션 시트 만 들 기:

create table emp_partition(

empno int,

ename string,

job string,

mgr int,

hiredate string,

sal double,

comm double,

deptno int

)

partitioned by (pt string)

row format delimited fields terminated by '\t' lines terminated by '
' 

stored as textfile;

pt = '2013 - 08 - 01' 가 져 오기

sqoop import --connect jdbc:oracle:thin:@192.168.1.107:1521:ORCL \

--username SCOTT --password tiger \

--table EMP --columns "EMPNO,ENAME,JOB,SAL,COMM" \

--hive-overwrite --hive-import  --hive-table emp_partition \

--fields-terminated-by '\t' --lines-terminated-by '
' \

--hive-drop-import-delims --hive-partition-key 'pt' --hive-partition-value '2013-08-01' \

-m 3;

pt = '2013 - 08 - 02' 가 져 오기

sqoop import --connect jdbc:oracle:thin:@192.168.1.107:1521:ORCL \

--username SCOTT --password tiger \

--table EMP --columns "EMPNO,ENAME,JOB,SAL,COMM" \

--hive-overwrite --hive-import  --hive-table emp_partition \

--fields-terminated-by '\t' --lines-terminated-by '
' \

--hive-drop-import-delims  --hive-partition-key 'pt' --hive-partition-value '2013-08-02' \

-m 3;

조회:

select * from emp_partition where pt='2013-08-01';

select * from emp_partition where pt='2013-08-02';

이 내용에 흥미가 있습니까?

현재 기사가 여러분의 문제를 해결하지 못하는 경우 AI 엔진은 머신러닝 분석(스마트 모델이 방금 만들어져 부정확한 경우가 있을 수 있음)을 통해 가장 유사한 기사를 추천합니다:

자바 작업 은 Clob 또는 NClob 데이터 형식의 저장 프로 세 스 인 스 턴 스 를 포함 합 니 다.

텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.

CC BY-SA 2.5, CC BY-SA 3.0 및 CC BY-SA 4.0에 따라 라이센스가 부여됩니다.

sqoop 작업 의 ORACLE 를 HIVE 로 가 져 오기

좋은 웹페이지 즐겨찾기