Spark + HWC로 Hive 테이블을 만들고 자동으로 Metadata를 Atlas에 반영합니다.

Summary

HDP 3.1.x의 경우 Spark + HWC에서 Hive 테이블을 만들고 자동으로 Metadata를 Atlas에 반영하는 방법이 있습니다.

방법:

1) Atlas + Hive의 제휴

2) HWC 준비

전제조건: Hive Warehouse Connector (HWC) and low-latency analytical processing (LLAP)
htps : // / cs. c우우라. 이 m/HDP 도쿠멘 ts/HDP3/HDP-3.1. 이런 c 치온. HTML

3) Spark - Hive 연계

Spark 설정 추가:
Set the values of these properties as follows:

spark.sql.hive.hiveserver2.jdbc.url

In Ambari, copy the value from Services > Hive > Summary > HIVESERVER2 INTERACTIVE JDBC URL.
spark.datasource.hive.warehouse.metastoreUri

Copy the value from hive.metastore.uris. In Hive, at the hive> prompt, enter set hive.metastore.uris and copy the output. For example, thrift://mycluster-1.com:9083.
spark.hadoop.hive.llap.daemon.service.hosts

Copy value from Advanced hive-interactive-site > hive.llap.daemon.service.hosts.
spark.hadoop.hive.zookeeper.quorum

Copy the value from Advanced hive-sitehive.zookeeper.quorum.

예:

4) 동작 확인

htps : // / cs. c우우라. 이 m/HDP 도쿠멘 ts/HDP3/HDP-3.1. HTML
htps : // / cs. c우우라. 이 m / HDP Dokumen ts / HDP 3 / HDP-3. 페라치온 s. HTML

Integrating Apache Hive with Apache Spark - Hive Warehouse Connector

[centos@zzeng-hdp-1 ~/git/ops/hwx-field-cloud/hdp]$ spark-shell --jars /usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.1.4.0-315.jar
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
19/11/30 04:21:12 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
Spark context Web UI available at http://zzeng-hdp-1.field.hortonworks.com:4041
Spark context available as 'sc' (master = yarn, app id = application_1575083036450_0018).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.3.2.3.1.4.0-315
      /_/

Using Scala version 2.11.12 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_112)
Type in expressions to have them evaluated.
Type :help for more information.

scala> import com.hortonworks.hwc.HiveWarehouseSession
import com.hortonworks.hwc.HiveWarehouseSession

scala> import com.hortonworks.hwc.HiveWarehouseSession._
import com.hortonworks.hwc.HiveWarehouseSession._

scala> val hive = HiveWarehouseSession.session(spark).build()
hive: com.hortonworks.spark.sql.hive.llap.HiveWarehouseSessionImpl = com.hortonworks.spark.sql.hive.llap.HiveWarehouseSessionImpl@7b88dd58

scala> hive.createDatabase("zzeng3", false);

scala> hive.setDatabase("zzeng3")

scala> hive.createTable("web_sales").ifNotExists().column("sold_time_sk", "bigint").column("ws_ship_date_sk", "bigint").create()

scala>

Atlas 디스플레이:

제한사항:
htps : // / cs. c우우라. 이 m/HDP 도쿠멘 ts/HDP3/HDP-3.1. r_은 d d g g apachi _ s pa rk_이었다. HTML
1) ORC 테이블만 대응
2) Spark Thrift Server 미대응

Reference

이 문제에 관하여(Spark + HWC로 Hive 테이블을 만들고 자동으로 Metadata를 Atlas에 반영합니다.), 우리는 이곳에서 더 많은 자료를 발견하고 링크를 클릭하여 보았다 https://qiita.com/zzeng/items/67641dd9fb828bb51829

텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.

우수한 개발자 콘텐츠 발견에 전념 (Collection and Share based on the CC Protocol.)

좋은 웹페이지 즐겨찾기

개발자 우수 사이트 수집

개발자가 알아야 할 필수 사이트 100선 추천 우리는 당신을 위해 100개의 자주 사용하는 개발자 학습 사이트를 정리했습니다