Spark + HWC로 Hive 테이블을 만들고 자동으로 Metadata를 Atlas에 반영합니다.

5020 단어 스파크atlashiveHWC


HDP 3.1.x의 경우 Spark + HWC에서 Hive 테이블을 만들고 자동으로 Metadata를 Atlas에 반영하는 방법이 있습니다.


1) Atlas + Hive의 제휴

2) HWC 준비

전제조건: Hive Warehouse Connector (HWC) and low-latency analytical processing (LLAP)
htps : // / cs. c우우라. 이 m/HDP 도쿠멘 ts/HDP3/HDP-3.1. 이런 c 치온. HTML

3) Spark - Hive 연계

Spark 설정 추가:
Set the values ​​of these properties as follows:

In Ambari, copy the value from Services > Hive > Summary > HIVESERVER2 INTERACTIVE JDBC URL.

Copy the value from hive.metastore.uris. In Hive, at the hive> prompt, enter set hive.metastore.uris and copy the output. For example, thrift://

Copy value from Advanced hive-interactive-site > hive.llap.daemon.service.hosts.

Copy the value from Advanced hive-sitehive.zookeeper.quorum.


4) 동작 확인

htps : // / cs. c우우라. 이 m/HDP 도쿠멘 ts/HDP3/HDP-3.1. HTML
htps : // / cs. c우우라. 이 m / HDP Dokumen ts / HDP 3 / HDP-3. 페라치온 s. HTML
  • Integrating Apache Hive with Apache Spark - Hive Warehouse Connector
  • [centos@zzeng-hdp-1 ~/git/ops/hwx-field-cloud/hdp]$ spark-shell --jars /usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-
    Setting default log level to "WARN".
    To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
    19/11/30 04:21:12 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
    Spark context Web UI available at
    Spark context available as 'sc' (master = yarn, app id = application_1575083036450_0018).
    Spark session available as 'spark'.
    Welcome to
          ____              __
         / __/__  ___ _____/ /__
        _\ \/ _ \/ _ `/ __/  '_/
       /___/ .__/\_,_/_/ /_/\_\   version
    Using Scala version 2.11.12 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_112)
    Type in expressions to have them evaluated.
    Type :help for more information.
    scala> import com.hortonworks.hwc.HiveWarehouseSession
    import com.hortonworks.hwc.HiveWarehouseSession
    scala> import com.hortonworks.hwc.HiveWarehouseSession._
    import com.hortonworks.hwc.HiveWarehouseSession._
    scala> val hive = HiveWarehouseSession.session(spark).build()
    hive: com.hortonworks.spark.sql.hive.llap.HiveWarehouseSessionImpl = com.hortonworks.spark.sql.hive.llap.HiveWarehouseSessionImpl@7b88dd58
    scala> hive.createDatabase("zzeng3", false);
    scala> hive.setDatabase("zzeng3")
    scala> hive.createTable("web_sales").ifNotExists().column("sold_time_sk", "bigint").column("ws_ship_date_sk", "bigint").create()

    Atlas 디스플레이:

    htps : // / cs. c우우라. 이 m/HDP 도쿠멘 ts/HDP3/HDP-3.1. r_은 d d g g apachi _ s pa rk_이었다. HTML
    1) ORC 테이블만 대응
    2) Spark Thrift Server 미대응

    좋은 웹페이지 즐겨찾기