Spark + HWC로 Hive 테이블을 만들고 자동으로 Metadata를 Atlas에 반영합니다.
Summary
HDP 3.1.x의 경우 Spark + HWC에서 Hive 테이블을 만들고 자동으로 Metadata를 Atlas에 반영하는 방법이 있습니다.
방법:
1) Atlas + Hive의 제휴
2) HWC 준비
전제조건: Hive Warehouse Connector (HWC) and low-latency analytical processing (LLAP)
htps : // / cs. c우우라. 이 m/HDP 도쿠멘 ts/HDP3/HDP-3.1. 이런 c 치온. HTML
3) Spark - Hive 연계
Spark 설정 추가:
Set the values of these properties as follows:
spark.sql.hive.hiveserver2.jdbc.url
In Ambari, copy the value from Services > Hive > Summary > HIVESERVER2 INTERACTIVE JDBC URL.
spark.datasource.hive.warehouse.metastoreUri
Copy the value from hive.metastore.uris. In Hive, at the hive> prompt, enter set hive.metastore.uris and copy the output. For example, thrift://mycluster-1.com:9083.
spark.hadoop.hive.llap.daemon.service.hosts
Copy value from Advanced hive-interactive-site > hive.llap.daemon.service.hosts.
spark.hadoop.hive.zookeeper.quorum
Copy the value from Advanced hive-sitehive.zookeeper.quorum.
예:
4) 동작 확인
htps : // / cs. c우우라. 이 m/HDP 도쿠멘 ts/HDP3/HDP-3.1. HTML
htps : // / cs. c우우라. 이 m / HDP Dokumen ts / HDP 3 / HDP-3. 페라치온 s. HTML
2) HWC 준비
전제조건: Hive Warehouse Connector (HWC) and low-latency analytical processing (LLAP)
htps : // / cs. c우우라. 이 m/HDP 도쿠멘 ts/HDP3/HDP-3.1. 이런 c 치온. HTML
3) Spark - Hive 연계
Spark 설정 추가:
Set the values of these properties as follows:
spark.sql.hive.hiveserver2.jdbc.url
In Ambari, copy the value from Services > Hive > Summary > HIVESERVER2 INTERACTIVE JDBC URL.
spark.datasource.hive.warehouse.metastoreUri
Copy the value from hive.metastore.uris. In Hive, at the hive> prompt, enter set hive.metastore.uris and copy the output. For example, thrift://mycluster-1.com:9083.
spark.hadoop.hive.llap.daemon.service.hosts
Copy value from Advanced hive-interactive-site > hive.llap.daemon.service.hosts.
spark.hadoop.hive.zookeeper.quorum
Copy the value from Advanced hive-sitehive.zookeeper.quorum.
예:
4) 동작 확인
htps : // / cs. c우우라. 이 m/HDP 도쿠멘 ts/HDP3/HDP-3.1. HTML
htps : // / cs. c우우라. 이 m / HDP Dokumen ts / HDP 3 / HDP-3. 페라치온 s. HTML
Spark 설정 추가:
Set the values of these properties as follows:
spark.sql.hive.hiveserver2.jdbc.url
In Ambari, copy the value from Services > Hive > Summary > HIVESERVER2 INTERACTIVE JDBC URL.
spark.datasource.hive.warehouse.metastoreUri
Copy the value from hive.metastore.uris. In Hive, at the hive> prompt, enter set hive.metastore.uris and copy the output. For example, thrift://mycluster-1.com:9083.
spark.hadoop.hive.llap.daemon.service.hosts
Copy value from Advanced hive-interactive-site > hive.llap.daemon.service.hosts.
spark.hadoop.hive.zookeeper.quorum
Copy the value from Advanced hive-sitehive.zookeeper.quorum.
예:
4) 동작 확인
htps : // / cs. c우우라. 이 m/HDP 도쿠멘 ts/HDP3/HDP-3.1. HTML
htps : // / cs. c우우라. 이 m / HDP Dokumen ts / HDP 3 / HDP-3. 페라치온 s. HTML
[centos@zzeng-hdp-1 ~/git/ops/hwx-field-cloud/hdp]$ spark-shell --jars /usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.1.4.0-315.jar
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
19/11/30 04:21:12 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
Spark context Web UI available at http://zzeng-hdp-1.field.hortonworks.com:4041
Spark context available as 'sc' (master = yarn, app id = application_1575083036450_0018).
Spark session available as 'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.3.2.3.1.4.0-315
/_/
Using Scala version 2.11.12 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_112)
Type in expressions to have them evaluated.
Type :help for more information.
scala> import com.hortonworks.hwc.HiveWarehouseSession
import com.hortonworks.hwc.HiveWarehouseSession
scala> import com.hortonworks.hwc.HiveWarehouseSession._
import com.hortonworks.hwc.HiveWarehouseSession._
scala> val hive = HiveWarehouseSession.session(spark).build()
hive: com.hortonworks.spark.sql.hive.llap.HiveWarehouseSessionImpl = com.hortonworks.spark.sql.hive.llap.HiveWarehouseSessionImpl@7b88dd58
scala> hive.createDatabase("zzeng3", false);
scala> hive.setDatabase("zzeng3")
scala> hive.createTable("web_sales").ifNotExists().column("sold_time_sk", "bigint").column("ws_ship_date_sk", "bigint").create()
scala>
Atlas 디스플레이:
제한사항:
htps : // / cs. c우우라. 이 m/HDP 도쿠멘 ts/HDP3/HDP-3.1. r_은 d d g g apachi _ s pa rk_이었다. HTML
1) ORC 테이블만 대응
2) Spark Thrift Server 미대응
Reference
이 문제에 관하여(Spark + HWC로 Hive 테이블을 만들고 자동으로 Metadata를 Atlas에 반영합니다.), 우리는 이곳에서 더 많은 자료를 발견하고 링크를 클릭하여 보았다 https://qiita.com/zzeng/items/67641dd9fb828bb51829텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.
우수한 개발자 콘텐츠 발견에 전념 (Collection and Share based on the CC Protocol.)