Ambari 가 자동 으로 설정 한 hive 와 spark sql 메타 데이터 베 이 스 를 봅 니 다.

저 는 Ambari 로 만 든 HDP 클 러 스 터 에 hive 와 spark 클 라 이언 트 (체크 하면 됩 니 다) 를 설 치 했 습 니 다. 시간 이 있 으 면 Ambari 를 완전히 오프라인 으로 설치 하고 HDP 클 러 스 터 를 만 드 는 블 로 그 를 쓸 것 입 니 다. 기대 하 세 요 ~ spark 프로필 이 어디 에 있 는 지 먼저 찾 아 보 세 요.
[root@ws1dn3 ~]# whereis spark
spark: /etc/spark
[root@ws1dn3 ~]# cd /etc/spark/
[root@ws1dn3 spark]# ll
total 8
drwxr-xr-x 3 root root 4096 Oct  8 11:16 2.4.2.0-258
lrwxrwxrwx 1 root root   34 Oct  8 11:16 conf -> /usr/hdp/current/spark-client/conf
drwxr-xr-x 2 root root 4096 Oct  8 11:16 conf.backup
[root@ws1dn3 spark]# cd conf
[root@ws1dn3 conf]# ll
total 60
-rw-r--r-- 1 root  root   987 Apr 14 03:14 docker.properties.template
-rw-r--r-- 1 root  root  1105 Apr 14 03:14 fairscheduler.xml.template
-rw-r--r-- 1 spark spark  172 Oct  8 11:16 hive-site.xml
-rw-r--r-- 1 spark spark  621 Oct  9 09:48 log4j.properties
-rw-r--r-- 1 root  root  1734 Apr 14 03:14 log4j.properties.template
-rw-r--r-- 1 spark spark 4956 Oct  8 11:16 metrics.properties
-rw-r--r-- 1 root  root  6671 Apr 14 03:14 metrics.properties.template
-rw-r--r-- 1 root  root   865 Apr 14 03:14 slaves.template
-rw-r--r-- 1 spark spark  722 Oct  8 11:16 spark-defaults.conf
-rw-r--r-- 1 root  root  1292 Apr 14 03:14 spark-defaults.conf.template
-rw-r--r-- 1 spark spark 1788 Oct  8 11:16 spark-env.sh
-rwxr-xr-x 1 root  root  4209 Apr 14 03:14 spark-env.sh.template

확인 해 보 세 요. 클 라 이언 트 설정 만 있 습 니 다.
[root@ws1dn3 conf]# cat hive-site.xml 
  <configuration>

    <property>
      <name>hive.metastore.urisname>
      <value>thrift://ws1dn3.wondersoft.cn:9083value>
    property>

  configuration>

제 가 그때 선택 한 dn3 에 Hive metastore 를 설 치 했 어 요.
다음은 hive - site. xml 의 전체 설정 을 살 펴 보 겠 습 니 다.
[root@ws1dn3 conf]# whereis hive
hive: /usr/bin/hive /etc/hive
[root@ws1dn3 conf]# cd /etc/hive/conf
[root@ws1dn3 conf]# ll
total 228
-rw-r--r-- 1 root root     1139 Apr 22 08:14 beeline-log4j.properties.template
drwxr-xr-x 2 hive hadoop   4096 Oct  8 15:25 conf.server
-rw-r--r-- 1 hive hadoop 175716 Apr 25 14:47 hive-default.xml.template
-rw-r--r-- 1 hive hadoop   1759 Oct  8 11:13 hive-env.sh
-rw-r--r-- 1 hive hadoop   2378 Apr 22 08:14 hive-env.sh.template
-rw-r--r-- 1 hive hadoop   2652 Oct  8 11:13 hive-exec-log4j.properties
-rw-r--r-- 1 hive hadoop   3050 Oct  8 11:13 hive-log4j.properties
-rw-r--r-- 1 hive hadoop  19199 Oct  8 11:00 hive-site.xml
-rw-r--r-- 1 root root     1593 Apr 22 08:14 ivysettings.xml
-rw-r--r-- 1 hive hadoop   6529 Oct  8 11:13 mapred-site.xml
[root@ws1dn3 conf]# cat hive-site.xml 
  <configuration>

    <property>
      <name>ambari.hive.db.schema.namename>
      <value>hivevalue>
    property>

    <property>
      <name>atlas.hook.hive.maxThreadsname>
      <value>1value>
    property>

    <property>
      <name>atlas.hook.hive.minThreadsname>
      <value>1value>
    property>

    <property>
      <name>datanucleus.autoCreateSchemaname>
      <value>falsevalue>
    property>

    <property>
      <name>datanucleus.cache.level2.typename>
      <value>nonevalue>
    property>

    <property>
      <name>datanucleus.fixedDatastorename>
      <value>truevalue>
    property>

    <property>
      <name>hive.auto.convert.joinname>
      <value>truevalue>
    property>

    <property>
      <name>hive.auto.convert.join.noconditionaltaskname>
      <value>truevalue>
    property>

    <property>
      <name>hive.auto.convert.join.noconditionaltask.sizename>
      <value>1073741824value>
    property>

    <property>
      <name>hive.auto.convert.sortmerge.joinname>
      <value>truevalue>
    property>

    <property>
      <name>hive.auto.convert.sortmerge.join.to.mapjoinname>
      <value>falsevalue>
    property>

    <property>
      <name>hive.cbo.enablename>
      <value>truevalue>
    property>

    <property>
      <name>hive.cli.print.headername>
      <value>falsevalue>
    property>

    <property>
      <name>hive.cluster.delegation.token.store.classname>
      <value>org.apache.hadoop.hive.thrift.ZooKeeperTokenStorevalue>
    property>

    <property>
      <name>hive.cluster.delegation.token.store.zookeeper.connectStringname>
      <value>ws1dn2.wondersoft.cn:2181,ws1dn1.wondersoft.cn:2181,ws1dn3.wondersoft.cn:2181value>
    property>

    <property>
      <name>hive.cluster.delegation.token.store.zookeeper.znodename>
      <value>/hive/cluster/delegationvalue>
    property>

    <property>
      <name>hive.compactor.abortedtxn.thresholdname>
      <value>1000value>
    property>

    <property>
      <name>hive.compactor.check.intervalname>
      <value>300Lvalue>
    property>

    <property>
      <name>hive.compactor.delta.num.thresholdname>
      <value>10value>
    property>

    <property>
      <name>hive.compactor.delta.pct.thresholdname>
      <value>0.1fvalue>
    property>

    <property>
      <name>hive.compactor.initiator.onname>
      <value>falsevalue>
    property>

    <property>
      <name>hive.compactor.worker.threadsname>
      <value>0value>
    property>

    <property>
      <name>hive.compactor.worker.timeoutname>
      <value>86400Lvalue>
    property>

    <property>
      <name>hive.compute.query.using.statsname>
      <value>truevalue>
    property>

    <property>
      <name>hive.conf.restricted.listname>
      <value>hive.security.authenticator.manager,hive.security.authorization.manager,hive.users.in.admin.rolevalue>
    property>

    <property>
      <name>hive.convert.join.bucket.mapjoin.tezname>
      <value>falsevalue>
    property>

    <property>
      <name>hive.default.fileformatname>
      <value>TextFilevalue>
    property>

    <property>
      <name>hive.default.fileformat.managedname>
      <value>TextFilevalue>
    property>

    <property>
      <name>hive.enforce.bucketingname>
      <value>falsevalue>
    property>

    <property>
      <name>hive.enforce.sortingname>
      <value>truevalue>
    property>

    <property>
      <name>hive.enforce.sortmergebucketmapjoinname>
      <value>truevalue>
    property>

    <property>
      <name>hive.exec.compress.intermediatename>
      <value>falsevalue>
    property>

    <property>
      <name>hive.exec.compress.outputname>
      <value>falsevalue>
    property>

    <property>
      <name>hive.exec.dynamic.partitionname>
      <value>truevalue>
    property>

    <property>
      <name>hive.exec.dynamic.partition.modename>
      <value>strictvalue>
    property>

    <property>
      <name>hive.exec.failure.hooksname>
      <value>org.apache.hadoop.hive.ql.hooks.ATSHookvalue>
    property>

    <property>
      <name>hive.exec.max.created.filesname>
      <value>100000value>
    property>

    <property>
      <name>hive.exec.max.dynamic.partitionsname>
      <value>5000value>
    property>

    <property>
      <name>hive.exec.max.dynamic.partitions.pernodename>
      <value>2000value>
    property>

    <property>
      <name>hive.exec.orc.compression.strategyname>
      <value>SPEEDvalue>
    property>

    <property>
      <name>hive.exec.orc.default.compressname>
      <value>ZLIBvalue>
    property>

    <property>
      <name>hive.exec.orc.default.stripe.sizename>
      <value>67108864value>
    property>

    <property>
      <name>hive.exec.orc.encoding.strategyname>
      <value>SPEEDvalue>
    property>

    <property>
      <name>hive.exec.parallelname>
      <value>falsevalue>
    property>

    <property>
      <name>hive.exec.parallel.thread.numbername>
      <value>8value>
    property>

    <property>
      <name>hive.exec.post.hooksname>
      <value>org.apache.hadoop.hive.ql.hooks.ATSHookvalue>
    property>

    <property>
      <name>hive.exec.pre.hooksname>
      <value>org.apache.hadoop.hive.ql.hooks.ATSHookvalue>
    property>

    <property>
      <name>hive.exec.reducers.bytes.per.reducername>
      <value>67108864value>
    property>

    <property>
      <name>hive.exec.reducers.maxname>
      <value>1009value>
    property>

    <property>
      <name>hive.exec.scratchdirname>
      <value>/tmp/hivevalue>
    property>

    <property>
      <name>hive.exec.submit.local.task.via.childname>
      <value>truevalue>
    property>

    <property>
      <name>hive.exec.submitviachildname>
      <value>falsevalue>
    property>

    <property>
      <name>hive.execution.enginename>
      <value>tezvalue>
    property>

    <property>
      <name>hive.fetch.task.aggrname>
      <value>falsevalue>
    property>

    <property>
      <name>hive.fetch.task.conversionname>
      <value>morevalue>
    property>

    <property>
      <name>hive.fetch.task.conversion.thresholdname>
      <value>1073741824value>
    property>

    <property>
      <name>hive.limit.optimize.enablename>
      <value>truevalue>
    property>

    <property>
      <name>hive.limit.pushdown.memory.usagename>
      <value>0.04value>
    property>

    <property>
      <name>hive.map.aggrname>
      <value>truevalue>
    property>

    <property>
      <name>hive.map.aggr.hash.force.flush.memory.thresholdname>
      <value>0.9value>
    property>

    <property>
      <name>hive.map.aggr.hash.min.reductionname>
      <value>0.5value>
    property>

    <property>
      <name>hive.map.aggr.hash.percentmemoryname>
      <value>0.5value>
    property>

    <property>
      <name>hive.mapjoin.bucket.cache.sizename>
      <value>10000value>
    property>

    <property>
      <name>hive.mapjoin.optimized.hashtablename>
      <value>truevalue>
    property>

    <property>
      <name>hive.mapred.reduce.tasks.speculative.executionname>
      <value>falsevalue>
    property>

    <property>
      <name>hive.merge.mapfilesname>
      <value>truevalue>
    property>

    <property>
      <name>hive.merge.mapredfilesname>
      <value>falsevalue>
    property>

    <property>
      <name>hive.merge.orcfile.stripe.levelname>
      <value>truevalue>
    property>

    <property>
      <name>hive.merge.rcfile.block.levelname>
      <value>truevalue>
    property>

    <property>
      <name>hive.merge.size.per.taskname>
      <value>256000000value>
    property>

    <property>
      <name>hive.merge.smallfiles.avgsizename>
      <value>16000000value>
    property>

    <property>
      <name>hive.merge.tezfilesname>
      <value>falsevalue>
    property>

    <property>
      <name>hive.metastore.authorization.storage.checksname>
      <value>falsevalue>
    property>

    <property>
      <name>hive.metastore.cache.pinobjtypesname>
      <value>Table,Database,Type,FieldSchema,Ordervalue>
    property>

    <property>
      <name>hive.metastore.client.connect.retry.delayname>
      <value>5svalue>
    property>

    <property>
      <name>hive.metastore.client.socket.timeoutname>
      <value>1800svalue>
    property>

    <property>
      <name>hive.metastore.connect.retriesname>
      <value>24value>
    property>

    <property>
      <name>hive.metastore.execute.setuginame>
      <value>truevalue>
    property>

    <property>
      <name>hive.metastore.failure.retriesname>
      <value>24value>
    property>

    <property>
      <name>hive.metastore.kerberos.keytab.filename>
      <value>/etc/security/keytabs/hive.service.keytabvalue>
    property>

    <property>
      <name>hive.metastore.kerberos.principalname>
      <value>hive/[email protected]value>
    property>

    <property>
      <name>hive.metastore.pre.event.listenersname>
      <value>org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListenervalue>
    property>

    <property>
      <name>hive.metastore.sasl.enabledname>
      <value>falsevalue>
    property>

    <property>
      <name>hive.metastore.server.max.threadsname>
      <value>100000value>
    property>
    ###################     
    <property>
      <name>hive.metastore.urisname>
      <value>thrift://ws1dn3.wondersoft.cn:9083value>
    property>

    <property>
      <name>hive.metastore.warehouse.dirname>
      <value>/apps/hive/warehousevalue>
    property>

    <property>
      <name>hive.optimize.bucketmapjoinname>
      <value>truevalue>
    property>

    <property>
      <name>hive.optimize.bucketmapjoin.sortedmergename>
      <value>falsevalue>
    property>

    <property>
      <name>hive.optimize.constant.propagationname>
      <value>truevalue>
    property>

    <property>
      <name>hive.optimize.index.filtername>
      <value>truevalue>
    property>

    <property>
      <name>hive.optimize.metadataonlyname>
      <value>truevalue>
    property>

    <property>
      <name>hive.optimize.null.scanname>
      <value>truevalue>
    property>

    <property>
      <name>hive.optimize.reducededuplicationname>
      <value>truevalue>
    property>

    <property>
      <name>hive.optimize.reducededuplication.min.reducername>
      <value>4value>
    property>

    <property>
      <name>hive.optimize.sort.dynamic.partitionname>
      <value>falsevalue>
    property>

    <property>
      <name>hive.orc.compute.splits.num.threadsname>
      <value>10value>
    property>

    <property>
      <name>hive.orc.splits.include.file.footername>
      <value>falsevalue>
    property>

    <property>
      <name>hive.prewarm.enabledname>
      <value>falsevalue>
    property>

    <property>
      <name>hive.prewarm.numcontainersname>
      <value>3value>
    property>

    <property>
      <name>hive.security.authenticator.managername>
      <value>org.apache.hadoop.hive.ql.security.ProxyUserAuthenticatorvalue>
    property>

    <property>
      <name>hive.security.authorization.enabledname>
      <value>falsevalue>
    property>

    <property>
      <name>hive.security.authorization.managername>
      <value>org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdConfOnlyAuthorizerFactoryvalue>
    property>

    <property>
      <name>hive.security.metastore.authenticator.managername>
      <value>org.apache.hadoop.hive.ql.security.HadoopDefaultMetastoreAuthenticatorvalue>
    property>

    <property>
      <name>hive.security.metastore.authorization.auth.readsname>
      <value>truevalue>
    property>

    <property>
      <name>hive.security.metastore.authorization.managername>
      <value>org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvidervalue>
    property>

    <property>
      <name>hive.server2.allow.user.substitutionname>
      <value>truevalue>
    property>

    <property>
      <name>hive.server2.authenticationname>
      <value>NONEvalue>
    property>

    <property>
      <name>hive.server2.authentication.spnego.keytabname>
      <value>HTTP/[email protected]value>
    property>

    <property>
      <name>hive.server2.authentication.spnego.principalname>
      <value>/etc/security/keytabs/spnego.service.keytabvalue>
    property>

    <property>
      <name>hive.server2.enable.doAsname>
      <value>truevalue>
    property>

    <property>
      <name>hive.server2.logging.operation.enabledname>
      <value>truevalue>
    property>

    <property>
      <name>hive.server2.logging.operation.log.locationname>
      <value>/tmp/hive/operation_logsvalue>
    property>

    <property>
      <name>hive.server2.support.dynamic.service.discoveryname>
      <value>truevalue>
    property>

    <property>
      <name>hive.server2.table.type.mappingname>
      <value>CLASSICvalue>
    property>

    <property>
      <name>hive.server2.tez.default.queuesname>
      <value>defaultvalue>
    property>

    <property>
      <name>hive.server2.tez.initialize.default.sessionsname>
      <value>falsevalue>
    property>

    <property>
      <name>hive.server2.tez.sessions.per.default.queuename>
      <value>1value>
    property>

    <property>
      <name>hive.server2.thrift.http.pathname>
      <value>cliservicevalue>
    property>

    <property>
      <name>hive.server2.thrift.http.portname>
      <value>10001value>
    property>

    <property>
      <name>hive.server2.thrift.max.worker.threadsname>
      <value>500value>
    property>

    <property>
      <name>hive.server2.thrift.portname>
      <value>10000value>
    property>

    <property>
      <name>hive.server2.thrift.sasl.qopname>
      <value>authvalue>
    property>

    <property>
      <name>hive.server2.transport.modename>
      <value>binaryvalue>
    property>

    <property>
      <name>hive.server2.use.SSLname>
      <value>falsevalue>
    property>

    <property>
      <name>hive.server2.zookeeper.namespacename>
      <value>hiveserver2value>
    property>

    <property>
      <name>hive.smbjoin.cache.rowsname>
      <value>10000value>
    property>

    <property>
      <name>hive.stats.autogathername>
      <value>truevalue>
    property>

    <property>
      <name>hive.stats.dbclassname>
      <value>fsvalue>
    property>

    <property>
      <name>hive.stats.fetch.column.statsname>
      <value>truevalue>
    property>

    <property>
      <name>hive.stats.fetch.partition.statsname>
      <value>truevalue>
    property>

    <property>
      <name>hive.support.concurrencyname>
      <value>falsevalue>
    property>

    <property>
      <name>hive.tez.auto.reducer.parallelismname>
      <value>truevalue>
    property>

    <property>
      <name>hive.tez.container.sizename>
      <value>3072value>
    property>

    <property>
      <name>hive.tez.cpu.vcoresname>
      <value>-1value>
    property>

    <property>
      <name>hive.tez.dynamic.partition.pruningname>
      <value>truevalue>
    property>

    <property>
      <name>hive.tez.dynamic.partition.pruning.max.data.sizename>
      <value>104857600value>
    property>

    <property>
      <name>hive.tez.dynamic.partition.pruning.max.event.sizename>
      <value>1048576value>
    property>

    <property>
      <name>hive.tez.input.formatname>
      <value>org.apache.hadoop.hive.ql.io.HiveInputFormatvalue>
    property>

    <property>
      <name>hive.tez.java.optsname>
      <value>-server -Djava.net.preferIPv4Stack=true -XX:NewRatio=8 -XX:+UseNUMA -XX:+UseParallelGC -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStampsvalue>
    property>

    <property>
      <name>hive.tez.log.levelname>
      <value>INFOvalue>
    property>

    <property>
      <name>hive.tez.max.partition.factorname>
      <value>2.0value>
    property>

    <property>
      <name>hive.tez.min.partition.factorname>
      <value>0.25value>
    property>

    <property>
      <name>hive.tez.smb.number.wavesname>
      <value>0.5value>
    property>

    <property>
      <name>hive.txn.managername>
      <value>org.apache.hadoop.hive.ql.lockmgr.DummyTxnManagervalue>
    property>

    <property>
      <name>hive.txn.max.open.batchname>
      <value>1000value>
    property>

    <property>
      <name>hive.txn.timeoutname>
      <value>300value>
    property>

    <property>
      <name>hive.user.install.directoryname>
      <value>/user/value>
    property>

    <property>
      <name>hive.vectorized.execution.enabledname>
      <value>truevalue>
    property>

    <property>
      <name>hive.vectorized.execution.reduce.enabledname>
      <value>falsevalue>
    property>

    <property>
      <name>hive.vectorized.groupby.checkintervalname>
      <value>4096value>
    property>

    <property>
      <name>hive.vectorized.groupby.flush.percentname>
      <value>0.1value>
    property>

    <property>
      <name>hive.vectorized.groupby.maxentriesname>
      <value>100000value>
    property>

    <property>
      <name>hive.zookeeper.client.portname>
      <value>2181value>
    property>

    <property>
      <name>hive.zookeeper.namespacename>
      <value>hive_zookeeper_namespacevalue>
    property>

    <property>
      <name>hive.zookeeper.quorumname>
      <value>ws1dn2.wondersoft.cn:2181,ws1dn1.wondersoft.cn:2181,ws1dn3.wondersoft.cn:2181value>
    property>

    <property>
      <name>javax.jdo.option.ConnectionDriverNamename>
      <value>com.mysql.jdbc.Drivervalue>
    property>
  ###################        
    <property>
      <name>javax.jdo.option.ConnectionURLname>
      <value>jdbc:mysql://ws1m.wondersoft.cn/hive?createDatabaseIfNotExist=truevalue>
    property>

    <property>
      <name>javax.jdo.option.ConnectionUserNamename>
      <value>rootvalue>
    property>

내 가 주석 한 곳 을 주의해 라. 나 도 이렇게 말 하 는 것 이 맞 는 지 모르겠다. 나 는 여기에 있다.http://duguyiren3476.iteye.com/blog/1632868 배우다
어쨌든 결 과 는 hive 와 spark sql 의 메타 데이터 베 이 스 는 같은 것 입 니 다. 어떤 조작 을 하고 싶 습 니까? 하지만 spark sql 을 사용 하 겠 습 니 다. 0.0 hive:
[root@ws1dn2 ~]# su hdfs
[hdfs@ws1dn2 root]$ hive
WARNING: Use "yarn jar" to launch YARN applications.

Logging initialized using configuration in file:/etc/hive/2.4.2.0-258/0/hive-log4j.properties
hive> show tables;
OK
t_log_2016
test
Time taken: 0.333 seconds, Fetched: 2 row(s)
hive> select count(1) from t_log_2016;
Query ID = hdfs_20161013143029_f605d459-878f-495f-95bf-3a3960537d00
Total jobs = 1
Launching Job 1 out of 1
Tez session was closed. Reopening...
Session re-established.


Status: Running (Executing on YARN cluster with App id application_1475896673093_0012)

--------------------------------------------------------------------------------
        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
--------------------------------------------------------------------------------
Map 1 ..........   SUCCEEDED     12         12        0        0       0       0
Reducer 2 ......   SUCCEEDED      1          1        0        0       0       0
--------------------------------------------------------------------------------
VERTICES: 02/02  [==========================>>] 100%  ELAPSED TIME: 16.56 s    
--------------------------------------------------------------------------------
OK
550734
Time taken: 25.034 seconds, Fetched: 1 row(s)

spark sql:
[hdfs@ws1dn2 root]$ spark-sql 
...
SET hive.support.sql11.reserved.keywords=false
SET spark.sql.hive.version=1.2.1
SET spark.sql.hive.version=1.2.1
spark-sql> show tables;
t_log_2016  false
test    false
Time taken: 1.964 seconds, Fetched 2 row(s)
spark-sql> select count(*) from t_log_2016;
550734                                                                          
Time taken: 2.715 seconds, Fetched 1 row(s)

한 개 에 25 초, 한 개 에 3 초 도 안 돼 요. 어떤 0. 0 을 쓴다 고 했 어 요? 그리고 spark sql 은 scala 와 결합 해서 쓸 수 있어 요.

좋은 웹페이지 즐겨찾기