Apache HDFS/hadoop을 Single Node Cluster로 움직여 보았습니다.
그건 그렇고, HDFS는 "Hadoop Distributed File System"이라는 것입니다.
⬛︎ HDFS 환경 구축
$ sudo apt-get install default-jdk
$ sudo update-alternatives --list java
/usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java
$ cd $HOME
$ vi .profile
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
export PATH=$HOME/hadoop-2.7.2/bin:$HOME/hadoop-2.7.2/sbin:$PATH
$ source .profile
$ sudo apt-get install ssh
$ sudo apt-get install rsync
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
$ chmod 0600 ~/.ssh/authorized_keys
ssh localhost에서 비밀번호 문의가 없는지 확인
$ ssh localhost
$ exit
$ wget http://www.apache.org/dist/hadoop/common/hadoop-2.7.2/hadoop-2.7.2.tar.gz
$ tar xfz hadoop-2.7.2.tar.gz
$ cd hadoop-2.7.2/etc/hadoop/
먼저 core-site.xml 편집
$ vi core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
계속해서 hdfs-site.xml 편집
$ vi hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
그리고 hadoop-env.sh도 편집합니다.
$ vi hadoop-env.sh
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
$ hdfs namenode -format
16/03/04 03:55:40 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = HDFS/127.0.1.1
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 2.7.2
... (snip)
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at HDFS/127.0.1.1
************************************************************/
$ start-dfs.sh
Starting namenodes on [localhost]
localhost: starting namenode, logging to /home/tsubo/hadoop-2.7.2/logs/hadoop-tsubo-namenode-HDFS.out
localhost: starting datanode, logging to /home/tsubo/hadoop-2.7.2/logs/hadoop-tsubo-datanode-HDFS.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /home/tsubo/hadoop-2.7.2/logs/hadoop-tsubo-secondarynamenode-HDFS.out
$ jps
12767 Jps
12656 SecondaryNameNode
12462 DataNode
12302 NameNode
웹 브라우저에서 http://localhost:50070/로 이동하여 hadoop 작동 상태를 확인합니다.
⬛︎ HDFS를 사용해보기
$ hdfs dfs -ls /
$ hdfs dfs -mkdir /foo
$ hdfs dfs -ls /
Found 1 items
drwxr-xr-x - tsubo supergroup 0 2016-03-04 04:11 /foo
$ cat /home/tsubo/test.txt
test data
$ hdfs dfs -put /home/tsubo/test.txt /foo
@HDFS:~$ hdfs dfs -ls /foo
Found 1 items
-rw-r--r-- 1 tsubo supergroup 10 2016-03-04 04:14 /foo/test.txt
$ hdfs dfs -cat /foo/test.txt
test data
$ hdfs dfs -rm /foo/test.txt
16/03/04 04:20:54 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
Deleted /foo/test.txt
$ hdfs dfs -rmdir /foo
$ hdfs dfs -ls /
$ stop-dfs.sh
Stopping namenodes on [localhost]
localhost: stopping namenode
localhost: stopping datanode
Stopping secondary namenodes [0.0.0.0]
0.0.0.0: stopping secondarynamenode
이상
Reference
이 문제에 관하여(Apache HDFS/hadoop을 Single Node Cluster로 움직여 보았습니다.), 우리는 이곳에서 더 많은 자료를 발견하고 링크를 클릭하여 보았다 https://qiita.com/ttsubo/items/829ac9ef783d6fd052bf텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.
우수한 개발자 콘텐츠 발견에 전념 (Collection and Share based on the CC Protocol.)