Hadoop 데이터 형식

Hadoop 데이터 형식

우 리 는 hadop 이 자바 프로 그래 밍 으로 쓴 것 을 안다.따라서 우 리 는 자바 개발 환경 을 사용 하여 HDFS 를 조작 하고 Mapreduce 를 작성 하 는 것 도 자 연 스 러 운 일이 다.그러나 이 안에 hadop 은 자바 데이터 형식 을 포장 했다.그러면 hadop 의 데이터 형식 은 자바 의 데이터 형식 과 대응 된다.다음은 대 비 를 해 보 겠 습 니 다.1.Hadoop 데이터 형식 에 대한 소개:(1)hadop.io 가방 은 주로 기본 유형 과 다른 유형 으로 나 눌 수 있 습 니 다.

(2)기본 형식(hadop:java):데이터 형식                    hadop 데이터 형식:                                 Java 데이터 형식 불 형                    *BooleanWritable                                      boolean    정형                     *IntWritable：                                               int 부동 소수점 플로트                *FloatWritable：                                            float 부동 소수점 형 double          *DoubleWritable：                                         double 정수 형식 byte          *ByteWritable：                                           byte 는 hadop 데이터베이스 형식 과 자바 데이터 형식 간 에 어떻게 전환 하 는 지 설명 합 니 다.두 가지 방식 이 있 습 니 다.1.set 방식 을 통 해 2.new 방식 을 통 해.(3)기타(부분):*Text:hadop:에서 대응 하 는 Java 데이터 형식 string*Array Writable:  자바 데이터 형식 배열 에 대응 합 니 다.

1.Hadoop 의 데이터 형식 은 Writable 인 터 페 이 스 를 실현 해 야 합 니 다.
2.자바 기본 유형 과 Hadoop 에서 흔히 볼 수 있 는 기본 유형의 대조
       Long       LongWritable
       Integer    IntWritable
       Boolean   BooleanWritable
       String     Text
       ―자바 유형 은 어떻게 hadop 기본 유형 으로 바 꿉 니까?
       답:hadop 형식의 구조 방법 을 호출 하거나 set()방법 을 호출 합 니 다.
              newLongWritable(123L);
       ―hadop 기본 유형 은 어떻게 자바 유형 으로 바 꿉 니까?
       답:Text 에 대해 서 는 toString()방법 을 호출 하고 다른 유형 은 get()방법 을 호출 해 야 합 니 다.

Hadoop 사용자 정의 데이터 형식

Hadoop 의 맞 춤 형 데이터 형식
일반적으로 두 가지 방법 이 있 는데 하 나 는 비교적 간단 한 것 은 값 을 겨냥 한 것 이 고 다른 하 나 는 키 와 값 에 모두 적응 하 는 방법 이다.
1.Writable 인터페이스 구현:

/*DataInput and DataOutput 클래스 는 java.io 클래스*/

public interface Writable {

void readFields(DataInput in);

void write(DataOutput out);

}

다음은 작은 예 입 니 다.

public class Point3D implement Writable {
  public float x, y, z;
 
  public Point3D(float fx, float fy, float fz) {
         this.x = fx;
         this.y = fy;
         this.z = fz;
  }
 
  public Point3D() {
         this(0.0f, 0.0f, 0.0f);
  }
 
  public void readFields(DataInput in) throws IOException {
         x = in.readFloat();
         y = in.readFloat();
         z = in.readFloat();
  }
 
  public void write(DataOutput out) throws IOException {
         out.writeFloat(x);
         out.writeFloat(y);
         out.writeFloat(z);
  }
 
  public String toString() {
         return Float.toString(x) + ", "
                + Float.toString(y) + ", "
                + Float.toString(z);
  }
}
2、     ，        ( ，       C++  ？)，  ，Java Hadoop      WritableComparable      ，WritableComparable，     ，   Writable，   Comparable，    ，   ，  Java        ？~~
public interface WritableComparable<T> {
  public void readFields(DataInput in);
  public void write(DataOutput out);
  public int compareTo(T other);
}
          ，       。
public class Point3D inplements WritableComparable {
  public float x, y, z;
 
  public Point3D(float fx, float fy, float fz) {
         this.x = fx;
         this.y = fy;
         this.z = fz;
  }
 
  public Point3D() {
         this(0.0f, 0.0f, 0.0f);
  }
 
  public void readFields(DataInput in) throws IOException {
         x = in.readFloat();
         y = in.readFloat();
         z = in.readFloat();
  }
 
  public void write(DataOutput out) throws IOException {
         out.writeFloat(x);
         out.writeFloat(y);
         out.writeFloat(z);
  }
 
  public String toString() {
         return Float.toString(x) + ", "
                + Float.toString(y) + ", "
                + Float.toString(z);
  }
 
  public float distanceFromOrigin() {
         return (float) Math.sqrt( x*x + y*y +z*z);
  }
 
  public int compareTo(Point3D other) {
         return Float.compareTo(
                distanceFromOrigin(),
                other.distanceFromOrigin());
  }
 
  public boolean equals(Object o) {
         if( !(o instanceof Point3D)) {
                return false;
         }
         Point3D other = (Point3D) o;
         return this.x == o.x
                && this.y == o.y
                && this.z == o.z;
  }
 
  /*    hashCode()      
   * Hadoop Partitioners       ，    
   */
  public int hashCode() {
         return Float.floatToIntBits(x)
                ^ Float.floatToIntBits(y)
                ^ Float.floatToIntBits(z);
  }
}

Hadoop 데이터 형식 을 사용자 정의 한 후 Hadoop 에 명확 하 게 알려 야 합 니 다.잡 콘 이 맡 을 수 있 는 일이 야.setOutputKeyClass()/setOutputValueClass()방법 을 사용 하면 됩 니 다.

void setOutputKeyClass(Class theClass)

void setOutputValueClass(Class theClass)

보통(기본 조건 에서)이 함 수 는 Map 과 Reduce 단계 의 출력 에 모두 작용 합 니 다.물론 전문 적 인 setMapOutputKeyClass()/setReduceOutputKeyClass()인터페이스 도 있 습 니 다.

Hadoop 데이터 형식

좋은 웹페이지 즐겨찾기