hadop hdfs 총괄 NameNode 부분 3 ---

Datanode Descriptor 는 DataNode 에 대한 추상 적 인 것 으로 NameNode 의 내부 데이터 구조 로 BlockMap 과 INode 에 맞 춰 파일 시스템 에 있 는 모든 Datanodes 에 포 함 된 Block 정보 와 해당 하 는 INode 정 보 를 기록 했다.
DatanodeDescriptor 는 DatanodeInfo, DatanodeInfo 는 DatanodeID 를 계승 합 니 다.
DatanodeID
DatanodeID 는 다음 과 같은 속성 이 있 습 니 다.
public String name; /// hostname:portNumber    public String storageID; /// unique per cluster storageID 클 러 스 터 내 유일한 hostname protected int infoPort; /// the port where the infoserver is running infoPort 포트 번호 public int ipcPort; /// the port where the ipc server is running 바 텀 IPC 통신 포트 번호
2. DatanodeInfo
1. DatanodeInfo 는 다음 과 같은 속성 이 있 습 니 다.
protected long capacity;     protected long dfsUsed; protected long remaining;
protected String hostName = null; hostname 은 Datanode 가 register 에서 제공 합 니 다. protected long lastUpdate; protected int xceiverCount; 이것 은 Datanode 와 client 또는 Datanode 가 연 결 될 때의 연결 수 를 나타 내 는 것 으로 초과 하면 오류 가 발생 할 수 있 습 니 다. protected String location = NetworkTopology.DEFAULT_RACK; 네트워크 토폴로지 구조, 이것 은 선반 에 따라 백업 배치 전략 을 정의 할 수 있 습 니 다.
  protected AdminStates adminState; admin State 는 이 Datanode 의 운행 상 태 를 표시 합 니 다. 운행 상 태 는 NORMAL, DECOMMISION 입 니 다.INPROGRESS, DECOMMISSIONED; Datanode 에서 decommission 을 진행 할 때 유용 합 니 다. decommission 은 Datanode 오프라인 을 말 합 니 다. 데이터 손실 을 방지 하기 위해 오프라인 과정 에서 이 Datanode 에 대응 하 는 Block 을 다른 Datanode 에 복사 해 야 합 니 다.
2. 중요 한 방법
  public String dumpDatanode() 모든 속성 통계 정 보 를 출력 합 니 다.
3. DatanodeDescriptor
Datanode Descriptor 는 DataNode 의 모든 조작 에 대한 추상 적 인 것 입 니 다. DataNode 는 파일 시스템 의 모든 데 이 터 를 저장 하 는 것 입 니 다. 데 이 터 는 파일 에 대응 하고 파일 은 여러 블록 으로 구성 되 며 각 블록 은 여러 개의 백업 이 있 습 니 다.DataNode 의 작업 에 있어 서 기본적으로 client 가 Datanode 에 데 이 터 를 전송 합 니 다. Datanode 는 모든 block 을 기록 해 야 합 니 다. 데이터 가 분실 되면 block 을 다시 복사 (replicate) 해 야 합 니 다. 데이터 가 append 과정 이나 전송 과정 에서 오류 가 발생 하면 복구 (recovery) 등 이 필요 합 니 다.Datanode Descriptor 에 모든 작업 이 봉인 되 어 있 습 니 다.
   1. 중요 데이터 구조
(1) 내부 클래스 BlockTargetPair

  public static class BlockTargetPair {
    public final Block block;
    public final DatanodeDescriptor[] targets;    

    BlockTargetPair(Block block, DatanodeDescriptor[] targets) {
      this.block = block;
      this.targets = targets;
    }
  }

block 및 모든 복사 본 에 저 장 된 Datanode 를 표시 합 니 다.아래 의 일부 데이터 구조 에 기 초 를 제공 하 다.
(2) 내부 클래스 private static class BlockQueue
BlockTargetPair 대기 열 을 봉인 하 는 데 사용 되 며, 열 에 들 어 가 는 등 방법 을 포함 합 니 다.
(3)private volatile BlockInfo blockList = null;
모든 Datanode Descriptor 는 이 Datanode 가 저장 한 모든 Block 을 기록 하려 고 합 니 다. BlockInfo 를 통 해 저 장 된 것 입 니 다. blockList 는 3 원 그룹 에 따라 저장 (BlocksMap 분석 참조) 되 고 머리 노드 로 모든 block 을 저장 하 며 링크 를 통 해 얻 을 수 있 습 니 다.
(4) 내부 구조:

  /** A queue of blocks to be replicated by this datanode */
  private BlockQueue replicateBlocks = new BlockQueue();
  /** A queue of blocks to be recovered by this datanode */
  private BlockQueue recoverBlocks = new BlockQueue();
  /** A set of blocks to be invalidated by this datanode */
  private Set<Block> invalidateBlocks = new TreeSet<Block>();

이 내부 구 조 는 이 Datanode 에서 다른 Datanode 에 복사 해 야 하 는 - replicateBlock 을 포함 하고 있 으 며, 이 Datanode 에서 다른 Datanode 에 복사 해 야 하 는 - recoverBlocks 를 포함 하고 있 으 며, Block 을 Datanode 에서 삭제 해 야 합 니 다.
앞의 두 구 조 는 다른 Datanode Descriptor 를 받 아야 합 니 다. 복사 와 복구 가 필요 한 Datanode 를 알 아야 하기 때문에 invalidate 는 이번 Datanode 에서 삭제 해 야 할 것 이 며 다른 Datanode 와 는 무관 합 니 다.
(5) 다음 변 수 는 block 스케줄 링 을 유지 합 니 다. block report 와 heartbeat 시간 등 을 포함 합 니 다.

  private int currApproxBlocksScheduled = 0;
  private int prevApproxBlocksScheduled = 0;
  private long lastBlocksScheduledRollTime = 0;
  private static final int BLOCKS_SCHEDULED_ROLL_INTERVAL = 600*1000; //10min

2. 중요 한 방법
(1)void updateHeartbeat

  void updateHeartbeat(long capacity, long dfsUsed, long remaining,
      int xceiverCount) {
    this.capacity = capacity;
    this.dfsUsed = dfsUsed;
    this.remaining = remaining;
    this.lastUpdate = System.currentTimeMillis();
    this.xceiverCount = xceiverCount;
    rollBlocksScheduled(lastUpdate);
  }

DataNode 는 NameNode 에 심장 박동 을 보 낼 때 capacity, dfsused, remainning, xceiverCount 를 포함 하여 마지막 업데이트 시간 을 업데이트 합 니 다.
(2)boolean addBlock(BlockInfo b)

  boolean addBlock(BlockInfo b) {
    if(!b.addNode(this))
      return false;
    // add to the head of the data-node list
    blockList = b.listInsert(blockList, this);
    return true;
  }

Block 을 대기 열 헤더 에 삽입 합 니 다.
(3)boolean removeBlock(BlockInfo b)

  boolean removeBlock(BlockInfo b) {
    blockList = b.listRemove(blockList, this);
    return b.removeNode(this);
  }

대기 열 에서 삭제 합 니 다.
(4)void addBlockToBeReplicated

  void addBlockToBeReplicated(Block block, DatanodeDescriptor[] targets) {
    assert(block != null && targets != null && targets.length > 0);
    replicateBlocks.offer(block, targets);
  }

Block 을 replicate Blocks 구조 에 배치 합 니 다.
(5)void addBlockToBeRecovered

  void addBlockToBeRecovered(Block block, DatanodeDescriptor[] targets) {
    assert(block != null && targets != null && targets.length > 0);
    recoverBlocks.offer(block, targets);
  }

Block 을 recoverBlocks 구조 에 배치 합 니 다.
(6)void addBlocksToBeInvalidated

  void addBlocksToBeInvalidated(List<Block> blocklist) {
    assert(blocklist != null && blocklist.size() > 0);
    synchronized (invalidateBlocks) {
      for(Block blk : blocklist) {
        invalidateBlocks.add(blk);
      }
    }
  }

Block 을 invalidateBlocks 구조 에 배치 합 니 다.
(7) BlockCommand getReplicationCommand(int maxTransfers)
BlockCommand getLeaseRecoveryCommand(int maxTransfers)
BlockCommand getInvalidateBlocks(int maxblocks)
이 세 가지 방법 은 세 개의 내부 데이터 구조 중의 데 이 터 를 writable 의 데이터 형식 으로 봉 하여 대응 하 는 Datanode 에 전송 하 는 동시에 cmd 를 Datanode Protocol. DNA 로 지정 하 는 것 이다.TRANSFER，DatanodeProtocol.DNA_RECOVERBLOCK 또는 DatanodeProtocol. DNAINVALIDATE。
(8) reportDiff 이 방법 은 Datanode Descriptor 에서 가장 중요 한 방법 이다.

void reportDiff(BlocksMap blocksMap,
                  BlockListAsLongs newReport,
                  Collection<Block> toAdd,
                  Collection<Block> toRemove,
                  Collection<Block> toInvalidate) {
    // place a deilimiter in the list which separates blocks 
    // that have been reported from those that have not
    BlockInfo delimiter = new BlockInfo(new Block(), 1);
    boolean added = this.addBlock(delimiter);
    assert added : "Delimiting block cannot be present in the node";
    if(newReport == null)
      newReport = new BlockListAsLongs( new long[0]);
    // scan the report and collect newly reported blocks
    // Note we are taking special precaution to limit tmp blocks allocated
    // as part this block report - which why block list is stored as longs
    Block iblk = new Block(); // a fixed new'ed block to be reused with index i
    for (int i = 0; i < newReport.getNumberOfBlocks(); ++i) {
      iblk.set(newReport.getBlockId(i), newReport.getBlockLen(i), 
               newReport.getBlockGenStamp(i));
      BlockInfo storedBlock = blocksMap.getStoredBlock(iblk);
      if(storedBlock == null) {
        // If block is not in blocksMap it does not belong to any file
        toInvalidate.add(new Block(iblk));
        continue;
      }
      if(storedBlock.findDatanode(this) < 0) {// Known block, but not on the DN
        // if the size differs from what is in the blockmap, then return
        // the new block. addStoredBlock will then pick up the right size of this
        // block and will update the block object in the BlocksMap
        if (storedBlock.getNumBytes() != iblk.getNumBytes()) {
          toAdd.add(new Block(iblk));
        } else {
          toAdd.add(storedBlock);
        }
        continue;
      }
      // move block to the head of the list
      this.moveBlockToHead(storedBlock);
    }
    // collect blocks that have not been reported
    // all of them are next to the delimiter
    Iterator<Block> it = new BlockIterator(delimiter.getNext(0), this);
    while(it.hasNext())
      toRemove.add(it.next());
    this.removeBlock(delimiter);
  }

Datanode 는 정기 적 으로 NameNode 에 리 포트 를 진행 합 니 다. 물론 리 포트 가 자원 을 많이 소모 하기 때문에 모든 리 포트 시간 이 매우 빈번 하지 않 습 니 다.보고 할 때 새로 얻 은 Block 과 Blocks Map 의 Block 을 비교 하고 Blocks Map 에 이 Block 이 존재 하지 않 으 면 삭제 합 니 다.복사 본 수가 없 으 면 추가 합 니 다. 다른 가입 도 Datanode 는 Block 맵 에 추 가 됩 니 다.

이 내용에 흥미가 있습니까?

현재 기사가 여러분의 문제를 해결하지 못하는 경우 AI 엔진은 머신러닝 분석(스마트 모델이 방금 만들어져 부정확한 경우가 있을 수 있음)을 통해 가장 유사한 기사를 추천합니다:

Hadoop의 NameNode에서 이상을 시작할 수 없습니다.

해결 방법: NameNode 포맷 질문: 해결:...

텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.

CC BY-SA 2.5, CC BY-SA 3.0 및 CC BY-SA 4.0에 따라 라이센스가 부여됩니다.

hadop hdfs 총괄 NameNode 부분 3 --- - DatanodeDescriptor

좋은 웹페이지 즐겨찾기