storm Async loop died! & reconnect

자세히 보기
storm이 슈퍼바이저가 리셋되었을 때 topology가 오류를 보고하여 모든 spout이 소비되지 않습니다.
 
2015-07-15T09:48:26.470+0800 b.s.util [ERROR] Async loop died!
java.lang.RuntimeException: java.lang.RuntimeException: Client is being closed, and does not take requests any more
        at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:128) ~[storm-core-0.9.3.jar:0.9.3]
        at backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:99) ~[storm-core-0.9.3.jar:0.9.3]
        at backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:80) ~[storm-core-0.9.3.jar:0.9.3]
        at backtype.storm.disruptor$consume_loop_STAR_$fn__1460.invoke(disruptor.clj:94) ~[storm-core-0.9.3.jar:0.9.3]
        at backtype.storm.util$async_loop$fn__464.invoke(util.clj:463) ~[storm-core-0.9.3.jar:0.9.3]
        at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]
        at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]
Caused by: java.lang.RuntimeException: Client is being closed, and does not take requests any more
        at backtype.storm.messaging.netty.Client.send(Client.java:185) ~[storm-core-0.9.3.jar:0.9.3]
        at backtype.storm.utils.TransferDrainer.send(TransferDrainer.java:54) ~[storm-core-0.9.3.jar:0.9.3]
        at backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__3730$fn__3731.invoke(worker.clj:330) ~[storm-core-0.9.3.jar:0.9.3]
        at backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__3730.invoke(worker.clj:328) ~[storm-core-0.9.3.jar:0.9.3]
        at backtype.storm.disruptor$clojure_handler$reify__1447.onEvent(disruptor.clj:58) ~[storm-core-0.9.3.jar:0.9.3]
        at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:125) ~[storm-core-0.9.3.jar:0.9.3]
        ... 6 common frames omitted
2015-07-15T09:48:26.507+0800 b.s.util [ERROR] Halting process: ("Async loop died!")
java.lang.RuntimeException: ("Async loop died!")
        at backtype.storm.util$exit_process_BANG_.doInvoke(util.clj:325) [storm-core-0.9.3.jar:0.9.3]
        at clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.5.1.jar:na]
        at backtype.storm.disruptor$consume_loop_STAR_$fn__1458.invoke(disruptor.clj:92) [storm-core-0.9.3.jar:0.9.3]
        at backtype.storm.util$async_loop$fn__464.invoke(util.clj:473) [storm-core-0.9.3.jar:0.9.3]
        at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]
        at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]

로그 위에 대량의reconnection
 Reconnect started for Netty-Client-hostxx/ip:6703... [91]
 Reconnect started for Netty-Client-hostxx/ip:6703... [92]
 Reconnect started for Netty-Client-hostxx/ip:6703... [93]
 Reconnect started for Netty-Client-hostxx/ip:6703... [94]
 Reconnect started for Netty-Client-hostxx/ip:6703... [95]
 Reconnect started for Netty-Client-hostxx/ip:6703... [96]
 Reconnect started for Netty-Client-hostxx/ip:6703... [97]
 Reconnect started for Netty-Client-hostxx/ip:6703... [98]
 Reconnect started for Netty-Client-hostxx/ip:6703... [99]

IP에 로그인하여 6703 포트에 두 개의 워커가 있는 것을 발견하여work 포트가 점용되었습니다 (전에supervisor가work를 다시 시작하지 않았습니다. 버그, 해결해야 함)
해결: 중복된 포트의worker를 삭제하고 topology가 자동으로 다른work를 사용하도록 합니다
 
 

좋은 웹페이지 즐겨찾기