pyspark DecisionTreeModel 은 RDD 에서 직접 사용 할 수 없습니다.

3548 단어 pythonspark
하나 훈련 했 어 요DecisionTreeModel ,그리고 RDD 에서 검증 을 준비 합 니 다:
dtModel     = DecisionTree.trainClassifier(data, 2, {}, impurity="entropy", maxDepth=maxTreeDepth)

predictions = dtModel.predict(data.map(lambda lp: lp.features))


def GetDtLabel(x):
    return 1 if dtModel.predict(x.features) > 0.5 else 0


dtTotalCorrect = data.map(lambda point : 1 if  GetDtLabel(point) == point.label else 0).sum()
     
     Exception: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transformation. SparkContext can only be used on the driver, not in code that it run on workers. For more information, see SPARK-5063.

      scala      ,   dtModel      ,      :
      
         dtModelBroadcast = sc.broadcast(dtModel)

          stackoverflow pyspark :

 
  
     http://stackoverflow.com/questions/31684842/how-to-use-java-scala-function-from-an-action-or-a-transformation
 
  
     http://stackoverflow.com/questions/36838024/combining-spark-streaming-mllib
  
    pyspark   DescitionTreeModel predict       
      “In Python, predict cannot currently be used within an RDD transformation or action. 
Call predict directly on the RDD instead.”
 def predict(self, x):
        """
        Predict the label of one or more examples.

        Note: In Python, predict cannot currently be used within an RDD
              transformation or action.
              Call predict directly on the RDD instead.

        :param x:  Data point (feature vector),
                   or an RDD of data points (feature vectors).
        """
        if isinstance(x, RDD):
            return self.call("predict", x.map(_convert_to_vector))

        else:
            return self.call("predict", _convert_to_vector(x))

        call self._sc , model sc

class JavaModelWrapper(object):
    """
    Wrapper for the model in JVM
    """
    def __init__(self, java_model):
        self._sc = SparkContext.getOrCreate()
        self._java_model = java_model

    def __del__(self):
        self._sc._gateway.detach(self._java_model)

    def call(self, name, *a):
        """Call method of java_model"""
        return callJavaFunc(self._sc, getattr(self._java_model, name), *a)

      py4j java_model( "org.apache.spark.mllib.tree.model.DecisionTreeModel"), SparkContext。
 
  
 
   
 

좋은 웹페이지 즐겨찾기