Spark pipeline get best model

 val lr = new LinearRegression()

 val pipeline = new Pipeline().setStages(Array(lr))

 val paramGrid = new ParamGridBuilder().addGrid(lr.regParam, Array(0, 0.5, 1.0)).build()

 val cv = new CrossValidator().setEstimator(pipeline).setEvaluator(new RegressionEvaluator).setEstimatorParamMaps(paramGrid).setNumFolds(2)

 val cvModel = cv.fit(data)

 val model = cvModel.bestModel.asInstanceOf[PipelineModel]

 val lrModel = model.stages(0).asInstanceOf[LinearRegressionModel]

This entry was posted in spark. Bookmark the permalink.

Leave a comment