public class KMeansParallelJobRunner extends MapReduceJobController implements ClusteringRunner
Bahmani, Kumar, Moseley, Vassilvitskii and Vattani. Scalable K-means++. VLDB Endowment Vol. 5, No. 7. 2012. Couple things to note:
(1) Updating the cost of each sampled point occurs as the first step within sampling loop; the initial sample is performed outside the loop.
(2) A final update cost occurs outside the sampling loop just prior to stripping off the top 'K' centers.
MapReduceJobController.PostOperationTask
DoNothingTask
Constructor and Description |
---|
KMeansParallelJobRunner() |
Modifier and Type | Method and Description |
---|---|
Collection<ParameterEnum<?>> |
getParameters() |
int |
run(org.apache.hadoop.conf.Configuration configuration,
PropertyManagement propertyManagement) |
void |
setInputFormatConfiguration(FormatConfiguration inputFormatConfiguration) |
void |
setZoomLevel(int zoomLevel) |
getConfiguration, getRunners, init, run
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
run
public void setZoomLevel(int zoomLevel)
setZoomLevel
in interface ClusteringRunner
public void setInputFormatConfiguration(FormatConfiguration inputFormatConfiguration)
setInputFormatConfiguration
in interface ClusteringRunner
public int run(org.apache.hadoop.conf.Configuration configuration, PropertyManagement propertyManagement) throws Exception
run
in interface MapReduceJobRunner
run
in class MapReduceJobController
Exception
public Collection<ParameterEnum<?>> getParameters()
getParameters
in interface IndependentJobRunner
getParameters
in class MapReduceJobController
Copyright © 2013–2022. All rights reserved.