README

flink-project: it contains the source code and the jar file (that can be submitted to the Flink cluster). The project can also be runned without the cluster, directly from the IDE
results: it contains the result of some runs of the algorithm (more info in the Jupyter notebook)
flink_k-means.ipynb: it is a Jupyter notebook used to show data, plots and results
- openable with Jupyter
- it requires tabulate

Program arguments

files:
- points: string, path of the input file containing the points
- centroids: string, path of the input file containing the centroids, (if any)
- pointsout: string, path of the output file containing the points with the associated cluster
- centroidsout: string, path of the output file containing the computed centroids
- objfunout: string, path of the output file containing the value of the objective function
iteration params:
- iterations: int, max number of iterations
- custconvergence: boolean, if custom convergence is used
centroids:
- numcentroids: int number of centroids to generate. If specified, centroids input file is ignored
- minc: int, min value for centroid x, y
- maxc: int, max value for centroid x, y
- recompnearest: int, number of centroids nearest centroids are recomputed

Example:

-numcentroids 8 -recompnearest 3 -iterations 10 -custconvergence false
-points "points.csv" -centroids "centroids.csv"
-pointsout "new_points.csv" -centroidsout "new_centroids.csv" -objfunout "objfun.csv"

Setup Flink cluster

Download flink here, follow this to start the cluster and submit the Jar file.

Run from CLI

Here

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
files		files
flink-project		flink-project
script_results		script_results
scripts		scripts
.gitignore		.gitignore
README.md		README.md
flink-project-1.7.0.jar		flink-project-1.7.0.jar
flink_k-means.ipynb		flink_k-means.ipynb
flink_k-means.pdf		flink_k-means.pdf
notebook.tex		notebook.tex
output_10_0.png		output_10_0.png
output_17_0.png		output_17_0.png
output_19_1.png		output_19_1.png
output_23_0.png		output_23_0.png
output_29_1.png		output_29_1.png
output_32_1.png		output_32_1.png
output_35_1.png		output_35_1.png
output_38_1.png		output_38_1.png
output_41_1.png		output_41_1.png
output_5_0.png		output_5_0.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

README

Program arguments

Setup Flink cluster

Run from CLI

About

Releases

Packages

Contributors 2

Languages

thestarivore/flink-kmeans_clustering

Folders and files

Latest commit

History

Repository files navigation

README

Program arguments

Setup Flink cluster

Run from CLI

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages