K-means algorithm

This repository contains all resources for Homework 1 of TDT4173 fall 2021. A short writeup can be found in report.pdf

Problem Description

In this project I will be developing two well-known and simple (but occasionally very useful) machine learning algorithms. The interface I am implementing strongly resembles the one used in Scikit Learn.

K-means algorithm

k-means clustering is a method of vector quantization, originally from signal processing, that aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean (cluster centers or cluster centroid), serving as a prototype of the cluster.

Decision tree

A decision tree is a flowchart-like structure in which each internal node represents a "test" on an attribute (e.g. whether a coin flip comes up heads or tails), each branch represents the outcome of the test, and each leaf node represents a class label (decision taken after computing all attributes). The paths from root to leaf represent classification rules.

Overview

The <algorithm_name>.py files (e.g. k_means.py) contains the implementation of the algorithm.
For each algorithm, there are two datasets that has been used to test the algorithm.
- The data_1.csv files contain easy problems.
- The data_2.csv files contain harder problems.
The report.pdf file contains results, plots and descriptions of the work.
The experiment.ipynb files are jupyter notebooks with code for loading the datasets, training, and evaluating the models.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
decision_tree		decision_tree
images		images
k_means		k_means
.gitignore		.gitignore
README.md		README.md
__latexindent_temp.tex		__latexindent_temp.tex
report.aux		report.aux
report.fdb_latexmk		report.fdb_latexmk
report.fls		report.fls
report.pdf		report.pdf
report.tex		report.tex
report_assignment1.pdf		report_assignment1.pdf
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

K-means algorithm

Problem Description