Skip to content

This repository contains all resources for Homework 1 of TDT4173 fall 2021.

Notifications You must be signed in to change notification settings

haavardsjef/k-means-algorithm

Repository files navigation

K-means algorithm

This repository contains all resources for Homework 1 of TDT4173 fall 2021. A short writeup can be found in report.pdf

Problem Description

In this project I will be developing two well-known and simple (but occasionally very useful) machine learning algorithms. The interface I am implementing strongly resembles the one used in Scikit Learn.

K-means algorithm

k-means clustering is a method of vector quantization, originally from signal processing, that aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean (cluster centers or cluster centroid), serving as a prototype of the cluster.

illustration of results

Decision tree

A decision tree is a flowchart-like structure in which each internal node represents a "test" on an attribute (e.g. whether a coin flip comes up heads or tails), each branch represents the outcome of the test, and each leaf node represents a class label (decision taken after computing all attributes). The paths from root to leaf represent classification rules.

Overview

  • The <algorithm_name>.py files (e.g. k_means.py) contains the implementation of the algorithm.
  • For each algorithm, there are two datasets that has been used to test the algorithm.
    • The data_1.csv files contain easy problems.
    • The data_2.csv files contain harder problems.
  • The report.pdf file contains results, plots and descriptions of the work.
  • The experiment.ipynb files are jupyter notebooks with code for loading the datasets, training, and evaluating the models.

About

This repository contains all resources for Homework 1 of TDT4173 fall 2021.

Topics

Resources

Stars

Watchers

Forks