#

ucb

Here are 63 public repositories matching this topic...

Bachfischer / COMP90051-StatML-Assignment-2

Source code for Assignment 2 of COMP90051 (Semester 2 2020)

ucb multi-armed-bandit mab

Updated Oct 21, 2020
Jupyter Notebook

PRAN20 / Upper-Confidence-Bound

reinforcement-learning ucb upper upper-confidence-bounds

Updated Jan 26, 2022
Jupyter Notebook

sarthakmittal92 / multi-armed-bandits

Repository for the course project done as part of CS-747 (Foundations of Intelligent & Learning Agents) course at IIT Bombay in Autumn 2022.

python thompson-sampling reinforcement-learning-algorithms ucb multi-armed-bandits bandits kl-ucb

Updated Oct 14, 2022
Python

Murtazali05 / Multi-armed-bandit

Multi Armed Bandits implementation using the Jester Dataset

thompson-sampling ucb multi-armed-bandits e-greedy

Updated Apr 5, 2021
Python

YingluDeng / cs61b_notes

cs61b notes for 2021 spring

Updated Feb 5, 2021

salimandre / Monte-Carlo-Tree-Search

We implemented a Monte Carlo Tree Search (MCTS) from scratch and we successfully applied it to Tic-Tac-Toe game.

reinforcement-learning graphics mcts ucb monte-carlo-tree-search tic-tac-toe-game upper-confidence-bound

Updated Jul 9, 2020
Python

LittleWat / hyper-parameter-optimization-by-GMRF-GPUCB

R.I.T project

python3 ucb gaussian-processes gmrf markov-random-field gp

Updated Jul 29, 2019
Python

JoelJa835 / MAB_Algorithms

Implementation of Multi-Armed Bandit (MAB) algorithms UCB and Epsilon-Greedy. MAB is a class of problems in reinforcement learning where an agent learns to choose actions from a set of arms, each associated with an unknown reward distribution. UCB and Epsilon-Greedy are popular algorithms for solving MAB problems.

reinforcement-learning-algorithms ucb bandits mab e-greedy

Updated Mar 26, 2023
Python

JoelJa835 / Least-Loaded-Server

reinforcement-learning-algorithms ucb multiplicative-weights

Updated Apr 26, 2023
Python

loraalex / LoBook

LoRa@FIIT algorithms comparison using jupyter notebooks

iot analysis lora ucb adr upper-confidence-bound lorafiit adaptive-data-rate

Updated Dec 10, 2023
Jupyter Notebook

Suchetaaa / CS747-Assignments

Foundations Of Intelligent Learning Agents (FILA) Assignments

reinforcement-learning monte-carlo linear-programming thompson-sampling ucb bootstrapping multi-armed-bandits bellman-equation temporal-differencing-learning howards-pi sarsa-learning kl-ucb windy-gridworld intelligent-learning-agents

Updated Nov 8, 2019
Python

SanketAgrawal / ReinforcementLearning

Chapter wise implementation & analysis of all the algorithms in RL : An Intoduction by Richard S. Sutton and Andrew G. Barto

reinforcement-learning artificial-intelligence epsilon-greedy python-3 ucb k-armed-bandit gradient-bandit optimistic-inital-values

Updated Jul 18, 2020
Jupyter Notebook

paramrathour / Intelligent-and-Learning-Agents

My programs during CS747 (Foundations of Intelligent and Learning Agents) Autumn 2021-22

linear-programming thompson-sampling epsilon-greedy mountain-car sarsa ucb markov-decision-processes multi-armed-bandit policy-iteration value-iteration tile-coding kl-ucb policy-control

Updated Apr 17, 2022
Python

Retr0-code / Pong-RL

Reinforcement learning used in the game of pong

cmake reinforcement-learning cpp q-learning ucb pong-game cpp20 boost-test upper-confidence-bound the-game-of-pong

Updated May 20, 2024
C++

SarCode / ML-Code-Tutorials-Udemy

Complete Tutorial Guide with Code for learning ML

natural-language-processing random-forest svm scikit-learn artificial-neural-networks logistic-regression ucb polynomial-regression kmeans-clustering knearest-neighbor-algorithm apriori-algorithm classification-methods svr kernel-svm kernel-pca heirarchical-clustering decison-trees

Updated Apr 21, 2023
Python

BigBobAtBerkeley / CS70

CS70 Homework and Discussion Solutions

berkeley probability mathematics and uc cs discrete theory ucb 70

Updated Feb 18, 2023

JeremyTremblay2 / m1-ia-jtremblay

Train and test your IA's using these samples in the machine learning field.

python machine-learning reinforcement-learning tensorflow linear-regression scikit-learn machine-learning-algorithms q-learning cnn dnn gradient-descent ucb polynomial-regression knn decision-tree deep-q-learning

Updated Jun 4, 2024
Jupyter Notebook

BigBobAtBerkeley / CS170

CS 170 Homework Solutions

algorithms solutions berkeley efficient and uc problems cs ucb 170 intractable

Updated Feb 18, 2023

zamburak

mknbv / zamburak

Bandit algorithms in OCaml

trading ocaml ucb bandit-algorithms stochastic-bandit adversarial-bandit exp3

Updated Jul 22, 2020
OCaml

martind-hub / Thompson-Sampling-UCB

This is a sample code written in R that compares Thompson Sampling and UCB for three available arms sampled from a bernoulli distribution.

thompson-sampling ucb

Updated Feb 5, 2021
R

Improve this page

Add a description, image, and links to the ucb topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the ucb topic, visit your repo's landing page and select "manage topics."