Upper Confidence Bound Algorithms

Rather than performing exploration by simply selecting an arbitrary action, chosen with a probability that remains constant, the UCB algorithm changes its exploration-exploitation balance as it gathers more knowledge of the environment. It moves from being primarily focused on exploration, when actions that have been tried the least are preferred, to instead concentrate on exploitation, selecting the action with the highest estimated reward.

This is a theoretical note, but pretty cool that they have an application towards how I live my life: 202006071608

uid: 202006071602 tags: #algorithms #writeup

Date

February 22, 2023

Up next

Common Things that I Observed in the UK American sweets, Eventbrite for everything, no porch pirates because everything gets delivered to your flat? So much focus on environmental

Previously

Chinese people gradually becoming more conservative recently Chinese people used to be Democratic, but once Chinese people were assigned their own bubble, they get put into their own bubble Since Chinese