Quartz 4

❯

❯

❯

Thompson Sampling for Contextual Bandits with Linear Payoffs

Thompson Sampling for Contextual Bandits with Linear Payoffs

Sep 23, 20251 min read

http://proceedings.mlr.press/v28/agrawal13.pdf

Głównym osiągnięciem publikacji jest oszacowanie upper bound na regret algorytmu Linear Thompson sampling, Contextual Bandits, rozwiązującego problem Contextual Multi-arm Bandit Problem.

Graph View

Backlinks

Contextual Multi-arm Bandit Problem
Linear Thompson sampling, Contextual Bandits

Created with Quartz v4.4.1 © 2026

GitHub
Discord Community