Quartz 4

Home

❯

sources

❯

papers

❯

Thompson Sampling for Contextual Bandits with Linear Payoffs

Thompson Sampling for Contextual Bandits with Linear Payoffs

Sep 23, 20251 min read

http://proceedings.mlr.press/v28/agrawal13.pdf

Głównym osiągnięciem publikacji jest oszacowanie upper bound na regret algorytmu Linear Thompson sampling, Contextual Bandits, rozwiązującego problem Contextual Multi-arm Bandit Problem.


Graph View

Backlinks

  • Contextual Multi-arm Bandit Problem
  • Linear Thompson sampling, Contextual Bandits

Created with Quartz v4.4.1 © 2025

  • GitHub
  • Discord Community