Q* (Q-star) by OpenAI Explained (no AGI)

Опубликовано: 20 Октябрь 2024
на канале: Discover AI

8,075

233

What is Q*? In this video, we demystify the enigmatic Q* (Q-star) concept by OpenAI, breaking it down into simple, understandable terms. Far from the realm of AGI or superhuman intelligence, Q* actually finds its roots in the principles of entropy in physics and the nuances of agent behavior, particularly focusing on imitation learning.

Join us as we unravel the real science behind Q*.

Reading about the OpenAI Board drama, I was asked by one of my subscriber, if I know Q*. Sure, let's dive into Q and Q* (a PhD in theoretical physics helps).

Residual Q-Learning: Offline and Online Policy
Customization without Value (UC Berkeley and Toyota Research Inst)
https://arxiv.org/pdf/2306.09526.pdf

First appearance of Q*:
Reinforcement Learning with Deep Energy-Based Policies (OpenAI, UC Berkeley, ..)
https://arxiv.org/pdf/1702.08165.pdf

Chapters:
00:00 What is Q?
00:40 Q function explained
01:41 Q-learning update rule Bellman
03:36 Markov Decision Process
05:05 We compute Q
06:03 Residual Q-Learning Oct 2023
08:17 Policy customization, multi tasks
12:47 Residual Soft Actor Critic
13:06 Residual Max-Entropy MC
13:24 Q* a soft Q-function Oct 2023
14:35 Q* in Max Entropy RL
16:57 Q* dev by OpenAI & Berkeley
17:30 Maximum Entropy Policies w/ Q star

#openai
#AGI
#Q*