Virtual Paper Review – Reasoning

Name: Virtual Paper Review – Reasoning
Start: 2025-02-12T18:00:00-06:00
End: 2025-02-12T19:30:00-06:00

February 12 @ 6:00 pm – 7:30 pm

Join us virtually this Wednesday at 6pm to kick off our monthly Paper Review series! We will be holding these sessions every 4 weeks on Zoom. The first one will be led by Josh Phillips:

This month we will be exploring the current evolution in AI reasoning models. We will start with a short review of the history of combining reinforcement learning with tree search for reasoning (AlphaGo/MuZero). Then we will cover the real start of the reasoning race with the chain-of-thought innovations highlighted by “Let’s Verify Step by Step.” And finally of course, we’ll cover the most important insights from DeepSeek’s three recent papers that build on all of this prior foundation:

DeepSeek-V3 – a highly efficient MoE-based approach (671B parameters, only 37B active per token) powered by advanced CUDA/NCCL optimizations and low-precision (FP8) training, making huge models tractable on constrained hardware.
DeepSeek-R1 – the first open demonstration that an LLM can learn complex reasoning purely through RL rewards, without any annotated examples (r1-zero)
DeepSeekMath– which introduces GRPO (an alternative to PPO that simplifies training and removes the need for a separate value network).

Links:

Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model (2019) : https://arxiv.org/pdf/1911.08265
Lets Verify Step by Step (2023) : https://arxiv.org/abs/2305.20050
Deepseek V3 Technical Report (2024): https://github.com/deepseek-ai/DeepSeek-V3/blob/main/DeepSeek_V3.pdf
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning (2025) https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSeek_R1.pdf
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models (2025): https://arxiv.org/pdf/2402.03300
(Bonus) “We’ve got R1 at home” : https://unsloth.ai/blog/r1-reasoning

Details:

Date – 2/12/2025
Time – 6-7:30pm
Location – VIRTUAL
Zoom –https://us02web.zoom.us/j/86964741971?pwd=SXmv4xfY8M6TZbEOIXyYfLu5Y3FAil.1

Details

Date:: February 12
Time:: 6:00 pm – 7:30 pm