Amirrea’s Page
About Me
I'm Amirreza Velae — an Electrical Engineering and Mathematics student at Sharif University of Technology. I’m deeply curious about the nature of intelligence and how we can replicate it in machines. This curiosity naturally drew me toward machine learning, optimization, and especially reinforcement learning.
My primary research interest is reinforcement learning, as I believe it holds the most promise for achieving general intelligence. I’m particularly drawn to the theoretical aspects of deep reinforcement learning, bandit algorithms, and statistical learning theory. For my B.Sc. thesis, I studied the convergence of Trust Region Policy Optimization (TRPO) under the supervision of Prof. Hamed Shah-Mansouri.
In summer 2024, I had the opportunity to conduct remote research at the Max Planck Institute for Intelligent Systems with Amin Charusaie, working on incorporating human feedback into neural networks using Bayesian layers. I also collaborated with Prof. Mohammad Aliannejadi on debiasing ranking algorithms, exploring methods to reduce bias in language models. This summer (2025), I’m working under the supervision of Arash Bahari Kordabadi and Sadegh Soudjani at the Max Planck Institute for Software Systems, focusing on developing second-order reinforcement learning algorithms.
Outside of academics, you’ll usually find me playing chess or soccer, or following chess tournaments.
Feel free to reach out through the contact information or social links on the left. Whether you’d like to discuss research, share a cool idea, or just say hello — I’d love to hear from you!
🎓 Education
- Allameh Jafari High School (NODET) - Graduated in 2021
- Sharif University of Technology - Expected graduation in 2026
🔍 Research Interests
- Statistical Learning Theory
- Optimization
- Reinforcement Learning
- Bandit Algorithms
💡 Selected Projects
- Debias Ranking with BackPack Language Model - Advisor: Prof. Mohammad Aliannejadi - Preprint
- Optimization in Trust Region Policy Optimization (B.Sc. Thesis) - Advisor: Prof. Hamed Shah-Mansouri - Ongoing
- Second-order Methods for Reinforcement Learning - Advisors: Arash Bahari Kordabadi & Sadegh Soudjani - Ongoing
"The exploiter never tells the exploited how he's exploiting them."
— Jean-Luc Godard