Trustworthy Machine Reasoning
with Foundation Models

Tutorial at AAAI 2026

Singapore

Brando Miranda
Brando Miranda
Stanford
Brando Miranda
Pan Lu
Stanford
Xiang Yue
Sanmi Koyejo
Stanford
Bo Han
Bo Han
HKBU/RIKEN

Abstract

Recent advances in foundation models have led to remarkable progress in machine reasoning, enabling systems to solve increasingly complex tasks in mathematics, coding, science, and real-world decision making. However, despite these gains, foundation model reasoning often suffers from critical trustworthiness issues, including sensitivity to noisy inputs, hallucinated or misleading reasoning traces, vulnerability to adversarial attacks, and limited interpretability.

This tutorial aims to establish a unified and systematic understanding of trustworthy machine reasoning with foundation models. Rather than treating reasoning performance, robustness, safety, and interpretability in isolation, we synthesize recent progress across prompting, test-time scaling, post-training, and agentic reasoning frameworks, highlighting how these components jointly shape trustworthy reasoning behavior.

The tutorial is organized into four main parts. First, we introduce the foundations of machine reasoning in large models, discussing its capabilities, limitations, and emerging trends. Second, we present core techniques for trustworthy reasoning in foundation models, including prompting strategies, test-time scaling methods, and post-training approaches that enhance robustness and safety. Third, we extend the discussion to foundation agents, covering tool-augmented, multi-agent, and multi-modal reasoning, along with their unique trustworthiness challenges. Finally, we examine real-world applications and open research problems, focusing on trustworthy code agents and agentic coding systems that integrate reasoning and tool use for reliable deployment. By consolidating recent advances and open challenges, this tutorial seeks to provide a foundation for future research on trustworthy machine reasoning systems.

Schedule

Time: 14:00, January 20, 2026. Location: Peridot 202

  • Part I: An Introduction to Trustworthy Machine Reasoning with Foundation Models (Bo Han, 30 mins)
  • Part II: Techniques of Trustworthy Machine Reasoning with Foundation Models (Zhanke Zhou, 50 mins)
  • Part III: Techniques of Trustworthy Machine Reasoning with Foundation Agents (Chentao Cao, 50 mins)
  • Part IV: Applications of Trustworthy Machine Reasoning with AI Coding Agents (Brando Miranda, 50 mins)
  • Closing Remarks (Zhanke Zhou, 10 mins)
  • Q&A

Organizer's Bio

Zhanke Zhou

Zhanke Zhou is a Ph.D. student in the Trustworthy Machine Learning and Reasoning (TMLR) Group at Hong Kong Baptist University, advised by Prof. Bo Han. He was a visiting student at the Stanford Trustworthy Artificial Intelligence (STAIR) Lab at Stanford University, working with Prof. Sanmi Koyejo. His research focuses on trustworthy machine reasoning with foundation models, including large language models (LLMs) and vision-language models (VLMs), to solve complex problems such as mathematics and coding, as well as to accelerate scientific discovery and applications in fields such as biology, chemistry, and healthcare. He believes that reasoning is an essential pathway toward achieving artificial general intelligence (AGI). Trustworthy machine reasoning encompasses key properties including reasoning capability, robustness, safety, and explainability.

Chentao Cao

Chentao Cao is currently a Ph.D. student in the Trustworthy Machine Learning and Reasoning (TMLR) Group at Hong Kong Baptist University, under the supervision of Prof. Bo Han, and collaborating closely with Prof. Zhun Zhong. His research primarily centers on developing trustworthy machine reasoning frameworks with foundation models, including large language models (LLMs) and vision-language models (VLMs). His goal is to create robust and reliable reasoning models capable of addressing complex problems, such as mathematics. By enhancing the trustworthiness of foundation models, he seeks to drive advancements in critical downstream applications, particularly in healthcare and safety domains, thereby enabling more effective and safer solutions in real-world scenarios.

Brando Miranda

Brando Miranda is a Ph.D. student at Stanford University in the Department of Computer Science, under the supervision of Prof. Sanmi Koyejo. Previously, he was a graduate student at the University of Illinois Urbana-Champaign, a Research Assistant at MIT's Center for Brains, Minds and Machines (CBMM), and a graduate student at the Massachusetts Institute of Technology (MIT). Miranda's research interests lie in meta-learning, foundation models for theorem proving, and human- and brain-inspired artificial intelligence (AI). He completed his Master of Engineering in Electrical Engineering and Computer Science under the supervision of Prof. Tomaso Poggio, where he conducted research on deep learning theory. Miranda has received several awards, including the Most Cited Paper Certificate from the International Journal of Automation & Computing (IJAC), two Honorable Mentions from the Ford Foundation Fellowship, the Computer Science Excellence Saburo Muroga Endowed Fellowship, the Stanford School of Engineering Fellowship, and he is currently an EDGE Scholar at Stanford University.

Pan Lu

Pan Lu is a postdoctoral researcher at Stanford University. He received his Ph.D. in Computer Science from UCLA in 2024. His research focuses on developing AI methods and systems to advance complex reasoning, mathematical intelligence, and scientific discovery. He has served as Senior Program Chair for NENLP 2025, Program Chair for SoCal NLP 2023, and Co-Chair of the MATHAI workshops at NeurIPS (2021-2024). He is a recipient of several awards, including two Most Influential Paper Awards (NeurIPS 2022, ICLR 2024), a Best Paper Honorable Mention at ACL 2023, the Best Paper Award at the KnowledgeNLP Workshop 2025, and Ph.D. Fellowships supported by Amazon, Bloomberg, and Qualcomm.

Sanmi Koyejo

Sanmi Koyejo is an Assistant Professor in the Department of Computer Science at Stanford University and a co-founder of Virtue AI. At Stanford, Koyejo leads the Stanford Trustworthy Artificial Intelligence (STAIR) Lab, which works to develop the principles and practice of trustworthy AI, with a focus on applications in science and healthcare. Koyejo has received several awards, including the Skip Ellis Early Career Award, the Presidential Early Career Award for Scientists and Engineers (PECASE), a Sloan Fellowship, a Terman Faculty Fellowship, an NSF CAREER Award, a Kavli Fellowship, and an IJCAI Early Career Spotlight. Koyejo serves on the Neural Information Processing Systems Foundation Board, the Association for Health Learning and Inference Board, and as President of the Black in AI Board.

Bo Han

Bo Han is currently an Associate Professor in Machine Learning and the Director of the Trustworthy Machine Learning and Reasoning Group at Hong Kong Baptist University, and a BAIHO Visiting Scientist of the Imperfect Information Learning Team at the RIKEN Center for Advanced Intelligence Project (RIKEN AIP). His research focuses on trustworthy machine learning. He has received multiple paper awards, including the Outstanding Paper Award at NeurIPS, the Most Influential Paper Award at NeurIPS, and the Outstanding Student Paper Award at a NeurIPS Workshop. He is also a recipient of the RGC Early CAREER Scheme, IEEE AI's 10 to Watch Award, IJCAI Early Career Spotlight, INNS Aharon Katzir Young Investigator Award, RIKEN BAIHO Award, Dean's Award for Outstanding Achievement, and the Microsoft Research StarTrack Scholars Program.