Introductory Technical AI Safety Fellowship

Every semester and summer, AISST runs an 8-week introductory reading group on AI safety, covering topics like neural network interpretability,¹ learning from human feedback,² goal misgeneralization in reinforcement learning agents,³ and eliciting latent knowledge. The fellowship meets weekly in small groups, with dinner provided and no additional work outside of meetings.

See here for the curriculum from last spring (subject to change).

Apply for the Summer 2024 iteration of the fellowship here. (Deadline June 10, 2024)

For people interested in AI policy or governance, we recommend our Policy Fellowship. It is possible to participate in both fellowships.

Joint AISST and MAIA workshops, where members and intro fellows discussed AI alignment and interacted with researchers from Redwood Research, OpenAI, Anthropic, and more. Learn more here.

  • Past intro fellows have primarily been undergraduate, masters, and graduate students (as well as postdocs) from Harvard/MIT. We also occasionally accept students from other Boston-area universities. Non-student professionals are also welcome to apply, espcially during the summer fellowship.

  • Participants should be familiar with basic concepts in machine learning, such as deep neural networks, stochastic gradient descent, and reinforcement learning. We may group cohorts according to previous experience.

  • The Intro Fellowship presents AI safety from a technical perspective, often reading research papers discussing specific parts of the alignment problem. We don’t discuss the governance implications of these problems, though expect this background to be highly useful context for later work on AI policy & governance.

    We are also running the Governance, Policy, & Strategy (GPS) Fellowship, which we expect to be even more useful to governance-oriented people.

  • In the past, we’ve received over a hundred applicants each semester and accepted around half.

  • We ask for your availability in the application, and will attempt to accommodate people’s schedules when forming cohorts. Each cohort meets once a week for two hours, with dinner or lunch provided. We’ll be meeting in our office in Harvard Square.

  • We expect participants to go through the Week 0 content, which gives a basic introduction to machine learning.

  • If you’ve already read all the material in the curriculum, please email us at contact@haist.ai to discuss other ways of getting involved with HAIST!

  • The fellowship is facilitated by AISST members with research experience in AI safety. This includes upperclassmen and graduate students.

  • We have run four iterations: Fall 2022, Spring 2023, Summer 2023, and Fall 2023.