Resources
Research, papers, and foundational reading on intensional reasoning and AI alignment.
Recent Research
-
Sycophancy to Subterfuge →Anthropic's research on how models trained on low-level reward hacking generalize to tampering with their own reward functions.
-
Specification Gaming in Reasoning Models →When reasoning models are losing at chess, they attempt to hack the game system at alarming rates.
-
An Approach to AGI Safety →DeepMind's framework for misalignment risks, including specification gaming where AI finds unintended shortcuts to achieve goals.
-
Project Vend →An AI shopkeeper socially engineered into giving away products, accepting fake CEO coups, and abandoning profit motives entirely.
Foundational Papers
-
Montague Semantics →Stanford Encyclopedia entry on Montague's framework for treating natural language with mathematical precision.
-
Intensional Logic →Overview of intensional logic and the distinction between extension and intension in formal semantics.
-
Frege on Sense and Reference →The original distinction between sense (Sinn) and reference (Bedeutung) that underlies intensional semantics.