News
An Empirical Review of the Animal Harm Benchmark — LessWrong
2+ hour, 25+ min ago (239+ words) Summary: The Animal Harm Benchmark(AHB) is one of only two publicly available benchmarksfor measuring LLM bias against non-human animals. This work" An Empirical Review of the Animal Harm Benchmark Summary: The Animal Harm Benchmark (AHB) is one of only…...
Was It Owl a Dream? — LessWrong
6+ day, 16+ hour ago (827+ words) I try to apply mechanical interpretability to understand token entanglement in subliminal learning, fail, and come to suspect subliminal learning is not caused by token entanglement. Subliminal learning is the phenomenon of transferring knowledge to a model by fine tuning…...
Flamingos (among other things) reduce emergent misalignment — LessWrong
1+ week, 3+ day ago (329+ words) Work conducted as part of Neel Nanda's MATS 10.0 exploration phase. Emergent Misalignment (Betley et al. (2025b)) is a phenomenon in which training language models to exhibit some kind of narrow misbehavior induces a surprising degree of generalization, making the model become…...
Does focusing on animal welfare make sense if you're AI-pilled? — LessWrong
3+ week, 1+ day ago (1510+ words) As the possibility of ASI moves out of kooky thought experiments and into Q4 projections, mainstream animal welfare folks are showing increasing interest in the implications of ASI for animals and on animal welfare in the long-run future. That said, I…...
What It's Like To Be A Worm (Notes on Borderline Sentience) — LessWrong
1+ mon, 4+ hour ago (1590+ words) By Ralph Stefan Weir Darwin's book is noteworthy not just as the first major text on bioturbation (the reworking of soils by organisms) but also for how he approaches the inner lives of earthworms: Darwin was not the first scientist…...
Would you kill a vulcan to save a shrimp? — LessWrong
1+ mon, 9+ hour ago (1648+ words) Imagine a being very much like a human, with rich conscious experience but no affective conscious states. Call such a creature a Philosophical Vulcan (or p-Vulcan for short.) p-Vulcans differ from the regular Vulcans on Star Trek who are low-affect,…...
how whales click — LessWrong
1+ mon, 2+ day ago (287+ words) After the phonic lips, sound passes through the vocal cap. The same paper notes: The phonic lips are enveloped by the "vocal cap," a morphologically complex, connective tissue structure unique to kogiids. Extensive facial muscles appear to control the position…...
Backyard cat fight shows Schelling points preexist language — LessWrong
1+ mon, 2+ week ago (927+ words) Two cats fighting for control over my backyard appear to have settled on a particular chain-link fence as the delineation between their territories. This suggests that: I don't have any pets, so my backyard is terra nullius according to Cat…...
A Study Of Instinct — LessWrong
2+ mon, 2+ week ago (1065+ words) (This is a lightly edited crosspost from my Substack.)" I told the forest: I want to know you. I want to know you as an animal does, living with you, in you, as you. I'm learning the feeling of instinct....
Viewing animals as economic agents — LessWrong
2+ mon, 2+ week ago (156+ words) tl; dr: Studying how animals assign values and allocate resources contextualizes human goal-directed behavior. Here I introduce some factual claims about animals, and a suggestion for how terms from economics can be applied to their behavior. " Impatience and Risk Aversion…...