Effective Altruism News
Effective Altruism News
- On one side of this debate is Yudkowsky & Soares, who think that (if AI progress continues) we’re on a direct path to egregiously-misaligned, scheming, out-of-control, rogue superintelligence (ASI), not even slightly nice, in the absence of yet-to-be-invented breakthrough technical alignment ideas. On the other side of this debate is almost everyone who works on or studies LLMs.
- Our new incubator for software startups; apply by Jun 24
- On one side of this debate is Yudkowsky & Soares, who think that (if AI progress continues) we’re on a direct path to egregiously-misaligned, scheming, out-of-control, rogue superintelligence (ASI), not even slightly nice, in the absence of yet-to-be-invented breakthrough technical alignment ideas. On the other side of this debate is almost everyone who works on or studies LLMs.
- #AISafety #superintelligence #animation #indieanimation #aialignment
- Fighting to win
- How funders and foundation leaders can help others in their network give more, and give better.
- Transformer Weekly: Preemption’s child safety push, OpenAI’s pause preparations, and SpaceX’s IPO...
- A better understanding of the many factors that influence the difficult decision to surrender a companion animal could inform interventions to help keep them with their families. The post The Complexities Of Companion Animal Surrender appeared first on Faunalytics.
- The most effective weight-loss drug so far, cancer breakthroughs, gene editing for cholesterol, ancestral CRISPR systems, and more.
- welfare will grow faster than the economy
- not the density or whatever
- AI company CEOs Sam Altman (OpenAI), Demis Hassabis (Google DeepMind), and Dario Amodei (Anthropic) disagree on a lot, like how fast the technology should develop, the best way to regulate it, and how to prepare society for smarter-than-human AI, among other things. That makes it all the more remarkable that they — along with 85 […]...
- In September 2025, I'd become increasingly convinced that a fieldbuilding program for content creators could solve a long-standing bottleneck of expanding reach and trust beyond the AI safety and EA bubble. I had graduated from UCLA a few months earlier when I came across the AI-2027 report which had a significant impact on me.
- And other insights from the electric car rollout in 2025.
- contra scott alexander (?). Yesterday, I went to the Berkeley ACX Meetup. Scott Alexander was there, and ran a Q&A session where participants could ask him questions and he would respond to them unless the questions were about eulogies, in which case he would pause to think for a few seconds before kindly passing.
- Parkinson's Law states that work expands to fit the space allotted. The idea being, if you give someone a month to write a report, they'll take a month, but if you give them a week, they'll take a week, and then they'll have three weeks to do three other reports!
- People often assume that a large fraction of the AI safety community works on alignment. As far as we're aware, this is not true. Most people are not working on making sure superintelligent AIs are aligned with human values or follow human instructions. Currently, the people who we know of that work on alignment are roughly:
- I was recently at “Skoll”, the biggest NGO/social entrepreneurship conference. From one conversation to the next, two topics popped up over and over……. and over 1. Scale, Scale, Scale 2. Scale through Government, Government, Government We at OneDay Health grapple with these questions: how then Shall we scale? And shall we through Government?
- TLDR: Sarah Bluhm and I are funding and mentoring ideas-first (as opposed to, e.g., careers-first) EA community builders. If you’re at one of these universities or know someone who is, we want to talk to you, especially this subset: University of Toronto. University of Michigan. UCLA. USC. NYU. Columbia. Claremont Colleges (Claremont McKenna, Scripps, Harvey Mudd, Pomona, Pitzer).
- Compiling all the public evidence on Mythos Preview’s cyber abilities...
- This post was drafted with the assistance of Claude. Apply nowDeadline: July 8, 2026. Tentative program dates: August 15 – October 30. Know someone doing serious work on AI, animals, or the long-term future of sentient beings?. That someone might be you.
- Thanks to Arepo, David Thorstadt, Zeshen, and Michael st Jules for looking over this article. Disclaimer: I am not a subject matter expert and this is not a rigorous scientific article. This post is entirely human-written. Introduction.
- Michael Thatcher, President and CEO of Charity Navigator: “There are a lot of problems in the world, and so figuring out where you can have the highest level of impact with the resources that you have is actually the smartest thing you can do.". See more impact stories at 👉 effectivealtruism.org/stories #EffectiveAltruism #EffectiveAltruismStories
- OpenAI insists it doesn’t fund or direct LTF — but one of the super PAC’s operatives describes it as a “corporate funder” with “a say”...
- You can save lives by writing on the internet, says a guy writing on the internet
- A qualitative study of German slaughterhouse workers reveals how they manage — and occasionally struggle with — the emotional demands of killing animals. The post How Slaughterhouse Workers Learn To Emotionally Detach appeared first on Faunalytics.
- Polo GGB opened its doors to local high school students with the aim of bringing young people closer to the world of scientific research and raising awareness about malaria and vector control. Through direct interaction with our researchers, students are introduced to the innovative technologies and research projects carried out at Polo GGB, including the […].
- I used an LLM to help draft this post and it likely contains >10% AI-generated text, but I’ve edited/rewritten it extensively and endorse it. TL;DR: It’s unclear how much the intelligence explosion will directly affect agriculture, because it's one of the least cognitive-labor-intensive industries.
- Just beyond central Vancouver, the Squamish Nation is building one of the most ambitious and unusual housing developments in the world, and getting rich in the process. How they did it has lessons for
- On Tuesday, June 9, 2026, AI Now Senior Fellow, AI and Healthcare Dr. Katie J. Wells testified at a Hearing before the U.S. House of Representatives Committee on Education & the Workforce Subcommittee on Workforce Protections. In her testimony, Dr. Wells highlighted how gig nursing platforms are targeting policymakers with legislation that upends worker protections […].
- [Update (June 11, 2026): Anthropic has since "un-silenced" the new safeguards (source).]. [Thanks to Julian Minder for helpful discussion and review.]. Claude Fable 5 and its new safeguards. Yesterday, Anthropic publicly released Claude Fable 5. Fable 5 is a Mythos-class model – a model class above Opus, Anthropic's previous premium tier – and, as assessed by multiple benchmarks, it is...
- Suffering-focused ethics (SFE) is a family of moral views that gives special priority to reducing suffering. As you might know, we at the Center for Reducing Suffering find SFE deeply compelling—it is, after all, the backbone of our work. Part of our mission is to research and build a field around SFE. Unfortunately, SFE remains highly neglected in both academia and broader moral discourse.
- Fresh features in time for the World Cup!
- Detecting Hidden Behaviors in LLMs via Activation-matched Finetuning — preprint, 2026. [ Paper] [ Code]. TLDR. Given a model with some unknown, abnormal behavior (backdoors, censorship, reward hacking,...), construct an aligned reference by training a clean model to match the suspect's residual-stream activations on a benign prompt corpus.
- California YIMBY is excited to announce our endorsement of Xavier Becerra to be the next Governor of California. This election will be pivotal for California’s future. And the choice could not be any clearer. Xavier Becerra is the best candidate…. The post California YIMBY Endorses Xavier Becerra <span class="dewidow">for Governor</span> appeared first on California YIMBY.
- “If you need your working day to be fulfilling, if you need to feel like you’re making a difference, trust that voice, because it means you have the passion to actually make a difference in the world. Pursue it, because we need more people with that passion doing good work.”... Read more...
- Last week, the AI company Anthropic released a blog post titled “When AI builds itself”. This led to a media frenzy, with news outlets around the world publishing headlines that the company was urging a global pause on AI development, or calling for AI non-proliferation. However, the post does not call for a pause.
- TL;DR: Recent work from Goodfire & UK AISI – Verbalized Eval Awareness Inflates Measured Safety – shows that newer open-weight models verbalize evaluation-awareness (VEA) more often, and that this inflates measured safety. Between OLMo-3-32B-Think and OLMo-3.1-32B-Think – identical base, SFT, DPO, and RL data, differing only in an additional ~3 weeks of the RLVR stage – VEA roughly doubles.
- We are seeking a Policy and Research Associate to join our team to address poverty and insecurity in low and middle-income countries by incubating solutions, rigorously testing them in the field, and working with local partners to scale what works.
- The Data and Research Associate will primarily work with Jishnu Das on (a) rolling out a health insurance study in Kenya and Nigeria; (b) harmonizing and analyzing a unique dataset of Standardized Patients studies; and (c) providing support on IRB applications and new data collection in the field.
- When somebody says something, either they mean it, or they are responsible for meaning it.
- And a vibes-based assessment of what they mean.
- Hoy celebramos que, gracias a nuestros donantes, Ayuda Efectiva ya ha salvado 1.000 vidas. Pero las cifras grandes son difíciles de visualizar: ¿qué significa realmente ese hito?
- (see full author list at the end). About a year ago, METR showed that the length of tasks frontier models can reliably complete doubles every few months. A related safety-relevant question is this: what length of tasks can models complete without any chain of thought (CoT)? We investigate in our new paper.
- This is a short post to explain a distinction between three different types of model organism (MO) research: Type. Purpose. Example. Worst-case model organisms. Stress-test safety and control techniques by making the problem as hard as possible. Password-locked models for capability elicitation; sleeper agents for stress-testing alignment training; red-team malign inits in control.
- Models' no-CoT time horizon has doubled roughly every year.
- Statement: I'm far from an EA or AI expert, these are all opinions from an 19 years old EA laypeople (which is probably biased or woefully wrong). Welcome to give me any critique in the comments(rather than just downvoting me).
- This working draft of AI Now’s upcoming report traces corporate power in the data center industry in the United States, focusing on the flows of money and power that determine who both drives and benefits from the current data center boom. The aim of this research is to help local communities and their advocates fight […].
- In this case study, we used a method called process tracing to demonstrate the impact of our Animal Product Impact Scales on Anima International France and their decision to change their organizational strategy. The post Tracking Our Direct Impact: A Case Study Using Process Tracing appeared first on Faunalytics.
- UK AISI, Model Transparency Team. Epistemic status: Most experiments were run over a period of ~2-3 days during a hackathon at UK AISI, and were fairly heavily vibe coded. Expect some of this to be rough around the edges. Tl;dr:
- EA Forum Digest #295 Hello!. CEA is hiring for a financial controller, a recruiter, and for roles on the Events team. All roles listed here. It’s also organisation update week, so check out this thread for jobs, research updates and opportunities relating to EA orgs. — Toby (for the Forum team) We recommend:
- This post is based on my personal views, which mostly overlap with the views of my employer ControlAI but does not necessarily fully reflect them. This applies in particular, but not exclusively, to technical opinions about AI development and geopolitical predictions. You might’ve heard that superintelligent AI (ASI) poses extreme risks like human extinction and other comparably undesirable...
- How do we know when the world has changed? On June 1, a team of scientists published a preprint scientific paper claiming they had edited human embryonic DNA with more precision than any previous attempt. As a technical achievement, the work is undoubtedly impressive, largely avoiding the errors that had accompanied earlier efforts to gene […]...
- guest post!
- The Claude Fable 5/Mythos 5 System Card has a section in which they talk about illegible reasoning, and provide an "extreme" example thereof. Models developing their own uninterpretable, unmonitorable internal language has been a major theoretical concern for a while, and when o3 was released last year with its disclaim overshadow disclaim vantage style word salad CoT, it seemed like the...
- TL;DR: Recent work from Goodfire & UK AISI – Verbalized Eval Awareness Inflates Measured Safety – shows that newer open-weight models verbalize evaluation-awareness (VEA) more often, and that this inflates measured safety. Between OLMo-3-32B-Think and OLMo-3.1-32B-Think – identical base, SFT, DPO, and RL data, differing only in an additional ~3 weeks of the RLVR stage – VEA roughly doubles.
- Your farmed animal advocacy update for early June 2026
- A new outlet for discussion in Ulster
- The post Optimizing Government-Led Community Health: A New Model for Sustainable Scale appeared first on Living Goods.
- This is a linkpost for https://www.anthropic.com/news/claude-fable-5-mythos-5. Discuss...
- Ten years ago, a shocking discovery sparked a movement. Today, Crustacean Compassion is celebrating a decade of changing how the world sees and treats crabs, lobsters, prawns and crayfish.
- I grew up in South Florida, which leads the nation in drowning deaths for children.
- A simple taxonomy of the main proposals for post-AGI universal redistribution
- I'm a freelance web designer and developer who has been concerned about AI and prioritising a transition into AI safety since late 2025. This post is a summary of my experience so far, as a possibly useful addition to the conversation around the need for generalists in AI safety.
- In my post “ Why I’m not a Bayesian”, I argued that the Bayesian approach of assigning credences to propositions with binary truth values only works in simple and restricted domains. Instead, I claimed, a better approach to epistemology is to assign degrees of truth to models of the world.
- Over the past 15 months or so, ARC's technical agenda has developed quite a bit. The advent of the Matching Sampling Principle (MSP), and ideas like it, has begotten a host of concrete technical problems; progress on those problems has given us more philosophical clarity on the big picture, which has led to even more technical progress.
- June 2026: We've just launched this program and are inviting the first Affiliates. We expect to invite more over time; register your interest below. About the program Research Affiliates pursue their own research directions for reducing risks of astronomical suffering (s-risks), with CLR’s funding, affiliation, and research community.
- In a new paper in Cyber Security: A Peer-Reviewed Journal, Sarah Powazek, Director of CLTC’s Public Interest Cybersecurity Program, addresses the challenge of “usability” in cybersecurity, particularly for…. The post New Paper Highlights the Need for Usable Cybersecurity appeared first on CLTC.
- In an op-ed published by Tech Policy Press, Ann Cleaveland, Executive Director of the Center for Long-Term Cybersecurity, argues that, in the face of significant new cyber threats…. The post Op-Ed Calls for “Project Kaleidoscope” to Bolster Community Cyber Defense in the Age of AI appeared first on CLTC.
- Over the past 15 months or so, ARC's technical agenda has developed quite a bit. The advent of the Matching Sampling Principle (MSP), and ideas like it, has begotten a host of concrete technical problems; progress on those problems has given us more philosophical clarity on the big picture, which has led to even more technical progress.
- TL;DR: My new prior is that top-of-the-line LLMs working on easy tasks generate code that is maybe 10 % more complicated than necessary. I also think we accept this complexity too easily, because it comes from code that is right here, right now, solving an immediate problem.
- TL;DR: What is slop, and why? Is it fundamental? Is it in the room with us right now? And, most importantly, how do we exorcise it?. Previously in this series: This Week In Fashion and On Automatic Ideas. A potential post for this Substack starts when I pick up an idea by talking to a smart person or revisiting an evergreen topic.
- You won’t believe how low big tech has stooped in their slime campaign against Alex Bores...
- The battle lines of the AI morality debate are being laid down. On one side you have the ChatGPT dogma: AI as mere tools with no real preferences or even beliefs. On the other you have the twitter AI whisperers: AIs as complex beings with rich personalities and desires which deserve our respect. And in the middle you have the official Anthropic line, that they are genuinely uncertain, as is...
- This study reveals how guided Arctic king crab tours normalize animal suffering through storytelling, shaping tourist behavior, and masking ethical concerns. The post Safari Of Suffering: The Reality Of King Crab Tourism appeared first on Faunalytics.
- works better than you'd think
- The Great Exhibition Road Festival is a free annual celebration of science and the arts each summer in South Kensington, led by Imperial College London. Visitors could enjoy hands-on workshops, interesting talks, performances and installations from iconic museums, research and culture organisations in South Kensington.
- By Abhi Kumar, Associate Program Officer in Farm Animal Welfare. Note: We used AI (Claude) to draft this post from other documents related to this RFP. All content was reviewed by Abhi and the CG team for accuracy. Over 100 billion animals are farmed and slaughtered for food every year.
- "LLMs just imitate humans.". A very repeated claim about AI, and it's false. In this clip from Modern Wisdom, Eliezer Yudkowsky breaks down how the recent breakthrough of applying reinforcement learning to chain of thought lets models move past imitation. Have the model take 20 attempts at a problem, find the one that works best, then train it to think more like that successful attempt.
- Grateful to Benjamin Vincent and Alex Rubinsteyn for our many conversations on this topic, and comments on drafts of this essay!. Introduction. When most people hear of “cancer vaccine,” they’ll think of normal vaccines. Perhaps they’ll even think of what ostensibly is a cancer vaccine: the HPV vaccine.
- I often use what I’ll call the “safety-usefulness tradeoff model”, which is: developers face a tradeoff between "safety" and "usefulness" of an AI deployment, and the developer has only limited willingness or ability to sacrifice usefulness for the sake of safety.
- I often use what I’ll call the “safety-usefulness tradeoff model”, which is: developers face a tradeoff between "safety" and "usefulness" of an AI deployment, and the developer has only limited willingness or ability to sacrifice usefulness for the sake of safety.
- When is "increasing safety budget" a useful concept?
- TL;DR: Bun is a very large and very influential open-source project. It is being migrated from the easier-to-read Zig programming language to harder-to-read but memory-safe Rust. This is done almost entirely by the AI tool Claude Code.
- When the world wakes up to the unacceptable danger of AI development, what happens to those responsible? The Berkeley trials, perhaps.
- "We see these AIs as a galaxy glittering with capabilities, but at their center, invisible to the naked eye, holding all the constellations together, is an unimaginably massive black hole of data."
- Executive summary
- Most flags used to be ugly. They were probably better that way.
- Hi everyone!. Over the last six months or so, those of you who listen to the 80,000 Hours Podcast might occasionally have heard an unfamiliar voice asking questions to our guests. The person behind that unfamiliar voice is me, Zershaaneh!. I'm not saying I'm also Banksy, but I'm not not saying that.
- She knew she wanted to help animals, she just couldn’t decide how. Becca Rogers had been sitting with that question since 2019, when she left PETA after 1.5 years doing undercover work and stepped into a tech ed company. She still cared deeply about animals and she needed to find her way back, but the […]...
Loading...