Effective Altruism News
Effective Altruism News
- Statement: I'm far from an EA or AI expert, these are all opinions from an 19 years old EA laypeople (which is probably biased or woefully wrong). Welcome to give me any critique in the comments(rather than just downvoting me).
- In this case study, we used a method called process tracing to demonstrate the impact of our Animal Product Impact Scales on Anima International France and their decision to change their organizational strategy. The post Tracking Our Direct Impact: A Case Study Using Process Tracing appeared first on Faunalytics.
- UK AISI, Model Transparency Team. Epistemic status: Most experiments were run over a period of ~2-3 days during a hackathon at UK AISI, and were fairly heavily vibe coded. Expect some of this to be rough around the edges. Tl;dr:
- EA Forum Digest #295 Hello!. CEA is hiring for a financial controller, a recruiter, and for roles on the Events team. All roles listed here. It’s also organisation update week, so check out this thread for jobs, research updates and opportunities relating to EA orgs. — Toby (for the Forum team) We recommend:
- This post is based on my personal views, which mostly overlap with the views of my employer ControlAI but does not necessarily fully reflect them. This applies in particular, but not exclusively, to technical opinions about AI development and geopolitical predictions. You might’ve heard that superintelligent AI (ASI) poses extreme risks like human extinction and other comparably undesirable...
- How do we know when the world has changed? On June 1, a team of scientists published a preprint scientific paper claiming they had edited human embryonic DNA with more precision than any previous attempt. As a technical achievement, the work is undoubtedly impressive, largely avoiding the errors that had accompanied earlier efforts to gene […]...
- guest post!
- The Claude Fable 5/Mythos 5 System Card has a section in which they talk about illegible reasoning, and provide an "extreme" example thereof. Models developing their own uninterpretable, unmonitorable internal language has been a major theoretical concern for a while, and when o3 was released last year with its disclaim overshadow disclaim vantage style word salad CoT, it seemed like the...
- TL;DR: Recent work from Goodfire & UK AISI – Verbalized Eval Awareness Inflates Measured Safety – shows that newer open-weight models verbalize evaluation-awareness (VEA) more often, and that this inflates measured safety. Between OLMo-3-32B-Think and OLMo-3.1-32B-Think – identical base, SFT, DPO, and RL data, differing only in an additional ~3 weeks of the RLVR stage – VEA roughly doubles.
- Your farmed animal advocacy update for early June 2026
- A new outlet for discussion in Ulster
- The post Optimizing Government-Led Community Health: A New Model for Sustainable Scale appeared first on Living Goods.
- This is a linkpost for https://www.anthropic.com/news/claude-fable-5-mythos-5. Discuss...
- Ten years ago, a shocking discovery sparked a movement. Today, Crustacean Compassion is celebrating a decade of changing how the world sees and treats crabs, lobsters, prawns and crayfish.
- I grew up in South Florida, which leads the nation in drowning deaths for children.
- A simple taxonomy of the main proposals for post-AGI universal redistribution
- I'm a freelance web designer and developer who has been concerned about AI and prioritising a transition into AI safety since late 2025. This post is a summary of my experience so far, as a possibly useful addition to the conversation around the need for generalists in AI safety.
- In my post “ Why I’m not a Bayesian”, I argued that the Bayesian approach of assigning credences to propositions with binary truth values only works in simple and restricted domains. Instead, I claimed, a better approach to epistemology is to assign degrees of truth to models of the world.
- Over the past 15 months or so, ARC's technical agenda has developed quite a bit. The advent of the Matching Sampling Principle (MSP), and ideas like it, has begotten a host of concrete technical problems; progress on those problems has given us more philosophical clarity on the big picture, which has led to even more technical progress.
- June 2026: We've just launched this program and are inviting the first Affiliates. We expect to invite more over time; register your interest below. About the program Research Affiliates pursue their own research directions for reducing risks of astronomical suffering (s-risks), with CLR’s funding, affiliation, and research community.
- In a new paper in Cyber Security: A Peer-Reviewed Journal, Sarah Powazek, Director of CLTC’s Public Interest Cybersecurity Program, addresses the challenge of “usability” in cybersecurity, particularly for…. The post New Paper Highlights the Need for Usable Cybersecurity appeared first on CLTC.
- In an op-ed published by Tech Policy Press, Ann Cleaveland, Executive Director of the Center for Long-Term Cybersecurity, argues that, in the face of significant new cyber threats…. The post Op-Ed Calls for “Project Kaleidoscope” to Bolster Community Cyber Defense in the Age of AI appeared first on CLTC.
- Over the past 15 months or so, ARC's technical agenda has developed quite a bit. The advent of the Matching Sampling Principle (MSP), and ideas like it, has begotten a host of concrete technical problems; progress on those problems has given us more philosophical clarity on the big picture, which has led to even more technical progress.
- TL;DR: My new prior is that top-of-the-line LLMs working on easy tasks generate code that is maybe 10 % more complicated than necessary. I also think we accept this complexity too easily, because it comes from code that is right here, right now, solving an immediate problem.
- TL;DR: What is slop, and why? Is it fundamental? Is it in the room with us right now? And, most importantly, how do we exorcise it?. Previously in this series: This Week In Fashion and On Automatic Ideas. A potential post for this Substack starts when I pick up an idea by talking to a smart person or revisiting an evergreen topic.
- You won’t believe how low big tech has stooped in their slime campaign against Alex Bores...
- The battle lines of the AI morality debate are being laid down. On one side you have the ChatGPT dogma: AI as mere tools with no real preferences or even beliefs. On the other you have the twitter AI whisperers: AIs as complex beings with rich personalities and desires which deserve our respect. And in the middle you have the official Anthropic line, that they are genuinely uncertain, as is...
- This study reveals how guided Arctic king crab tours normalize animal suffering through storytelling, shaping tourist behavior, and masking ethical concerns. The post Safari Of Suffering: The Reality Of King Crab Tourism appeared first on Faunalytics.
- works better than you'd think
- The Great Exhibition Road Festival is a free annual celebration of science and the arts each summer in South Kensington, led by Imperial College London. Visitors could enjoy hands-on workshops, interesting talks, performances and installations from iconic museums, research and culture organisations in South Kensington.
- By Abhi Kumar, Associate Program Officer in Farm Animal Welfare. Note: We used AI (Claude) to draft this post from other documents related to this RFP. All content was reviewed by Abhi and the CG team for accuracy. Over 100 billion animals are farmed and slaughtered for food every year.
- "LLMs just imitate humans.". A very repeated claim about AI, and it's false. In this clip from Modern Wisdom, Eliezer Yudkowsky breaks down how the recent breakthrough of applying reinforcement learning to chain of thought lets models move past imitation. Have the model take 20 attempts at a problem, find the one that works best, then train it to think more like that successful attempt.
- Grateful to Benjamin Vincent and Alex Rubinsteyn for our many conversations on this topic, and comments on drafts of this essay!. Introduction. When most people hear of “cancer vaccine,” they’ll think of normal vaccines. Perhaps they’ll even think of what ostensibly is a cancer vaccine: the HPV vaccine.
- I often use what I’ll call the “safety-usefulness tradeoff model”, which is: developers face a tradeoff between "safety" and "usefulness" of an AI deployment, and the developer has only limited willingness or ability to sacrifice usefulness for the sake of safety.
- I often use what I’ll call the “safety-usefulness tradeoff model”, which is: developers face a tradeoff between "safety" and "usefulness" of an AI deployment, and the developer has only limited willingness or ability to sacrifice usefulness for the sake of safety.
- When is "increasing safety budget" a useful concept?
- TL;DR: Bun is a very large and very influential open-source project. It is being migrated from the easier-to-read Zig programming language to harder-to-read but memory-safe Rust. This is done almost entirely by the AI tool Claude Code.
- When the world wakes up to the unacceptable danger of AI development, what happens to those responsible? The Berkeley trials, perhaps.
- "We see these AIs as a galaxy glittering with capabilities, but at their center, invisible to the naked eye, holding all the constellations together, is an unimaginably massive black hole of data."
- Executive summary
- Most flags used to be ugly. They were probably better that way.
- Hi everyone!. Over the last six months or so, those of you who listen to the 80,000 Hours Podcast might occasionally have heard an unfamiliar voice asking questions to our guests. The person behind that unfamiliar voice is me, Zershaaneh!. I'm not saying I'm also Banksy, but I'm not not saying that.
- She knew she wanted to help animals, she just couldn’t decide how. Becca Rogers had been sitting with that question since 2019, when she left PETA after 1.5 years doing undercover work and stepped into a tech ed company. She still cared deeply about animals and she needed to find her way back, but the […]...
- The post Our top tips for becoming a better applicant appeared first on 80,000 Hours.
- We're not on indigenous land
- What does an AI even ‘want’ anyway?
- A survey of 500 U.S. dog guardians explores how ethical beliefs about animals influence training methods, showing that human-centered views are linked to punishment while welfare-focused views favor gentler approaches. The post Ethical Beliefs Shape How People Train Their Dogs appeared first on Faunalytics.
- When will markets price the singularity?
- At Clearer Thinking, we're running a collaboration survey about the psychological challenges of various kinds related to working on high-impact problems (e.g., existential risk, AI safety, climate change, animal welfare, global health, bio/nuclear safety, and other topics), and what people find helpful in dealing with those challenges. We are interested in hearing from you whether you...
- In just 10 days over the summer of 1854, 500 people died of cholera in the Soho neighborhood of London. The city’s population had more than doubled to 2.3 million people in the first half of the 1800s, and its sewage system could not keep up. But the streams of human waste flowing into the […]...
- If by whiskey.....
- Greetings from a world where…...
- In philosophy of mind, "mental causation" means mental entities have causal effects, especially physical ones. If physicalism is true, then physical effects are explainable in terms of physical causes (or at least, fundamental physical laws), needing no recourse to causation by anything that is not in fundamental physics.
- We're looking at livelihoods research again (contribute ideas or reach out to support the team with your expertise), sharing our new methods site, and releasing updated SADs guidance. Looking at livelihoods and growth again.
- "We are approaching a runaway to superintelligence that could threaten our shared human future."
- Image credit: Jebulon. To prevent superintelligent AI from killing everyone, I would like there to be a strong international agreement banning the development of ASI until it can be proven safe. But that sort of agreement requires a lot of political buy-in and coordination. In the meantime, it may be easier to get light-touch AI safety regulations passed.
- Hive Slack Threads: May
- Instead of using static position increments (+1) per token, RoPE-based language models can learn per-token and per-layer position increments. This has no detectable effect on model performance but allows us to see what the model thinks the distance is between each position and how this varies per-layer... Example sentence with each character plotted based on per-layer learned position increments.
- Crosspost. The best charities save lives for a few thousand dollars. If you earn $100,000 per year and give away 10%, you can save about 100 people over the course of your life. I think we generally aren’t good at visualizing what’s really at stake. So here is me attempting vaguely to grok 100 lives. Ted Bundy killed around 30 people.
- What happens after we #PauseAI
- We introduce an evaluation for activation verbalizers: can they surface a target model's reasoning as it solves a math problem in a single forward pass? For open-weight NLAs, the answer seems to be: "possibly, but definitely not reliably". Lots of important capabilities currently require AI models to reason "out loud" in a natural-language chain of thought, which means that we can monitor...
- Plus some other stuff
- When I speak to you one-on-one, assuming you’re listening, we’re in parity.
- I'm leading a non-profit team building a pathogen-agnostic early-warning system. As AI systems become increasingly capable substitutes for expert human biologist expertise, the risk that someone could engineer a pathogen to spread widely before detection is going up.
- Epistemic status: don’t know whether I actually believe all of this, but I think it’s worth considering. A “corrigible” agent, per the LW wiki, is: …one that doesn’t interfere with what we would intuitively see as attempts to ’correct’ the agent, or ’correct’ our mistakes in building it; and permits these ’corrections’ despite the apparent instrumentally convergent reasoning saying otherwise.
- Five years ago I read a post on the EA Forum arguing that "election campaign contributions might be a way in which you can have a substantial impact as a small donor". It struck me as weird but plausible: a combination that you see a lot of on the Forum. A few months later I read another post, a case for Carrick Flynn in particular.
- No one programs AI to be dangerous, but that's because they don't program it at all.
- About one year ago, I started spending most of my time organising PauseAI UK. At that time our largest protest had seen fewer than 50 attendees, no prominent politicians or scientists were associated with PauseAI, and I largely ran the UK chapter by myself.
- Recently, the Vatican and Anthropic have shown a united front on artificial intelligence. But are they actually aligned?. The post The Rival Theologies of Artificial Intelligence appeared first on Palladium.
- In this video, we talk about AI scheming. In particular, we walk through a study in which OpenAI and Apollo Research tested whether AI models would take "covert actions": strategically withholding, misrepresenting, or concealing information.
- TLDR: Sequentially mixing training objectives incentivises different training dynamics depending on the distinguishability of the training environments and the amount of pressure for shared circuitry. We classify these patterns into three classes: ecological generalists, conditional policies, and strategy churn.
- On what individuals can do
- Originally intended as a quick take, but got a bit longer, so why not turn it into a post. Just sharing my observations & assumptions here about the state of software automation. Happy to hear thoughts on where you think I'm off.
- Here, I summarize the failings of democracy. *
- In their new post on recursive self-improvement, Anthropic argues that a pause in frontier AI development is needed, but unfortunately, they can't pause on their own, because of less cautious actors: We believe it would be good for the world to have the option to slow or temporarily pause frontier AI development to enable societal structures and alignment research to keep up with the advance...
- In a darkened convention hall in Chicago on May 31, a Harvard oncologist named Brian Wolpin stood at a podium and in a voice that sounded as if he was reading from the phone book, recited a set of numbers that brought a roomful of cancer doctors to their feet for 42 seconds. Adam Feuerstein, […]...
- This post covers key insights for pandemic risk reduction from our Risk Analysis paper, which was awarded ‘Best Paper of 2025’. . Global pandemics, such as the Black Death, the Spanish Flu, and most recently, COVID-19, have brought hardship and loss: from global famine to loss of life, from deepening inequality to overwhelming health systems, and more. An important lesson arises: the more...
- Five years ago I read a post on the EA Forum arguing that "election campaign contributions might be a way in which you can have a substantial impact as a small donor". It struck me as weird but plausible: a combination that you see a lot of on the Forum. A few months later I read another post, a case for Carrick Flynn in particular.
- Summary: This is a write-up on preparing for warning shots to catalyze international cooperation on AGI risks, and the corollary list of projects one could pursue. We argue we must first (1) understand types of warning shots, then (2) prepare to catch them.
- ➡️ Passez à l'action sur les risques de l'IA : En quelques clics, alertez vos élus et envoyez le modèle de lettre préparé. C’est automatisé pour un minimum d’effort: https://taap.it/TF-PauseIACampagnes ⬇️⬇️⬇️ Infos complémentaires : sources, références, liens... ⬇️⬇️⬇️ Le contenu vous intéresse ? Abonnez-vous et cliquez sur la 🔔 Vous avez aimé cette vidéo ?
- This is a description of the methodology behind the latest iteration of my Targeted Personality Test. Feel free to take it either before or after reading the article. This post can also be read at my Substack. Thanks to Justis Millis for providing feedback and proofreading on this post. In my prior post “Which personality traits are real?
- Suffering-focused ethics (SFE) is a family of moral views that gives special priority to reducing suffering. As you might know, we at CRS find SFE deeply compelling—it is, after all, the backbone of our work. Part of our mission is to research and build a field around SFE. Unfortunately, SFE remains highly neglected in both academia and […].
- Why I stopped donating to animal welfare charities but feel more motivated than ever to redirect money and talent to the cause. I have wanted to write this post for a while. It is an uncomfortable thing to bring up. Many people in the animal welfare space are working really hard, and this post might leave some feeling defeated.
- This is a summary of the work I've done and work I plan to do, and the theories of change and AI progress that motivate my work. I've been working full-time on alignment for three years and change, and thinking about brainlike AGI and its alignment increasingly often since 2004.
- Note - title edited to be more descriptive. This is a summary of the work I've done and work I plan to do, and the theories of change and AI progress that motivate my work. I've been working full-time on alignment for three years and change, and thinking about brainlike AGI and its alignment increasingly often since 2004.
- On the mass extinctions that came before, the one we were already in before generative AI, and how humanity could easily be outcompeted.
- Purveying high quality decorrelated tokens
- About one year ago, I started spending most of my time organising PauseAI UK. At that time our largest protest had seen fewer than 50 attendees, no prominent politicians or scientists were associated with PauseAI, and I largely ran the UK chapter by myself.
- Join our team! We’re hiring for 3 open roles—apply by July 5 ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ...
- FLF is running a competition to find the best workflows and methodologies for using AI to produce reliable, trustworthy knowledge bases, grounded in real-world cases. We’re open-minded on the types of submissions we receive and on how they address the problem. We’ve set aside approximately $200k for prizes.
Loading...