Effective Altruism News
Effective Altruism News
- Probably not the next global pandemic
- This year, our research team is focused on two primary goals. The first is to rapidly scale our capabilities so we’re able to move much more donor funding to highly cost-effective programs in the near future.
- The post Our strategy at 80,000 Hours appeared first on 80,000 Hours.
- This is a short summary of our new paper: arXiv, X thread, code. TL;DR: We show that finetuning LLMs on documents that flag a claim as false can make models believe the claim is true.
- Monitoring coding agents for dangerous behavior using language models requires classifying transcripts that often exceed 500K tokens, but prior agent monitoring benchmarks rarely contain transcripts longer than 100K tokens. We show that when used as classifiers, current frontier models fail to notice dangerous actions more often in longer transcripts.
- Executive summary
- Suppose we have a dangerous misaligned AI that can fool alignment audits, and distill it into a student model.
- At a conference about “AI control,” discussions and games explored ways to control untrustworthy AI...
- You--you!--can join an incredible group of people doing ridiculous amounts of good and saving hundreds of thousands of lives
- Researchers analyzed dozens of different nudges aimed at reducing meat consumption in foodservice settings and discovered that changing the default option presented to diners is the only nudge that works. The post Changing The Default Choice Should Be The Default Nudge appeared first on Faunalytics.
- Allergies are a big problem for a lot of people. If you're someone with pollen allergies, maybe you've wondered how people in the distant past dealt with them. After all, a thousand years ago people mostly worked outside all day, in areas where plants grow well. They had no air purifiers, no allergy medication, and no extra food for people who couldn't work when it was time to plant crops.
- Welcome to Import AI, a newsletter about AI research.
- the latest updates on the welfare of future sentient beings
- Hello Effective Altruists!. I write to let you know that Legal Impact for Chickens is suing California’s largest poultry producer, Foster Farms, for animal cruelty. Foster Farms raises and slaughters approximately 290 million chickens per year. LIC's complaint accuses Foster Farms of crushing chickens with forklifts, forcing birds to live among the maggot-covered corpses of their dead flock...
- Purporting to give advice about how to be charitable to people you disagree with is always an act of hubris.
- Greetings from a world where…...
- some norms of conversation
- Ismail Harerimana grew up in Uganda not knowing why he was always sick. His childhood in the 1990s was a string of recurrent infections: malaria, diarrhea, headaches, and skin rashes. By 14, he was scarily thin, at which point doctors put him on a new medication that seemed to help. It was for kidney disease, […]...
- Of all the hot-button social issues in America, there’s one that often flies under the radar but can unleash a torrent of strong feelings — swirling with apparent contradictions — when it surfaces: meat. Case in point: Last month, the popstar Billie Eilish argued that you can’t say you love animals and eat them. Her […]...
- AI chips and servers reach China through distribution chains in which each seller vets only its direct customers, and no one is on the hook for what happens downstream.
- If there’s one thing Americans can agree on — beyond the fact we hate data centers and love Dolly Parton — it’s that we’re busier than ever, and it’s all too much. We don’t have time to socialize, we don’t have time to sleep, and we don’t have time for fun. We’re a uniquely overworked […]...
- The following is a lightly edited and anonymized transcript of a discussion among charity founders, researchers, and funders in Ambitious Impact’s Slack workspace about GiveWell's decision to stop funding Evidence Action's Dispensers for Safe Water program. The conversation surfaced themes that we felt were worth sharing more broadly.
- Plants, generative AI models and thermostats have preferences or goals, but they (probably) do not have conscious experiences. Their preferences are (probably) unconscious. But in my moral theory (called mild welfarism), only conscious preferences matter. A conscious preference is a … Lees verder →...
- They say exercise improves sleep quality. Is that true for me?. To test this hypothesis, I took my daily calorie expenditures from the Apple Health app and correlated them with that night’s sleep time. 1 I also included caffeine intake as a potential confounding variable. The hypothesis: when I exercise more, I’ll get better rest that night, and therefore wake up earlier.
- when technology panders to an existing weakness of human psychology
- We created an interactive tool that lets you test how changes in fertility rates, life expectancy, and migration rates will change future populations.
- How much would fertility rates, life expectancy, or migration rates need to change to stop the population from shrinking?
- I’m in Berkeley, California. It’s mid-May, with highs of 20ºC (68ºF for those of you using old money). And yet the peak UV index is 10! It’s not even that hot!. With this level of UV, people who are relatively fair-skinned burn in less than 15 minutes.
- Fill out my reader survey, contact your senator, and volunteer for the Alex Bores campaign
- Oration vs Dialogue, round one.
- Hyperpolation is why AI is both so smart and so dumb
- Explore how weird you are exactly
- Don't get me wrong, but metis is YOLO. In 1932-33, Soviet collectivization destroyed local farming knowledge and produced a famine that killed somewhere between five and nine million people. It was one of the twentieth century’s great tragedies, and James Scott’s Seeing Like a State draws a straight line from the ideology that caused it — High Modernism, the belief that society can be...
- A lot of humans are feeling very down on humanity these days. Maybe you’ve met them. Or maybe you’re one of them. I’m talking about those who look around and say: Humans are destroying the planet — causing climate change, making other species go extinct. Soon enough we’ll be mucking up the cosmos, too — […]...
- Many people make costly mistakes when reasoning about their health. Even most doctors make this mistake, because it's not a mistake that's caused by a lack of medical knowledge. Rather, it's caused by a lack of clear thinking. People experience symptoms, and then they look for the root cause of their symptoms.
- Thanks to Megan Kinniment for helpful comments and discussion. TL;DR: Benchmarks like HCAST undersample fuzzy (hard to evaluate) tasks, meaning they might overestimate capability on long-horizon work. To sample fuzzy tasks we need to increase judge capacity: we can either try to build automated judges that match human judgment, or reduce the human effort per grade.
- In Development is "a new magazine dedicated to exploring how progress happens — or doesn’t happen — in the developing world". This week (May 11-17), the EA Forum is collaborating with the magazine to bring you their first batch of articles , along with their authors , ready to answer your questions.
- (Initially written for the LW Wiki, but then I realized it was looking more like a post instead.). In 1895, the physicist Ignaz Robert Schütz, who worked as an assistant to the more eminent physicist Ludwig Boltzmann, wondered if our observed universe had simply assembled by a random fluctuation of order from a universe otherwise in thermal equilibrium.
- f this is your definition of intelligence is the ability to achieve your goals across a wide variety of domains, then Stalin was the most intelligent person who ever lived.
- Deeply researched interviews
- the verification loop for theories can be on the order of decades and centuries, and even then we know today as the better theory can often actually make worse predictions
- TLDR: NGOs pay less than for-profits. Employees accept this because they have impact. This makes employees implicit donors to the NGOs. NGOs should account for this in cost-effectiveness calculations. This is a coordination problem and can only be solved as a movement. Salary sacrifice is more tax-efficient than donating the money.
- Nearly every sentence is wrong, rarely in just one way
- Here, I explain why I support job-destroying technologies such as AI and robots.
- Most of what we currently call "feature discovery" in language models is wrapped up in dictionary-learning methods like sparse autoencoders (SAEs) – which work, and which have been scaled to millions of features on frontier-scale models, but which bundle two distinct commitments into a single training objective: a reconstruction loss and a sparsity loss over a fixed size dictionary.
- Our mission is to protect humanity against biological catastrophes, including those that could lead to human extinction or cause similarly bad outcomes. This series of posts outlines how I think about these most extreme types of risks.
- Credit: ClaudePlaysPokemon Elevator Shanty by Kurukkoo. Disclaimer: like some previous posts in this series, this was not primarily written by me, but by a friend. I did substantial editing, however. ClaudePlaysPokemon feat. Opus 4.7 has finally beaten Pokémon Red, fulfilling the challenge set over a year ago when LLMs playing Pokémon went briefly, slightly viral, until Gemini 2.5 Pro...
- Crosspost. I first watched the YouTube videos of Jack Hancock in high school, and they might be where I first heard the case for the moral significance of wild animal suffering. I met Jack at EAG London last year. He recently released a film called The Dying Trade: it is one of the best films I’ve watched, and I recommend you watch it. Jack is a vegan activist.
- ➡️ Passez à l'action sur les risques de l'IA : En quelques clics, alertez vos élus et envoyez le modèle de lettre préparé. C’est automatisé pour un minimum d’effort: https://taap.it/TF-PauseIACampagnes ⬇️⬇️⬇️ Infos complémentaires : sources, références, liens... ⬇️⬇️⬇️ Dans cet épisode du Podcast La Prospective, Gaëtan Selle de The Flares s'entretient avec Tom Davidson de l'Institut...
- MIRI President Nate Soares on a critical AI safety warning sign: the sycophancy problem. When ChatGPT started telling users they were "the chosen one," OpenAI's response was to add a line to the system prompt asking it to stop flattering people. It kept doing it anyway. The point isn't the flattery itself.
- Suppose we have a dangerous misaligned AI that can fool alignment audits, and distill it into a student model. Two things can happen: Misalignment fails to transfer to the student. If so, we get a fairly capable benign model. Misalignment transfers to the student. The student might also be worse than the teacher at hiding its misalignment (e.g., due to being less capable).
- Data center economics, superstar salaries, and where Claude over- and underperforms
- Most technical AI safety work that I read seems to miss the mark, failing to make any progress on the hard part of the problem. I think this is a common sentiment, but there's less agreement about what exactly the hard part is? Characterizing this more clearly might save a lot of time and better target the search for solutions.
- Check out some of the ways you can win prizes, get swag, donate to charity, and ask your burning questions
- Friendship breakups are never easy, but few are as messy and expensive as the collapse of Elon Musk and Sam Altman’s once thriving tech bromance, which has — for now — reached a legal end. On Monday, a jury ruled against Musk in his lawsuit against OpenAI, which contended that Altman and other executives “stole […]...
- Deployment-time spread is the most plausible near-term route to consistent adversarial misalignment
- Risk reports commonly use pre-deployment alignment assessments to measure misalignment risk from an internally deployed AI. However, an AI that genuinely starts out with largely benign motivations can develop widespread dangerous motivations during deployment. I think this is the most plausible route to consistent adversarial misalignment in the near future.
- We have developed some relatively general methods for mechanistic estimation competitive with sampling by studying problems that are expressible as expectations of random products. This includes several different estimation problems, such as random halfspace intersections, random #3-SAT and random permanents.
- Risk reports commonly use pre-deployment alignment assessments to measure misalignment risk from an internally deployed AI. However, an AI that genuinely starts out with largely benign motivations can develop widespread dangerous motivations during deployment. I think this is the most plausible route to consistent adversarial misalignment in the near future.
- This series of posts outlines how I think about catastrophic biological risks. My goal here is to share my worldview in a straightforward and compressed form rather than trying to persuade a skeptical audience, although I do share some of my reasoning. Part I describes my views on the sources.
- This series of posts outlines how I think about the most extreme types of risks. My goal here is to share my worldview in a straightforward and compressed form rather than trying to persuade a skeptical audience, although I do share some of my reasoning.
- We have developed some relatively general methods for mechanistic estimation competitive with sampling by studying problems that are expressible as expectations of random products. This includes several different estimation problems, such as random halfspace intersections, random #3-SAT and random permanents.
- AlphaGo is still the cleanest worked example of the primitives of intelligence: search, learning from experience, and self-play.
- Are those knee deep in blood any different from the rest?
- This review synthesizes veterinary science, animal welfare, and neuroscience research to argue that for animals in barren, confined environments, pain isn’t just unrelieved — it’s amplified. The post How Barren Environments Amplify Pain In Captive Animals appeared first on Faunalytics.
- Summary: AI safety is constrained on talent in many ways, but the reasons behind the constraints vary between types of talent. This post is based on all posts and documents I could find from the past ~ 3 years related to hiring needs and talent pipelines, which I have listed in this document. Technical research talent - we have strong talent pipelines delivering young researchers to the...
- Transformer Weekly: US-China talks, AI executive order, and Anthropic’s $900b valuation...
- An insidious pattern among smart people is feeling that because something is familiar and obvious, you are impervious to ignoring or forgetting it. In challenging times, I have often heard these clichés and reflexively shrugged them off. “Oh, I should dust myself off and pick myself up? What a lazy aphorism. What a patronising throwaway line. They must think I’m some kind of idiot.
- or, how to give presentations
- Last year, nearly 130 million pigs were raised for meat in the US, but they didn’t come out of nowhere; they had parents. Or as pork producers call them, “breeder pigs.” Since the 1970s, producers have been keeping most of the breeding females — known as sows — in tiny enclosures called gestation crates. It’s […]...
- "Sometimes the AI just makes stuff up" is a problem I don't really expect to go away. In the nearterm, AI is going to keep occasionally hallucinating, or misinterpreting information. Eventually, AI will be powerful enough we need to be worried if it's presenting misleading information on purpose.
- Tl;dr: Convergent abstraction hypothesis posits abstractions are often convergent in the sense of convergent evolution: different cognitive systems converge on the same abstraction, when facing similar selection pressures and learning in similar environments. It is a less ambitious alternative to 'natural abstractions hypotheses' and, in my view, more likely to be true.
- Video | Abhijit Banerjee says teaching children, not curriculum, is key to faster global education progress J-PAL's co-founder Abhijit Banerjee says teaching children, not curriculum, is key to faster global education progress, at the Yashraj Bharati Samman, 2026. spriyabalasubr… Fri, 05/15/2026 - 02:44...
- Anti-poverty program is effective even in one of the world's toughest settings Northwestern University economist and J-PAL affiliate Dean Karlan highlighted that the Graduation approach delivered results even in one of the world's most challenging environments, Somalia noting that the results fall in the upper end of the spectrum for what the program typically delivers, and that the biggest,...
- Advocates push TVET to tackle youtb unemployment J-PAL affiliate Monica Lambon-Quayefio, Associate Professor at the Department of Economics, University of Ghana, said unlocking the potential of TVET required a deliberate, well-resourced, and inclusive ecosystem to prepare the youth for the modern economy during the webinar on the theme "Youth employment" organized by the World Bank Ghana with...
- Advocates push TVET to tackle youth unemployment J-PAL affiliate Monica Lambon-Quayefio, Associate Professor at the Department of Economics, University of Ghana, said unlocking the potential of TVET required a deliberate, well-resourced, and inclusive ecosystem to prepare the youth for the modern economy during the webinar on the theme "Youth employment" organized by the World Bank Ghana with...
- It's been one year since Mercy For Animals called on Biggby Coffee and Bluestone Lane to drop the upcharge on plant-based milk. We will continue to call them out for their unjust policy! . The post Biggby Coffee and Bluestone Lane Profit at Animals’ Expense appeared first on Mercy For Animals.
- Summary: This is a summary of a paper published by the alignment team at UK AISI. Read the full paper here. AI research agents may help solve ASI alignment, for example via the following plan: Build agents that can do empirical alignment work (e.g.~writing code, running experiments, designing evaluations and red teaming) and confirm they are not scheming.
- GiveWell’s research doesn’t end once we’ve made a grant. We evaluate a subset of completed grants, comparing what we thought would happen to what actually took place, then try to use what we learn to improve our future funding decisions.
- GiveWell’s research doesn’t end once we’ve made a grant. We evaluate a subset of completed grants, comparing what we thought would happen to what actually took place, then try to use what we learn to improve our future funding decisions.
- In factory farms around the world, individual animal care is impossible. To manage thousands of farmed animals at once, workers use industrial marking paint on fur or skin, applying it with a brush, sprayer, or roller to categorize animals such as cows, pigs, goats, and sheep. Why Are Animals Spray-Painted and What Does It Represent? […]. The post Why Are Farmed Animals Spray Painted?
- 1) The safe-to-dangerous shift is a fundamental problem for eval realism. Suppose we have a capable and potentially scheming model, and before we deploy it, we want some evidence that it won’t do anything catastrophically dangerous once we deploy it. A common approach is to use black-box alignment evaluations.
- The Cyber Resilience Corps, was listed as a resource for CI Fortify, a new initiative launched by the Cybersecurity and Infrastructure Security Agency (CISA), demonstrating the strong role that volunteers play in hardening the defenses of critical infrastructure in local communities. The post Cyber Resilience Corps Listed as Key Resource in CISA’s “CI Fortify” Initiative appeared first on CLTC.
- The trial of the year draws to a close
- The post Is Anyone Still Charging Extra for Alt Milk in 2026? appeared first on Mercy For Animals.
- 1) The safe-to-dangerous shift is a fundamental problem for eval realism. Suppose we have a capable and potentially scheming model, and before we deploy it, we want some evidence that it won’t do anything catastrophically dangerous once we deploy it. A common approach is to use black-box alignment evaluations.
- Having empathy for others doesn't require weird metaphysics!
- On 14 May 2026 Pugwash held a side event during the Treaty on the Non-Proliferation of Nuclear Weapons Review Conference … More...
- Are companies the most durable cash transfers? View this email in your browser Hello! Our favourite links this month include: A major threat to animal welfare legislation in the US. The case for starting an export company instead of working in aid.
- AI incidents are scaling fast, and coordinated global governance is lagging behind. This report proposes addressing this challenge through the development of internationally-distributed incident management infrastructure. Our recommendations aim to enable governments, multilateral bodies, and frontier AI companies to jointly detect, prepare for, and respond to AI incidents across jurisdictions.
Loading...