Antifragile: A Non-Bullshit Version
All the useful and interesting stuff
This week, I wrote about why Nassim Taleb is self-defeating. However, I still think some of his books contain some really interesting ideas which I sometimes want to refer people to without making them read his obnoxious books. So this is the “bullshit-free” version of his book Antifragile, which I think is the most philosophically interesting of his works.
I. The Fragile, Robust, and Antifragile
A stressor is a change that disrupts an environment, a departure from the average state. How does a system—an object, an organism, a business, a society—react to stressors?
Fragile
Stressors harm fragile systems. A piano is fragile: use only wears it down, and more violent stressors can break it altogether. An intricate, Ocean’s 11 heist is fragile: if one component goes wrong, it can ruin the entire plan. The financial sector in 2008 was fragile: a few too many defaults could trigger a chain panic and the whole thing comes crashing down.
The scariest kind of fragile systems have nonlinear fragility. A snowman is linearly fragile to heat: as you increase the temperature, it degrades uniformly into a puddle. A Jenga tower is nonlinearly fragile. Most of the time, while you’re pulling blocks, the tower doesn’t react. At best, you get a little wobbling. But when you pull that final block and cross a threshold, the whole thing comes crashing down.
The term “nonlinear” here means if you double the intensity of stressors, you do not get double the output disorder. If you take one Jenga block and observe the Jenga tower wobble about 3 millimeters, and you assume linearity, then you expect if you take ten Jenga blocks it will wobble 30 millimeters. But it reacts much more than that. Nonlinearly fragile systems are so dangerous because you don’t get a warning period where you can see that the system isn’t working and adapt. It looks fine… until it doesn’t.
Robust
Stressors don’t affect robust systems one way or the other. A steel door is robust to kicks: it will not be harmed, but it won’t be strengthened either. Of course, everything is only robust up to a point—enough force will break it. This also goes for the other system types: they are context dependent and vary based on the type & severity of stressor.
Antifragile
Stressors help antifragile systems. Immune systems are antifragile: suffering a mild form of a disease improves the system strength once you get through. Evolution is antifragile: randomness and environmental stressors lead to species that are able to sustain more damage.
Note that antifragile systems can be made from, and even benefit, from having fragile parts. Think about antibiotic resistance: most bacteria are so fragile to antibiotics that they die, but the evolutionary process is so antifragile that taking too many antibiotics will lead to super-bacteria that render us powerless. This is because the fragility of bacteria allows evolution to cheaply identify (in a sense) which bacteria are strong, and only continue with those organisms.
A nonlinear antifragile system is the opposite of a nonlinear fragile system. The Jenga tower looks like it’s doing fine, until it suddenly catastrophically fails. The immune system, meanwhile, suffers a lot of small fails. You do still get sick a lot of the time. A vaccine or mild strain of a new disease will make you feel like shit. But when a ten-times stronger version of that stressor comes around, you will suffer a lot less than ten times as much, because those small failures allow the system to get stronger.
II. Determinants of system (anti)fragility
It’s all information, baby
Stressors are information. I do not mean that metaphorically, I mean it mathematically. In information theory, the information carried by an event is defined as the negative log of its probability. Here’s an awesome explainer of how that works. Low-probability stressors are how information is introduced into a system.
To gain some intuition, let’s anthropomorphize a little bit. If you are a complex system, when a stressor happens you learn two things: what kind of stressors you might encounter in the future, and how the system (you) reacts to that kind of stressor. It’s like a training run: you see what works and what doesn’t, and you can discard failures and select the components which are working. If you never face a negative disruption, you have no information about what works and what negative disruptions look like.
Now, it’s very good for scientific thinking to try to really conceptualize this with systems that lack explicit intelligence. The evolutionary process behind antibiotic resistance doesn’t survey the scene after an antibiotic and think “gee, those bacteria with type-X lipoproteins all died, I guess I better manufacture some bacteria with type-Y lipoproteins instead!” What happens is the fragility of the individual bacteria automatically “chooses” the type-Y bacteria, because the type-X bacteria can’t reproduce anymore (due to being dead).
Conversely, the porcelain cup does not change in response to nudges and gusts of wind: it keeps the exact same crystaline structure. There is no mechanism by which weak arrangements of molecules are more likely to be discarded and strong arrangements are more likely to form. There is no “learning”, so the system gets one nasty surprise and that is it.
Similarly, learning systems which don’t get enough stressors suffer. Imagine a turkey which survives 364 days without seeing turkeys getting slaughtered. It concludes that farmers pose no risk to it… right before Thanksgiving. The sad part is that the night before Thanksgiving is when the turkey is at its maximum confidence that it is safe: it was a lot more cautious when it had only observed a week or two of peace. It would have been better for the turkey if the farmers had mistreated it a little or killed other turkeys; then it could have learned it was worth the risk of attempting an escape.
Optionality
Option = asymmetry + rationality
Remember how I talked about evolution “filtering out” the weak organisms and keeping only the strong? That’s optionality. Optionality occurs when you have multiple choices available to you, some better than others (the asymmetry), and you can preferentially select the choices which are best for you (the rationality). A rational agent who gains an option for free is always better off in expectation, because if the option turns out to suck it can be turned down. A rational agent who has to pay a small, fixed amount for an option will lose a little if the option turns out to suck but may gain enormously if the option turns out to be great. This is analyzed in the next section.
Upside & downside
Consider a new anti-inflammatory drug which seems pretty effective at reducing joint pain. You suffer mild joint pain, so you decide to take it. You do find joint pain reductions… and suffer a heart attack a few years later. It turns out that this drug also has cardiac effects which were not detected in clinical testing. This really happened, and resulted in at least 88,000 cases of serious heart disease before it was withdrawn.
Assuming you’re not a genius biochemist, you’re very unlikely to be able to predict that rofecoxib (the chemical name of the drug Vioxx) can cause heart disease. But via Taleb’s framework, you can know ahead of time that taking rofecoxib is a bad idea.
Why? Consider the whole span of possible outcomes. If rofecoxib is safe, you will gain a small positive benefit: no joint pain. What about unexpected benefits? Since human bodies have evolved for hundreds of thousands of years and evolution is an antifragile system, it’s very unlikely that a drug designed for joint pain will unexpectedly provide massive benefits like doubling your lifespan: you’d expect evolution to pick up those benefits first. So the upside—the space of possible benefits you get if the drug goes well—is moderate and limited.
But if rofecoxib is unsafe, you have no idea what it would do to your body. Biochemistry is really complicated, and the history of medicine is fraught with side effects. It is possible that rofecoxib could halve your lifespan by leading to a heart attack. Or it could mess with your body in so many other ways, as significant or more significant than joint pain. So the downside is much greater than the upside.
By comparing upside and downside, you can conclude that your body is fragile to rofecoxib. Unexpected effects are more likely to hurt you than to help you, and if they hurt you, they can really hurt you. So it’s probably best not to take rofecoxib.1,2
By contrast, imagine you are deciding whether or not to invest a small amount of money in a new technology whose applications are very big if true. You think the tech is probably crap. If that’s true, you lose a small but finite amount of money: the downside is strictly limited. But if you’re wrong, you might make hundreds to thousands of times your return on investment: the upside is unlimited, or at least the limit is a lot higher. Your investment portfolio is antifragile to small but aggressive bets like this: unexpected effects are more likely to help than hurt you, and if they help, they really help. So it might be a good idea to invest.
“Isn’t this just expected value calculation?” Yes and no. If you truly had perfect information then yes this would be a strictly worse approximation of expected value. But what is the probability that this particular drug is safe? What is the probability that this particular technology is successful? In nonlinear fragile systems, base rate estimations are guaranteed to be too optimistic: the Jenga tower looks fine until it’s not, and then it’s too late. In nonlinear antifragile systems, base rate estimations are guaranteed to be too pessimistic: the system learns by failing in small amounts routinely. In many complex systems, real probability estimates are intractable. According to antifragility theory, the upside/downside heuristic should lead to better practical results than trying to do naïve EV calculations.
Redundancy & “efficiency”
Antifragile systems often include some level of redundancy—multiple independent parts performing the same function—while fragile systems lack redundancy and put their eggs in one basket. Optimizing for “efficiency” can unintentionally lead to fragility. Why? Again, upside/downside analysis is a pretty good explanation. Suppose you believe your system has some redundant parts, and you think you can improve yield by 10% by eliminating them. So there’s a fixed, capped gain if you’re right. But what if you’re wrong? Or what if one of the parts fails and there’s no backup? Then yield will decrease by a lot more than 10%; your system may fail entirely.
This is a pernicious effect of greedy optimization given a turkey problem. Suppose the turkeys have some weaponry they can use to defend themselves from the humans, but there is a marginal cost associated with maintaining them. A turkey who throws the weapons away will be richer than the turkey who keeps them; if they are in direct competition, the market might even eliminate such “inefficient” turkeys who keep parts that 364 days of the year, do nothing. When Thanksgiving rolls around, the inefficient turkeys have suffered a small but manageable consistent loss while the “efficient” turkeys lose everything.
But if your optimization process only uses short-term reward to decide which turkeys last to Thanksgiving, it may get rid of those parts which don’t look like they’re doing anything. Think of COVID supply chain problems, or naïve reformers under Chesterton’s fence. Moloch problems abound.
Conversely, antifragility works via overcompensation. When an antifragile system suffers a small loss, it does more than recover to your start state; it absorbs information which makes it do better next time, and builds up “more than it needs.” This is how failures can be productive and not just neutral.
Averages, concavity, & convexity
Researchers place two 90-year-old grandmothers in two rooms. Room A fluctuates slowly between 70° and 80° F. Room B is kept at 50° for 57 minutes and 500° for the last 3 minutes. The researchers, being dumb characters in an explanatory analogy, are mystified to find grandmother A comfortable and grandmother B dead. “I don’t understand,” says one. “The average temperature of both rooms was 75°!”
Alas, the average is not what matters here. If grandmothers reacted to heat linearly, both would indeed have emerged in the same comfortable state. But grandmothers are nonlinearly fragile to heat: bigger and bigger fluctuations have worse and worse effects, up until the point where the grandmother dies. This is Taleb’s guiding principle:
For the [nonlinearly] fragile, shocks bring higher harm as their intensity increases (up to a certain level). … For the [nonlinearly] antifragile, shocks bring more benefits (equivalently, less harm) as their intensity increases (up to a point).3
Here’s what nonlinear antifragility and fragility look like on a graph:
The “you are here” point represents the average/typical value of the input variable X. Now ask yourself, for each system, would you prefer:
n doses of the average value,
n random doses of X, which average to the average value?
These are not the same thing! For the top graph (antifragile), n doses of the average nets you n × 0 = 0 gains on average. For random doses, some of them will be higher than average and some lower than average. But because the curve is steeper on the higher-than-average side and shallower on the lower-than-average side, the gains from 5 above-average doses more than outweigh the losses from 5 below-average doses. A function like this is called convex, and it is expected to gain from increased variability around the mean.4
Similarly, for the bottom graph, the curve is steeper on the higher-than-average side, but in the wrong direction—it’s steeply negative. So you’re like the grandmother: the benefits of a cool temperature for 57 minutes do not even come close to outweighing the cost of an extremely hot temperature for the last 3. A function like this is called concave, and it expected to suffer from increased variability around the mean.
This is the mathematical underpinning of fragility and antifragility, and is sometimes the simplest way to identify it: would your system do better with exactly average inputs, or with a variety of inputs with the same average? The former is fragile, the latter is antifragile.
III. Exploiting fragility & antifragility
Focus on system type, not forecasting
Some guy wrote a book about how the real world of risk—science, economy, politics, military, discovery, etc.—does not behave like games of “risk” like poker and blackjack. In fact, those games are about as far from real risk as possible: they have precisely calculable probabilities. The fact that the house always wins is proof: no rogue poker player can blow up the entire house in one fell swoop, but that can happen in financial systems.
Even “experts” suck at predicting things in real world systems. In that same book, there are a ton of examples: the planning fallacy, economic “forecasts” with awful track records, poor domain transfer, etc. But even if they were pretty good on average, the problem is that the most unexpected events are often the only ones that matter. If you are a venture capitalist, the performance of the typical startup does not matter to you when there are the occasional startups which are thousands of times more valuable than the rest. If you are trying to prevent a financial crisis, your job is to learn about extreme cases, and thanks to nonlinear fragility and the turkey problem, that means most of the day-to-day data you observe provides almost no information.
What to do? Taleb thinks you should pretty much ignore concrete forecasting and focus entirely on whether you are in a fragile, robust, or antifragile system, and do everything in your power to move towards antifragility. You don’t have to know what the next great scientific achievement will be—if you knew, it wouldn’t be groundbreaking—but you can figure out the research practices which grant you high exposure to new ideas and cheaply test/discard the bad ones. You don’t have to know what will trigger a financial crisis, but you can look at how the banks manage their debts and see if they have redundancy and opportunity to learn from mistakes, or if mistakes are likely to compound.
It is much easier and more computationally tractable to identify system type rather than predict the future, and system type matters more than the specific future anyway. So do that!
Apply the barbell strategy
A barbell is heavy on one end, skinny in the middle, and heavy on the other. The “barbell strategy” is one way of making a system antifragile. The implementation is:
Invest 90% of resources in extremely safe assets. (Heavy left side)
Invest the remaining 10% in extremely speculative, “big if true” assets. (Heavy right side)
Do not invest in “moderately” risky assets. (Skinny middle)
How does this keep you antifragile? Upside/downside analysis! You have clipped downside: you won’t lose more than 10% of your money, which is pretty modest. But your upside is huge: if any of the risky assets pays off, you will gain an extreme amount from them.
There’s also an epistemic argument against “moderate” portfolios: it is easy to identify extremely safe assets and extremely risky ones. But it is much harder to identify what makes an asset only “moderately” safe, requiring modeling and theoretical assumptions which could be flawed. And if they are flawed, there are many more ways for an asset to become unexpectedly worthless than unexpectedly worthwhile. So putting all your money in moderate assets which may share some underlying modeling assumption puts you at much greater risk of ruin than barbelling.
Trust the Lindy effect
The Lindy effect is:
For the perishable, every additional day in its life translates into a shorter additional life expectancy. For the nonperishable, every additional day may imply a longer life expectancy.
Where perishable things are like living creatures, and nonperishable things are like technologies, institutions, books, etc.
Why is this the case? Time, like evolution, is a filtering process. When you see a brand new technology, you do not have that much information as to whether it contains insights that are worth anything. It has not suffered many stressors, and so you don’t see how it reacts to them. But the technology known as the book has been around for hundreds of years, and it is still going strong. The longer the book has survived, the more confident we are that it’s got to have some really worthwhile qualities; after all, most inventions don’t make it.
Therefore, the longer something has stuck around in human history, the more reason you have to trust it, because it’s been subject to selective pressures and survived. There might be more reason to listen to tacit knowledge and the advice of grandmothers than to experts with new and complicated methodologies, and default to whatever the oldest strategy is in absence of strong reasons to switch.
NB: time and positive selective pressure are necessary for the Lindy effect to be a good guide to the useful. Parasites, diseases, and drugs have been along with humans for a really long time, and those that remain are really good at being parasitic. But that doesn’t mean you should eschew vaccines. Similarly, when an environment is insulated from ideas having to collide with reality (like certain corners of academia), the solutions that thrive in that environment may transfer very badly to the real world, but are still quite likely to persist in that sterile environment.
Via negativia
Via negativia, or “the negative way”, is based on a tradition in theology where scholars don’t attempt to describe what God is, but capture what he definitively is not. Christian God may be indescribable, in which case attempts to capture him precisely will go awry, but Christian God is definitely not a sadistic murderer, so don’t do sadistic murders. That can be a much more tractable way to do right by God. Via negativia for us is simply that you ought to focus on removing things that are known to be negative rather than adding positive interventions.
Via negativia for Taleb is more based on asymmetries in epistemology rather than payoffs. It is easier to gain knowledge of what doesn’t work than what does. A single observation of a black swan disproves “all swans are white”, but a million white swans cannot prove the same statement. Similarly, only a few cases are needed to show that a drug has an unacceptable side effect, whereas a hundred seemingly healthy (for now) people can still conceal internal damage.
But there’s upside/downside analysis too. The new joint pain drug, by being new, there is a lot of uncertainty about it; as we saw, the upside is most likely clipped whereas the downside could be huge. For interventions in your life that you’ve already taken, removing them has a small known downside (you lose that one useful thing) and a larger possible upside (your life becomes significantly more streamlined).
At least, that’s the argument, anyway. I find this somewhat less convincing than the rest of the book; Taleb does not clarify which should dominate between the tinkering with optionality, which is pro-intervention, and the via negativia stuff, which is anti-intervention. For instance, look at the redundancy/efficiency section of before: how do we know that when we subtract something, there aren’t hidden downsides, that it doesn’t serve some redundant function which makes everything collapse?
The best argument is that this is where the Lindy effect should come in: stuff that has survived for a long time despite selective pressures is probably worth something even if it seems wasteful. New stuff—new for you, new for society, new for humans—is more likely to actually be wasteful, especially if it was selected by human design and not filtered through another antifragile process. Eliminate the new and wasteful, and only add the old (and only some of the old).
IV. Conclusion
Researching and writing this post made me really happy and kind of sad. Really happy, because I feel like antifragility really clicked for me now in a way it never has before, it’s a really elegant and beautiful idea, and now I have a much easier way to share it with others. Kind of sad because Taleb had all of this beautiful work and absolutely took a shit on it with his abrasive and insecure personality. The best book ever written is Gödel, Escher, Bach. Antifragile could have been number two. ‘Twas not to be. But hopefully I’ve provided some optionality with respect to Taleb’s offerings. May you take the good stuff and forget about the rest.
Plus, in the hypothetical rofecoxib is a new drug. Time and observation of what happens to people who take drugs can allow you to filter the good drugs from the bad, an antifragile process. This is why Taleb is big on things that have weathered the test of time.
Also, the calculus flips for medicine when you are terminally ill with very low chance of survival. Your odds of survival can only decrease by so much, but they could get much better. Then it’s worth being very aggressive with experimental treatments, according to Taleb. (This is not medical advice or opinion; I am not a doctor.)
Taleb does not include “nonlinearly” in the text, arguing that most instances of fragility and antifragility are nonlinear. But I think this is misleading and one of the factors that contributes to Taleb devolving into loose and sloppy metaphor. The original definition of fragility he offers is something which suffers from the introduction of stressors, which is not necessarily concave.
This is known as Jensen’s inequality, and it holds for any convex function f and random variable X:
where 𝔼[something] is the expected value of that thing with respect to X.




Doing everything in your power to make something antifragile looks like it’s obviously ignoring the cost of such efforts. For example, I have always been sympathetic to the idea that most things you could have done to avoid the 2008 financial crisis would be sufficiently costly to the point where you’d be better of just doing nothing and letting the crisis happen. Even if that example does not work, there have to be some situations where you have to make this trade off. Also, I’m not sure that human judgement itself does not count as something which you should trust for approximately the same reason you trust things that have stood the test of time. Since after all most ideas that are possible, do not get enacted only ones that are sufficiently persuasive to a human. Now, obviously, it is true. That being persuasive is not the same thing as actually working, and I do think that actually looking at the historical record makes it clear that human judgement is worse than cultural evolution, but still it’s a difference of degree not a difference of kind. Also, honestly, a lot of the arguments in this post seem kind of one-sided as should be clear from the beginning of my comment. For example, should we be suspicious of the industrial revolution because it’s only been around for a couple centuries and the downside of possibly causing human extinction through better technology or other surprises should outweigh the possible benefits. Explicit modelling and expected utility calculations might miss unknown unknowns and be bad at accounting for them, but they are much better at dealing with these kind of issues instead of dogmatically sticking to a rule of thumb.
Perfect timing. Great follow up on Taleb.