How An Algorithm Would Handle a Black Swan

In summary, it wouldn't.

‍

Background

A bit of background first though.

What is a Black Swan? Black Swans are defined as events estimated to be extremely improbable but with outsized impact that are reserved for history shaking events such as the emergence of the internet or 9/11. The term has seen a marked rise in popularity after being popularized in Nassim Taleb's novel of the same name (while he’s gained a bit more “notoriety" outside the literature sphere over the years, I'd highly recommend his writings).

Hedge funds like Universa Investments have even grown to impressive wealth with 4,000%+ gains in a single month purportedly predicting Black Swans. More to come on that later though...

‍

Pop Culture Perspectives

Background aside now, what would pop culture have to say about an algorithm’s ability to predict these Black Swans?

If you subscribe to a popular opinion that oft comes in waves, you would think algorithms are the infallible supercomputers that can not only predict Black Swans but will genetically create a real-life black swan while solving the meaning of life and brewing a new cappuccino called Cisne Negro with its infinite spare cycles.

If you subscribe to the opposing philosophy that underlies say Ray Dalio’s “principles", then you'd believe we can of course programmatically capture these events. This is because nothing is ever really a Black Swan per se given "history repeats itself over and over" ad nauseam; thus, almost no event ever qualifies as a Black Swan. This is at the core of Bridgewater’s unprecedented long investment track record of success, so how could it be wrong?

It goes without saying, but the former misses its mark by a wide margin. We may be in the age of algorithms, but those algorithms are still a long ways away from being our overlords.

As for the latter, while it's more pervasive in history, it’s still off target in relation to Black Swans. This is because an algorithmic prediction of a Black Swan is fundamentally challenged due to the interworking of its 3 core components: model, data, and target variable (there are other supplementary components, but we'll focus on these for simplicity).

How do these 3 pieces drive a Black Swan prediction?

‍

Algorithmic Components

First and foremost, the target variable (i.e., y). This is the lynchpin among the components as we make sure that we are predicting the thing we actually want to predict. If this sounds redundant that’s because it is, but it’s also nontrivially key. If you want to predict a black swan event but are actually predicting whether a photo is a cat or a dog, then the puck stops there on the usefulness of the algorithm. Predicting the wrong thing means none of the below matter. Note that in most situations, predicting the wrong thing isn’t so unforgiving as you can learn from that missed data in future predictions. Not so in a Black Swan. You typically only have one shot at these given their infrequent nature.
Assuming the target variable is right, the next component is data (i.e., x). Data can be dicey, but is often oversimplified to be clean. This translates to either having enough representative samples in the training data or not. It can become more dicey though when a situation wasn’t exactly in the data but could be abstracted from the data. Going back to our cat versus dog classifier example, a golden retriever may not have been in the data but there were enough similar dogs for the model to figure things out. Which is a good segue into…
The model (i.e., f). This is the engine of the algorithm and can take every color of the rainbow (random forest, linear regression, deep learning, support vector machine, optimization model, etc). All we care though for the sake of this exercise is whether the chosen form has sufficiently high accuracy and generalization.

‍

Verdict

So, how would an algorithm then handle a Black Swan based on the workings of these components?

In relation to choosing the target variable, almost no one saw the most recent “Black Swan” COVID-19 coming. Because it was on so few people’s radar, it translates to an equally low likelihood of the right target variable being selected. But if one did somehow anticipate a coming Black Swan for argument sake…
The algorithm would require data that accurately represents the situation. This is challenging considering that there is no existing data to replicate the situation given the nature of Black Swans. For COVID-19 as an example, the closest cousins were Spanish influenza, SARS, and MERS with each diverging in significant ways (e.g., availability of data, infectious characteristics, mortality rate, etc.). This translates to a need for data fabrication with very accurate assumptions (read: challenging), and/or…
Build a truly impressive model that abstracts from the incomplete / imperfect data assuming fabrication of data isn’t accurate (unlikely).

In other words, an algorithm can’t handle Black Swans without a highly active and accurate imagination that leads to design decisions that anticipate these events, make the right model assumptions, and generate representational data.

That’s the nature of Black Swans though - one can’t anticipate them by definition. No algorithm/human can predict what it doesn't anticipate.

‍

Workarounds

So what do people do instead?

Abstract the Prediction. This Takes Two Forms

Abstract the target variable to a higher level that encapsulates what previously was so hard to imagine. This is the equivalent of making your dart board 50x bigger and thus makes hitting the bullseye also 50x more likely. This is a tradeoff between being one of the earliest to gain for increased certainty as a result. I would argue this is a form of bet that Bridgewater takes as well as a hint of the following...
Zoom in on a portion of the system that acts more predictably and operate there. This is the equivalent of learning algebra before getting into calculus. This is where most of us lie whether it be media, business operators, investors, or politicians. None of us can understand all the ways the world is changing under this Black Swan that is flipping the rules of the game. So we try to stick to a game with rules that we do understand, which leads to a bit more certainty but also plenty of Darwinian “survival of the fittest” with a bit of luck.

Surrender and Don't Predict Black Swans

Predicting Black Swans has been proven to be demonstrably near impossible. But we know that they inevitably happen and that they will rampage through some systems in similar ways. If that’s the case, why not give up on predicting when they happen, and instead just bet that they will happen eventually. This was Universa Investment’s approach (recall their mention in the beginning). Universa Investments gave up on predicting when a Black Swan will occur (while recognizing they will inevitably occur) and instead chose to lose a small amount frequently over the last 10 years to occasionally win very, very big on Black Swan volatility / shocks that rampage in the financial system. In practice, they saw 4,000+% returns for the month of March alone when COVID-19 fears were at the height during the first wave.

‍

Closing Words

All this is to say Black Swans will predictably unpredictably wreak havoc on a society that biases to ignore low likelihood events. Take comfort in knowing our algorithmic counterparts would do no better than the best of us at anticipating them.