Souradip Mookerjee

The Failure of Complex Systems (and Missing Heritability)

Permalink to post

Published on · 4 mins read

Complex systems don't fail just because of one reason

I once ran to catch a train back home from University. I ran and ran as fast as I could. I got to the platform mere seconds after it had closed its doors and started to pull away. I had been so close, but I'd missed it. What had been the culprit? What could I blame?

Was it the old lady in front of me in the queue taking her sweet time with her coins while my train was waiting for me? Was it the dodgy internet connection along the way that hadn't let me buy a ticket on the way there? Had I not ran fast enough? Did I wake up too late? Did I spend too long helping a friend through an emotional crisis the previous night?

Missing a train is a complex system. In truth, the real reason was all of the above and yet none at the same time. Complex systems have a lot of redundancy built in. They can handle one or two things not being quite perfect. But when enough things go wrong, they all have an effect that's bigger than any one of them combined.

The Missing Heritability Problem

In genetics, there's this curious problem called the "missing heritability problem". We know that, for example, 60% of the incidence of schizophrenia is due to genetics through studies of twins. But the genes we've identified only acount for a small fraction of that. In the 2000s, the Wellcome Trust used genome sequencing to try and find genes responsible on a massive scale for common diseases like coronary artery disease, Crohn's, Rheumatoid Arthritis, and more. But again, they only found a few individual genes that were associated with these diseases.

The human body is a complex system

We are full of redundant systems within ourselves. We've evolved over millions of years so that we can recover from any single hit to our health. The analysis that's been done on the genome sequence studies have focused on assessing individual gene associations with a disease, or polygenic risk scores from adding together the individual risks from each individual gene. I think it's likely that the missing heritability comes from somewhere else.

The individual risk of me missing my train from each of those factors I mentioned earlier are minimal, even added together. But they have a synergistic, superadditive effect when they all happen together. I think a similar model can be used to suggest that many of the genes responsible for these diseases are hiding, only to be brought out when the right context of simultaneous mutations exist.

Unexplored avenues of research

I think this represents a fascinating unexplored avenue of research. With the advent of neural network architectures like transformers, originally developed for machine language translation, I think this could be feasible to tackle now. The translation for any work to another language requires knowledge of the context of the surrounding words in the sentence for it to be appropriate.

These neural networks allow not only to infer the combinatorial superadditive effects that traditional statistical techniques cannot, but we could then run the model backwards to see given a certain gene, what other genes are taken into consideration when deciding if someone is at risk.

We even know the theoretical maximum accuracy for these models (the heritability), beyond which it would be impossible to improve upon. We could even build useful clinical prognostic tests based around these, rather than just saying someone has a "4% increased risk of X" as sites like 23andme seem to do these days.

Souradip Mookerjee Mr Souradip Mookerjee MA (Cantab) 1995-05