Why make it complicated when you can make it simple ?

September 2024

In recent years, the rise of complex and non-linear approaches, particularly those powered by Deep Learning architectures, has become an extremely present trend across a wide range of disciplines. Initially gaining prominence in computer science, these deep, convolutional, recurrent, or residual neural networks, and other transformer models have rapidly extended their reach into fields as diverse as physics, chemistry, biology, geography, geology, mathematics and many others. In sociology and, notably, more recently in linguistics, these models have started to shape the way researchers approach problems, proposing new tools to tackle old challenges. However, the allure of these sophisticated methods often overshadows a critical understanding of their limitations and the assumptions embedded within them. While these techniques often offer reasonable predictive capabilities, their widespread adoption is often done without questioning their suitability and the potential risks of over-reliance on such intricate systems in areas where simpler methods might suffice or even be preferable.

Despite the remarkable success of these models, they often carry with them a set of strong, but often implicit, assumptions that are rarely scrutinized. Non-linearity, or high-dimensionality, for instance, are frequently assumed to be inherent qualities of the problems being addressed. Models are built in consequence, without sufficient consideration of whether a simpler, linear approach might be equally effective. These assumptions also come with the idea that a (really huge and) complex model is indeed a good representation of the reality underlying the observed data. All of this is often accepted uncritically in the Machine Learning process, with the focus being on optimizing performance rather than questioning the fundamental premises on which these models are built.

These huge non-linear models are now more than ever employed as if they were the best or even the only approach, regardless of the nature of the problem being studied. However, tools such as linear regression, despite their simplicity, can offer powerful insights when applied correctly and can be much more transparent and easier to interpret than their complex counterparts. The lackof comparison between complex neural networks and these simpler, well-understood models means that researchers may be missing out on more straightforward solutions that could not only provide a clearer understanding of the underlying processes at play, but also, sometimes, yield better results. This over-reliance on complexity, without sufficient justification, raises concerns about the direction in which the field is heading and whether it is losing sight of the importance of simplicity and clarity in scientific inquiry.

The widespread adoption of complex models is often driven by a fallacious line of reasoning: the assumption that because these models perform well on complex problems, they must also be effective for simpler ones, and, furthermore, that if these models work well on simple problems, they must somehow be more closely aligned with the underlying reality of the phenomena being studied.

This reasoning is flawed, of course, and misleading, as it can lead researchers to overestimate the capabilities of these models and to overlook simpler, more transparent approaches that might be more appropriate. The belief that complexity necessarily equates to accuracy or realism is in contradiction with the idea that the simplest explanation is often the best, as suggested by Ockham's razor. Furthermore, the focus on achieving high performance metrics, such as accuracy or precision, can come at the expense of other important factors, such as model interpretability, robustness, and generalizability. In many cases, the use of complex models is justified primarily by their ability to achieve high performance on benchmark datasets, rather than by their ability to provide meaningful insights into the underlying phenomena being studied.

As mentioned earlier, the success of complex methods on problems that appear complex at first glance does not necessarily mean that these models are the most appropriate. In many cases, the complexity of the problem may be overstated, and simpler models may be capable of achieving comparable or even superior results. Moreover, the recent emphasis on performance and precision over explainability and model transparency reflects a broader cultural shift that prioritizes outcomes over understanding. This shift is, of course, socio-cultural, but in the field of computer sciences, it has been enabled by the rapid increase in computational power, which allows for the training and deployment of increasingly complex models. However, this focus on performance can lead to a neglect of the underlying principles that should guide scientific inquiry, such as the importance of transparency, interpretability, and the ability to explain the results in a meaningful way. The use of models that cannot be, to a great extent, explained or understood is particularly problematic in the context of scientific research, where the goal is, to me, not only to make accurate predictions but also and especially to gain a deeper understanding of the underlying processes. While this results-oriented approach may be necessary in certain applied contexts, such as engineering, it is, to my sense, less appropriate in the theoretical sciences, where the goal is to advance knowledge and understanding rather than simply to achieve practical results.

Finally, and perhaps most importantly, the widespread adoption of complex methods leads to a concerning oversight: the confusion between models and reality. It is essential to remember that models are, at their core, simplifications of the observations of reality, designed to help us understand and predict certain aspects of the world around us. However, they are not reality itself. The rational, mathematical, and logical approaches that underpin these models allow us to capture and describe measured phenomena, but they do not claim to represent the full complexity of reality. For centuries, scientists have developed models with a deep awareness of the philosophical questions surrounding the notion of modeling reality. These questions, which touch on issues such as the limits of knowledge, the nature of truth, and the relationship between the observer and the observed, were considered central to the scientific enterprise. However, in the current era of Machine Learning, these philosophical concerns are often neglected, forgotten, or outright dismissed. Despite the poorness of the reflexion that follows this neglection, this is particularly troubling given the increasing reliance on models in decision-making processes across a wide range of fields, from finance, to healthcare, and most importantly to public policy and politics.

All of this is regrettable.

Of course, some problems are likely only to be tackled effectively with complex models due to the intricate nature of the data and phenomena involved. However, it is crucial that such models are used with parsimony rather than out of scientific laziness.

The "New Perspectives in the Social Sciences" journal offers a valuable opportunity for reflection on these issues. Of course, the opinions expressed in these articles are those of their respective authors and do not necessarily reflect my own views.