How theoretically very distinct mechanisms can generate identical observations

This post was written by Joost Kruis ( and summarizes the paper entitled “Three representations of the Ising model” published recently in Scientific Reports. Joost works in the Psychological Methods Department at the University of Amsterdam.

Network models have gained increasing popularity in social sciences, and essentially describe the increasingly popular practice among researchers to explain associations, observed between measured variables, as a consequence of mutualistic relations between these variables themselves.

Examining the structure of observed associations between measured variables is an integral part in many branches of science. At face value, associations inform about a possible relation between two variables, yet contain no information about the nature and directions of these relations. This is captured in the (infamous) phrase about the quantity measuring the extent of the interdependence of variable quantities: correlation does not imply causation. Making causal inferences from associations requires the specification of a mechanism that explains the emergence of the associations.

In our paper we discuss three of these, theoretically very distinct, mechanisms and their prototypical statistical models. These three mechanisms, represented within the context of depression, are;

Screen Shot 2017-01-23 at 13.05.20

The first mechanism represents the (until recently most dominant) perspective on psychopathology where a mental disorder is viewed as the common cause of its symptoms. The common cause mechanism is statistically represented by the latent variable model, and explains the emergence of observed associations through an unobserved variable (depression) acting as a common cause with respect to the observed variables (sleep disturbances, loss of energy, concentration problems). The manifest variables (symptoms) are thus independent indicators of the latent variable (mental disorder) and reflect its current state.

The network perspective on psychopathology is captured by, what we in our paper term, the reciprocal effect mechanism. In this framework the associations between observed variables are explained as a consequence of mutualistic relations between these variables. In this framework the unobservable variable depression does not exist, but is merely a word used to describe particular collective states of a set of interacting features.

The third, common effect, mechanism explains associations between observed variables as arising from (unknowingly) conditioning on a common effect of these variables, and is statistically represented by a collider model. In this framework the observed variables act as a collective cause towards an effect. An example of this is receiving a depression diagnosis (effect) as a consequence of the occurrence of multiple symptoms (causes) that are linked by the DSM to the term depression.

While each of these mechanisms proposes a radically different explanation for the emergence of associations between a set of manifest variables. We demonstrate in the paper that their associated statistical models for binary data are mathematically equivalent. From this follows that, each of these three mechanisms is capable of generating the exact same observations, and as such that any set of associations between variables that is sufficiently described by a statistical model in one framework, can be explained as emerging from the mechanism represented by any of the three theoretical frameworks.

Having multiple possible interpretations for the same model allows for more plausible explanations when it comes to the theoretical concepts and the causal inferences we obtain from the measurement model applied to our data. Furthermore, the historical success of theoretically very implausible models, such as the latent variable model can, in retrospect, arguably be explained by the equivalence of these three models.

However, it also means that obtaining a sufficient fit for the statistical models in one of these frameworks is by no means evidence that it is the mechanism from this framework that actually generated the observations. That is, there will always exist representations from the other mechanisms that can explain our observations equally well.

We should thus not only apply a network model to our data because it gives us a pretty picture (which it does), but because we believe that the associations between the variables we have measured are explained as a consequence of mutualistic relations between these variables themselves.


Statistical models that analyse (pairwise) relations between variables encompass assumptions about the underlying mechanism that generated the associations in the observed data. In the present paper we demonstrate that three Ising model representations exist that, although each proposes a distinct theoretical explanation for the observed associations, are mathematically equivalent. This equivalence allows the researcher to interpret the results of one model in three different ways. We illustrate the ramifications of this by discussing concepts that are conceived as problematic in their traditional explanation, yet when interpreted in the context of another explanation make immediate sense.
—Kruis, J. and Maris, G. Three representations of the Ising model. Sci. Rep. 6, 34175; doi: 10.1038/srep34175 (2016).

EDIT January 27 2017: Denny Borsboom published a blog post entitled “The meaning of model equivalence: Network models, latent variables, and the theoretical space in between” that puts this post here and the paper by Kruis & Maris into a broader perspective.

How well do Network Models Predict Future Observations? Blogpost on Predictability in Network Models

Estimated network and

Estimated network and predictability measures (blue pie charts) for the data in McNally et al. (2014)

Predictability of nodes in network models offer an additional perspective on symptom networks, which relates to the practical relevance of edges, the selection of optimal treatment and the degree to which (parts of) the network are self-determined or determined by factors outside of the networks.

For a brief introduction and fully reproducible example of how to compute and visualize predictability in R, have a look at this blog post. The preprint of the paper can be found here.

Psychosystems Satellite Symposium at Conference on Complex Systems (CCS2016)


The Psychosystems group is hosting a satellite symposium at the Conference on Complex Systems (CCS2016) on 21 September, at the Beurs van Berlage in Amsterdam.

Title: Complexity in personalised dynamical networks for mental health

Summary: Psychopathology is recognized as a phenomenon that is important but equally difficult to understand. For instance, depression is often considered as some unknown (physical) system causing symptoms like loss of interest and insomnia. It is however difficult to determine what exactly the system is or how it influences the symptoms we can observe. Recently, a change in view has been proposed to consider symptoms and their interactions as the building blocks of a complex system in an effort to better understand the intricacies of mental health and pathologies. Such systems are inherently complex in the sense that the interactions between a plethora of processes and symptoms can result in bistable behaviour, which may explain, for instance, sudden transitions from healthy to depressed moods. The dynamical processes on such networks are now the main focus of investigation that may lead to better understanding and possibly prevention of pathologies. One of the most promising ways to investigate such complex systems is by monitoring single subjects for some period of time. This is often done by what is called experience sampling, where information through smart phones is obtained several times a day, usually at random intervals. The models and methods for the analyses of such time series are far from trivial. Borrowing from statistical physics, the Ising model, for instance, has proven useful in determining key differences between depressed and remittent patients.


A Network Approach to Psychosis

Our two new papers discussing the application of network analysis to psychiatry research are now published as advance access in the journal Schizophrenia Bulletin.

The first paper (pdf) is a brief theoretical framework describing how the network approach can be used to investigate the interplay between environmental risk factors, expression of psychosis, and symptoms of general psychopathology. We provide an example network constructed from the Early Developmental Stages of Psychopathology (EDSP) study in order to determine the network structure pertaining to three environmental risk factors (cannabis use, developmental trauma, urban environment), dimensional measures of psychopathology, and one composite measure of psychosis expression. The results suggest that network models can successfully serve to disentangle the mechanisms that underlie this interplay: Environmental factors may (1) exert main effects on specific (sets of) symptoms, which subsequently spread through the symptom network and (2) increase the strength of interactions between symptoms, leading to a more strongly connected and less resilient network structure. 


The second paper (pdf) is a detailed study concerned with potential pathways between childhood trauma and psychotic experiences, investigated from a network perspective. We used data from patients diagnosed with a psychotic disorder from the longitudinal observational study Genetic Risk an Outcome of Psychosis Project and included the five scales of the Childhood Trauma Questionnaire-Short Form and all original symptom dimensions of the Positive and Negative Syndrome Scale as nodes in our network. The results show that childhood trauma is connected to the positive and negative symptoms of psychosis only through symptoms of general psychopathology. To highlight these pathways, we further constructed shortest path networks showing the quickest routes from each trauma node to each of the positive and negative symptoms. Overall, our findings suggest that several symptoms of general psychopathology may mediate the relationship between trauma and psychosis, providing evidence for an affective pathway to psychosis and re-emphasizing questions regarding the specificity of trauma to psychosis. A poster highlighting the main results of this study can be accessed here.


Isvoranu, A. M., Borsboom, D., van Os, J. & Guloksuz, S. (2016). A Network Approach to Environmental Impact in Psychotic Disorder: Brief Theoretical Framework. Schizophrenia Bulletin, Advance Access published May 13, 2016. 

Isvoranu, A. M., van Borkulo, C. D., Boyette, L., Wigman, J. T. W., Vinkers, C. H., Borsboom, D. (2016). A Network Approach to Psychosis: Pathways Between Childhood Trauma and Psychotic Symptoms. Schizophrenia Bulletin, Advance Access published May 10, 2016.

Mapping manuals of madness: comparing ICD-10 and DSM-IV-TR networks

Within the field of psychopathology, several diagnostic classification systems are used, of which the International Classification of Diseases and Related Health Problems (ICD) and the Diagnostic and Statistical Manual of Mental Disorders are the current dominant frameworks. Even though both manuals classify similar mental and behavioural disorders, research shows that there are striking difference in concordance and prevalence of disorders depending on which manual (or manual-based instrument) is used. In this paper we investigated the symptom structure of the ICD-10 and DSM-IV-TR by representing individual symptoms are nodes and connecting nodes whenever the corresponding symptom feature as diagnostic criteria for the same mental disorder. Additionally we investigated whether this symptom network structure aligns with empirical data.

Results indicate that, relative to the DSM-IV-TR network, the ICD-10 network contains (a) more nodes, (b) lower level of clustering, and (c) a higher level of connectivity. Nevertheless, the nodes that are most central to each networks are very similar. Comparison to empirical data indicates that the DSM-IV-TR network structure follows comorbidity rates more closely than the ICD-10 network structure. We conclude that, despite their apparent likeness, ICD-10 and DSM-IV-TR harbour important structural differences, and that both may be improved by matching diagnostic categories more closely to empirically determined network structures.

Quiz: why does the factor structure of depression scales change over time?

We published a paper in Psychological Assessment a few weeks ago, and I would like to take the time to explain what these results imply. You can find the full text here, the analytic code (R & Mplus) including the output of all models here (scroll down to the paper), and while I am not allowed to share the data we re-analyzed, I wrote some pointers on how to apply for the datasets here.

In contrast to other blog posts, this will be a quiz: in the paper, we find a very consistent pattern of violations of temporal measurement invariance (I will explain in a second what that means), in different datasets, but we don’t really have a good idea what causes this pattern of observations.

In contrast to other quizzes, however, there is no prize because … we don’t know what the true answer is as of yet ;).

So what did we do in the paper?

We examined two crucial psychometric assumptions that are part of nearly all contemporary depression research. We find strong evidence that these assumptions do not seem to hold in general, which impacts on the validity of depression research as a whole.

What are these psychometric assumptions? In depression research, various symptoms are routinely assessed via rating scales and added to construct sum-scores. These scores are used as a proxy for depression severity in cross-sectional research, and differences in sum-scores over time are taken to reflect changes in an underlying depression construct. For example, a sum-score of symptoms that supposedly reflects “depression severity” is often correlated to stress or gender or biomarkers to find out what the relationship between these variables and depression is; this is only valid if a sum-score of symptoms is actually a reasonable proxy for depression severity. In longitudinal research, if a sum-score decreases from 20 points to 15 points in a population, we conclude that depression improved somewhat. This is only valid if the 20 points and the 15 points reflect the same thing (if the 20 points would reflect intelligence and the 15 points neuroticism, the difference of 5 points over time would be meaningless).

To allow for such interpretations, rating scales must (a) measure a single construct, and (b) measure that construct in the same way across time. These requirements are referred to as unidimensionality and measurement invariance. In the study, we investigated these two requirements in 2 large prospective studies (combined n = 3,509) in which overall depression levels decrease, examining 4 common depression rating scales (1 self-report, 3 clinician-report) with different time intervals between assessments (between 6 weeks and 2 years).

A consistent pattern of results emerged. For all instruments, neither unidimensionality nor measurement invariance appeared remotely tenable. At least 3 factors were required to describe each scale (this means that the sum-score does not reflect 1 underlying construct, but at least 3 and sometimes up to 6), and the factor structure changed over time. Typically, the structure became less multifactorial as depression severity decreased (without however reaching unidimensionality). The decrease in the sum-scores was accompanied by an increase in the variances of the sum-scores, and increases in internal consistency.

You can see the results in the graph below. The four sections represent four different rating scales, the lines represent the first (red) and second (green) measurement point of the longitudinal datasets, PA (blue) means parallel analysis which tells us how many factors a scale has at a given timepoint, and the x-axis represents the number of factors that have to be extracted. If our red and green data lines are above the blue PA line, it means that we should extract a factor. You can read up on all the ESEM modeling in the paper itself, but the gist is that in order to be unidimensional, a scale must only have 1 factor; and as you can see, all scales require the extraction of at least 3 factors. In order to be measurement invariance, the factor solution has to be stable across time; it’s highly evident from the graphs that this is not the case, because the lines for the 2 measurement points per scale should roughly overlap if this were the case (do me the favor and click on the image, I can’t embed vector graphics here). For a scale to be unidimensional and measurement invariant, the red and green line should be very similar, and they should be above the blue line for the first factor and then drop below the blue line for the second etc factors.


These findings challenge the common interpretation of sum-scores and their changes as reflecting 1 underlying construct. In other words, summing up symptoms to a total score, and correlating this total score statistically with other variables such as risk factors or biomarkers, is very questionable if the score itself is not unidimensional and does not reflect 1 underlying construct (depression). Obviously, if you have worked with depression symptoms before you know that they are very different from each other, and the idea that they are interchangeable indicators of 1 condition (depression) is very problematic. But now we have empirical evidence that this is the case — which is consistent with many other papers that have shown similar results. The special thing about this paper is that we examined these properties across a whole range of scales, and tested the robustness of the results by varying many dimensions (datasets, clinician- vs patient-rated, and timeframes).

But what is the reason for these violations of temporal invariance? In the paper, we discuss a number of possibilities that we all exclude as sufficient explanations. Among these are response shift bias, regression towards the mean, selection bias, floor and ceiling effects of items, and that item responses over time may have been influenced by medication.

As we say in the paper:

Overall, these possibilities unlikely fully explain the causes of the pronounced and consistent shifts of the factorial space observed in this report, although they may each contribute somewhat. In other words, while we have provided a thorough description of the crime scene, we have no good idea who the main suspect may be.

The violations of common measurement requirements are sufficiently severe to suggest alternative interpretations of depression sum-scores as formative instead of reflective measures. A reflective sum-score is one that indicates an underlying disorder, the same way having a number of measles symptoms tells us that you have measles: the symptoms inform us about a problem because the problem caused the symptoms. A formative sum-score, on the other hand, is nothing but an index: a sum of problems. These problems are not meant to reflect or indicate an underlying problem. Still, we can learn something from such a sum-score: the more problems people have, the worse they are probably doing in their lives.

» Fried, E. I., van Borkulo, C. D., Epskamp, S., Schoevers, R. A., Tuerlinckx, F., & Borsboom, D. (2016). Measuring Depression over Time … or not? Lack of Unidimensionality and Longitudinal Measurement Invariance in Four Common Rating Scales of Depression. Psychological Assessment. Advance Online Publication. (PDF) (URL)

New network study: what are ‘good’ depression symptoms?

Our new paper “What are ‘good’ depression symptoms? Comparing the centrality of DSM and non-DSM symptoms of depression in a network analysis” was published in the Journal of Affective Disorders (PDF).

In the paper we develop a novel theoretical and empirical framework to answer the question what a “good” symptom is. Traditionally, all depression symptoms are considered somewhat interchangeable indicators of depression, and it’s not clear what a good or clinically relevant symptom is. From the perspective of depression as a network of interacting symptoms, however, important symptoms are those with a large number of strong connections to other symptoms in the dynamic system (i.e. symptoms with a high centrality).

So we went ahead and estimated the network structure of 28 depression symptoms in about 3,500 depressed patients. We found that the 28 symptoms are intertwined with each other in complicated ways (it is not the case that all symptoms have roughly equally strong ties to each other), and symptoms differed substantially in their centrality values. Interestingly, both depression symptoms as listed in the Diagnostic and Statistical Manual of Mental Disorders (DSM) — as well as non-DSM symptoms such as anxiety — were among the most central symptoms.


When we compared the centrality of DSM and non-DSM symptoms, we found that, on average, DSM symptoms are not more central. At least from a network perspective, this raises substantial doubts about the validity of the depression symptoms featured in the DSM. Our findings suggest the value of research focusing on especially central symptoms to increase the accuracy of predicting outcomes such as the course of illness, probability of relapse, and treatment response.

Fried, E., Epskamp, S., Nesse, R. M., Tuerlinckx, F., & Borsboom, D. (in press). What are ‘good’ depression symptoms? Comparing the centrality of DSM and non-DSM symptoms of depression in a network analysis. Journal of Affective Disorders, 189, 314–320. doi:10.1016/j.jad.2015.09.005

Bereavement network paper published

Our new network paper “From Loss to Loneliness: The Relationship Between Bereavement and Depressive Symptoms” was published in the Journal of Abnormal Psychology (PDF).

In the paper we examined 2 competing explanations concerning how spousal bereavement impacts on depression symptoms: a traditional latent variable explanation, in which loss triggers depression which then leads to symptoms; and a network explanation, in which bereavement directly affects particular depression symptoms which then activate other symptoms. We re-analyzed data from the CLOC study, a prospective cohort of 515 individuals, half of which would experienced spousal loss throughout the course of the study (the other half was queried as control group). We modeled the effect of partner loss on depressive symptoms either as an indirect effect through a latent variable, or as a direct effect in a network constructed through a causal search algorithm.

Overall, losing a partner impacted on very specific depression symptoms (e.g., feeling lonely and sad mood), but not on others (e.g., sleep problems). The effect of partner loss on these symptoms was not mediated by a latent variable. The network model indicated that bereavement mainly affected loneliness, which in turn activated other depressive symptoms. The findings support a growing body of literature showing that specific adverse life events differentially affect depressive symptomatology [1-3], and suggest that future studies should examine interventions that directly target such symptoms

» Fried, E. I., Bockting, C., Arjadi, R., Borsboom, D., Tuerlinckx, F., Cramer, A., Epskamp, S., Amshoff, M., Carr, D., & Stroebe, M. (2015). From Loss to Loneliness: The Relationship Between Bereavement and Depressive Symptoms. Journal of Abnormal Psychology.

[1] Keller, M. C., Neale, M. C., & Kendler, K. S. (2007). Association of different adverse life events with distinct patterns of depressive symptoms. The American Journal of Psychiatry, 164(10), 1521–9. doi:10.1176/appi.ajp.2007.06091564
[2] Cramer, A. O. J., Borsboom, D., Aggen, S. H., & Kendler, K. S. (2013). The pathoplasticity of dysphoric episodes: differential impact of stressful life events on the pattern of depressive symptom inter-correlations. Psychological Medicine, 42(5), 957–65. doi:10.1017/S003329171100211X
[3] Fried, E. I., Nesse, R. M., Guille, C., & Sen, S. (2015). The differential influence of life stress on individual symptoms of depression. Acta Psychiatrica Scandinavica, (6), 1–7. doi:10.1111/acps.12395

Network Application 0.1

During the last couple of months, I have been working on building a web application (called NetworkApp) that enables users to upload their own data, construct networks and analyze them. Today, version 0.1 of this application has been released.

The NetworkApp has been build with Shiny: a web application framework for R. While R is used to build the web application, the user won’t see this and thus does not need to have any programming skills to use the NetworkApp.

To start using the NetworkApp, click here to access it.

New paper on depression symptom profiles

Our new study titled “Depression is not a consistent syndrome: An investigation of unique symptom patterns in the STAR*D study” was published in the Journal of Affective Disorders (PDF).

In the paper we examine the degree of heterogeneity of Major Depression (MD). The DSM-5 defines a host of depression symptoms, which are commonly added up to sum-scores that reflect overall depression severity. This implies that depression symptoms are interchangeable indicators of the same underlying condition. It further implies that all patients with MD have the same disorder, justifying the search for things like “depression risk factors” and “depression biomarkers”.

Therefore, we wanted to investigate how much depressed individuals actually differ in their symptoms. Using a conservative strategy to estimate profiles, we identified 1030 unique depression symptom profiles in 3703 depressed patients. 83.9% of these profiles were endorsed by five or fewer subjects, and 48.6% were endorsed by only one single individual.

The results could have shown that most patients fit into a few common patterns, but the most common symptom profile had a frequency of only 1.8%. This substantial symptom variation among individuals with the same diagnosis calls into question the status of MD as a specific consistent syndrome. It stresses the importance of investigating specific symptoms and their dynamic interactions. Sum-scores obfuscate important insights, and the analysis of individual symptoms and their causal associations that is an important part of the ‘Psychosystems Project’ described on this website offers a way forward.

Fried, E. I., & Nesse, R. M. (2015). Depression is not a consistent syndrome: An investigation of unique symptom patterns in the STAR*D study. Journal of Affective Disorders, 172, 96–102.