Some notes on the Journal of Abnormal Psychology special issue on the reproducibility of network analysis

By Denny Borsboom, Eiko Fried, Lourens Waldorp, Claudia van Borkulo, Han van der Maas, Angélique Cramer, Don Robinaugh, and Sacha Epskamp

The Journal of Abnormal Psychology has just published a paper by Forbes, Wright, Markon, and Krueger (2017a) entitled “Evidence that psychopathology networks have limited replicability”, along with two commentaries and a rejoinder. One of the commentaries is from our group. In it, we identify several major statistical errors and methodological problems in Forbes et al.’s paper (Borsboom, Fried, Epskamp, Waldorp, van Borkulo, van der Maas, & Cramer, 2017). Also, we show that network models replicate very well when properly implemented (the correlation between network parameters in the original and replication sample exceeds .90 for Ising models and relative importance networks). In fact, a formal statistical test shows that the hypothesis, that the Ising networks in both samples are precisely equal, cannot be rejected despite high statistical power.

We are strongly committed to open science and would have ideally shared all of the data and code, so that readers could make up their own minds on whether or not networks replicate. Unfortunately, however, both Forbes et al.’s analyses and our own are not reproducible, because the replication dataset used by Forbes et al. is not openly accessible. Therefore, we can share both the analysis code we used and the original dataset, but not the replication dataset.

In their rejoinder, Forbes et al. (2017b) do not debate the major points made in our commentary: they accept (a) that the relative importance networks they reported were incorrectly implemented; (b) that the data they used are unsuited due the fact that the correlation matrix is distorted by the skip structure in the interview schedule; and (c) that it is incorrect to assume different kinds of techniques should converge on the same network, as their analysis presupposed. In addition, Forbes et al. (2017b) now acknowledge that in the original paper that we were invited to comment on, which was widely shared by the authors and which we adressed in a previous blogpost on this website, the reported directed acyclic graphs were also incorrectly implemented.

Thus, relative to their evidence base at the time of writing their original publication, two out of three of Forbes et al.’s network models proved to be implemented incorrectly, and the correlations among the variables they studied turned out to be affected so strongly by the interview’s skip structure that no firm conclusions could be drawn from them.

Despite these facts, and even though virtually all the ‘evidence’ they originally based their conclusion on has dissolved, in their rejoinder Forbes et al. (2017b) repeat the claim that network models do not replicate.

Their line of argument is no longer based on their results from the target paper, but instead rests on two new pillars.

First, Forbes et al. (2017b) present a literature review of network papers related to post-traumatic stress (PTSD), which are argued to report different networks, and suggest that this proves that networks do not replicate. With regard to this line of argument, we note that there are many reasons that network structures can differ; for instance, because the networks really are different (e.g., the relevant PTSD networks were estimated on different samples, with different kinds of trauma, using different methodologies) or because of sampling error (e.g., due to small sample sizes). We agree that there is heterogeneity in the results of PTSD network analysis, which in 2015 inspired a large collaborative effort to investigate the replicability of PTSD network structures in a paper that is forthcoming in the journal Clinical Psychological Science. The question how these differences originate is important, but that there are differences between networks does not provide evidence for the thesis that network models are “plagued with substantial flaws”, as Forbes et al. conclude, for the same reason that finding different factor models for PTSD across samples (Armour, Müllerová, & Elhai, 2015) does not imply that there is something wrong with factor analysis. Network methodology is evaluated on methodological grounds, using mathematical analysis and simulations, not on the basis of whether it shows different results in different populations.

Second, Forbes et al. (2017b) base their conclusions on a technique presented by the other team of commentators (Steinley, Hoffman, Brusco, & Sher, 2017), who purport to evaluate the statistical significance of connection parameters in the network. This is done by evaluating these parameters relative to a sampling distribution constructed by keeping the margins of the contingency table fixed (i.e., controlling for item and person effects). The test in question is not new; it has been known in psychometrics for several years as a way to investigate whether items adhere to a unidimensional latent variable model called the Rasch model (Verhelst, 2008). While we welcome new techniques to assess network models, we know from standing mathematical theory that any network connections that produce data consistent with the Rasch model will not be picked up by this testing procedure. This is problematic because we also know that fully connected networks with uniform connection weights will produce precisely such data (Epskamp, Maris, Waldorp, & Borsboom, in press). As a consequence, the proposed test is unreliable and cannot be used to evaluate the robustness of network models. People in our team are now preparing a full methodological evaluation of Steinley et al.’s (2017) technique and will post the results soon.

In the conclusion of their rebuttal, Forbes et al. (2017b) call for methodologically rigorous testing of hypotheses that arise from network theory and analysis. We echo this conclusion. It is critical that we continue the hard work of advancing our analytic methods, forming causal hypotheses about the processes that generate and maintain mental disorders, and testing these hypotheses in dedicated studies. In addition, we think it is important that additional replication studies are being executed to examine to what extent network models replicate; interested researchers can use our analysis code as a template to implement relevant models in different populations and compare the results.


Armour, C., Műllerová, J., & Elhai, J. D. (2015). A systematic literature review of PTSD’s latent structure in the Diagnostic and Statistical Manual of Mental Disorders: DSM-IV to DSM-5. Clinical Psychology Review, 44, 60–74.

Borsboom, D., Fried, E. I., Epskamp, S., Waldorp, L. J., van Borkulo, C. D., van der Maas, H. L. J., & Cramer, A. O. J. (2017). False alarm? A comprehensive reanalysis of “Evidence that psychopathology symptom networks have limited replicability” by Forbes, Wright, Markon, and Krueger. Journal of Abnormal Psychology.

Epskamp, S., & Fried, E. I. (in press). A Tutorial on Regularized Partial Correlation Networks. Psychological Methods. Preprint:

Epskamp, S., Maris, G., Waldorp, L. J., & Borsboom, D. (in press). Network psychometrics. To appear in: Irwing, P., Hughes, D., & Booth, T. (Eds.), Handbook of Psychometrics. New York: Wiley.

Epskamp, S., Waldorp, L. J., Mõttus, R., & Borsboom, D. (submitted). Discovering Psychological Dynamics: The Gaussian Graphical Model in Cross-sectional and Time-series Data. Preprint:

Forbes, M., Wright, A., Markon, K., & Krueger, R. (2017a).  Evidence that psychopathology symptom networks have limited replicability. Journal of Abnormal Psychology.

Forbes, M., Wright, A., Markon, K., & Krueger, R. (2017b).  Further evidence that psychopathology symptom networks have limited replicability and utility: Response to Borsboom et al. and Steinley et al. Journal of Abnormal Psychology. To our knowledge, the preprint for this paper is not currently available online.

Steinley, D., Hoffman, M., Brusco, M. J., & Sher, K. J. (2017). A method for making inferences in network analysis: A comment on Forbes, Wright, Markon, & Krueger (2017). Journal of Abnormal Psychology. To our knowledge, the preprint for this paper is not currently available online.

Verhelst, N. D. (2008). An Efficient MCMC Algorithm to Sample Binary Matrices with Fixed Marginals. Psychometrika, 73, 705–728.


Two new papers on attitude networks

Our paper on predicting voting decisions using network analysis was published in Scientific Reports (PDF) and our tutorial on analyzing and simulating attitude networks was published in Social Psychological and Personality Science (PDF).

In the first paper, we show that whether attitudes toward presidential candidates predict voting decisions depends on the connectivity of the attitude network. If the attitude network is strongly connected, attitudes almost perfectly predict the voting decision. Less connected attitude networks are less predictive of voting decisions. Additionally, we show that the most central attitude elements have the highest predictive value.

In the second paper, we provide a state-of-the-art tutorial on how to estimate (cross-sectional) attitude networks and how to compute common network descriptives on estimated attitude networks. We also show how one can simulate from an estimated attitude network to derive formalized hypotheses regarding attitude dynamics and attitude change.

Psychopathology networks replicate with stunning precision

By Denny Borsboom, Eiko Fried, Sacha Epskamp, Lourens Waldorp, Claudia van Borkulo, Han van der Maas, and Angélique Cramer

Update: The pre-print of our official commentary is now online at PsyArXiv.

In a forthcoming paper in the Journal of Abnormal Psychology entitled “Evidence that psychopathology networks do not replicate”, Miriam Forbes, Aidan Wright, Kristian Markon, and Robert Krueger purport to show that network structures do not replicate across datasets. They estimate networks for symptoms of Major Depressive Episode (MDE) and Generalized Anxiety Disorder (GAD) in two large datasets – one from the National Comorbidity Survey-Replication (NCS) and one from the Australian National Survey of Mental Health and Well-Being (NSMHWB). As is evident from our published work (Fried & Cramer, in press; Fried, van Borkulo, Cramer, Boschloo, Schoevers, & Borsboom, 2017; Epskamp, Borsboom, & Fried, 2017) we see the reproducibility of network research as a top priority, and are happy to see that researchers are investigating this important issue.

The conclusion proposed by the authors is that “network analysis […] had poor replicability between […] samples”. This conclusion is not supported by the data for state-of-the-art network models, as we will argue in a commentary solicited by the Journal of Abnormal Psychology. Given the intense interest in the matter, however, we deemed it useful to post a short blog post in advance to state a fact that may not be obvious to most readers given the rhetorical style of Forbes et al. (in press). In one sentence: state-of-the-art networks don’t just replicate – they replicate with stunning precision.

Unfortunately, the authors of the paper did not share their data or code with us yet, so we cannot fully evaluate their work, but they did share the parameter matrices that they got out of the analysis, and these are sufficient to establish that the authors’ conclusion does not apply to state-of-the-art network modeling techniques (e.g., networks estimated using our R-package IsingFit; Van Borkulo et al., 2014). The authors of the paper suggest as much when they say that “[t]he replicability of the edges in the Ising models was remarkably similar between and within samples”, but this conclusion is easily lost in the rhetoric of the paper’s title, abstract, and discussion. In addition, the authors insufficiently articulate just how similar the networks are; as a result, users of our techniques may be wondering whether Ising networks really live up to their reputation as highly stable and secure network estimation techniques.

BlogPlotFigure 1. The networks estimated from the NCS and NSMHWB samples, and the scatterplot of network parameters as estimated in both samples (r=.95). The network representations use the default settings of qgraph (Epskamp, Cramer, Waldorp, Schmittmann, & Borsboom, 2011) and a common (average) layout to optimize the comparison between datasets.

So just how replicable are Ising networks? Figure 1 pictures the situation quite clearly: the IsingFit networks are almost indistinguishable, and the network parameters display a whopping correlation of .95 for network edges and .93 for node thresholds across samples (Spearman correlations equal .88 and .85, respectively). Even centrality indices, which we usually approach with considerable caution due to their sensitivity to sampling variation (Epskamp, Borsboom, & Fried, 2017), show surprisingly good replication performance with correlations of .94 (strength), .94 (betweenness), and .76 (closeness).

Nobody in our group had in fact expected such an accurate replication across two entirely distinct samples. As such, we argue that the authors’ conclusion that “the unique utility of network analysis …seems limited to visualizing complex multivariate relationships…” is unwarranted. Given our re-analysis of the results of the Forbes et al. (in press) paper one will wonder: how on earth can the authors of the paper interpret this result as “evidence that psychopathology networks do not replicate”? Well, if you want to find that out, keep an eye out for the upcoming issue of the Journal of Abnormal Psychology, in which we will provide a comprehensive dissection of their methodology and argumentation. We’ll keep you posted!



Epskamp, S., Borsboom, D. & Fried, E.I. (2017). Estimating psychological networks and their accuracy: a tutorial paper. Behavior Research Methods. doi:10.3758/s13428-017-0862-1

Epskamp, S., Cramer, A. O. J., Waldorp, L. J., Schmittmann, V. D., & Borsboom, D. (2012). qgraph: Network visualizations of relationships in psychometric data. Journal of Statistical Software, 48, 1-18.

Forbes, Wright, Markon, and Krueger (in press).  Evidence that psychopathology symptom networks do not replicate. Journal of Abnormal Psychology.

Fried, E. I. & Cramer, A. O. J. (in press). Moving forward: challenges and directions for psychopathological network theory and methodology. Perspectives on Psychological Science.

Fried, E. I.*, van Borkulo, C. D.*, Cramer, A. O. J., Lynn, B., Schoevers, R. A., Borsboom, D. (2016). Mental disorders as networks of problems: a review of recent insights. Social Psychiatry and Psychiatric Epidemiology, 52, 1-10.

Van Borkulo, C.D., Borsboom, D., Epskamp, S., Blanken, T.F., Boschloo, L., Schoevers, R.A. & Waldorp, L.J. (2014). A new method for constructing networks from binary data. Scientific Reports, 4: 5918. doi: 10.1038/srep05918




Paper on comparing networks of two groups of patients with MDD

Our paper on comparing networks of two groups of patients with Major Depressive Disorder was published in JAMA Psychiatry (PDF).

In this paper, we investigated the association between baseline network structure of depression symptoms and the course of depression. We compared the baseline network structure of persisters (defined as patients with MDD at baseline and depressive symptomatology at 2-year follow-up) and remitters (patients with MDD at baseline without depressive symptomatology at 2-year follow-up). To compare network structures we used the first statistical test that directly compares connectivity of two networks (Network Comparison Test; NCT). While both groups have similar symptomatology at baseline, persisters have a more densely connected network compared to remitters. More specific symptom associations seem to be an important determinant of persistence of depression.

A Dutch newspaper (NRC Handelsblad, November 21st, 2015) published a piece about this paper (Link).

Paper on network model of attitudes

Our paper on the Causal Attitude Network (CAN) model was published in Psychological Review (PDF).

In the paper, we introduce the CAN model, which conceptualizes attitudes as networks consisting of interacting evaluative reactions, such as beliefs (e.g., judging a presidential candidate as competent and charismatic), feelings (e.g., feeling proudness and hope about the candidate), and behaviors (e.g., voting for the candidate). Interactions arise through direct causal connections between the evaluative reactions (e.g., feeling hopeful about the candidate because one judges her as competent and charismatic). The CAN model assumes that causal connections between evaluative reactions serve to heighten the consistency of the attitude and we argue that the Ising model’s axiom of energy expenditure reduction represents a formalized account of consistency pressure. Because individuals not only strive for consistency but also for accuracy, network representations of attitudes have to deal with the tradeoff between consistency and accuracy. This tradeoff is likely to lead to a small-world structure and we show that attitude networks indeed have a small-world structure. We also discuss the CAN model’s implication for attitude change and stability. Furthermore, we show that connectivity of attitude networks provides a formalized and parsimonious account of the dynamical differences between strong and weak attitudes.

Dalege, J., Borsboom, D., van Harreveld, F., van den Berg, H., Conner, M., & van der Maas, H. L. J. (2015). Toward a formalized account of attitudes: The Causal Attitude Network (CAN) model. Psychological Review. Advance online publication.

HRQoL Paper published

Recently, our paper “The application of a network approach to health-related quality of life (HRQoL): introducing a new method for assessing hrqol in healthy adults and cancer patients” was published in Quality of Life Research.

The objective of this paper was to introduce a new approach for analyzing Health-Related Quality of Life (HRQoL) data, namely a network model.

The goal of this paper was to introduce the network approach in the analyzation of Health-Related Quality of Life (HRQoL) data. To show that the network approach can aid in the analysis of these kinds of data, we constructed networks of two samples: Dutch cancer patients (N = 485) and Dutch healthy adults (N = 1742). Both completed the 36-item Short Form Health Survey (SF-36), a commonly used instrument across different disease conditions and patient groups [1]. In order to investigate the influence of diagnostic status, we added this binary variable to a third network that was constructed using both samples. The SF-36 consists of 8 sub-scales (domains). We constructed so-called “sub-scale” networks to gain more insight into the dynamics of HRQoL on domain level.

Results showed that the global structure of the SF-36 is dominant in all networks, supporting the validity of questionnaire’s subscales. Furthermore, we found that the network structure of the individual samples were similar with respect to the basic structure (item level), and that the network structure of the individual samples were highly similar not only with respect to the basic structure, but also with respect to the strength of the connections (subscale level). Lastly, centrality analyses revealed that maintaining a daily routine despite one’s physical health predicts HRQoL levels best.

We concluded that the network approach offers an alternative view on Healt-Related Quality of Life. We showed that the HRQoL network is, in its basic structure, similar across samples. Moreover, by using the network approach, we are able to identify important characteristics in the structure, which may inform treatment decisions.

Kossakowski, J. J., Epskamp, S., Kieffer, J. M., Borkulo, C. D. van, Rhemtulla, M., & Borsboom, D. (in press). The Application of a Network Approach to Health-Related Quality of Life: Introducing a New Method for Assessing HRQoL in Healthy Adults and Cancer Patients. Quality of Life Research. DOI: 10.1007/s11136-015-1127-z.

[1] Ware, J. E, Jr, & Sherbourne, C. D. (1992). The MOS 36-item short-form health survey (SF-36): I. Conceptual framework and item selection. Medical Care, 30, 473–483.