Network Application 0.1

During the last couple of months, I have been working on building a web application (called NetworkApp) that enables users to upload their own data, construct networks and analyze them. Today, version 0.1 of this application has been released.

The NetworkApp has been build with Shiny: a web application framework for R. While R is used to build the web application, the user won’t see this and thus does not need to have any programming skills to use the NetworkApp.

To start using the NetworkApp, click here to access it.

New paper on depression symptom profiles

Our new study titled “Depression is not a consistent syndrome: An investigation of unique symptom patterns in the STAR*D study” was published in the Journal of Affective Disorders (PDF).

In the paper we examine the degree of heterogeneity of Major Depression (MD). The DSM-5 defines a host of depression symptoms, which are commonly added up to sum-scores that reflect overall depression severity. This implies that depression symptoms are interchangeable indicators of the same underlying condition. It further implies that all patients with MD have the same disorder, justifying the search for things like “depression risk factors” and “depression biomarkers”.

Therefore, we wanted to investigate how much depressed individuals actually differ in their symptoms. Using a conservative strategy to estimate profiles, we identified 1030 unique depression symptom profiles in 3703 depressed patients. 83.9% of these profiles were endorsed by five or fewer subjects, and 48.6% were endorsed by only one single individual.

The results could have shown that most patients fit into a few common patterns, but the most common symptom profile had a frequency of only 1.8%. This substantial symptom variation among individuals with the same diagnosis calls into question the status of MD as a specific consistent syndrome. It stresses the importance of investigating specific symptoms and their dynamic interactions. Sum-scores obfuscate important insights, and the analysis of individual symptoms and their causal associations that is an important part of the ‘Psychosystems Project’ described on this website offers a way forward.

Reference:
Fried, E. I., & Nesse, R. M. (2015). Depression is not a consistent syndrome: An investigation of unique symptom patterns in the STAR*D study. Journal of Affective Disorders, 172, 96–102.

Network Model Selection Using qgraph 1.3

In this post I will describe some of the new functionality in the 1.3 version of qgraph. The qgraph package is aimed at visualizing correlational structures and, more recently, at visualizing the results of network analysis applied to, in particular but not limited to, psychology. The application of network analysis to psychology has been the core interest of the Psychosystems research group in which the qgraph package has been developed. In version 1.2.4 and 1.2.5 new functions were introduced to compute graph descriptives of weighted graphs and in version 1.3, the visualization techniques are extended with model selection tools for estimating the graph structure. As such, qgraph is no longer just a visualization tool but a full suite of methods for applying network analysis from data to estimating a graph structure and assessing graph descriptives. Version 1.3 is on CRAN now and will be updated on mirrors shortly. To install and check if you have version 1.3 run:

install.packages("qgraph")
packageDescription("qgraph")$Version

Alternatively, you can install the developmental version from GitHub:

library("devtools")
install_github("sachaepskamp/qgraph")

Network models

In qgraph generally two types of network models can be estimated for multivariate Gaussian data: the correlation network and the partial correlation network. In graph literature a correlation network is typically denoted with bidirectional edges whereas a partial correlation network is typically denoted with undirected edges:

 

In qgraph, bidirectional edges can be plotted but are omitted by default to make the graph clearer. Both the correlation network and the partial correlation network are saturated models if fully connected and degrees of freedom can be obtained by removing edges. Thus, a network model simply means a correlation or partial correlation network in which some edges are set to be equal to zero; model selection means that we attempt to identify which edges should be set to zero. This can be done broadly in three ways:

  • Thresholding: simply remove all edges that are weaker than some threshold, such as all edges that are not significantly different from zero.
  • Model search: Continuously estimate models in which some edges are set to zero and stepwise comparisons or an information criterium to decide which model is optimal
  • LASSO estimation: jointly estimate parameter values and the absence of edges using regularization techniques (more on this below).

Correlation networks

A correlation network is a model for marginal independence relationships; two unconnected nodes imply a marginal independence between two nodes. For instance, the graph above implies the following independence relationship:
\[
A \!\perp\!\!\!\perp C.
\]
We can use the schur complement to see the relationship between \(A\) and \(C\) after conditioning on \(B\):
\[
\mathrm{Cov}\left( A, C \mid B \right) = \mathrm{Cov}\left( A, C \right) - \mathrm{Cov}\left( A, B \right) \mathrm{Var}\left( B \right)^{-1} \mathrm{Cov}\left(B, C \right) \not= 0,
\]
Since both covariances with \(B\) are nonzero and the inverse of a variance is always nonzero for any random variable. Thus, the correlation network above does not only imply a marginal independence, it also implies a conditional dependency relationship:
\[
A \not\!\perp\!\!\!\perp C \mid B.
\]
Such an induced dependency can only occur in very specific causal structures that might not correspond to our general notion of nodes interacting with each other. In general, we expect that if \(A\) and \(B\) are correlated and \(B\) and \(C\) are correlated than \(A\) and \(C\) are also correlated. In general, we would expect correlation networks to often be saturated, especially when constructing networks based on psychometric tests that are designed to contain correlated items. As such, we generally do not apply model selection to correlation networks.

Even though correlation networks are not that interesting from a modeling perspective they are very interesting in the information they portray about the structure of a dataset. Epskamp et al. (2012) described how qgraph and a correlation network of the Big 5 can be used from a psychometric point of view to simply visualize the correlational patterns in the dataset; factors should emerge as clusters of nodes. Cramer et al. (2012) interpreted the same Big 5 network in the context of the network perspective of psychology. Furthermore, correlation networks have also been described and used by Cramer, Borsboom, Aggen and Kendler (2011), Schmittmann et al. (2013) and Ruzzano, Borsboom and Geurts (2014).

Partial correlation network

The partial correlation network is a model for conditional independence relationships. In the above undirected network there is no edge between nodes \(A\) and \(C\), which implies that:
\[
A \!\perp\!\!\!\perp C \mid B.
\]
Thus, the partial covariance between \(A\) and \(C\) after conditioning on all other nodes in the network equals zero. Now, by definition, \(\mathrm{Cov}\left( A, C \mid B \right)\) is equal to zero and thus:
\[
\mathrm{Cov}\left( A, C \right) = \mathrm{Cov}\left( A, B \right) \mathrm{Var}\left( B \right)^{-1} \mathrm{Cov}\left(B, C \right) \not= 0,
\]
which implies the following marginal relationship:
\[
A \not\!\perp\!\!\!\perp C.
\]
Therefore, the partial correlation network implies the opposite kind of independence relationships between unconnected nodes; unconnected nodes are conditionally independent but marginally still allowed to correlate. The edges in the network can be understood as pairwise interactions between two nodes This allows us to model interaction effects between pairs of nodes while still allowing all nodes to correlate. Such a network of pairwise interactions is termed a pairwise Markov Random Field (Lauritzen, 1996). If the data is assumed multivariate normally distributed then the model is called a Gaussian Graphical Model, Gaussian Random field, concentration network or partial correlation network; the network structure is then directly related to the inverse of the covariance matrix with zeroes indicating missing edges.

We will use the term partial correlation network to describe this class of networks. In a partial correlation network, the strength of edges—the partial correlation between two nodes after conditioning on all other nodes—is directly equivalent to the predictive quality between two nodes that would be obtained in multiple regression. Thus, the strength of an edge can be seen as how well two nodes predict each other; in the above network knowing \(A\) tells you something about \(B\) and in turn knowing \(B\) tells you something about \(C\). The network can also be seen as large scale mediation analysis, as in the above network the predictive quality of \(A\) on \(C\) is mediated by \(B\).

Network estimation using qgraph

We will analyze the BFI dataset from the psych package, of which the correlation network had previously been described on Joel Cadwell’s blog. This dataset contains 25 items on the five central personality traits: openness to Experience, conscientiousness, extroversion, agreeableness and neuroticism. For each trait there are five indicators, such as “make friends easily” for extroversion and “make people feel at ease” for agreeableness. We can load the data into R as follows:

library("psych")
data(bfi)

We are only interested in the interactions between items of the big five so we only subset the 25 indicators:

bfiSub <- bfi[,1:25]

These responses are measured on a five point scale, and thus the assumption of normality is violated by default. As a solution, we compute not the Pearson correlations but the polychoric correlations instead. To do this, we can use the cor_auto() function in qgraph that automatically detects if data looks ordinal and runs the lavCor() function from the lavaan package to compute Pearson, polychoric or polyserial correlations where needed:

library("qgraph")
bfiCors <- cor_auto(bfiSub)
## Variables detected as ordinal: A1; A2; A3; A4; A5; C1; C2; C3; C4; C5; E1; E2; E3; E4; E5; N1; N2; N3; N4; N5; O1; O2; O3; O4; O5

To help plot our graph we can also load more information on the meaning of each node:

Names <- scan("http://sachaepskamp.com/files/BFIitems.txt",what = "character", sep = "\n")
Groups <- rep(c('A','C','E','N','O'),each=5)

Now we can plot the (saturated) correlation network. To do so, we can simply send the correlation matrix to the qgraph() function and use some more arguments to make the plot look good. In addition, we can set graph = "cor" to tell qgraph that the input is, in fact, a correlation matrix (this doesn’t do much other than check if the matrix is positive definite):

corGraph <- qgraph(bfiCors, layout = "spring", graph = "cor",
       nodeNames = Names, groups = Groups, legend.cex = 0.3,
      cut = 0.3, maximum = 1, minimum = 0, esize = 20,
      vsize = 5, repulsion = 0.8)

In this network each node represents one of the 25 items in the BFI dataset and edges represent the strength of correlation between them; green edges represent positive correlations, red edges negative correlations and the wider and more saturated an edge the stronger the correlation. The argument layout = "spring" forced qgraph to use a weighted version of the Fruchterman and Reingold (1991) algorithm to place strongly correlated nodes together. The nodeNames, groups and legend.cex arguments controlled the coloring and the legend placed next to the graph, the vsize argument the size of the nodes and the repulsion argument the repulsion ratio used in the Fruchterman-Reingold algorithm.

The minimum argument causes all edges with a value below the minimum to not be drawn (these edges are only made invisible but still taken into account when computing graph descriptives), the maximum argument sets an upper bound for the width of edges to scale to and the cut argument splits scaling in saturation and width; edges above the cut value are fully saturated and scale in width whereas edges below the cut value are small and scale in color. By default qgraph will choose these values to optimally plot a graph (see the manual for more details); to know which values qgraph used you could use the argument details = TRUE.

The above correlation network overall nicely retrieves the five factor structure; as expected factors emerge as clusters of correlations. To compute the saturated partial correlation network we can set the argument graph = "pcor":

pcorGraph <- qgraph(bfiCors, layout = corGraph$layout, graph = "pcor",
      nodeNames = Names, groups = Groups, legend.cex = 0.3, 
      cut = 0.1, maximum = 1, minimum = 0, esize = 20,
      vsize = 5)

Now we set the layout argument to be equal to the layout of the correlation network and the cut value to be lower as partial correlations are generally weaker than correlations. We can see that the partial correlation network generally returns the same pattern of interactions but also that some edges—such as between N4 and N1—are much weaker in the partial correlation network than in the correlation network.

Thresholding

Both the networks shown above are fully saturated; even though some (partial) correlations are so weak they are practically invisible both networks are fully connected. The aim of model selection is to identify which connections are zero, as these values indicate (conditional) independence relationships. A straightforward way of doing this is by thresholding; setting all edges that are not significantly different from zero to be equal to zero.

To use thresholding in qgraph we can use the threshold argument, which works different than minimum in that it removes edges from the network, influencing the network descriptives. The threshold argument can be set to both a value or a string indicating the type of adjustment for multiple tests to be used, which corresponds to the adjust argument of corr.test function in the psych package. In addition to the options psych provides the threshold argument can be set to locfdr to use the local FDR measure provided by the fdrtool package and used in the FDRnetwork function in qgraph. For instance, we can remove all non-significant edges using the Holm correction on the partial correlation network as follows:

holmGraph <- qgraph(bfiCors, layout = corGraph$layout, graph = "pcor",
      threshold = "holm", sampleSize = nrow(bfi),
      nodeNames = Names, groups = Groups, legend.cex = 0.3, 
      cut = 0.1, maximum = 1, minimum = 0, esize = 20,
      vsize = 5)

Which makes the network much cleaner and more interpretable.

Model search

While thresholding can be insightful it is a rather crude way of model selection; we only remove the non-significant edges and do not look at how other edge values change nor at how well the model without certain edges explains the data. To accomplish this we need model selection. We are particularly interested in the question: what model explains the data appropriately while being simple (having few edges).

One way to accomplish this is by evaluating all possible models and comparing them on some information criterium—such as the BIC—that weights the likelihood against the sparsity. For a three-node network all possible models are the following:

 

The connections between models indicate nesting relationships; a model with only an edge between \(A\) and \(B\) is nested in the model with both edges between \(A\) and \(B\) and \(B\) and \(C\). The number of possible models is \(2^{p(p-1)/2}\) with \(p\) being the amount of nodes in the network. It will be clear that in the case of a three node network it is entirely tractable to estimate \(8\) models and compare the BIC values but for the case of \(25\) nodes it becomes much impossible to compare all possible models. Thus, a search is performed to navigate through the model space. We could start at the bottom with the independence model and stepwise move higher to more complicated models by only comparing directly nested models, or start at the top with the saturated model and stepwise move lower to simpler models. The first routine is called step-up estimation and the second routine step-down estimation.

The findGraph function in qgraph can be used to search for an optimal (partial) correlation network. This function can be used to do brute search of all models, step-up comparison and step-down comparison. To compare models the Extended Bayesian Information Criterium (EBIC; Foygel & Drton, 2010) is used. By default, the findGraph function will start at the thresholded graph using Holm correction and iterate between step-up and step-down search until an optimal model is found:

optGraph <- findGraph(bfiCors, nrow(bfi), type = "pcor")
optimalGraph <- qgraph(optGraph, layout = corGraph$layout,
      nodeNames = Names, groups = Groups, legend.cex = 0.3, 
      cut = 0.1, maximum = 1, minimum = 0, esize = 20,
      vsize = 5)

 

LASSO estimation

Model selection can be a slow process as the amount of variables increases, and the stepwise process might get lost in the model space and not converge to the best model. For example, stepwise up and stepwise down methods often obtain a different model. A well established method for simultaneously estimating parameters and performing model selection on a partial correlation network is to employ the least absolute shrinkage and selection operator (LASSO; Tibshirani, 1996) regularization technique. A particularly fast variant of the LASSO for these kinds of networks is the graphical LASSO which directly estimates the inverse of the covariance matrix (Friedman, Hastie and Tibshirani, 2008) and has been implemented in the R package glasso.

In short, the graphical LASSO estimates not the maximum likelihood solution but a penalized maximum likelihood solution in which the likelihood is penalized for the sum of absolute parameter estimates. Because of this, the graphical LASSO estimates a model that still explains the data well with generally weaker parameter estimates and many parameter estimates of exactly zero. The amount of penalization is specified by a tuning parameter \(\lambda\). In qgraph, the EBICglasso function can be used to estimate a network using 100 different \(\lambda\) values and return of these the network that minimizes the extended Bayesian Information Criterium (EBIC; Chen, and Chen, 2008). Such a procedure has been established to accurately recover the network structure (Foygel & Drton, 2010; van Borkulo et al., 2014).

To automatically have qgraph run EBICglasso and estimate the model we need only set graph = "glasso":

glassoGraph <- qgraph(bfiCors, layout = corGraph$layout, 
      graph = "glasso", sampleSize = nrow(bfi),
      nodeNames = Names, groups = Groups, legend.cex = 0.3, 
      cut = 0.1, maximum = 1, minimum = 0, esize = 20,
      vsize = 5)

This method runs extremely fast and returns a sparse and interpretable network structure. This network shows some interesting properties. For example, neurotocism consists of two distinct clusters, an anxiety and a depression cluster, connected by the item “N3: Have frequent mood swings”. The Conscientiousness item “C5: Waste my time” and the extroversion item “E3: Know how to captivate people” act as a bridging nodes between the different personality trait clusters. Finally, most strong connections make perfectly sense when interpreted as a pairwise interaction.

Graph descriptives using qgraph

Now that we have estimated the partial correlation network in different ways we can look at how these networks compare in terms of graph descriptives. We are particularly interested in node centrality and clustering. To compare the estimated networks, we can use the centralityPlot and clusteringPlot functions which use ggplot2 to visualize the estimates:

centralityPlot(
  list(
    saturated = pcorGraph, 
    threshold = holmGraph,
    modelSearch = optimalGraph,
    glasso = glassoGraph)
  )

We can similarly compare the clustering coefficients:

clusteringPlot(
  list(
    saturated = pcorGraph, 
    threshold = holmGraph,
    modelSearch = optimalGraph,
    glasso = glassoGraph)
  )
## Warning: Removed 25 rows containing missing values (geom_point).

We can see that even though the graphs all are highly similar they can differ on centrality and clustering coefficients. However, generally all graphs are somewhat aligned; nodes that are highly central are so across all graphs.

References

Four new Ph. D. projects on networks

We are happy to welcome four new Ph.D. researchers into the network community:

  • Marie Deserno will investigate determinants of success and problems in Autism, in the context of a project organized by the Academische Werkplaats Autisme. The project is supervised by Hilde Geurts, Sander Begeer, and Denny Borsboom.
  • Riet van Bork will investigate methodologies suited to distinguish between network and latent variable explanations of symptom covariation. The project is funded partly by a European Commission Career Integration Grant awarded to Mijke Rhemtulla, who will supervise the project together with Denny Borsboom.
  • Jonas Dalege will develop network models for the structure and dynamics of attitudes in a project that involves a collaboration between the Psychological Methods and Social Psychology groups at the University of Amsterdam. The project is supervised by Frenk van Harreveld, Han van der Maas, and Denny Borsboom.
  • Merijn Mestdagh will develop time series models for nonlinear networks that involve large number of variables at the K.U. Leuven in a Ph. D. project awarded by the Flemish Organization for Scientific Research (FWO). The project is supervised by Francis Tuerlinckx and Denny Borsboom.

Three new NWO-Veni projects on networks

A very successful round of 2014 NWO grant applications for network researchers!

Our own Angélique Cramer received a Veni grant to further develop network methodology and apply that to mood disorders:

  • Network psychometrics: methods for exposing the architecture and dynamics of mood disorders
    Dr A.O.J. (Angélique) Cramer (f), University of Amsterdam – Psychometrics
    In the network approach a psychological disorder is a consequence of interactions between symptoms (gloomy and fretting). In this proposal methods to construct and analyse these networks will be further developed. Subsequently the networks of patients with mood disorders will be investigated: for example, can we predict when the patient will experience a relapse?

Hanneke Wigman of Groningen University received a Veni grant to investigate symptom networks in the etiology of psychosis:

  • Networks of symptoms in psychosis
    Dr J.T.W. (Hanneke) Wigman (f), University Medical Centre Groningen – Psychiatry
    The researcher will study the development of psychoses (a lost sense of reality) by describing networks of early symptoms and by examining per person how these symptoms influence each other. This will help to develop personalised treatment recommendations.

Annemarie Kalis of Utrecht University received a Veni grant to investigate the role of mental content in empirical psychological research, one subproject of which is devoted to network models:

  • A role for content and mental causation in empirical psychology
    Dr A. Kalis (f), Utrecht University – Philosophy
    Many mental states have content: our convictions, desires, intentions ‘are all about something’. Empirical psychology, however, scarcely offers space for the notion of content. This project will develop a philosophical analysis that shows how mental content can and should play a role in in empirical psychological research.