Free summer school materials and videos

Last month, we hosted the first online version of our summer school on network analysis. All materials except solutions are now available on the OSF and – for a limited time (until October 31) – you can now get access via Eventbrite to 50+ video lectures for free.

Special thanks to everyone who helped make this possible: Sacha Epskamp, Adela Isvoranu, Ria Hoekstra, Denny Borsboom, Riet van Bork, Julian Burger, Jonas Haslbeck, Lourens Waldorp, Eiko Fried, Alessandra Mansueto, Karoline Huth, Jill de Ron, Adam Finnemann, Gaby Lunansky, Jolanda Van der Ree-Kossakowski, Pia Tio, and Lourens Waldorp.

The polarization within and across individuals: the hierarchical Ising opinion model

On May 7 2020 we published the paper The polarization within and across individuals: the hierarchical Ising opinion model’ in the Journal of Complex networks.

Polarization of opinions involves psychological processes as well as group dynamics. However, the interaction between the within individual dynamics of attitude formation and across person polarization is rarely studied. By modelling individual attitudes as Ising networks of attitude elements, and approximating this behaviour by the cusp singularity, we developed a fundamentally new model of social dynamics.

In this hierarchical model, agents behave either discretely or continuously depending on their attention to the issue. At the individual level the model reproduces the mere thought effect and resistance to persuasion. At the social level the model implies polarization and the persuasion paradox. We propose a new intervention for escaping polarization in bounded confidence models of opinion dynamics.

Han van der Maas, Jonas Dalege, and Lourens Waldorp

Psychological Methods, UvA

Job opportunities in Amsterdam!

This post was written by Sacha Epskamp, and cross-posted to the psychonetrics blog.

The old city center of Amsterdam – a location right next to the Institute for Advanced Studies which will partly host the PhD positions.

There are several exciting job vacancies in Amsterdam on topics very closely related to our research: one assistant professor position at the Psychological Methods Group, and several PhD positions hosted through the newly founded Center for Urban Mental Health. Note that PhD positions in the Netherlands are full-time 4-year paid jobs including the same benefits as any other job (e.g., pension buildup). Starting a PhD position in the Netherlands requires a Master’s degree. In this blog post, I will highlight some of the most relevant positions. Please consider applying if you are eligible, or forwarding these vacancies to eligible candidates you may know!

Assistant Professor of Psychological Methods

The first position I would like to point out is an assistant professor position at the Psychological Methods Group. The Psychological Methods group of the University of Amsterdam is one of the largest and most successful research and education centers in the field of psychological methods. In the past decade, several strong lines of research originated from this group, such as network modeling of psychological phenomena in the Psychosystems and the Psychonetrics lab groups, Bayesian statistical tools implemented in the open-source statistics program JASP, and adaptive testing implemented in the spin-off company Oefenweb. We have recently launched the increasingly popular Behavioral Data Science master program, which will be the focus of this assistant professorship as well.

Please note that this position does not come with a tenure-track (these do not exist at the Psychological Methods Group), but does come with an outlook on a permanent position!

PhD positions at the Center for Urban Mental Health

The University of Amsterdam has recently approved the foundation of the first-ever Center for Urban Mental Health, which will form an interdisciplinary approach to tackling common mental health problems from a complexity point of view. This center will at first be housed at the renowned Institute for Advanced Studies, right in the city center of Amsterdam. The Center for Urban Mental Health will be launched with several interdisciplinary PhD projects, all aiming to start early 2020. Below, I will list the most relevant positions that are currently open for applications!

Computational modeling of psychological and social dynamics in urban mental health conditions: the case of addictive substance use.

The first PhD vacancy is a PhD project between the Psychological Methods Group in the Department of Psychology, the Department of Computational Science in the Informatics Institute, and the Institute for Advanced Studies. This project will be supervised by me (Sacha Epskamp) and Michael Lees, and will be hosted at the Psychological Methods group and the Institute for Advanced Studies. The aim of this project is to form computational models (e.g., differential equations, agent based models, Ising models), that combine psychological dynamics with social dynamics. This means that we wish to form models that can simulate intraindividual dynamics of multiple people that also interact with each other in complex ways. We are specifically looking for a candidate with a background in using such models as well as a strong affinity with psychological research.

Please note that the closing date is already this Sunday (November 24). However, if you are interested in doing a PhD on the topic of computational modeling of psychological dynamics, feel free to contact me (Sacha Epskamp) as more positions related to this topic may become available.

Network Analysis and Urban Mental Health Interventions

The second PhD vacancy is a vacancy between the Department of Psychology and the Department of Communication Science. The project will be supervised by me (Sacha Epskamp), alongside Reinout Wiers, Julia van Weert, and Barbara Schouten. This project will start out as a methodological project, investigating ways of estimating network models from self-report time-series data (e.g., experience sampling method; ESM), after which the project will move towards a more applied clinical focus in which the candidate is expected to gather and analyze ESM data, followed by collaborating with clinicians to derive and evaluate intervention plans. We are specifically looking for a candidate that has both a strong focus on clinical research as well as good data analytic skills and an affinity or background in methodology.

Network theory of addiction and depression

The third PhD vacancy is a vacancy between the Psychological Methods Group in the Department of Psychology, the Department of Psychiatry at the Amsterdam UMC, location AMC, and the Institute for Advanced Studies. This project will be supervised by Maarten Marsman, Judy Luigjes, and Ruth van Holst. This project will build upon the view that depression and addictions form a complex system of mutually interacting problems and aims to formalize a network model of the two disorders in an urban context. The project will involve analyzing existing cross-sectional, longitudinal and clinical data, taken from an urban population. Where necessary, new techniques and models will be developed for the analysis of these data. The candidate should have a strong affinity to (clinical) psychological research and formal statistical modelling (e.g., Bayesian hierarchical modelling, or mathematical statistics), and proficiency in programming in at least R.

Even more positions!

If this was not enough, more positions will open up shortly! At the Center for Urban Mental Health! Currently there is a listing on the complexity of fear, depression and addiction among adolescents (in Dutch), and a listing on aging and mental health. And a few more positions are expected to open up. If you are interested in these, I encourage you to check the Urban Mental Health website and/or follow the Center for Urban Mental Health on Twitter. In addition, even more job opportunities may open up in the coming months, depending on various sources of funding. While I cannot go into detail on those, I highly recommend joining our very active Facebook group on Psychological Dynamics, in which both job opportunities on topics related to our research are routinely posted.

Network school & SEM materials + extended registration CCS session

This blog post was written by Sacha Epskamp

I would like to share the following three updates:

1. In January we hosted the annual Psychological Networks Amsterdam Winter School. On popular request, we have now made many of the materials publicly available on the Open Science Framework: https://osf.io/he3wj.

2. I teach two courses on Structural Equation Modeling at the University of Amsterdam, which are closely related and often touch on network modeling as well. Many materials (including video summaries) of the latest course are now available at  http://sachaepskamp.com/SEM2019

3. We have extended the deadline for our event in Singapore on complexities of adverse behavior and mental health (http://complexsystems.nl/ccs2019/) to June 16! Please share this with anyone you think might be interested!

Complexities of Adverse Behavior – Registration open!

This blog post was written by Sacha Epskamp I am happy to announce the full day “Complexities of Adverse Behavior” event in Singapore on October 2! We now opened up registration for contributed talks, with a deadline of June 1. For more information and registration, please see our website. This will be part of the wider Conference on Complex Systems, which is the largest annual international conference on complexity science, bringing together many researchers from different fields of study. I attend this conference every year as I think the larger field of human behavior does not stop at the borders of psychological research (plus it is a really nice conference!) Screenshot 2019-05-07 at 11.41.25

Network Intervention Analysis

According to the network theory of mental disorders, mental disorders are developed and sustained through direct interactions between symptoms [1]. From this conceptualization it follows that treatment of mental disorders should involve changing the network of interrelated symptoms. While over the past years many studies have investigated the network structure of psychopathologies [2], the effect of psychological treatment on the network of interrelated symptoms has rarely been assessed. Moreover, network analysis techniques provide a unique opportunity to investigate treatment effects at a more detailed symptom level. In this new study we aspired to adopt and extent the network approach to investigate specific and sequential treatment effects—a technique we labelled Network Intervention Analysis (NIA) [3]. NIA involves estimating a network of interrelated symptoms and including an additional treatment indicator variable that encodes the treatment condition that a participants belongs to. Including such a variable into the network allows to see which symptoms are conditionally dependent on this treatment variable. Since the treatment can influence the symptoms but not vice versa, this dependency indicates the symptoms that are directly affected by treatment.

In our illustration of NIA we investigated the effect of cognitive behavioral therapy for insomnia (CBTI) on co-occurring insomnia and depression symptoms. More traditional analyses had showed that after completion, the treatment had relieved both insomnia and depression symptoms [4]. It remained unclear, however, whether the effect of treatment on the depression symptoms occurred via improving the sleep problems, or whether CBTI influenced the depression symptoms directly. Using NIA across 10 measurement weeks (2 prior to treatment, 5 during treatment, and 3 after treatment) we could identify that CBTI predominantly affected the sleep problems, indicating that the improvements in depression likely occur via CBTI-induced improvements in sleep.

NIA_blog

Figure. Network structure before, during, and after treatment. The networks include the Insomnia Severity Index and Patient Health Questionnaire items (circles) and treatment (square). The size of the node is proportional to the difference in symptom severity between the treatment and control group, where smaller node sizes represent greater differences in favor of the treatment group. All ten networks corresponding to each of the measurement weeks are shown in the paper. An animated version can be found online.

This paper is a first illustration of how NIA can be used to investigate sequential and symptom-specific treatment induced changes over time. We hope to further develop NIA into a more sophisticated technique to investigate treatment effects over time—to ultimately better understand treatment mechanisms and reveal clues to their optimization.

[1] Borsboom, D. (2017). A network theory of mental disorders. World Psychiatry, 16, 5-13.

[2] Fried, E.I., Van Borkulo, C.D., Cramer, A.O.J., Boschloo, L., Schoevers, R.A., Borsboom, D. (2017). Mental disorders as networks of problems: A review of recent insights. Social Psychiatry and Psychiatric Epidemiology, doi.org/10.1007/s00127-016-1319-z.

[3] Blanken, T.F.*, Van der Zweerde, T.*, Van Straten, A., Van Someren, E.J.W., Borsboom, D., Lancee, J. (2019). Introducing Network Intervention Analysis to investigate sequential, symptom-specific treatment effects: A demonstration in co-occurring insomnia and depression. Psychotherapy and Psychosomatics, doi.org/10.1159/000495045.

[4] Van der Zweerde T, Van Straten A, Effting M, Kyle SD, Lancee J. (2018). Does online insomnia treatment reduce depressive symptoms? A randomized controlled trial in individuals with both insomnia and depressive symptoms. Psychological Medicine, 49, 501-599.

*shared first authors

The Psychosystems group evolves

Busy times at the Psychosystems group! In 2018, Max Hinne joined us as a postdoc, working on a position shared between Psychosystems and Eric-Jan Wagenmakers’ Bayesian research group (the people behind the fantastic JASP program, which now incorporates a network analysis module designed by Don van den Berg). Max studies ways to integrate information on network structure and network dynamics by utilizing Bayesian approaches. Also in 2018, NWO-Veni laureate Maarten Marsman, who works on network models in the context of educational measurement, was awarded a research fellowship at the Institute for Advanced Studies in Amsterdam. In addition, Ria Hoekstra and Julian Burger successfully applied for a Research Talent grant of the Netherlands Organization for Scientific Research and are starting their Ph. D. projects this month. Ria will study methods to address heterogeneity in network structures, with Denny Borsboom acting as promotor. Julian, whose primary location is at Groningen University with Robert Schoevers acting as promotor, will develop ways to translate dynamical systems theory and network modeling into tools that are useful in clinical practice. Sacha Epskamp, who (directly after winning the 2018 Psychometric Society Dissertation Award for his thesis on network psychometrics) secured an NWO-Veni project on network modeling which also happens to start this month, will be involved in the supervision of the new Ph. D. projects, acting as co-promotor and daily supervisor, while Maarten Marsman will also be involved in Julian’s project. Incidentally, Sacha was not the only person receiving praise for his Ph. D. thesis this year, as Claudia van Borkulo finished second in the Van Swinderen Prize 2018 for her dissertation on symptom network models in depression research. Finally, Jonas Dalege is transitioning into a postdoc position this month; he will work on Denny Borsboom’s ERC consolidator grant in a joint position with the Social Psychology department at the University of Amsterdam.

A preview of new features in bootnet 1.1

This blog post was written by Sacha Epskamp.

In the past year, we have published two tutorial papers on the bootnet package for R, which aims to provide an encompassing framework for estimating network structures and checking their accuracy and stability. The first paper, published in Behavior Research Methods, focuses on how to use the bootstrapping methods to assess the accuracy and stability of network structures. The second, published in Psychological Methods, focuses on the most commonly used network model, the Gaussian graphical model (GGM; a network of partial correlations), and discusses further utility of the bootnet package.

With more than a year of development, version 1.1 of the bootnet package marks the most ambitious series of updates to the package to date. The version is now ready for testing on Github, and will soon be released on CRAN. This version includes or fleshes out a total of 7 new default sets, including one default set aimed at time-series analysis, and offers new functionality to the default sets already included in the package. In addition, the package now fully supports directed networks as well as methods resulting in multiple network models, and supports the expected influence centrality measure. Furthermore, the plotting functions in bootnet have been updated and now show more information. Finally, the package includes a new simulation function, replicationSimulator, and greatly improves the functionality of the netSimulator function. With these extensions, bootnet now aims to provide an exhaustive simulation suite for network models in many conditions. This blog post is intended to introduce this new functionality and to encourage testing of the functionality before the package is submitted to CRAN.

The package can be installed using:

library("devtools")
install_github("sachaepskamp/bootnet")

which requires developer tools to be installed: Rtools for Windows and Xcode for Mac. Subsequently, the package can be loaded as usual:
The package can be installed using:

library("bootnet")

Updates to network estimation

Bootnet now contains the following default sets:

default model Description Packages Recently added
“EBICglasso” GGM Regularized GGM estimation using glasso and EBIC Model selection qgraph, glasso
“pcor” GGM Unregularized GGM estimation, possibly thresholded using significance testing or FDR qgraph
“IsingFit” Ising Regularized Ising models using nodewise logistic regressions and EBIC model selection IsingFit, glmnet
“IsingSampler” Ising Unregularized Ising moddel estimation IsingSampler
“huge” GGM Regularized GGM estimation using glasso and EBIC Model selection huge
“adalasso” GGM Regularized GGM estimation using adaptive LASSO and cross-validation parcor
“mgm” Mixed graphical model Regularized mixed graphical model estimation using EBIC or cross-validation mgm Revamped
“relimp” Relative importance networks Relative importance networks, possible using another default for network structure relaimpo Yes
“cor” Correlation networks Correlation networks, possibly thresholded using significance testing or FDR qgraph Yes
“TMFG” Correlation networks & GGM Triangulated Maximally Filtered Graph NetworkToolbox Yes
“LoGo” GGM Local/Global Sparse Inverse Covariance Matrix NetworkToolbox Yes
“ggmModSelect” GGM Unregularized (E) BIC model selection using glasso and stepwise estimation qgraph, glasso Yes
“graphicalVAR” Graphical VAR Regularized estimation of temporal and contemporaneous networks graphicalVAR Yes

As can be seen, several of these are newly added. For example, we can now estimate the two recently added more conservative GGM estimation methods added to qgraph, which I described in an earlier blog post. Taking my favorite dataset as example:

library("psych")
data(bfi)

The optimal unregularized network that minimizes BIC (note that tuning defaults to 0 in this case) can be obtained as follows:

net_modSelect <- estimateNetwork(bfi[,1:25], 
              default = "ggmModSelect",
              stepwise = FALSE,
              corMethod = "cor")

It is generally recommended to use the stepwise improvement of edges with stepwise = TRUE, but I set it to FALSE here to make this tutorial easier to follow. Likewise, the thresholded regularized GGM can be obtained using the new threshold argument for the EBICglasso default set:

net_thresh <- estimateNetwork(bfi[,1:25],
              tuning = 0, # EBICglasso sets tuning to 0.5 by default
              default = "EBICglasso",
              threshold = TRUE,
              corMethod = "cor")

As typical, we can plot the network:

Layout <- qgraph::averageLayout(net_modSelect, net_thresh)
layout(t(1:2))
plot(net_modSelect, layout = Layout, title = "ggmModSelect")
plot(net_thresh, layout = Layout, title = "Thresholded EBICglasso")

unnamed-chunk-6-1

Principal direction and expected influence

One robust finding in psychological literature is that all variables tend to correlate positively after arbitrary rescaling of variables – the positive manifold. Likewise, it is common to expect parameters focusing on conditional associations (e.g., partial correlations) to also be positive after rescaling variables. Negative edges in such a network could indicate (a) violations of the latent variable model and (b) the presence of common cause effects in the data. To this end, bootnet now includes the ‘principalDirection’ argument in many default sets, which takes the first eigenvector of the correlation matrix and multiplies all variables by their corresponding sign. For example:

net_modSelect_rescale <- estimateNetwork(bfi[,1:25], 
              default = "ggmModSelect",
              stepwise = FALSE,
              principalDirection = TRUE)

net_thresh_rescale <- estimateNetwork(bfi[,1:25],
              tuning = 0, 
              default = "EBICglasso",
              threshold = TRUE,
              principalDirection = TRUE)

layout(t(1:2))
plot(net_modSelect_rescale, layout = Layout, 
     title = "ggmModSelect")
plot(net_thresh_rescale, layout = Layout, 
     title = "Thresholded EBICglasso")

unnamed-chunk-7-1

This makes the edges of an unexpected sign much more profound, at the cost of interpretability (as now the rescaling of some variables has to be taken into account).

One potentially useful centrality index that can be used on graphs mostly showing only positive relationships (and all variables recoded in the same direction) is the newly proposed expected influence measure. Expected influence computes node strength without taking the absolute value of edge-weights. This centrality measure can now be obtained:

qgraph::centralityPlot(
  list(
    ggmModSelect = net_modSelect_rescale,
    EBICGlasso_thresh = net_thresh_rescale
  ), include = "ExpectedInfluence"
)

unnamed-chunk-8-1

To bootstrap expected influence, it has to be requested from bootnet():

boots <- bootnet(net_thresh_rescale, statistics = "ExpectedInfluence", 
                 nBoots = 1000, nCores = 8, type = "case")

library("ggplot2")
plot(boots, statistics = "ExpectedInfluence") + 
  theme(legend.position = "none")

unnamed-chunk-9-1

In addition to expected influence, thanks to the contribution of Alex Christensen, randomized shortest paths betweenness centrality (RSPBC) and hybrid centrality are now also supported, which can be called using statistics = c("rspbc", "hybrid").

Relative importance networks

Relative importance networks can be seen as a re-parameterization of the GGM in which directed edges are used to quantify the (relative) contribution in predictability of each variable on other variables. This can be useful mainly for two reasons: (1) the relative importance measures are very stable, and (2) centrality indices based on relative importance networks have more natural interpretations. For example, in-strength of non-normalized relative importance networks equals \(R^2\). Bootnet has been updated to support directed networks, which allows for support for estimating relative importance networks. The estimation may be slow with over 20 variables though. Using only the first 10 variables of the BFI dataset we can compute the relative importance network as follows:

net_relimp <- estimateNetwork(bfi[,1:10],
              default = "relimp",
              normalize = FALSE)

As the relative importance network can be seen as a re-parameterization of the GGM, it makes sense to first estimate a GGM structure and subsequently impose that structure on the relative importance network estimation. This can be done using the structureDefault argument:

net_relimp2 <- estimateNetwork(bfi[,1:10],
              default = "relimp",
              normalize = FALSE,
              structureDefault = "ggmModSelect",
              stepwise = FALSE # Sent to structureDefault function
    )
Layout <- qgraph::averageLayout(net_relimp, net_relimp2)
layout(t(1:2))
plot(net_relimp, layout = Layout, title = "Saturated")
plot(net_relimp2, layout = Layout, title = "Non-saturated")

unnamed-chunk-12-1

The difference is hardly visible because edges that are removed in the GGM structure estimation are not likely to contribute a lot of predictive power.

Time-series analysis

The LASSO regularized estimation of graphical vector auto-regression (VAR) models, as implemented in the graphicalVAR package, is now supported in bootnet! For example, we can use the data and codes supplied in the supplementary materials of our recent publication in Clinical Psychological Science to obtain the detrended data object Data. Now we can run the graphical VAR model (I use nLambda = 8 here to speed up computation, but higher values are recommended and are used in the paper):

# Variables to include:
Vars <- c("relaxed","sad","nervous","concentration","tired","rumination",
          "bodily.discomfort")

# Estimate model:
gvar <- estimateNetwork(
  Data, default = "graphicalVAR", vars = Vars,
  tuning = 0, dayvar = "date", nLambda = 8
)

We can now plot both networks:

Layout <- qgraph::averageLayout(gvar$graph$temporal, 
                                gvar$graph$contemporaneous)
layout(t(1:2))
plot(gvar, graph = "temporal", layout = Layout, 
     title = "Temporal")
plot(gvar, graph = "contemporaneous", layout = Layout, 
     title = "Contemporaneous")

unnamed-chunk-16-1

The bootstrap can be performed as usual:

gvar_boot <- bootnet(gvar, nBoots = 100, nCores = 8)

To plot the results, we need to make use of the graph argument:

plot(gvar_boot, graph = "contemporaneous", plot = "interval")

unnamed-chunk-19-1

Updates to bootstrapping methods

Splitting edge accuracy and model inclusion

Some of the new default sets (ggmModSelect, TMFG and LoGo) do not rely on regularization techniques to pull estimates to zero. Rather, they first select a set of edges to include, then estimate a parameter value only for the included edges. The default plotting method of edge accuracy, which has been updated to include the means of the bootstraps, will then not accurately reflect the range of parameter values. Making use of arguments plot = "interval" and split0 = TRUE, we can plot quantile intervals only for the times the parameter was not set to zero, in addition to a box indicating how often the parameter was set to zero:

# Agreeableness items only to speed things up:
net_modSelect_A <- estimateNetwork(bfi[,1:5], 
              default = "ggmModSelect",
              stepwise = TRUE,
              corMethod = "cor")

# Bootstrap:
boot_modSelect <- bootnet(net_modSelect_A, nBoots = 100, nCores = 8)

# Plot results:
plot(boot_modSelect, plot = "interval", split0 = TRUE)

unnamed-chunk-14-1

This shows that the edge A1 (Am indifferent to the feelings of others) – A5 (Make people feel at ease) was always removed from the network, contradicting the factor model. The transparency of the intervals shows also how often an edge was included. For example, the edge A1 – A4 was almost never included, but when it was included it was estimated to be negative.

Accuracy of directed networks

Accuracy plots of directed networks are now supported:

net_relimp_A <- estimateNetwork(bfi[,1:5],
              default = "relimp",
              normalize = FALSE)
boot_relimp_A <- bootnet(net_relimp_A, nBoots = 100, nCores = 8)
plot(boot_relimp_A, order = "sample")

unnamed-chunk-21-1

Simulation suite

The netSimulator function and accompanying plot method have been greatly expanded. The most basic use of the function is to simulate the performance of an estimation method given some network structure:

Sim1 <- netSimulator(
  input = net_modSelect, 
  dataGenerator = ggmGenerator(),
  nCases = c(100,250,500,1000),
  nCores = 8,
  nReps = 100,
  default = "ggmModSelect",
  stepwise = FALSE)

plot(Sim1)

unnamed-chunk-22-1

Instead of keeping the network structure fixed, we could also use a function as input argument to generate a different structure every time. The updated genGGM function allows for many such structures (thanks to Mark Brandt!). In addition, we can supply multiple arguments to any estimation argument used to test multiple conditions. For example, perhaps we are interested in investigating if the stepwise model improvement is really needed in 10-node random networks with \(25\%\) sparsity:

Sim2 <- netSimulator(
input = function()bootnet::genGGM(10, p = 0.25, graph = "random"),  
  dataGenerator = ggmGenerator(),
  nCases = c(100,250,500,1000),
  nCores = 8,
  nReps = 100,
  default = "ggmModSelect",
  stepwise = c(FALSE, TRUE))

plot(Sim2, color = "stepwise")

unnamed-chunk-23-1

which shows slightly better specificity at higher sample sizes. We might wish to repeat this simulation study for 50% sparse networks:

Sim3 <- netSimulator(
  input = function()bootnet::genGGM(10, p = 0.5, graph = "random"),  
  dataGenerator = ggmGenerator(),
  nCases = c(100,250,500,1000),
  nCores = 8,
  nReps = 100,
  default = "ggmModSelect",
  stepwise = c(FALSE, TRUE))

plot(Sim3, color = "stepwise")

unnamed-chunk-24-1

The results object is simply a data frame, meaning we can combine our results easily and use the plot method very flexibly:

Sim2$sparsity <- "sparsity: 0.75"
Sim3$sparsity <- "sparsity: 0.50"
Sim23 <- rbind(Sim2,Sim3)
plot(Sim23, color = "stepwise", yfacet = "sparsity")

unnamed-chunk-25-1

Investigating the recovery of centrality (correlation with true centrality) is also possible:

plot(Sim23, color = "stepwise", yfacet = "sparsity",
    yvar = c("strength", "closeness", "betweenness", "ExpectedInfluence"))

unnamed-chunk-26-1

Likewise, we can also change the x-axis variable:

Sim23$sparsity2 <- gsub("sparsity: ", "", Sim23$sparsity)
plot(Sim23, color = "stepwise", yfacet = "nCases",
    xvar = "sparsity2", xlab = "Sparsity")

unnamed-chunk-27-1

This allows for setting up powerful simulation studies with minimal effort. For example, inspired by the work of Williams and Rast, we could compare such an un-regularized method with a regularized method and significance thresholding:

Sim2b <- netSimulator(
  input = function()bootnet::genGGM(10, p = 0.25, graph = "random"),  
  dataGenerator = ggmGenerator(),
  nCases = c(100,250,500,1000),
  nCores = 8,
  nReps = 100,
  default = "EBICglasso",
  threshold = c(FALSE, TRUE))

Sim2c <- netSimulator(
  input = function()bootnet::genGGM(10, p = 0.25, graph = "random"),  
  dataGenerator = ggmGenerator(),
  nCases = c(100,250,500,1000),
  nCores = 8,
  nReps = 100,
  default = "pcor",
  threshold = "sig",
  alpha = c(0.01, 0.05))

# add a variable:
Sim2$estimator <- paste0("ggmModSelect", 
                         ifelse(Sim2$stepwise,"+step","-step"))
Sim2b$estimator <- paste0("EBICglasso",
                    ifelse(Sim2b$threshold,"+threshold","-threshold"))
Sim2b$threshold <- NULL
Sim2c$estimator <- paste0("pcor (a = ",Sim2c$alpha,")")

# Combine:
Sim2full <- dplyr::bind_rows(Sim2, Sim2b, Sim2c)

# Plot:
plot(Sim2full, color = "estimator")

unnamed-chunk-28-1

Replication simulator

In response to a number of studies aiming to investigate the replicability of network models, I have implemented a function derived from netSimulator that instead simulates two datasets from a network model, treating the second as a replication dataset. This allows researchers to investigate what to expect when aiming to replicate effect. The input and usage is virtually identical to that of netSimulator:

SimRep <- replicationSimulator(
  input = function()bootnet::genGGM(10, p = 0.25, graph = "random"),  
  dataGenerator = ggmGenerator(),
  nCases = c(100,250,500,1000),
  nCores = 8,
  nReps = 100,
  default = "ggmModSelect",
  stepwise = c(FALSE, TRUE))

plot(SimRep, color = "stepwise")

unnamed-chunk-29-1

Future developments

As always, I highly welcome bug reports and code suggestions on Github. I would also welcome any volunteers willing to help work in this project. This work can include adding new default sets, but also overhauling the help pages or other work. Please contact me if you are interested!

New features in qgraph 1.5

Written by Sacha Epskamp.

While the developmental version is routinely updated, I update the stable qgraph releases on CRAN less often. The last major update (version 1.4) was 1.5 years ago. After several minor updates since, I have now completed work on a new larger version of the package, version 1.5, which is now available on CRAN. The full list of changes can be read in the NEWS file. In this blog post, I will describe some of the new functionality.

New conservative GGM estimation algorithms

Recently, there has been some debate on the specificity of EBICglasso in exploratory estimation of Gaussian graphical models (GGM). While EBIC selection of regularized glasso networks works well in retrieving network structures at low sample sizes, Donald Williams and Philippe Rast recently showed that specificity can be lower than expected in dense networks with many small edges, leading to an increase in false positives. These false edges are nearly invisible under default qgraph fading options (also due to regularization), and should not influence typical interpretations of these models. However, some lines of research focus on discovering the smallest edges (e.g., bridge symptoms or environmental edges), and there has been increasing concerns regarding the replicability of such small edges. To this end, qgraph 1.5 now includes a warning when a dense network is selected, and includes two new more conservative estimation algorithms: thresholded EBICglasso estimation and unregularized model selection.

Thresholded EBICglasso

Based on recent work by Jankova and Van de Geer (2018), a low false positive rate is guaranteed for off-diagonal (\(i \not= j\)) precision matrix elements (proportional to partial correlation coefficients) \(\kappa_{ij}\) for which:

\[
|\kappa_{ij}| > \frac{\log p (p-1) / 2}{\sqrt{n}}.
\]
The option threshold = TRUE in EBICglasso and qgraph(..., graph = "glass") now employs this thresholding rule by setting edge-weights to zero that are not larger than the threshold in both in the returned final model as well in the EBIC computation of all considered models. Preliminary simulations indicate that with this thresholding rule, high specificity is guaranteed for many cases (an exception is the case in which the true model is not in the glassopath, at very high sample-sizes such as \(N > 10{,}000\)). A benefit of this approach over the unregularized option described above is that edge parameters are still regularized, preventing large visual overrepresentations due to sampling error.

The following codes showcase non-thresholded vs thresholded EBICglasso:

library("qgraph")
library("psych")
data(bfi)
bfiSub <- bfi[,1:25]
layout(t(1:2))
g1 <- qgraph(cor_auto(bfiSub), graph = "glasso", sampleSize = nrow(bfi),
             layout = "spring", theme = "colorblind", title = "EBICglasso", 
             cut = 0)
g2 <- qgraph(cor_auto(bfiSub), graph = "glasso", sampleSize = nrow(bfi), 
       threshold = TRUE, layout = g1$layout, theme = "colorblind", 
       title = "Thresholded EBICglasso", cut = 0)

Picture1

While the thresholded graph is much sparser, that does not mean all removed edges are false positives. Many are likely reflecting true edges.

Unregularized Model Search

While the LASSO has mostly been studied in high-dimensional low-sample cases, in many situations research focuses on relatively low-dimensional (e.g., 20 nodes) settings with high sample size (e.g., \(N > 1{,}000\)). To this end, it is arguable if regularization techniques are really needed. In the particular case of GGMs, one could also use model selection on unregularized models in which some pre-defined edge-weights are set to zero. It has been shown that (extended) Bayesian information criterion (EBIC) selection of such unregularized models selects the true model as \(N\) grows to \(\infty\) (Foygel and Drton, 2010). The new function ggmModSelect now supports model search of unregularized GGM models, using the EBIC exactly as it is computed in the Lavaan package. The hypertuningparameter is set by default to \(0\) (BIC selection) rather than \(0.5\), as preliminary simulations indicate \(\gamma = 0\) shows much better sensitivity while retaining high specificity. By default, ggmModSelect will first run the glasso algorithm for \(100\) different tuning parameters to obtain \(100\) different network structures. Next, the algorithm refits all those networks without regularization and picks the best. Subsequently, the algorithm adds and removes edges until EBIC can no longer be improved. The full algorithm is:

  1. Run glasso to obtain 100 models
  2. Refit all models without regularization
  3. Choose the best according to EBIC
  4. Test all possible models in which one edge is changed (added or removed)
  5. If no edge can be added or changed to improve EBIC, stop here
  6. Change the edge that best improved EBIC, now test all other edges that would have also lead to an increase in EBIC again
  7. If no edge can be added or changed to improve EBIC, go to 4, else, go to 6.

When stepwise = FALSE, steps 4 to 7 are ignored, and when considerPerStep = "all", all edges are considered at every step. The following codes showcase the algorithm:

modSelect_0 <- ggmModSelect(cor_auto(bfiSub), nrow(bfi), gamma = 0, nCores = 8)
modSelect_0.5 <- ggmModSelect(cor_auto(bfiSub), nrow(bfi), gamma = 0.5, nCores = 8)
layout(t(1:2))
g3 <- qgraph(modSelect_0$graph, layout = g1$layout, theme = "colorblind", 
       title = "ggmModSelect (gamma = 0)", cut = 0)
g4 <- qgraph(modSelect_0.5$graph, layout = g1$layout, theme = "colorblind", 
       title = "ggmModSelect (gamma = 0.5)", cut = 0)

Picture2

Note that this algorithm is very slow in higher dimensions (e.g., above 30-40 nodes), in which case only the regular EBICglasso, thresholded EBICglasso, or setting stepwise = FALSE are feasible. Of note, centrality analyses, especially of the more stable strength metric, are hardly impacted by the estimation method:

centralityPlot(
  list(EBICglasso=g1,
       EBICglasso_threshold=g2,
       ggmModSelect_0 = g3,
       ggmModSelect_0.5 = g4
       ))

Picture3

Both thresholded EBICglasso and ggmModSelect are implemented in the development version of bootnet, which will be updated soon to CRAN as well. Preliminary simulations show that both guarantee high specificity, while losing sensitivity. Using ggmModSelect with \(\gamma = 0\) (BIC selection) shows better sensitivity and works well in detecting small edges, but is slow when coupled with stepwise model search, which may make bootstrapping hard. I encourage researchers to investigate these and competing methods in large-scale simulation studies.

Which estimation method to use?

Both new methods are much more conservative than the EBICglasso, leading to drops in sensitivity and possible misrepresentations of the true sparsity of the network structure. For exploratory hypothesis generation purposes in relatively low sample sizes, the original EBICglassso is likely to be preferred. In higher sample sizes and with a focus point on identifying small edges, the conservative methods may be preferred instead. There are many more GGM estimation procedures available in other R packages, and detailed simulation studies investigating which estimator works best in which case are now being performed in multiple labs. I have also implemented simulation functions in the developmental version of bootnet to aide studying these methods, which I will describe in an upcoming blog post.

Flow diagrams

Sometimes, researchers are interested in the connectivity of one node in particular, which can be hard to see in the Fruchterman-Reingold algorithm, especially when the connections to that one node are weak. The new flow function, which I developed together with Adela Isvoranu, can be used to place nodes in such a way that connections of one node are clearly visible. The function will place the node of interest to the left, then in vertical levels nodes connected to the node of interest with 1, 2, 3, etcetera edges. Edges between nodes in the same level are displayed as curved edges. For example:

flow(g2, "N3", theme = "colorblind", vsize = 4)

Picture4

Expected influence

The centrality index expected influence is now returned by centrality() and can be plotted using centralityPlot(), although it has to be requested using include. In addition, the plots can now be ordered by one of the indices:

centralityPlot(g2, include = c("Strength","ExpectedInfluence"),
               orderBy = "ExpectedInfluence")

Picture5

Note, however, that the BFI network is not an optimal network to compute expected influence on, as some variables are (arbitrarily) scored negatively. It is best to compute expected influence on a network in which higher scores on all nodes have the same interpretation (e.g., symptoms in which higher = more severe).

Future developments

As always, I highly welcome bug reports and code suggestions on Github. In addition, I will also update the bootnet package soon and write a separate blog post on its latest additions.

References

Jankova, J., and van de Geer, S. (2018) Inference for high-dimensional graphical models. In: Handbook of graphical models (editors: Drton, M., Maathuis, M., Lauritzen, S., and Wainwright, M.). CRC Press: Boca Raton, Florida, USA.

Foygel, R., & Drton, M. (2010). Extended Bayesian information criteria for Gaussian graphical models. In Advances in neural information processing systems (pp. 604-612).