On this page, we will list current open student projects (internship/thesis) for which we are looking for motivated students. If you find one of the projects interesting, please contact the researcher with a short description of your academic background, research interests and motivation for why you would be interested in the project. Please do not share this page, except to fellow students
Applied & conceptual projects
These are projects in which you will be asked to apply novel statistical analyses and report on your findings, but not to develop new methodological tools. They are ideal for students with more conceptual research interests.Reproducibility of network modeling results: comparing persisters and remitters
This project revolves around an ongoing treatment study (> 4000 participants) at McLean Hospital (Boston, USA) and will be in close collaboration with Courtney Beard (assistant professor at Harvard Medical School and assistant director at McLean Hospital). Data was collected in the routine clinical care at both admissions to and discharge from treatment. This dataset contains measurements of depression (9 symptoms) and anxiety (7 symptoms). With this data, reproducibility can be assessed of one of the first high impact empirical findings within the field of psychiatry and clinical psychology. Do those who will persist having high symptom levels, have a more densely connected network of depression symptoms than those who will have reduced symptom levels? This project would be a good candidate for pre-registration, making this the first pre-registered network study.
Contact: Claudia van Borkulo (C.D.vanBorkulo@uva.nl).
Network literature study
With so many network papers being published, it is time to look at the bigger picture rather than individual studies. A literature review can be performed in several fields of study in which network studies are (continuing to be) published. This work can be continued by setting up the first network meta analyses.
Contact: Sacha Epskamp (email@example.com), Julian Burger (depression, PTSD, bereavement; J.Burger@uva.nl), & Adela Isvoranu (schizophrenia; firstname.lastname@example.org).
Network analysis of clinical data
The Amsterdam Medical Centre is often interested in collaborating with us on different types of projects and co-supervise students. They have a lot of available data to work on, so you can discuss options if you are interested.
Contact: Adela Isvoranu (email@example.com).
Networks: How science interest depends on macro-cultural and economic factors
A recent network analysis of science interest as a multimodal construct revealed interesting differences between The Netherlands and Colombia for adolescents (Sachisthal et al., 2018). Science interest is a complex construct where multiple factors (such as, knowledge, self-efficacy, personal value, etc.) play a role. These components have mutual interactions, which makes a network model an interesting perspective for analysis. The next step, in this project, is to make comparisons between a large number of countries for which we have relevant macro-economic and cultural indices. Data is available via TIMMS or PISA. The plan is to compare a larger number of countries and relate structural characteristics of the national science interest networks to macro-economic and cultural characteristics of the nations.
Contact: Claudia van Borkulo (C.D.vanBorkulo@uva.nl), Maien Sachisthal (M.firstname.lastname@example.org) & Maartje Raijmakers (M.E.J.Raijmakers@uva.nl).
Since their introduction, network models have been used in hundreds of applications. The models are especially popular in the areas of clinical psychology and psychiatry, where they are used to chart connections between symptoms. As these applications proliferate, the literature threatens to become scattered. The goal of this project is to counter this process by creating a supernetwork: an annotated network of psychopathology symptoms (and possibly other factors) that represents each symptom as a node, and connects symptoms if they have been reported to be connected in the literature. Next to systematizing the literature, the supernetwork can also be used to assess reproducibility and generalizability of network structures over different studies, populations, and measurement strategies.
- Setting up the supernetwork infrastructure and applying it to a subset of studies as a proof-of-principle
- Developing and programming an interface through which researchers can contribute network structures from their own research to the supernetwork
- Gathering and analyzing data in a given area of application (e.g. depression, anxiety, schizophrenia) using network models, assessing reproducibility of the solution, and contributing the study to the supernetwork
Contact: Denny Borsboom (email@example.com).
Theoretical & mathematical projects
These projects aim to take psychological theory as a starting point for building mathematical models of psychological processes. Such mathematical models are now often studied in diverse fields as ecology, physics, biology and complexity science, but not yet often in psychology. These projects are ideal for students who are interested in the overlap between better understanding psychological processes as well as working with cross-disciplinary methods and mathematical models. Programming and calculus skills are highly recommended.An agent based model for substance use in social settings
The onset and maintenance of substance use often lies in social situations. For example, people start drinking together with friends in high-school, and may maintain a drinking habit by searching friends who also drink. The social dynamics underlying substance use, however, have not yet been investigated much in detail. By using complexity science, we may hope to gain better insight in how the use and abuse of substances may spread on social networks and how, in turn, those social networks may adapt around the use of substances. This may allow for unique insight in intervention selection as well as aide understanding of why many substances are so prevalent. This project may be hosted at the Institute for Advanced Studies.
Contact: Sacha Epskamp (firstname.lastname@example.org) & Rick Quax (email@example.com)
Theoretical mathematical modeling of psychological processes
In the last year we have started translating simple psychological theories into mathematical models. With numerous conceptual theories in numerous fields of study, we can continue this work in many different directions. Every different field of study could be a separate project, some examples are developmental psychopathology (schema-theory and personality disorders), substance use and mood disorders. This project may be hosted at the Institute for Advanced Studies.
Contact: Sacha Epskamp (firstname.lastname@example.org) & Julian Burger (J.Burger@uva.nl)
Methodological & psychometric projects
These projects are more methodologically orientated, with a focusses ranging from running simulation studies to developing new statistical methods. They are ideal for students who are interested in programming and statistics.
Network centrality and change-scores
Recently, change-score analysis has been used to validate the role of central nodes in psychopathlogical networks (https://osf.io/e7k6s/). Change-scores, however, are very pecuiliar and their covariance structure is far from trivial. While reviewing this paper, I discovered that the results are to be expected by chance alone:
This work has to be worked out and investigated further into a possible commentary or publication. The project will involve simulation studies to demonstrate the expected performance of the proposed test by chance.
Contact: Sacha Epskamp (email@example.com)
Network based adaptive assessment
For my VENI research, I aim to set up adaptive measurement systems on high-dimensional questionnaires. That is, can we use a network model to always ask only the most informative question? Such a system can have wide applications, such as improving dating apps, voting recommenders, and diagnosis. Numerous projects could fit this research.
Contact: Sacha Epskamp (firstname.lastname@example.org).
NCT-ESM: Comparing group-level networks based on ESM data
Currently, the Network Comparison Test (NCT) is suited and validated for cross-sectional data; one can statistically compare the network of one group with that of another group. The growing collection of ESM (intensive, longitudinal) data calls for an extension of NCT in which one can compare group-level networks that are based on the individuals’ ESM data in the respective groups. This project could involve (1) creating a new NCT in R, (2) assessing performance of the new test with a validation study, and (3) applying it to an empirical dataset.
Contact: Claudia van Borkulo (C.D.vanBorkulo@uva.nl).
From IsingFit to OrdinalFit: extending IsingFit to handle ordinal data
Recently, many methods have become available to estimate the network structure from data. These methods are suitable for multiple types of variables: binary, continuous, nominal, and combinations thereof. Many datasets, however, contain ordinal variables. There is a clear gap for this type of data. In close collaboration with Mijke Rhemtulla (associate professor at UC Davis), you will (1) create a new estimation method for ordinal data in R, (2) assess performance with a validation study, and (3) apply it to empirical data.
Contact: Claudia van Borkulo (C.D.vanBorkulo@uva.nl).
What is lost when binarizing variables?
In most network analyses to date, categorical variables are binarized so that the binary-valued Ising model can be used to model the data. This procedure is suboptimal because binarizing variables possibly destroys important information. This project could go in several directions: (1) illustrate theoretically / with examples in which situations information is lost due to dichotomization; (2) show how to interpret interactions involving more than 2 categories in a network model, possibly by re-analyzing a published study in which variables have been dichotomized; (3) discuss the trade-offs between binarizing & fitting the Ising model and not binarizing & fitting a more complex Mixed Graphical Model (MGM); (4) think of ways to visualize the set of parameters in a categorical-categorical interaction in an (interactive) network visualization.
Contact: Jonas Haslbeck (J.M.B.Haslbeck@uva.nl).
How well can we estimate different types of edges?
Mixed Graphical Models (MGMs) allow to combine variables defined on different domains (e.g. continuous, count or categorical) in one network model. From theory, we expect that different edge types (e.g. continuous-continuous vs. categorical-categorical) differ in how hard they are to estimate. However, it is unclear how hard exactly it is to estimate different edge types with the number of observations in typical psychological applications.
In this project, we would (1) set up a small simulation study to investigate how hard it is to recover different edge-types, and (2) discuss whether the results are in line with what we expect from theory.
Contact: Jonas Haslbeck (J.M.B.Haslbeck@uva.nl).
Missing Data Analysis
Missing data analysis is a prominent challenge in network psychometrics. Students interested in missing data analysis can pick up simulation projects investigating the extend of bias introduced by poor handling of missingness, to technical projects aiming to solve the problem using, for example, imputation techniques or FIML estimation.
- Simulation study on investigating which of the currently available methods of handling missing data in cross-sectional GGMs work best (e.g., multiple imputation, fiml estimation of cor matrix, pairwise estimation of cor matrix, Bayesian GGM estimation)
- Comparing mlVAR to MPlus estimation of multi-level graphical VAR models from zero missingness to severe missingness
- FIML estimation of polychoric correlations using OpenMx, and investigating its potential in estimating network models from datasets with skip-structure or severe missingness
Contact: Sacha Epskamp (email@example.com) & Maarten Marsman (firstname.lastname@example.org).
Skewed ordinal data and multicollinearity
One prominent issue in network psychometrics is the handling of ordinal data, especially when such data are highly skewed. With a long history of handling ordinal data in psychometrics, a solution for this problem should be conceivable with several ways of extending models. Moreover, the extent to which the variables covary with each other (i.e., multicollinearity) might have an additional effect.
- Simulation study investigating the effect of skewed ordinal data using only current state-of-the-art estimation methods.
- Simulation study investigating the effect of multicollinearity using only current state-of-the-art estimation methods. To what extent do the effects of ordinal data and multicollinearity influence each other?
- Aiming to solve estimation of network models from skewed ordinal data
Contact: Sacha Epskamp (email@example.com) & Tessa Blanken (firstname.lastname@example.org)
Measurement invariance in network modeling
Measurement invariance testing entails testing if a model is the same for two groups (e.g., men and women or majority and minority). While steps for conducting measurement invariance have been worked out in latent variable modeling, they have not yet been worked out in network models. Assessing network invariance may be performed using a simple sequence of tests:
- Does a network model fit both groups
- Does the same structure but different pars fit both groups
- Does the same structure but different intercepts fit both groups
- Does the same model fit both groups
The project is possible if the network structure is known in both groups, but perhaps harder if this needs to be estimated as well. This project will be joint with Kees-jan Kan from the education department.
Contact: Sacha Epskamp (email@example.com) & Maarten Marsman (firstname.lastname@example.org)
The critical dynamics of marginal IRT models
The most tractable way to simulate data from an Ising model is through Gibbs sampling, where the full-conditional distributions are the conditional distribution of one node given all others (i.e., logistic regression). However it is well documented that this approach suffers from increasing autocorrelation close to the critical temperature (this is critical slowing down). In a now seminal paper, Swendsen and Wang proposed an alternative approach that would come to be known as the first data-augmentation algorithm, which effectively breaks down the aforementioned Gibbs sampler into two steps: (1) generate a latent topological structure given the node states, and (2) generate the node states given the latent topological structure. Although this is a mouthful, it offers an extremely simple algorithm that suffers much less from the critical slowing down near the critical temperature. This has become one of the main examples of an efficient data augmentation algorithm, and forms the basis of many even more complicated approaches known as perfect sampling. Recently, we have proposed another data-augmentation representation of the Ising model, the marginal IRT model, which offers yet another algorithmic approach to generate data from it. The key question is how this compares to the two existing algorithms. I expect that it compares favorably to the other two algorithms, which may have major implications for all kinds of computational approaches in network models.
Contact: Maarten Marsman (email@example.com).