Data Sets
Click below to learn about the data sets the EDGE Lab works with.
Collaborative Study on the Genetics of Alcoholism (COGA)
The Collaborative Study on the Genetics of Alcoholism (COGA) is a multi-site project whose goal is to identify specific genes involved in the predisposition to alcohol dependence and related disorders. The COGA sample consists of large families densely affected with alcohol dependence, who were identified through inpatient or outpatient alcohol treatment programs. All individuals were administered the Semi-Structured Assessment for the Genetics of Alcoholism (SSAGA) interview, a polydiagnostic instrument that assesses most major psychiatric disorders. Since 1991, COGA has interviewed more than 17,000 members of more than 2,200 families from across the United States, many of whom have been longitudinally assessed. COGA has used a variety of complementary strategies for gene identification, and has a variety of genotyped samples in which different types of genetic analyses are on-going. These include (1) family-based linkage sample; (2) case control GWAS sample; and (3) child-adolescent sample. Family members, including adults, children, and adolescents, have been carefully characterized across a variety of domains, including alcohol and other substance-related phenotypes, co-occurring disorders (e.g. depression), electrophysiology, key precursor behavioral phenotypes (e.g. conduct disorder), and environmental risk factors (e.g. stress). COGA participants have also provided a blood sample that has been used to create a repository of DNA and cell lines which are used for genetic studies.
Click here for information on how to access COGA data.
Project Alliance (PAL)
Project Alliance (PAL) comprises two existing longitudinal research data sets: Project Alliance 1 (PAL1) and Project Alliance 2 (PAL2). The full Project Alliance 1 (PAL 1) sample consisted of 998 multiethnic families; and 593 families participated in Project Alliance 2 (PAL 2). In both studies, two cohorts (in subsequent academic years) of adolescents and their families were recruited in the sixth grade from three public middle schools in northeast Portland, Oregon. Youths completed questionnaires about family conflict and antisocial behavior, among other topics. Youth and families were assessed at 9 time points between age 11-12 and age 23-24. This rich data set includes a variety of data types, including youth and parent self-report surveys, teacher ratings of problem behavior, school records, criminal records, and observations of youth interacting with their families and best friends.
Spit for Science: The VCU Student Survey
Spit for Science: The VCU Student Survey (S4S) is a university-wide research registry with the goal of understanding pathways of risk (both genetic andenvironmental) for substance use and related mental health outcomes across the college years. From 2011-2015, all incoming freshmen age 18 or older were invited to participate in an on-line survey at the beginning of the fall semester of their first year, provide a saliva DNA sample, and complete follow-up surveys each spring semester thereafter. Five cohorts of incoming freshman have been enrolled through the S4S pipeline thus far (N>12,000). Spit for Science has achieved consistent participation rates of 67-68% across all freshman classes, which compares quite favorably to cooperation rates of 22-40% reported in other web-based college surveys 67-70 . Sample demographics do not differ significantly from the overall university study population: 17% identify as Asian; 17% identify as African American, 6% identify as Hispanic, 51% identify as Caucasian, 6% identify as more than 1 race, and 3% report other/unknown; 60% are female and 40% male. All data are entered in a registry, formerly managed by the EDGE Lab, now by the VCU Office of Research. All data are available broadly to all investigators with relevant research interests related to substance use and related behavioral health outcomes.
To request access to Spit for Science data, email spit4science@vcu.edu.
The Finnish Twin Studies
FinnTwin16 (FT16) and FinnTwin12 (FT12) are two population-based twin studies aimed at understanding how genetic and environmental influences impact the development of alcohol use and related behaviors across adolescence and into young adulthood. All twins were identified through Finland's Central Population Registry, permitting exhaustive and unbiased ascertainment of all twins born in the country across 10 birth, for a total of ~10,000 twins and their families. FT16 has questionnaire assessments at ages 16, 17, 18.5, and in the mid-20s. These questionnaires contain items on alcohol use, smoking, other drug use, personality, and related health habits and environmental factors. A subset of the twins highly concordant or discordant for alcohol use in adolescence (~600 twins) also completed psychiatric interviews, DNA collection, electrophysiological measures, and neuropsychological testing at the mid-20s assessment. FT12 first assessed children at age 12, with follow-ups at age 14, 17, and in the young 20s. FT12 contains rich data from the twins, parents, teachers, and peers. A subset (~1850 twins and their parents) also completed psychiatric interviews at ages 14 and 22. GWAS data for this subset are also available.
To inquire about data access, email Dr. Jaakko Kaprio at jaakko.kaprio@helsinki.fi.
The Child Development Project (CDP)
The Child Development Project (CDP) is a community based sample, in which children were recruited during kindergarten pre-registration from a variety of schools that served families from a range of socioeconomic status groups at three US cities. The original CDP sample consisted of 585 children (52% male; 81% European American, 17% African American, and 2% other ethnic groups). Data collection began the summer before the participants entered kindergarten (at ~age 5) and follow-ups have been conducted annually and remain on-going. DNA was collected from 93% of the target sample of regular CDP participants and is stored in the VCU Virginia Institute for Psychiatric and Behavioral Genetics (VIPBG) molecular genetics laboratory of Dr. Riley.
The project's guiding model of developmental process is that children's biological dispositions, cultural contexts, life experiences, and characteristic social cognitions transactionally combine to influence a variety of behavioral outcomes. The rich, longitudinal assessments of the CDP offer special advantages for advancing understanding of genetic mechanisms in behavioral development. The CDP's database contains multi-source, multi-method measures of multiple levels of social context and process, including family, school, peer, neighborhood, and child characteristics. It also contains a wide range of adjustment measures, including externalizing and internalizing behavior problems, substance use, academic, occupational and military achievement, romantic relationships, and religious and civic involvement. The CDP database provides an unusually dense array of theoretically important, phenotypic measures over a long span of development. This richness of phenotypes makes the genetic information available in the sample particularly valuable for studying how these genes impact developmental pathways.
The Mobile Youth Survey (MYS)
The Mobile Youth Survey (MYS) was designed to identify the life course trajectories of adolescents (aged 10-18) living in poverty in the Mobile-Prichard inner city area of Alabama. The MYS was administered in a group-format between 1998-2011, and examined a number of psychosocial variables including risk behaviors (e.g., violence and agressive behavior, alcohol and drug use, sexual behavior), family factors (e.g., family structure and parental monitoring), and neighborhood factors (e.g., support from neighborhood). Over 12,000 youths enrolled in the MYS, producing nearly 36,000 annual data points.
In 2008, DNA and more extensive phenotypic measures on a subsample of ~700 MYS participants ages 14-18 were collected. These youth were also part of a 'natural experiment' in which a random subset of families were relocated to better housing make possible by a government grant.
The MYS sample is a unique resource for extending our understanding of the risks associated with identified genes, both in the sense that it is a largely African-American sample, an under-represented population in genetic studies, and an impoverished sample, making it possible to study how extreme environmental conditions, such as poverty, may alter the importance/expression of individual genetic predispositions and/or the role of other important environmental factors, such as family and peer variables.
Avon Longitudinal Study of Parents and Children (ALSPAC)
Avon Longitudinal Study of Parents and Children (ALSPAC) has followed a large epidemiological cohort of over 14,000 children (with DNA on 10,000), born from pregnancies with due dates between April 1991 and December 1992, and their families, for over 25 years. The project has collected comprehensive health-related information, including phenotypic outcomes, environmental factors, and DNA, with >85 assessments from mothers, their partners, and children, conducted from the pre-natal stage through age 17 at yearly, or more frequent, intervals.
Click here to learn more about the questionnaires, measures, and cohort sample.
Click here if you are interested in accessing the ALSPAC data set.