Summary statistics based on Saunders, G.R.B., Wang, X., Chen, F. et al. Genetic diversity fuels gene discovery for tobacco and alcohol use. Nature 612, 720–724 (2022). https://doi.org/10.1038/s41586-022-05477-4 Files include summary statistics for associations with each phenotype: Drinks per week (DrnkWk), Cigarettes per day (CigDay), Smoking initiation (SmkInit), Smoking cessation (SmkCes), and Age of initiation (AgeSmk). Data was gathered from the meta-analysis of 60 genome wide association studies (GWAS: https://genome.psych.umn.edu/index.php/GSCAN) in up to 3.4 million participants from four major ancestries on nicotine and substance use. ## ------------------------------------------ Column descriptions for files associated with ancestry-stratified GWAS summary statistics: GSCAN_[SmkInit|AgeSmk|CigDay|SmkCes|DrnkWk]_2022_GWAS_SUMMARY_STATS_[EAS|EUR|AMR|AFR].txt.gz Columns are: CHR: Chromosome POS: Position (b38) RSID: rsID EFFECT_ALLELE: Effect allele OTHER_ALLELE: Non-effect allele AF_1000G: Effect allele frequency from 1000G BETA: Beta based on effect allele SE: Standard error of the beta P: Beta P-value N: Sum of sample size across contributing cohorts Note: these files contain results from fixed-effect ancestry-stratified meta-analyses with 23andMe removed. Files include all variants that have passed our post-meta-analytic QC filters (see page 12 of the Supplementary Information) and were present in the 1000Genomes reference panel (phase 4). We did not include allele frequency for data privacy reasons and instead used 1000G allele frequency within ancestry for each variant. ## ------------------------------------------ Column descriptions for files associated with multi-ancestry GWAS summary statistics: GSCAN_[SmkInit|AgeSmk|CigDay|SmkCes|DrnkWk]_2022_GWAS_SUMMARY_STATS_MULTI.txt.gz Columns are: CHR: Chromosome POS: Position (b38) EFFECT_ALLELE: Effect allele OTHER_ALLELE: Non-effect allele EFFECT_ALLELE_FREQ_1000G: Allele Frequency from 1000G (see note below) GAMMA_0MDS: Gamma estimate for intercept GAMMA_0MDS_SE: GAMMA_0MDS standard error GAMMA_1MDS: Gamma estimate for MDS component 1 GAMMA_1MDS_SE: GAMMA_1MDS standard error GAMMA_2MDS: Gamma estimate for MDS component 2 GAMMA_2MDS_SE: GAMMA_2MDS standard error GAMMA_3MDS: Gamma estimate for MDS component 3 GAMMA_3MDS_SE: GAMMA_3MDS standard error GAMMA_4MDS: Gamma estimate for MDS component 4 GAMMA_4MDS_SE: GAMMA_4MDS standard error TAU2: residual heterogeneity N: Sum of sample size across contributing cohorts Note: these files contain results from mixed-effect multi-ancestry meta-regressions with 23andMe removed. Files include all variants that have passed our post-meta-analytic QC filters (see page 12 of the Supplementary Information) and were present in the 1000Genomes reference panel (phase 4). We did not include allele frequency for data privacy reasons and instead used 1000G allele frequencies. Allele frequencies for each variant are based on a rough approximation of expected ancestry proportions within the meta-sample, and we computed a weighted average allele frequency from an ancestry-matched subset of 1000 Genomes. ## ------------------------------------------ Column descriptions for multi-dimensional scaling (MDS) files associated with multi-ancestry GWAS summary statistics: GSCAN_[SmkInit|AgeSmk|CigDay|SmkCes|DrnkWk]_2022_MDS_MULTI.txt.gz Columns are: MDS1: Multidimensional scaling (MDS) component 1 MDS2: MDS component 2 MDS3: MDS component 3 MDS4: MDS component 4 Note: these files contain MDS coordinates for all contributing studies for each phenotype. The MDS coordinates are based on allele frequencies from all studies that contributed summary statistics to our meta-analytic effort. Study names have been removed but these data may be combined with the multi-ancestry summary statistics to compute per variant effect sizes in an external sample. ## ------------------------------------------ Column descriptions for files associated with PRS weights: GSCAN_[SmkInit|AgeSmk|CigDay|SmkCes|DrnkWk]_2022_PRS_WEIGHTS_[AFR|EAS|EUR|AMR].txt.gz Columns are: RSID: rsID CHR: Chromosome POS: Position (b38) EFFECT_ALLELE: Effect allele OTHER_ALLELE: Non-effect allele EFFECT_WEIGHT: Effect allele PRS weight Note: PRS weights were generated using LDpred implemented with ancestry-stratified GWAS summary statistics excluding data from 23andMe. LDpred ran with ancestry group matched LD reference panels (Add Health reference genotypes for EUR, AFR, and AMR ancestries; 1000 Genomes for EAS ancestry). Files include HapMap3 variants. ## ------------------------------------------ Column descriptions for files associated with ancestry-stratified GWAS summary statistics: GSCAN_[SmkInit|AgeSmk|CigDay|SmkCes|DrnkWk]_2022_GWAS_SUMMARY_STATS_EUR_LOO_UKB_23andMe.txt.gz **Files contain results from European ancestry stratified fixed-effect meta-analyses with the UK Biobank and 23andMe studies removed. Columns are: CHR: Chromosome POS: Position (b38) RSID: rsID EFFECT_ALLELE: Effect allele OTHER_ALLELE: Non-effect allele AF_1000G: Effect allele frequency from 1000G BETA: Beta based on effect allele SE: Standard error of the beta P: Beta P-value N: Sum of sample size across contributing cohorts Note: these files contain results from EUR-stratified fixed-effect meta-analyses with UK Biobank and 23andMe removed. Files include all variants that have passed our post-meta-analytic QC filters (see page 12 of the Supplementary Information) and were present in the 1000Genomes reference panel (phase 4). We did not include allele frequency for data privacy reasons and instead used 1000G allele frequency within European ancestry for each variant.