Lange Symposium 2020 - GWAS for Ordinal Phenotypes with OrdinalGWAS.jl¶

This tutorial demonstrates how to conduct GWAS on ordinal phenotypes using OrdinalGWAS.jl. We will demonstrate various procedures that this software package can perform.

OrdinalGWAS.jl is a Julia package for performing genome-wide association studies (GWAS) for ordered categorical phenotypes using proportional odds model or ordred Probit model. It is useful when the phenotype takes ordered discrete values, e.g., disease status (undiagnosed, pre-disease, mild, moderate, severe). This is especially common in complex diseases where a binary status may not be suitable for all individuals.

Outline¶

Motivation
Model
Basic Usage
- Input files
- Running a simple analysis
- Output files
Customized Analysis
- Restricting sample/snps
- Link functions
- LRT & score tests
- SNP-set analysis (with annotation file)
Additional Features in Documentation
Example (if time)

Motivation¶

The following is a phenotyping algorithm from Eastwood et al. (2016) for diagnosing type II diabetes in the UK biobank population.

Labels generated include likelihood of diabetes, relying on several variables.

Hard to dichotomize.

Other types of ordinal phenotypes:

Disease Severity
Disease Progression
Maximum stage of a disease reached under treatment

Model¶

OrdinalGWAS.jl uses an ordered multinomial model, by default it runs the null model and then performs a score test for each SNP using the null model.

Assume trait $Y$ takes ordinal values $j \in \{1, \ldots, J\}$.
Cumulative probabilities $\alpha_{ij} = \mathbb{P}(Y_i \le j)$ are linked to covariates $\mathbf{x}_i$ via $$ g(\alpha_{ij}) = \theta_j - \mathbf{x}_i^T \boldsymbol{\beta}, \quad j = 1,\ldots, J-1, $$ where $g$ is a strictly increasing link function.
Intercepts $\theta_1 \le \cdots \le \theta_{J-1}$ enforces order between categories and regression coefficients $\boldsymbol{\beta}$ reflects effects of covariates.

For reproducibility, check the machine information below. To execute a notebook command, hold down Shift and Enter within the box. This tutorial and corresponding modules have been checked with Julia versions 1.0 and 1.3.

# machine information for this tutorial
versioninfo()

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin18.6.0)
  CPU: Intel(R) Core(TM) i7-4850HQ CPU @ 2.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, haswell)

# for use in this tutorial
using CSV, OrdinalGWAS, SnpArrays

Basic usage¶

Example data set¶

The data folder contains an example data set with a simulated covariate in covariate.txt.

# content of the data folder
readdir("data")

7-element Array{String,1}:
 ".DS_Store"            
 "covariate.txt"        
 "hapmap3.bed"          
 "hapmap3.bim"          
 "hapmap3.fam"          
 "hapmap3.map"          
 "hapmap_snpsetfile.txt"

Input files¶

ordinalgwas expects two input files: one for responses plus covariates (second argument), the other the Plink files for genotypes (third argument).

Covariate and trait file¶

Covariates and phenotype are provided in a csv file, e.g., covariate.txt, which has one header line for variable names. In this example, variable hypertension representing levels of hypertension is the ordered categorical phenotypes coded as integers 1 to 4. We want to include variable sex as the covariate in GWAS.

run(`head data/covariate.txt`);

famid,perid,faid,moid,sex,hypertension
2431,NA19916,0,0,1,4
2424,NA19835,0,0,2,4
2469,NA20282,0,0,2,4
2368,NA19703,0,0,1,3
2425,NA19901,0,0,2,3
2427,NA19908,0,0,1,4
2430,NA19914,0,0,2,4
2470,NA20287,0,0,2,1
2436,NA19713,0,0,2,3

Genotype data in PLINK format¶

We can use SnpArrays.jl to read in the raw genotype data.

s1 = SnpArray("data/hapmap3.bed")

324×13928 SnpArray:
 0x03  0x03  0x00  0x03  0x03  0x03  …  0x02  0x02  0x02  0x03  0x03  0x03
 0x03  0x02  0x02  0x02  0x03  0x03     0x03  0x03  0x03  0x03  0x02  0x03
 0x03  0x03  0x02  0x02  0x02  0x03     0x03  0x03  0x03  0x03  0x03  0x03
 0x03  0x03  0x02  0x03  0x02  0x03     0x02  0x02  0x02  0x03  0x03  0x03
 0x03  0x03  0x02  0x02  0x03  0x03     0x03  0x03  0x03  0x03  0x03  0x03
 0x03  0x02  0x03  0x03  0x00  0x03  …  0x02  0x02  0x02  0x03  0x03  0x03
 0x03  0x03  0x03  0x03  0x00  0x03     0x03  0x03  0x03  0x03  0x03  0x03
 0x03  0x03  0x00  0x03  0x03  0x03     0x03  0x03  0x03  0x03  0x03  0x03
 0x03  0x03  0x02  0x02  0x03  0x03     0x03  0x03  0x03  0x03  0x03  0x03
 0x03  0x03  0x02  0x03  0x03  0x03     0x02  0x02  0x02  0x03  0x03  0x03
 0x03  0x03  0x02  0x03  0x03  0x03  …  0x03  0x03  0x03  0x03  0x03  0x03
 0x03  0x03  0x03  0x02  0x03  0x03     0x02  0x02  0x02  0x03  0x03  0x03
 0x03  0x03  0x00  0x03  0x02  0x03     0x02  0x02  0x02  0x03  0x03  0x03
    ⋮                             ⋮  ⋱                       ⋮            
 0x03  0x03  0x02  0x03  0x00  0x03     0x02  0x02  0x02  0x03  0x03  0x03
 0x03  0x03  0x00  0x03  0x03  0x03     0x02  0x02  0x02  0x03  0x03  0x03
 0x03  0x03  0x00  0x03  0x02  0x03     0x03  0x03  0x03  0x03  0x03  0x03
 0x03  0x03  0x00  0x02  0x03  0x03  …  0x03  0x03  0x03  0x03  0x03  0x03
 0x03  0x03  0x02  0x03  0x03  0x03     0x03  0x03  0x03  0x03  0x03  0x03
 0x03  0x03  0x03  0x02  0x03  0x03     0x02  0x02  0x02  0x03  0x03  0x03
 0x03  0x03  0x02  0x02  0x03  0x03     0x02  0x02  0x02  0x03  0x03  0x03
 0x03  0x03  0x02  0x03  0x03  0x03     0x03  0x03  0x03  0x03  0x03  0x03
 0x03  0x03  0x00  0x03  0x02  0x03  …  0x02  0x02  0x02  0x03  0x03  0x03
 0x03  0x03  0x02  0x02  0x03  0x03     0x02  0x02  0x02  0x03  0x03  0x03
 0x03  0x03  0x00  0x02  0x03  0x03     0x02  0x02  0x02  0x03  0x03  0x03
 0x03  0x03  0x00  0x03  0x03  0x03     0x03  0x03  0x03  0x03  0x03  0x03

In this example, there are 324 individuals genotyped at 13,928 SNPs.

Analysis¶

The following command performs GWAS using the proportional odds model as the default when no link function is specified. The output is the fitted null model.

Formula for null model¶

The first argument specifies the null model without SNP effects, e.g., @formula(hypertension ~ sex).

ordinalgwas(@formula(hypertension ~ sex), "data/covariate.txt", "data/hapmap3")

StatsModels.TableRegressionModel{OrdinalMultinomialModel{Int64,Float64,LogitLink},Array{Float64,2}}

hypertension ~ sex

Coefficients:
──────────────────────────────────────────────────────
               Estimate  Std.Error   t value  Pr(>|t|)
──────────────────────────────────────────────────────
intercept1|2  -1.48564    0.358891  -4.13952    <1e-4 
intercept2|3  -0.569479   0.341044  -1.66981    0.0959
intercept3|4   0.429815   0.339642   1.26549    0.2066
sex            0.424656   0.213914   1.98517    0.0480
──────────────────────────────────────────────────────

For documentation of the ordinalgwas function, type ?ordinalgwas in Julia REPL.

?ordinalgwas

search: ordinalgwas ordinalgwasGxE OrdinalGWAS ordinalsnpsetgwas OrdinalRange

ordinalgwas(nullformula, covfile, plinkfile)
ordinalgwas(nullformula, df, plinkfile)
ordinalgwas(fittednullmodel, plinkfile)
ordinalgwas(fittednullmodel, bedfile, bimfile, bedn)

Compressed Plink files are supported. For example, if Plink files are hapmap3.bed.gz, hapmap3.bim.gz and hapmap3.fam.gz, the same command

ordinalgwas(@formula(hypertension ~ sex), "data/covariate.txt", "data/hapmap3")

still works. Check all supported compression format by

SnpArrays.ALLOWED_FORMAT

6-element Array{String,1}:
 "gz"  
 "zlib"
 "zz"  
 "xz"  
 "zst" 
 "bz2"

Output files¶

ordinalgwas outputs two files: ordinalgwas.null.txt and ordinalgwas.pval.txt.

ordinalgwas.null.txt lists the estimated null model (without SNPs).

run(`cat ordinalgwas.null.txt`);

StatsModels.TableRegressionModel{OrdinalMultinomialModel{Int64,Float64,LogitLink},Array{Float64,2}}

hypertension ~ sex

Coefficients:
──────────────────────────────────────────────────────
               Estimate  Std.Error   t value  Pr(>|t|)
──────────────────────────────────────────────────────
intercept1|2  -1.48564    0.358891  -4.13952    <1e-4 
intercept2|3  -0.569479   0.341044  -1.66981    0.0959
intercept3|4   0.429815   0.339642   1.26549    0.2066
sex            0.424656   0.213914   1.98517    0.0480
──────────────────────────────────────────────────────

ordinalgwas.pval.txt tallies the SNPs and their pvalues.

ENV["COLUMNS"]=120 #shows up to 10 columns for dataframe displays
CSV.read("ordinalgwas.pval.txt")

#clean up files 
rm("ordinalgwas.null.txt", force=true)
rm("ordinalgwas.pval.txt", force=true)

Output file names can be changed by the nullfile and pvalfile keywords respectively. For example,

ordinalgwas(@formula(hypertension ~ sex), "data/covariate.txt", "data/hapmap3", pvalfile="ordinalgwas.pval.txt.gz")

will output the p-value file in compressed gz format.

Timing¶

For this moderate-sized data set (N = 324 SNPs = 13,928), ordinalgwas takes around 0.3 seconds.

@time(ordinalgwas(@formula(hypertension ~ sex), "data/covariate.txt", "data/hapmap3"));

  0.219184 seconds (639.63 k allocations: 32.737 MiB, 5.43% gc time)

#clean up files 
rm("ordinalgwas.null.txt", force=true)
rm("ordinalgwas.pval.txt", force=true)

We have applied this software to the COPDGene cohort (N = 5,953, Nsnps = 630,860) and the UKBiobank cohort (N = 185,565, Nsnps = 464,137) published in Genetic Epidemiology. Runtimes were 3.5 minutes and 181 minutes respectively.

SNP and/or sample masks¶

In practice, we often perform GWAS on selected SNPs and/or selected samples. They can be specified by the snpinds, covrowinds and bedrowinds keywords of ordinalgwas function.

For example, to perform GWAS on SNPs with minor allele frequency (MAF) above 0.05

# create SNP mask
snpinds = maf(SnpArray("data/hapmap3.bed")) .≥ 0.05
# GWAS on selected SNPs
@time ordinalgwas(@formula(hypertension ~ sex), "data/covariate.txt", "data/hapmap3", 
    snpinds=snpinds, nullfile="commonvariant.null.txt", pvalfile="commonvariant.pval.txt")

  0.354116 seconds (974.12 k allocations: 51.225 MiB, 6.77% gc time)

StatsModels.TableRegressionModel{OrdinalMultinomialModel{Int64,Float64,LogitLink},Array{Float64,2}}

hypertension ~ sex

Coefficients:
──────────────────────────────────────────────────────
               Estimate  Std.Error   t value  Pr(>|t|)
──────────────────────────────────────────────────────
intercept1|2  -1.48564    0.358891  -4.13952    <1e-4 
intercept2|3  -0.569479   0.341044  -1.66981    0.0959
intercept3|4   0.429815   0.339642   1.26549    0.2066
sex            0.424656   0.213914   1.98517    0.0480
──────────────────────────────────────────────────────

CSV.read("commonvariant.pval.txt")

# extra header line in commonvariant.pval.txt
countlines("commonvariant.pval.txt"), count(snpinds)

(12086, 12085)

# clean up
rm("commonvariant.null.txt", force=true)
rm("commonvariant.pval.txt", force=true)

covrowinds specify the samples in the covariate file and bedrowinds for SnpArray. User should be particularly careful when using these two keyword. Selected rows in SnpArray should exactly match the samples in the null model. Otherwise the results are meaningless.

Use the keyword covrowinds to specify selected samples in the covarite file. Use the keyword bedrowinds to specify selected samples in the Plink bed file. For example, to use the first 300 samples in both covariate and bed file:

ordinalgwas(@formula(hypertension ~ sex), "data/covariate.txt", "data/hapmap3", 
    nullfile="first300.null.txt", pvalfile="first300.pval.txt", covrowinds=1:300,
    bedrowinds=1:300)

StatsModels.TableRegressionModel{OrdinalMultinomialModel{Int64,Float64,LogitLink},Array{Float64,2}}

hypertension ~ sex

Coefficients:
──────────────────────────────────────────────────────
               Estimate  Std.Error   t value  Pr(>|t|)
──────────────────────────────────────────────────────
intercept1|2  -1.42902    0.371752  -3.84401    0.0001
intercept2|3  -0.590123   0.355841  -1.65839    0.0983
intercept3|4   0.407014   0.35423    1.14901    0.2515
sex            0.40068    0.221836   1.8062     0.0719
──────────────────────────────────────────────────────

CSV.read("first300.pval.txt")

# clean up
rm("first300.null.txt", force=true)
rm("first300.pval.txt", force=true)

Likelihood ratio test (LRT)¶

By default, ordinalgwas calculates p-value for each SNP using score test. Score test is fast because it doesn't require fitting alternative model for each SNP. User can request likelihood ratio test (LRT) using keyword test=:lrt. LRT is much slower but may be more powerful than score test.

@time ordinalgwas(@formula(hypertension ~ sex), "data/covariate.txt", "data/hapmap3"; snpinds = 1:10, 
    test=:LRT, nullfile="lrt.null.txt", pvalfile="lrt.pval.txt")

  0.742670 seconds (1.82 M allocations: 92.778 MiB, 3.28% gc time)

StatsModels.TableRegressionModel{OrdinalMultinomialModel{Int64,Float64,LogitLink},Array{Float64,2}}

hypertension ~ sex

Coefficients:
──────────────────────────────────────────────────────
               Estimate  Std.Error   t value  Pr(>|t|)
──────────────────────────────────────────────────────
intercept1|2  -1.48564    0.358891  -4.13952    <1e-4 
intercept2|3  -0.569479   0.341044  -1.66981    0.0959
intercept3|4   0.429815   0.339642   1.26549    0.2066
sex            0.424656   0.213914   1.98517    0.0480
──────────────────────────────────────────────────────

Note the extra effect column in pvalfile, which is the effect size (regression coefficient) for each SNP.

CSV.read("lrt.pval.txt")

# clean up
rm("lrt.pval.txt", force=true)
rm("lrt.null.txt", force=true)

In this example, GWAS by score test takes less than 0.2 second, while GWAS by LRT takes about 20 seconds. About 100 fold difference in run time.

Score test for screening, LRT for power¶

For large data sets, a practical solution is to perform score test first, then re-do LRT for the most promising SNPs according to score test p-values.

Step 1: Perform score test GWAS, results in score.pval.txt.

@time ordinalgwas(@formula(hypertension ~ sex), "data/covariate.txt", "data/hapmap3", 
    test=:score, pvalfile="score.pval.txt");

  0.261909 seconds (752.90 k allocations: 38.855 MiB, 6.65% gc time)

CSV.read("score.pval.txt")

Step 2: Sort score test p-values and find top 10 SNPs.

scorepvals = CSV.read("score.pval.txt")[!, 6] # p-values in 5th column
tophits = sortperm(scorepvals)[1:10] # indices of 10 SNPs with smallest p-values
scorepvals[tophits] # smallest 10 p-values

10-element Array{Float64,1}:
 1.3080149099181303e-6
 6.536722765052079e-6 
 9.66474218566903e-6  
 1.2168672367668889e-5
 1.8027460018331254e-5
 2.0989542284213636e-5
 2.6844521269963608e-5
 3.108283828554874e-5 
 4.1010912875160476e-5
 4.2966265138454725e-5

Step 3: Re-do LRT on top hits.

@time ordinalgwas(@formula(hypertension ~ sex), "data/covariate.txt", "data/hapmap3", 
    snpinds=tophits, test=:LRT, pvalfile="lrt.pval.txt");

  0.278900 seconds (527.08 k allocations: 28.637 MiB, 4.83% gc time)

CSV.read("lrt.pval.txt")

# clean up
rm("ordinalgwas.null.txt", force=true)
rm("score.pval.txt", force=true)
rm("lrt.pval.txt", force=true)

SNP-set testing¶

In many applications, we want to test a SNP-set (testing if SNPs have a joint effect together). The function ordinalsnpsetgwas() can be used to do this. The following is an example of using an annotated file where each SNP has a gene annotation.

run(`head data/hapmap_snpsetfile.txt`);

gene1 rs10458597
gene1 rs12562034
gene1 rs2710875
gene1 rs11260566
gene1 rs1312568
gene1 rs35154105
gene1 rs16824508
gene1 rs2678939
gene1 rs7553178
gene1 rs13376356

Now we just need to specify that file to run a snpset analysis.

ordinalsnpsetgwas(@formula(hypertension ~ sex), "data/covariate.txt", "data/hapmap3",
    pvalfile = "snpset.pval.txt", snpset = "data/hapmap_snpsetfile.txt")

StatsModels.TableRegressionModel{OrdinalMultinomialModel{Int64,Float64,LogitLink},Array{Float64,2}}

hypertension ~ sex

Coefficients:
──────────────────────────────────────────────────────
               Estimate  Std.Error   t value  Pr(>|t|)
──────────────────────────────────────────────────────
intercept1|2  -1.48564    0.358891  -4.13952    <1e-4 
intercept2|3  -0.569479   0.341044  -1.66981    0.0959
intercept3|4   0.429815   0.339642   1.26549    0.2066
sex            0.424656   0.213914   1.98517    0.0480
──────────────────────────────────────────────────────

CSV.read("snpset.pval.txt")

# clean up
rm("snpset.pval.txt", force=true)
rm("ordinalgwas.null.txt", force=true)

Additional Features Available Not Covered Today¶

The following features are also available in OrdinalGWAS.jl. They are covered in the documentation found here.

Additional SNP-set analysis options
- Sliding window
- Specific set of snps
GxE interaction analysis
Link Functions
- LogitLink(), proportional odds model (default),
- ProbitLink(), ordred Probit model,
- CloglogLink(), proportional hazards model, or
- CauchyLink()
Set of multiple plink files (i.e. by chromosome)
- Running on a cluster
Running analysis with a Docker file
Plotting
- To plot the GWAS results, use the MendelPlots.jl package.

GxE example (if time allows)¶

GxE interactions¶

In many applications, the user may want to test the GxE interaction effect. This requires fitting the SNP in the null model and is quite slower, but the command ordinalgwasGxE() can be used test the interaction effect. To do this you must specify the environmental variable in the command, either as a symbol, such as ":age" or as a string "age".

For documentation of the ordinalgwasGxE function, type ?ordinalgwasGxE in Julia REPL.

@docs
?ordinalgwasGxE

?ordinalgwasGxE

search: ordinalgwasGxE ordinalgwas OrdinalGWAS ordinalsnpsetgwas

ordinalgwasGxE(nullformula, covfile, plinkfile, e)
ordinalgwasGxE(nullformula, df, plinkfile, e)
ordinalgwasGxE(fittednullmodel, plinkfile, e)
ordinalgwasGxE(fittednullmodel, bedfile, bimfile, bedn, e)

The following can be used to test if there is an interaction between sex and the first five SNPs in the data.

ordinalgwasGxE(@formula(hypertension ~ sex), "data/covariate.txt", "data/hapmap3",
    :sex, pvalfile = "gxe_snp.pval.txt", snpinds=1:5, test=:score)

StatsModels.TableRegressionModel{OrdinalMultinomialModel{Int64,Float64,LogitLink},Array{Float64,2}}

hypertension ~ sex

Coefficients:
──────────────────────────────────────────────────────
               Estimate  Std.Error   t value  Pr(>|t|)
──────────────────────────────────────────────────────
intercept1|2  -1.48564    0.358891  -4.13952    <1e-4 
intercept2|3  -0.569479   0.341044  -1.66981    0.0959
intercept3|4   0.429815   0.339642   1.26549    0.2066
sex            0.424656   0.213914   1.98517    0.0480
──────────────────────────────────────────────────────

CSV.read("gxe_snp.pval.txt")

# clean up
rm("gxe_snp.pval.txt", force=true)

Now try to run a GxE interaction analysis for the first five snps using an LRT (returns the effect size estimate of the GxE interaction as well), saving the results to a file named gxe_lrt.pval.txt.

## Your code here:

###

Display the results

CSV.read("gxe_snp.pval.txt")

# clean up
rm("gxe_snp.pval.txt", force=true)

GxE interactions - testing joint effect¶

In some applications, the user may want to test SNP effect and/or its interaction with other terms. testformula keyword specifies the test unit besides the covariates in nullformula.

In following example, keyword testformula=@formula(hypertension ~ snp + snp & sex) instructs ordinalgwas to test joint effect of snp and snp & sex interaction.

ordinalgwas(@formula(hypertension ~ sex), "data/covariate.txt", "data/hapmap3", 
    pvalfile="GxE.pval.txt", testformula=@formula(hypertension ~ snp + snp & sex));

CSV.read("GxE.pval.txt")

# clean up
rm("ordinalgwas.null.txt")
rm("GxE.pval.txt")

	chr	pos	snpid	maf	hwepval	pval
	Int64	Int64	String	Float64	Float64	Float64
1	1	554484	rs10458597	0.0	1.0	1.0
2	1	758311	rs12562034	0.0776398	0.409876	0.00456531
3	1	967643	rs2710875	0.324074	4.07625e-7	3.10828e-5
4	1	1168108	rs11260566	0.191589	0.128568	1.21687e-5
5	1	1375074	rs1312568	0.441358	2.5376e-19	0.00820686
6	1	1588771	rs35154105	0.0	1.0	1.0
7	1	1789051	rs16824508	0.00462963	0.933278	0.511198
8	1	1990452	rs2678939	0.453704	5.07696e-11	0.299728
9	1	2194615	rs7553178	0.226852	0.170561	0.171333
10	1	2396747	rs13376356	0.14486	0.905308	0.532042
11	1	2623158	rs28753913	0.0	1.0	1.0
12	1	2823603	rs1563468	0.483025	4.23066e-9	0.225191
13	1	3025087	rs6690373	0.25387	9.23864e-8	0.701847
14	1	3225416	rs12043519	0.029321	0.000778802	0.00107978
15	1	3431124	rs12093117	0.109907	0.232693	0.427794
16	1	3633945	rs10910017	0.221875	0.000270888	0.913129
17	1	3895935	rs34770924	0.0246914	3.20726e-5	0.999021
18	1	4096895	rs6702633	0.475232	0.663812	0.00651631
19	1	4297388	rs684965	0.305556	0.213633	0.0951953
20	1	4498133	rs11809295	0.0993789	0.257162	0.0832435
21	1	4698713	rs578528	0.324074	0.616937	0.0692307
22	1	4899946	rs4654471	0.358025	0.181265	0.224531
23	1	5100369	rs6681148	0.131579	0.00845447	0.155667
24	1	5302730	rs10799197	0.428793	0.294742	0.669055
25	1	5502779	rs10796400	0.231481	0.46091	0.241525
26	1	5703284	rs2244632	0.394737	0.313378	0.53453
27	1	5904631	rs7549324	0.367284	0.517567	0.716133
28	1	6106513	rs2843494	0.0648148	0.740663	0.365573
29	1	6310159	rs4908880	0.0555556	0.28968	0.579307
30	1	6514524	rs932112	0.220679	0.172669	0.303095
⋮	⋮	⋮	⋮	⋮	⋮	⋮

	chr	pos	snpid	maf	hwepval	pval
	Int64	Int64	String	Float64	Float64	Float64
1	1	758311	rs12562034	0.0776398	0.409876	0.00456531
2	1	967643	rs2710875	0.324074	4.07625e-7	3.10828e-5
3	1	1168108	rs11260566	0.191589	0.128568	1.21687e-5
4	1	1375074	rs1312568	0.441358	2.5376e-19	0.00820686
5	1	1990452	rs2678939	0.453704	5.07696e-11	0.299728
6	1	2194615	rs7553178	0.226852	0.170561	0.171333
7	1	2396747	rs13376356	0.14486	0.905308	0.532042
8	1	2823603	rs1563468	0.483025	4.23066e-9	0.225191
9	1	3025087	rs6690373	0.25387	9.23864e-8	0.701847
10	1	3431124	rs12093117	0.109907	0.232693	0.427794
11	1	3633945	rs10910017	0.221875	0.000270888	0.913129
12	1	4096895	rs6702633	0.475232	0.663812	0.00651631
13	1	4297388	rs684965	0.305556	0.213633	0.0951953
14	1	4498133	rs11809295	0.0993789	0.257162	0.0832435
15	1	4698713	rs578528	0.324074	0.616937	0.0692307
16	1	4899946	rs4654471	0.358025	0.181265	0.224531
17	1	5100369	rs6681148	0.131579	0.00845447	0.155667
18	1	5302730	rs10799197	0.428793	0.294742	0.669055
19	1	5502779	rs10796400	0.231481	0.46091	0.241525
20	1	5703284	rs2244632	0.394737	0.313378	0.53453
21	1	5904631	rs7549324	0.367284	0.517567	0.716133
22	1	6106513	rs2843494	0.0648148	0.740663	0.365573
23	1	6310159	rs4908880	0.0555556	0.28968	0.579307
24	1	6514524	rs932112	0.220679	0.172669	0.303095
25	1	6715827	rs441515	0.493808	0.782937	0.596866
26	1	6917805	rs12043429	0.0987654	0.600298	0.726598
27	1	7119246	rs4908600	0.156832	0.649017	0.501062
28	1	7319987	rs7553372	0.356037	5.28439e-9	0.122518
29	1	7522841	rs1193169	0.228261	0.481934	0.134259
30	1	7723190	rs4908691	0.371914	0.0875127	0.497583
⋮	⋮	⋮	⋮	⋮	⋮	⋮

	chr	pos	snpid	maf	hwepval	pval
	Int64	Int64	String	Float64	Float64	Float64
1	1	554484	rs10458597	0.0	1.0	1.0
2	1	758311	rs12562034	0.0776398	0.409876	0.00355969
3	1	967643	rs2710875	0.324074	4.07625e-7	0.000123604
4	1	1168108	rs11260566	0.191589	0.128568	5.2213e-6
5	1	1375074	rs1312568	0.441358	2.5376e-19	0.00758234
6	1	1588771	rs35154105	0.0	1.0	1.0
7	1	1789051	rs16824508	0.00462963	0.933278	0.504171
8	1	1990452	rs2678939	0.453704	5.07696e-11	0.363679
9	1	2194615	rs7553178	0.226852	0.170561	0.24109
10	1	2396747	rs13376356	0.14486	0.905308	0.453925
11	1	2623158	rs28753913	0.0	1.0	1.0
12	1	2823603	rs1563468	0.483025	4.23066e-9	0.148556
13	1	3025087	rs6690373	0.25387	9.23864e-8	0.762476
14	1	3225416	rs12043519	0.029321	0.000778802	0.00131734
15	1	3431124	rs12093117	0.109907	0.232693	0.530602
16	1	3633945	rs10910017	0.221875	0.000270888	0.835396
17	1	3895935	rs34770924	0.0246914	3.20726e-5	0.96925
18	1	4096895	rs6702633	0.475232	0.663812	0.0367962
19	1	4297388	rs684965	0.305556	0.213633	0.0813781
20	1	4498133	rs11809295	0.0993789	0.257162	0.168299
21	1	4698713	rs578528	0.324074	0.616937	0.0654792
22	1	4899946	rs4654471	0.358025	0.181265	0.232969
23	1	5100369	rs6681148	0.131579	0.00845447	0.106474
24	1	5302730	rs10799197	0.428793	0.294742	0.693596
25	1	5502779	rs10796400	0.231481	0.46091	0.0755408
26	1	5703284	rs2244632	0.394737	0.313378	0.616338
27	1	5904631	rs7549324	0.367284	0.517567	0.987062
28	1	6106513	rs2843494	0.0648148	0.740663	0.558175
29	1	6310159	rs4908880	0.0555556	0.28968	0.414671
30	1	6514524	rs932112	0.220679	0.172669	0.215976
⋮	⋮	⋮	⋮	⋮	⋮	⋮

	chr	pos	snpid	maf	hwepval	effect	pval
	Int64	Int64	String	Float64	Float64	Float64	Float64
1	1	554484	rs10458597	0.0	1.0	0.0	1.0
2	1	758311	rs12562034	0.0776398	0.409876	-1.00578	0.00191858
3	1	967643	rs2710875	0.324074	4.07625e-7	-0.648856	1.80505e-5
4	1	1168108	rs11260566	0.191589	0.128568	-0.915723	5.87338e-6
5	1	1375074	rs1312568	0.441358	2.5376e-19	-0.331814	0.00808102
6	1	1588771	rs35154105	0.0	1.0	0.0	1.0
7	1	1789051	rs16824508	0.00462963	0.933278	-0.733803	0.516903
8	1	1990452	rs2678939	0.453704	5.07696e-11	-0.135865	0.299464
9	1	2194615	rs7553178	0.226852	0.170561	-0.251208	0.161511
10	1	2396747	rs13376356	0.14486	0.905308	0.129461	0.538734

	chr	pos	snpid	maf	hwepval	pval
	Int64	Int64	String	Float64	Float64	Float64
1	1	554484	rs10458597	0.0	1.0	1.0
2	1	758311	rs12562034	0.0776398	0.409876	0.00456531
3	1	967643	rs2710875	0.324074	4.07625e-7	3.10828e-5
4	1	1168108	rs11260566	0.191589	0.128568	1.21687e-5
5	1	1375074	rs1312568	0.441358	2.5376e-19	0.00820686
6	1	1588771	rs35154105	0.0	1.0	1.0
7	1	1789051	rs16824508	0.00462963	0.933278	0.511198
8	1	1990452	rs2678939	0.453704	5.07696e-11	0.299728
9	1	2194615	rs7553178	0.226852	0.170561	0.171333
10	1	2396747	rs13376356	0.14486	0.905308	0.532042
11	1	2623158	rs28753913	0.0	1.0	1.0
12	1	2823603	rs1563468	0.483025	4.23066e-9	0.225191
13	1	3025087	rs6690373	0.25387	9.23864e-8	0.701847
14	1	3225416	rs12043519	0.029321	0.000778802	0.00107978
15	1	3431124	rs12093117	0.109907	0.232693	0.427794
16	1	3633945	rs10910017	0.221875	0.000270888	0.913129
17	1	3895935	rs34770924	0.0246914	3.20726e-5	0.999021
18	1	4096895	rs6702633	0.475232	0.663812	0.00651631
19	1	4297388	rs684965	0.305556	0.213633	0.0951953
20	1	4498133	rs11809295	0.0993789	0.257162	0.0832435
21	1	4698713	rs578528	0.324074	0.616937	0.0692307
22	1	4899946	rs4654471	0.358025	0.181265	0.224531
23	1	5100369	rs6681148	0.131579	0.00845447	0.155667
24	1	5302730	rs10799197	0.428793	0.294742	0.669055
25	1	5502779	rs10796400	0.231481	0.46091	0.241525
26	1	5703284	rs2244632	0.394737	0.313378	0.53453
27	1	5904631	rs7549324	0.367284	0.517567	0.716133
28	1	6106513	rs2843494	0.0648148	0.740663	0.365573
29	1	6310159	rs4908880	0.0555556	0.28968	0.579307
30	1	6514524	rs932112	0.220679	0.172669	0.303095
⋮	⋮	⋮	⋮	⋮	⋮	⋮