Mendel - Iterative Hard Thresholding

A modern approach to analyze data from a Genome Wide Association Studies (GWAS)

Package Feature

  • Built-in support for PLINK binary files via SnpArrays.jl, VCF files via VCFTools.jl, and BGEN files via BGEN.jl.
  • Out-of-the-box parallel computing routines for q-fold cross-validation.
  • Fits a variety of generalized linear models with any choice of link function.
  • Can run multivariate GWAS if given multiple continuous phenotypes.
  • Outputs proportion of phenotypic variance explained (PVE) by genetic predictors.
  • Outputs estimated covariance matrix between phenotypes (when running multivariate IHT).
  • Computation directly on raw genotype files.
  • Efficient handlings for non-genetic covariates.
  • Optional acceleration (debias) step to dramatically improve speed.
  • Ability to explicitly incorporate weights for predictors.
  • Ability to enforce within and between group sparsity.
  • Estimates nuisance parameter for negative binomial regression using Newton or MM algorithm.

Read our paper for more detail.

MendelIHT borrows distribution and link functions implementationed in GLM.jl and Distributions.jl.

DistributionCanonical LinkStatus
NormalIdentityLink$\checkmark$
MvNormalIdentityLink$\checkmark$
BernoulliLogitLink$\checkmark$
PoissonLogLink$\checkmark$
NegativeBinomialLogLink$\checkmark$
GammaInverseLinkexperimental
InverseGaussianInverseSquareLinkexperimental

Examples of these distributions in their default value is visualized in this post.

CauchitLink
CloglogLink
IdentityLink
InverseLink
InverseSquareLink
LogitLink
LogLink
ProbitLink
SqrtLink

Manual Outline