VCFTools.jl

VCFTools.jl provides some Julia utilities for handling the VCF files.

In [1]:
# dispay Julia version info
versioninfo()
Julia Version 1.2.0
Commit c6da87ff4b (2019-08-20 00:03 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin18.6.0)
  CPU: Intel(R) Core(TM) i7-6920HQ CPU @ 2.90GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)
Environment:
  JULIA_EDITOR = code

Example VCF file

Current folder contains an example VCF file for demonstation.

In [2]:
;ls -l test.vcf.gz
-rw-r--r--  1 huazhou  staff  876514 Feb 18 23:32 test.vcf.gz

Load the VCF file and display the first 35 lines

In [3]:
using VCFTools

fh = openvcf("test.vcf.gz", "r")
for l in 1:35
    println(readline(fh))
end
close(fh)
##fileformat=VCFv4.1
##INFO=<ID=LDAF,Number=1,Type=Float,Description="MLE Allele Frequency Accounting for LD">
##INFO=<ID=AVGPOST,Number=1,Type=Float,Description="Average posterior probability from MaCH/Thunder">
##INFO=<ID=RSQ,Number=1,Type=Float,Description="Genotype imputation quality from MaCH/Thunder">
##INFO=<ID=ERATE,Number=1,Type=Float,Description="Per-marker Mutation rate from MaCH/Thunder">
##INFO=<ID=THETA,Number=1,Type=Float,Description="Per-marker Transition rate from MaCH/Thunder">
##INFO=<ID=CIEND,Number=2,Type=Integer,Description="Confidence interval around END for imprecise variants">
##INFO=<ID=CIPOS,Number=2,Type=Integer,Description="Confidence interval around POS for imprecise variants">
##INFO=<ID=END,Number=1,Type=Integer,Description="End position of the variant described in this record">
##INFO=<ID=HOMLEN,Number=.,Type=Integer,Description="Length of base pair identical micro-homology at event breakpoints">
##INFO=<ID=HOMSEQ,Number=.,Type=String,Description="Sequence of base pair identical micro-homology at event breakpoints">
##INFO=<ID=SVLEN,Number=1,Type=Integer,Description="Difference in length between REF and ALT alleles">
##INFO=<ID=SVTYPE,Number=1,Type=String,Description="Type of structural variant">
##INFO=<ID=AC,Number=.,Type=Integer,Description="Alternate Allele Count">
##INFO=<ID=AN,Number=1,Type=Integer,Description="Total Allele Count">
##ALT=<ID=DEL,Description="Deletion">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=DS,Number=1,Type=Float,Description="Genotype dosage from MaCH/Thunder">
##FORMAT=<ID=GL,Number=.,Type=Float,Description="Genotype Likelihoods">
##INFO=<ID=AA,Number=1,Type=String,Description="Ancestral Allele, ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/pilot_data/technical/reference/ancestral_alignments/README">
##INFO=<ID=AF,Number=1,Type=Float,Description="Global Allele Frequency based on AC/AN">
##INFO=<ID=AMR_AF,Number=1,Type=Float,Description="Allele Frequency for samples from AMR based on AC/AN">
##INFO=<ID=ASN_AF,Number=1,Type=Float,Description="Allele Frequency for samples from ASN based on AC/AN">
##INFO=<ID=AFR_AF,Number=1,Type=Float,Description="Allele Frequency for samples from AFR based on AC/AN">
##INFO=<ID=EUR_AF,Number=1,Type=Float,Description="Allele Frequency for samples from EUR based on AC/AN">
##INFO=<ID=VT,Number=1,Type=String,Description="indicates what type of variant the line represents">
##INFO=<ID=SNPSOURCE,Number=.,Type=String,Description="indicates if a snp was called when analysing the low coverage or exome alignment data">
##reference=GRCh37
##reference=GRCh37
#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO	FORMAT	HG00096	HG00097	HG00099	HG00100	HG00101	HG00102	HG00103	HG00104	HG00106	HG00108	HG00109	HG00110	HG00111	HG00112	HG00113	HG00114	HG00116	HG00117	HG00118	HG00119	HG00120	HG00121	HG00122	HG00123	HG00124	HG00125	HG00126	HG00127	HG00128	HG00129	HG00130	HG00131	HG00133	HG00134	HG00135	HG00136	HG00137	HG00138	HG00139	HG00140	HG00141	HG00142	HG00143	HG00146	HG00148	HG00149	HG00150	HG00151	HG00152	HG00154	HG00155	HG00156	HG00158	HG00159	HG00160	HG00171	HG00173	HG00174	HG00176	HG00177	HG00178	HG00179	HG00180	HG00182	HG00183	HG00185	HG00186	HG00187	HG00188	HG00189	HG00190	HG00231	HG00232	HG00233	HG00234	HG00235	HG00236	HG00237	HG00238	HG00239	HG00240	HG00242	HG00243	HG00244	HG00245	HG00246	HG00247	HG00249	HG00250	HG00251	HG00252	HG00253	HG00254	HG00255	HG00256	HG00257	HG00258	HG00259	HG00260	HG00261	HG00262	HG00263	HG00264	HG00265	HG00266	HG00267	HG00268	HG00269	HG00270	HG00271	HG00272	HG00273	HG00274	HG00275	HG00276	HG00277	HG00278	HG00280	HG00281	HG00282	HG00284	HG00285	HG00306	HG00309	HG00310	HG00311	HG00312	HG00313	HG00315	HG00318	HG00319	HG00320	HG00321	HG00323	HG00324	HG00325	HG00326	HG00327	HG00328	HG00329	HG00330	HG00331	HG00332	HG00334	HG00335	HG00336	HG00337	HG00338	HG00339	HG00341	HG00342	HG00343	HG00344	HG00345	HG00346	HG00349	HG00350	HG00351	HG00353	HG00355	HG00356	HG00357	HG00358	HG00359	HG00360	HG00361	HG00362	HG00364	HG00366	HG00367	HG00369	HG00372	HG00373	HG00375	HG00376	HG00377	HG00378	HG00381	HG00382	HG00383	HG00384	HG00403	HG00404	HG00406	HG00407	HG00418	HG00419	HG00421	HG00422	HG00427	HG00428
22	20000086	rs138720731	T	C	100	PASS	AC=7;RSQ=0.8454;AVGPOST=0.9983;AA=T;AN=2184;LDAF=0.0040;THETA=0.0001;VT=SNP;SNPSOURCE=LOWCOV;ERATE=0.0003;AF=0.0032;AFR_AF=0.01	GT:DS:GL	0/0:0.000:-0.03,-1.19,-5.00	0/0:0.000:-0.04,-1.05,-5.00	0/0:0.000:-0.07,-0.85,-5.00	0/0:0.000:-0.03,-1.18,-5.00	0/0:0.000:-0.06,-0.87,-5.00	0/0:0.000:-0.03,-1.14,-5.00	0/0:0.000:-0.06,-0.90,-5.00	0/0:0.000:-0.23,-0.45,-1.28	0/0:0.000:-0.03,-1.20,-5.00	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.11,-0.65,-4.40	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.06,-0.91,-5.00	0/0:0.000:-0.18,-0.47,-2.54	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.06,-0.90,-5.00	0/0:0.000:-0.01,-1.74,-5.00	0/0:0.000:-0.00,-3.66,-5.00	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.00,-2.53,-5.00	0/0:0.000:-0.09,-0.73,-5.00	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.00,-3.11,-5.00	0/0:0.000:-0.06,-0.89,-5.00	0/0:0.000:-0.09,-0.71,-4.10	0/0:0.000:-0.11,-0.65,-4.40	0/0:0.000:-0.18,-0.47,-2.34	0/0:0.000:-0.22,-0.45,-1.32	0/0:0.000:-0.02,-1.29,-5.00	0/0:0.000:-0.03,-1.15,-5.00	0/0:0.000:-0.02,-1.45,-5.00	0/0:0.000:-0.00,-3.34,-5.00	0/0:0.000:-0.12,-0.61,-3.19	0/0:0.000:-0.11,-0.67,-4.40	0/0:0.000:-0.05,-0.99,-5.00	0/0:0.000:-0.18,-0.48,-2.15	0/0:0.000:-0.01,-1.47,-5.00	0/0:0.000:-0.10,-0.67,-3.62	0/0:0.000:-0.03,-1.14,-5.00	0/0:0.000:-0.09,-0.73,-4.40	0/0:0.000:-0.07,-0.84,-4.40	0/0:0.000:-0.18,-0.48,-2.46	0/0:0.000:-0.0292813,-1.18575,-5	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.01,-1.67,-5.00	0/0:0.000:-0.18,-0.47,-2.40	0/0:0.000:-0.03,-1.25,-5.00	0/0:0.000:-0.11,-0.66,-3.44	0/0:0.000:-0.09,-0.73,-4.70	0/0:0.000:-0.0418663,-1.03687,-4.39794	0/0:0.000:-0.08,-0.79,-3.14	0/0:0.000:-0.00,-2.30,-5.00	0/0:0.000:-0.00,-2.54,-5.00	0/0:0.000:-0.03,-1.21,-5.00	0/0:0.000:-0.06,-0.86,-5.00	0/0:0.000:-0.09,-0.71,-4.70	0/0:0.000:-0.01,-1.49,-5.00	0/0:0.000:-0.01,-1.88,-5.00	0/0:0.000:-0.09,-0.71,-4.70	0/0:0.000:-0.03,-1.19,-5.00	0/0:0.000:-0.10,-0.67,-4.40	0/0:0.000:-0.01,-1.51,-5.00	0/0:0.000:-0.02,-1.40,-5.00	0/0:0.000:-0.03,-1.17,-5.00	0/0:0.000:-0.05,-0.93,-5.00	0/0:0.000:-0.01,-1.48,-5.00	0/0:0.000:-0.00,-2.02,-5.00	0/0:0.000:-0.03,-1.18,-5.00	0/0:0.000:-0.02,-1.46,-5.00	0/0:0.000:-0.03,-1.17,-5.00	0/0:0.050:-0.18,-0.47,-2.73	0/0:0.000:-0.17,-0.49,-2.97	0/0:0.000:-0.10,-0.68,-4.40	0/0:0.000:-0.05,-0.99,-5.00	0/0:0.000:-0.12,-0.62,-3.38	0/0:0.000:-0.00,-2.06,-5.00	0/0:0.000:-0.16,-0.51,-2.66	0/0:0.000:-0.11,-0.64,-4.22	0/0:0.000:-0.03,-1.22,-5.00	0/0:0.000:-0.01,-1.64,-5.00	0/0:0.000:-0.00,-2.85,-5.00	0/0:0.000:-0.02,-1.38,-5.00	0/0:0.000:-0.05,-0.94,-5.00	0/0:0.000:-0.0311436,-1.15989,-5	0/0:0.000:-0.36,-0.42,-0.73	0/0:0.000:-0.01,-1.88,-5.00	0/0:0.000:-0.05,-0.92,-5.00	0/0:0.000:-0.03,-1.16,-5.00	0/0:0.000:-0.04,-1.04,-5.00	0/0:0.000:-0.13,-0.59,-5.00	0/0:0.000:-0.02,-1.36,-5.00	0/0:0.000:-0.16,-0.51,-2.36	0/0:0.000:-0.02,-1.31,-5.00	0/0:0.000:-0.03,-1.19,-5.00	0/0:0.000:-0.03,-1.17,-5.00	0/0:0.000:-0.00,-4.40,-5.00	0/0:0.000:-0.03,-1.16,-5.00	0/0:0.000:-0.09,-0.73,-3.70	0/0:0.000:-0.19,-0.47,-1.77	0/0:0.000:-0.00,-3.32,-5.00	0/0:0.000:-0.17,-0.51,-2.00	0/0:0.000:-0.00,-2.17,-5.00	0/0:0.000:-0.00,-2.91,-5.00	0/0:0.000:-0.10,-0.71,-4.10	0/0:0.000:-0.03,-1.12,-5.00	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.01,-1.67,-5.00	0/0:0.000:-0.00,-2.09,-5.00	0/0:0.000:-0.04,-1.09,-5.00	0/0:0.000:-0.01,-1.48,-5.00	0/0:0.000:-0.02,-1.41,-5.00	0/0:0.000:-0.10,-0.69,-3.80	0/0:0.000:-0.01,-1.54,-5.00	0/0:0.000:-0.03,-1.16,-5.00	0/0:0.000:-0.09,-0.73,-4.70	0/0:0.000:-0.09,-0.74,-4.70	0/0:0.000:-0.06,-0.92,-5.00	0/0:0.000:-0.05,-0.97,-5.00	0/0:0.000:-0.08,-0.78,-5.00	0/0:0.000:-0.06,-0.92,-5.00	0/0:0.000:-0.10,-0.67,-4.40	0/0:0.000:-0.01,-1.71,-5.00	0/0:0.000:-0.03,-1.20,-5.00	0/0:0.000:-0.02,-1.26,-5.00	0/0:0.000:-0.04,-1.10,-5.00	0/0:0.000:-0.02,-1.27,-5.00	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.01,-1.47,-5.00	0/0:0.000:-0.00,-2.00,-5.00	0/0:0.000:-0.10,-0.67,-4.22	0/0:0.050:-0.18,-0.47,-2.34	0/0:0.000:-0.05,-1.00,-5.00	0/0:0.000:-0.11,-0.65,-3.85	0/0:0.000:-0.10,-0.68,-4.70	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.08,-0.76,-5.00	0/0:0.000:-0.19,-0.47,-2.14	0/0:0.000:-0.00,-1.99,-5.00	0/0:0.000:-0.18,-0.47,-2.46	0/0:0.000:-0.09,-0.74,-4.40	0/0:0.450:-0.05,-0.94,-5.00	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.10,-0.69,-4.70	0/0:0.000:-0.01,-1.50,-5.00	0/0:0.000:-0.18,-0.47,-2.34	0/0:0.000:-0.03,-1.17,-5.00	0/0:0.000:-0.06,-0.88,-5.00	0/0:0.000:-0.02,-1.41,-5.00	0/0:0.000:-0.06,-0.88,-5.00	0/0:0.000:-0.18,-0.47,-1.95	0/0:0.000:-0.19,-0.46,-2.17	0/0:0.000:-0.03,-1.13,-5.00	0/0:0.000:-0.03,-1.18,-5.00	0/0:0.000:-0.18,-0.48,-2.23	0/0:0.000:-0.23,-0.45,-1.31	0/0:0.000:-0.11,-0.64,-3.92	0/0:0.000:-0.03,-1.17,-5.00	0/0:0.000:-0.03,-1.22,-5.00	0/0:0.000:-0.11,-0.66,-4.22	0/0:0.000:-0.12,-0.61,-2.38	0/0:0.000:-0.03,-1.22,-5.00	0/0:0.000:-0.40,-0.45,-0.60	0/0:0.000:-0.00,-2.98,-5.00	0/0:0.000:-0.13,-0.59,-2.09	0/0:0.000:-0.02,-1.37,-5.00	0/0:0.000:-0.477139,-0.477113,-0.477113	0/0:0.000:-0.04,-1.10,-5.00	0/0:0.000:-0.03,-1.23,-5.00	0/0:0.000:-0.01,-1.51,-5.00	0/0:0.000:-0.01,-1.67,-5.00	0/0:0.000:-0.08,-0.75,-4.40	0/0:0.000:-0.03,-1.23,-5.00	0/0:0.000:-0.10,-0.69,-4.40	0/0:0.000:-0.12,-0.63,-3.92	0/0:0.000:-0.01,-1.74,-5.00	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.19,-0.46,-2.60	0/0:0.000:-0.19,-0.46,-2.62	0/0:0.000:-0.11,-0.65,-4.70	0/0:0.000:-0.11,-0.66,-4.70	0/0:0.050:-0.18,-0.49,-2.04	0/0:0.050:-0.10,-0.67,-4.40	0/0:0.000:-0.01,-1.62,-5.00	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.23,-0.41,-1.51	0/0:0.000:-0.18,-0.48,-2.18	0/0:0.000:-0.03,-1.17,-5.00	0/0:0.000:-0.10,-0.68,-4.10	0/0:0.000:-0.03,-1.24,-5.00	0/0:0.000:-0.18,-0.48,-2.14
22	20000146	rs73387790	G	A	100	PASS	LDAF=0.0169;RSQ=0.9482;THETA=0.0004;AA=G;AN=2184;AVGPOST=0.9972;VT=SNP;SNPSOURCE=LOWCOV;AC=36;ERATE=0.0003;AF=0.02;AFR_AF=0.07;EUR_AF=0.0013	GT:DS:GL	0/0:0.000:-0.00,-2.68,-5.00	0/0:0.000:-0.07,-0.82,-5.00	0/0:0.000:-0.13,-0.60,-3.05	0/0:0.000:-0.03,-1.24,-5.00	0/0:0.000:-0.18,-0.47,-3.08	0/0:0.000:-0.06,-0.90,-5.00	0/0:0.000:-0.01,-1.48,-5.00	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.16,-0.51,-2.63	0/0:0.000:-0.01,-1.76,-5.00	0/0:0.000:-0.10,-0.67,-5.00	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.11,-0.66,-4.40	0/0:0.000:-0.10,-0.69,-4.70	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.18,-0.47,-2.30	0/0:0.000:-0.00,-2.80,-5.00	0/0:0.000:-0.00,-2.02,-5.00	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.01,-1.49,-5.00	0/0:0.000:-0.01,-1.76,-5.00	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.02,-1.40,-5.00	0/0:0.000:-0.00,-2.09,-5.00	0/0:0.000:-0.10,-0.70,-4.10	0/0:0.000:-0.22,-0.46,-1.27	0/0:0.000:-0.18,-0.48,-2.39	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.19,-0.47,-2.27	0/0:0.000:-0.07,-0.85,-5.00	0/0:0.000:-0.00,-2.53,-5.00	0/0:0.000:-0.00,-2.83,-5.00	0/0:0.000:-0.22,-0.46,-1.24	0/0:0.000:-0.19,-0.46,-2.27	0/0:0.000:-0.10,-0.68,-4.40	0/0:0.000:-0.09,-0.73,-4.22	0/0:0.000:-0.03,-1.19,-5.00	0/0:0.000:-0.15,-0.55,-2.64	0/0:0.000:-0.05,-0.97,-5.00	0/0:0.000:-0.08,-0.76,-4.70	0/0:0.000:-0.01,-1.49,-5.00	0/0:0.000:-0.06,-0.86,-5.00	0/0:0.000:-0.029681,-1.18006,-5	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.18,-0.48,-2.33	0/0:0.000:-0.10,-0.70,-4.70	0/0:0.000:-0.00,-2.04,-5.00	0/0:0.000:-0.03,-1.24,-5.00	0/0:0.000:-0.10,-0.69,-4.40	0/0:0.000:-0.0843997,-0.755377,-3.00877	0/0:0.000:-0.21,-0.47,-1.33	0/0:0.000:-0.00,-3.22,-5.00	0/0:0.000:-0.00,-3.70,-5.00	0/0:0.000:-0.05,-0.96,-5.00	0/0:0.000:-0.03,-1.21,-5.00	0/0:0.000:-0.01,-1.47,-5.00	0/0:0.000:-0.01,-1.86,-5.00	0/0:0.000:-0.05,-0.97,-5.00	0/0:0.000:-0.06,-0.90,-5.00	0/0:0.000:-0.01,-1.51,-5.00	0/0:0.000:-0.03,-1.25,-5.00	0/0:0.000:-0.01,-1.92,-5.00	0/0:0.000:-0.00,-2.58,-5.00	0/0:0.000:-0.06,-0.89,-5.00	0/0:0.000:-0.05,-1.00,-5.00	0/0:0.000:-0.05,-0.95,-5.00	0/0:0.000:-0.01,-1.72,-5.00	0/0:0.000:-0.00,-2.25,-5.00	0/0:0.000:-0.02,-1.43,-5.00	0/0:0.000:-0.18,-0.48,-2.03	0/0:0.000:-0.18,-0.47,-2.72	0/0:0.000:-0.09,-0.72,-4.70	0/0:0.000:-0.18,-0.47,-2.42	0/0:0.000:-0.19,-0.46,-2.20	0/0:0.000:-0.24,-0.44,-1.19	0/0:0.000:-0.01,-1.75,-5.00	0/0:0.000:-0.01,-1.48,-5.00	0/0:0.000:-0.19,-0.46,-2.27	0/0:0.000:-0.05,-0.99,-5.00	0/0:0.000:-0.06,-0.87,-5.00	0/0:0.000:-0.00,-3.40,-5.00	0/0:0.000:-0.10,-0.68,-4.70	0/0:0.000:-0.01,-1.72,-5.00	0/0:0.000:-0.0148341,-1.47392,-5	0/0:0.000:-0.05,-0.95,-5.00	0/0:0.000:-0.03,-1.16,-5.00	0/0:0.000:-0.04,-1.03,-5.00	0/0:0.000:-0.02,-1.46,-5.00	0/0:0.000:-0.09,-0.73,-4.40	0/0:0.000:-0.02,-1.46,-5.00	0/0:0.000:-0.02,-1.26,-5.00	0/0:0.000:-0.03,-1.25,-5.00	0/0:0.000:-0.19,-0.45,-2.12	0/0:0.000:-0.01,-1.50,-5.00	0/0:0.000:-0.05,-0.96,-5.00	0/0:0.000:-0.00,-4.70,-5.00	0/0:0.000:-0.13,-0.58,-3.06	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.17,-0.51,-2.09	0/0:0.000:-0.00,-2.29,-5.00	0/0:0.000:-0.38,-0.43,-0.67	0/0:0.000:-0.00,-2.81,-5.00	0/0:0.000:-0.00,-3.25,-5.00	0/0:0.000:-0.22,-0.46,-1.26	0/0:0.000:-0.03,-1.17,-5.00	0/0:0.000:-0.03,-1.21,-5.00	0/0:0.000:-0.01,-1.50,-5.00	0/0:0.000:-0.01,-1.54,-5.00	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.00,-2.04,-5.00	0/0:0.000:-0.05,-0.93,-5.00	0/0:0.000:-0.10,-0.68,-3.70	0/0:0.000:-0.09,-0.72,-5.00	0/0:0.000:-0.01,-1.77,-5.00	0/0:0.000:-0.05,-0.93,-5.00	0/0:0.000:-0.01,-1.51,-5.00	0/0:0.000:-0.16,-0.51,-2.74	0/0:0.000:-0.10,-0.69,-4.40	0/0:0.000:-0.18,-0.48,-2.26	0/0:0.000:-0.18,-0.48,-2.37	0/0:0.000:-0.18,-0.48,-2.27	0/0:0.000:-0.00,-2.58,-5.00	0/0:0.000:-0.05,-0.93,-5.00	0/0:0.000:-0.03,-1.19,-5.00	0/0:0.000:-0.00,-2.34,-5.00	0/0:0.000:-0.03,-1.23,-5.00	0/0:0.000:-0.10,-0.69,-4.40	0/0:0.000:-0.08,-0.78,-4.70	0/0:0.000:-0.05,-0.95,-5.00	0/0:0.000:-0.03,-1.22,-5.00	0/0:0.000:-0.18,-0.48,-2.36	0/0:0.000:-0.01,-1.53,-5.00	0/0:0.000:-0.18,-0.48,-2.25	0/0:0.000:-0.10,-0.68,-4.70	0/0:0.000:-0.09,-0.73,-5.00	0/0:0.000:-0.02,-1.41,-5.00	0/0:0.000:-0.05,-0.93,-5.00	0/0:0.000:-0.03,-1.22,-5.00	0/0:0.000:-0.18,-0.47,-2.41	0/0:0.000:-0.09,-0.73,-4.40	0/0:0.000:-0.00,-2.00,-5.00	0/0:0.000:-0.19,-0.46,-2.19	0/0:0.000:-0.03,-1.20,-5.00	0/0:0.000:-0.05,-0.98,-5.00	0/0:0.000:-0.18,-0.47,-2.31	0/0:0.000:-0.09,-0.73,-4.10	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.10,-0.67,-3.66	0/0:0.000:-0.12,-0.63,-4.70	0/0:0.000:-0.16,-0.51,-2.28	0/0:0.000:-0.02,-1.46,-5.00	0/0:0.000:-0.01,-1.75,-5.00	0/0:0.000:-0.05,-0.95,-5.00	0/0:0.000:-0.10,-0.68,-4.22	0/0:0.000:-0.05,-0.95,-5.00	0/0:0.000:-0.12,-0.62,-3.08	0/0:0.000:-0.19,-0.45,-2.25	0/0:0.000:-0.01,-1.77,-5.00	0/0:0.000:-0.13,-0.60,-2.15	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.06,-0.92,-5.00	0/0:0.000:-0.04,-1.05,-5.00	0/0:0.000:-0.19,-0.46,-2.30	0/0:0.000:-0.19,-0.46,-2.28	0/0:0.000:-0.01,-1.48,-5.00	0/0:0.000:-0.190265,-0.457324,-2.2321	0/0:0.000:-0.23,-0.46,-1.21	0/0:0.000:-0.01,-1.50,-5.00	0/0:0.000:-0.00,-2.44,-5.00	0/0:0.000:-0.01,-1.73,-5.00	0/0:0.000:-0.01,-1.53,-5.00	0/0:0.000:-0.05,-0.96,-5.00	0/0:0.000:-0.05,-0.94,-5.00	0/0:0.000:-0.19,-0.46,-2.62	0/0:0.000:-0.00,-2.25,-5.00	0/0:0.000:-0.19,-0.46,-2.23	0/0:0.000:-0.11,-0.65,-4.70	0/0:0.000:-0.18,-0.46,-2.68	0/0:0.000:-0.11,-0.66,-4.40	0/0:0.000:-0.19,-0.46,-2.43	0/0:0.000:-0.18,-0.49,-2.00	0/0:0.000:-0.06,-0.92,-5.00	0/0:0.000:-0.20,-0.45,-2.05	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.03,-1.18,-5.00	0/0:0.000:-0.10,-0.68,-4.00	0/0:0.000:-0.03,-1.20,-5.00	0/0:0.000:-0.10,-0.69,-4.22	0/0:0.000:-0.11,-0.65,-4.40	0/0:0.000:-0.11,-0.67,-3.85
22	20000199	rs183293480	A	C	100	PASS	LDAF=0.0009;THETA=0.0004;AN=2184;AVGPOST=0.9990;VT=SNP;AA=A;RSQ=0.6274;SNPSOURCE=LOWCOV;AC=1;ERATE=0.0003;AF=0.0005;EUR_AF=0.0013	GT:DS:GL	0/0:0.000:-0.00,-2.04,-5.00	0/0:0.000:-0.07,-0.82,-3.47	0/0:0.000:-0.07,-0.83,-5.00	0/0:0.000:-0.03,-1.12,-5.00	0/0:0.000:-0.11,-0.64,-4.10	0/0:0.000:-0.12,-0.62,-3.85	0/0:0.000:-0.01,-1.47,-5.00	0/0:0.000:-0.01,-1.54,-5.00	0/0:0.000:-0.10,-0.70,-4.70	0/0:0.000:-0.03,-1.18,-5.00	0/0:0.000:-0.16,-0.50,-3.30	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.03,-1.20,-5.00	0/0:0.000:-0.10,-0.70,-5.00	0/0:0.000:-0.19,-0.46,-2.46	0/0:0.000:-0.16,-0.51,-2.67	0/0:0.000:-0.00,-2.57,-5.00	0/0:0.000:-0.00,-2.85,-5.00	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.00,-2.55,-5.00	0/0:0.000:-0.00,-2.02,-5.00	0/0:0.050:-0.48,-0.48,-0.48	0/0:0.000:-0.23,-0.46,-1.24	0/0:0.000:-0.02,-1.45,-5.00	0/0:0.000:-0.10,-0.68,-4.40	0/0:0.000:-0.06,-0.88,-5.00	0/0:0.000:-0.13,-0.61,-2.23	0/0:0.000:-0.05,-0.94,-5.00	0/0:0.000:-0.26,-0.43,-1.11	0/0:0.000:-0.04,-1.01,-5.00	0/0:0.000:-0.12,-0.62,-3.28	0/0:0.000:-0.00,-2.27,-5.00	0/0:0.000:-0.00,-2.42,-5.00	0/0:0.000:-0.04,-1.08,-5.00	0/0:0.000:-0.06,-0.88,-5.00	0/0:0.000:-0.04,-1.03,-5.00	0/0:0.000:-0.10,-0.70,-4.10	0/0:0.000:-0.18,-0.48,-2.22	0/0:0.000:-0.02,-1.41,-5.00	0/0:0.000:-0.03,-1.19,-5.00	0/0:0.000:-0.14,-0.57,-2.12	0/0:0.000:-0.10,-0.69,-4.70	0/0:0.000:-0.12,-0.62,-2.28	0/0:0.000:-0.0149779,-1.4698,-5	0/0:0.000:-0.10,-0.70,-4.70	0/0:0.000:-0.06,-0.89,-5.00	0/0:0.000:-0.39,-0.23,-2.25	0/0:0.000:-0.00,-3.40,-5.00	0/0:0.000:-0.07,-0.86,-4.40	0/0:0.000:-0.18,-0.48,-2.35	0/0:0.000:-0.00967891,-1.65679,-5	0/0:0.000:-0.22,-0.46,-1.24	0/0:0.000:-0.00,-2.27,-5.00	0/0:0.000:-0.00,-3.12,-5.00	0/0:0.000:-0.01,-1.73,-5.00	0/0:0.000:-0.02,-1.39,-5.00	0/0:0.000:-0.10,-0.70,-4.70	0/0:0.000:-0.03,-1.23,-5.00	0/0:0.000:-0.09,-0.72,-4.10	0/0:0.000:-0.10,-0.71,-4.70	0/0:0.000:-0.02,-1.45,-5.00	0/0:0.000:-0.04,-1.11,-5.00	0/0:0.000:-0.03,-1.14,-5.00	0/0:0.000:-0.00,-2.42,-5.00	0/0:0.000:-0.05,-0.94,-5.00	0/0:0.000:-0.05,-0.95,-5.00	0/0:0.000:-0.11,-0.65,-3.85	0/0:0.000:-0.05,-0.95,-5.00	0/0:0.000:-0.01,-1.72,-5.00	0/0:0.000:-0.00,-2.08,-5.00	0/0:0.000:-0.03,-1.20,-5.00	0/0:0.000:-0.05,-0.94,-5.00	0/0:0.000:-0.16,-0.52,-2.11	0/0:0.000:-0.10,-0.69,-4.70	0/0:0.000:-0.44,-0.46,-0.54	0/0:0.000:-0.07,-0.83,-5.00	0/0:0.000:-0.16,-0.51,-2.42	0/0:0.000:-0.18,-0.48,-2.30	0/0:0.000:-0.19,-0.46,-2.27	0/0:0.000:-0.06,-0.89,-5.00	0/0:0.000:-0.06,-0.88,-5.00	0/0:0.000:-0.00,-3.15,-5.00	0/0:0.000:-0.12,-0.61,-4.10	0/0:0.000:-0.06,-0.91,-5.00	0/0:0.000:-0.00935928,-1.67121,-5	0/0:0.000:-0.18,-0.48,-2.13	0/0:0.000:-0.07,-0.85,-5.00	0/0:0.000:-0.03,-1.19,-5.00	0/0:0.000:-0.03,-1.15,-5.00	0/0:0.000:-0.09,-0.71,-4.70	0/0:0.000:-0.19,-0.46,-2.52	0/0:0.000:-0.03,-1.20,-5.00	0/0:0.000:-0.09,-0.73,-4.10	0/0:0.000:-0.11,-0.64,-4.40	0/0:0.000:-0.03,-1.21,-5.00	0/0:0.000:-0.19,-0.47,-1.83	0/0:0.000:-0.00,-2.76,-5.00	0/0:0.000:-0.09,-0.73,-3.92	0/0:0.000:-0.15,-0.55,-2.43	0/0:0.000:-0.18,-0.48,-1.89	0/0:0.000:-0.00,-3.18,-5.00	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.02,-1.43,-5.00	0/0:0.000:-0.22,-0.40,-5.00	0/0:0.000:-0.07,-0.84,-4.70	0/0:0.000:-0.03,-1.22,-5.00	0/0:0.000:-0.10,-0.69,-3.85	0/0:0.000:-0.01,-1.84,-5.00	0/0:0.000:-0.00,-3.07,-5.00	0/0:0.000:-0.10,-0.69,-3.85	0/0:0.000:-0.06,-0.91,-5.00	0/0:0.000:-0.10,-0.68,-4.70	0/0:0.000:-0.10,-0.69,-3.80	0/0:0.000:-0.48,-0.18,-4.10	0/0:0.000:-0.02,-1.47,-5.00	0/0:0.000:-0.05,-0.94,-5.00	0/0:0.000:-0.03,-1.22,-5.00	0/0:0.000:-0.18,-0.48,-2.34	0/0:0.000:-0.05,-0.95,-5.00	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.18,-0.48,-2.37	0/0:0.000:-0.10,-0.70,-4.40	0/0:0.000:-0.00,-2.44,-5.00	0/0:0.000:-0.01,-1.78,-5.00	0/0:0.000:-0.00,-2.06,-5.00	0/0:0.000:-0.00,-2.28,-5.00	0/0:0.000:-0.06,-0.92,-5.00	0/0:0.000:-0.03,-1.24,-5.00	0/0:0.000:-0.01,-1.51,-5.00	0/0:0.000:-0.05,-0.95,-5.00	0/0:0.000:-0.02,-1.46,-5.00	0/0:0.000:-0.05,-0.94,-5.00	0/0:0.000:-0.11,-0.64,-4.70	0/0:0.000:-0.19,-0.46,-2.54	0/0:0.000:-0.02,-1.38,-5.00	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.06,-0.92,-5.00	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.02,-1.26,-5.00	0/0:0.000:-0.18,-0.47,-2.47	0/0:0.000:-0.16,-0.51,-2.43	0/0:0.000:-0.01,-1.73,-5.00	0/0:0.000:-0.19,-0.46,-2.79	0/0:0.000:-0.06,-0.91,-5.00	0/0:0.000:-0.03,-1.20,-5.00	0/0:0.000:-0.18,-0.48,-2.35	0/0:0.000:-0.03,-1.21,-5.00	0/0:0.000:-0.05,-0.93,-5.00	0/0:0.000:-0.03,-1.18,-5.00	0/0:0.000:-0.19,-0.46,-2.72	0/0:0.000:-0.02,-1.45,-5.00	0/0:0.000:-0.16,-0.51,-2.71	0/0:0.000:-0.01,-1.80,-5.00	0/0:0.000:-0.10,-0.69,-3.66	0/0:0.000:-0.10,-0.69,-4.22	0/0:0.000:-0.23,-0.44,-1.26	0/0:0.000:-0.19,-0.46,-2.12	0/0:0.000:-0.04,-1.07,-5.00	0/0:0.000:-0.03,-1.23,-5.00	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.19,-0.46,-2.20	0/0:0.000:-0.05,-0.95,-5.00	0/0:0.000:-0.11,-0.65,-3.44	0/0:0.000:-0.05,-1.00,-5.00	0/0:0.000:-0.03,-1.17,-5.00	0/0:0.000:-0.01,-1.77,-5.00	0/0:0.000:-0.189687,-0.458221,-2.2426	0/0:0.000:-0.01,-1.79,-5.00	0/0:0.050:-0.24,-0.42,-1.39	0/0:0.000:-0.19,-0.45,-4.70	0/0:0.000:-0.00,-2.15,-5.00	0/0:0.000:-0.05,-0.97,-5.00	0/0:0.000:-0.18,-0.48,-2.23	0/0:0.000:-0.00,-2.11,-5.00	0/0:0.000:-0.11,-0.66,-4.70	0/0:0.000:-0.01,-1.59,-5.00	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.18,-0.46,-2.63	0/0:0.000:-0.11,-0.66,-5.00	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.19,-0.46,-2.43	0/0:0.000:-0.19,-0.47,-1.84	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.19,-0.46,-2.21	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.01,-1.82,-5.00	0/0:0.000:-0.05,-0.99,-5.00	0/0:0.000:-0.05,-0.94,-5.00	0/0:0.000:-0.03,-1.22,-5.00	0/0:0.000:-0.00,-2.26,-5.00	0/0:0.000:-0.10,-0.69,-4.00
22	20000291	rs185807825	G	T	100	PASS	ERATE=0.0005;AVGPOST=0.9983;AA=G;AN=2184;LDAF=0.0015;VT=SNP;SNPSOURCE=LOWCOV;RSQ=0.5564;AC=2;THETA=0.0003;AF=0.0009;ASN_AF=0.0035	GT:DS:GL	0/0:0.000:-0.00,-2.06,-5.00	0/0:0.000:-0.07,-0.83,-5.00	0/0:0.000:-0.02,-1.27,-5.00	0/0:0.000:-0.01,-1.77,-5.00	0/0:0.000:-0.19,-0.45,-2.14	0/0:0.000:-0.11,-0.66,-5.00	0/0:0.000:-0.03,-1.17,-5.00	0/0:0.000:-0.02,-1.33,-5.00	0/0:0.000:-0.18,-0.48,-2.43	0/0:0.000:-0.05,-0.93,-5.00	0/0:0.000:-0.18,-0.46,-2.76	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.03,-1.15,-5.00	0/0:0.000:-0.07,-0.83,-4.40	0/0:0.000:-0.02,-1.33,-5.00	0/0:0.000:-0.10,-0.68,-4.40	0/0:0.000:-0.00,-2.28,-5.00	0/0:0.000:-0.01,-1.83,-5.00	0/0:0.000:-0.13,-0.61,-2.24	0/0:0.000:-0.00,-3.70,-5.00	0/0:0.000:-0.00,-2.21,-5.00	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.22,-0.46,-1.25	0/0:0.000:-0.00,-2.81,-5.00	0/0:0.000:-0.02,-1.44,-5.00	0/0:0.000:-0.06,-0.92,-5.00	0/0:0.000:-0.08,-0.79,-3.21	0/0:0.000:-0.01,-1.72,-5.00	0/0:0.000:-0.21,-0.46,-1.42	0/0:0.000:-0.08,-0.79,-4.40	0/0:0.000:-0.04,-1.06,-5.00	0/0:0.000:-0.02,-1.42,-5.00	0/0:0.000:-0.00,-3.85,-5.00	0/0:0.000:-0.07,-0.83,-5.00	0/0:0.000:-0.01,-1.80,-5.00	0/0:0.000:-0.11,-0.66,-4.40	0/0:0.000:-0.01,-1.48,-5.00	0/0:0.000:-0.01,-1.47,-5.00	0/0:0.000:-0.05,-0.94,-5.00	0/0:0.000:-0.19,-0.46,-2.54	0/0:0.000:-0.03,-1.22,-5.00	0/0:0.000:-0.05,-1.00,-5.00	0/0:0.000:-0.13,-0.60,-2.19	0/0:0.000:-0.0021071,-2.31515,-5	0/0:0.000:-0.10,-0.68,-4.40	0/0:0.000:-0.10,-0.68,-4.22	0/0:0.000:-0.05,-0.93,-5.00	0/0:0.000:-0.00,-3.38,-5.00	0/0:0.000:-0.06,-0.90,-5.00	0/0:0.000:-0.20,-0.45,-1.99	0/0:0.000:-0.0193877,-1.35992,-5	0/0:0.000:-0.02,-1.28,-5.00	0/0:0.000:-0.00,-2.46,-5.00	0/0:0.000:-0.00,-2.75,-5.00	0/0:0.000:-0.18,-0.47,-2.36	0/0:0.000:-0.10,-0.68,-4.00	0/0:0.000:-0.02,-1.34,-5.00	0/0:0.000:-0.00,-3.27,-5.00	0/0:0.250:-0.03,-1.19,-5.00	0/0:0.000:-0.07,-0.84,-5.00	0/0:0.000:-0.01,-1.52,-5.00	0/0:0.000:-0.11,-0.65,-3.40	0/0:0.000:-0.10,-0.70,-4.70	0/0:0.000:-0.03,-1.19,-5.00	0/0:0.000:-0.03,-1.23,-5.00	0/0:0.000:-0.06,-0.92,-5.00	0/0:0.000:-0.00,-1.99,-5.00	0/0:0.000:-0.01,-1.61,-5.00	0/0:0.000:-0.01,-1.48,-5.00	0/0:0.000:-0.10,-0.69,-4.22	0/0:0.000:-0.01,-1.49,-5.00	0/0:0.000:-0.01,-1.59,-5.00	0/0:0.000:-0.10,-0.69,-3.36	0/0:0.000:-0.01,-1.48,-5.00	0/0:0.000:-0.08,-0.77,-4.10	0/0:0.000:-0.05,-1.00,-5.00	0/0:0.000:-0.10,-0.67,-3.80	0/0:0.000:-0.03,-1.19,-5.00	0/0:0.000:-0.06,-0.86,-5.00	0/0:0.000:-0.06,-0.92,-5.00	0/0:0.000:-0.11,-0.66,-4.70	0/0:0.000:-0.00,-2.68,-5.00	0/0:0.000:-0.05,-0.93,-5.00	0/0:0.000:-0.01,-1.84,-5.00	0/0:0.000:-0.0293556,-1.18469,-5	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.19,-0.46,-2.37	0/0:0.000:-0.10,-0.68,-4.10	0/0:0.000:-0.05,-0.97,-5.00	0/0:0.000:-0.03,-1.20,-5.00	0/0:0.000:-0.06,-0.91,-5.00	0/0:0.000:-0.18,-0.48,-2.06	0/0:0.000:-0.06,-0.87,-5.00	0/0:0.000:-0.03,-1.13,-5.00	0/0:0.000:-0.03,-1.24,-5.00	0/0:0.000:-0.05,-0.94,-5.00	0/0:0.000:-0.00,-3.85,-5.00	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.07,-0.83,-4.40	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.00,-2.82,-5.00	0/0:0.000:-0.08,-0.76,-3.80	0/0:0.000:-0.00,-3.47,-5.00	0/0:0.000:-0.03,-1.18,-5.00	0/0:0.000:-0.11,-0.65,-3.59	0/0:0.000:-0.03,-1.20,-5.00	0/0:0.000:-0.02,-1.43,-5.00	0/0:0.000:-0.00,-2.73,-5.00	0/0:0.000:-0.06,-0.86,-4.22	0/0:0.000:-0.14,-0.57,-2.84	0/0:0.000:-0.01,-1.50,-5.00	0/0:0.000:-0.02,-1.45,-5.00	0/0:0.000:-0.01,-1.76,-5.00	0/0:0.000:-0.02,-1.46,-5.00	0/0:0.000:-0.01,-1.79,-5.00	0/0:0.000:-0.06,-0.90,-5.00	0/0:0.000:-0.01,-1.87,-5.00	0/0:0.000:-0.01,-1.50,-5.00	0/0:0.000:-0.03,-1.21,-5.00	0/0:0.000:-0.08,-0.79,-5.00	0/0:0.000:-0.06,-0.91,-5.00	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.00,-2.89,-5.00	0/0:0.000:-0.02,-1.46,-5.00	0/0:0.000:-0.20,-0.44,-2.07	0/0:0.000:-0.00,-2.65,-5.00	0/0:0.000:-0.02,-1.27,-5.00	0/0:0.000:-0.03,-1.21,-5.00	0/0:0.000:-0.01,-1.79,-5.00	0/0:0.000:-0.05,-0.95,-5.00	0/0:0.000:-0.10,-0.68,-4.70	0/0:0.000:-0.35,-0.41,-0.78	0/0:0.000:-0.09,-0.75,-3.70	0/0:0.000:-0.05,-0.98,-5.00	0/0:0.000:-0.05,-0.97,-5.00	0/0:0.000:-0.05,-0.93,-5.00	0/0:0.000:-0.08,-0.77,-5.00	0/0:0.000:-0.04,-1.03,-5.00	0/0:0.000:-0.02,-1.47,-5.00	0/0:0.000:-0.03,-1.16,-5.00	0/0:0.000:-0.01,-1.57,-5.00	0/0:0.000:-0.06,-0.89,-5.00	0/0:0.000:-0.06,-0.90,-5.00	0/0:0.000:-0.00,-2.30,-5.00	0/0:0.000:-0.05,-0.98,-5.00	0/0:0.000:-0.10,-0.69,-4.40	0/0:0.000:-0.01,-1.50,-5.00	0/0:0.000:-0.06,-0.87,-5.00	0/0:0.000:-0.16,-0.51,-2.37	0/0:0.000:-0.19,-0.46,-2.23	0/0:0.050:-0.27,-0.34,-3.22	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.08,-0.77,-4.40	0/0:0.000:-0.05,-0.94,-5.00	0/0:0.000:-0.05,-0.99,-5.00	0/0:0.000:-0.07,-0.82,-5.00	0/0:0.000:-0.12,-0.61,-2.91	0/0:0.000:-0.00,-1.98,-5.00	0/0:0.000:-0.03,-1.21,-5.00	0/0:0.000:-0.07,-0.82,-5.00	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.01,-1.49,-5.00	0/0:0.000:-0.12,-0.61,-3.15	0/0:0.000:-0.00,-2.08,-5.00	0/0:0.000:-0.02,-1.33,-5.00	0/0:0.000:-0.01,-1.74,-5.00	0/0:0.050:-0.11227,-0.642561,-4.22185	0/0:0.000:-0.22,-0.46,-1.28	0/0:0.000:-0.10,-0.70,-4.10	0/0:0.000:-0.10,-0.67,-3.62	0/0:0.000:-0.03,-1.18,-5.00	0/0:0.000:-0.03,-1.24,-5.00	0/0:0.000:-0.10,-0.70,-4.10	0/0:0.000:-0.01,-1.80,-5.00	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.02,-1.47,-5.00	0/0:0.000:-0.06,-0.88,-5.00	0/0:0.000:-0.03,-1.13,-5.00	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.18,-0.46,-2.69	0/0:0.000:-0.03,-1.13,-5.00	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.18,-0.48,-2.41	0/0:0.000:-0.18,-0.47,-2.85	0/0:0.050:-0.48,-0.48,-0.48	0/0:0.000:-0.08,-0.76,-4.70	0/0:0.000:-0.10,-0.69,-4.00	0/0:0.000:-0.11,-0.65,-3.62	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.01,-1.70,-5.00	0/0:0.000:-0.00,-2.57,-5.00
22	20000428	rs55902548	G	T	100	PASS	AC=323;AVGPOST=0.9983;AA=G;AN=2184;VT=SNP;RSQ=0.9949;LDAF=0.1473;SNPSOURCE=LOWCOV;ERATE=0.0003;THETA=0.0003;AF=0.15;ASN_AF=0.0017;AMR_AF=0.15;AFR_AF=0.31;EUR_AF=0.15	GT:DS:GL	1/0:1.000:-5.00,0.00,-5.00	0/0:0.000:-0.35,-0.43,-0.73	0/1:1.000:-1.81,-0.01,-2.95	0/0:0.000:-0.01,-1.79,-5.00	0/0:0.000:-0.06,-0.86,-5.00	1/0:1.000:-0.19,-0.46,-2.18	0/0:0.000:-0.10,-0.68,-5.00	0/1:1.000:-4.40,-0.03,-1.12	0/1:1.000:-5.00,-0.69,-0.10	0/0:0.000:-0.10,-0.69,-4.70	0/0:0.000:-0.48,-0.48,-0.48	0/1:1.000:-5.00,-0.01,-1.77	0/0:0.000:-0.18,-0.48,-2.57	0/0:0.000:-0.02,-1.31,-5.00	0/0:0.000:-0.11,-0.65,-4.70	0/0:0.000:-0.10,-0.68,-4.70	0/0:0.000:-0.01,-1.72,-5.00	1/0:1.000:-5.00,0.00,-5.00	1/0:1.000:-1.38,-0.02,-2.61	0/1:1.000:-5.00,-1.40,-0.02	0/0:0.000:-0.00,-2.97,-5.00	0/0:0.000:-0.19,-0.47,-2.15	0/0:0.000:-0.44,-0.46,-0.54	0/0:0.000:-0.00,-2.52,-5.00	0/0:0.000:-0.05,-0.93,-5.00	0/0:0.000:-0.01,-1.77,-5.00	0/1:0.750:-0.22,-0.46,-1.26	0/0:0.000:-0.03,-1.19,-5.00	0/0:0.000:-0.21,-0.46,-1.42	0/0:0.000:-0.19,-0.45,-2.20	0/0:0.000:-0.12,-0.63,-3.36	0/0:0.000:-0.00,-2.54,-5.00	0/0:0.000:-0.00,-2.88,-5.00	1/0:1.000:-2.10,-0.00,-4.00	1/0:1.250:-5.00,-1.62,-0.01	0/1:1.000:-3.06,-0.47,-0.18	0/1:1.000:-5.00,-0.87,-0.06	1/1:2.000:-5.00,-0.84,-0.07	0/0:0.000:-0.05,-0.95,-5.00	0/0:0.000:-0.02,-1.46,-5.00	0/0:0.000:-0.14,-0.58,-2.17	0/0:0.000:-0.01,-1.77,-5.00	0/0:0.000:-0.06,-0.91,-5.00	1/0:1.000:-5,0,-5	0/0:0.000:-0.05,-0.95,-5.00	1/0:1.000:-4.70,-0.70,-0.10	0/0:0.000:-0.09,-0.72,-4.70	0/0:0.000:-0.01,-1.54,-5.00	1/1:2.000:-5.00,-0.68,-0.10	0/0:0.000:-0.18,-0.48,-2.33	0/0:0.000:-0.00465443,-1.97224,-5	0/0:0.000:-0.48,-0.48,-0.48	1/0:1.000:-5.00,-0.93,-0.05	0/0:0.000:-0.00,-2.82,-5.00	1/1:2.000:-5.00,-1.67,-0.01	0/0:0.000:-0.05,-0.95,-5.00	1/0:1.000:-5.00,-0.87,-0.06	0/0:0.000:-0.01,-1.78,-5.00	0/0:0.000:-0.01,-1.83,-5.00	0/1:1.000:-5.00,-0.00,-3.70	0/1:0.950:-0.15,-0.53,-2.39	0/0:0.000:-0.09,-0.71,-4.70	0/0:0.000:-0.18,-0.48,-2.56	0/0:0.000:-0.00,-2.60,-5.00	0/0:0.000:-0.01,-1.49,-5.00	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.04,-1.09,-5.00	0/0:0.000:-0.00,-2.40,-5.00	0/0:0.000:-0.06,-0.92,-5.00	0/0:0.000:-0.01,-1.54,-5.00	0/0:0.000:-0.02,-1.45,-5.00	0/0:0.000:-0.01,-1.84,-5.00	0/1:1.000:-2.39,-0.30,-0.31	0/0:0.000:-0.03,-1.20,-5.00	0/0:0.000:-0.11,-0.64,-3.44	0/0:0.000:-0.02,-1.42,-5.00	0/0:0.000:-0.01,-1.76,-5.00	0/0:0.000:-0.03,-1.17,-5.00	0/1:1.000:-5.00,-0.03,-1.18	0/0:0.000:-0.03,-1.21,-5.00	0/0:0.000:-0.10,-0.70,-5.00	0/1:1.000:-5.00,0.00,-5.00	0/0:0.000:-0.01,-1.58,-5.00	0/1:1.000:-5.00,0.00,-5.00	1/0:1.000:-5,-5.21375e-05,-3.92082	0/0:0.000:-0.05,-0.92,-5.00	0/0:0.000:-0.02,-1.45,-5.00	0/1:1.000:-5.00,-0.00,-4.22	0/0:0.000:-0.09,-0.73,-4.22	0/0:0.000:-0.00,-2.03,-5.00	0/0:0.000:-0.03,-1.20,-5.00	0/0:0.000:-0.03,-1.19,-5.00	0/0:0.000:-0.18,-0.48,-2.03	0/0:0.000:-0.01,-1.59,-5.00	0/0:0.000:-0.37,-0.42,-0.71	0/0:0.000:-0.02,-1.45,-5.00	0/0:0.000:-0.00,-3.36,-5.00	1/0:1.000:-3.85,-0.05,-0.94	1/0:1.000:-5.00,-0.84,-0.07	1/0:1.000:-0.06,-0.90,-5.00	0/0:0.000:-0.00,-2.56,-5.00	0/0:0.000:-0.19,-0.47,-1.73	0/0:0.000:-0.06,-0.90,-5.00	0/0:0.000:-0.00,-3.74,-5.00	0/0:0.000:-0.22,-0.46,-1.26	1/0:1.000:-5.00,-0.00,-3.92	0/0:0.000:-0.10,-0.69,-3.85	0/0:0.000:-0.00,-2.76,-5.00	0/0:0.000:-0.01,-1.70,-5.00	0/0:0.000:-0.02,-1.46,-5.00	0/0:0.000:-0.06,-0.92,-5.00	0/0:0.000:-0.01,-1.73,-5.00	0/0:0.000:-0.05,-1.00,-5.00	1/0:1.000:-2.32,-0.01,-1.58	0/1:0.950:-0.09,-0.73,-4.70	0/1:1.000:-2.25,-0.02,-1.52	0/0:0.000:-0.01,-1.50,-5.00	0/0:0.000:-0.10,-0.70,-4.40	0/0:0.000:-0.05,-0.99,-5.00	0/0:0.000:-0.03,-1.19,-5.00	0/0:0.000:-0.03,-1.25,-5.00	0/0:0.000:-0.10,-0.67,-4.10	0/1:1.000:-5.00,-0.00,-4.22	0/0:0.000:-0.18,-0.48,-2.02	0/0:0.000:-0.05,-0.94,-5.00	0/0:0.000:-0.00,-4.70,-5.00	0/0:0.000:-0.03,-1.23,-5.00	0/0:0.000:-0.05,-0.93,-5.00	0/0:0.000:-0.03,-1.20,-5.00	1/0:1.000:-0.37,-0.24,-2.87	0/0:0.000:-0.05,-0.94,-5.00	0/0:0.000:-0.18,-0.48,-2.36	0/0:0.000:-0.18,-0.48,-1.83	0/0:0.000:-0.18,-0.48,-2.23	0/0:0.000:-0.01,-1.78,-5.00	0/0:0.000:-0.06,-0.90,-5.00	0/0:0.000:-0.07,-0.82,-5.00	0/0:0.000:-0.10,-0.70,-4.70	0/0:0.000:-0.01,-1.48,-5.00	0/0:0.000:-0.02,-1.46,-5.00	0/0:0.000:-0.18,-0.48,-2.11	0/0:0.000:-0.10,-0.69,-4.70	0/0:0.000:-0.18,-0.47,-2.84	0/0:0.000:-0.03,-1.21,-5.00	1/0:1.000:-4.40,-0.00,-4.70	0/1:0.800:-0.03,-1.20,-5.00	0/0:0.000:-0.05,-0.98,-5.00	0/0:0.000:-0.10,-0.68,-3.70	0/0:0.000:-0.10,-0.67,-5.00	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.10,-0.68,-3.59	0/0:0.000:-0.02,-1.26,-5.00	0/0:0.000:-0.48,-0.48,-0.48	0/0:0.000:-0.01,-1.80,-5.00	1/0:1.000:-5.00,-0.66,-0.11	0/0:0.000:-0.12,-0.61,-2.49	1/1:2.000:-5.00,-1.30,-0.02	0/0:0.000:-0.04,-1.07,-5.00	0/0:0.000:-0.18,-0.48,-2.19	0/0:0.000:-0.07,-0.85,-5.00	1/0:1.000:-2.07,-0.04,-1.10	0/1:1.000:-0.09,-0.74,-4.70	0/1:1.000:-5.00,-0.00,-2.47	0/0:0.000:-0.02,-1.26,-5.00	1/0:1.000:-2.25,-0.01,-1.55	0/0:0.000:-0.00,-3.17,-5.00	0/0:0.000:-0.203967,-0.438136,-1.99396	0/0:0.000:-0.12,-0.62,-3.18	0/0:0.000:-0.03,-1.19,-5.00	0/0:0.000:-0.01,-1.66,-5.00	1/0:1.000:-5.00,-0.00,-3.74	1/0:1.000:-1.43,-0.02,-4.70	0/0:0.000:-0.05,-0.96,-5.00	0/0:0.000:-0.09,-0.74,-4.70	0/0:0.000:-0.48,-0.48,-0.48	1/0:1.000:-5.00,-0.18,-0.47	0/0:0.000:-0.17,-0.50,-2.70	0/1:1.000:-5.00,-0.00,-3.80	0/1:1.000:-2.22,-0.01,-1.97	0/1:1.000:-5.00,-0.67,-0.10	0/0:0.000:-0.21,-0.43,-2.01	0/0:0.000:-0.19,-0.47,-1.83	0/0:0.000:-0.10,-0.69,-4.40	0/0:0.000:-0.11,-0.64,-4.10	0/0:0.000:-0.18,-0.47,-2.80	0/0:0.000:-0.01,-1.47,-5.00	0/0:0.000:-0.22,-0.43,-1.61	0/0:0.000:-0.02,-1.47,-5.00	0/0:0.000:-0.01,-1.70,-5.00	0/0:0.000:-0.05,-0.97,-5.00	0/0:0.000:-0.02,-1.28,-5.00

As in typical VCF files, it has a bunch of meta-information lines, one header line, and then one line for each each marker. In this VCF, genetic data has fields GT (genotype), DS (dosage), and GL (genotype likelihood).

Summary statistics

  • Number of records (markers) in a VCF file.
In [4]:
records = nrecords("test.vcf.gz")
Out[4]:
1356
  • Number of samples (individuals) in a VCF file.
In [5]:
samples = nsamples("test.vcf.gz")
Out[5]:
191
  • gtstats function calculates genotype statistics for each marker with GT field.
In [6]:
@time records, samples, lines, missings_by_sample, missings_by_record, 
    maf_by_record, minorallele_by_record = gtstats("test.vcf.gz");
  1.860589 seconds (6.14 M allocations: 401.973 MiB, 9.47% gc time)
In [7]:
# number of markers
records
Out[7]:
1356
In [8]:
# number of samples (individuals)
samples
Out[8]:
191
In [9]:
# number of markers with GT field
lines
Out[9]:
1356
In [10]:
# number of missing genotypes in each sample (individual)
missings_by_sample'
Out[10]:
1×191 LinearAlgebra.Adjoint{Int64,Array{Int64,1}}:
 0  0  0  0  0  0  0  0  0  0  0  0  0  …  0  0  0  0  0  0  0  0  0  0  0  0
In [11]:
# number of missing genotypes in each marker with GT field
missings_by_record'
Out[11]:
1×1356 LinearAlgebra.Adjoint{Int64,Array{Int64,1}}:
 0  0  0  0  0  0  0  0  0  0  0  0  0  …  0  0  0  0  0  0  0  0  0  0  0  0
In [12]:
# minor allele frequency of each marker with GT field
maf_by_record'
Out[12]:
1×1356 LinearAlgebra.Adjoint{Float64,Array{Float64,1}}:
 0.0  0.0  0.0  0.0  0.146597  0.0  …  0.0  0.0  0.0706806  0.0706806  0.0
In [13]:
# minor allele of each marker (with GT field): true (REF) or false (ALT)
minorallele_by_record'
Out[13]:
1×1356 LinearAlgebra.Adjoint{Bool,Array{Bool,1}}:
 1  1  1  1  1  1  1  1  1  1  1  1  0  …  1  1  1  1  1  1  1  1  1  1  1  1

The optional second argument of gtstats function specifies an output file or IO stream for genotype statistics per marker. Each line has fields:

  • 1-8: VCF fixed fields (CHROM, POS, ID, REF, ALT, QUAL, FILT, INFO)
  • 9: Missing genotype count
  • 10: Missing genotype frequency
  • 11: ALT allele count
  • 12: ALT allele frequency
  • 13: Minor allele count (REF allele vs ALT alleles)
  • 14: Minor allele frequency (REF allele vs ALT alleles)
  • 15: HWE P-value (REF allele vs ALT alleles)
In [14]:
# write genotype statistics in file gtstats.out.txt
@time gtstats("test.vcf.gz", "gtstats.out.txt");
  0.516509 seconds (1.88 M allocations: 184.184 MiB, 5.13% gc time)

The output file can be read as a DataFrame for further analysis.

In [15]:
using CSV

gstat = CSV.read("gtstats.out.txt"; 
    header = [:chr, :pos, :id, :ref, :alt, :qual, :filt, :info, :missings, :missfreq, :nalt, :altfreq, :nminor, :maf, :hwe],
    delim = '\t',
)
Out[15]:

1,356 rows × 15 columns (omitted printing of 7 columns)

chrposidrefaltqualfiltinfo
Int64Int64StringStringStringFloat64StringString
12220000086rs138720731TC100.0PASSAC=7;RSQ=0.8454;AVGPOST=0.9983;AA=T;AN=2184;LDAF=0.0040;THETA=0.0001;VT=SNP;SNPSOURCE=LOWCOV;ERATE=0.0003;AF=0.0032;AFR_AF=0.01
22220000146rs73387790GA100.0PASSLDAF=0.0169;RSQ=0.9482;THETA=0.0004;AA=G;AN=2184;AVGPOST=0.9972;VT=SNP;SNPSOURCE=LOWCOV;AC=36;ERATE=0.0003;AF=0.02;AFR_AF=0.07;EUR_AF=0.0013
32220000199rs183293480AC100.0PASSLDAF=0.0009;THETA=0.0004;AN=2184;AVGPOST=0.9990;VT=SNP;AA=A;RSQ=0.6274;SNPSOURCE=LOWCOV;AC=1;ERATE=0.0003;AF=0.0005;EUR_AF=0.0013
42220000291rs185807825GT100.0PASSERATE=0.0005;AVGPOST=0.9983;AA=G;AN=2184;LDAF=0.0015;VT=SNP;SNPSOURCE=LOWCOV;RSQ=0.5564;AC=2;THETA=0.0003;AF=0.0009;ASN_AF=0.0035
52220000428rs55902548GT100.0PASSAC=323;AVGPOST=0.9983;AA=G;AN=2184;VT=SNP;RSQ=0.9949;LDAF=0.1473;SNPSOURCE=LOWCOV;ERATE=0.0003;THETA=0.0003;AF=0.15;ASN_AF=0.0017;AMR_AF=0.15;AFR_AF=0.31;EUR_AF=0.15
62220000683rs142720028AG100.0PASSAVGPOST=0.9985;AN=2184;LDAF=0.0015;VT=SNP;RSQ=0.5718;AA=A;SNPSOURCE=LOWCOV;THETA=0.0007;ERATE=0.0003;AC=2;AF=0.0009;AFR_AF=0.0041
72220000771rs114690707AC100.0PASSERATE=0.0004;AC=28;AN=2184;RSQ=0.9857;VT=SNP;AA=A;LDAF=0.0130;SNPSOURCE=LOWCOV;AVGPOST=0.9995;THETA=0.0003;AF=0.01;AMR_AF=0.01;AFR_AF=0.05
82220000793rs189842693TC100.0PASSERATE=0.0004;RSQ=0.7411;AA=T;AN=2184;AVGPOST=0.9981;AC=6;VT=SNP;SNPSOURCE=LOWCOV;LDAF=0.0031;THETA=0.0003;AF=0.0027;ASN_AF=0.0035;EUR_AF=0.01
92220000810rs147349046CT100.0PASSAA=C;AVGPOST=0.9994;AC=28;AN=2184;VT=SNP;RSQ=0.9802;SNPSOURCE=LOWCOV;ERATE=0.0003;LDAF=0.0128;THETA=0.0003;AF=0.01;AMR_AF=0.01;AFR_AF=0.05
102220000814rs183154520TC100.0PASSERATE=0.0004;AVGPOST=0.9985;THETA=0.0002;AA=T;AN=2184;RSQ=0.4507;VT=SNP;SNPSOURCE=LOWCOV;AC=1;LDAF=0.0012;AF=0.0005;AMR_AF=0.0028
112220000864rs187930998GA100.0PASSAC=7;THETA=0.0014;AA=g;AN=2184;LDAF=0.0037;RSQ=0.8089;VT=SNP;SNPSOURCE=LOWCOV;ERATE=0.0006;AVGPOST=0.9982;AF=0.0032;ASN_AF=0.01
122220000882rs148068532CG100.0PASSTHETA=0.0004;AN=2184;AC=8;VT=SNP;RSQ=0.9358;AA=c;LDAF=0.0038;SNPSOURCE=LOWCOV;ERATE=0.0003;AVGPOST=0.9995;AF=0.0037;AMR_AF=0.0028;AFR_AF=0.01
132220000950rs1978233TG100.0PASSERATE=0.0005;AA=G;AN=2184;AC=2157;VT=SNP;SNPSOURCE=LOWCOV;RSQ=0.8667;LDAF=0.9857;AVGPOST=0.9950;THETA=0.0003;AF=0.99;ASN_AF=1.00;AMR_AF=0.98;AFR_AF=1.00;EUR_AF=0.97
142220000975rs141800233GA100.0PASSERATE=0.0004;AVGPOST=0.9994;AA=G;AN=2184;THETA=0.0005;VT=SNP;RSQ=0.9593;AC=15;SNPSOURCE=LOWCOV;LDAF=0.0069;AF=0.01;AMR_AF=0.0028;AFR_AF=0.03
152220001001rs192051979TC100.0PASSLDAF=0.0027;RSQ=0.6246;AA=T;AN=2184;VT=SNP;ERATE=0.0008;AVGPOST=0.9973;SNPSOURCE=LOWCOV;AC=4;THETA=0.0003;AF=0.0018;ASN_AF=0.01;AMR_AF=0.0028
162220001006rs2079702GA100.0PASSERATE=0.0004;RSQ=0.9917;LDAF=0.8452;THETA=0.0004;AN=2184;AVGPOST=0.9972;VT=SNP;AA=A;AC=1847;SNPSOURCE=LOWCOV;AF=0.85;ASN_AF=0.61;AMR_AF=0.83;AFR_AF=0.99;EUR_AF=0.94
172220001016rs183256914CT100.0PASSAA=C;AN=2184;LDAF=0.0054;AC=6;VT=SNP;ERATE=0.0011;RSQ=0.5737;SNPSOURCE=LOWCOV;THETA=0.0003;AVGPOST=0.9940;AF=0.0027;AMR_AF=0.01;EUR_AF=0.0013
182220001157rs150580380GA100.0PASSTHETA=0.0002;AA=G;AN=2184;VT=SNP;LDAF=0.0020;SNPSOURCE=LOWCOV;AC=1;AVGPOST=0.9968;ERATE=0.0006;RSQ=0.3659;AF=0.0005;ASN_AF=0.0017
192220001159rs139570132CT100.0PASSERATE=0.0004;AA=C;THETA=0.0002;AC=28;RSQ=0.9466;AN=2184;AVGPOST=0.9981;VT=SNP;SNPSOURCE=LOWCOV;LDAF=0.0135;AF=0.01;AMR_AF=0.01;AFR_AF=0.05
202220001219rs143369598GC100.0PASSAC=7;AVGPOST=0.9963;THETA=0.0002;AA=G;AN=2184;LDAF=0.0040;VT=SNP;SNPSOURCE=LOWCOV;RSQ=0.6315;ERATE=0.0003;AF=0.0032;ASN_AF=0.01;EUR_AF=0.0026
212220001333rs5993894CT100.0PASSERATE=0.0004;AA=C;AN=2184;LDAF=0.0528;VT=SNP;AVGPOST=0.9980;SNPSOURCE=LOWCOV;RSQ=0.9854;AC=114;THETA=0.0003;AF=0.05;AMR_AF=0.02;AFR_AF=0.21
222220001434rs146344141CT100.0PASSAA=C;THETA=0.0004;AN=2184;VT=SNP;LDAF=0.0101;RSQ=0.9081;SNPSOURCE=LOWCOV;AC=21;ERATE=0.0003;AVGPOST=0.9974;AF=0.01;ASN_AF=0.0017;AMR_AF=0.01;EUR_AF=0.02
232220001455rs188666449GA100.0PASSAVGPOST=0.9986;THETA=0.0004;AA=G;AN=2184;VT=SNP;LDAF=0.0011;SNPSOURCE=LOWCOV;AC=1;RSQ=0.4222;ERATE=0.0003;AF=0.0005;ASN_AF=0.0017
242220001521rs139601437CA100.0PASSERATE=0.0005;AA=C;AN=2184;RSQ=0.5048;LDAF=0.0010;VT=SNP;THETA=0.0006;AVGPOST=0.9989;SNPSOURCE=LOWCOV;AC=1;AF=0.0005;AFR_AF=0.0020
252220001587rs71788814CAGC530.0PASSAA=CAG;ERATE=0.0005;AN=2184;AVGPOST=0.9952;VT=INDEL;RSQ=0.9665;LDAF=0.0519;AC=114;THETA=0.0003;AF=0.05;AMR_AF=0.03;AFR_AF=0.21;EUR_AF=0.0013
262220001600rs144217522TA100.0PASSERATE=0.0004;AC=28;AA=T;AN=2184;RSQ=0.9560;VT=SNP;THETA=0.0011;LDAF=0.0126;SNPSOURCE=LOWCOV;AVGPOST=0.9984;AF=0.01;AMR_AF=0.01;AFR_AF=0.05
272220001655rs192606530GA100.0PASSERATE=0.0004;THETA=0.0002;AA=G;AN=2184;RSQ=0.5888;LDAF=0.0018;VT=SNP;SNPSOURCE=LOWCOV;AC=2;AVGPOST=0.9982;AF=0.0009;AMR_AF=0.01
282220001822rs111598545AG100.0PASSRSQ=0.9768;THETA=0.0004;AN=2184;VT=SNP;AVGPOST=0.9976;LDAF=0.0418;AA=A;SNPSOURCE=LOWCOV;ERATE=0.0003;AC=91;AF=0.04;ASN_AF=0.01;AMR_AF=0.05;AFR_AF=0.08;EUR_AF=0.04
292220002011rs184950746CT100.0PASSERATE=0.0004;AA=C;AVGPOST=0.9994;AN=2184;RSQ=0.6434;THETA=0.0005;VT=SNP;SNPSOURCE=LOWCOV;AC=1;LDAF=0.0008;AF=0.0005;AMR_AF=0.0028
302220002207rs142461772GA100.0PASSAA=G;AN=2184;AC=6;VT=SNP;AVGPOST=0.9989;LDAF=0.0029;SNPSOURCE=LOWCOV;ERATE=0.0003;THETA=0.0003;RSQ=0.8802;AF=0.0027;AFR_AF=0.01

Filter

Sometimes we wish to subset entire VCF files, such as filtering out certain samples or records (SNPs). This is achieved via the filter function:

In [16]:
# filtering by specifying indices to keep
record_mask = 1:records       # keep all records (SNPs)
sample_mask = 2:(samples - 1) # keep all but first and last sample (individual)
@time VCFTools.filter("test.vcf.gz", record_mask, sample_mask, 
    des="filtered.test.vcf.gz")
  2.356589 seconds (7.01 M allocations: 494.823 MiB, 5.98% gc time)

One can also supply bitvectors as masks:

In [17]:
record_mask    = trues(records)
sample_mask    = trues(samples)
record_mask[1] = record_mask[end] = false
@time VCFTools.filter("test.vcf.gz", record_mask, sample_mask, 
    des="filtered.test.vcf.gz")
  0.712751 seconds (830.25 k allocations: 110.526 MiB, 2.39% gc time)

Convert

Convert GT data in VCF file test.vcf.gz to a Matrix{Union{Missing, Int8}}. Here as_minorallele = false indicates that VCFTools.jl will copy the 0s and 1s of the file directly into A, without checking if ALT or REF is the minor allele.

In [18]:
@time A = convert_gt(Int8, "test.vcf.gz"; as_minorallele = false, 
    model = :additive, impute = false, center = false, scale = false)
  0.486844 seconds (1.79 M allocations: 139.283 MiB, 6.30% gc time)
Out[18]:
191×1356 Array{Union{Missing, Int8},2}:
 0  0  0  0  1  0  0  0  0  0  0  0  2  …  0  0  0  0  1  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0  0  0  0  2     0  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  0  1  0  0  0  0  0  0  0  2     0  0  0  0  1  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0  0  0  0  2     0  0  0  0  1  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0  0  0  0  2     0  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  0  1  0  0  0  0  0  0  0  2  …  0  0  0  0  1  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0  0  0  0  2     0  0  0  0  0  0  0  0  0  1  1  0
 0  0  0  0  1  0  0  0  0  0  0  0  2     0  0  0  0  1  0  0  0  0  0  0  0
 0  0  0  0  1  0  0  0  0  0  0  0  2     0  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0  0  0  0  2     0  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0  0  0  0  2  …  0  0  1  0  1  0  0  0  0  1  1  0
 0  0  0  0  1  0  0  0  0  0  0  0  2     0  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0  0  0  0  2     0  0  0  0  0  0  0  0  0  1  1  0
 ⋮              ⋮              ⋮        ⋱     ⋮              ⋮              ⋮
 0  0  0  0  1  0  0  0  0  0  0  0  2     0  0  1  0  0  0  0  0  0  1  1  0
 0  0  0  0  0  0  0  0  0  0  0  0  2  …  0  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0  0  0  0  2     0  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0  0  0  0  2     0  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0  0  0  0  2     0  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0  0  0  0  2     0  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0  0  0  0  2  …  0  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0  0  0  0  2     0  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0  0  0  0  2     0  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0  0  0  0  2     0  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0  0  0  0  2     0  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0  0  1  0  2  …  0  0  0  0  0  0  0  0  0  0  0  0

Convert GT data in VCF file test.vcf.gz to a numeric array. This checks which of ALT/REF is the minor allele, imputes the missing genotypes according to allele frequency, centers the dosages around 2MAF, and scales the dosages by sqrt(2MAF*(1-MAF)).

In [19]:
@time A = convert_gt(Float64, "test.vcf.gz"; as_minorallele = true, 
    model = :additive, impute = true, center = true, scale = true)
  0.401206 seconds (1.58 M allocations: 130.332 MiB, 7.20% gc time)
Out[19]:
191×1356 Array{Union{Missing, Float64},2}:
 0.0  0.0  0.0  0.0   1.41301   0.0  …  0.0  0.0  -0.390016  -0.390016  0.0
 0.0  0.0  0.0  0.0  -0.586138  0.0     0.0  0.0  -0.390016  -0.390016  0.0
 0.0  0.0  0.0  0.0   1.41301   0.0     0.0  0.0  -0.390016  -0.390016  0.0
 0.0  0.0  0.0  0.0  -0.586138  0.0     0.0  0.0  -0.390016  -0.390016  0.0
 0.0  0.0  0.0  0.0  -0.586138  0.0     0.0  0.0  -0.390016  -0.390016  0.0
 0.0  0.0  0.0  0.0   1.41301   0.0  …  0.0  0.0  -0.390016  -0.390016  0.0
 0.0  0.0  0.0  0.0  -0.586138  0.0     0.0  0.0   2.36899    2.36899   0.0
 0.0  0.0  0.0  0.0   1.41301   0.0     0.0  0.0  -0.390016  -0.390016  0.0
 0.0  0.0  0.0  0.0   1.41301   0.0     0.0  0.0  -0.390016  -0.390016  0.0
 0.0  0.0  0.0  0.0  -0.586138  0.0     0.0  0.0  -0.390016  -0.390016  0.0
 0.0  0.0  0.0  0.0  -0.586138  0.0  …  0.0  0.0   2.36899    2.36899   0.0
 0.0  0.0  0.0  0.0   1.41301   0.0     0.0  0.0  -0.390016  -0.390016  0.0
 0.0  0.0  0.0  0.0  -0.586138  0.0     0.0  0.0   2.36899    2.36899   0.0
 ⋮                              ⋮    ⋱                                  ⋮  
 0.0  0.0  0.0  0.0   1.41301   0.0     0.0  0.0   2.36899    2.36899   0.0
 0.0  0.0  0.0  0.0  -0.586138  0.0  …  0.0  0.0  -0.390016  -0.390016  0.0
 0.0  0.0  0.0  0.0  -0.586138  0.0     0.0  0.0  -0.390016  -0.390016  0.0
 0.0  0.0  0.0  0.0  -0.586138  0.0     0.0  0.0  -0.390016  -0.390016  0.0
 0.0  0.0  0.0  0.0  -0.586138  0.0     0.0  0.0  -0.390016  -0.390016  0.0
 0.0  0.0  0.0  0.0  -0.586138  0.0     0.0  0.0  -0.390016  -0.390016  0.0
 0.0  0.0  0.0  0.0  -0.586138  0.0  …  0.0  0.0  -0.390016  -0.390016  0.0
 0.0  0.0  0.0  0.0  -0.586138  0.0     0.0  0.0  -0.390016  -0.390016  0.0
 0.0  0.0  0.0  0.0  -0.586138  0.0     0.0  0.0  -0.390016  -0.390016  0.0
 0.0  0.0  0.0  0.0  -0.586138  0.0     0.0  0.0  -0.390016  -0.390016  0.0
 0.0  0.0  0.0  0.0  -0.586138  0.0     0.0  0.0  -0.390016  -0.390016  0.0
 0.0  0.0  0.0  0.0  -0.586138  0.0  …  0.0  0.0  -0.390016  -0.390016  0.0

Extract data marker-by-maker or window-by-window

Large VCF files easily generate numeric arrays that cannot fit into computer memory. Many analyses only need to loop over markers or sets of markers. Previous functions for importing genotypes/haplotypes/dosages have equivalent functions to achieve this:

  • copy_gt! loops over genotypes
  • copy_ht! loops over haplotypes
  • copy_ds! loops over dosages

For example, to loop over all genotype markers in the VCF file test.vcf.gz:

In [20]:
using GeneticVariation

# initialize VCF reader
people, snps = nsamples("test.vcf.gz"), nrecords("test.vcf.gz")
reader = VCF.Reader(openvcf("test.vcf.gz"))
# pre-allocate vector for marker data
g = zeros(Union{Missing, Float64}, people)
for j = 1:snps
    copy_gt!(g, reader; model = :additive, impute = true, center = true, scale = true)
    # do statistical anlaysis
end
close(reader)

To loop over markers in windows of size 25:

In [21]:
# initialize VCF reader
people, snps = nsamples("test.vcf.gz"), nrecords("test.vcf.gz")
reader = VCF.Reader(openvcf("test.vcf.gz"))
# pre-allocate matrix for marker data
windowsize = 25
g = zeros(Union{Missing, Float64}, people, windowsize)
nwindows = ceil(Int, snps / windowsize)
for j = 1:nwindows
    copy_gt!(g, reader; model = :additive, 
        impute = true, center = true, scale = true)
    # do statistical anlaysis
end
close(reader)
┌ Warning: Only 7 records left in reader; columns 8-25 are set to missing values
└ @ VCFTools /Users/huazhou/.julia/dev/VCFTools/src/convert.jl:67