GWAS in one go

At load-time, the GWASpi package will, provided enough information, ask you if you want to proceed with a complete association study after loading the genotype and sample info data. Some formats (As the PLINK and Beagle formats) may contain sufficient data in their files so as to launch a complete GWS from the get-go. Other formats will require you to add more information, usually in the form of a Sample Info file.

Before scanning for affection information in your data, GWASpi will ask if you want to just load your data with the corresponding QA check or if you want to make a complete GWS in one go.

Alternatively, you may launch a GWAS-In-One-Go operation under the “Analise Matrix” section.
Next, GWASpi will ascertain what missing information it may need and resquest it. You may leave the default answers if you want, they have been chosen to be within accepted ranges.
Some of these questions are on QA thresholds:

A short description of the above fields and other methods the GWAS will be based upon:

  • Markers detected to have more than 2 Alleles will be flagged as mismatching and will be eliminated.
  • Markers having a missing ratio above a threshold (default 5%) will be ignored. Marker missing ratio is measured as the number of missing sample genotypes / number of samples for that marker.
  • By default, a Hardy-Weinberg threshold p-Value will be proposed, calculated as per Bonferroni (0.05 / Number of markers). You may choose, if you like, to introduce a fixed Hardy-Weinberg p-Value threshold. Any Marker tested to have a HW p-Value below that threshold will be discarded.
  • Any Samples having a missing ratio above a threshold (default 5%) will not be considered in the study. Sample missing ratio is measured as the number of missing marker genotypes / number of markers in that sample.
  • Any Samples having a heterozygosity ratio above a threshold (default 50%) will not be considered in the study. Sample heterozygosity ratio is measured as the number of heterozygous marker genotypes / number of non-missing marker genotypes in that sample.