Load genotypes, do GWAS in one go

#This is a demo file
#Usage: java -Xms1500m -Xmx2500m -jar GWASpi.jar script scriptFile [log cli.log]
data-dir=/GWASpi/data/
[script]
0.command=load_genotypes_do_gwas_in_one_go
1.study-id=1
2.format=PLINK
3.use-dummy-samples=true
4.new-matrix-name=Matrix 43
5.description=Load genotypes of batch 42, perform GWAS in one go.
6.file1-path=/GWASpi/input/Plink/mi_input.map
7.file2-path=/GWASpi/input/Plink/mi_input.ped
8.sample-info-path=no info file
9.discard-marker-by-missing-ratio=false
10.discard-marker-missing-ratio-threshold=0
11.calculate-discard-threshold-for-HW=false
12.discard-marker-with-provided-threshold=true
13.discard-marker-HW-treshold=0.0000005
14.discard-samples-by-missing-ratio=false
15.discard-samples-missing-ratio-threshold=0
16.discard-samples-by-heterozygosity-ratio=false
17.discard-samples-heterozygosity-ratio-threshold=0.5
18.perform-Allelic-Tests=true
19.perform-Genotypic-Tests=true
20.perform-Trend-Tests=true
[/script]

Line by line…

Let’s look into the example above and explain it line by line:

  • First we see two lines starting with a "#" character. These are comment lines and you can have as many as you want, as long as they start with "#".
  • Next, the "data-dir=/GWASpi/data/" line specifies where GWASpi’s datae has to be stored. If the database doesn’t allready exist, a new and empty one will be created at the specified path.
  • Following these lines starts the "[script][/script]" block itself.
  • Next comes the call to the operation command, in this case 0.command=load_genotypes_do_gwas_in_one_go

Following the operation call you will need to list a number of parameters. Each specific operation call has a given number and types of calls, which you will have to respect in name and order for the operation to run sucessfully:

  1. study-id= <- This is the Study ID of the study the new genotypes have to be loaded under. If you set this value to "1.study-id=New Study" the genotype matrix will be loaded to a fresh Study.
  2. format= <- Indicate the format/technology the data is provided in (GWASpi, PLINK…)
  3. use-dummy-samples=true <- Indicate if the samples have to be generated as dummies from the provided fomar files instead of a Sample-Info file in GWASpi format [true/false].
  4. new-matrix-name= <- The name the new matrix will show. Free text, 64 characters long.
  5. description= <- Description of the matrix. Free text, 2000 characters long.
  6. file1-path= <- Path to the 1st file, as specified in the first GUI field for the given format.
  7. file2-path= <- Path to the 2nd file, as specified in the second GUI field for the given format.
  8. sample-info-path= <- Path to the Sample-Info file, as specified in the third GUI field for the given format. This can be left empty if use-dummy-samples=true
  9. discard-marker-by-missing-ratio= <- Boolean to activate filter [true/false]
  10. discard-marker-missing-ratio-threshold=0 <- Threshold for filtering
  11. calculate-discard-threshold-for-HW= <- Boolean to toggle filter [true/false]
  12. discard-marker-with-provided-threshold= <- Boolean to toggle filter [true/false]
  13. discard-marker-HW-treshold= <- Specify value in case of discard-marker-with-provided-threshold
  14. discard-samples-by-missing-ratio= <- Boolean to activate filter [true/false]
  15. discard-samples-missing-ratio-threshold= <- Threshold for filtering
  16. discard-samples-by-heterozygosity-ratio= <- Boolean to activate filter [true/false]
  17. discard-samples-heterozygosity-ratio-threshold= <- Threshold for filtering
  18. perform-Allelic-Tests= <- Boolean, perform Allelic Association test? [true/false]
  19. perform-Genotypic-Tests= <- Boolean, perform Genotypic Association test?[true/false]
  20. perform-Trend-Tests= <- Boolean, perform Trend test?[true/false]