Installation

This section will show you how to get up and running with GWASpi.

Installation and Running

Decompress the compressed GWASpi file to a suitable directory.

Inside the resulting “GWASpi” folder you will find an “app” directory, containing the necessary application, libraries and resource jars.
Next to this folder you will have several launcher files. Depending on your OS, you can use the GWASpi_Linux_launcher.sh, GWASpi_Mac_launcher.command or GWASpi_Windows_launcher.bat to start the application.

The first two launchers are pre-configured to dedicate up to 50% of your installed RAM to GWASpi. The windows .bat instead has a fixed amount of RAM allocated that you will have to adapt to your hardware configuration. Just edit the .bat file manually to launch with the desired amount:

java -Xms[Nb of MB of allocated RAM]m -Xmx[Nb of MB of allocated RAM]m -jar app/GWASpi.jar

example with 1000 MB: java -Xms1000m -Xmx1000m -jar app/GWASpi.jar

The RAM used will be displayed when you start the application, along an estimation of the number of markers you will be able to manage with this amount.This number is just an estimation and depends on the load format (in case of a load operation) or type of operation on a pre-loaded matrix you will be performing.

Using less than 256MB of RAM with GWASpi is below optimal and should only be done if you want to visualize already generated matrices and reports. You can edit the percentage of allocated RAM in the GWASpi_Linux_launcher.sh or GWASpi_Mac_launcher.command files. Keep in mind that your OS will require some RAM to function, so never allocate 100% of it to GWASpi.

The number of accepted Samples is largely independent of your RAM (though it will be limited by the disk space available). You will influence CPU usage as this number increases. At constant marker number, the more Samples you have the longer the processing will take.

Select directory to store data

After starting GWASpi for the first time, you will be asked to provide a data storage directory. GWASpi needs a rather large disk space to store its NetDF-3 databases of genotypes, reports and charts, as well as any exported data to other file-formats.
Note: The Fat32 file system, which was quite popular in some older Windows platforms, only manages files of up to 4GB in size. Even though a “large file support” has been introduced in Windows OSes, it is HIGHLY recommended that you use the more modern NTFS or newer file systems that Windows offers. Even though GWASpi has workarounds allowing it to manage big files, failing to do so may cripple the application strongly, as Genome Wide Studies typically deal with huge amounts of data.

Directory structure

Once you have chosen the directory to store data in, following management databases and working folders will be created:

  • datacenter
  • export
  • genotypes
  • reports

– The “datacenter” folder contains an SQL database with the relevant information on your studies that GWASpi needs to manage your data.
– The “export” folder will hold any dataset that you will chose to export from within GWASpi.
– The “genotypes” directory contains the big netCDF-3 files with genotypes, QA and analysis data.
– The “reports” directory will hold the resulting human readable reports and charts of your QAs and analysis.

As a guideline, don’t manually delete, rename or move any of these files, as GWASpi will keep accessing them to read the data they contain. Exceptions to this rule are the “export” folder. You may remove any file created there as GWASpi will not try and retrieve it back later. After deletion of a Study in GWASpi, you may want to delete the STUDY_# folder corresponding to that Study ID, just to be sure no large files remain.

Start loading Datasets

Once all the folder structure and databases have been initialized, you may start working with GWASpi.
You may choose to do so under the “Default Study” in the left pane tree, or you may choose to create a new study to hold your data separately and ordered to whatever criteria you see fit.

Further information on how to start loading datasets will be provided under our tutorial.