Sample Info & Phenotype

Both the Sample Info (as well as Phenotypes) file format is the format used to import or update the samples information as stored in the GWASpi database.
Please consider that while updating or importing Samples, the IDs and the number of imported Samples should match the genotype data that you provide to GWASpi as input.

The order of the Samples is irrelevant but grouping them by categories that are relevant for any subsetting that you may need later is a good idea, as this will greatly speed up the operations you will perform. For example, groupings could be done by sex, case/control affection, category, disease, population or age.

The Sample Info File Format is related to the PLINK “fam” file format, expanded by some fields, and stands as follows:

10 columns of data

  1. Family ID (max 32 characters)
  2. Sample ID (max 32 characters)
  3. Father ID (max 32 characters)
  4. Mother ID (max 32 characters)
  5. Sex (0 Undefined, 1 Male, 2 Female)
  6. Affection (0 Undefined, 1 Control, 2 Case)
  7. Category (max 32 characters, 0 Undefined)
  8. Disease (max 64 characters, 0 Undefined)
  9. Population (max 32 characters, 0 Undefined)
  10. Age (Integer, MAX 2147483647, 0 Undefined)

The header line as well as all the fields are compulsory!

Sample Info example:
FamilyID SampleID FatherID MotherID Sex Affection Category Desease Population Age
0 1AC18 0 0 1 1 0 CF CEU 0
F01 1AC19 0 0 1 2 0 CF CEU 0
F01 1AC25 1AC19 1AC26 2 1 0 CF CEU 0
0 1AC26 0 0 2 2 0 0 0 0
0 1AC27 0 0 2 1 0 MS CEU 0
0 1AC30 0 0 2 2 0 MS JPT 0
0 1AC32 0 0 2 1 0 MS CEU 49
0 1AC33 0 0 2 1 0 MS CEU 35
F02 1AC34 0 0 2 2 0 MS CEU 30
F02 1AC40 0 0 1 2 0 MS CEU 29
F02 1AC41 0 0 1 1 0 MS 0 0
0 1AC49 0 0 1 1 0 0 JPT 0

Phenotype files do not require to contain any other information than columns 2) Sample ID and 6) Affection

Phenotype file example:
FamilyID SampleID FatherID MotherID Sex Affection Category Desease Population Age
0 1AC18 0 0 0 1 0 0 0 0
0 1AC19 0 0 0 2 0 0 0 0
0 1AC25 0 0 0 1 0 0 0 0
0 1AC26 0 0 0 2 0 0 0 0
0 1AC27 0 0 0 1 0 0 0 0
0 1AC30 0 0 0 2 0 0 0 0
0 1AC32 0 0 0 1 0 0 0 0
0 1AC33 0 0 0 1 0 0 0 0
0 1AC34 0 0 0 2 0 0 0 0
0 1AC40 0 0 0 2 0 0 0 0
0 1AC41 0 0 0 1 0 0 0 0
0 1AC49 0 0 0 1 0 0 0 0