Illumina LGEN

llumina uses an export format related to PLINK’s LGEN (long genotype file) format. Nevertheless, it is sufficiently different that you won’t be able to load the resulting files, as they are, in PLINK. GWASpi has taken into account these differences and offers an out of the box import method for Illumina LGEN files.

Example of an Illumina LGEN file:
The file starts with a header, containing a number of experiment specific fields:

BSGT Version 3/3/2007
Processing Date 2/17/2009 11:48:00 AM
Content HumanCNV370-Quadv3_C.bpm
Num SNPs 345111
Total SNPs 373397
Num Samples 48
Total Samples 72

Followed by the genotype data, in a long series of triplet rows, preceded with the field descriptions. The data itself is grouped by Sample ID, in a long list of SNPs with it’s associated genotype:


Sample Index Sample ID SNP Name Allele1 – Forward Allele2 – Forward
0 S1 rs100001 A T
0 S1 rs100002 A A
0 S1 rs100003 C T
0 S1 rs100004 T T
0 S1 rs100005 C C
0 S1 rs100006 A T
0 S2 rs100001 A A
0 S2 rs100002 A A
0 S2 rs100003 T T
0 S2 rs100004 T T

The sample information data must be provided in GWASpi’s standard Sample Info format