Convert SNP data to 0,1,2 and 5
1
0
Entering edit mode
8.1 years ago
bingnas ▴ 10

Hi there,

I am looking for hire someone for reasonable price

I have bam files for 22 subjects (human) mapped by Bowtie2 with hg-19.

1- I want SNP data vs reference genome (i.e hg19) from these samples. 2- Convert SNP genotype to 0,1,2 and 5. Where 0 is recessive homozygous and 2 dominant homozygous, 1 hetrozigous and 5 for missing. 3- Merge these 22 subjects in matrix as following:

Chromosome postion reference Subject1 Subject2 ……………………. Subject22 Ch1 335453 A 0 2 ...……………………. 0 Chr1 336565 G 1 5 ……...………………. 2 . .

. . Ch22 3546372 C 1 0 ……….....…………… 1

enter image description here

Thanks

SNP • 3.7k views
ADD COMMENT
1
Entering edit mode

I assume you mean 0: reference homozygous, 1: heterozygous variant and 2: homozygous variant. Dominant and recessive doesn't make sense on the variant level. A variant can have a dominant/recessive effect on a phenotype, but it's not a variant state.

The job you are asking for is quite easy.

ADD REPLY
0
Entering edit mode

Thank you WouterDeCoster for your answer! could you help me how to do it please or I would send you the data?

Bing

ADD REPLY
1
Entering edit mode

I assume this is whole exome sequencing data or whole genome sequencing data. The gatk best practices are quite well documented and commonly accepted way of doing data processing and variant calling. You will obtain vcf files after variant calling, which can be converted to the numerical output (plink format, right?) you ask for using vcftools ./vcftools --vcf input_data.vcf --plink --chr 1 --out output_in_plink

ADD REPLY
0
Entering edit mode

yes I want it like PLINK format, I see you put --chr 1, you mean I should convert them by chromosome? in other word can I convert whole chromosomes in one time?

I will do it and let you know what is going on!

Thank you for your help

ADD REPLY
1
Entering edit mode

According to https://vcftools.github.io/man_latest.html (see SITE FILTERING OPTIONS) that is just a method to filter the file by inclusion or exclusion of a certain chromosome and the command I posted is just an example I copy pasted from the documentation. It's probably not an essential argument to the function.

ADD REPLY
1
Entering edit mode
8.0 years ago

you are trying to create an input file for plink, but all you need to do is to perform variant calling on your samples and give the resulting vcf files directly to plink, since latest plink versions do accept vcf files natively.

ADD COMMENT

Login before adding your answer.

Traffic: 2051 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6