How can I get 31 chromosome from vcf file?
1
0
Entering edit mode
6.1 years ago
Mbillah ▴ 140

I have a vcf file. When I run this command grep -v -E '^#' variants.vcf | cut -f 1 | sort | uniq -c I got 6835 line. But my chromosome number is not greater than 31. Is there any way to specify 31 chromosome form this 6835 line. The lines are look like:

 NC_005044.2
 NC_030808.1
 NC_030809.1
 NC_030810.1
 NC_030811.1
 NC_030812.1
 NC_030813.1

Thank you.

chromosome vcf • 1.9k views
ADD COMMENT
0
Entering edit mode

Is there any way to specify 31 chromosome form this 6835 line.

What does this exactly mean? Are you concerned that there are > 31 entries that look like chromosomes?

ADD REPLY
0
Entering edit mode

yes, How can I differentiate them. There are two tags "NC" and "NW" . What does "NW" mean.

ADD REPLY
1
Entering edit mode

NC are fully assembled chromosomes. NW are scaffolds/rcontigs that would still be part of the genome. They would have NNNN where there is missing sequence. You can find a full listing here.

ADD REPLY
0
Entering edit mode
6.1 years ago
grep '^chr' variants.vcf | cut -f 1 | sort | uniq -c
ADD COMMENT
0
Entering edit mode

Hello Bastien, My vcf file does't contain any chr word as a result your command is not working.

ADD REPLY
0
Entering edit mode

So, you need a convertion table from your chromosome names (NC_005044.2, NC_030808.1...) to standard chromosome names (chr1, chr2...)

ADD REPLY
0
Entering edit mode

How can I create a conversion table?

ADD REPLY
1
Entering edit mode

In your post you missed some essential informations, as the species you are working on, seems like to be Capra Hircus. Here, you can find the conversion table for your chromosomes

ADD REPLY
0
Entering edit mode

Yes my species is Capra Hircus. Can you tell me what does "NW" mean? and How can I find start and End position of each chromosome?

ADD REPLY
1
Entering edit mode

You can find the assembly/annotation report for Capara here. Scroll down the page and click on Assembly statistics tab to get the start-stop for each chromosome. Note: There are two goat assemblies. One linked here is for ARS1.

ADD REPLY

Login before adding your answer.

Traffic: 1650 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6