[main_samview] fail to read the header from "human_g1k_v37.annotate.fasta".
1
0
Entering edit mode
12 months ago
Chen • 0

Hi,

I tried to annotate chromosome with prefix "chr" in a fasta file like:

sed 's/^>/>chr/' human_g1k_v37.fasta > human_g1k_v37.annotate.fasta

However, after that, I failed to view header of the new fasta file:

samtools view -H human_g1k_v37.annotate.fasta
>>> [main_samview] fail to read the header from "human_g1k_v37.annotate.fasta".

What might potentially cause the error and if there's any alternative way to annotate a fasta file? Thanks.

reference samtools fasta • 853 views
ADD COMMENT
1
Entering edit mode
12 months ago

samtools view is a tool to show the content of a BAM/SAM/CRAM file. Not fasta.

if there's any alternative way to annotate a fasta file?

define "annotate"

ADD COMMENT
0
Entering edit mode

Hi, sorry for the confusion, I am referring to rename chromosome in fasta, for example, convert >1 to >chr1. Thanks.

ADD REPLY
1
Entering edit mode

You renamed the fasta with sed...

ADD REPLY
0
Entering edit mode

If you're doing that then it probably means you downloaded the wrong copy of the genome. Eg GRCh37 uses ">1" and GRCh38 uses ">chr1". Just editing the names may get you past the first hurdle, but cause vastly bigger problems downstream.

I'd go back to square one. Why do you think g1k_v37 is the correct reference? Is it actually that, or is it the almost-but-not-quite-identical hg19? If you don't know, go back to source and ask them, or look at the meta-data in the SQ lines. Maybe it'll give proper data provenance. (Although sadly many people consider basic things like keeping track of what they're doing to not be an integral part of science!)

ADD REPLY

Login before adding your answer.

Traffic: 1561 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6