Why Is One Fasta File Indexed Differently From The Others Using Bioperl'S Db::Fasta Module?

0

Entering edit mode

11.1 years ago

RossCampbell ▴ 140

I'm trying to extract sequences from several fasta chromosome files using Bioperl's Bio::DB::Fasta module, which is working fine for all but one file. As expected, this module indexes each file first...for example 'chr1.fa' is indexed as 'chr1.fa.index'. However for one file (chr11.fa) it creates a file called '_db.chr11.fa.index', which is throwing off the rest of my pipeline and my output has no sequence from that chromosome. Is there anyone who's more familiar with this module than me and knows why this file is being created differently from the others?

bioperl fasta • 2.5k views

ADD COMMENT • link 11.1 years ago by RossCampbell ▴ 140

2

Entering edit mode

Are all the FASTA IDs unique in your files? Did you accidently have >chr1 in your chr11.fa file? What does "grep '>' chr*.fa" say?

ADD REPLY • link 11.1 years ago by Torst ▴ 980

0

Entering edit mode

They are all unique. grep returned: chr10.fa:>chr10 chr11.fa:>chr11 chr12.fa:>chr12...and down through the list. They all are unique and match their filename.

ADD REPLY • link 11.1 years ago by RossCampbell ▴ 140

0

Entering edit mode

Update: This issue was resolved when I installed Ubuntu updates this morning. I don't know what fixed, but there was apparently something behind the scenes that wasn't working right, but it's fine now.

ADD REPLY • link 11.1 years ago by RossCampbell ▴ 140

Login before adding your answer.