Using samtools faidx with a large list of genomic loci
1
0
Entering edit mode
2.9 years ago
jsandler • 0

I'm using the faidx command with samtools to generate a fasta file with the sequences of various lists of genomic loci I am interested in. The lists of genomic loci contain ~100-1000 unique positions. I would like to include the file with the list of loci in the faidx command, but I am unable to get samtools to read this file. Instead, I am currently pasting in this list from the output of a sql query. Is there a file format that samtools can read that will save me the pain of cut and paste? Here is my command:

samtools faidx /<path to genome>/genome.fa loci_list.csv output.fa

The error returned is:

[W::fai_get_val] Reference loci_list.csv not found in FASTA file, returning empty sequence
[faidx] Failed to fetch sequence in loci_list

Thanks

samtools faidx alignment • 1.4k views
ADD COMMENT
0
Entering edit mode
2.9 years ago
liorglic ★ 1.5k

To do what you want, you need to run:

samtools faidx /<path to genome>/genome.fa -r loci_list > output.fa

and your loci_list should contain lines with the format: chr:start-end, not csv lines.
Alternatively, use bedtools getfasta as described here.

ADD COMMENT

Login before adding your answer.

Traffic: 1528 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6