I am trying to use BEDtools to get some sequences from genomic coordinates. But I am having an errors saying WARNING. chromosome (chr12) was not found in the FASTA file. Skipping.
for each read that I have in my bed file.
I gave you some details about what I am doing.
I just download the last version of BEDtools (I think) bedtools-2.17.0.
Then I have 2 different files (much more longer that the little part that I show) :
A fasta file with all the sequences of chromosomes:
>chr01
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
a BED file with my genomic coordinates (already sorted)
chr01 187814 190840
chr01 307073 310104
chr01 701047 704068
chr01 702941 705962
chr01 702952 705972
chr01 867716 870740
chr01 914064 917087
chr01 991080 994104
chr01 1039795 1042815
chr01 1058713 1061736
And then I write the command line:
bedtools getfasta -fi all.con -bed 1-13<em>sorted2.bed -fo NewCandidates/Genomic</em>coordinates/1-13_1500.fa
The only thing that I get is "WARNING. chromosome (chr01) was not found in the FASTA file. Skipping."
, thousands of times...
If someone can help me and tell me what I am doing wrong, I will be very grateful.
I had the same problem. The issue was the .fai index file generated by bedtools. The solution was to remove the bedtools generated .fai file and run samtools faidx on your input fasta first, then run bedtools getfasta.
well if you just need fasta files for each chromosome (HG19) you can download it from ucsc genome browser: http://hgdownload.soe.ucsc.edu/goldenPath/hg19/chromosomes/
your paste above is not fasta format. did the editor eat your ">"? should look like:
etc.
Yes, this happens because ">" is formatted as a blockquote UNLESS you indent lines with 4 spaces. Please note that questions are auto-previewed as you type, to help avoid this kind of problem. Fixed it for you.
I don't get it - is it
chr01
orchr12
or both (all) of them?What about - instead of using
all.con
try to use fasta file with only one chromosome in it and bed file with only that chromosome coordinates.Thanks for your comment, but then I need to split all my files, and I have 24 different libraries per 12 chromosomes...If this is the only solution, I think is not the best solution for me...
Just to test if it's working:
-fi chr01.fa -bed chr01.bed
Hi, I tried and I get exactly the same error: "WARNING. chromosome (chr01) was not found in the FASTA file. Skipping." Thanks again!
My command line in case of: bedtools getfasta -fi Chr1.con -bed NewCandidates/Genomiccoordinates/1-13chr01.bed -fo NewCandidates/Genomiccoordinates/1-131500.fa