Bedtools maskFastaFromBed does not mask on the minus strand
0
0
Entering edit mode
4.8 years ago
Biostar • 0

Hello,

I have an unmasked genome and its repeat file in bed/gff format. I tried to softmask the genome using maskFastaFromBed from bedtools (v2.26), but it doesn't automatically mask the minus strand by taking strand information from the .bed file. There is no -s option to switch on strandedness either (a feature available in getfasta). So basically, the masking is always returned on the + strand and I'm surprised! Does anyone have any ideas or alternative tools to suggest? Example run and output is below.

Thanks.

genome.fasta

Chr1
ATTGACAGAAATATCATCACATCTATTCTTTCTCTCCCCTAGTTTAGCAAAT
Chr2
GACATATAAATAATAGTGGGAAAGAGACCGGATGAAACCTCAACTGTGGCTTTCATTAACAGATCA

genome.bed

Chr1 0 17 for 1 +
Chr2 0 17 rev 1 -

maskFastaFromBed -soft -fi genome.fasta -bed genome.bed -fo genome_softmasked.fasta

Output:

Chr1
attgacagaaatatcatCACATCTATTCTTTCTCTCCCCTAGTTTAGCAAAT
Chr2
gacatataaataatagtGGGAAAGAGACCGGATGAAACCTCAACTGTGGCTTTCATTAACAGATCA
software error • 1.6k views
ADD COMMENT
0
Entering edit mode

Can you provide example output of how you would like the output to be?

ADD REPLY
0
Entering edit mode

Hello,

I would've expected the output on minus strand as below (also shown by user Fatima below). Am I getting this wrong or something?

Chr2 GACATATAAATAATAGTGGGAAAGAGACCGGATGAAACCTCAACTGTGGctttcattaacagatca

Thanks

ADD REPLY
0
Entering edit mode

Would it work if you modify the genome.bed based on the strand? I assume for 0 17 on the negative strand you want the mask to be applied to 17 nucleotides starting from the last one toward the beginning of the sequence.

>Chr2

GACATATAAATAATAGTGGGAAAGAGACCGGATGAAACCTCAACTGTGGctttcattaacagatca

If that's the case I think you can modify your genome.bed and when it's - strand you can replace start stop with length-stop-1 and length-start

ADD REPLY
0
Entering edit mode

Hi Fatima,

Thanks for the reply. Yes, there are ways to modify the .bed file, but we shouldn't have to since bedtools generally takes strand information in other tools (like getfasta), but apparently not for maskFastaFromBed.

ADD REPLY

Login before adding your answer.

Traffic: 1791 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6