Detect STRs in illumina library
1
0
Entering edit mode
15 months ago

I have paired-end whole genome sequencing data, and I would like to try finding STRs (Short Tandem Repeats) in this data. What tools should I use? There is no assembled reference genome available for my species.

repeats annotation STR • 895 views
ADD COMMENT
0
Entering edit mode
ADD REPLY
0
Entering edit mode

Unfortunately these tools only work for long reads. I have Illumina short reads library.

ADD REPLY
1
Entering edit mode
14 months ago

BBMask in the BBTools package can find short repeats, depending on the length you're interested in...

bbmask.sh in=reads.fq out=masked.fq maskrepeats minkr=1 maxkr=15 minlen=40 minrepeats=4 lowercase=t masklowentropy=f

That will mask (to lowercase) sequences with STRs with repeating subunits of length between 1 and 15. Then you can filter the reads with lowercase letters in them... I don't have a program for that though.

ADD COMMENT
0
Entering edit mode

Thank you. I've already found RepeatExplorer2 (https://www.nature.com/articles/s41596-020-0400-y) is the perfect tool for these purposes.

ADD REPLY

Login before adding your answer.

Traffic: 1802 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6