Extract non-repetitive DNA sequence
1
0
Entering edit mode
4.2 years ago
huiyus97 • 0

Hi,

I scanned all dna repetitive elements with RepeatMasker, and I got a file look like this.

u1  u2  u3  u4  scaffolds   begin   end (left)repeat        repeat  class   begin   end left    id  
15  3.7 3.1 6.5 contig1 2955    2986    -9613389    +   (ATTA)n Simple_repeat   1   31  0   1   
29  4.8 2.2 2.2 contig1 3772    3816    -9612559    +   (AAGGCTAAA)n    Simple_repeat   1   45  0   2   
15  19.6    0   0   contig1 6019    6047    -9610328    +   (T)n    Simple_repeat   1   29  0   3   
14  25.1    3.3 3.3 contig1 9869    9928    -9606447    +   GA-rich Low_complexity  1   60  0   4

I then reformatted this file and extracted all repetitive elements with bedtools. However, I also want to extract the non-repetitive sequences, which I assume is all dna sequences except for repetitive sequences.

Is there anyway to extract the non-repetitive sequences directly with a file indicated the positions of all repetitive elements?

Thank you!

bedtools • 711 views
ADD COMMENT
0
Entering edit mode

Please use the formatting bar (especially the code option) to present your post better. You can use backticks for inline code (`text` becomes text), or select a chunk of text and use the highlighted button to format it as a code block. I've done it for you this time.
code_formatting

ADD REPLY
0
Entering edit mode
4.2 years ago
JC 13k
  1. Create a BED file with your contigs length
  2. Subtract the repetitive region from the contigs BED with bedtools subtract operator
  3. Extract the regions from step 2
ADD COMMENT

Login before adding your answer.

Traffic: 2324 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6