How to find a number of TE in a set of genomes?
0
0
Entering edit mode
5.7 years ago
little_more ▴ 70

I have a list of assemblies IDs (GenBank) and a list of corresponding chromosomes and plasmids. A toy example:

a = [GCA_000005845.2, GCA_000006925.2, GCA_000007405.1, GCA_000007445.1, ...]
b = [CP024720.1, CP024722.1, CP024721.1, LT601384.1, LT838196.1,...]

I'd like to find a number of IS in each genome. The only thing that have come to my mind: parse all CDS in each genome with BioPython and count the number of CDS with "IS ... transposase" in their "product" keys. Is there a better way to do this? Can I somehow use GO? Note that the lists are quite big so I need an automated way.

genome biopython • 980 views
ADD COMMENT
0
Entering edit mode

I' had to google this.

"TE" : Transposable Element

ADD REPLY
0
Entering edit mode

what you mean by "IS"? Before to perform any bioinformatic analysis I would recommend you answer these questions?

> what am I looking for? 
> It is possible to get by in silico analysis?  
> Am I able to perform these in silico analysis? or Do I need professional help?
ADD REPLY
0
Entering edit mode

'IS' is a pretty common abbreviation for "insertion sequence" -- a type of mobile elements in prokaryotic genomes. I do not know what is the purpose of your comment because I am asking exactly "is there a way..." and one of the ways I've already tried.

ADD REPLY

Login before adding your answer.

Traffic: 1658 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6