Hi every body
I am working on rotavirus, this virus has not a long genome as usual but 11 separated fragments. each fragment correspond to a gene so we have here 11 genes. each viral strain has these 11 genes. In the biological databases we can find these sequences but it happens that for one viral strain it was not sequenced all genes and/or for the same strain we have only partial sequence of genes. So my question is: How to retrieve using script or other else, only the full sequences of the 11 genes for the same viral strain. for example:
Strain 1: gene 1 (full sequence), gene 2 (full sequence), gene 3 (full sequence), gene 4 (full sequence), gene 5 (full sequence), gene 6 (full sequence), gene 7 (full sequence), gene 8 (full sequence), gene 9 (full sequence), gene 10 (full sequence), gene 11 (full sequence).
Strain 2: gene 1 (full sequence), gene 2 (full sequence), gene 3 (full sequence), gene 4 (full sequence), gene 5 (full sequence), gene 6 (full sequence), gene 7 (full sequence), gene 8 (full sequence), gene 9 (full sequence), gene 10 (full sequence), gene 11 (full sequence).
.
.
.
.
.
.
and so on.
Thank you for your help
I worked on Rotavirus during my thesis, apart from the few classical strains (RF..) I'm afraid there is no solution to your question.
o you thin so?
I think there is solution for that through some scripts.
I think may be we can create a personal database that will be linked to the international databases and through a script get what we look fo.
as far as I remember most sequences are partial, poorly annotated or only a fews segments have been sequenced.
No Dear,
i have collected more than 130 full genome sequences until now
great ! so, things have changed ! what was your method ?
just manually from genebank
for that reason i want more rapid method to retrive information automatically
again, what was your method ?: if "manually" means you looked in the articles and peeked the accession numbers, then you cannot automatize things. If you found a way ( e.g: a feature in genbank) to get all the sequences for a given strains, then we might help you.
Hi
Yeh manually through articles.
I think we can retrieve from Genbank sequences of rotavirus A. Then from these sequences we retrieve only the same strain that is repeated 11 times. The name of the strain is indicated in the title of the sequence. We get all genes sequences from the same strain. Again we filter the results to keep only sequences that indicate complete. In such way we get all complete sequences of the same strain.
Does that mean you want to retrieve a sequence that has not been sequenced or not been uploaded?
no . just i want to get from the database (NCBI) only the full sequences of genes (11) that belongs to the same strain or isolate.
thank you i will try that