Hi,
I have done tblastn and now I have a blast tabular output format for proteins. Each hit has this information
queryId, subjectId, percIdentity, alnLength, mismatchCount, gapOpenCount, queryStart, queryEnd, subjectStart, subjectEnd, eVal, bitScore
what i am interested in is that for each of the protein i could extract the subject start and ends and build a complete protein. The problem is the multiple hits or overlapping hits keeping in mind the evalue and percent id. Is there a simple way to extract this information for each protein to build a model protein? Any type of tool or code could help.
If the question is very simple then please guide to the right path.
Thanks in advance
How do you want to construct a protein sequence when your BLAST result gives you nucleotide matches? What's the whole point of this? I'm a bit confused by the question ;)
you are right :), so let me explain, blast hits give the nucleotide hits,I want to retrieve those hits for each query protein and then further translate the sequence to protein. Does it answer your quenstion?