I have to filter transcripts with coverage of target >80%. For that which column should I select for filtration?
qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore
Also please confirm if below command is correct one. I have to remove transcripts with significant homology to known proteins (e.g., e-value <1e-10, coverage of target >80%, and identity >90%).
Should I use following command:
Note: Since I am not sure about coverage of target in the blastx output, I have written command below without adding that parameter temporarily.
awk '{ if ($3 <= 90 && $11 >= "1e-10") print $0}' blastx_output.txt > filtered.txt