Entering edit mode
9.2 years ago
Naresh D J
▴
110
Hi,
I am pretty new to the analysis of RNA-seq data. My data has 35 % of un-mapped reads (human genome). So I would like to check those un-mapped reads with blast program to check for contamination of other genomes in the data.
I have the un-mapped reads in bam format. How can I run the blast program?
Thank you,
Best Regards,
Naresh DJ
Hi Pierre,
Thank you for the reply. Can you briefly explain the details of the code. What each line does? How to save the output from blast?
first line extract the unmapped read, output is a tab delimited file
second line is awk, it prints the following components : '>' , then the name of the read, then a name extension of the paired reads (/1 or /2), then a Carriage return, then the sequence and a final CR.
3rd line : pipe in blast
How to save the output from blast?
blastn -db contamination -out out.blast
Thank you Pierre.
Actually my reads are single end, do I need to make changes to the awk? Is "contamination" the database I need to download as it is not available on the blast installed at my university server?
Single end: you can just use
Yes:
Hi Pierre,
I first generated my sam file and then tried your code like this:
I get this error message:
Do you have any idea why?
I get the same error message when I just save the awk output and then run blastn separately.
Thanks!