Entering edit mode
5.3 years ago
jaqx008
▴
110
Hello everyone. I have a small RNA library that I am trying to see if the target any kind of viruses. I plan to make these short sequences into longer reads or contigs and then see if they map to any viral genomes. I was advised to use velvet to carry out the assembly. I installed velvet using conda install velvet. to do the assembly I used the command I found in the velvet help according to how I understand it.
velveth output.fastq 191 -fastq sample.fastq
the problem is, I am not sure this is right because the output file were still short reads when I looked and looked as below
output
>NS500519:44:HHHTLBGX2:3:11406:4633:13982 21398134 0
CATTGCACTCGTCCCGGCCTGA
>NS500519:44:HHHTLBGX2:3:11406:15103:13982 21398135 0
AGCACTGAGAACACTTTGGCCTTGGCAAG
>NS500519:44:HHHTLBGX2:3:11406:23757:13983 21398136 0
TCTTAGAACTCATCGGGAGGGAACATTAGC
>NS500519:44:HHHTLBGX2:3:11406:9883:13983 21398137 0
smallRNA's are of a size that should not need assembly. If you need to see if they map to viral genomes, you could do that with the data you have now. If the data originally did not come from an entity that was long to begin with there is no point in trying to assemble.
Instead you could look into
tadpole.sh
from BBMap suite that can be used for read extension/error correction as an alternative.I apologize for late response. The data I have are small RNAs mostly piRNA and siRNAs that can likely target viral elements (in invertebrates). making the reads longer can help me blast my data against viral databases to see if the small RNAs originate from any viral genomes as I dont have a specific virus in mind. However, I will see if tadpole can help with the read extension. Thanks again