Entering edit mode
9.9 years ago
juanma_lace
▴
20
Hi all,
I'm running an analysis of small RNA and I want to use some libs from NCBI in SRA (Illumina).
The problem is that all the reads are 35 bp long (I understand they have some kind of adaptors).
I want to know how to obtain the real fasta sequences programmatically (I want to use them in a pipeline)
Thank you in advance
Seqs files:
I see that illumina-dump creates a lot of files, how can I get the fasta from those files
Here you go. But seriously, this stuff is harder to find than it needs to be:
Thank you for the sarcasm, anyway it does not answer my question. My question is about removing the adaptors, not just converting.
I'm sorry for the sarcasm - I think it's the first time I've used that site, and I agree now that it seems overly harsh. RamRS's link suggests cutadapt, which should trim any adapter sequence you specify. The hard part is finding the correct adapter sequence to trim, and FastQC might help you with that.
cutadapt, fastqc and trimmomatic might help with trimming adapters.
EDIT: Lorena Pantano has given a much better, detailed response here: Transform smallRNA SRA (Illumina) sequences to FASTA
And yes, lmgtfy can be a bit too condescending at times. I think the cultural difference amplifies the effect, unfortunately.
there are some adapter removal that are specific for mRNA and not small RNA. I would use cutadapt. The adapter in smallRNA is always the same, and it is enough to detect 8 nucleotides. The adapter should be something like AGATCGGAAGAGCAC, or without the first A if it is standard protocol. Fastqc was not working well the last time I used it (1/2 years ago, and the author admitted it was not prepare for small RNA).
Thank you for the detailed response. It is better for the OP to hear from someone that shares research domain.
Does this help: http://www.ark-genomics.org/events-online-training-eu-training-course/adapter-and-quality-trimming-illumina-data
I don't have any XP with primer/adapter trimming :(