I have to analyze a small RNA-seq data. This is my first time working with small RNA. I wanted to do differential expression and micro-rna comparison along with GO annotation. Something that mirtools does. But mirtools takes fasta files and I have paired end fastq files. I was wondering which publicly available pipelines (based on ease of usage and installation) can anybody recommend to do a comprehensive small RNA-seq analysis.
If you want something standalone, easy to use, and easy to install, I suggest the BBMap package - it's already compiled and will run on any platform that has Java. It does not do annotation, but it does do adapter-trimming with bbduk.sh (which is important for micro-RNA) and alignment with bbmap.sh, or alignment-free RNA quantification against a transcriptome with seal.sh. You could also use reformat.sh to convert your fastq files to fasta.
How is the alignment of small rna reads going to be different from general RNA-seq reads? What bbmap options would be specific to small rna read mapping?
Mainly BBMap is just easy to install and use, which is what you asked for. And it supports direct output of coverage and RPKM, which is often useful for RNA-seq.
Oh, the defaults are usually fine. You could set "maxindel=20" if you want to speed it up a little since the defaults are geared toward long RNAs, and small RNAs don't contain long introns. Just be sure to adapter-trim the reads first, because small RNAs are expected to have adapter sequence, which interferes with mapping.
So once I have mapped the reads I'll have to depend on other tools to do further analysis. Is there a way I can leverage other tools like srnabench once I have the mapped sam file or do any functionalities exist in BBtools to do so.
BBTools can output coverage across the genome, or generate RPKMs if you are mapping to a transcriptome. And it can produce various metrics on coverage, like the fraction of bases across a transcript that were covered, but it does not do most RNA-seq-specific things like differential expression analysis. Many RNA analyses use mapped sam files, but I don't personally do small-RNA analysis so I'm not sure which downstream tools are best.
I would have to ask my collaborator that :( . BTW the length of the sequences in the fastq files is 50nt I am sure it does includes some adapter sequences that I would have to remove.
Well it was just a one-off analysis I had to do for my collaborator to add on some results to a paper. So I don't know much about why they used paired end (Which I have read on Illumina website don't help much in case of small RNA or miRNA).
But if your paired-end data is strand-specific (what I really hope you checked), then you can just create a single-end file out of it. mate1 is in the correct direction and mate2 is directed in reverse direction. So, you can just take mate1.fastq and add the reverse complement of mate2.fastq. Assure to do that after clipping the adapters!
I'm not sure, but probably there are some tools available which do the re-direction of mate2 for you. But I never needed it, so I don't know. But you can use the resulting file single-end file with all small RNA-Seq tools available.
I have finally done the mapping etc for small rna. Since sRNAbench does not handle data which has no replicates (like in my case) what would be a good procedure to use in order to give the required results. Since both edgeR and DEseq handle replicate free data for RNA-seq do you think these techniques can be extended to small rna as well.
The UEA small RNA Workbench is another option, Its cross platform GUI (Graphical User Interface) application, You can run either on Linux or Windows seamlessly. It has different modules for different analysis such as Mirprof for Known miRNA Analysis and Mircat for Novel miRNA Analysis.
How is the alignment of small rna reads going to be different from general RNA-seq reads? What bbmap options would be specific to small rna read mapping?
Mainly BBMap is just easy to install and use, which is what you asked for. And it supports direct output of coverage and RPKM, which is often useful for RNA-seq.
What I am really asking is what parameters should I change or specifically use for aligning small RNA reads when using bbmap?
Oh, the defaults are usually fine. You could set "maxindel=20" if you want to speed it up a little since the defaults are geared toward long RNAs, and small RNAs don't contain long introns. Just be sure to adapter-trim the reads first, because small RNAs are expected to have adapter sequence, which interferes with mapping.
So once I have mapped the reads I'll have to depend on other tools to do further analysis. Is there a way I can leverage other tools like srnabench once I have the mapped sam file or do any functionalities exist in BBtools to do so.
BBTools can output coverage across the genome, or generate RPKMs if you are mapping to a transcriptome. And it can produce various metrics on coverage, like the fraction of bases across a transcript that were covered, but it does not do most RNA-seq-specific things like differential expression analysis. Many RNA analyses use mapped sam files, but I don't personally do small-RNA analysis so I'm not sure which downstream tools are best.