Hi,
my first question here in BioStar - so hello everyone! :-)
We have performed a smRNA-Seq experiment using the Illumina v1.5 smRNA library prep kit.
The two groups are ticks (I. scapularis) either fed on an B.burgdorferi-infected mammalian host or an uninfected animal. We dissected the ticks and extracted total RNA from their midgut/hemocoel tissues which contained the mammalian blood and bacteria in the infected group. The RNA integrity was confirmed using Agilen BioAnalyzer and we sent total RNA to our core facility for library preparation and sequencing on a GAIIx.
The core facility multiplexed the same, and after sequencing demultiplexed the data. I took over the analysis from this point, starting with the removal of the Illumina RNA adaptors and sorting of the trimmed reads into two groups:
1) miRNA-candidates 16-25bp reads
2) ncRNA-candidates >25bp reads
For the infected group this resulted in approximately 2 million miRNA-candidate reads, uninfected about 50% more.
Then I attempted mapping of these reads to
1. known mature Iscapularis miRNA (all data from latest miRBase release)
2. known hairpin Iscapularis
3. all other mature miRNA/hairpin RNAs
using bowtie with modified parameters (-l 15 --seedmms 1)
To my surprise only a tiny fraction of the miRNA-candidates mapped to known Iscapularis miRNAs:
infected: 250
uninfected: 354
The number of reads mapping to hairpins were improved, but nowhere near what I expected (One positive thing is that miRNA-mature and hairpin-mapped reads overlap):
infected: 4346
uninfected: 4515
The numbers again improve when mapping to all known miRNAs from miRBase, going into the low 5 digit range, but still representing only a tiny fraction of the total reads.
My next step was to map these reads to the transcriptome and genome of Iscapularis as well as the mouse (mammalian host). This now results in roughly half of the miRNA-candidate reads for both groups being mapped... It still leaves me with the majority of reads being "unknowns".
My questions for people who have done smRNA-Seq experiments before:
Do you typically see any traces of mRNA in your data?
How well does the ligation of the 3'-OH Illumina adaptor discriminate between "true" small RNAs and other molecules?
Is there any way to salvage the data from this experiment, or should I consider it as invalid due to the mapping results so far?
Thanks everyone in advance!
How many reads mapped to the B. burgdorferi genome? Or, how clean is the prep from the tick stomachs such that some reads aren't from another organism?
take some reads that didn't align and blast against nr
Hi Larry:
about 10,000 reads mapped to the Bburg genome; the prep is pretty "dirty" and I am expecting to see RNA from other bacteria; we've got some microbiome data which I am using to put together as many bacterial genomes as possible to align the reads to
Hi Jeremy: I am currently assembling contigs from the unaligned reads, and am going to blast them against NT actually