Entering edit mode
6.0 years ago
610225668
•
0
Hi,all It was the first time for me to map RNA sequence. The data generated from corals .I used STAR to map the sequence to the reference. I used the default parameter but got a terrible result. The final mapping result was
Started job on | Dec 12 16:02:44
Started mapping on | Dec 12 16:03:13
Finished on | Dec 12 16:12:34
Mapping speed, Million of reads per hour | 85.99
Number of input reads | 13400813
Average input read length | 150
UNIQUE READS:
Uniquely mapped reads number | 3114
Uniquely mapped reads % | 0.02%
Average mapped length | 124.34
Number of splices: Total | 41
Number of splices: Annotated (sjdb) | 2
Number of splices: GT/AG | 24
Number of splices: GC/AG | 4
Number of splices: AT/AC | 0
Number of splices: Non-canonical | 13
Mismatch rate per base, % | 4.13%
Deletion rate per base | 0.03%
Deletion average length | 1.86
Insertion rate per base | 0.01%
Insertion average length | 1.47
MULTI-MAPPING READS:
Number of reads mapped to multiple loci | 1505
% of reads mapped to multiple loci | 0.01%
Number of reads mapped to too many loci | 47
% of reads mapped to too many loci | 0.00%
UNMAPPED READS:
% of reads unmapped: too many mismatches | 0.00%
% of reads unmapped: too short | 99.96%
% of reads unmapped: other | 0.00%
CHIMERIC READS:
Number of chimeric reads | 0
% of chimeric reads | 0.00%
Is there any idea about the too many unmapped reads? I didn't understand what the reason 'too short' mean. Can somebody explain it?Thanks!
Could you send your data to a pre-processing software like fastqc
What are your reads length ?
What is you command line to align ?
% of reads unmapped: too short
can mean two things with STAR :Too short means too short alignment. Are you sure you use the right reference?
In fact, I have nine types of coral,and I chosen five of these to build the reference index independently.But unfortunately, the results were similar
Can you elaborate on this ? Are you just using short contigs as a reference (please give stats, like using bbmaps stats.sh) ? Are you aligning against a single species ?
Have you tried bwa-mem or minimap2 to check their mapping rates for general info ? Have you ever tried alignments to these references before ?
I've notice that older STAR versions have issues with PE-reads having too much of an overlap.
If you've got PE data, try only R1 first. Otherwise, check your FastQC reports, as Batien mentioned, for adapter-contamination or overrepresented sequences indicating other contaminations.