which alignment tool and which reference I should choose for miRNA-seq data analysis ?
3
0
Entering edit mode
8.4 years ago
jimmy_zeng ▴ 90

Hi, All: I am a newbie for the miRNA-seq data analysis, and I just want to follow a paper to get the same results.

I choose the paper: Aggarwal P, Turner A, Matter A, Kattman SJ et al. RNA expression profiling of human iPSC-derived cardiomyocytes in a cardiac hypertrophy model. PLoS One 2014;9(9):e108051. PMID: 25255322

The raw human miRNA sequencing data was downloaded from http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE60292 , which was clean of adapter sequences.

I first use the fastx-toolkit to filter the bad quality reads .

I download the miRBase databae by the code below :

wget ftp://mirbase.org/pub/mirbase/CURRENT/hairpin.fa.gz    ## 28645 reads
wget ftp://mirbase.org/pub/mirbase/CURRENT/mature.fa.zip        ## 35828 reads 
wget ftp://mirbase.org/pub/mirbase/CURRENT/hairpin.fa.zip  ##
wget ftp://mirbase.org/pub/mirbase/CURRENT/genomes/hsa.gff3    ##
wget ftp://mirbase.org/pub/mirbase/CURRENT/miFam.dat.zip  ##

grep sapiens mature.fa |wc  # 2588 

grep sapiens hairpin.fa |wc   #1881 

## Homo sapiens 
perl -alne '{if(/^>/){if(/Homo/){$tmp=1}else{$tmp=0}};print if $tmp==1;}' hairpin.fa  >hairpin.human.fa

perl -alne '{if(/^>/){if(/Homo/){$tmp=1}else{$tmp=0}};print if $tmp==1;}' mature.fa  > mature.human.fa  


## step5 : alignment to miRBase v21 by bowtie2 (hairpin.human.fa/mature.human.fa )
## 
mkdir  bowtie2_index &&  cd bowtie2_index
~/biosoft/bowtie/bowtie2-2.2.9/bowtie2-build ../hairpin.human.fa hairpin_human
~/biosoft/bowtie/bowtie2-2.2.9/bowtie2-build ../mature.human.fa  mature_human

ls *_clean.fq.gz | while read id ; do  ~/biosoft/bowtie/bowtie2-2.2.9/bowtie2 -x miRBase/bowtie2_index/hairpin_human -U $id   -S ${id%%.*}.hairpin.sam ; done 
## overall alignment rate:  10.20% / 5.71%/ 10.18%/ 4.36% / 10.02% / 4.95%
ls *_clean.fq.gz | while read id ; do  ~/biosoft/bowtie/bowtie2-2.2.9/bowtie2 -x miRBase/bowtie2_index/mature_human  -U $id   -S ${id%%.*}.mature.sam ; done 
## overall alignment rate:  6.67% / 3.78% / 6.70% / 2.80%/ 6.55% / 3.23%

Am I right to use bowtie2 ? Am I right to use the miRBase ???

RNA-Seq miRNA-seq alignment reference • 4.6k views
ADD COMMENT
0
Entering edit mode

Following reference alignment and read filtering of the miRNA-Seq data, between 863,000 and 1.9 million reads aligned to the custom reference set.

and below is the results from paper, as you can see ,

http://www.ncbi.nlm.nih.gov/geo/download/?acc=GSE60292&format=file

ls *gz |while read id ; do (echo $id;zcat $id | cut -f 2 |perl -alne '{$t+=$_;}END{print $t}');done

GSM1470353_iPS_010313_Unstim_known_miRNA_counts.txt.gz 686560 GSM1470354_iPS_010313_ET1_known_miRNA_counts.txt.gz 1109891 GSM1470355_iPS_011013_Unstim_known_miRNA_counts.txt.gz 956918 GSM1470356_iPS_011013_ET1_known_miRNA_counts.txt.gz 679366 GSM1470357_iPS_012513_Unstim_known_miRNA_counts.txt.gz 1164426 GSM1470358_iPS_012513_ET1_known_miRNA_counts.txt.gz 1063609

ADD REPLY
0
Entering edit mode
8.4 years ago
jimmy_zeng ▴ 90

I also use the alignment tool the paper mentions :

step5.2 using SHRiMP to do alignment

http://compbio.cs.toronto.edu/shrimp/README

3.5 Mapping cDNA reads against a miRNA database

cd ~/biosoft/SHRiMP/SHRiMP_2_2_3 export SHRIMP_FOLDER=$PWD cd -

  We project the database with:

$SHRIMP_FOLDER/utils/project-db.py --seed 00111111001111111100,00111111110011111100,00111111111100111100,00111111111111001100,00111111111111110000 \ --h-flag --shrimp-mode ls miRBase/hairpin.human.fa

$SHRIMP_FOLDER/bin/gmapper-ls -L hairpin.human-ls SRR1542716.fastq --qv-offset 33 \ -o 1 -H -E -a -1 -q -30 -g -30 --qv-offset 33 --strata -N 8 >map.out 2>map.log

I still can get the same results as the paper.

just 20K successfully mapped reads

ADD COMMENT
0
Entering edit mode
8.4 years ago
jimmy_zeng ▴ 90

I just look annother post , and find that : miRNA mapping rate is very low.. (less than 0.03%)

Oops, I forgot to convert U to T in hairpin.fa file.. (so stupid about it..)

ADD COMMENT
0
Entering edit mode
8.4 years ago
jimmy_zeng ▴ 90

ls _clean.fq.gz | while read id ; do ~/biosoft/bowtie/bowtie2-2.2.9/bowtie2 -x miRBase/bowtie2_index/hairpin_human -U $id -S ${id%%.}.hairpin.sam ; done

overall alignment rate: 10.20% / 5.71%/ 10.18%/ 4.36% / 10.02% / 4.95% (before convert U to T )

overall alignment rate: 51.77% / 70.38%/51.45% /61.14%/ 52.20% / 65.85% (after convert U to T )

ls _clean.fq.gz | while read id ; do ~/biosoft/bowtie/bowtie2-2.2.9/bowtie2 -x miRBase/bowtie2_index/mature_human -U $id -S ${id%%.}.mature.sam ; done

overall alignment rate: 6.67% / 3.78% / 6.70% / 2.80%/ 6.55% / 3.23% (before convert U to T )

overall alignment rate: 34.94% / 46.16%/ 35.00%/ 38.50% / 35.46% /42.41%(after convert U to T )

ADD COMMENT

Login before adding your answer.

Traffic: 1698 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6