hi,
I was given recently sequenced smallRNA data from the pathogen fungi Aspergillus fumigatus and Candida albicans and asked to perform differential expression of miRNAs in response to human blood infection. Therefor i had 2 replicates for each of the RNA-Seq experiments of both fungi and after their extraction from infected human blood: namely Af_1, Af_2, Ca_1, Ca_2 and HsAf-Af_1, HsAf-Af_1, HsCa-Ca_1, HsCa-Ca_2. I should develop a workflow and prepare differential expression tables
I performed so but I failed to impressed my mentor, I don't know why :( :( :(
1-checked the quality of FASTQ files by FastQC following I removed Illumina Small RNA 3' Adapters and reads shorter than 15 bp by bbduk.
2-downloaded Aspergillus_fumigatus.CADRE.32.gtf.gz, Aspergillus_fumigatus.CADRE.dna.toplevel.fa.gz, Candida_albicans_sc5314.ASM18296v2.32.gtf.gz and Candida_albicans_sc5314.ASM18296v2.dna.toplevel.fa.gz from Ensembl
3- built genomes by bowtie2-build -f genome.fa genome
4- mapped the cleaned reads on reference genome through tophat by tophat -p 10 -G file.gtf file.fq
5- Assembled transcripts by cufflinks -p 10 accepted_hits.bam
6- created merged transcriptome annotation by cuffmerge -g file.gtf -s genome.fa -p 10 ssemblies.txt
7- Identified differentially expressed genes by cuffdiff for example cuffdiff -o diff_out -b genome.fa -p 10 –L Af,HsAf -u merged_asm/merged.gtf Af_1.bam, Af_2.bam HsAf-Af_1.bam, HsAf-Af_2.bam
8- finally I extracted only significant genes
I was going to use miRNAs GTF but there was not such a files for these fungi in miRBase. there was gff3 in ensembl contains miRNAs when I converted that to GTF I got error then I used the mentioned GTFs.
....
Firstly, I don't think that C. albicans has any miRNAs or does RNAi (as far as anyone has found). If you look in that GTF file, there are no miRNAs.
I would seems that fumigatus DOES do RNAi, but I the only literature I found on miRNAs in fumigatus was somebody finding something that looked a bit like a miRNA in the genome that was differentially expressed in hypoxia (see here).
So I think the reason you are having trouble finding good miRNA annotation is that it doesn't exist. If you want to do DE miRNA analysis in fumigatus, but might need to create the annotation yourself from your sequencing data. The miRDeep2 tool is designed for this.
As for the DE pipeline once you've got the annotation .... I would recommend the bowtie2>HTSeq>edgeR/DESeq pipeline for miRNA, remembering to pool counts from duplicate copies of miRNAs (i.e. many miRNAs appear in more than one place in the genome). You might see the Karken/SeqImp pipeline from the Enright lab for ideas on how to build such a pipeline - there software only supports a small number of species right now, but you should look at what they are doing and re-implement for your stuff.
thank you so much for your comprehensive explanation