I performed microRNA sequencing and generated a count matrix and then performed DESeq2. I had one wildtype sample and a non targeting control. The other samples were NF2; TP53 and PTEN single knockout and NF2-PTEN; PTEN-TP53 and NF2-TP53 double knockouts in MCF10A cell lines which were created using CRISPR. Each sample was submitted for sequencing in triplicates.
The following pipeline was used: a) Cutadapt to trim the adapters b) Star aligner: To align the sequencing results to the gencode reference genome c) HTSeq: To create the count matrix.
The count matrix was as follows for one sample, for example:
The counts are extremely low, even for the microRNAs and not just protein coding genes. I am not sure if that is correct or not. Is it normal to get such low counts?
Next, I performed DESeq2 and filtered microRNAs and compared the double knockouts to the wildtype. Again, I present an example here, but the same microRNAs are being up or down regulated in all the three double knockouts, although the log2foldchange value is different. Only 45 microRNAs are up or downregulated and I get NA for others. Can someone please tell me whats going on? I dont think this is right.
Please let me know if more information is required. Thank you!
miRNA generally have a special adapter sequence that needs to be trimmed before alignment. Kits used will generally include instructions on which adapter to remove. Sounds like this has not been done here.
Hi @GenoMax,
Thanks for the quick response. I received the clean sequence from the sequencing company. i.e, they said they used cutadapt to remove the adapters and send the clean data back. I used those files for downstream pipeline. Could there be another reason for this?
miRNA data should be ~25 bp. Is that what you have? If you have longer reads then the data has not been trimmed properly.
seems urs miRNA data..... and u have followed mRNA protocol..... please look at the miRDeep2 protocol to get known and novel miRNA counts.....