Question

How to trim miRNA reads?

0

Entering edit mode

12 months ago

Sanjukta • 0

Hi there,

I am new to bioinformatics. I am trying to prepare fasta.gz files for uploading onto CPSS, a websever for miRNA-seq datasets. My data is from Gene Omnibus db. Basically the sample fasta file appears like this:

;>SRR1658346.1 HISEQ1:187:D0NWFACXX:3:1101:2565:2050 length=51
ATCATACAAGGACAATTTCTTTTAACGTCGTATGCCGTCTTCTGCTTGNAA
>SRR1658346.2 HISEQ1:187:D0NWFACXX:3:1101:2654:2232 length=51
TCGAGGAGCTCACAGTCTAGTATAACGTCGTATGCCGTCTTCTGCTTGAAA
>SRR1658346.3 HISEQ1:187:D0NWFACXX:3:1101:2870:2103 length=51
TTCAAGTAATCCAGGATAGGCTAACGTCGTATGCCGTCTTCTGCTTGAAAA
>SRR1658346.4 HISEQ1:187:D0NWFACXX:3:1101:3001:2147 length=51
TAGCACCATCCGAAATCAGTTTAACGTCGTATGCCGGCTTCTGCTTGAAAA

And my clean file should be like this (an example from CPSS):

>t0000001_823508
TGAGGTAGTAGATTGTATAGTT
>t0000002_757054
TGAGGTAGTAGGTTGTATAGTT
>t0000003_252586
ACAGTAGTCTGCACATTGGTT

With my limited knowledge, I can guess that there are adaptors along with the typical 21 nt long miRNA sequence. But I am not sure as how to trim them as the terminal sequences are of varying composition.

(edited) I am trying to re-analyse an miRNA dataset to discover some desirable miRNAs which are not reported in the relevant publication. Here's a link to the webtool.

mirna adapter-trimming fastq • 1.2k views

ADD COMMENT • link updated 12 months ago by GenoMax 148k • written 12 months ago by Sanjukta • 0

0

Entering edit mode

The question needs some clarification:

What is the purpose of your analysis?
Are you sure the tool is still relevant for your question, the web page I found shows an error (Bad gateway)
[Why do you want to trim the reads to the mature miRNA?] (Sorry, I missed the point here, of course you should be trimming adapter sequences)
Did you consider miRDeep2 or similar alternatives?

ADD REPLY • link 12 months ago by Michael 55k

0

Entering edit mode

I have replied to your queries in the main post. And, no I have not checked miRDeep2 yet.

ADD REPLY • link 12 months ago by Sanjukta • 0

0

Entering edit mode

Hi Michael,

I went with Galaxy for now, and not proper miRDeep2. The installation file is pretty large, and temporary internet issues are preventing me from downloading it, is taking pretty long time.

I did QC on galaxy and it could not detect adaptor to my surprise. I am not sure what could be done, I am writing another post, any pointer will be appreciated.

ADD REPLY • link 12 months ago by Sanjukta • 0

0

Entering edit mode

sRNAtoolkit is also an option

ADD REPLY • link 12 months ago by akshay ▴ 10

score 2 · Answer 1 · 2023-12-28

2

Entering edit mode

12 months ago

GenoMax 148k

It would be ideal to know the kit used so you will know the specific adapter that was added to the miRNA's. But looking at the reads above you can see that TAACGTCGTATGCCGTCTTCTGC is likely a safe bet to trim your data.

ATCATACAAGGACAATTTCTTT TAACGTCGTATGCCGTCTTCTGCTTGNAA
TCGAGGAGCTCACAGTCTAGTA TAACGTCGTATGCCGTCTTCTGCTTGAAA
 TTCAAGTAATCCAGGATAGGC TAACGTCGTATGCCGTCTTCTGCTTGAAAA
 TAGCACCATCCGAAATCAGTT TAACGTCGTATGCCGGCTTCTGCTTGAAAA

ADD COMMENT • link 12 months ago by GenoMax 148k

0

Entering edit mode

Thank you so much for pointing the sequence out.

ADD REPLY • link 12 months ago by Sanjukta • 0

score 1 · Answer 2 · 2023-12-28

1

Entering edit mode

12 months ago

Michael 55k

If you are looking for an all-in-one qc and adapter trimming pipeline, fastp should do well. It should also be able to detect the adapter sequences automatically or you can use the sequence given by GenoMax. If you use mirDeep, it has also a built-in trimming step.

ADD COMMENT • link 12 months ago by Michael 55k