Question

Submitting many transcriptomes to TSA - how to automatize?

0

Entering edit mode

5.8 years ago

al-ash ▴ 210

Hi, Is it possible to automatize submition of a large number of transcriptomes to TSA NCBI database without having to create a new TSA submission for each transcriptome? I have many tens of transcriptomes and submitting them one-by-one would be quite laborious. Also, is there a way how to simulate the TSA's check for the matches to UniVec vector database? With large number of transcriptomes to be submitted, uploading transcriptomes one-by-one only to retrieve the errors related to presumed adapter contamination and having to re-upload cleaned transcriptomes is very cumbersome.

I'd de thankful for any advice or hints!

TSA transcriptome NCBI database submition • 1.3k views

ADD COMMENT • link 5.8 years ago by al-ash ▴ 210

0

Entering edit mode

Using fastp you can detect the presence of adapters

ADD REPLY • link 5.8 years ago by Juke34 9.2k

0

Entering edit mode

Thanks, that looks handy. But I was actually looking for a method which would mimic as closely as possible the adapter detection on NCBI web - because I actually did adapter trimming as a part of the transcriptome assembly but NCBI is still reporting few adapters here and there: in total 137 supposedly contaminated contigs in 37 out of 55 transcriptomes I was uploading so the prevalence is extremely low but it still means that I need to reupload and let the webpage reanalyze 37 transcriptomes (which takes long time and the process aparently sometimes crashes).

In the end, I uploaded everything to TSA, let it analyze the transcriptomes, used their adapter contamination report to clean my transcriptomes and now I'm reuploading the cleaned transcriptomes and waiting for them being processed.

ADD REPLY • link 5.8 years ago by al-ash ▴ 210

1

Entering edit mode

Did you contact NCBI? Maybe the tool they use for detecting the adaptors is in their GitHub?

ADD REPLY • link 5.8 years ago by Juke34 9.2k