carp reads in my bacterial small RNome
2
0
Entering edit mode
21 months ago

Hello

I have a few E. coli small RNome samples I tried to align with both E. coli genome and only with the non-coding E. coli RNAs. In both cases, I had a high percentage of unnaligned reads (~90%).

I blasted a few of them, and sometimes I got fragments of phage viruses, other E coli strains and, curiously, Cyprinus carpo hits (all of them with a query cover of 40-60%). I also noticed that many of them end with some variation of a long sequence of AAAGGGGGGG's, which does not seem to be the case of the ones that aligned. This happened for every sample.

Has anyone experienced something similar?

rna-seq • 700 views
ADD COMMENT
2
Entering edit mode
21 months ago
GenoMax 147k

Those carp hits are not real. Which DB were you aligning to? Someone submitted bad sequences to NCBI (without doing proper clean up) is more than likely the cause.

See: https://dgg32.medium.com/carp-in-the-soil-1168818d2191

ADD COMMENT
1
Entering edit mode
21 months ago

The carp genome is a well known problem going back to 2014 I think. It should really be thrown out of the databases or a new version created.

http://grahametherington.blogspot.com/2014/09/why-you-should-qc-your-reads-and-your.html

ADD COMMENT

Login before adding your answer.

Traffic: 2965 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6