Hi everybody.
I'm doing single-cell rna-seq analysis.
My starting point are fastq files with paired end reads from NexteraXT platform. Read length is 75 bp.
I've trimmed my reads for both quality and adapter content with Trim-Galore, then I performed Fastqc-Multiqc to check if everything is ok.
I found several samples (about 60 on 350 samples) to have overrepresented sequences to various extent, not much in major part of the cases
i'd like to blast the overrepresented sequences to see what they are.
Is there a simple way to get them?
I tried to search over internet but i can't find anything...
Hi, welcome to Biostars! Which platform was used? 10x, SMARTseq? What was the read length? Trimmed for which adapter or quality? Overrepresented sequences: Which and to what extend, please post some details. There are several constant regions in scRNA-seq fragments depending on the platform so overrepresentation can be normal and expected.
Hi, welcome to Biostars! Which platform was used? 10x, SMARTseq? What was the read length? Trimmed for which adapter or quality? Overrepresented sequences: Which and to what extend, please post some details. There are several constant regions in scRNA-seq fragments depending on the platform so overrepresentation can be normal and expected.
sorry, im new here. i'll edit my question to add more details. you're right, i was too confident.
Don't worry :)
michele.tebaldi.92 : When you edit your original post it would help to have some visual information. This would help with that part: How to add images to a Biostars post
may be sequence clustering software such as CD-HIT may help you michele.tebaldi.92