cluster and determine frequency of reads in fastq file

0

Entering edit mode

8.1 years ago

mccormack ▴ 90

How could I determine the frequency of reads in a fastq file ? I would also like to cluster the reads in the fastq file.

sequence alignment • 2.2k views

ADD COMMENT • link updated 8.0 years ago by Biostar 20 • written 8.1 years ago by mccormack ▴ 90

0

Entering edit mode

Are you referring to counting "how many sequence types" are present in the dataset? What would be the purpose of clustering the reads? Deduplication?

ADD REPLY • link 8.1 years ago by GenoMax 148k

0

Entering edit mode

I am trying to follow the procedure found here: https://dnacore.mgh.harvard.edu/new-cgi-bin/site/pages/crispr_sequencing_pages/crispr_sequencing_algorithm.jsp

ADD REPLY • link 8.1 years ago by mccormack ▴ 90

0

Entering edit mode

Have you tried to email the person on that page to see if they have ready code that implements that procedure?

ADD REPLY • link 8.1 years ago by GenoMax 148k

0

Entering edit mode

Yes, I e-mailed and received a reply before posting this question. The reply was that there could not be any more clearer explanation than what appears on the web page.

ADD REPLY • link 8.1 years ago by mccormack ▴ 90

0

Entering edit mode

Hi Mccormack,

I am also interested doing the same. I am working on the miRNA. They have well conserved regions in them. So I would like to determine the frequency of each reads and want them to cluster it using fastq file.

Can you please share your inputs?

ADD REPLY • link 7.8 years ago by bioinforesearchquestions ▴ 370

0

Entering edit mode

bump

I am also interested in this question. I am currently trying to map RNA-seq reads to a newly available reference genome. From what I read in a previous transcriptome paper done in this model, clustering the reads to unique groups seems to be useful/necessary?

Thank you!

ADD REPLY • link 7.7 years ago by nancydong20 ▴ 130

Login before adding your answer.