Where can I get fastq sequence files of length 150 and more.
2
0
Entering edit mode
7.0 years ago
AHW ▴ 90

I want to perform alignment of the human sequence reads with the reference genome. I need reads length 150 and more (500) to test some algorithm. Where can I find such type of reads, both single and paired-end. I got reads from 1000 genome project around 100 length, however I want reads of length more.

alignment sequence • 1.9k views
ADD COMMENT
3
Entering edit mode
7.0 years ago
Gungor Budak ▴ 270

Use Sequence Read Archive advanced search and provide read length as 150 and species as Homo sapiens. You can add more filters if you want. The query should look something like this:

(150[ReadLength]) AND "Homo sapiens"[orgn:__txid9606] 

And here is a result for paired-end sequencing data with read length 150.

ADD COMMENT
0
Entering edit mode

Agaz,

If transcriptome reads generated from Pacbio interests you then you can access European Nucleotide Archive accession PRJEB3969 (https://www.ebi.ac.uk/ena/data/view/PRJEB3969)

ADD REPLY
1
Entering edit mode
7.0 years ago

This is a very basic requirement and a lot of tools are available to simulate artificial reads from the genome under question. More interestingly, you can define the number, length and quality of reads also. One such well documented program ( ArtificialFastqGenerator )is here

Here is the publication link

Another very sophisticated tool is ART (courtesy: Biostars Handbook) which can mimic the sequencing platforms very well.

ADD COMMENT

Login before adding your answer.

Traffic: 2482 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6