tophat 2 rna seq
1
1
Entering edit mode
7.7 years ago
dimitrischat ▴ 210

hi all. i have a fastq file with total sequences 28020920 but i only want to do 10.000 f.e. . which option is there because i cant seem to find that in the manual.

RNA-Seq • 1.3k views
ADD COMMENT
2
Entering edit mode

Not sure why you want to do that but you could use reformat.sh from BBMap suite to sample the 10000 reads into a new file and then use that.

reformat.sh in=reads.fq.gz out=sampled.fq.gz sample=10000
ADD REPLY
1
Entering edit mode

thank you. but cant you do it using tophat?

ADD REPLY
1
Entering edit mode

You can check the manual but I don't think tophat has an option to sample a fraction of reads.

ADD REPLY
1
Entering edit mode

Hi dimitrischat,

It's worth noting that TopHat2 has been, essentially, deprecated by the developers, who recommend using HISAT2 instead. Unless you have a very specific reason to adopt TopHat2 in your pipeline, it's probably best to follow their advice.

ADD REPLY
2
Entering edit mode
7.5 years ago
Oskar ▴ 20

Hi Dimitris - I guess you want a small sample of your reads for testing and debugging purposes. If so, you can create a file containing a small number of reads. For example:

$ zcat myreads.fastq.gz | head -400000 | gzip > Test100k.fastq.gz

which takes the first 100k reads from “myreads” and stores them in “Test100k”

Hope it helps!

ADD COMMENT

Login before adding your answer.

Traffic: 1483 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6