Question

how to take the subset of fastq reads in one specific chromosome

0

Entering edit mode

8.0 years ago

jolin0701-dy ▴ 100

I have a paired-ends RNA-seq fastq reads and it is too large.

I'd like to create a subset of reads only containing the reads mapping to chrX.

Firstly, I need to map all the reads to chrX.

Then how can I select the mapped reads?

Thanks a lot~~

rna-seq • 4.4k views

ADD COMMENT • link updated 8.0 years ago by Ron ★ 1.2k • written 8.0 years ago by jolin0701-dy ▴ 100

0

Entering edit mode

Just as a remark, it's not the best solution to map all your reads only to chrX. As such reads which map "suboptimally" to chrX but better to e.g. chr6 might still get mapped to chrX. An aligner doesn't necessarily find the correct position of a read, just the one best matching.

ADD REPLY • link 8.0 years ago by WouterDeCoster 47k

score 0 · Answer 1 · 2016-11-23

0

Entering edit mode

8.0 years ago

bongok ▴ 40

1) You could map using tophat or bwa.

2) Then extract the mapped reads using samtools. See here - How To Filter Mapped Reads With Samtools

Best

ADD COMMENT • link 8.0 years ago by bongok ▴ 40

0

Entering edit mode

Thanks.

I know it can be used for single ends.

$samtools view in.bam | awk '{printf "@%s\n%s\n+\n%s\n", $1,$10,$11}' > single.fastq

But how to do it with paired ends?

Thanks

ADD REPLY • link 8.0 years ago by jolin0701-dy ▴ 100

score 0 · Answer 2 · 2016-11-23

0

Entering edit mode

8.0 years ago

Ron ★ 1.2k

For splitting FASTQ per chromosme : Split Fastq File Into Different Files Only Comprising One Chromosome Each

For splitting BAM per chromosome,something like : samtools view -b chr1:100-200 > small.bam

ADD COMMENT • link 8.0 years ago by Ron ★ 1.2k

0

Entering edit mode

Thanks.

How to format the bam to fastq?

especially for paired ends?

ADD REPLY • link 8.0 years ago by jolin0701-dy ▴ 100

0

Entering edit mode

This is a latest thread on this: A: Converting large (compressed and unsorted) BAM files to fastq

ADD REPLY • link 8.0 years ago by Ron ★ 1.2k