Question

STAR for chip-seq

2

Entering edit mode

7.2 years ago

varsha619 ▴ 90

Hello, Is STAR aligner recommended for use with ChIP-seq data? I am trying to use STAR for ChIP-seq data to obtain reads mapped to multiple regions of the genome with mismatch options, which STAR seems to do better than Bowtie2. I get only around 14% of reads mapped, and around 80% in "% of reads unmapped: too short". From the suggestions in the link - https://groups.google.com/forum/#!topic/rna-star/E_mKqm9jDm0, I tried --alignIntronMax 1 option but the results are similar. Please advise, thank you.

star ChIP-Seq alignment • 8.8k views

ADD COMMENT • link updated 7.2 years ago by predeus ★ 2.1k • written 7.2 years ago by varsha619 ▴ 90

0

Entering edit mode

around 80% in "% of reads unmapped: too short".

What is the size distribution of reads in that pool (or this data in general)? If the reads are very short (< 30-40 bp, after scan/trim) then it may indeed be difficult to map them.

ADD REPLY • link 7.2 years ago by GenoMax 152k

0

Entering edit mode

@genomax, The average read size is 50-75bp

ADD REPLY • link 7.2 years ago by varsha619 ▴ 90

0

Entering edit mode

Then @predeus' answer may not apply. You likely have a different problem. Have you checked a sampling of reads that do not map by blast? You could have some sort of contamination in your data.

ADD REPLY • link 7.2 years ago by GenoMax 152k

1

Entering edit mode

I concur with genomax. Did you run FastQC on the fastq files? It's likely that only about 18% of your reads are usable if both STAR and bowtie2 agree. Depending on what FastQC says, you may be able to rescue some more reads by adapter trimming.

ADD REPLY • link 7.2 years ago by Friederike 9.0k

0

Entering edit mode

I will check this, thank you for your help!

ADD REPLY • link 7.2 years ago by varsha619 ▴ 90

0

Entering edit mode

can you post the entire command you're using and the log file output?

ADD REPLY • link 7.2 years ago by Friederike 9.0k

0

Entering edit mode

STAR --genomeDir /genomes/dm6/Sequence/STARindex --runThreadN 8 --readFilesIn in.fastq --outSAMtype BAM SortedByCoordinate --outFileNamePrefix star_out

ADD REPLY • link 7.2 years ago by varsha619 ▴ 90

score 3 · Answer 1 · 2018-05-29

3

Entering edit mode

7.2 years ago

predeus ★ 2.1k

"too short" is STAR's euphemism for reads that just fail to align. What's the alignment rate you're getting with bowtie2? Chip-Seq is very tricky experimentally, so it happens quite often that libraries are full of adapter sequences etc. Aligners (as long as you are using a well-supported modern one, like bwa, bowtie2, or STAR) should not matter all that much.

Some types (e.g. H3K9me3) are also enriched for multimapping reads because these marks are enriched in heterochromatin.

ADD COMMENT • link 7.2 years ago by predeus ★ 2.1k

0

Entering edit mode

Bowtie2 also gave me only 18% alignment but I was confused because the file sizes are not comparable. The bam file from Bowtie2 (1,035,494,925) is much larger than the one from STAR (275,497,682). P.S. It's fly genome, hence the smaller sizes.

ADD REPLY • link 7.2 years ago by varsha619 ▴ 90

0

Entering edit mode

Size discrepancy could just be due to bowtie2 including unaligned reads in the BAM vs STAR not doing that.

ADD REPLY • link 7.2 years ago by GenoMax 152k

0

Entering edit mode

@genomax, does Bowtie2 output unaligned reads by default? Even when I don't use --un option

ADD REPLY • link 7.2 years ago by varsha619 ▴ 90

2

Entering edit mode

Since you did not use --no-unal your file must have unaligned reads. --un separates these reads in a new file.