RNA-seq and Chip Seq alignment
1
0
Entering edit mode
3 months ago

I am just getting introduced to bioinformatics, and I am having a hard time. My PI has no background in bioinformatics and she is no help... she just shares "scripts" stored in Word docs where she copies and pastes commands x1M times...

I got confused and wanted to ask about some details here, hoping someone would help me understand.

I have a paired-end ChIP seq and RNA-seq data sets.

For analyzing both data sets (based on my PI's "scripts")

They jump straight to alignment where they align both reads separately.Then they use the sam files from both reads to create a tag directory for the sample. Here is an example to be more clear:

## Align reads with STAR
STAR \
  --readFilesCommand zcat \
  --genomeDir /STAR_indexes/mm10/ \
  --runThreadN 24 \
  --readFilesIn Samp1_Rep1_R1_001.fastq.gz \
  --outFileNamePrefix Sample1_Rep1_R1_001_

STAR \
  --readFilesCommand zcat \
  --genomeDir /STAR_indexes/mm10/ \
  --runThreadN 24 \
  --readFilesIn Samp1_Rep1_R2_001.fastq.gz \
  --outFileNamePrefix Sample1_Rep1_R2_001_

makeTagDirectory Sample1_Rep1 Sample1_Rep1_R1_001_Aligned.out.sam Sample1_Rep1_R2_001_Aligned.out.sam

Shouldn't both reads be aligned together? Or is this way also fine?

If reads are aligned separately as I had indicated and then those 2 sam files from R1 and R2 are used to created the TagDirectory -- does that mess up the TagDirectories in any way?

Versus if the reads were aligned in paired-end mode and the that single sam file was used to make the TagDirectory?

Or either way it does not make a difference when creating the TagDirectory?

Also... in this case my PI skips the trimming... wouldn't "non-trimmed" samples introduce some bias?

Sorry if these are stupid Q! Just hard to wrap my head around how this works.

Thanks!

ChIP-seq RNA-seq alignment • 382 views
ADD COMMENT
0
Entering edit mode

Please use the formatting bar (especially the code option) to present your post better. You can use backticks for inline code (`text` becomes text), or use one of (a) the option highlighted in the image below/ (b) fenced code blocks for multi-line code. Fenced code blocks are useful in syntax highlighting. If your code has long lines with a single command, break those lines into multiple lines with proper escape sequences so they're easier to read and still run when copy-pasted. I've done it for you this time.
code_formatting

ADD REPLY
0
Entering edit mode
3 months ago
Trivas ★ 1.8k

Not trimming is ~OK because of soft-clipping STAR can do, see Star Read Clipping

As for mapping R1 and R2 separately, that's a bit weird for RNA-seq, less weird for ChIP-seq but also dependent on your average fragment size/insert length of your libraries.

ADD COMMENT

Login before adding your answer.

Traffic: 2621 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6