Question

RNA-seq and Chip Seq alignment

0

Entering edit mode

11 months ago

alphaflylizard • 0

I am just getting introduced to bioinformatics, and I am having a hard time. My PI has no background in bioinformatics and she is no help... she just shares "scripts" stored in Word docs where she copies and pastes commands x1M times...

I got confused and wanted to ask about some details here, hoping someone would help me understand.

I have a paired-end ChIP seq and RNA-seq data sets.

For analyzing both data sets (based on my PI's "scripts")

They jump straight to alignment where they align both reads separately.Then they use the sam files from both reads to create a tag directory for the sample. Here is an example to be more clear:

## Align reads with STAR
STAR \
  --readFilesCommand zcat \
  --genomeDir /STAR_indexes/mm10/ \
  --runThreadN 24 \
  --readFilesIn Samp1_Rep1_R1_001.fastq.gz \
  --outFileNamePrefix Sample1_Rep1_R1_001_

STAR \
  --readFilesCommand zcat \
  --genomeDir /STAR_indexes/mm10/ \
  --runThreadN 24 \
  --readFilesIn Samp1_Rep1_R2_001.fastq.gz \
  --outFileNamePrefix Sample1_Rep1_R2_001_

makeTagDirectory Sample1_Rep1 Sample1_Rep1_R1_001_Aligned.out.sam Sample1_Rep1_R2_001_Aligned.out.sam

Shouldn't both reads be aligned together? Or is this way also fine?

If reads are aligned separately as I had indicated and then those 2 sam files from R1 and R2 are used to created the TagDirectory -- does that mess up the TagDirectories in any way?

Versus if the reads were aligned in paired-end mode and the that single sam file was used to make the TagDirectory?

Or either way it does not make a difference when creating the TagDirectory?

Also... in this case my PI skips the trimming... wouldn't "non-trimmed" samples introduce some bias?

Sorry if these are stupid Q! Just hard to wrap my head around how this works.

Thanks!

ChIP-seq RNA-seq alignment • 747 views

ADD COMMENT • link updated 11 months ago by Ram 45k • written 11 months ago by alphaflylizard • 0

0

Entering edit mode

Please use the formatting bar (especially the code option) to present your post better. You can use backticks for inline code (`text` becomes text), or use one of (a) the option highlighted in the image below/ (b) fenced code blocks for multi-line code. Fenced code blocks are useful in syntax highlighting. If your code has long lines with a single command, break those lines into multiple lines with proper escape sequences so they're easier to read and still run when copy-pasted. I've done it for you this time.
code_formatting

ADD REPLY • link 11 months ago by Ram 45k

score 0 · Answer 1 · 2024-08-24

0

Entering edit mode

11 months ago

Trivas ★ 1.9k

Not trimming is ~OK because of soft-clipping STAR can do, see Star Read Clipping

As for mapping R1 and R2 separately, that's a bit weird for RNA-seq, less weird for ChIP-seq but also dependent on your average fragment size/insert length of your libraries.

ADD COMMENT • link 11 months ago by Trivas ★ 1.9k