Entering edit mode
2.2 years ago
Dan
▴
180
My ATAC-seq samples have both pair-end 50 and pair-end 100 reads. After peak calling using macs2
, can I directly compare the counts of the regions defined by these peaks from different read lengths?
library(Rsubread)
library(GenomicRanges)
library(DESeq2)
peaks <- dir("peaks/", pattern = "*.narrowPeak",
full.names = TRUE)
myPeaks <- lapply(peaks, ChIPQC:::GetGRanges, simple = TRUE)
myGRangesList<-GRangesList(myPeaks)
reduced <- reduce(unlist(myGRangesList))
consensusIDs <- paste0("consensus_", seq(1, length(reduced)))
mcols(reduced) <- do.call(cbind, lapply(myGRangesList, function(x) (reduced %over% x) + 0))
reducedConsensus <- reduced
mcols(reducedConsensus) <- cbind(as.data.frame(mcols(reducedConsensus)), consensusIDs)
bamsToCount <- dir("bam/", full.names = TRUE, pattern = "*.\\.bam$")
regionsToCount <- data.frame(GeneID = paste("ID", seqnames(reducedConsensus),
start(reducedConsensus), end(reducedConsensus), sep = "_"), Chr = seqnames(reducedConsensus),
Start = start(reducedConsensus), End = end(reducedConsensus), Strand = strand(reducedConsensus))
fcResults <- featureCounts(bamsToCount, annot.ext = regionsToCount, isPairedEnd = TRUE,
countMultiMappingReads = FALSE, maxFragLength=1000)
myCounts <- fcResults$counts
colnames(myCounts) <- c("sample_1", "sample_2", "sample_3", "sample_4")
# [generate the metadata] to describe/group the samples
Group <- factor(c("Control", "Control", "Treat","Treat"))
metaData <- data.frame(Group, row.names = colnames(myCounts))
atacDDS <- DESeqDataSetFromMatrix(myCounts, metaData, ~Group, rowRanges = reducedConsensus)
atacDDS <- DESeq(atacDDS)
atac_Rlog <- rlog(atacDDS)
rawCounts <- counts(atacDDS, normalized = FALSE)
Thanks
Please define
compare
.I am sorry the question was not clear. I edited it. Could you tell me if the codes are reasonable or not? Thanks
I see. Basically that is correct I think, so using a dedicated stats framework like DESeq2. The impact of read length is probably minor and limited to regions that are difficult to map. Still, to be sure you can trim the reads all down to 50bp and then repeat alignment.
I used
trim_galore --illumina --paired --fastqc
to trim the adaptors, is it possible and how should I trim the reads to pair-end 50? Thanks a lot.trim_galore
has parameters that can trim specific bp from either end:--clip_R1
,--clip_R2
,--three_prime_clip_R1
,--three_prime_clip_R2
, which parameter should I use for my situation? Thanks a lot.You will need to clip on 3'-end for both reads.
I see. Thanks
I always use seqtk for these kinds of things.
Is
seqtk
better thantrim_galore
?I do not know, I do not use trim_galore. If trim_galore can do the job it is fine.