I've run macs2 callpeaks and in the .narrowPeak file I have many rows for chromosomes with names like "chrUn_KI270522v1". Can anyone tell me what this means? The peak was found in an unknown chromosome? Is this due to noise and can be ignored? I have a feeling if I ignore those rows I'm going to be missing out on important info but I get many errors when I try to annotate the peaks because those aren't real chromosomes.
Any insight would be appreciatd.
FYI - These are human cells.
Thanks
Thank you! Can you tell me how to filter out these contigs? I'm new to all of this so sorry if this is a basic question
From the BAM file (alignment) or the peak file?
I guess the .bam files since you said you filter them out before finding the peaks. That makes more sense to me.
I figured out how to use bamtools filter to get chr1 but is there a way to give it a list of the chromosomes I want and remove the contigs that are a problem? Thanks
I usually use samtools. Given a sorted and indexed file you can do:
Extracting the primary chromosomes from the BAM header, this will eliminate everything like chrU, *_random etc:
Or just manually, listing the chromosomes to keep:
Thanks so much