Entering edit mode
2.7 years ago
manuel.fernandez
▴
50
Hi everyone,
I'm trying to extract the reads with BOTH ends mapped to a specific region of Chr 12 (6,140,000 - 8,460,000) in a capture Hi-C experiment. I'm using the following command in samtools:
samtools view $file.bam -h "chr12:6140000-8460000"
It seems to work fine for the first read, but there are many second ends that are mapped to regions outside, even in other chromosomes, i.e., I get something like this:
ERR5035850.12056960 130 chr12 6142560 0 76P = 6142773 76 * * TC:i:1 S1:i:1 S2:i:0
ERR5035850.2480369 128 chr12 6142560 0 76P = 6144648 76 * * TC:i:1 S1:i:0 S2:i:0
ERR5035850.76070047 1152 chr12 6142560 0 76P chr16 19345512 76 * * TC:i:1 S1:i:1 S2:i:1
ERR5035850.2336828 152 chr12 6142560 0 76P = 6142785 76 * * TC:i:1 S1:i:1 S2:i:0
I have also tried including the flag -f 2, and these unwanted regions seem to go away, but I'm not completely sure this is the flag I want to use for a Hi-C experiment.
Could anyone help me a bit on this?
Thanks a lot!
Manuel F. Merino