Entering edit mode
3.8 years ago
kimkes25
▴
50
Hello, I have a tsv file that looks like this :
I want to know what is the file's coverage of the genome. The data in it represents cut sites of crisper cas9 in T cells.
I tried to follow this post, but it always gives the same answer for any input in tsv format. command:
bc <<< "scale=10; 100 * $(awk '{sum=$3-$2}END{print sum}' /groups/itay_mayrose/kimk/targets_file_Leenay_mean_eff_coordinates.tsv) / $(awk '{sum+=$2}END{print sum}' hg38.chrom.sizes)"
output:.0000006231
I also know there are 1556 sites of length 20 in the file if it helps.
Could it be chromosome identifier? Your file has bare numbers and UCSC files typically have
chr
prefix. See if it works if you add achr
prefix to your chromosome identifiers.