Entering edit mode
7.5 years ago
Kian
▴
50
Hi my input file to UCSC for repeat elements is like:
chrom txStart txEnd
chr1 15079913 35209257
output:
genoName genoStart genoEnd strand repName repClass repFamily**
chr1 16777160 16777470 + AluSp SINE Alu
chr1 25165800 25166089 - AluY SINE Alu
chr1 33553606 33554646 + L2b LINE L2
the raw of input file and output file is not the same. output have more raw in not the same the distance in input file!! how i can match two file?
How I have repeats elements for the default distance in input file and not more? Thanks
Thanks for your response, i know in this distance there are not one type repeat, Actually, i want to know in the distance how many and what repeat exist. but i think UCSC divide my distance and tell me in the per section what repeat exist. its good but not my goal! in have a distance and want to know in this distance there are how many LINE, how many SINE, how many....,
It's giving you the individual entries, so count them. Alternatively, download the repeatmasker file, convert it to BED, use
bedtools intersect
and use whatever method you prefer to finish summarizing things.Thanks, i will try it!