length of combined transcripts
1
2
Entering edit mode
9.0 years ago
bsp017 ▴ 50

Hi all,

Is there a tool to calculate the combined length of transcripts which map to a given interval. For example as a function of bedtools or htseq? I can calculate the number of reads mapping to a particular gene using these tools but not the length of the gene segment which has been covered. What I want to know is, which genes in a gff or bed file have alignments from my sorted bam file covering more than 100 bp of their entire length.

Thanks,
James

RNA-Seq • 2.1k views
ADD COMMENT
2
Entering edit mode
9.0 years ago

BEDOPS bedmap --bases and bedmap --bases-uniq offer total and distinct counts of bases of map elements (e.g., transcripts or reads) which overlap ("map" to) a reference interval (e.g., a set of genes):

$ bedmap --echo --bases genes.bed reads.bed \
    > total_read_base_count_over_genes.bed
$ bedmap --echo --bases-uniq genes.bed reads.bed \
    > distinct_read_base_count_over_genes.bed

It's not clear which option you want, but the --bases-uniq option would effectively merge the overlapping reads into a contiguous region, and then give the length of the overlapping part of the contiguous region. The --bases option simply gives the total sum of the length of the overlapping part of each overlapping read.

Regardless of which option you pick, if you add --delim '\t', you can easily pipe to awk to quickly filter mapped results, where genes have alignments covering 100 bp or more:

$ bedmap --echo --bases --delim '\t' genes.bed reads.bed \
    | awk '$NF >= 100' - \
    > genes_with_all_reads_a_total_of_100_bases_or_more_overlap.bed

If you need to convert BAM-formatted reads to BED, you could use BEDOPS bam2bed and pipe BED-formatted reads to bedmap:

$ bam2bed < reads.bam \
    | bedmap --echo --bases --delim '\t' genes.bed - \
    | awk '$NF >= 100' - \
    > genes_with_all_reads_a_total_of_100_bases_or_more_overlap.bed
ADD COMMENT

Login before adding your answer.

Traffic: 2503 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6