Phased Genome, Want reads From Both Alleles
2
1
Entering edit mode
6.6 years ago
kpr ▴ 80

I have a phased genome, and I am trying to generate a counts matrix that contains the number of reads that map to both alleles. For example, something like this.

                  Wild1     Wild2     Wild3     MT1      MT2    MT3
C1_000001          0          0         0        0        0      0      
C1_000002          10         9         8        1        1      1      
C1_000003          0          0         0        10       12     10      
C1_000004          6          5         7        0        0      0

Right now I have used bowtie, tophat, and have an accepted_hits file, aln.bam.

I have used samtools to sort the aln.bam but am having trouble with the next step. Based on what I have read so far, I think my next step is to generate a consensus sequence?

samtools mpileup -uf ref.fa aln.bam | bcftools call -c | vcfutils.pl vcf2fq > cns.fq

I want to make sure I am understanding this step, and subsequent steps.

  1. I don't understand this portion of the above line of code, and am having trouble finding documents on it :

    vcfutils.pl vcf2fq > cns.fq

  2. I have read documentation saying that we can use some of these functions to generate a consensus across samples, or alleles. I want to make sure I am doing it by alleles.
  3. It looks like this will generate a consensus fastq file. Not sure what the next steps would be.
  4. Any other suggestions besides "Google Allele Specific Pipelines" would be helpful. Perhaps a link of one in particular that you would recommend.

Update: I also don't need the whole matrix. Figuring out a way just to do it for one gene would be sufficient.

Thanks in advance!

rna-seq • 1.4k views
ADD COMMENT
1
Entering edit mode

Point 4 is quite demanding. People help whichever way they can.

ADD REPLY
0
Entering edit mode

I'm not trying to be rude, I just need something a little more than that at this point.

ADD REPLY
1
Entering edit mode
6.6 years ago

Use ASEReadCounter from GATK.

ADD COMMENT
1
Entering edit mode
6.6 years ago

This whole project is very demanding. I would recommend looking at Phaser https://github.com/secastel/phaser

However, I spent a long time working on this, and got very little out. Nanopore is a better option for getting phased alleles in my opinion. Good luck.

ADD COMMENT
0
Entering edit mode

Thanks for the suggestion!

ADD REPLY

Login before adding your answer.

Traffic: 2695 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6