Question

Oxford Nanopore Single-Molecule Amplicon-Seq Phasing

0

Entering edit mode

2.0 years ago

KoppesEA ▴ 80

I am working on a research project to define novel variants in a compact gene of interest from patient samples using long-read single-molecule oxford nanopore amplicon sequencing. A ~6.1 kb fragment was cleanly isolated by PCR and sent to plasmidsaurus for amplicon sequencing. The summary files show SNPs that are clearly heterozygous indicating two alleles for each patient. I would like to now phase each variant to determine with certainty whether each SNP is found in trans on opposing alleles or present on the same allele in cis.

As a background I am a molecular biologist self-trained in command line tools and competent with illumina short-read RNA-Seq and WGS. However, I do not have experience with single-molecule long-read sequencing.

If a knowledgeable expert in the community could direct me to an optimal pipeline starting from raw .fastq reads to do QC, Trimming (if necessary?), alignment to reference gene PCR amplicon sequence and phasing that would be fantastic. With some direction to which tools to use I can probably figure out the command line, although any consideration to critical command line options is appreciated. I will post coding problems below if issues arise.

I’ve considered a partial solution to just use a grep text search to extract fastq reads that a) contain the amplicon and b) separate based on SNPs. But I know a more elegant solution must exist. My other option is to just clone each allele into a plasmid and sequence enough to get each allele separate.

Thanks in advance, --EK

SNP Long-Read Phasing Oxford-Nanopore Single-Molecule • 1.3k views

ADD COMMENT • link updated 21 months ago by GenoMax 145k • written 2.0 years ago by KoppesEA ▴ 80

0

Entering edit mode

module load minimap2/2.24
module load gcc/8.2.0
module load samtools/1.14 ##note clair 3 dependencies is 1.15
module load clair3/0.1-r12

inFolder=./raw_amplicon_fastq/
outFolder=./phased_amplicons
refGenome=./Chr17Ref/Homo_sapiens.GRCh38.dna.chromosome.17.fa

rm -rf $outFolder/*
rmdir $outFolder
mkdir $outFolder

for fastq_file in ${inFolder}/*.fastq
do
    echo $fastq_file    
    samplename=`basename $fastq_file .fastq`
    echo $samplename
    outDir=$outFolder/$samplename
    echo $outDir
    rm -r $outDir
    mkdir $outDir
    minimap2 -a ${refGenome}.gz $fastq_file > $outDir/${samplename}.sam
    samtools view -b -o $outDir/${samplename}.bam $outDir/${samplename}.sam 
    samtools sort -O BAM -o $outDir/${samplename}_sorted.bam $outDir/${samplename}.bam 
    samtools index -b $outDir/${samplename}_sorted.bam 
    run_clair3.sh \
    --bam_fn=$outDir/${samplename}_sorted.bam \
    --ref_fn=$refGenome \
    --output=$outDir \
    --threads=4 \
    --platform=ont \
    --model_path=./r941_prom_sup_g5014/ \
    --enable_phasing
    echo $samplename >> $outFolder/mygene_phaseSummary.vcf
    gunzip -c $outDir/phased_merge_output.vcf.gz >> $outFolder/mygene_phaseSummary.vcf

    done

ADD REPLY • link updated 21 months ago by GenoMax 145k • written 21 months ago by KoppesEA ▴ 80

0

Entering edit mode

This is what I came up with so far. Mapping PCR amplicons to a specific chromosome and then phasing alleles.

ADD REPLY • link 21 months ago by KoppesEA ▴ 80

score 1 · Answer 1 · 2022-09-13

1

Entering edit mode

2.0 years ago

GenoMax 145k

For QC: PycoQC (LINK) (you will need the sequence summary file from nanopore run) and Nanoplot (LINK).

For Alignments: minimap2 https://github.com/lh3/minimap2

If you want to filter reads out that have the amplicon then bbduk.sh should work. A Guide is available: https://jgi.doe.gov/data-and-tools/software-tools/bbtools/bb-tools-user-guide/bbduk-guide/

Nanopore seems to have a phasing workflow (no personal experience): https://nanoporetech.com/resource-centre/snvs-and-phasing-workflow

ADD COMMENT • link 2.0 years ago by GenoMax 145k

0

Entering edit mode

Thanks! I appreciate the guidance and I'm looking into your suggestions. Will update when I start making some progress.

Following from the Nanopore link it looks like WhatsHap (https://whatshap.readthedocs.io/en/latest/) or something like Clair3 (https://github.com/HKU-BAL/Clair3) might be necessary for phasing.