output same result using different imputed data and same reference panel
2
0
Entering edit mode
12 months ago
sh • 0

I use RIL as test, the parental and maternal genome as reference panel ,alignment by AnchorWave as input PHG. and five indiviuals that are low GBS and WGS as imputation.

I have imputed a vcf file by the PHG, the five indiviuals have same imputaiton result at same sites. When I modify some parameters about at constuct pangenome and after, the final the result is same. and When I change the parameterHaplotypeGraphBuilderPlugin.taxa=Tt_1A_part1 added '#' and delete LoadHaplotypesFromGVCFPlugin.mergeRefBlocks=true, the parental and maternal have same varition. I think that all the problems from about HaplotypeGraphBuilderPlugin step, the helper document about config is lost and don't know internal progress, could you give me some advice?

it weired that when I run MakeInitialPHGDBPipelinePlugin, producing a vcf and tbi index file at './input/reference/', corresponding to produced by bed file.

mkdir liquibase_dir
cd liquibase_dir
wget -O liquibase-4.7.0.tar.gz "https://github.com/liquibase/liquibase/releases/download/v4.7.0/liquibase-4.7.0.tar.gz"
tar -xzf liquibase-4.7.0.tar.gz
rm liquibase-4.7.0.tar.gz

mkdir ./changelogs
mkdir ./changelogs/changesets

cd ..
export PATH=/media/xudong/14t1/phg8/liquibase_dir:${PATH}
#cp -r ../changelogs ./
ln -s ./liquibase_dir/liquibase ./

perl /home/xudong/download/tasseladmin-tassel-5-standalone-846381e171c8/run_pipeline.pl -debug -MakeDefaultDirectoryPlugin -workingDir /media/xudong/14t1/phg9 -endPlugin > 1.1log
# copy fasta into reference and assemblies, including two fasta file, including ref and query reference
# copy bed file into dir

perl /home/xudong/download/tasseladmin-tassel-5-standalone-846381e171c8/run_pipeline.pl -Xmx100G -debug -configParameters config1.2.txt -MakeInitialPHGDBPipelinePlugin -endPlugin > 1.2.log

# When I finished this step, producing a vcf and tbi index file at './input/reference/', produced by bed file and vcf file


perl /home/xudong/download/tasseladmin-tassel-5-standalone-846381e171c8/run_pipeline.pl -Xmx20G -debug -configParameters config1.5.txt -LoadHaplotypesFromGVCFPlugin -endPlugin > 1.5.log


# at myconfig.txt, delete samDir and modify minTaxa =1, HaplotypeGraphBuilderPlugin configure are extracted from initial config.txt.
perl /home/xudong/download/tasseladmin-tassel-5-standalone-846381e171c8/run_pipeline.pl -Xmx20G -debug -configParameters myconfig.txt -ImputePipelinePlugin -imputeTarget pangenome -skipLiquibaseCheck true -endPlugin > pangenome.log

minimap2 -d ./outputDir/pangenome/pangenome_assembly_by_anchorwave.mmi ./outputDir/pangenome/pangenome_assembly_by_anchorwave.fa

# https://bitbucket.org/bucklerlab/practicalhaplotypegraph/wiki/UserInstructions/ImputeWithPHG_fastq-homozygous  the help document about config is lost.
perl /home/xudong/download/tasseladmin-tassel-5-standalone-846381e171c8/run_pipeline.pl -debug \
-configParameters ./myconfig.txt \
-HaplotypeGraphBuilderPlugin \
    -configFile ./myconfig.txt \
    -methods assembly_by_anchorwave \
    -includeVariantContexts true \
    -includeSequences false \
    -endPlugin \
-FastqToMappingPlugin \
    -minimap2IndexFile ./outputDir/pangenome/pangenome_assembly_by_anchorwave.mmi \
    -keyFile ./readMapping_key_file.txt \
    -fastqDir /media/xudong/14t1/phg/GBS \
    -methodName assembly_by_anchorwave \
    -methodDescription anchorwave \
    -debugDir ./ \
    -endPlugin > 1.71.log   


perl /home/xudong/download/tasseladmin-tassel-5-standalone-846381e171c8/run_pipeline.pl -debug  \
-configParameters ./myconfig.txt \
-HaplotypeGraphBuilderPlugin \
    -configFile ./myconfig.txt \
    -methods assembly_by_anchorwave \
    -includeVariantContexts true \
    -includeSequences false \
    -endPlugin \
-BestHaplotypePathPlugin \
    -keyFile ./readMapping_key_file_pathKeyFile.txt \
    -outDir ./outputDir \
    -minReads 0 \
    -readMethod assembly_by_anchorwave \
    -pathMethod assembly_by_anchorwave_PATH \
    -endPlugin > 1.72.log

perl /home/xudong/download/tasseladmin-tassel-5-standalone-846381e171c8/run_pipeline.pl -debug  \
-configParameters ./myconfig.txt \
-HaplotypeGraphBuilderPlugin \
    -configFile ./myconfig.txt \
    -methods assembly_by_anchorwave \
    -includeVariantContexts true \
    -includeSequences false \
    -endPlugin \
    -ImportDiploidPathPlugin -pathMethodName assembly_by_anchorwave_PATH -endPlugin  \
-PathsToVCFPlugin \
-outputFile ./final.v2.vcf.gz \
-referenceFasta ./genome_data/Tt_1A_part1.fasta \
    -endPlugin > 1.73.log

the last log section.

[main] INFO net.maizegenetics.pipeline.TasselPipeline - Tassel Pipeline Arguments: [-fork1, -HaplotypeGraphBuilderPlugin, -configFile, ./myconfig.txt, -methods, assembly_by_anchorwave, -includeVariantContexts, true, -includeSequences, false, -endPlugin, -ImportDiploidPathPlugin, -pathMethodName, assembly_by_anchorwave_PATH, -endPlugin, -PathsToVCFPlugin, -outputFile, ./final.v2.vcf.gz, -referenceFasta, ./genome_data/Tt_1A_part1.fasta, -endPlugin, -runfork1]
net.maizegenetics.pangenome.api.HaplotypeGraphBuilderPlugin
   net.maizegenetics.pangenome.hapCalling.ImportDiploidPathPlugin
      net.maizegenetics.pangenome.hapCalling.PathsToVCFPlugin
[pool-1-thread-1] INFO net.maizegenetics.plugindef.AbstractPlugin - Starting net.maizegenetics.pangenome.api.HaplotypeGraphBuilderPlugin: time: Jan 7, 2024 20:13:30
[pool-1-thread-1] INFO net.maizegenetics.plugindef.AbstractPlugin - 
HaplotypeGraphBuilderPlugin Parameters
configFile: ./myconfig.txt
methods: assembly_by_anchorwave
includeSequences: false
includeVariantContexts: true
haplotypeIds: null
chromosomes: [1A]
taxa: null
localGVCFFolder: ./genome_data

[pool-1-thread-1] INFO net.maizegenetics.pangenome.db_loading.DBLoadingUtils - first connection: dbName from config file = ./phg_db_name.db host: localHost user: sqlite type: sqlite
[pool-1-thread-1] INFO net.maizegenetics.pangenome.db_loading.DBLoadingUtils - Database URL: jdbc:sqlite:./phg_db_name.db
[pool-1-thread-1] INFO net.maizegenetics.pangenome.db_loading.DBLoadingUtils - Connected to database:  

[pool-1-thread-1] INFO net.maizegenetics.pangenome.api.CreateGraphUtils - referenceRangesAsMap: query statement: select reference_ranges.ref_range_id, chrom, range_start, range_end, methods.name from reference_ranges  INNER JOIN ref_range_ref_range_method on ref_range_ref_range_method.ref_range_id=reference_ranges.ref_range_id  INNER JOIN methods on ref_range_ref_range_method.method_id = methods.method_id  AND methods.method_type = 7 ORDER BY reference_ranges.ref_range_id
methods size: 1
[pool-1-thread-1] INFO net.maizegenetics.pangenome.api.CreateGraphUtils - referenceRangesAsMap: number of reference ranges: 2515
[pool-1-thread-1] INFO net.maizegenetics.pangenome.api.CreateGraphUtils - referenceRangesAsMap: time: 0.028792687 secs.
[pool-1-thread-1] INFO net.maizegenetics.pangenome.api.CreateGraphUtils - taxaListMap: query statement: SELECT gamete_haplotypes.gamete_grp_id, genotypes.line_name FROM gamete_haplotypes INNER JOIN gametes ON gamete_haplotypes.gameteid = gametes.gameteid INNER JOIN genotypes on gametes.genoid = genotypes.genoid ORDER BY gamete_haplotypes.gamete_grp_id;
[pool-1-thread-1] INFO net.maizegenetics.pangenome.api.CreateGraphUtils - taxaListMap: number of taxa lists: 2
[pool-1-thread-1] INFO net.maizegenetics.pangenome.api.CreateGraphUtils - taxaListMap: time: 0.008242209 secs.
[pool-1-thread-1] INFO net.maizegenetics.pangenome.api.CreateGraphUtils - createHaplotypeNodes: haplotype method: assembly_by_anchorwave range group method: null
[pool-1-thread-1] INFO net.maizegenetics.pangenome.api.CreateGraphUtils - createHaplotypeNodes: query statement: SELECT haplotypes_id, gamete_grp_id, haplotypes.ref_range_id, asm_contig, asm_start_coordinate, asm_end_coordinate, asm_strand, genome_file_id, seq_hash, seq_len...
[pool-1-thread-1] INFO net.maizegenetics.pangenome.api.CreateGraphUtils - CreateGraphUtils:addNodes - query=SELECT haplotypes_id, gamete_grp_id, haplotypes.ref_range_id, asm_contig, asm_start_coordinate, asm_end_coordinate, asm_strand, genome_file_id, seq_hash, seq_len, gvcf_file_id FROM haplotypes inner join reference_ranges on haplotypes.ref_range_id = reference_ranges.ref_range_id WHERE method_id = 5 AND chrom in ('1A');
[pool-1-thread-1] INFO net.maizegenetics.pangenome.api.CreateGraphUtils - addNodes: number of nodes: 4990
[pool-1-thread-1] INFO net.maizegenetics.pangenome.api.CreateGraphUtils - addNodes: number of reference ranges: 2495
[pool-1-thread-1] INFO net.maizegenetics.pangenome.api.CreateGraphUtils - createHaplotypeNodes: time: 0.078513454 secs.
[pool-1-thread-1] INFO net.maizegenetics.pangenome.api.HaplotypeGraph - Created graph edges: created when requested  number of nodes: 4990  number of reference ranges: 2495
[pool-1-thread-1] INFO net.maizegenetics.plugindef.AbstractPlugin - Finished net.maizegenetics.pangenome.api.HaplotypeGraphBuilderPlugin: time: Jan 7, 2024 20:13:31
[pool-1-thread-1] INFO net.maizegenetics.plugindef.AbstractPlugin - Starting net.maizegenetics.pangenome.hapCalling.ImportDiploidPathPlugin: time: Jan 7, 2024 20:13:31
[pool-1-thread-1] INFO net.maizegenetics.plugindef.AbstractPlugin - 
ImportDiploidPathPlugin Parameters
pathMethodName: assembly_by_anchorwave_PATH
taxa: null

[pool-1-thread-1] INFO net.maizegenetics.pangenome.db_loading.DBLoadingUtils - first connection: dbName from config file = ./phg_db_name.db host: localHost user: sqlite type: sqlite
[pool-1-thread-1] INFO net.maizegenetics.pangenome.db_loading.DBLoadingUtils - Database URL: jdbc:sqlite:./phg_db_name.db
[pool-1-thread-1] INFO net.maizegenetics.pangenome.db_loading.DBLoadingUtils - Connected to database:  

[pool-1-thread-1] INFO net.maizegenetics.pangenome.hapCalling.ImportDiploidPathPlugin - importPathsFromDB: query: SELECT line_name, paths_data FROM paths, genotypes, methods WHERE paths.genoid=genotypes.genoid AND methods.method_id=paths.method_id AND methods.name IN ('assembly_by_anchorwave_PATH')
[pool-1-thread-1] INFO net.maizegenetics.pangenome.hapCalling.ImportDiploidPathPlugin - importPathsFromDB: number of path list: 7
[pool-1-thread-1] INFO net.maizegenetics.plugindef.AbstractPlugin - Finished net.maizegenetics.pangenome.hapCalling.ImportDiploidPathPlugin: time: Jan 7, 2024 20:13:31
[pool-1-thread-1] INFO net.maizegenetics.plugindef.AbstractPlugin - Starting net.maizegenetics.pangenome.hapCalling.PathsToVCFPlugin: time: Jan 7, 2024 20:13:31
[pool-1-thread-1] INFO net.maizegenetics.plugindef.AbstractPlugin - 
PathsToVCFPlugin Parameters
outputFile: ./final.v2.vcf.gz.vcf
refRangeFileVCF: null
referenceFasta: ./genome_data/Tt_1A_part1.fasta
makeDiploid: true
positions: null
symbolicToN: false
symbolic: false

Genome FASTA character conversion: ACGTNacgtn to ACGTNacgtn
[pool-1-thread-1] INFO net.maizegenetics.pangenome.hapCalling.PathsToVCFPlugin - PathsToVCFPlugin: processData: number of ranges: 2495
[pool-1-thread-1] INFO net.maizegenetics.pangenome.hapCalling.PathsToVCFPlugin - PathsToVCFPlugin: processData: number of taxa: 7
[DefaultDispatcher-worker-3] INFO net.maizegenetics.pipeline.TasselPipeline - net.maizegenetics.pangenome.hapCalling.PathsToVCFPlugin: time: Jan 7, 2024 20:13:32: progress: 0%
[DefaultDispatcher-worker-20] INFO net.maizegenetics.pipeline.TasselPipeline - net.maizegenetics.pangenome.hapCalling.PathsToVCFPlugin: time: Jan 7, 2024 20:13:34: progress: 10%
[DefaultDispatcher-worker-21] INFO net.maizegenetics.pipeline.TasselPipeline - net.maizegenetics.pangenome.hapCalling.PathsToVCFPlugin: time: Jan 7, 2024 20:13:35: progress: 20%
[DefaultDispatcher-worker-14] INFO net.maizegenetics.pipeline.TasselPipeline - net.maizegenetics.pangenome.hapCalling.PathsToVCFPlugin: time: Jan 7, 2024 20:13:37: progress: 30%
[DefaultDispatcher-worker-24] INFO net.maizegenetics.pipeline.TasselPipeline - net.maizegenetics.pangenome.hapCalling.PathsToVCFPlugin: time: Jan 7, 2024 20:13:40: progress: 40%
[DefaultDispatcher-worker-16] INFO net.maizegenetics.pipeline.TasselPipeline - net.maizegenetics.pangenome.hapCalling.PathsToVCFPlugin: time: Jan 7, 2024 20:13:42: progress: 50%
[DefaultDispatcher-worker-24] INFO net.maizegenetics.pipeline.TasselPipeline - net.maizegenetics.pangenome.hapCalling.PathsToVCFPlugin: time: Jan 7, 2024 20:13:45: progress: 60%
[DefaultDispatcher-worker-20] INFO net.maizegenetics.pipeline.TasselPipeline - net.maizegenetics.pangenome.hapCalling.PathsToVCFPlugin: time: Jan 7, 2024 20:13:48: progress: 70%
[DefaultDispatcher-worker-19] INFO net.maizegenetics.pipeline.TasselPipeline - net.maizegenetics.pangenome.hapCalling.PathsToVCFPlugin: time: Jan 7, 2024 20:13:51: progress: 80%
[DefaultDispatcher-worker-11] INFO net.maizegenetics.pipeline.TasselPipeline - net.maizegenetics.pangenome.hapCalling.PathsToVCFPlugin: time: Jan 7, 2024 20:14:1: progress: 90%
[DefaultDispatcher-worker-12] INFO net.maizegenetics.pipeline.TasselPipeline - net.maizegenetics.pangenome.hapCalling.PathsToVCFPlugin: time: Jan 7, 2024 20:14:4: progress: 100%
[DefaultDispatcher-worker-12] INFO net.maizegenetics.plugindef.AbstractPlugin - net.maizegenetics.pangenome.hapCalling.PathsToVCFPlugin  Citation: Bradbury PJ, Zhang Z, Kroon DE, Casstevens TM, Ramdoss Y, Buckler ES. (2007) TASSEL: Software for association mapping of complex traits in diverse samples. Bioinformatics 23:2633-2635.
[pool-1-thread-1] INFO net.maizegenetics.plugindef.AbstractPlugin - Finished net.maizegenetics.pangenome.hapCalling.PathsToVCFPlugin: time: Jan 7, 2024 20:14:4
[pool-1-thread-1] INFO net.maizegenetics.pipeline.TasselPipeline - net.maizegenetics.pangenome.hapCalling.ImportDiploidPathPlugin: time: Jan 7, 2024 20:14:4: progress: 100%
[pool-1-thread-1] INFO net.maizegenetics.pipeline.TasselPipeline - net.maizegenetics.pangenome.api.HaplotypeGraphBuilderPlugin: time: Jan 7, 2024 20:14:4: progress: 100%
PHG • 1.2k views
ADD COMMENT
0
Entering edit mode
12 months ago
pjb39 ▴ 220

I did not see any obvious problem in the log files that you sent to lcj. I think everything ran okay, including building of the HaplotypeGraph. Looking at the read mapping files (...ReadMapping.txt), the reads that mapped to only a single genome seemed to be distributed across the entire chromosome for both parent assemblies. Most reads mapped to both, which is uninformative for imputation, so I only paid attention to reads mapping to a single haplotype. It looks like you are mapping reads against chromosome 1A only but that the reads come from the entire genome. This potentially results in a lot of mismapped reads, since reads from other parts of the genome could best match to someplace on 1A when the rest of the genome is not available as a target. You did not say what species you are working with, but if it is hexaploid like wheat, this would be especially problematic as you could get a lot of mismapping of genome B and D reads. Briefly, if the reads come from the whole genome, they should be mapped against the whole genome.

ADD COMMENT
0
Entering edit mode

enter image description here

ADD REPLY
0
Entering edit mode

I use fastq (only one chr) produced by bam that bwa mapping.the result don't have change. it confused me and it was screenshot vcf result, all individuals including reference have same result.

ADD REPLY
0
Entering edit mode

Plase use ADD COMMENT/REPLY to keep answers and comments organized.

ADD REPLY
0
Entering edit mode
12 months ago
sh • 0

Hi professor peter, could you help me three question again? thanks. the pipeline is same before . I modify my gvcf file, one produced by anchorwave and the other produced by MakeInitialPHGDBPipelinePlugin.

in the LoadHaplotypesFromGVCFPlugin, warning show up when load gvcf file produced by anchorwave

WARNING: ConvertVariantContextToVariantInfo:determineASMINfo has empty variant list, creating default ASMVariantInfo object
WARNING: ConvertVariantContextToVariantInfo:determineASMINfo has empty variant list, creating default ASMVariantInfo object
WARNING: ConvertVariantContextToVariantInfo:determineASMINfo has empty variant list, creating default ASMVariantInfo object
WARNING: ConvertVariantContextToVariantInfo:determineASMINfo has empty variant list, creating default ASMVariantInfo object
WARNING: ConvertVariantContextToVariantInfo:determineASMINfo has empty variant list, creating default ASMVariantInfo object
WARNING: ConvertVariantContextToVariantInfo:determineASMINfo has empty variant list, creating default ASMVariantInfo object
WARNING: ConvertVariantContextToVariantInfo:determineASMINfo has empty variant list, creating default ASMVariantInfo object
WARNING: ConvertVariantContextToVariantInfo:determineASMINfo has empty variant list, creating default ASMVariantInfo object

second question , when I imputed the vcf file, the variant don't match the anchorwave gvcf result and only have snp.but indel in imputed vcf are lost, I can't find parameter about this

the imputed vcf file:

1A      1198311 .       T       A       .       .       .       GT      0/0     1/1     ./.
1A      1198313 .       C       T       .       .       .       GT      0/0     1/1     ./.
1A      1398320 .       A       C       .       .       .       GT      0/0     1/1     0/0
1A      1398322 .       G       C       .       .       .       GT      0/0     1/1     0/0
1A      1398330 .       A       T       .       .       .       GT      0/0     1/1     0/0
1A      1398339 .       A       T       .       .       .       GT      0/0     1/1     0/0
1A      1398340 .       G       A       .       .       .       GT      0/0     1/1     0/0
1A      1398349 .       G       T       .       .       .       GT      0/0     1/1     0/0

the anchorwave gvcf :

1A      1198311 .       T       A,<NON_REF>     .       .       ASM_Chr=1A;ASM_End=569953;ASM_Start=569953;ASM_Strand=+ GT:AD:DP:PL     1:0,30,0:30:90,90,0
1A      1198312 .       G       <NON_REF>       .       .       ASM_Chr=1A;ASM_End=569954;ASM_Start=569954;ASM_Strand=+;END=1198312     GT:AD:DP:PL     0:30,0:30:0,90,90
1A      1198313 .       C       T,<NON_REF>     .       .       ASM_Chr=1A;ASM_End=569955;ASM_Start=569955;ASM_Strand=+ GT:AD:DP:PL     1:0,30,0:30:90,90,0
1A      1198314 .       TCGTTGCGGGCGGTGAATACGACGACGCCGGTGTCGTCGCCGACGATGCACTCGGCGATGCGCATGTTGCCGCCGGCCCTGCCGCCCTGCGGCCTGTGGAGGACGACGGGCTTGGAGCTGAGCACCCGGAGCTGGAGGTTGTGGCCGTAGGTGCCCGGCCGCAGCTCCTCCACCTTGTCGAACGTCATCGCTGCCCTTTGCCTGGACACGGGAAACCGATGGACGGGATCGTGATCAGAGGAATCGGGCATCGAATCAGGGCGGTGGGGATCGGGATCTTGAAGAAAGAAAGAA
ACAAGATCTGGTCCGATCCGATCTGAGCTCCGAGTCAGATAGCGAGAGCTCAGATCGATCTTCCAGGGGAGGGGACCGAGTGCGGACGTACCTGGTTTGGTTGTGGACGGTGGAGCGAGCGAGGGAGCGCGTCTCGTCTCCTGCTTGCGCTGCTTGGATGTGATCTGAAGATGAGATCCGCCGGAGGAAGAAGAAGAAGAAGAAGAAGTTTGTCTCTTGGTTCGTTTGGTTTTGTCCGAGAGGAAGGAGGAAACCGCCGGGGGACCCAAGTTGTGGTATGAATCCAAACCCAATTCAATCCCCTTTGTCTTTGTTTGT
TTCCCCCCTCTCTCTTTTCTCTCTTCTCGTGTATGGTGTATGAGGTGTGTCACCAAACAACAAACTTGCTTCTACTCTGACCCGATGTGAAATTCACCCGCCGGCAATCTTGGTTTCCCCCCAAAATTCATTTCCGGCGTTGTATATGCTTCATTTCCGGCGTCGCGTCGCCTTGATCCTACTAGTTTGAGCACGTCGCATGTCAAAATTGGGGTCGTGTCGCCTTGATTCTGCTAGTTTGAGCACGTCACATGGCAAAATTGGGGTTACTATGTATGCGTTGGCGTGCATACAGTTCTTCGAAATAAAACGTATGTA
TGCGTTAGCGTGCATATAGTTCTTCGAAATAAAACGTATGCAATGACTTAATCCATTGCACACAGTCCGTTTTGAGAGGTCGTGGGCTAAACCTGGTTGGGGTCGAGTTATGGAATTGTGTGTGATGATAACAACACTTCACATAAAACTTCAGTAGTTACTCCGCATGCAATACAATTCACACAATTATTAGATAGCAATTGTGTGTGATGTTTGAAGTTTCACAAACAGCTATTTTGCATAGAACATCTGTACAGATCATGCCACCTCATGTGTTTTTTTAGCCCAAATCACAGATAGTTAACTCTTCGAAAACAT
GC......

last question : when the gvcf don't include produced by MakeInitialPHGDBPipelinePlugin. the BestHaplotypePathPlugin log

[DefaultDispatcher-worker-6] INFO net.maizegenetics.pangenome.hapCalling.PathFinderForSingleTaxonNodes - Finished processing reads for a sample: 2495 ranges discarded out of 2495.
[DefaultDispatcher-worker-6] INFO net.maizegenetics.pangenome.hapCalling.PathFinderForSingleTaxonNodes - 873 ranges had too few reads; 0 ranges had too many reads; 1622 ranges had all reads equal

when add the gvcf

[DefaultDispatcher-worker-4] INFO net.maizegenetics.pangenome.hapCalling.PathFinderForSingleTaxonNodes - Finished processing reads for a sample: 887 ranges discarded out of 2515.
[DefaultDispatcher-worker-4] INFO net.maizegenetics.pangenome.hapCalling.PathFinderForSingleTaxonNodes - 568 ranges had too few reads; 0 ranges had too many reads; 319 ranges had all reads equal
ADD COMMENT
0
Entering edit mode

for the imputed vcf file, still not imputed site ./. , which are almost ~2%. I think the result all are imputed and indel are included.

ADD REPLY
0
Entering edit mode

sh : Please post a new question if this is not related to the original question in this thread. If this is a follow-up then this needs to go under the answer you received above.

Adding new questions as answers breaks the logical flow of threads and makes it very difficult for future visitors to figure out what is going on.

ADD REPLY
0
Entering edit mode

the same pipeline when I used to all chromosomes, the last vcf file only have header line and don't take place bug, so confused for me.

ADD REPLY
0
Entering edit mode

The phg vcf file export only exports SNPs. There is no option to export indels. The reason is that reconciling overlapping indels across samples can be quite challenging and we decided not to put the time and effort into that. If I understand the last question, no reads were mapped when you only used one assembly for mapping. Check the value of minTaxa. I assume that is being set in the config file. If minTaxa (for BestHaplotypePathPlugin) is greater than the number of taxa being used to build the graph, then none of the reference ranges will be used and nothing will be imputed. Only reference ranges with taxa >= minTaxa are imputed.

ADD REPLY

Login before adding your answer.

Traffic: 3712 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6