A Workflow Of Population Genomic Operations/Analysis For Newcomers
3
8
Entering edit mode
14.0 years ago
Jianfengmao ▴ 320

Dear BioStars,

I just move from classical population genetics to genomics/population genomics. I need to set up my genomic handling platform and ability. I have used R for statistics for 3 years, so bioconductor is preferable to me.

In my current study, we sequenced genomes of tens of accessions of a plant, by Illumina next generation sequencer. And, now the reads have been aligned with the reference genome.

I have not any experiences of genomic analysis. On the beginning, I checked all the available packages for sequence analyses of the bioconductor, and read their manual. And also, I surveyed the courses in bioconductor websites. But, I still can not make a full and effective workflow for me to do population genomic analysis, though I have witnessed much excellent genomic implements of bioconductor.

I need hints, tips, suggestions, and advice on making an explicit and effective workflow for me to do the following analysis by using bioconductor or maybe not:

  1. mutation types. e.g. CG -> AT, CG -> TA etc. polarized with the relative genomes
  2. Polymorphism along chromosomes (or scaffold)
  3. Polymorphism by type; intergenic, CDs etc.; and polymorphism by metabolic network
  4. LD and recombination
  5. drastic mutations. e.g. stop codons etc. in gene family, Gene Ontology
  6. Population structure using STRUCTURE
  7. Fst among groups
  8. association studies
workflow population comparative • 5.6k views
ADD COMMENT
1
Entering edit mode
ADD REPLY
3
Entering edit mode
14.0 years ago

I am unaware of a workflow system that meets all of your requirements. However, two widely used systems for high-throughput population genetic analysis (=population genomics) are Variscan and LibSeq. You would need to use your reference genome annotation and wrapper scripts to run either of these systems, but both provide functionality for requirements 1-4.

ADD COMMENT
2
Entering edit mode
14.0 years ago
Rm 8.3k

PoPOOLation tool; Even though I haven't tried it...you can test it and see how it fits to your requirement.

It has a collection of tools to facilitate population genetic studies of next generation sequencing data.

ADD COMMENT
2
Entering edit mode
14.0 years ago

As suggested by Casesy - I don't know about a published workflow that suits your task. But most of these steps are part of GWAS analysis and routinely performed in Computational genomics labs. IMHO, your requirement includes two types of analysis tracks:

  1. Genetic / genomics analysis using raw or simulated data
  2. Annotation and interpretation of results from step 1 (genomics studies).

You may use PLINK (or via R plugin) for implementing most of your genetics / genomics based tasks (for example: LD, Recombinations, FST) etc.

Once you identified your mutations / polymorphisms you may move to next level of annotations. For annotation you may use a variety of tools that discussed here in biostar in several previous posts (see: SNP effects on amino acids, variation databases, SNPs of unknown significance etc). By integrating PLINK with some of the annotation resources discussed in those questions, you can develop such a work-flow.

ADD COMMENT

Login before adding your answer.

Traffic: 1514 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6