Sinteny analysis using BAM file
1
0
Entering edit mode
4.1 years ago

I have some genomes assamblies (BAM file) from Illumina sequencing. I would like to evaluate sinteny breaks in the chromosome 1. However, I have found software that make sinteny analysis, like Nucmer, but they use fasta file as a input and reference genome in fasta format as well. I would to know how could I make similar analysis or how could be the correct pipeline to transform my bam files to fasta and then make the analysis.

Thanks in advance,

assembly genome gene sequence • 1.1k views
ADD COMMENT
0
Entering edit mode

assuming your BAMs are already mapped to the reference genome, just create a consensus sequence for each sample

ADD REPLY
0
Entering edit mode

You can export a concensus from your BAM file as fasta (see: Generating consensus sequence from bam file ) and then use the software you mention above.

ADD REPLY
0
Entering edit mode

Could I use the compiled VCF with 50 genomes?, That was fiiltered by SNPs. Is that the correct file?

ADD REPLY
0
Entering edit mode
4.1 years ago

If you have just genomic co-ordinates, you can use Ensembl REST to automate a synteny lookup analysis. Here, I do it between Human and Mouse via LASTZ_NET alignments. Please edit as you wish:

library(httr)
library(jsonlite)
library(xml2)

server <- 'http://rest.ensembl.org'

lookup <- c(
  '1:100000-200000',
  '1:50000000:50100000')

for (i in 1:length(lookup)) {
  ext <- paste0(
    '/alignment/region/homo_sapiens/',
    lookup[i],
    ':1?species_set=homo_sapiens;species_set=mus_musculus;method=LASTZ_NET')
    #':1?species_set=homo_sapiens;species_set=mus_musculus;method=EPO')

  message('--Looking up region: ', lookup[i])

  response <- GET(paste(server, ext, sep = ""), content_type("application/json"))

  if (any(grepl('no alignment available for this region', content(response)[[1]]))) {
      message('----', NA)
  } else {
    stop_for_status(response)

    human_coord <- paste0(
      content(response)[[1]]$alignments[[1]]$seq_region, ':',
      content(response)[[1]]$alignments[[1]]$start, '-',
      content(response)[[1]]$alignments[[1]]$end)
    human_seq <- content(response)[[1]]$alignments[[1]]$seq

    mouse_coord <- paste0(
      content(response)[[1]]$alignments[[2]]$seq_region, ':',
      content(response)[[1]]$alignments[[2]]$start, '-',
      content(response)[[1]]$alignments[[2]]$end)
    mouse_seq <- content(response)[[1]]$alignments[[2]]$seq

    message('\tHuman region:\t\t\t', human_coord)
    message('\tHuman sequence:\t\t\t', human_seq)

    message('\tSyntenic mouse region:\t\t', mouse_coord)
    message('\tSyntenic mouse sequence:\t', mouse_seq)

    Sys.sleep(5)

    message('\n')
  }
}

--Looking up region: 1:100000-200000
    Human region:           1:100000-100208
    Human sequence:         CACTAAGCACACAGAGAATAATGTCTAGAATCTGAGTGCCATGTTATCAAATTGTACTGAGACTCT-TGCAGTCACACAGGCTGACATGTAAGCATCG---CCATGCCTAGTACAGACTCTCCCTGCAGATGAAATTATATGGGATGCTAAATTATAATGAGAACAATGTTTGGTGAGCCAAAACTACAACAAGGGAAGCTAATTGGATGAAT
    Syntenic mouse region:      1:176696703-176696893
    Syntenic mouse sequence:    -------CACACTG-----GATTTGTAGAAGCAGGGTAATGCGTTGTC--------CTGGGACTTTCTGTAGTCACTGGAAGTGAACTGTTATTATTGGTATTATACATTGAACACTCTTTAAATACAGATGTACTTATATGGGATGCTGATTTATAATCAGAATCATGTTTGGTGAATGATGA--ACCAGGAGAAAATTTGATTGGATAAAT


--Looking up region: 1:50000000:50100000
    Human region:           1:50002687-50002969
    Human sequence:         TGGCCATAGAGAACTTCAACAGAGGAATT-AACTGTGGAGGACTG----GTGTGTAAGAGACAGAAAAGACTACCAGAGGCCTTTCACCTTAATGCTACTATCTTTTTTATTACTAG---GATATGCATTACAGGATGCTAAGAGCCTCAAATTTCCTATGTTCCTACTCTCTGTATG-GATGCCAAGTTTAAATTTTAAAATCTTATTAT--AATT-CAAAAAT----------CACAAAAAAGTAAAGTAAAAATAACAGAA-----GTTAAATTAACAAAATAAAAGTTAAT---------AGAAGTTAATCAAAG
    Syntenic mouse region:      4:110403744-110404052
    Syntenic mouse sequence:    TGGCCACAAAGACCATCCAGAGGGGGCTTTAACAATGATGGACTGCAAAATGCATG-GACATGGGAAAGACTAGTCCAGTTCTTTTAACTTCCTGCTCCTACTTTTGTTTTAACTAAAATGATATACACTAAAGAA-----AGAGCCTTGTATTGTTTATGTTCCTACTTTCTTTTTATAATTCTAAATTTA----TTGATAGCTCAATGTGCAATTGTAAAGATATGGCTTGGACACTGGGAAGGTAGCTAGATATATTTGAGTAATGATCATTTTAGTTAAAAAAAACTTTCTCTTCTTTCCAGAATTCAGACTGAG
ADD COMMENT

Login before adding your answer.

Traffic: 2550 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6