how to align an intron from a complex

0

Entering edit mode

2.4 years ago

Bioinfo ▴ 20

I have the intron seq of a complex and I would like to align and annotate it, is there any platform, workflow that you guys could share?

as an example the fastq file look like this

I need to get the exons and then be able to translate to protein, any idea I would appreciate

genomic • 1.3k views

ADD COMMENT • link updated 2.4 years ago by WouterDeCoster 47k • written 2.4 years ago by Bioinfo ▴ 20

0

Entering edit mode

intron seq of a complex

What does this mean? Link you provided is for a bacterial sequence so there should be no introns.

ADD REPLY • link 2.4 years ago by GenoMax 147k

0

Entering edit mode

@GenoMax it is just an example, I have an intron and I am trying to align it to the human genome reference, the get the exons and then translate it to protein.

ADD REPLY • link 2.4 years ago by Bioinfo ▴ 20

0

Entering edit mode

Are you simply looking to see which exons flank that sequence?

If you have a single sequence then using blat would likely be the fastest way to do this.

Note: Remove the link from the original post since that does not have any connection with this post.

ADD REPLY • link 2.4 years ago by GenoMax 147k

0

Entering edit mode

@GenoMax Yes but the seq is larger than what Blat is supporting. Here is what I am trying to do, finding the exons , translating to protein

ADD REPLY • link 2.4 years ago by Bioinfo ▴ 20

1

Entering edit mode

If sequence is larger than an intron then I am going to assume that you are referring to PacBio or nanopore long read sequence in fastq format.

In that case your best option is likely to use an aligner like minimap2.

Still a little unclear as to what part you want to translate to protein. Are you looking for mutations affecting coding sequences?

ADD REPLY • link 2.4 years ago by GenoMax 147k

0

Entering edit mode

@GenoMax Here is what I am trying to do, I have an intron with over 200000n, I am trying to align it to the human genome, then check for genes that are located in there and then find which genes expressed where .

ADD REPLY • link 2.4 years ago by Bioinfo ▴ 20

0

Entering edit mode

Can you tell us what your definition of an intron is and how you obtained that file?

ADD REPLY • link 2.4 years ago by WouterDeCoster 47k

0

Entering edit mode

Yes but the seq is larger than what Blat is supporting.

what is the size of 'seq' ?

otherwise, use lastz...

ADD REPLY • link 2.4 years ago by Pierre Lindenbaum 164k

0

Entering edit mode

@Pierre Lindenbaum it is about 201750 while Blat only support 75000. I can use the https://blast.ncbi.nlm.nih.gov/ but the issue is that how can I extract the exons!!!! etc

ADD REPLY • link 2.4 years ago by Bioinfo ▴ 20

0

Entering edit mode

You can use command line blat.

ADD REPLY • link 2.4 years ago by GenoMax 147k

0

Entering edit mode

@GenoMax do you have an example or some sort of workflow that I could use?

ADD REPLY • link 2.4 years ago by Bioinfo ▴ 20

1

Entering edit mode

We are at least 10+ comments into this thread but we still don't know what kind of data you have.

Is it DNAseq or RNAseq? You said it is fastq but is it short or long read sequences? How is this data related to the said "intron" sequence? Is there more than one of these? What kind of genes are you looking for in the "intron" since by definition there should be none.

ADD REPLY • link 2.4 years ago by GenoMax 147k

Login before adding your answer.