Question

Download from ENA according to XML

0

Entering edit mode

3.8 years ago

harmadikemil • 0

Dear All!

I don't know if it is possibly, that is why i ask. We usually download a lot of sequences from ENA to build database for phylogenetic analysis, but a lot of time there is only .bam files not fastq. My question would be, that is possible to download XML for the chosen study, filter the samples that have raw sequences as fastq file, but give error to bam files.

Thank you in advance

ENA Genome XML Python • 1.2k views

ADD COMMENT • link updated 2.0 years ago by Polina ▴ 10 • written 3.8 years ago by harmadikemil • 0

0

Entering edit mode

that sounds like it should be possible yes.

However, I think you're far better of (an much easier) by downloading the bam files and then transforming them to fastq files, eg with bamtofastq (https://bedtools.readthedocs.io/en/latest/content/tools/bamtofastq.html)

ADD REPLY • link 3.8 years ago by lieven.sterck 15k

0

Entering edit mode

Thank you! I know it is possible to convert bam to fastq, but usually the bam file is filtered and don't have reads that are usefull to us (for example mitochondrial reads). Maybe you know, how should I do the XML filtering? :)

ADD REPLY • link 3.8 years ago by harmadikemil • 0

0

Entering edit mode

I see, depends a bit a on the bam then, you can (are allowed to submit your raw reads as bam as well, so those should contain all data) , but this might be hard to spot without processing them.

On the other hand, is it's raw data it should contain all, no matter if it's bam or fastq file.

Perhaps contact the ENA helpdesk for this (and perhaps get back here if you can resolve this)

ADD REPLY • link 3.8 years ago by lieven.sterck 15k

score 1 · Accepted Answer · 2022-11-14

1

Entering edit mode

2.0 years ago

Polina ▴ 10

I've created a Python tool: ENATool, which downloads and parses xml from ENA browser to csv format, which you may filter based on your preferences (at this case, fastq files) and then download raw data.

ADD COMMENT • link 2.0 years ago by Polina ▴ 10