Use ART illumina to create miRNA simulated data
1
0
Entering edit mode
6.2 years ago
dzisis1986 ▴ 70

Hi am trying to use ART Illumina in order to create simulated miRNA data sets. I have multiple problems in understanding the manual. I managed to create an art profile out of a fastq sample and i suppose i have to use this with a combination of a reference file in fasta format in order to have a fastq output. What i did was to run ART as follows :

art_illumina -i testmature.fa -l 10 -sp -f 10 -o simulated_SRAnew

and the output is an emply or only with header fastq file . I tried also to run :

art_illumina -i testmature.fa -1 miRNAseq_SRR1035644.txt -l 10 -sp -f 10 -o simulated_SRAnew

where miRNAseq_SRR1035644.txt is the file i created with the art_profiler but again i have an empty fastq file.

I am wondering what kind of fasta reference should i use ? My testmature.fa file is a file like that

>cel-let-7-5p MIMAT0000001 Caenorhabditis elegans let-7-5p
UGAGGUAGUAGGUUGUAUAGUU
>cel-let-7-3p MIMAT0015091 Caenorhabditis elegans let-7-3p
CUAUGCAAUUUUCUACCUUACC
>cel-lin-4-5p MIMAT0000002 Caenorhabditis elegans lin-4-5p
UCCCUGAGACCUCAAGUGUGA
>cel-lin-4-3p MIMAT0015092 Caenorhabditis elegans lin-4-3p
ACACCUGGGCUCUCCGGGUACC
>cel-miR-1-5p MIMAT0020301 Caenorhabditis elegans miR-1-5p
CAUACUUCCUUACAUGCCCAUA
>cel-miR-1-3p MIMAT0000003 Caenorhabditis elegans miR-1-3p
UGGAAUGUAAAGAAGUAUGUA

Do you have any idea where can be the problem or if there is another way to create miRNA seq simulated data ? What kind of reference and parameters to use ?

Thank you in advance

miRNA art_illumina simulate • 1.8k views
ADD COMMENT
0
Entering edit mode

Sequencers don't use Uracil when sequencing. So you should replace those U's with T's. I assume the reference would be expected to be a genome (and not small RNA's like what you have) so that may be one reason why this is not working.

ADD REPLY
0
Entering edit mode

Even if i replace U's with T's the result is the same. The output of art is a fq file only with headers like that

@cel-let-7-5p-10

+

@cel-let-7-5p-9

+

@cel-let-7-5p-8

+

@cel-let-7-5p-7

+

@cel-let-7-5p-6

+
ADD REPLY
0
Entering edit mode

So you suggest me to use a genome, lets say human genome and create profiles from my fq miRNA files and those 2 as input to ART ? And what about adapters ? My first problem is to be able to use ART and create a simple data set and then understand how to use it in order to create specific miRNA seq data with adapters. Those simulated data i need !

ADD REPLY
0
Entering edit mode

Do you absolutely need to simulate the reads? Can you not start with one of the available datasets out there? miRNA data I worked with had a specific adapter on 3'-end and then 4 bases at the beginning of the read (due to the kit used). miRNA reads where thus identified by presence of that adapter sequence (then removing it and the 4 bases at beginning of read) leaving a 22-25 bp final product. So not a straight forward thing to simulate.

ADD REPLY
0
Entering edit mode

I would like to create simulated miRNA reads in order to test a pipeline i already tested for 400 miRNA data sets. I found that ART is the most efficient tool to simulate reads and also there is available a new version from skewer team ( https://sourceforge.net/projects/skewer/files/Simulator/ ). In this version (art_illumina_src151-adapter-enabled.tar.gz ) they changed a bit the code in order to create miRNA with adapters but this is all i could find. There is no more documentation. What i could do was to follow the instructions of ART but as you can see above unsuccessful.

ADD REPLY
0
Entering edit mode
6.1 years ago
dzisis1986 ▴ 70

So the result is that no one is able to understand how ART works for miRNA simualted data ? No one used this specific version ( https://sourceforge.net/projects/skewer/files/Simulator/ ) of ART to produce reads that contain adapter contaminants?

ADD COMMENT
1
Entering edit mode

Is that a question or a comment? I don't think ART author claims anywhere that they can simulate miRNA sequence data. As we have discussed in past this is a rather special use case.

ADD REPLY

Login before adding your answer.

Traffic: 1974 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6