Choosing random set of 500 fasta files from a folder containing large amount of fasta files in python or jupyter notebook
0
0
Entering edit mode
3.9 years ago

I was hoping to randomly select 500 fasta files of sequences from a directory which is a folder containing all the split fasta files and then make put the randomly selected fasta files in a separate folder. I want to mention that I want random sampling without replacement! Please help. Maybe include what the first thin to write down when opening python as I have minimum coding experience.

fasta python • 1.4k views
ADD COMMENT
0
Entering edit mode

Lay out your logic first, then look for ways to implement that logic. For example, you need a list of file names and a way to pick a 500-size sample without replacement from them. Google "python random sample without replacement" and see where that takes you. Plug in your array of file names and you'll have the solution.

First thing to write down: the core of the problem in plain English.

ADD REPLY
0
Entering edit mode

I'll give you a hint:

You'll want the os module to get a list of files in the directory (alternatively you can use the glob module if needed).

Then you'll want the random or numpy modules for implementing a random choice. If you're using python 3.6 or higher, random.choices has with replacement by default. Easily google-able so I suggest you start there.

ADD REPLY

Login before adding your answer.

Traffic: 2307 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6