I just tried the same command with the latest version (11.8.1) of datasets and it worked without any issue. May be upgrade the datasets application and try again? That said, please read on for a bit more info.
It is possible that you are downloading a lot more data than you really want. You see, the command datasets download genome taxon "homo sapiens" will download _all_ human assemblies (~800 of them). I am assuming that you don't want to download the genome sequence for nearly 800 assemblies.
You should add the --refseq flag to your command to restrict the results to the latest RefSeq assemblies; there are 2 of them: GRCh38 aka hg38 and GRCh37 aka hg19. You may not even need _two_ of them either. Very likely, you need just the one _latest reference assembly_ that is GRCh38/hg38. For that, you should run the command with --refseq and --reference flags, like so:
How do you know how many assemblies will end up in your data package _before_ you spend the time to download it? There are two ways to figure this out.
Use the datasets summary command. The following command datasets summary genome taxon "homo sapiens" --limit NONE will return the total count of assemblies for your query: 791. If you drop the --limit NONE flag you will be able to download metadata for all of those assemblies in JSON from which you can pick and choose the assemblies of interest to you, copy the assembly accessions and use them with the datasets download command (look for the --inputfile option in datasets download genome accession command).
Download a dehydrated package. Running datasets download command with the --dehydrated flag is like doing a dry-run of the download command; you will download a data package that metadata but not the entire sequence data. You can read more about it here. Once you run this command and download the zip file (which should be very small), unzip it and look inside the file ncbi_dataset/data/assembly_data_report.jsonl to see a report of all assemblies that will end up in a real package. If everything appears as it should be you can either choose to 'rehydrate' the package that you have already downloaded or run the same command without the --dehydrated flag to download the actual data.
vkkodali It would be great to have this information available under the Documentation section on Datasets web page, perhaps in a Read this first bullet. Is there a plan to have an online NCBI notebook/manual for Datasets?
vkkodali It would be great to have this information available under the
Documentation
section on Datasets web page, perhaps in aRead this first
bullet. Is there a plan to have an online NCBI notebook/manual for Datasets?Good point.
As the tool matures, I believe the extent of the documentation will increase as well. Four Jupyter notebooks are currently available here.