To download your data using the Dewey Data Python Client, follow these steps in your command line or terminal:
- Install the client
pip install deweypy- Make a directory for your downloads
mkdir dewey-downloads/- Locate
FOLDER_ID
FOLDER_ID can be extracted from the end of the end of the API URL after data/
- Run client
python -m deweypy --api-key <YOUR_API_KEY> speedy-download <FOLDER_ID>- For now, please use the CLI to download data. This method is well tested; notebook support will be available soon.
- Increasing the number of workers for multi-threaded downloads yields diminishing returns, as API requests are limited both by our bucket’s rate limits and your own. We recommend the default of 8 workers, but you can override this with:
--num-workers <INT>followingspeedy-download <FOLDER_ID>. - If your dataset is date-partitioned, you can limit the data processed by specifying partition boundaries at the end of your command:
--partition-key-before YYYY-MM-DD --partition-key-after YYYY-MM-DD--partition-key-beforeincludes all partitions up to and including the given date.--partition-key-afterincludes all partitions from and including the given date onward.
For guidance on analyzing your downloaded data, check out the provided notebook tutorial. It demonstrates how to work with Polars, Pandas, and DuckDB, and includes methods for exporting data to Parquet format for more efficient downstream analysis.