Skip to content

Releases: huggingface/huggingface_hub

v0.5.1: Patch release

07 Apr 19:10

Choose a tag to compare

This is a patch release fixing a breaking backward compatibility issue.

Linked PR: #822

v0.5.0: Reference documentation, Keras improvements, stabilizing the API

07 Apr 19:09

Choose a tag to compare

Documentation

Version v0.5.0 is the first version which features an API reference. It is still a work in progress with features lacking, some images not rendering, and a documentation reorg coming up, but should already provide significantly simpler access to the huggingface_hub API.

The documentation is visible here.

Model & datasets list improvements

The list_models and list_datasets methods have been improved in several ways.

List private models

These two methods now accept the token keyword to specify your token. Specifying the token will include your private models and datasets in the returned list.

  • Support list_models and list_datasets with token arg by @muellerzr in #638

Modelcard metadata

These two methods now accept the cardData boolean argument. If set to True, the modelcard metadata will also be returned when using these two methods.

  • Include cardData in list_models and list_datasets by @muellerzr in #639

Filtering by carbon emissions

The list_models method now also accepts an emissions_trehsholds parameter to filter by carbon emissions.

Keras improvements

The Keras serialization and upload methods have been worked on to provide better support for models:

  • All parameters are now included in the saved model when using push_to_hub_keras
  • log_dir parameter for TensorBoard logs, which will automatically spawn a TensorBoard instance on the Hub.
  • Automatic model card

Contributing guide

A contributing guide is now available for the huggingface_hub repository. For any and all information related to contributing to the repository, please check it out!

Read more about it here: CONTRIBUTING.md.

Pre-commit hooks

The huggingface_hub GitHub repository has several checks to ensure that the code respects code quality standards. Opt-in pre-commit hooks have been added in order to make it simpler for contributors to leverage them.

Read more about it in the aforementionned CONTRIBUTING guide.

Renaming and transferring repositories

Repositories can now be renamed and transferred programmatically using move_repo.

  • Allow renaming and transferring repos programmatically by @osanseviero in #704

Breaking changes & deprecation

⛔ The following methods have now been removed following a deprecation cycle

list_repos_objs

The list_repos_objs and the accompanying CLI utility huggingface-cli repo ls-files have been removed.
The same can be done using the model_info and dataset_info methods.

  • Remove deprecated list_repos_objs and huggingface-cli repo ls-files by @julien-c in #702

Python 3.6

Python 3.6 support is now dropped as end of life. Using Python 3.6 and installing huggingface_hub will result in version v0.4.0 being installed.

⚠️ Items below are now deprecated and will be removed in a future version

  • API deprecate positional args in file_download and hf_api by @adrinjalali in #745
  • MNT deprecate name and organization in favor of repo_id by @adrinjalali in #733

What's Changed

New Contributors

Full Changelog: v0.4.0...v0.5.0

v0.4.0: Tag listing, Namespace Objects, Model Filter

26 Jan 18:30

Choose a tag to compare

Tag listing

This PR introduces the ability to fetch all available tags for models or datasets and returns them as a nested namespace object, for example:

>>> from huggingface_hub import HfApi

>>> api = HfApi() 
>>> tags = api.get_model_tags()
>>> print(tags)
Available Attributes:
 * benchmark
 * language_creators
 * languages
 * licenses
 * multilinguality
 * size_categories
 * task_categories
 * task_ids

>>> print(tags.benchmark)
Available Attributes:
 * raft
 * superb
 * test

Namespace objects

With a goal of adding more tab-completion to the library, this PR introduces two objects:

  • DatasetSearchArguments
  • ModelSearchArguments

These two AttributeDictionary objects contain all the valid information we can extract from a model as tab-complete parameters. We also include the author_or_organization and dataset (or model) _name as well through careful string splitting.

Model Filter

This PR introduces a new way to search the hub: the ModelFilter class.

It is a simple Enum at first to the user, allowing them to specify what they want to search for, such as:

f = ModelFilter(author="microsoft", model_name="wavlm-base-sd", framework="pytorch")

From there, they can pass in this filter to the new list_models_by_filter function in HfApi to search through it:

models = api.list_modes(filter=f)

The API may then be used for complex queries:

args = ModelSearchArguments()
f = ModelFilter(framework=[args.library.pytorch, args.library.TensorFlow], model_name="bert", tasks=[args.pipeline_tag.Summarization, args.pipeline_tag.TokenClassification])

api.list_models_from_filter(f)

Ignoring filenames in snapshot_download

This PR introduces a way to limit the files that will be fetched by the snapshot_download. This is useful when you want to download and cache an entire repository without using git, and that you want to skip files according to their filenames.

What's Changed

New Contributors

Full Changelog: v0.2.1...v0.4.0

v0.2.1: Patch release

26 Jan 18:18

Choose a tag to compare

This is a patch release fixing an issue with the notebook login.

5e2da9b#diff-fb1696cbcf008dd89dde5e8c1da9d4be5a8f7d809bc32f07d4453caba40df15f

v0.2.0: Access tokens, skip large files, local files only

26 Jan 18:17

Choose a tag to compare

Access tokens

Version v0.2.0 introduces the access token compatibility with the hub. It offers the access tokens as the main login handler, with the possibility to still login with username/password when doing [Ctrl/CMD]+C on the login prompt:

image

The notebook login is adapted to work with the access tokens.

Skipping large files

The Repository class now has an additional parameter, skip_lfs_files, which allows cloning the repository while skipping the large file download.

#472

Local files only for snapshot_download

The snapshot_download method can now take local_files_only as a parameter to enable leveraging previously downloaded files.

#505

v0.1.2: Patch release

09 Nov 17:46

Choose a tag to compare

What's Changed

Full Changelog: v0.1.1...v0.1.2

v0.1.1: Patch release

05 Nov 18:39

Choose a tag to compare

What's Changed

  • Fix typing-extensions minimum version by @lhoestq in #453
  • Fix argument order in create_repo for Repository.clone_from by @sgugger in #459

Full Changelog: v0.1.0...v0.1.1

v0.1.0: Optional token, `HfApi` begone, git prune

02 Nov 22:42

Choose a tag to compare

What's Changed

Version v0.1.0 is the first minor release of the huggingface_hub package, which promises better stability for the incoming versions. This update comes with big quality of life improvements.

Make token optional in all HfApi methods. by @sgugger in #379

Previously, most methods of the HfApi class required the token to be explicitly passed. This is changed in this version, where it defaults to the token stored in the cache. This results in a re-ordering of arguments, but backward compatibility is preserved in most cases. Where it is not preserved, an explicit error is thrown.

Root methods instead of HfApi by @LysandreJik in #388

The HfApi class now exposes its methods through the hf_api file, reducing the friction to access these helpers. See the example below:

# Previously
from huggingface_hub import HfApi

api = HfApi()
user = api.whoami()

# Now
from huggingface_hub.hf_api import whoami

user = whoami()

The HfApi can still be imported and works as before for backward compatibility.

Add list_repo_files util by @sgugger in #395

Offers a list_repo_files to ... list the repo files! Supports both model repositories and dataset repositories

Add helper to generate an eval result model-index, with proper typing by @julien-c in #382

Offers a metadata_eval_result in order to generate a YAML block to put in model cards according to evaluation results.

Add metrics to API by @mariosasko in #429

Adds a list_metrics method to HfApi!

Git prune by @LysandreJik in #450

Adds a git_prune method to the Repository class. This prunes local files which are unneeded as already pushed to a remote.
It adds the argument auto_lfs_prune to git_push and the commit context-manager for simpler handling.

Bug fixes

Full Changelog: v0.0.19...v0.1.0

v0.0.18: Repo metadata, git tags, Keras mixin

04 Oct 21:10

Choose a tag to compare

v0.0.18: Repo metadata, git tags, Keras mixin

Repository metadata (@julien-c)

The version v0.0.18 of the huggingface_hub includes tools to manage repository metadata. The following example reads metadata from a repository:

from huggingface_hub import Repository

repo = Repository("xxx", clone_from="yyy")
data = repo.repocard_metadata_load()

The following example completes that metadata before writing it to the repository locally.

data["license"] = "apache-2.0"
repo.repocard_metadata_save(data)

Git tags (@AngledLuffa)

Tag management is now available! Add, check, delete tags locally or remotely directly from the Repository utility.

Revisited Keras support (@nateraw)

The Keras mixin has been revisited:

  • It now saves models as SavedModel objects rather than .h5 files.
  • It now offers methods that can be leveraged simply as a functional API, instead of having to use the Mixin as an actual mixin.

Improvements and bug fixes

v0.0.17: Non-blocking git push, notebook login

04 Oct 21:00

Choose a tag to compare

v0.0.17: Non-blocking git push, notebook login

Non-blocking git-push

The pushing methods now have access to a blocking boolean parameter to indicate whether the push should happen
asynchronously.

In order to see if the push has finished or its status code (to spot a failure), one should use the command_queue
property on the Repository object.

For example:

from huggingface_hub import Repository

repo = Repository("<local_folder>", clone_from="<user>/<model_name>")

with repo.commit("Commit message", blocking=False):
    # Save data

last_command = repo.command_queue[-1]

# Status of the push command
last_command.status  
# Will return the status code
#     -> -1 will indicate the push is still ongoing
#     -> 0 will indicate the push has completed successfully
#     -> non-zero code indicates the error code if there was an error

# if there was an error, the stderr may be inspected
last_command.stderr

# Whether the command finished or if it is still ongoing
last_command.is_done

# Whether the command errored-out.
last_command.failed

When using blocking=False, the commands will be tracked and your script will exit only when all pushes are done, even
if other errors happen in your script (a failed push counts as done).

Notebook login (@sgugger)

The huggingface_hub library now has a notebook_login method which can be used to login on notebooks with no access to the shell. In a notebook, login with the following:

from huggingface_hub import notebook_login

notebook_login()

Improvements and bugfixes