Skip to content

Commit

Permalink
Doc page: git vs http (#1245)
Browse files Browse the repository at this point in the history
* HTTP vs GIT page

* rename

* rewording

* broken url

* few tweaks

* Apply suggestions from code review

Co-authored-by: Julien Chaumond <julien@huggingface.co>

Co-authored-by: Julien Chaumond <julien@huggingface.co>
  • Loading branch information
Wauplin and julien-c authored Dec 8, 2022
1 parent c11ce25 commit f6f524c
Show file tree
Hide file tree
Showing 3 changed files with 73 additions and 8 deletions.
4 changes: 4 additions & 0 deletions docs/source/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,10 @@
title: Manage the Cache
- local: how-to-model-cards
title: Create and Share Model Cards
- title: "Conceptual guides"
sections:
- local: concepts/git_vs_http
title: Git vs HTTP paradigm
- title: "Reference"
sections:
- local: package_reference/overview
Expand Down
59 changes: 59 additions & 0 deletions docs/source/concepts/git_vs_http.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# Git vs HTTP paradigm

The `huggingface_hub` library is a library for interacting with the 🤗 Hub, which is a
collections of git-based repositories (models, datasets or spaces). There are two main
ways to access the Hub using `huggingface_hub`.

The first approach, the so-called "git-based" approach, is led by the [`Repository`] class.
This method uses a wrapper around the `git` command with additional functions specifically
designed to interact with the Hub. The second option, called the "HTTP-based" approach,
involves making HTTP requests using the [`HfApi`] client. Let's examine the pros and cons
of each approach.

## Repository: the historical git-based approach

At first, `huggingface_hub` was mostly built around the [`Repository`] class. It provides
Python wrappers for common `git` commands such as `"git add"`, `"git commit"`, `"git push"`,
`"git tag"`, `"git checkout"`,...
The library also helps with setting credentials and tracking large files, which are often
used in machine learning repositories. Additionally, the library allows you to execute its
methods in the background, making it useful for uploading data during training.

The main advantage of using a [`Repository`] is that it allows you to maintain a local
copy of the entire repository on your machine, but this can also be a disadvantage as
it requires you to constantly update and maintain this local copy. This is similar to
traditional software development where each developer maintains their own local copy and
pushes changes when working on a feature. However, in the context of machine learning,
this may not always be necessary as users may only need to download weights for inference
or convert weights from one format to another without the need to clone the entire
repository.

## HfApi: a flexible and convenient http client

The [`HfApi`] class was developed to provide an alternative to local git repositories, which
can be cumbersome to maintain, especially when dealing with large models or datasets. The
[`HfApi`] class offers the same functionality as git-based approaches, such as downloading
and pushing files and creating branches and tags, but without the need for a local folder
to sync.

In addition to the functionalities already provided by `git` the [`HfApi`] class also offers
additional features, such as the ability to manage repos, download files using caching for
efficient reuse, search the Hub for repos and metadata, access community features such as
discussions, PRs, and comments, and configure HF Spaces for hardware and secrets.

## What should I use ? And when ?

Overall, the **HTTP-based approach is the recommended way to use** `huggingface_hub`
in most cases. However, there are a few situations where maintaining a local git clone
(using [`Repository`]) may be more beneficial:
- If you are training a model on your machine, it may be more efficient to use a traditional
git-based workflow, pushing regular updates. [`Repository`] is optimized for this type of
situation with its ability to work in the background.
- If you need to manually edit large files, `git` is the best option as it only sends the
diff to the server. With the [`HfAPI`] client, the entire file is uploaded with each edit.
But keep in mind that most large files are binary so do not benefit from git diffs anyway.

Not all git commands are available through [`HfApi`]. Some may never be implemented, but
we are always trying to improve and close the gap. If you don't see your use case covered,
please open [an issue on Github](https://github.com/huggingface/huggingface_hub)! We
welcome feedback to help build the 🤗 ecosystem with and for our users.
18 changes: 10 additions & 8 deletions docs/source/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -16,26 +16,28 @@ the Inference API.
<div class="mt-10">
<div class="w-full flex flex-col space-y-4 md:space-y-0 md:grid md:grid-cols-2 md:gap-y-4 md:gap-x-5">

<a class="!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg" href="./guides/overview"
><div class="w-full text-center bg-gradient-to-br from-indigo-400 to-indigo-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed">How-to guides</div>
<a class="!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg" href="./guides/overview">
<div class="w-full text-center bg-gradient-to-br from-indigo-400 to-indigo-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed">How-to guides</div>
<p class="text-gray-700">Practical guides to help you achieve a specific goal. Take a look at these guides to learn how to use huggingface_hub to solve real-world problems.</p>
</a>

<a class="!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg" href="./package_reference/overview"
><div class="w-full text-center bg-gradient-to-br from-purple-400 to-purple-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed">Reference</div>
<a class="!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg" href="./package_reference/overview">
<div class="w-full text-center bg-gradient-to-br from-purple-400 to-purple-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed">Reference</div>
<p class="text-gray-700">Exhaustive and technical description of huggingface_hub classes and methods.</p>
</a>

<a class="!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg" href="./concepts/git_vs_http">
<div class="w-full text-center bg-gradient-to-br from-pink-400 to-pink-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed">Conceptual guides</div>
<p class="text-gray-700">High-level explanations for building a better understanding of huggingface_hub philosophy.</p>
</a>

</div>
</div>

<!--
<a class="!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg" href="./tutorials/overview"
><div class="w-full text-center bg-gradient-to-br from-blue-400 to-blue-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed">Tutorials</div>
<p class="text-gray-700">Learn the basics and become familiar with using huggingface_hub to programmatically interact with the 🤗 Hub!</p>
</a>
<a class="!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg" href="./concepts/overview"
><div class="w-full text-center bg-gradient-to-br from-pink-400 to-pink-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed">Conceptual guides</div>
<p class="text-gray-700">High-level explanations for building a better understanding of important topics such as huggingface_hub philosophy, the git-based vs http-based paradigm or the cache system internals.</p>
</a> -->

## Contribute
Expand Down

0 comments on commit f6f524c

Please sign in to comment.