Skip to content

Instantly share code, notes, and snippets.

@padeoe
Last active April 8, 2026 05:40
Show Gist options
  • Select an option

  • Save padeoe/697678ab8e528b85a2a7bddafea1fa4f to your computer and use it in GitHub Desktop.

Select an option

Save padeoe/697678ab8e528b85a2a7bddafea1fa4f to your computer and use it in GitHub Desktop.
CLI-Tool for download Huggingface models and datasets with aria2/wget: hfd

🤗Huggingface Model Downloader

Note

(2025-01-08) Add feature for 🏷️Tag(Revision) Selection, contributed by @Bamboo-D.
(2024-12-17) Add feature for ⚡Quick Startup and ⏭️Fast Resume, enabling skipping of downloaded files, while removing the git clone dependency to accelerate file list retrieval.

Considering the lack of multi-threaded download support in the official huggingface-cli, and the inadequate error handling in hf_transfer, This command-line tool leverages curl and aria2c for fast and robust downloading of models and datasets.

Features

  • ⏯️ Resume from breakpoint: You can re-run it or Ctrl+C anytime.
  • 🚀 Multi-threaded Download: Utilize multiple threads to speed up the download process.
  • 🚫 File Exclusion: Use --exclude or --include to skip or specify files, save time for models with duplicate formats (e.g., *.bin or *.safetensors).
  • 🔐 Auth Support: For gated models that require Huggingface login, use --hf_username and --hf_token to authenticate.
  • 🪞 Mirror Site Support: Set up with HF_ENDPOINT environment variable.
  • 🌍 Proxy Support: Set up with https_proxy environment variable.
  • 📦 Simple: Minimal dependencies, requires only curl and wget, while aria2 and jq are optional for better performance.
  • 🏷️ Tag Selection: Support downloading specific model/dataset revision using --revision.

Usage

First, Download hfd.sh or clone this repo, and then grant execution permission to the script.

chmod a+x hfd.sh

you can create an alias for convenience

alias hfd="$PWD/hfd.sh"

Usage Instructions

$ ./hfd.sh --help
Usage:
  hfd <REPO_ID> [--include include_pattern1 include_pattern2 ...] [--exclude exclude_pattern1 exclude_pattern2 ...] [--hf_username username] [--hf_token token] [--tool aria2c|wget] [-x threads] [-j jobs] [--dataset] [--local-dir path] [--revision rev]

Description:
  Downloads a model or dataset from Hugging Face using the provided repo ID.

Arguments:
  REPO_ID         The Hugging Face repo ID (Required)
                  Format: 'org_name/repo_name' or legacy format (e.g., gpt2)
Options:
  include/exclude_pattern The patterns to match against file path, supports wildcard characters.
                  e.g., '--exclude *.safetensor *.md', '--include vae/*'.
  --include       (Optional) Patterns to include files for downloading (supports multiple patterns).
  --exclude       (Optional) Patterns to exclude files from downloading (supports multiple patterns).
  --hf_username   (Optional) Hugging Face username for authentication (not email).
  --hf_token      (Optional) Hugging Face token for authentication.
  --tool          (Optional) Download tool to use: aria2c (default) or wget.
  -x              (Optional) Number of download threads for aria2c (default: 4).
  -j              (Optional) Number of concurrent downloads for aria2c (default: 5).
  --dataset       (Optional) Flag to indicate downloading a dataset.
  --local-dir     (Optional) Directory path to store the downloaded data.
                             Defaults to the current directory with a subdirectory named 'repo_name'
                             if REPO_ID is is composed of 'org_name/repo_name'.
  --revision      (Optional) Model/Dataset revision to download (default: main).

Example:
  hfd gpt2
  hfd bigscience/bloom-560m --exclude *.bin *.msgpack onnx/*
  hfd meta-llama/Llama-2-7b --hf_username myuser --hf_token mytoken -x 4
  hfd lavita/medical-qa-shared-task-v1-toy --dataset
  hfd bartowski/Phi-3.5-mini-instruct-exl2 --revision 5_0

Download a model

hfd bigscience/bloom-560m

Download a model need login

Get huggingface token from https://huggingface.co/settings/tokens, then

hfd meta-llama/Llama-2-7b --hf_username YOUR_HF_USERNAME_NOT_EMAIL --hf_token YOUR_HF_TOKEN

Download a model and exclude certain files (e.g., .safetensors)

hfd bigscience/bloom-560m --exclude *.bin *.msgpack onnx/*

You can also exclude multiple pattern like that

hfd bigscience/bloom-560m --exclude *.bin --exclude *.msgpack --exclude onnx/*

Download specific files using include patterns

hfd Qwen/Qwen2.5-Coder-32B-Instruct-GGUF --include *q2_k*.gguf

Download a dataset

hfd lavita/medical-qa-shared-task-v1-toy --dataset

Download a specific revision of a model

hfd bartowski/Phi-3.5-mini-instruct-exl2 --revision 5_0

Multi-threading and Parallel Downloads

The script supports two types of parallelism when using aria2c:

  • Threads per File (-x): Controls connections per file, usage: hfd gpt2 -x 8, recommended: 4-8, default: 4 threads.

  • Concurrent Files (-j): Controls simultaneous file downloads, usage: hfd gpt2 -j 3, recommended: 3-8, default: 5 files.

Combined usage:

hfd gpt2 -x 8 -j 3  # 8 threads per file, 3 files at once
#!/usr/bin/env bash
# Color definitions
RED='\033[0;31m'; GREEN='\033[0;32m'; YELLOW='\033[1;33m'; NC='\033[0m' # No Color
trap 'printf "${YELLOW}\nDownload interrupted. You can resume by re-running the command.\n${NC}"; exit 1' INT
display_help() {
cat << EOF
Usage:
hfd <REPO_ID> [--include include_pattern1 include_pattern2 ...] [--exclude exclude_pattern1 exclude_pattern2 ...] [--hf_username username] [--hf_token token] [--tool aria2c|wget] [-x threads] [-j jobs] [--dataset] [--local-dir path] [--revision rev]
Description:
Downloads a model or dataset from Hugging Face using the provided repo ID.
Arguments:
REPO_ID The Hugging Face repo ID (Required)
Format: 'org_name/repo_name' or legacy format (e.g., gpt2)
Options:
include/exclude_pattern The patterns to match against file path, supports wildcard characters.
e.g., '--exclude *.safetensor *.md', '--include vae/*'.
--include (Optional) Patterns to include files for downloading (supports multiple patterns).
--exclude (Optional) Patterns to exclude files from downloading (supports multiple patterns).
--hf_username (Optional) Hugging Face username for authentication (not email).
--hf_token (Optional) Hugging Face token for authentication.
--tool (Optional) Download tool to use: aria2c (default) or wget.
-x (Optional) Number of download threads for aria2c (default: 4).
-j (Optional) Number of concurrent downloads for aria2c (default: 5).
--dataset (Optional) Flag to indicate downloading a dataset.
--local-dir (Optional) Directory path to store the downloaded data.
Defaults to the current directory with a subdirectory named 'repo_name'
if REPO_ID is is composed of 'org_name/repo_name'.
--revision (Optional) Model/Dataset revision to download (default: main).
Example:
hfd gpt2
hfd bigscience/bloom-560m --exclude *.safetensors
hfd meta-llama/Llama-2-7b --hf_username myuser --hf_token mytoken -x 4
hfd lavita/medical-qa-shared-task-v1-toy --dataset
hfd bartowski/Phi-3.5-mini-instruct-exl2 --revision 5_0
EOF
exit 1
}
[[ -z "$1" || "$1" =~ ^-h || "$1" =~ ^--help ]] && display_help
REPO_ID=$1
shift
# Default values
TOOL="aria2c"
THREADS=4
CONCURRENT=5
HF_ENDPOINT=${HF_ENDPOINT:-"https://huggingface.co"}
INCLUDE_PATTERNS=()
EXCLUDE_PATTERNS=()
REVISION="main"
validate_number() {
[[ "$2" =~ ^[1-9][0-9]*$ && "$2" -le "$3" ]] || { printf "${RED}[Error] $1 must be 1-$3${NC}\n"; exit 1; }
}
# Argument parsing
while [[ $# -gt 0 ]]; do
case $1 in
--include) shift; while [[ $# -gt 0 && ! ($1 =~ ^--) && ! ($1 =~ ^-[^-]) ]]; do INCLUDE_PATTERNS+=("$1"); shift; done ;;
--exclude) shift; while [[ $# -gt 0 && ! ($1 =~ ^--) && ! ($1 =~ ^-[^-]) ]]; do EXCLUDE_PATTERNS+=("$1"); shift; done ;;
--hf_username) HF_USERNAME="$2"; shift 2 ;;
--hf_token) HF_TOKEN="$2"; shift 2 ;;
--tool)
case $2 in
aria2c|wget)
TOOL="$2"
;;
*)
printf "%b[Error] Invalid tool. Use 'aria2c' or 'wget'.%b\n" "$RED" "$NC"
exit 1
;;
esac
shift 2
;;
-x) validate_number "threads (-x)" "$2" 10; THREADS="$2"; shift 2 ;;
-j) validate_number "concurrent downloads (-j)" "$2" 10; CONCURRENT="$2"; shift 2 ;;
--dataset) DATASET=1; shift ;;
--local-dir) LOCAL_DIR="$2"; shift 2 ;;
--revision) REVISION="$2"; shift 2 ;;
*) display_help ;;
esac
done
# Generate current command string
generate_command_string() {
local cmd_string="REPO_ID=$REPO_ID"
cmd_string+=" TOOL=$TOOL"
cmd_string+=" INCLUDE_PATTERNS=${INCLUDE_PATTERNS[*]}"
cmd_string+=" EXCLUDE_PATTERNS=${EXCLUDE_PATTERNS[*]}"
cmd_string+=" DATASET=${DATASET:-0}"
cmd_string+=" HF_USERNAME=${HF_USERNAME:-}"
cmd_string+=" HF_TOKEN=${HF_TOKEN:-}"
cmd_string+=" HF_ENDPOINT=${HF_ENDPOINT:-}"
cmd_string+=" REVISION=$REVISION"
echo "$cmd_string"
}
# Check if aria2, wget, curl are installed
check_command() {
if ! command -v $1 &>/dev/null; then
printf "%b%s is not installed. Please install it first.%b\n" "$RED" "$1" "$NC"
exit 1
fi
}
check_command curl; check_command "$TOOL"
LOCAL_DIR="${LOCAL_DIR:-${REPO_ID#*/}}"
mkdir -p "$LOCAL_DIR/.hfd"
if [[ "$DATASET" == 1 ]]; then
METADATA_API_PATH="datasets/$REPO_ID"
DOWNLOAD_API_PATH="datasets/$REPO_ID"
CUT_DIRS=5
else
METADATA_API_PATH="models/$REPO_ID"
DOWNLOAD_API_PATH="$REPO_ID"
CUT_DIRS=4
fi
# Modify API URL, construct based on revision
if [[ "$REVISION" != "main" ]]; then
METADATA_API_PATH="$METADATA_API_PATH/revision/$REVISION"
fi
API_URL="$HF_ENDPOINT/api/$METADATA_API_PATH"
METADATA_FILE="$LOCAL_DIR/.hfd/repo_metadata.json"
# Fetch and save metadata
fetch_and_save_metadata() {
status_code=$(curl -L -s -w "%{http_code}" -o "$METADATA_FILE" ${HF_TOKEN:+-H "Authorization: Bearer $HF_TOKEN"} "$API_URL")
RESPONSE=$(cat "$METADATA_FILE")
if [ "$status_code" -eq 200 ]; then
printf "%s\n" "$RESPONSE"
else
printf "%b[Error] Failed to fetch metadata from $API_URL. HTTP status code: $status_code.%b\n$RESPONSE\n" "${RED}" "${NC}" >&2
rm $METADATA_FILE
exit 1
fi
}
check_authentication() {
local response="$1"
if command -v jq &>/dev/null; then
local gated
gated=$(echo "$response" | jq -r '.gated // false')
if [[ "$gated" != "false" && ( -z "$HF_TOKEN" || -z "$HF_USERNAME" ) ]]; then
printf "${RED}The repository requires authentication, but --hf_username and --hf_token is not passed. Please get token from https://huggingface.co/settings/tokens.\nExiting.\n${NC}"
exit 1
fi
else
if echo "$response" | grep -q '"gated":[^f]' && [[ -z "$HF_TOKEN" || -z "$HF_USERNAME" ]]; then
printf "${RED}The repository requires authentication, but --hf_username and --hf_token is not passed. Please get token from https://huggingface.co/settings/tokens.\nExiting.\n${NC}"
exit 1
fi
fi
}
if [[ ! -f "$METADATA_FILE" ]]; then
printf "%bFetching repo metadata...%b\n" "$YELLOW" "$NC"
RESPONSE=$(fetch_and_save_metadata) || exit 1
check_authentication "$RESPONSE"
else
printf "%bUsing cached metadata: $METADATA_FILE%b\n" "$GREEN" "$NC"
RESPONSE=$(cat "$METADATA_FILE")
check_authentication "$RESPONSE"
fi
should_regenerate_filelist() {
local command_file="$LOCAL_DIR/.hfd/last_download_command"
local current_command=$(generate_command_string)
# If file list doesn't exist, regenerate
if [[ ! -f "$LOCAL_DIR/$fileslist_file" ]]; then
echo "$current_command" > "$command_file"
return 0
fi
# If command file doesn't exist, regenerate
if [[ ! -f "$command_file" ]]; then
echo "$current_command" > "$command_file"
return 0
fi
# Compare current command with saved command
local saved_command=$(cat "$command_file")
if [[ "$current_command" != "$saved_command" ]]; then
echo "$current_command" > "$command_file"
return 0
fi
return 1
}
fileslist_file=".hfd/${TOOL}_urls.txt"
if should_regenerate_filelist; then
# Remove existing file list if it exists
[[ -f "$LOCAL_DIR/$fileslist_file" ]] && rm "$LOCAL_DIR/$fileslist_file"
printf "%bGenerating file list...%b\n" "$YELLOW" "$NC"
# Convert include and exclude patterns to regex
INCLUDE_REGEX=""
EXCLUDE_REGEX=""
if ((${#INCLUDE_PATTERNS[@]})); then
INCLUDE_REGEX=$(printf '%s\n' "${INCLUDE_PATTERNS[@]}" | sed 's/\./\\./g; s/\*/.*/g' | paste -sd '|' -)
fi
if ((${#EXCLUDE_PATTERNS[@]})); then
EXCLUDE_REGEX=$(printf '%s\n' "${EXCLUDE_PATTERNS[@]}" | sed 's/\./\\./g; s/\*/.*/g' | paste -sd '|' -)
fi
# Check if jq is available
if command -v jq &>/dev/null; then
process_with_jq() {
if [[ "$TOOL" == "aria2c" ]]; then
printf "%s" "$RESPONSE" | jq -r \
--arg endpoint "$HF_ENDPOINT" \
--arg repo_id "$DOWNLOAD_API_PATH" \
--arg token "$HF_TOKEN" \
--arg include_regex "$INCLUDE_REGEX" \
--arg exclude_regex "$EXCLUDE_REGEX" \
--arg revision "$REVISION" \
'
.siblings[]
| select(
.rfilename != null
and ($include_regex == "" or (.rfilename | test($include_regex)))
and ($exclude_regex == "" or (.rfilename | test($exclude_regex) | not))
)
| [
($endpoint + "/" + $repo_id + "/resolve/" + $revision + "/" + .rfilename),
" dir=" + (.rfilename | split("/")[:-1] | join("/")),
" out=" + (.rfilename | split("/")[-1]),
if $token != "" then " header=Authorization: Bearer " + $token else empty end,
""
]
| join("\n")
'
else
printf "%s" "$RESPONSE" | jq -r \
--arg endpoint "$HF_ENDPOINT" \
--arg repo_id "$DOWNLOAD_API_PATH" \
--arg include_regex "$INCLUDE_REGEX" \
--arg exclude_regex "$EXCLUDE_REGEX" \
--arg revision "$REVISION" \
'
.siblings[]
| select(
.rfilename != null
and ($include_regex == "" or (.rfilename | test($include_regex)))
and ($exclude_regex == "" or (.rfilename | test($exclude_regex) | not))
)
| ($endpoint + "/" + $repo_id + "/resolve/" + $revision + "/" + .rfilename)
'
fi
}
result=$(process_with_jq)
printf "%s\n" "$result" > "$LOCAL_DIR/$fileslist_file"
else
printf "%b[Warning] jq not installed, using grep/awk for metadata json parsing (slower). Consider installing jq for better parsing performance.%b\n" "$YELLOW" "$NC"
process_with_grep_awk() {
local include_pattern=""
local exclude_pattern=""
local output=""
if ((${#INCLUDE_PATTERNS[@]})); then
include_pattern=$(printf '%s\n' "${INCLUDE_PATTERNS[@]}" | sed 's/\./\\./g; s/\*/.*/g' | paste -sd '|' -)
fi
if ((${#EXCLUDE_PATTERNS[@]})); then
exclude_pattern=$(printf '%s\n' "${EXCLUDE_PATTERNS[@]}" | sed 's/\./\\./g; s/\*/.*/g' | paste -sd '|' -)
fi
local files=$(printf '%s' "$RESPONSE" | grep -o '"rfilename":"[^"]*"' | awk -F'"' '{print $4}')
if [[ -n "$include_pattern" ]]; then
files=$(printf '%s\n' "$files" | grep -E "$include_pattern")
fi
if [[ -n "$exclude_pattern" ]]; then
files=$(printf '%s\n' "$files" | grep -vE "$exclude_pattern")
fi
while IFS= read -r file; do
if [[ -n "$file" ]]; then
if [[ "$TOOL" == "aria2c" ]]; then
output+="$HF_ENDPOINT/$DOWNLOAD_API_PATH/resolve/$REVISION/$file"$'\n'
output+=" dir=$(dirname "$file")"$'\n'
output+=" out=$(basename "$file")"$'\n'
[[ -n "$HF_TOKEN" ]] && output+=" header=Authorization: Bearer $HF_TOKEN"$'\n'
output+=$'\n'
else
output+="$HF_ENDPOINT/$DOWNLOAD_API_PATH/resolve/$REVISION/$file"$'\n'
fi
fi
done <<< "$files"
printf '%s' "$output"
}
result=$(process_with_grep_awk)
printf "%s\n" "$result" > "$LOCAL_DIR/$fileslist_file"
fi
else
printf "%bResume from file list: $LOCAL_DIR/$fileslist_file%b\n" "$GREEN" "$NC"
fi
# Perform download
printf "${YELLOW}Starting download with $TOOL to $LOCAL_DIR...\n${NC}"
cd "$LOCAL_DIR"
if [[ "$TOOL" == "aria2c" ]]; then
aria2c --console-log-level=error --file-allocation=none -x "$THREADS" -j "$CONCURRENT" -s "$THREADS" -k 1M -c -i "$fileslist_file" --save-session="$fileslist_file"
elif [[ "$TOOL" == "wget" ]]; then
wget -x -nH --cut-dirs="$CUT_DIRS" ${HF_TOKEN:+--header="Authorization: Bearer $HF_TOKEN"} --input-file="$fileslist_file" --continue
fi
if [[ $? -eq 0 ]]; then
printf "${GREEN}Download completed successfully. Repo directory: $PWD\n${NC}"
else
printf "${RED}Download encountered errors.\n${NC}"
exit 1
fi
@Skrill2001
Copy link
Copy Markdown

请问这是什么问题,有什么办法解决吗? [DL:19MiB][#90b682 7.5GiB/10GiB(74%)][#2f1d57 1.6GiB/22GiB(7%)] 03/24 00:08:38 [ERROR] CUID#13 - Download aborted. URI=xxx Exception: [AbstractCommand.cc:351] errorCode=22 URI=xxx -> [HttpSkipResponseCommand.cc:239] errorCode=22 The response status is not successful. status=429

访问频率过快,被限制了

请问这个能通过调什么命令参数来缓解吗?需要下载的文件比较多,现在用的都是默认参数

@Guncuke
Copy link
Copy Markdown

Guncuke commented Mar 26, 2025

你好,更换了3个源都无法拉取模型,请问是为什么呢

./hfd.sh deepseek-ai/DeepSeek-V3-Base --local-dir ./DeepSeek-V3-Base
Fetching repo metadata...
cat: ./DeepSeek-V3-Base/.hfd/repo_metadata.json: No such file or directory
[Error] Failed to fetch metadata from https://hf-mirror.com/api/models/deepseek-ai/DeepSeek-V3-Base. HTTP status code: 000.

rm: cannot remove './DeepSeek-V3-Base/.hfd/repo_metadata.json': No such file or directory

@padeoe
Copy link
Copy Markdown
Author

padeoe commented Mar 26, 2025

你好,更换了3个源都无法拉取模型,请问是为什么呢

./hfd.sh deepseek-ai/DeepSeek-V3-Base --local-dir ./DeepSeek-V3-Base
Fetching repo metadata...
cat: ./DeepSeek-V3-Base/.hfd/repo_metadata.json: No such file or directory
[Error] Failed to fetch metadata from https://hf-mirror.com/api/models/deepseek-ai/DeepSeek-V3-Base. HTTP status code: 000.

rm: cannot remove './DeepSeek-V3-Base/.hfd/repo_metadata.json': No such file or directory

删除下./DeepSeek-V3-Base/.hfd/这个目录重试

@falcon-xu
Copy link
Copy Markdown

falcon-xu commented May 21, 2025

@padeoe 您好,目前调用脚本无法下载模型,目标路径下只有一个.hfd文件夹,无模型文件,运行信息如下:

$ ./hfd.sh microsoft/swin-base-patch4-window7-224-in22k  --local-dir ./swin-base-patch4-window7-224-in22k

Fetching repo metadata...                                                                                                                                                  
jq: error while loading shared libraries: libonig.so.5: cannot open shared object file: No such file or directory                                                          
Generating file list...                                                                                                                                                    
jq: error while loading shared libraries: libonig.so.5: cannot open shared object file: No such file or directory                                                          
Starting download with aria2c to ./swin-base-patch4-window7-224-in22k...                                                                                                   No files to download.                                                                                                                                                      
Download completed successfully. Repo directory: /data1/xuguanyu/code/checkpoint/swin-base-patch4-window7-224-in22k

已尝试如下方法,均无效:

  • 更换模型
  • 换用备用节点
  • 删除.hfd文件夹
  • 换用wget下载

请问应该如何处理?谢谢!

@falcon-xu
Copy link
Copy Markdown

@padeoe 您好,目前调用脚本无法下载模型,目标路径下只有一个.hfd文件夹,无模型文件,运行信息如下:

$ ./hfd.sh microsoft/swin-base-patch4-window7-224-in22k  --local-dir ./swin-base-patch4-window7-224-in22k

Fetching repo metadata...                                                                                                                                                  
jq: error while loading shared libraries: libonig.so.5: cannot open shared object file: No such file or directory                                                          
Generating file list...                                                                                                                                                    
jq: error while loading shared libraries: libonig.so.5: cannot open shared object file: No such file or directory                                                          
Starting download with aria2c to ./swin-base-patch4-window7-224-in22k...                                                                                                   No files to download.                                                                                                                                                      
Download completed successfully. Repo directory: /data1/xuguanyu/code/checkpoint/swin-base-patch4-window7-224-in22k

已尝试如下方法,均无效:

  • 更换模型
  • 换用备用节点
  • 删除.hfd文件夹
  • 换用wget下载

请问应该如何处理?谢谢!

更新 jq 后解决

@Seas0
Copy link
Copy Markdown

Seas0 commented Jun 19, 2025

The generate_command_string function seems to incorrectly save the HF_ENDPOINT environment variable as another HF_TOKEN, as shown at https://gist.github.com/padeoe/697678ab8e528b85a2a7bddafea1fa4f#file-hfd-sh-L99

@padeoe
Copy link
Copy Markdown
Author

padeoe commented Jun 19, 2025

The generate_command_string function seems to incorrectly save the HF_ENDPOINT environment variable as another HF_TOKEN, as shown at https://gist.github.com/padeoe/697678ab8e528b85a2a7bddafea1fa4f#file-hfd-sh-L99

Thanks, fixed!

@zigerZZZ
Copy link
Copy Markdown

(deepseek) shiny@cadd-cdob01-computer1:~/deepseek_local$ hfd.sh deepseek-ai/DeepSeek-R1-Distill-Qwen-32B --local-dir ./ds-Qwen
Fetching repo metadata...
cat: ./ds-Qwen/.hfd/repo_metadata.json: 没有那个文件或目录
[Error] Failed to fetch metadata from https://hf-mirror.com/api/models/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B. HTTP status code: 000.

rm: 无法删除 './ds-Qwen/.hfd/repo_metadata.json': 没有那个文件或目录
(deepseek) shiny@cadd-cdob01-computer1:~/deepseek_local$ rm -rf ./ds-Qwen/.hfd/
(deepseek) shiny@cadd-cdob01-computer1:~/deepseek_local$ hfd.sh deepseek-ai/DeepSeek-R1-Distill-Qwen-32B --local-dir ./ds-Qwen
Fetching repo metadata...
cat: ./ds-Qwen/.hfd/repo_metadata.json: 没有那个文件或目录
[Error] Failed to fetch metadata from https://hf-mirror.com/api/models/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B. HTTP status code: 000.

rm: 无法删除 './ds-Qwen/.hfd/repo_metadata.json': 没有那个文件或目录

你好,之前这个问题我也遇到了,删除./ds-Qwen/.hfd/这个目录不起作用

@Harperrrr111
Copy link
Copy Markdown

errorCode=1 SSL/TLS handshake failure: self-signed certificate in certificate chain 请问怎么解决呢

@Seas0
Copy link
Copy Markdown

Seas0 commented Aug 13, 2025

errorCode=1 SSL/TLS handshake failure: self-signed certificate in certificate chain 请问怎么解决呢

检查你的系统证书,或者你的网络需要身份认证;或者给下载器加个忽略SSL证书检查参数。

@Harperrrr111
Copy link
Copy Markdown

errorCode=1 SSL/TLS handshake failure: self-signed certificate in certificate chain 请问怎么解决呢

检查你的系统证书,或者你的网络需要身份认证;或者给下载器加个忽略SSL证书检查参数。

谢谢!解决了~

@xiotong0112
Copy link
Copy Markdown

请问这是什么问题,有什么办法解决吗? [DL:19MiB][#90b682 7.5GiB/10GiB(74%)][#2f1d57 1.6GiB/22GiB(7%)] 03/24 00:08:38 [ERROR] CUID#13 - Download aborted. URI=xxx Exception: [AbstractCommand.cc:351] errorCode=22 URI=xxx -> [HttpSkipResponseCommand.cc:239] errorCode=22 The response status is not successful. status=429

访问频率过快,被限制了

请问这个能通过调什么命令参数来缓解吗?需要下载的文件比较多,现在用的都是默认参数

你好解决了吗,我遇到了同样的问题

@padeoe
Copy link
Copy Markdown
Author

padeoe commented Oct 18, 2025

@xiotong0112 近期429问题是因为hf上游对访问频率进行了更严格的限制,可以通过传递--hf_username--hf_token参数实现登录,可以提高访问频率额度

@xiotong0112
Copy link
Copy Markdown

@xiotong0112 近期429问题是因为hf上游对访问频率进行了更严格的限制,可以通过传递--hf_username--hf_token参数实现登录,可以提高访问频率额度

感谢回复。还想请教一下,仓库文件数量太多,repo返回的文件列表不全,这种情况下怎么能下载仓库中完整的文件啊

@padeoe
Copy link
Copy Markdown
Author

padeoe commented Oct 21, 2025 via email

@eternity123-null
Copy link
Copy Markdown

eternity123-null commented Nov 3, 2025

您好,我用hfd下载数据集,还有一些文件没下载下来,命令行却提示已经下载完成了,请问有什么解决方法吗?
发现是repo_metadata.json中就少了缺失的文件那部分,换个位置重新下载发现repo_metadata.json仍然是不完整的。Dataset是behavior-1k/2025-challenge-demos

@Leslie-Luo
Copy link
Copy Markdown

大佬 微信群二维码过期了,麻烦更新一下

@padeoe
Copy link
Copy Markdown
Author

padeoe commented Nov 26, 2025

@Leslie-Luo 已更新,感谢提醒

@Gengsheng-Li
Copy link
Copy Markdown

@Leslie-Luo 已更新,感谢提醒

大佬,请问二维码在哪里呀?

@padeoe
Copy link
Copy Markdown
Author

padeoe commented Dec 1, 2025

@Leslie-Luo 已更新,感谢提醒

大佬,请问二维码在哪里呀?

在 hf-mirror.com 网站首页底部

@Gengsheng-Li
Copy link
Copy Markdown

还想请教大佬一个问题:
我在使用 ./hfd.sh Alibaba-NLP/Tongyi-DeepResearch-30B-A3B --tool aria2c -x 10 --local-dir ./models/Tongyi-DeepResearch-30B-A3B的时候有几个模型的safetensor文件出现了以下报错,最后下载失败了(但是也有几个下载成功了)。后面再次使用相同的指令打算断点重下时还是出现了一样的错误,不知道这是怎么回事?翻了全网也没能找到解决办法,想求助下大佬。

[DL:318KiB][#902d79 668MiB/3.7GiB(17%)][#8aa93b 666MiB/3.7GiB(17%)][#2ae6fc 1.0GiB/3.7GiB(29%)][#aa131c 2.1GiB/3.7GiB(57%)][#162f45 2.4GiB/3.7GiB(66%)]                                                                                                  
12/01 00:51:15 [ERROR] CUID#147 - Download aborted. URI=https://hf-mirror.com/Alibaba-NLP/Tongyi-DeepResearch-30B-A3B/resolve/main/model-00007-of-00016.safetensors                                                                                     
Exception: [AbstractCommand.cc:351] errorCode=8 URI=https://cas-bridge.xethub.hf-mirror.com/xet-bridge-us/68c90a317d4233e576219173/f0d0c0d0c2329cad49af3662709be5646d23c0630360214a4490b67f01553ec7?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha25$=UNSIGNED-PAYLOAD&X-Amz-Credential=cas%2F20251201%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20251201T055114Z&X-Amz-Expires=3600&X-Amz-Signature=98cafa0ad9bf003ab5ef096aaef618c6795554779d7491c222c73cd1f6eb8df4&X-Amz-SignedHeaders=host&X-Xet-Cas-Uid$62171e3b6a99db28e0b3159d&response-content-disposition=inline%3B+filename*%3DUTF-8%27%27model-00007-of-00016.safetensors%3B+filename%3D%22model-00007-of-00016.safetensors%22%3B&x-id=GetObject&Expires=1764571874&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGl$biI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTc2NDU3MTg3NH19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2FzLWJyaWRnZS54ZXRodWIuaGYuY28veGV0LWJyaWRnZS11cy82OGM5MGEzMTdkNDIzM2U1NzYyMTkxNzMvZjBkMGMwZDBjMjMyOWNhZDQ5YWYzNjYyNzA5YmU1NjQ2ZDIzYzA2MzAzNjAyMTRhNDQ5MGI2$2YwMTU1M2VjNyoifV19&Signature=JvGp1IQ9VFbOIjh%7EEljxKbpIj5jl8yWWAPr4o5%7EywsxBtas5L2EgZi6cNWMAP5g2GFCZs9wPcNByLq5Xp0VMoYd%7EOc7x21sfBVLqYQZ-NszYC9WunpMzFDqebFg-MRUWCNu-CP9-F6-jj6GKinm5J3rCB3CacNMinb1jL47jAs7f0i2cI8vwcK1zoC1MMn7zDoqAFJTpxljdQlCbJDk8$InlIlx8QavuLYi3Kf15VIMMEl576gfr%7Er1LusJwlGx4EIBx%7EbuKdCo52TrEjy9FX1jazFcBE76NB2SpfDzknD4olrbqOR-1CV41z-uO1DwbDNM1BEP%7EJaT8OUuyIiVEow__&Key-Pair-Id=K2L8F4GPSG1IFC                                                                                    
  -> [HttpResponse.cc:86] errorCode=8 Invalid range header. Request: 3041079195-3501195263/3999975416, Response: 0-3999975415/3999975416

@Gengsheng-Li
Copy link
Copy Markdown

@Leslie-Luo 已更新,感谢提醒

大佬,请问二维码在哪里呀?

在 hf-mirror.com 网站首页底部

谢谢佬,佬回得真快,感动~

@ranyev5
Copy link
Copy Markdown

ranyev5 commented Dec 8, 2025

大佬二维码过期了

@Littleor
Copy link
Copy Markdown

Littleor commented Jan 4, 2026

最近使用 hf-mirror 的过程中出现了 IP Limitation:

We had to rate limit your IP (162.159.*.*). To continue using our service, create a HF account or login to your existing account, and make sure you pass a HF_TOKEN if you're using the API.

需要配置 HF_TOKEN 以解决限流问题,然而 hfd.sh 每次都需要配置对应参数,直接写死在 env 在部分环境下不适合,因此更新了下脚本支持本地文件持久化 HF_TOKEN 和 HF_USERNAME:https://gist.github.com/Littleor/a159f06d5a5ba533b7c84a83ddd69dc0/revisions

用法:

$ ./hfd.sh login
Enter Hugging Face Username: ***
Enter Hugging Face Token: ***
Credentials saved to ~/.hfd_config

后续即可不带参数直接下载模型/数据集,减少输入复杂度。

@dichen-cd
Copy link
Copy Markdown

还想请教大佬一个问题: 我在使用 ./hfd.sh Alibaba-NLP/Tongyi-DeepResearch-30B-A3B --tool aria2c -x 10 --local-dir ./models/Tongyi-DeepResearch-30B-A3B的时候有几个模型的safetensor文件出现了以下报错,最后下载失败了(但是也有几个下载成功了)。后面再次使用相同的指令打算断点重下时还是出现了一样的错误,不知道这是怎么回事?翻了全网也没能找到解决办法,想求助下大佬。

[DL:318KiB][#902d79 668MiB/3.7GiB(17%)][#8aa93b 666MiB/3.7GiB(17%)][#2ae6fc 1.0GiB/3.7GiB(29%)][#aa131c 2.1GiB/3.7GiB(57%)][#162f45 2.4GiB/3.7GiB(66%)]                                                                                                  
12/01 00:51:15 [ERROR] CUID#147 - Download aborted. URI=https://hf-mirror.com/Alibaba-NLP/Tongyi-DeepResearch-30B-A3B/resolve/main/model-00007-of-00016.safetensors                                                                                     
Exception: [AbstractCommand.cc:351] errorCode=8 URI=https://cas-bridge.xethub.hf-mirror.com/xet-bridge-us/68c90a317d4233e576219173/f0d0c0d0c2329cad49af3662709be5646d23c0630360214a4490b67f01553ec7?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha25$=UNSIGNED-PAYLOAD&X-Amz-Credential=cas%2F20251201%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20251201T055114Z&X-Amz-Expires=3600&X-Amz-Signature=98cafa0ad9bf003ab5ef096aaef618c6795554779d7491c222c73cd1f6eb8df4&X-Amz-SignedHeaders=host&X-Xet-Cas-Uid$62171e3b6a99db28e0b3159d&response-content-disposition=inline%3B+filename*%3DUTF-8%27%27model-00007-of-00016.safetensors%3B+filename%3D%22model-00007-of-00016.safetensors%22%3B&x-id=GetObject&Expires=1764571874&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGl$biI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTc2NDU3MTg3NH19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2FzLWJyaWRnZS54ZXRodWIuaGYuY28veGV0LWJyaWRnZS11cy82OGM5MGEzMTdkNDIzM2U1NzYyMTkxNzMvZjBkMGMwZDBjMjMyOWNhZDQ5YWYzNjYyNzA5YmU1NjQ2ZDIzYzA2MzAzNjAyMTRhNDQ5MGI2$2YwMTU1M2VjNyoifV19&Signature=JvGp1IQ9VFbOIjh%7EEljxKbpIj5jl8yWWAPr4o5%7EywsxBtas5L2EgZi6cNWMAP5g2GFCZs9wPcNByLq5Xp0VMoYd%7EOc7x21sfBVLqYQZ-NszYC9WunpMzFDqebFg-MRUWCNu-CP9-F6-jj6GKinm5J3rCB3CacNMinb1jL47jAs7f0i2cI8vwcK1zoC1MMn7zDoqAFJTpxljdQlCbJDk8$InlIlx8QavuLYi3Kf15VIMMEl576gfr%7Er1LusJwlGx4EIBx%7EbuKdCo52TrEjy9FX1jazFcBE76NB2SpfDzknD4olrbqOR-1CV41z-uO1DwbDNM1BEP%7EJaT8OUuyIiVEow__&Key-Pair-Id=K2L8F4GPSG1IFC                                                                                    
  -> [HttpResponse.cc:86] errorCode=8 Invalid range header. Request: 3041079195-3501195263/3999975416, Response: 0-3999975415/3999975416

我也遇到了类似的情况,部分使用了xet的repo域名变成了xethub.hf.co 而不是 huggingface.co. 我猜测或许hf-mirror还没能cover到这种情况吧

@sonder718
Copy link
Copy Markdown

@xiotong0112 近期429问题是因为hf上游对访问频率进行了更严格的限制,可以通过传递--hf_username--hf_token参数实现登录,可以提高访问频率额度

感谢回复。还想请教一下,仓库文件数量太多,repo返回的文件列表不全,这种情况下怎么能下载仓库中完整的文件啊

你好解决了吗,我遇到了同样的问题

@luojiyin1987
Copy link
Copy Markdown

hfd.sh 的 87-88 行 原来是

cmd_string+=" HF_TOKEN=${HF_TOKEN:-}"                                                                                                                                                                                                                  
cmd_string+=" HF_TOKEN=${HF_ENDPOINT:-}"   

是不是应该改成

cmd_string+=" HF_TOKEN=${HF_TOKEN:-}"
cmd_string+=" HF_ENDPOINT=${HF_ENDPOINT:-}"

@padeoe
Copy link
Copy Markdown
Author

padeoe commented Jan 13, 2026

@luojiyin1987 感谢提醒, https://hf-mirror.com/hfd/hfd.sh 没更新到最新版本,确实存在typo,我今天更新

@RenovZ
Copy link
Copy Markdown

RenovZ commented Mar 23, 2026

你这个相当于是一个代理吧, 并不是一个镜像, 对吧!

@padeoe
Copy link
Copy Markdown
Author

padeoe commented Mar 24, 2026

你这个相当于是一个代理吧, 并不是一个镜像, 对吧!

本仓库是huggingface的下载加速脚本

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment