Where Does llama-cli -hf Save Hugging Face Models by Default

A quick note on where llama-cli -hf stores GGUF models downloaded from Hugging Face, and how to change the cache directory with LLAMA_CACHE or Hugging Face cache variables.

If you use llama-cli to download and run a model directly from Hugging Face, for example:

1
llama-cli -hf unsloth/gemma-4-E4B-it-GGUF

this uses the Hugging Face download support built into llama.cpp. Recent llama.cpp builds store models downloaded with -hf in the standard Hugging Face Hub cache directory.

Default cache locations

The cache location used by llama-cli -hf is first controlled by the LLAMA_CACHE environment variable. If LLAMA_CACHE is not set, llama.cpp checks Hugging Face cache variables such as HF_HUB_CACHE, HUGGINGFACE_HUB_CACHE, and HF_HOME.

If none of those variables are set, common default paths are:

System Default cache directory
Linux ~/.cache/huggingface/hub
macOS ~/.cache/huggingface/hub
Windows %USERPROFILE%\.cache\huggingface\hub

On Windows, %USERPROFILE% usually expands to:

1
C:\Users\用户名

So the default cache directory is roughly:

1
C:\Users\用户名\.cache\huggingface\hub

How to change the llama-cli cache directory

Set LLAMA_CACHE if you want to store the downloaded models on a specific disk or in a specific folder. You can also follow the Hugging Face convention and set HF_HOME; in that case, the Hub cache directory will be $HF_HOME/hub.

Temporary Windows CMD example:

1
2
set LLAMA_CACHE=D:\models\llama-cache
llama-cli -hf unsloth/gemma-4-E4B-it-GGUF

Temporary PowerShell example:

1
2
$env:LLAMA_CACHE="D:\models\llama-cache"
llama-cli -hf unsloth/gemma-4-E4B-it-GGUF

Temporary Linux / macOS example:

1
2
export LLAMA_CACHE=/data/models/llama-cache
llama-cli -hf unsloth/gemma-4-E4B-it-GGUF

Summary

  • llama-cli -hf ... uses the download logic from llama.cpp, but recent builds default to the Hugging Face Hub cache.
  • Linux / macOS default: ~/.cache/huggingface/hub
  • Windows default: %USERPROFILE%\.cache\huggingface\hub
  • To change the location, set LLAMA_CACHE, or set HF_HOME / HF_HUB_CACHE
记录并分享
Built with Hugo
Theme Stack designed by Jimmy