Figure out which llama.cpp models are taking up space

27 November 2025

When you try out a lot of LLM models on your local machine, you inevitably start running out of disk space at some point.

Just thought I'd write down a bash one-liner I made to list all the models in my llama.cpp cache directory, along with their size in GB, sorted by size.

It can be fairly annoying to figure out which models are taking up space, since they're all split into multiple parts and all the files for all your models are stored in one flat directory structure.

Here's the script:

for i in $(ls ~/.cache/llama.cpp/*.gguf | sed -re 's/(.*)(-[0-9]+-of-[0-9]+)/\1/' | sed -re 's/.gguf$//' | uniq); do echo "$(du -s ${i}* | awk '{total += $1} END {print total / 1024 / 1024}') GiB: $(basename ${i})"; done | sort -n

Here's some sample output on my system:

0.255459 GiB: nomic-ai_nomic-embed-text-v1.5-GGUF_nomic-embed-text-v1.5.f16
0.369465 GiB: unsloth_Qwen3-0.6B-GGUF_Qwen3-0.6B-Q4_K_M
0.595261 GiB: DevQuasar_Qwen.Qwen3-Reranker-0.6B-GGUF_Qwen.Qwen3-Reranker-0.6B.Q8_0
0.595261 GiB: Qwen_Qwen3-Embedding-0.6B-GGUF_Qwen3-Embedding-0.6B-Q8_0
0.79884 GiB: unsloth_gemma-3-27b-it-qat-GGUF_mmproj-F16
0.817757 GiB: unsloth_Magistral-Small-2509-GGUF_mmproj-F16
1.70845 GiB: mradermacher_zerank-1-small-GGUF_zerank-1-small.Q8_0
1.76443 GiB: Qwen_Qwen2.5-Coder-1.5B-Instruct-GGUF_qwen2.5-coder-1.5b-instruct-q8_0
3.36775 GiB: Qwen_Qwen2.5-Coder-3B-Instruct-GGUF_qwen2.5-coder-3b-instruct-q8_0
7.54235 GiB: Qwen_Qwen2.5-Coder-7B-Instruct-GGUF_qwen2.5-coder-7b-instruct-q8_0
11.2779 GiB: ggml-org_gpt-oss-20b-GGUF_gpt-oss-20b-mxfp4
13.3495 GiB: unsloth_Magistral-Small-2509-GGUF_Magistral-Small-2509-Q4_K_M
14.5454 GiB: unsloth_gemma-3-27b-it-qat-GGUF_gemma-3-27b-it-qat-Q4_0
20.2676 GiB: mradermacher_Seed-OSS-36B-Instruct-abliterated-GGUF_Seed-OSS-36B-Instruct-abliterated.Q4_K_M
23.3341 GiB: mistralai_Devstral-Small-2507_gguf_Devstral-Small-2507-Q8_0
26.4785 GiB: unsloth_GLM-4.5-Air-GGUF_Q5_K_S_GLM-4.5-Air-Q5_K_S
30.253 GiB: unsloth_Qwen3-30B-A3B-Instruct-2507-GGUF_Qwen3-30B-A3B-Instruct-2507-Q8_0
30.253 GiB: unsloth_Qwen3-Coder-30B-A3B-Instruct-GGUF_Qwen3-Coder-30B-A3B-Instruct-Q8_0
31.9117 GiB: unsloth_granite-4.0-h-small-GGUF_granite-4.0-h-small-Q8_0
48.5862 GiB: lovedheart_GLM-4.5-Air-GGUF-IQ1_M_MXFP4_S_GLM-4.5-Air-MXFP4_MOE_S
58.0141 GiB: lovedheart_GLM-4.5-Air-GGUF-IQ1_M_MXFP4_Max_GLM-4.5-Air-MXFP4_MOE_Max
58.7334 GiB: bartowski_ArliAI_GLM-4.5-Air-Derestricted-GGUF_ArliAI_GLM-4.5-Air-Derestricted-IQ4_NL_ArliAI_GLM-4.5-Air-Derestricted-IQ4_NL
59.0341 GiB: ggml-org_gpt-oss-120b-GGUF_gpt-oss-120b-mxfp4