Skip to content

Instantly share code, notes, and snippets.

@davidmezzetti
Last active June 8, 2026 20:07
Show Gist options
  • Select an option

  • Save davidmezzetti/6306f67685352b5c32b137dba06f5b21 to your computer and use it in GitHub Desktop.

Select an option

Save davidmezzetti/6306f67685352b5c32b137dba06f5b21 to your computer and use it in GitHub Desktop.

Results using txtai's benchmark script. All vectors generated using all-MiniLM-L6-v2

Index and search times are similar for all methods. For larger sources (like FiQA), the index time will be lower as IVF training is not required.

ArguAna

Method Disk (MB) NDCG_10
Faiss IVF 13.7 MB 0.4761
Faiss IVF SQ4 2.3 MB 0.4739
turbovec 4 bit 2.0 MB 0.4789
turbovec 2 bit 1.2 MB 0.4809

FiQA

Method Disk (MB) Index Time (s) NDCG_10
Faiss IVF 88.7 MB 46.0s 0.3522
Faiss IVF SQ4 13.1 MB 45.7s 0.3536
turbovec 4 bit 11.9 MB 41.8s 0.3595
turbovec 2 bit 6.5 MB 42.9s 0.3499

NFCorpus

Method Disk (MB) NDCG_10
Faiss IVF 5.5 MB 0.3089
Faiss IVF SQ4 0.7 MB 0.3103
turbovec 4 bit 0.7 MB 0.3083
turbovec 2 bit 0.4 MB 0.3045

SciDocs

Method Disk (MB) NDCG_10
Faiss IVF 40.7 MB 0.2158
Faiss IVF SQ4 7.0 MB 0.2150
turbovec 4 bit 6.2 MB 0.2186
turbovec 2 bit 3.8 MB 0.2104

SciFact

Method Disk (MB) NDCG_10
Faiss IVF 8.1 MB 0.5740
Faiss IVF SQ4 1.3 MB 0.6465
turbovec 4 bit 1.1 MB 0.6392
turbovec 2 bit 0.6 MB 0.5685
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment