Skip to content

Set Hugging Face/sentence-transformers model path

When I'm running sessions at conferences, I usually like to keep everything on the cloud in Google Colab so we don't have to worry about downloading models over awful conference wifi. But sometimes we can't even trust the internet enough for Colab!

The solution

The code below will put everything downloaded from Hugging Face into a folder called model-data.

import os

os.environ['HF_HOME'] = os.path.join(os.getcwd(), "model-data", "hub")
os.environ['SENTENCE_TRANSFORMERS_HOME'] = os.path.join(os.getcwd(), "model-data", "sentence-transformers")

Sometimes people suggest just loading the model directly from transformers: don't fall for it!!! I personally feel it's a lot more fragile than the alternative: setting HF_HOME and having absolutely everything Hugging Face-related live in the same directory. That way datasets, models, everything will fall into line.

If you're using sentence-transformers, HF_HOME won't have any effect. You'll need to set SENTENCE_TRANSFORMERS_HOME instead. The chances are pretty good that you'll be using both so I just cut-and-paste both lines each time.