Set Hugging Face/sentence-transformers model path
When I'm running sessions at conferences, I usually like to keep everything on the cloud in Google Colab so we don't have to worry about downloading models over awful conference wifi. But sometimes we can't even trust the internet enough for Colab!
The solution¶
The code below will put everything downloaded from Hugging Face into a folder called model-data
.
import os
os.environ['HF_HOME'] = os.path.join(os.getcwd(), "model-data", "hub")
os.environ['SENTENCE_TRANSFORMERS_HOME'] = os.path.join(os.getcwd(), "model-data", "sentence-transformers")
Sometimes people suggest just loading the model directly from transformers
: don't fall for it!!! I personally feel it's a lot more fragile than the alternative: setting HF_HOME
and having absolutely everything Hugging Face-related live in the same directory. That way datasets, models, everything will fall into line.
If you're using sentence-transformers, HF_HOME
won't have any effect. You'll need to set SENTENCE_TRANSFORMERS_HOME
instead. The chances are pretty good that you'll be using both so I just cut-and-paste both lines each time.