Transformers预训练模型的使用

Installation

python3 -m venv env

source env/bin/activate

git clone https://github.com/huggingface/transformers.git

cd transformers

pip install -e .

测试

python -c "from transformers import pipeline; print(pipeline('sentiment-analysis')('I hate you’))”

会输出 [{'label': 'NEGATIVE', 'score': 0.9991129040718079}]

报错

若提示没有Pytorch或者Tensorflow，pip3 install torch torchvision
ValueError: Connection error, and we cannot find the requested files in the cached path. Please try again or make sure your Internet connection is on. 网的问题
Exception has occurred: OSError Unable to load weights from pytorch checkpoint file. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True. 解决办法：清除 ~/.cache/torch/transformers下的所有文件后重试。

针对网络问题，手动下载，本地加载预训练权重

网站https://huggingface.co/models包含了transformers支持的所有模型，选中需要下载的模型。以bert-base-uncased为例。点开files and versions，可以看到文件列表，找到列表中的config.json，pytorch_model.bin和vocab文件，点击下载图标，下载到指定文件夹bert-base-uncased中。config.json是模型的配置文件，pytorch_model.bin是存储的预训练模型，vocab是字典。

如何在虚拟环境下运行jupyter notebook

参考https://www.jianshu.com/p/0432155d1bef

为虚拟环境安装ikernel包

pip3 install ipykernel

激活环境

python -m ipykernel install —name env