Transformers预训练模型的使用

Installation

python3 -m venv env

source env/bin/activate

  • 如果要离开虚拟环境,运行 deactivate

  • installing from source

git clone https://github.com/huggingface/transformers.git

cd transformers

pip install -e .

测试

python -c "from transformers import pipeline; print(pipeline('sentiment-analysis')('I hate you’))”

会输出 [{'label': 'NEGATIVE', 'score': 0.9991129040718079}]

报错

  • 若提示没有Pytorch或者Tensorflow,pip3 install torch torchvision
  • ValueError: Connection error, and we cannot find the requested files in the cached path. Please try again or make sure your Internet connection is on. 网的问题
  • Exception has occurred: OSError Unable to load weights from pytorch checkpoint file. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True. 解决办法:清除 ~/.cache/torch/transformers下的所有文件后重试。

针对网络问题,手动下载,本地加载预训练权重

网站https://huggingface.co/models包含了transformers支持的所有模型,选中需要下载的模型。以bert-base-uncased为例。点开files and versions,可以看到文件列表,找到列表中的config.json,pytorch_model.bin和vocab文件,点击下载图标,下载到指定文件夹bert-base-uncased中。config.json是模型的配置文件,pytorch_model.bin是存储的预训练模型,vocab是字典。

如何在虚拟环境下运行jupyter notebook

参考https://www.jianshu.com/p/0432155d1bef

为虚拟环境安装ikernel包

pip3 install ipykernel

激活环境

python -m ipykernel install —name env