服务器部署gpt3_based_demo
环境
使用uname --all
查看当前环境
1 | Linux 164 4.15.0-142-generic #146~16.04.1-Ubuntu SMP Tue Apr 13 09:27:15 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
步骤
安装conda
1 | wget [anaconda下载链接] |
命令行运行conda
发现不成功,于是添加到path
1 | export PATH="~/anaconda3/bin:$PATH" >> ~/.bashrc |
寻找可用的开源demo
Intro
OPT : Open Pre-trained Transformer Language Models
引用官方文件的前两段Large language models trained on massive text collections have shown surprising emergent capabilities to generate text and perform zero- and few-shot learning. While in some cases the public can interact with these models through paid APIs, full model access is currently limited to only a few highly resourced labs. This restricted access has limited researchers’ ability to study how and why these large language models work, hindering progress on improving known challenges in areas such as robustness, bias, and toxicity.We present Open Pretrained Transformers (OPT), a suite of decoder-only pre-trained transformers ranging from 125M to 175B parameters, which we aim to fully and responsibly share with interested researchers.
We train the OPT models to roughly match the performance and sizes of the GPT-3 class of models, while also applying the latest best practices in data collection and efficient training. Our aim in developing this suite of OPT models is to enable reproducible and responsible research at scale, and to bring more voices to the table in studying the impact of these LLMs. Definitions of risk, harm, bias, and toxicity, etc., should be articulated by the collective research community as a whole, which is only possible when models are available for study.Transformer的module,网址内含示例代码
Intro
Megatron-GPT 1.3B is a transformer-based language model. GPT refers to a class of transformer decoder-only models similar to GPT-2 and 3 while 1.3B refers to the total trainable parameter count (1.3 Billion) [1, 2]. It has Tensor Parallelism (TP) of 1, Pipeline Parallelism (PP) of 1 and should fit on a single NVIDIA GPU.
This model was trained with NeMo Megatron.
Intro
2021年八月不再维护
新的版本是GPT-NeoX
新版本的框架基于NVIDIA Megatron-LM的,并已通过 DeepSpeed 的技术以及一些新颖的优化进行了增强。(似乎需要从头开始训?见readme的加粗字体)
Info
一篇介绍性的文章GLM-130B:开源的双语预训练模型
GLM-130B 是一个 1300 亿参数规模的双语(中文和英文)双向语言模型。它的底层架构是基于通用语言模型(GLM1)(跟gpt3是不一样的),在超过 4000 亿个文本标识符上预训练完成。GLM-130B 利用自回归空白填充作为其主要的预训练目标,以图 4 中的句子为例,它掩盖了随机的连续文本区间(例如,“complete unkown”),并对其进行自回归预测。
summary
总的来看opt看起来跟gpt最像,经进一步查询资料发现transformer库已经有opt模型
Nividia的nemo-megatron-gpt-1.3有比较详细的部署文档,但是还不太清楚megatron是起了什么作用。
检查本机环境是否符合demo运行要求
查看服务器内存
检查python版本
检查GPU以及驱动
安装pytorch
根据transformer的安装文档
Transformers is tested on Python 3.6+, PyTorch 1.1.0+, TensorFlow 2.0+, and Flax.
安装对应CUDA版本的pytorch
1 | CUDA 10.2 |
检验pytorch
1 | import torch |
输出:
安装transformer
1 | git clone https://github.com/huggingface/transformers.git |
运行模型
复制transformer示例,添加输出,把模型和输入张量放到GPU上去,保存为0_test.py
1 | from transformers import AutoTokenizer, OPTForCausalLM |
运行
1 | python 0_test.py |
输出结果