uol-llama2

This repository is a simple way to deply llama-2 models with Docker:

First, make sure you got the docker installed successfully, you can check it by running:

docker run hello-world

If error, please refer to Docker Installation Tutorial Official Website

Clone the repository and play:

git clone --recursive [email protected]:YD-19/uol-llama2.git

simply run the following two lines in your command:

make build

make start

Then, start chatting.

Hints:

you must have at least 1 GPU on your machine for 7b models. 2 GPUs for 13b models, and 8 GPU for 30b models.
If it is your first time to run the model, you should download the model and checklist from meta by the following command:

make build Download=true

If you want to change model, simply rebuild with the following command, for example if you want to use 13b model:

make build model=13b-chat

If CUDA our of memory, you can reduce the maximum sequence length with the following command:

make build seq_len=128

If using 4090, you could set seq_len=4096 for 7b-chat model

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
DockerEnvs		DockerEnvs
llama @ 6c7fe27		llama @ 6c7fe27
.gitmodules		.gitmodules
Makefile		Makefile
README.md		README.md
git		git
ignore		ignore

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

uol-llama2

This repository is a simple way to deply llama-2 models with Docker:

Clone the repository and play:

Hints:

About

Releases

Packages

Languages

YD-19/uol-llama2

Folders and files

Latest commit

History

Repository files navigation

uol-llama2

This repository is a simple way to deply llama-2 models with Docker:

Clone the repository and play:

Hints:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages