Llama2 Transformer Model in Rust

This repository contains the Rust implementation of the Llama2 Transformer model, focusing on performance and correctness. The implementation covers model creation, tokenization, and operations such as matrix multiplication and softmax, essential for the transformer's forward pass.

Features

Fast matrix multiplication with quantized tensors.
Efficient softmax implementation.
Custom tokenizer compatible with pre-trained models.
Memory-efficient operation utilizing memory mapping.

Getting Started

To get started with the Llama2 Transformer in Rust, clone the repository and build the project using Cargo.

Prerequisites

Rust programming language
Cargo package manager
llama2 model converted to .bin format using https://github.com/karpathy/llama2.c/tree/master?tab=readme-ov-file#metas-llama-2-models

Installation

Clone the repository:

git clone https://github.com/your-github-username/llama2-rs.git
cd llama2-rs
cargo build --release
./target/release/llama2_rs llama2.bin -n 100 -m "Once upon a time"

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
src		src
tests		tests
.gitignore		.gitignore
Cargo.toml		Cargo.toml
LICENSE		LICENSE
Readme.md		Readme.md
logo.png		logo.png
tokenizer.json		tokenizer.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Llama2 Transformer Model in Rust

Features

Getting Started

Prerequisites

Installation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

maderix/llama2.rs

Folders and files

Latest commit

History

Repository files navigation

Llama2 Transformer Model in Rust

Features

Getting Started

Prerequisites

Installation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages