AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |
Back to Blog
You are welcome to join the RWKV discord to build upon it. RWKV introduction, and in 100 lines of numpy: Ī cool paper (Spiking Neural Network) using RWKV: RWKV in 150 lines (model, inference, text generation): More RWKV projects: Join Our Discord: (lots of developers) Fastest GPU inference API with vulkan (good for nvidia/amd/intel) Fast CPU/cuBLAS/CLBlast inference: int4/int8/fp16/fp32 numpy()) # same result as aboveĬool Community RWKV Projects (check them!): forward(, state) # RNN has state (use deepcopy if you want to clone it) out, state = model. numpy()) # get logits out, state = model. forward(, None) # use 20B_tokenizer.json print( out. model import RWKV # pip install rwkv model = RWKV( model = '/fsx/BlinkDL/HF-MODEL/rwkv-4-pile-1b5/RWKV-4-Pile-1B5-20220903-8040', strategy = 'cuda fp16') environ = '0' # if '1' then use CUDA kernel for seq mode (much faster) from rwkv. RWKV-4-World is the best model: generation & chat & code in 100+ world languages, with the best English zero-shot & in-context learning ability too. Raven 14B (finetuned on Alpaca+ShareGPT+.) Demo: So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding (using the final hidden state). You can use the "GPT" mode to quickly compute the hidden state for the "RNN" mode. You only need the hidden state at position t to compute the state at position t+1. RWKV is an RNN with Transformer-level LLM performance, which can also be directly trained like a GPT transformer (parallelizable). The RWKV Language Model (and my LM tricks) RWKV: Parallelizable RNN with Transformer-level LLM Performance (pronounced as "RwaKuv", from 4 major params: R W K V)
0 Comments
Read More
Leave a Reply. |