This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# So you want to run GPT-J-6B using HuggingFace+FastAPI on a local rig (3090 or TITAN) ... tricky. | |
# special help from the Kolob Colab server https://colab.research.google.com/drive/1VFh5DOkCJjWIrQ6eB82lxGKKPgXmsO5D?usp=sharing#scrollTo=iCHgJvfL4alW | |
# Conversion to HF format (12.6GB tar image) found at https://drive.google.com/u/0/uc?id=1NXP75l1Xa5s9K18yf3qLoZcR6p4Wced1&export=download | |
# Uses GDOWN to get the image | |
# You will need 26 GB of space, 12+GB for the tar and 12+GB expanded (you can nuke the tar after expansion) | |
# HPPH: Not sure where you'll find this file, the links I found didn't work and the GDOWN was returning unauthorised errors. Maybe I'll make it a torrent. | |
# HPPH: I also dumped the kobold endpoint. And added one for getting token counts so you can prune your prompt if necessary. | |
# HPPH: And finally... Now the prompt goes in the POST body, which simplifies matters significantly. | |
# Near Simplest Language model API, with room to expand! |