Skip to content

Instantly share code, notes, and snippets.

@cynthia
Created March 10, 2025 14:32
Show Gist options
  • Save cynthia/8b0f360447704f7ec543ed01b76e0ba3 to your computer and use it in GitHub Desktop.
Save cynthia/8b0f360447704f7ec543ed01b76e0ba3 to your computer and use it in GitHub Desktop.
from safetensors import safe_open
from torch.nn import Embedding
with safe_open("model-00001-of-000163.safetensors", framework="pt") as f:
embeddings = Embedding.from_pretrained(f.get_tensor('model.embed_tokens.weight'))
@cynthia
Copy link
Author

cynthia commented Mar 10, 2025

Notable weird tokens

  • "................................................................ ........................................................",
  • "---------------+ ---------------+",
  • "================================================================ ========",

and many more with similar patterns

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment