Skip to content

Instantly share code, notes, and snippets.

@jszym
Created January 23, 2025 17:59
Show Gist options
  • Save jszym/3c54dbf14f6b2cedb263bb0794ca6cf5 to your computer and use it in GitHub Desktop.
Save jszym/3c54dbf14f6b2cedb263bb0794ca6cf5 to your computer and use it in GitHub Desktop.
Accessing UniProt embeddings from H5 files.
# First download 'per-proteins.h5' from https://ftp.uniprot.org/pub/databases/uniprot/knowledgebase/embeddings/uniprot_sprot/
# We'll also need tables (pip install tables)
import tables
h5file = tables.open_file("per-protein.h5", mode="r")
# This variable will now hold a 1024 dimensional embedding
# for the Cathelicidin antimicrobial peptide with
# UniProt AC "P49913"
embedding = h5file.root['P49913'].read()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment