Skip to content

Instantly share code, notes, and snippets.

View kwang2049's full-sized avatar
🎯
Focusing

Kexin Wang kwang2049

🎯
Focusing
View GitHub Profile
@kwang2049
kwang2049 / index-dpr_single_nq-faiss_ivf_sq.py
Last active February 28, 2022 00:34
Example script that shows how to index the DPR single-nq (https://github.com/facebookresearch/DPR) embeddings with Faiss IndexIVFScalarQuantizer index.
import pickle
import os
import json
import faiss
import tqdm
import numpy as np
import pytrec_eval
import time
from typing import List, Tuple
from collections import defaultdict
@kwang2049
kwang2049 / matplotlib-font.py
Created December 16, 2021 13:04
Showing how to change the default font when using the matplotlib from python.
from matplotlib import pyplot as plt
plt.rcParams['font.family'] = 'serif'
plt.rcParams['font.serif'] = ['Times New Roman'] + plt.rcParams['font.serif']
plt.rcParams['font.size'] = 20
x = [1, 2, 3]
y1 = [1, 2, 3]
y2 = [2, 3, 4]
plt.plot(x, y1, label='y1')
@kwang2049
kwang2049 / msmarco_1m.py
Last active June 30, 2021 07:52
Generate an 1M-document version of MS MARCO by keeping the dev/test qrels and random sample other negatives. The msmarco-1m.zip is available at https://public.ukp.informatik.tu-darmstadt.de/kwang/datasets/ir/msmarco-1m.zip.
from beir import util, LoggingHandler
from beir.retrieval import models
from beir.datasets.data_loader import GenericDataLoader
from beir.retrieval.evaluation import EvaluateRetrieval
from beir.retrieval.search.dense import DenseRetrievalExactSearch as DRES
import logging
import pathlib, os
import random
import json
@kwang2049
kwang2049 / modeling_distilbert.py
Created May 28, 2021 08:13
DistilBERT modeling with LM head supported. One can download it and import modeling_distilbert to support DistilBERT for decoding usage, e.g. TSDAE: https://github.com/UKPLab/sentence-transformers/blob/master/examples/unsupervised_learning/TSDAE/train_tsdae_from_file.py
# coding=utf-8
# Copyright 2019-present, the HuggingFace Inc. team, The Google AI Language Team and Facebook, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software