Skip to content

Instantly share code, notes, and snippets.

View justinchuby's full-sized avatar
🌊
Better ML

Justin Chu justinchuby

🌊
Better ML
View GitHub Profile
@Chillee
Chillee / flex_attention_tutorial.py
Last active April 25, 2025 04:34
flex_attention_tutorial.py
import torch
from torch.nn.attention._flex_attention import _create_block_mask, _create_mask
from functools import partial
from torch.nn.attention._flex_attention import _flex_attention
from triton.testing import do_bench
import torch.nn.functional as F
from functools import lru_cache
torch.set_default_device('cuda')
# Example usage
/*
Program to demonstrate using one session and multiple threads to call Run on that session.
g++ -std=c++17 -o test_ort_one_session_multiple_threads test_ort_one_session_multiple_threads.cc -I onnxruntime-linux-x64-1.15.1/include/ -lonnxruntime -Lonnxruntime-linux-x64-1.15.1/lib/ -lpthread -Wl,-rpath,/home/pranav/onnxruntime-linux-x64-1.15.1/lib/
Author: Github id: pranavsharma)
*/
#include <onnxruntime_cxx_api.h>
#include <vector>
#include <string>
#include <iostream>
@hollance
hollance / alignment-heads.md
Last active April 11, 2025 22:50
Alignment heads for Whisper word-level timestamps with Hugging Face Transformers

To allow the Hugging Face version of Whisper to predict word-level timestamps, a new property alignment_heads must be added to the GenerationConfig object. This is a list of [layer, head] pairs that select the cross-attention heads that are highly correlated to word-level timing.

If your Whisper checkpoint does not have the alignment_heads property yet, it can be added in two possible ways.

Method 1. Change the model.generation_config property:

# load the model
model = WhisperForConditionalGeneration.from_pretrained("your_checkpoint")
@justinchuby
justinchuby / import_in_python.md
Last active June 4, 2025 23:49
Best practice for importing in Python

Best practice for importing in Python

1. Import at top level only

Allow imports at the module toplevel only, unless (1) it is too expensive to load the module or (2) module may not be available.

  • It is clear what a module needs when all imports are grouped in a single place. This makes refactoring easy.
  • Otherwise, any import errors will be raised only when the code executes. Erroneous imports may go undetected until the code path is hit at runtime.
  • Doing so reduces code duplication and improves consistency when we don't have the same import lines spread across the file.
  • Pylint has rule that checks for this: https://pylint.readthedocs.io/en/latest/user_guide/messages/convention/import-outside-toplevel.html
@dylech30th
dylech30th / Unification.scala
Last active July 12, 2023 20:48
The implementation of unification-based type inference algorithm in pure simply typed lambda calculus with Let Polymorphism
/**
* The implementation of unification-based type inference algorithm in simply typed lambda calculus with Let-Polymorphism
*
* The unification-based type inference algorithm is widely used in a huge variety of programming languages, where the most
* famous one is the Hindley-Milner Type System (a.k.a Damas-Milner Type System) of the ML-Family which permits the programmer
* to omit almost all of the type annotations, the algorithm is based on two concepts: Constraint Set and Unifier.
*
* A constraint set consist of several constraints, a constraints is basically a type equation, e.g., X = T, where both X and
* T are types
* A unifier is a set of type substitutions [X -> T1, Y -> T2, ...], it replaces all the type variables in its domain to the
@kongdd
kongdd / Debugging Mixed Python C++ code in Visual Studio Code
Created December 29, 2019 22:34 — forked from asroy/Debugging Mixed Python C++ code in Visual Studio Code
Debugging Mixed Python/C++ code in Visual Studio Code
I've tested it on Fedora 23 and Ubuntu 16.04. I'm using gcc-5.3.1, python-3.4, VS Code-1.14.0
You can debug mixed Python/C++ in the same GUI. It also works for MPI applications. You can switch between the debuggers and corresponding call stacks.
1. Packages needed
1) Visual Studio Code
2) Extensions for VS Code:
"Python" from Don Jayamanne (I'm using 0.6.7)
This allows VS Code act as the front end to debug python.
This gives VS Code ability to attach to a python script that uses module "ptvsd".
@jeffomatic
jeffomatic / rustfmt-skip-children.sh
Last active February 5, 2023 04:57
rustfmt --skip-children emulation script
#!/bin/bash
#
# This script filters rustfmt output for format-on-save workflows in text
# editors.
#
# Usage:
#
# rustfmt-skip-children /path/to/source
#
# In particular, it:
@mikhailov-work
mikhailov-work / turbo_colormap.py
Created August 8, 2019 23:31
Turbo Colormap Look-up Table
# Copyright 2019 Google LLC.
# SPDX-License-Identifier: Apache-2.0
# Author: Anton Mikhailov
turbo_colormap_data = [[0.18995,0.07176,0.23217],[0.19483,0.08339,0.26149],[0.19956,0.09498,0.29024],[0.20415,0.10652,0.31844],[0.20860,0.11802,0.34607],[0.21291,0.12947,0.37314],[0.21708,0.14087,0.39964],[0.22111,0.15223,0.42558],[0.22500,0.16354,0.45096],[0.22875,0.17481,0.47578],[0.23236,0.18603,0.50004],[0.23582,0.19720,0.52373],[0.23915,0.20833,0.54686],[0.24234,0.21941,0.56942],[0.24539,0.23044,0.59142],[0.24830,0.24143,0.61286],[0.25107,0.25237,0.63374],[0.25369,0.26327,0.65406],[0.25618,0.27412,0.67381],[0.25853,0.28492,0.69300],[0.26074,0.29568,0.71162],[0.26280,0.30639,0.72968],[0.26473,0.31706,0.74718],[0.26652,0.32768,0.76412],[0.26816,0.33825,0.78050],[0.26967,0.34878,0.79631],[0.27103,0.35926,0.81156],[0.27226,0.36970,0.82624],[0.27334,0.38008,0.84037],[0.27429,0.39043,0.85393],[0.27509,0.40072,0.86692],[0.27576,0.41097,0.87936],[0.27628,0.42118,0.89123],[0.27667,0.43134,0.90254],[0.27691,0.44145,0.913
@jaimelr
jaimelr / the_effective_engineer.md
Created June 27, 2019 23:21
The Effective Engineer notes

Part 1: Adopt the Right Mindsets

Focus on High-Leverage Activities

How should we decide what actually to work on in order to more effectively achieve our goals? Assess the activities' leverage:

Leverage =  (Impact Produce) / (Time Invested)

Another way of thinking about leverage is the commonly-mentioned Pare- to principle, or 80–20 rule—the notion that for many activities, 80% of the im- pact comes from 20% of the work.

// ==UserScript==
// @name Inline Math for Notion.so
// @homepageURL https://www.notion.so/evertheylen/Notion-Inline-Math-9c5047a4e7c84643848b3630db8d5a5e
// @version 0.2.1
// @match https://www.notion.so/*
// @grant GM_addStyle
// @require https://cdn.jsdelivr.net/npm/[email protected]/dist/katex.js
// ==/UserScript==
// Instructions for use: