Skip to content

Instantly share code, notes, and snippets.

View hbcbh1999's full-sized avatar

Hao Zhang hbcbh1999

  • New York, New York
View GitHub Profile
Name Purpose Author (Publication Date) Category
Andy - an artificial human A slang term for "android" - an artificially created humanoid being. Philip K. Dick (1968) ai
Autobutle An automated servant. Frank Herbert (1972) ai
Automaton Chessplayer - the first chess-playing computer The first chess-playing computer. Ambrose Bierce (1910) ai
Automonk A robot with an AI trained on an individual monk. Ray Naylor (2022) ai
Ava - she wants to be taught A piece of learning software. Amitav Ghosh (1995) ai
Bard A machine that invents randomized stories and can read them out loud or animate them for viewing. Isaac Asimov (1956) ai
Bendix Anxiety Reducer Machine-based psychotherapy. Robert Sheckley (1956) ai
Big Computer - wide-screen Jehovah Just like it says; this computer knows it all. John Varley (1983) ai
Big Noodle A vast artificial intelligence system used to process all of Earth's information. Philip K. Dick (1981) ai

Reinforcement Learning for Language Models

Yoav Goldberg, April 2023.

Why RL?

With the release of the ChatGPT model and followup large language models (LLMs), there was a lot of discussion of the importance of "RLHF training", that is, "reinforcement learning from human feedback". I was puzzled for a while as to why RL (Reinforcement Learning) is better than learning from demonstrations (a.k.a supervised learning) for training language models. Shouldn't learning from demonstrations (or, in language model terminology "instruction fine tuning", learning to immitate human written answers) be sufficient? I came up with a theoretical argument that was somewhat convincing. But I came to realize there is an additional argumment which not only supports the case of RL training, but also requires it, in particular for models like ChatGPT. This additional argument is spelled out in (the first half of) a talk by John Schulman from OpenAI. This post pretty much

@hbcbh1999
hbcbh1999 / LLM.md
Created March 29, 2023 01:24 — forked from rain-1/LLM.md
LLM Introduction: Learn Language Models

Purpose

Bootstrap knowledge of LLMs ASAP. With a bias/focus to GPT.

Avoid being a link dump. Try to provide only valuable well tuned information.

Prelude

Neural network links before starting with transformers.

from __future__ import annotations
from contextlib import contextmanager
from typing import NamedTuple, Callable, Optional, Any
import numpy as np
Array = Any
class Node(NamedTuple):
vjp: Optional[Callable]
parents: List[Node]
@hbcbh1999
hbcbh1999 / webmssdk.js
Created December 24, 2022 16:18 — forked from mhasbini/webmssdk.js
Deobfuscated version of https://sf16-website-login.neutral.ttwstatic .com/obj/tiktok_web_login_static/webmssdk/1.0.0.1/webmssdk.js
!(function (arg1, arg2) {
if ("object" == typeof exports && "undefined" != typeof module) {
arg2(exports);
} else if ("function" == typeof define && define.amd) {
define(["exports"], arg2);
} else {
arg2(
((arg1 = "undefined" != typeof globalThis ? globalThis : arg1 || self).byted_acrawler =
{})
);
@hbcbh1999
hbcbh1999 / check_for_ffast_math.py
Created September 27, 2022 03:09 — forked from moyix/check_for_ffast_math.py
Hacky script to check for the set_fast_math constructor in an executable/shared library using objdump
#!/usr/bin/env python
import subprocess
import re
import sys
def get_init_array(filename):
# Call objdump -s -j .init_array <filename> to get the contents of the .init_array section
try:
objdump_output = subprocess.check_output(['objdump', '-s', '-j', '.init_array', filename], stderr=subprocess.STDOUT)
@hbcbh1999
hbcbh1999 / bitcoin-pay.rb
Created August 18, 2022 12:00 — forked from Sjors/bitcoin-pay.rb
This script demonstrates how a bitcoin transaction is created and signed. Just pass in your own address and private key and it will prepare a transaction for you. You can then copy & paste that transaction into a webservice like Blockchain to send it. I wrote this mostly to understand better how it works. I sometimes had to "cheat" and look at t…
#!/usr/bin/env ruby
require 'open-uri'
require 'JSON'
require 'digest/sha2'
require 'pry'
require 'bigdecimal'
require 'bitcoin' # Because I need to cheat every now and then
# Usage:
# gem install pry json ffi ruby-bitcoin
@hbcbh1999
hbcbh1999 / julia_nim_cpp_r_sir.md
Created July 2, 2022 14:15 — forked from sdwfrost/julia_nim_cpp_r_sir.md
Comparing simple simulations in Julia, Nim, C++ and R

This gist compares the performance of Julia, Nim, C++ and R - the latter using either POMP, or LibBi in a simple simulation of an SIR epidemiological model. In addition to keeping track of susceptibles, infecteds and recovereds, I also store the cumulative number of infections. Time moves in discrete steps, and the algorithm avoids language-specific syntax features to make the comparison as fair as possible, including using the same algorithm for generating binomial random numbers and the same random number generator; the exception are the R versions, POMP uses the standard R Mersenne Twister for the random number generator; I'm not sure what LibBi uses. The algorithm for generating random binomial numbers is only really suitable for small np.

Benchmarks were run on a Mac Pro (Late 2013), with 3 Ghz 8-core Intel Xeon E3, 64GB 1866 Mhz RAM, running OSX v 10.11.3 (El Capitan

@hbcbh1999
hbcbh1999 / Working GDB on macOS 11.md
Created May 29, 2022 13:46 — forked from mike-myers-tob/Working GDB on macOS 11.md
Steps to get GDB actually working in April 2021 on macOS

Debug with GDB on macOS 11

The big reason to do this is that LLDB has no ability to "follow-fork-mode child", in other words, a multi-process target that doesn't have a single-process mode (or, a bug that only manifests when in multi-process mode) is going to be difficult or impossible to debug, especially if you have to run the target over and over in order to make the bug manifest. If you have a repeatable bug, no big deal, break on the fork from the parent process and attach to the child in a second lldb instance. Otherwise, read on.

Install GDB

Don't make the mistake of thinking you can just brew install gdb. Currently this is version 10.2 and it's mostly broken, with at least two annoying bugs as of April 29th 2021, but the big one is https://sourceware.org/bugzilla/show_bug.cgi?id=24069

$ xcode-select install  # install the XCode command-line tools
@hbcbh1999
hbcbh1999 / hn_seach.js
Created November 1, 2018 18:30 — forked from kristopolous/hn_seach.js
hn job query search
function query() {
var
// HN is done with very unsemantic classes.
job_list = Array.prototype.slice.call(document.querySelectorAll('.c5a,.cae,.c00,.c9c,.cdd,.c73,.c88')),
query_list = Array.prototype.slice.call(arguments),
shown = 0, total = job_list.length;
// Traverses up the dom stack trying to find a match of a specific class
function up_to(node, klass) {
if (node.classList.contains(klass)) {