Noah Syrkis syrkis

Anti-hype LLM reading list

Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.

Foundational Concepts

Pre-Transformer Models

Reinforcement Learning for Language Models

Yoav Goldberg, April 2023.

Why RL?

With the release of the ChatGPT model and followup large language models (LLMs), there was a lot of discussion of the importance of "RLHF training", that is, "reinforcement learning from human feedback". I was puzzled for a while as to why RL (Reinforcement Learning) is better than learning from demonstrations (a.k.a supervised learning) for training language models. Shouldn't learning from demonstrations (or, in language model terminology "instruction fine tuning", learning to immitate human written answers) be sufficient? I came up with a theoretical argument that was somewhat convincing. But I came to realize there is an additional argumment which not only supports the case of RL training, but also requires it, in particular for models like ChatGPT. This additional argument is spelled out in (the first half of) a talk by John Schulman from OpenAI. This post pretty much

How to use Neovim or VIM Editor as an IDE for R

Note: This tutorial is written for Linux based systems.

Requirements

R >= 3.0.0

To install the latest version of R please flollow the download and install instructions at https://cloud.r-project.org/

Neovim >= 0.2.0

Neovim (nvim) is the continuation and extension of Vim editor with the aim to keep the good parts of Vim and add more features. In this tutorial I will be using Neovim (nvim), however, most of the steps are equally applicable to Vim also. Please follow download and installation instructions on nvim's GitHub wiki https://github.com/neovim/neovim/wiki/Installing-Neovim.

How to get notifications of 'end of training' on your mobile phone

I often train machine learning/deep learning models and it takes a very long time to finish. Even an epoch in a moderately complex model takes near to half an hour to train. So, I constantly need to check (baby sit) the training process.

To help reduce the pain, I need a way to notify me on the training metrics. The idea is, we will send the training metrics (messages) as notifications on mobile using PyTorch Callbacks.

I have written some Python code snippets that helps me send my training metrics log as mobile push notifications using Pushover service. They have a limit of 7500 requests per month per user—which is fine for my usecase.

Those who'd like to have something like this, you can grab those little hacky scripts.

	{-# LANGUAGE TypeSynonymInstances #-}
	data Dual d = D Float d deriving Show
	type Float' = Float

	diff :: (Dual Float' -> Dual Float') -> Float -> Float'
	diff f x = y'
	where D y y' = f (D x 1)

	class VectorSpace v where
	zero :: v

	// start with this
	let widget = new ListWidget()
	//

	// Set colors
	let bidenColor = new Color("1a68ff")
	let trumpColor = new Color("ff4a43")
	let bidenColor2 = new Color("1a68ff", 0.8)
	let trumpColor2 = new Color("ff4a43", 0.8)

	// Variables used by Scriptable.
	// These must be at the very top of the file. Do not edit.
	// icon-color: deep-green; icon-glyph: user-md;
	// change "country" to a value from https://coronavirus-19-api.herokuapp.com/countries/
	const country = "Israel"
	const url = `https://coronavirus-19-api.herokuapp.com/countries/${country}`
	const req = new Request(url)
	const res = await req.loadJSON()

	if (config.runsInWidget) {

	"""
	A bare bones examples of optimizing a black-box function (f) using
	Natural Evolution Strategies (NES), where the parameter distribution is a
	gaussian of fixed standard deviation.
	"""

	import numpy as np
	np.random.seed(0)

	# the function we want to optimize

	/**
	* Get the date and days within a week from week number.
	* eg: date range for 8th week in 2013 is 17th Feb to 23rd Feb. This
	* code snippet will give you.
	*
	* It is not my code completely, Bit of modification from something
	* i found on net. Cant find it anymore so keeping a backup.
	*
	* @param {[Integer]} weekNo [From week 1 to Week 52/53 based on the system date setting]
	* @return {[Date]} [description]