Overview: - Problem: Models are big and it's hard to change facts that they learn. - In a nutshell: mend generates a weight delta that's decomposed into a low-rank matrix like LoRA. - "local, reliable, and general" - "Local" means unrelated output is not changed. "Reliable" means the model takes the desired corrections. "General" meaning variations on similar questions which would need correction also are corrected. - Works even on very large models. Differences to Prior Art: - ENN encodes editability into parameters of the model itself. MEND provides editability through independent model. (ENN is closer to fine-tune? MEND closer to LoRA?) - KE uses raw edit example as input and produces a single rank-1 mask and rank-1 offset over fine tune grad. MEND maps model grads into model edits. Open Questions: - Are they generating a different weight edit on a per-token basis? Why? - How are they applying the rank-1 model edit for each token in the input sequence? - ~~If they're producing a layer's parameter edit as an output, why do they care about producing the low-rank embedding of the deltas? Accumulate and apply them.~~ - Are they only applying the deltas at test time so that they can keep the low-rank deltas? And are they only keeping the low-rank deltas to avoid having to do parameter updates for all the weights? Formulation: - A base model is a differentiable function that maps an input x and parameters theta to an output y. - A MEND model maps base model parameters, an edit input x_e, an edit_label y_e, a loss function l_e, and optional parameters to a new set of model parameters. - The input to a MEND network g is the fine-tuning gradient ∇Wle(xe, ye, θ) at layer l and the output is the layer’s parameter edit, which we call ∇˜ W_l. - "MEND outputs a rank-1 model edit for each token in the input and output sequence." Aside: - Inputs to MEND can differ by orders of magnitude, so normalization is required.