JW's accumulated tips for using llms as a dev

These came out of various workshops, talks, and conversations at the AI worlds fair SF 2024. For example, the GitHub copilot folks gave multiple sessions, there were other sessions dedicated just to developer productivity, etc.

Use good prompts. Include: - Context: what's the task - Intent: what is your goal and purpose in mind - Clarity: ambiguous language that can be interpreted many ways will generate misses. clearly define the desired result. - Specificity: be as specific as possible and state all expectations, known constraints, requirements, use cases, etc. - Role statements sometimes help: "act as a python programmer who job is to do X, thinks about Y, etc."

For code, treat it like a junior engineer who can reuse and remix what it has already seen, not a senior programmer with general purpose creative reasoning. Don't confuse the ability to remix examples in the data set for seniority. For novel coding challenges published after 2021 that aren't in ChatGPT's training dataset, research shows its performance drops massively. link

Regeneration:

Setting temperature to zero can be very useful if the prompt is really well written, but sometimes you actually want to see variation.
For a nonzero temperature, regenerate often. A second or third result might do the trick the first didn't or show you how your prompt is unclear.

Good coding practices lead to better results from AI: - Use good names and natural language in the code. The variable name "annual_revenue" is a better than "rev" because it invokes the LLM's finance domain knowledge. - Use functional programming. Smaller units of code that have no side effects are easier not just for humans but also for LLMs to reason about, write tests for, debug, etc., because they don't rely on, or mutate state in, distant unseen places. - Separate concerns, e.g. config from logic, presentation from control code, etc, allowing the LLM to focus on a single problem domain when possible. - Be consistent. Generated code tries to follow your existing code style. - Docs and comments, even if AI-generated, provide context to future prompts over that code, not just for other devs. If the code is fully or mostly generated from a prompt, include that as a comment or docstring. - Code comments that give examples of input/output and use cases are very helpful. - Generate test cases by asking the LLM to use an adversarial mindset. See "role statements" above. Have it act as an adversary and explicitly identify edge cases and write tests from that point of view.

LLMs don't have to focus just code in order to be useful for dev: - Represent problems and code in intermediate states like DSLs, config YAML, pseudocode, etc, so that LLM i/o is on higher-level representations of your problem space. - How can you model a problem as one of language translation (LLMs do good at that) instead of code gen? LLMs excel at language translation. - On the other hand, sometimes using code output when you're otherwise not planning to can produce good results. You can even ask it to write code that writes code which might expose some interesting aspects of the problem space. Code is one language among many, LLMs are good translators, solving problems is sometimes done best in a different language than the final result. - Use roleplay in the form of a world simulation: ask LLMs to "act as" your program or solution architecture itself and report on its internal activity, state changes, logic flows, etc given inputs and events.

Build small utilities. Collect what works into personal toolkits that generate compound interest: - If a task can reasonably be scripted, generate, run, and throw away lots of small scripts, keep and iterate on the ones that are useful more than once. - Roll prompts you use often into scripts that take command line arguments - Write scripts to extract context from your codebase to use in prompts (e.g. all your classes, function names, with docstrings, with arguments to filter it to particular domains if needed) so that the whole code structure or a particular domain can be included in prompts). Code copilots are doing this now, Aider for example is an OSS copiot that uses treesitter to analyze the codebase and provide a "map". You can use Aider for that purpose alone.

Specific tips for Github copilot: - Autocomplete is designed to be fast to generate but has less context: just the current file and tabs open. Whereas inline chat pulls from the whole workspace into a larger context window. They might use different models and they change these over time. - Learn the chat commands including /explain, @github, @workspace and use them often.

jwhiting/jw-dev-llm-tips.md