Skip to content

Instantly share code, notes, and snippets.

@chuyqa
chuyqa / PG_Rag_Benchmarks.ipynb
Last active July 2, 2024 23:16
PGVector Local LLM
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@simonw
simonw / crontab.txt
Created September 10, 2020 16:09
Dogsheep crontab.txt as of 10th September 2020
# Fetch latest configuration:
*/5 * * * * cd /home/ubuntu/dogsheep-config && python3 git_pull_and_run_scripts.py . && sudo python3 ensure_symlinks.py files-to-symlink
# Goodreads
46 * * * * cd /home/ubuntu && /home/ubuntu/datasette-venv/bin/goodreads-to-sqlite books goodreads.db -a auth.json
# Twitter
1,11,21,31,41,51 * * * * /home/ubuntu/datasette-venv/bin/twitter-to-sqlite user-timeline /home/ubuntu/twitter.db -a /home/ubuntu/auth.json --since
2,7,12,17,22,27,32,37,42,47,52,57 * * * * run-one /home/ubuntu/datasette-venv/bin/twitter-to-sqlite home-timeline /home/ubuntu/timeline.db -a /home/ubuntu/auth.json --since
4,14,24,34,44,54 * * * * run-one /home/ubuntu/datasette-venv/bin/twitter-to-sqlite mentions-timeline /home/ubuntu/twitter.db -a /home/ubuntu/auth.json --since
@rvaidya
rvaidya / database_to_parquet.py
Last active February 24, 2025 11:24
Dump database table to parquet file using sqlalchemy and fastparquet. Useful for loading large tables into pandas / Dask, since read_sql_table will hammer the server with queries if the # of partitions/chunks is high. Using this you write a temp parquet file, then use read_parquet to get the data into a DataFrame
import pandas as pd
import numpy as np
import fastparquet
from sqlalchemy import create_engine, schema, Table
# Copied from pandas with modifications
def __get_dtype(column, sqltype):
import sqlalchemy.dialects as sqld
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@nepsilon
nepsilon / how-to-delete-lines-containing-a-given-string.md
Created January 31, 2017 10:43
How to delete lines containing a given string? — First published in fullweb.io issue #85

How to delete lines containing a given string?

Just like last week, where we wanted to replace a string, we can use sed for this task:

sed '/pouet/d' file.txt

This will output file.txt on stdout without the lines containing pouet.

@nepsilon
nepsilon / how-to-batch-convert-jpg-images-to-progressive-jpg-images.md
Last active May 28, 2024 22:27
How to batch convert JPG images to progressive JPG images? — First published in fullweb.io issue #82

How to batch convert JPG images to progressive JPG images?

Progressive JPG images, as opposed to baseline JPG, will display right away in the browser, and will load bits of it in cycle, rendering it from blur to clear.

Progressive is known to provide a better user experience, preventing the ”fax loading” effect. Where the image is displayed in full, but sequentially from top to bottom.

The imagemagick package will install the convert command that you can run to convert JPG to progressive:

convert -strip -interlace Plane -quality 80 input-file.jpg output-file.jpg
@nepsilon
nepsilon / python-how-to-print-the-full-traceback-without-exiting-the-program.md
Created January 3, 2017 06:24
Python: How to print the full traceback without exiting the program? — First published in fullweb.io issue #81

Python: How to print the full traceback without exiting the program?

The exception handling block except Exception as ex: print(ex) will only print the exception message and not its traceback.

That’s good to know, but we need more info than this to debug properly. Namely the line that raised the exception, together with its stack trace.

The traceback module, part of the stdlib, will help us with this:

@nepsilon
nepsilon / simple-cache-busting-with-nginx.md
Created September 20, 2016 03:39
Simple cache busting with Nginx — First published in fullweb.io issue #66

Simple cache busting with Nginx

You update your app.js or styles.css, but have a caching of 30 days and none of the clients will get the latest version? 😟

While the best would be to use a build mechanism to generate new filenames on the server, here is how to ensure clients get your last updates:

1. Change the name of the files in the HTML, for example styles.css to styles.123.css

2. Add this cache busting snippet in your nginx conf:

@nepsilon
nepsilon / what-you-should-know-about-tabindex.md
Created September 13, 2016 09:31
What you should know about tabindex — First published in fullweb.io issue #65

What you should know about tabindex

tabindex is an HTML core global attribute.

With it you can control in what order the elements get the focus, when the user presses the TAB key. You can also prevent an element to gain focus through the TAB key.

Typical example:

  1. We have a login form
@nepsilon
nepsilon / speed-is-a-feature-10-tips-faster-browser-networking.md
Created September 6, 2016 05:31
Speed is a feature: 10 tips for faster browser networking — First published in fullweb.io issue #64

Speed is a feature: 10 tips for faster browser networking

1. Minimize TCP connections

Ensure the web server uses Keep-Alive headers.

2. Reduce DNS look-ups

DNS look-ups are the first thing blocking your HTTP requests.