Skip to content

Instantly share code, notes, and snippets.

@zapalote
zapalote / extract-gbooks-terms.py
Last active April 2, 2024 11:31
Example of multi-threading and memory mapped file processing.
# extraction pattern: ngram TAB year TAB match_count TAB volume_count NEWLINE
# out: unique_ngram TAB sum(match_count) NEWLINE
import re
import os, sys, mmap
from pathlib import Path
from tqdm import tqdm
from concurrent.futures import ThreadPoolExecutor
abv = re.compile(r'^(([A-Z]\.){1,})(_|[^\w])') # A.B.C.
@hynekcer
hynekcer / maxsubstring.py
Created January 12, 2018 02:30
fast longest common substring - by suffix array
#!/usr/bin/env python
"""Find the longest repeated substring.
"Efficient way to find longest duplicate string for Python (From Programming Pearls)"
http://stackoverflow.com/questions/13560037/
The algorithm is based on "Prefix doubling".
The worst time complexity is O(n (log n)^2). Memory requirements are linear.
"""
import time
@ahmed-musallam
ahmed-musallam / compress_pdf.md
Last active August 17, 2025 20:46
How to compress PDF with ghostscript

How to compress PDF using ghostscript

As a developer, it bothers me when someone sends me a large pdf file compared to the number of pages. Recently, I recieved a 12MB scanned document for just one letter-sized page... so I got to googlin, like I usually do, and found ghostscript!

to learn more abot ghostscript (gs): https://www.ghostscript.com/

What we are interested in, is the gs command line tool, which provides many options for manipulating PDF, but we are interested in compressign those large PDF's into small yet legible documents.

credit goes to this answer on askubuntu forum: https://askubuntu.com/questions/3382/reduce-filesize-of-a-scanned-pdf/3387#3387?newreg=bceddef8bc334e5b88bbfd17a6e7c4f9

@cvan
cvan / qs.js
Last active February 21, 2024 13:44
get query-string parameters (alternative to `URLSearchParams`)
var queryParams = window.location.search.substr(1).split('&').reduce(function (qs, query) {
var chunks = query.split('=');
var key = chunks[0];
var value = decodeURIComponent(chunks[1] || '');
var valueLower = value.trim().toLowerCase();
if (valueLower === 'true' || value === 'false') {
value = Boolean(value);
} else if (!isNaN(Number(value))) {
value = Number(value);
}
package main
import (
"database/sql"
"strconv"
"log"
"net/http"
"fmt"
"bytes"
"gopkg.in/gin-gonic/gin.v1"
@justjanne
justjanne / Price Breakdown.md
Last active April 5, 2025 08:10 — forked from kylemanna/price.txt
Server Price Breakdown: DigitalOcean, Amazon AWS LightSail, Vultr, Linode, OVH, Hetzner, Scaleway/Online.net:

Server Price Breakdown: DigitalOcean, Amazon AWS LightSail, Vultr, Linode, OVH, Hetzner, Scaleway/Online.net:

Permalink: git.io/vps

$5/mo

Provider Type RAM Cores Storage Transfer Network Price
@spaze
spaze / opera-vpn.md
Last active December 22, 2024 15:50
Opera VPN behind the curtains is just a proxy, here's how it works

2023 update

ℹ️ Please note this research is from 2016 when Opera has first added their browser "VPN", even before the "Chinese deal" was closed. They have since introduced some real VPN apps but this below is not about them.

🕵️ Some folks also like to use this article to show a proof that the Opera browser is a spyware or that Opera sells all your data to 3rd parties or something like that. This article here doesn't say anything like that.


When setting up (that's immediately when user enables it in settings) Opera VPN sends few API requests to https://api.surfeasy.com to obtain credentials and proxy IPs, see below, also see The Oprah Proxy.

The browser then talks to a proxy de0.opera-proxy.net (when VPN location is set to Germany), it's IP address can only be resolved from within Opera when VPN is on, it's 185.108.219.42 (or similar, see below). It's an HTTP/S proxy which requires auth.

@prasoon2211
prasoon2211 / suffix_array.py
Last active November 12, 2019 08:15
Python suffix array
def sort_bucket(s, bucket, order):
d = defaultdict(list)
for i in bucket:
key = s[i:i+order]
d[key].append(i)
result = []
for k,v in sorted(d.iteritems()):
if len(v) > 1:
result += sort_bucket(s, v, order*2)
else:
@pa4373
pa4373 / bcc.py
Last active October 19, 2019 15:50
不用開hichannel網頁也能聽中廣流行網
#!/usr/bin/env python
import time
import base64
import hashlib
import urllib
import urlparse
import urllib2
import subprocess
from collections import OrderedDict
@asfktz
asfktz / how to make desktop version of google docs.md
Last active January 20, 2023 00:18
google docs - desktop version

steps:

  • you need to have node.js installed

  • copy main.js & package.json to a new folder

  • in the terminal:

    • to build it, run: npm install and then npm start

    • to pack it like a regular app, use 'electron-packager'. install it globally by running: npm install electron-packager -g