mdsumner’s gists

mdsumner / parquet_column.md

Created September 12, 2025 00:57

In R just do

arrow::read_parquet("https://github.com/mdsumner/dryrun/raw/refs/heads/main/data-raw/noaa_oi_025_degree_daily_sst_avhrr.parquet")$url

In python, I understand GDAL

from osgeo import ogr

mdsumner / generic_gdal_config.md

Created September 11, 2025 06:33

import os
import rasterio

dsn = "/vsis3/idea-10.7289-v5sq8xb5/www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/198109/oisst-avhrr-v02r01.19810901.nc"
os.environ["AWS_S3_ENDPOINT"] = "projects.pawsey.org.au"
os.environ["AWS_NO_SIGN_REQUEST"] = "YES"
os.environ["AWS_VIRTUAL_HOSTING"] = "YES"

mdsumner / virtualize_bluelink_salt.md

Last active September 17, 2025 03:52

All working from here https://virtualizarr.readthedocs.io/en/stable/usage.html (with HTTP tab)

## on the docker image
## reticulate::use_python("/workenv/bin/python3.12")


library(bowerbird)
bl <- bb_source(

mdsumner / sooty_seaice.md

Last active August 28, 2025 00:31

Install sooty

#install.packages("remotes")
remotes::install_cran("sooty")

Read the latest ice data we have for antarctica-amsr2-asi-s3125.

mdsumner / oisst_dir.md

Created August 24, 2025 23:08

See how the target doesn't have the "/%Y%m/" component:

Sun Aug 24 22:59:11 2025
Synchronizing dataset: NOAA OI 1/4 Degree Daily SST AVHRR
Source URL https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/
--------------------------------------------------------------------------------------------

 this dataset path is: /perm_storage/home/data/r_tmp/Rtmp7EG26o/bowerbird_files/www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr
 visiting https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/ ...

mdsumner / something_for_the_cntrys.md

Last active August 24, 2025 22:43

cnts <- rnaturalearth::ne_countries()
n <- nrow(cnts)
sf::sf_use_s2(FALSE)
pfun <-function(x) {
  x <- x[1, ]
  #expecting a sf cnt
  op <- options(warn = -1)
 on.exit(options(op))

mdsumner / hack_georef_tasmap.md

Last active August 19, 2025 23:03

From this URL, we remove the sidebar, and full-screen screenshot to this local PNG file, then georef from the url

https://tasmap.org/#9/-42.8830/147.0026

The file read here with rast() can be read directly with this, or itis attached below.

r <- rast("/vsicurl/https://private-user-images.githubusercontent.com/4107631/479547609-2127f648-d117-457b-ba01-6b181cdd5843.png?jwt=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NTU2MTI0NzIsIm5iZiI6MTc1NTYxMjE3MiwicGF0aCI6Ii80MTA3NjMxLzQ3OTU0NzYwOS0yMTI3ZjY0OC1kMTE3LTQ1N2ItYmEwMS02YjE4MWNkZDU4NDMucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDgxOSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTA4MTlUMTQwMjUyWiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9MTQ0NWJmNDk0M2NhNDUzMGI3Njg3YWQ3OWZiYmExMGYzNjY4Y2E4ZjM5OTU0YTNhNTk3YmMxZGE2NTZiMmQyYSZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.0TjLYya2nBjCNiO9po3uq

mdsumner / parcels_hobart.md

Last active August 11, 2025 06:57

parcels <- function(address, distance = 200) {
  parcel_dsn <- "/vsizip//vsicurl/listdata.thelist.tas.gov.au/opendata/data/LIST_PARCELS_HOBART.zip/list_parcels_hobart.shp"
  pt <- tidygeocoder::geo(address, quiet = TRUE)
  parcel_ds <- new(gdalraster::GDALVector, parcel_dsn)
  on.exit(parcel_ds$close(), add = TRUE)
  prj <- parcel_ds$getSpatialRef()
  pp <- gdalraster::transform_xy(cbind(pt$long, pt$lat), srs_to = prj, srs_from = "EPSG:4326")
  ex <- rep(pp, each = 2) + c(-1, 1, -1, 1) * distance
 sf &lt;- gdalraster::bbox_to_wkt(ex[c(1, 3, 2, 4)])

mdsumner / internal_boundaries.md

Last active August 9, 2025 00:43

In this thread it's actually not well defined, essentially if a vertex is used twice it shouldn't be in the output. (But should we normalize on edge or vertex, or full shared boundaries?). Should islands stay in the set? (I don't think so)

With silicate, simplest case is

library(silicate)
sc <- SC(polygon)
library(dplyr)
sc$object_link_edge <- sc$object_link_edge |> 
 group_by(edge_) |&gt;

mdsumner / overlapping_zarr.md

Created August 3, 2025 02:45

do you need 500 slightly overlapping Zarr datasets?

src <- "/vsicurl/https://projects.pawsey.org.au/idea-gebco-tif/GEBCO_2024.tif"
src <- "https://projects.pawsey.org.au/idea-gebco-tif/GEBCO_2024.tif"
library(purrr) ## purrr CRAN
library(mirai) ## mirai CRAN
if (!file.exists(basename(src))) {
  curl::curl_download(src, basename(src))  ## curl CRAN
}

Michael Sumner mdsumner