Skip to content

Instantly share code, notes, and snippets.

@Gcav66
Created October 27, 2021 15:49
Show Gist options
  • Save Gcav66/e91db4d8cee7d6950ced88910475584f to your computer and use it in GitHub Desktop.
Save Gcav66/e91db4d8cee7d6950ced88910475584f to your computer and use it in GitHub Desktop.
Snippet Running dask_cudf on Coiled GPU Dask Cluster
%%time
with afar.run, remotely:
import dask_cudf
df = dask_cudf.read_csv(
"s3://nyc-tlc/trip data/yellow_tripdata_2019-*.csv",
parse_dates=["tpep_pickup_datetime", "tpep_dropoff_datetime"],
storage_options={"anon": True},
assume_missing=True,
).persist()
res = (
df.groupby("passenger_count").tip_amount.mean().compute().to_pandas()
)
res.result()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment