You got your hands on some data that was leaked from a social network and you want to help the poor people.
Luckily you know a government service to automatically block a list of credit cards.
The service is a little old school though and you have to upload a CSV file in the exact format. The upload fails if the CSV file contains invalid data.
The CSV files should have two columns, Name and Credit Card. Also, it must be named after the following pattern:
YYYYMMDD.csv.
The leaked data doesn't have credit card details for every user and you need to pick only the affected users.
The data was published here:
You don't have much time to act.
What tools would you use to get the data, format it correctly and save it in the CSV file?
Do you have a crazy vim configuration that allows you to do all of this inside your editor? Are you a shell power user and write this as a one-liner? How would you solve this in your favorite programming language?
Show me your solution in the comments below!
Thank you all for participating!
I never thought so many people might be willing to submit a solution. This is exactly the overview about different technologies and ways of thinking I anticipated to get.
We have solutions without any coding, solutions in one line of code and solutions with over a hundred lines.
I hope everyone else also learned something new by looking at this different styles!
Make sure to also checkout the solutions on Hackernews, Reddit (and /r/haskell) and dev.to!
Cheers, Jorin
Here's my Haskell solution. By default it uses the supplied URL as the data source, but it can combine the data from any number of urls and/or files given as command line arguments.
{-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE DeriveGeneric #-} module Main (main) where import qualified System.Environment as E import qualified Network.HTTP.Conduit as N import qualified Data.Aeson as A import qualified Data.Text as T import qualified Data.ByteString.Lazy as B import qualified Data.Maybe as DM import qualified Data.Csv as C import Data.Csv ((.=)) import qualified Data.Vector as V import GHC.Generics import qualified Data.Time as DT import qualified Data.List as DL import qualified Network.URI as U import qualified Control.Monad as M data Person = Person { name :: T.Text , creditcard :: T.Text } deriving (Show, Generic) instance A.FromJSON Person instance C.ToNamedRecord Person where toNamedRecord (Person name creditcard) = C.namedRecord [ "Name" .= name, "Credit Card" .= creditcard] defaultURL = "https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json" main = do args <- E.getArgs let (urls,files) = DL.partition U.isURI (if null args then [defaultURL] else args) M.when (null args) (putStrLn "No arguments provided. Downloading from default URL. Any number of files and/or URLs can be provided as arguments.") M.unless (null files) (putStrLn "Reading the following files:" >> mapM_ putStrLn files) M.unless (null urls) (putStrLn "Downloading from the following URLs:" >> mapM_ putStrLn urls) contents <- M.liftM2 (++) (mapM N.simpleHttp urls) (mapM B.readFile files) let lines = concatMap (map B.init . B.split 10) contents parse = DM.mapMaybe A.decode lines :: [Person] csv = C.encodeByName (V.fromList ["Name", "Credit Card"]) parse now <- DT.getCurrentTime let fileName = DT.formatTime DT.defaultTimeLocale "%Y%m%d.csv" now putStrLn $ "Writing to " ++ fileName B.writeFile fileName csv