Skip to content

Instantly share code, notes, and snippets.

@pgroudas
Created May 1, 2012 19:41
Show Gist options
  • Save pgroudas/2570801 to your computer and use it in GitHub Desktop.
Save pgroudas/2570801 to your computer and use it in GitHub Desktop.
Simple test.pig
REGISTER 's3n://intentmedia-hawk-output/paul/jars/jsonStorage.jar';
DEFINE JsonStorage org.apache.pig.builtin.JsonStorage();
-- Load up the million song dataset from S3 (see data spec at: http://bit.ly/vOBKPe)
songs = LOAD 's3n://tbmmsd/T.tsv.j' USING PigStorage('\t') AS (
track_id:chararray, analysis_sample_rate:chararray, artist_7digitalid:chararray,
artist_familiarity:chararray, artist_hotness:double, artist_id:chararray, artist_latitude:chararray,
artist_location:chararray, artist_longitude:chararray, artist_mbid:chararray, artist_mbtags:chararray,
artist_mbtags_count:chararray, artist_name:chararray, artist_playmeid:chararray, artist_terms:chararray,
artist_terms_freq:chararray, artist_terms_weight:chararray, audio_md5:chararray, bars_confidence:chararray,
bars_start:chararray, beats_confidence:chararray, beats_start:chararray, danceability:double,
duration:float, end_of_fade_in:chararray, energy:chararray, key:chararray, key_confidence:chararray,
loudness:chararray, mode:chararray, mode_confidence:chararray, release:chararray,
release_7digitalid:chararray, sections_confidence:chararray, sections_start:chararray,
segments_confidence:chararray, segments_loudness_max:chararray, segments_loudness_max_time:chararray,
segments_loudness_max_start:chararray, segments_pitches:chararray, segments_start:chararray,
segments_timbre:chararray, similar_artists:chararray, song_hotness:chararray, song_id:chararray,
start_of_fade_out:chararray, tatums_confidence:chararray, tatums_start:chararray, tempo:double,
time_signature:chararray, time_signature_confidence:chararray, title:chararray, track_7digitalid:chararray,
year:int );
STORE songs INTO 's3n://intentmedia-hawk-output/paul/jsonStorage/songs' USING JsonStorage();
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment