Forked from BrentPalmer/YouTube Data Checker.rb
Last active
November 17, 2015 19:25
-
-
Save 8bitDesigner/d60959afdfd700cc79dc to your computer and use it in GitHub Desktop.
YouTube Data Checker - Parses through two CSV files and outputs the emails of discrepancies. *Note* I did not know if I was able to ask questions about the challenge? I noticed that prepended to some channel_ownership strings were "UC". I did not know if this was data entry error or not, so i processed as not BUT added the necessary code to take…
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require 'csv' | |
class YouTubeDataParser | |
def initialize( args ) | |
raise "Missing 'file1.csv'" if args[0].nil? | |
raise "Missing 'file2.csv'" if args[1].nil? | |
file1 = CSV.read(args[0], headers: true) | |
file2 = CSV.read(args[1], headers: true) | |
@file1 = file1 | |
@file2 = file2 | |
@concern = args[2] | |
yt_data_checker(@file1, @file2, @concern) | |
end | |
#checks for concern, directs correct files. | |
def yt_data_checker(file1, file2, concern) | |
if concern == "channel_ownership" | |
sanitize_channels(file1, file2) | |
calculate_differences(@file_1_yt_channels, @file_2_yt_channels) | |
print_emails(@total_difference) | |
elsif concern == "subscriber_count" | |
sanitize_subscriber_count(file1, file2) | |
calculate_differences(@file_1_subscriber_count, @file_2_subscriber_count) | |
print_emails(@total_difference) | |
else | |
sanitize_channels(file1, file2) | |
sanitize_subscriber_count(file1, file2) | |
calculate_differences(@file_1_yt_channels, @file_2_yt_channels) | |
calculate_differences(@file_1_subscriber_count, @file_2_subscriber_count) | |
print_emails(@total_difference) | |
end | |
end | |
#Normalizes channels | |
def sanitize_channels(file1, file2) | |
@file_1_yt_channels = {} | |
@file_2_yt_channels = {} | |
file1.each do |row| | |
@file_1_yt_channels[row[0]] = row[1].split('/').last #.gsub(/^UC/, "") -> Insert if UC is error in input | |
end | |
file2.each do |row| | |
@file_2_yt_channels[row[0]] = row[1].split('/').last #.gsub(/^UC/, "") -> Insert if UC is error in input | |
end | |
end | |
#Normalizes subscriber count | |
def sanitize_subscriber_count(file1, file2) | |
@file_1_subscriber_count = {} | |
@file_2_subscriber_count = {} | |
file1.each do |row| | |
@file_1_subscriber_count[row[0]] = row[2].gsub(/\W/, "").to_s | |
end | |
file2.each do |row| | |
@file_2_subscriber_count[row[0]] = row[2].gsub(/\W/, "").to_s | |
end | |
end | |
#Calculates between suppled channel_ownership, subscribe_count or both. | |
def calculate_differences(data_set1, data_set2) | |
@differences ||= [] | |
@differences = @differences + (data_set1.to_a - data_set2.to_a) | |
@total_difference = @differences | |
end | |
#Iterates through differneces, collects emails and prints them out. | |
def print_emails(differences) | |
emails = [] | |
differences.each do |difference| | |
emails << difference[0] | |
end | |
puts "-------Emails With Discrepancies------" | |
puts emails.uniq | |
puts "--------------------------------------" | |
end | |
end | |
YouTubeDataParser.new( ARGV ) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment