Created
September 1, 2014 00:57
-
-
Save milesmatthias/25c15fd8384d4a7e76f2 to your computer and use it in GitHub Desktop.
S3 directory upload in ruby. Switched http://avi.io/blog/2013/12/03/upload-folder-to-s3-recursively to use the official aws ruby sdk.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env ruby | |
require 'rubygems' | |
require 'aws-sdk' | |
class S3FolderUpload | |
attr_reader :folder_path, :total_files, :s3_bucket | |
attr_accessor :files | |
# Initialize the upload class | |
# | |
# folder_path - path to the folder that you want to upload | |
# bucket - The bucket you want to upload to | |
# aws_key - Your key generated by AWS defaults to the environemt setting AWS_KEY_ID | |
# aws_secret - The secret generated by AWS | |
# | |
# Examples | |
# => uploader = S3FolderUpload.new("some_route/test_folder", 'your_bucket_name') | |
# | |
def initialize(folder_path, bucket, aws_key = ENV['AWS_KEY_ID'], aws_secret = ENV['AWS_SECRET']) | |
AWS.config(access_key_id: aws_key, secret_access_key: aws_secret, region: 'us-west-2') | |
@folder_path = folder_path | |
@files = Dir.glob("#{folder_path}/**/*") | |
@total_files = files.length | |
@connection = AWS::S3.new | |
@s3_bucket = @connection.buckets[bucket] | |
end | |
# public: Upload files from the folder to S3 | |
# | |
# thread_count - How many threads you want to use (defaults to 5) | |
# | |
# Examples | |
# => uploader.upload!(20) | |
# true | |
# => uploader.upload! | |
# true | |
# | |
# Returns true when finished the process | |
def upload!(thread_count = 5) | |
file_number = 0 | |
mutex = Mutex.new | |
threads = [] | |
thread_count.times do |i| | |
threads[i] = Thread.new { | |
until files.empty? | |
mutex.synchronize do | |
file_number += 1 | |
Thread.current["file_number"] = file_number | |
end | |
file = files.pop rescue nil | |
next unless file | |
# I had some more manipulation here figuring out the git sha | |
# For the sake of the example, we'll leave it simple | |
# | |
path = file | |
puts "[#{Thread.current["file_number"]}/#{total_files}] uploading..." | |
data = File.open(file) | |
if File.directory?(data) | |
data.close | |
next | |
else | |
obj = s3_bucket.objects[path] | |
obj.write(data, { acl: :public_read }) | |
data.close | |
end | |
end | |
} | |
end | |
threads.each { |t| t.join } | |
end | |
end | |
uploader = S3FolderUpload.new('test', 'miles-media-library', AWS_KEY, AWS_SECRET) | |
uploader.upload!(1) | |
tried checking if the file is writable before writing it to S3, sub directories still returned true.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
#using this code as reference. I have a bit of code that checks whether I want the folder_path/parent folder to the files to be in the S3 path.
include_folder = false
if include_folder
path = file
else
path = file.sub(/^#{folder_path}//, '')
end
#Then I look at the path. If it's a dir, I skip the file processing.
if File.directory?(path)
next
else #what I'm looking at is a file
data = File.open(file)
obj = obj.S3.objects[path]
obj.write(data, { acl: :"public-read" })
data.close
end
The thing is when I set include_folder = true, the code works like a champ and sub folders are skipped. But when I don't want to include the parent folder, the first sub folder this code comes to is not flagged as a directory, my code attempts to open the current file in the loop and throws an error because it is in fact a directory. ( Is a directory @ rb_sysopen - foo/bar/bin (Errno::EISDIR))
I've tried other conditionals in the else block to force the processing (ie !(File.directory?(path))) but this piece of code keeps throwing this error. Any ideas?