Skip to content

Instantly share code, notes, and snippets.

@mbuckbee
Created January 31, 2011 07:31
Show Gist options
  • Save mbuckbee/803741 to your computer and use it in GitHub Desktop.
Save mbuckbee/803741 to your computer and use it in GitHub Desktop.
Bayesian classifier for training BoingBoing post recognition
require 'rubygems'
require 'classifier'
require 'madeleine'
m = SnapshotMadeleine.new("#{RAILS_ROOT}/bayes_data") {
Classifier::Bayes.new 'Boingable', 'Unboingable'
}
# Boing Boing Titles
file = File.new("#{RAILS_ROOT}/boingboing_titles.txt", "r")
while (line = file.gets)
puts "BoingBoing: " + line
m.system.train_boingable line
end
# Reddit Titles
file = File.new("#{RAILS_ROOT}/reddit_posts.txt", "r")
while (line = file.gets)
puts "Reddit: " + line
m.system.train_unboingable line
end
file.close
m.take_snapshot
puts "Bacon"
puts m.system.classify "Bacon" # returns 'Unboingable' because Reddit loves bacon.
puts "Cory"
puts m.system.classify "Cory" # returns 'Boingable' because people like to mention themselves.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment