Veesa Norman veesa

Responsible Data Challenges In The PeARS Project.

The part of PeARS development that I am responsible for is to process the URL's in such a way that they are in a useful format for semantic processing. I am also responsible for the user experience of blacklisting domains that will not be included in the search results. At this time this script only works on modern Linuxes (tested on Ubuntu and Arch) that use Firefox as their browser.

How It Works

Running the script at this time will take the user's Firefox history, retrieve the links, extract the body data from the document and store it in a SQLite database called history.db. This SQLite database is located on the user's hard drive, not in the PeARS directory so it will not be accidentally "pushed" to the PeARS repository. The reasons for this are technical - you don't want to try and version a large binary from each user, and privacy related - users will not want to see their own history

Install PeARS To Virtual Machine via Vagrant

Why Should I Do This?

By the end of this process you can have an installation of PeARS up and running on your network no matter what O/S you run. Vagrant will create a virtual machine on your network bridged from your network device on your local machine (currently configured to be static device 192.168.1.25), that will be accessible from that ip address on port 8080.

This ip address will need to be changed in the Vagrantfile to suit your own network.

	FROM debian:latest
	MAINTAINER Veesa Norman <[email protected]>
	RUN apt-get update && apt-get install -y apt-utils && apt-get install -y git
	CMD git clone -b development https://github.com/PeARSearch/PeARS.git
	CMD cd PeARS
	RUN apt-get install -y python-pip
	CMD pip install requirements.txt
	CMD apt-get install -y wget
	CMD wget http://clic.cimec.unitn.it/~aurelie.herbelot/openvectors.dump.bz2
	CMD ./uncompress_db openvectors.dump.bz2

	#!/bin/bash
	#
	# DESCRIPTION:
	#
	# Set the bash prompt according to:
	# * the active virtualenv
	# * the branch/status of the current git repository
	# * the return value of the previous command
	#
	# USAGE:

	import math

	EARTH_DIAMETER = 2 * 6378.2
	PI = 3.14159265
	RAD_CONVERT = PI / 180.0

	postcodes = open('geoserve_postcodes.csv', 'r')

	given_postcode = 'CB24 6AJ'
	given_distance = 3000.0 # The distance is given in metres

	import argparse
	import os
	import sys
	import re
	from collections import Counter

	log_list = []
	counter = 0
	offsite = 0
	isOffsite = bool

	from flask import Flask, jsonify, request
	import uuid
	import sqlite3
	from flask import g

	DATABASE = '/tmp/database.db'

	app = Flask(__name__)

	package controllers;

	import java.io.IOException;
	import java.io.PrintWriter;
	import java.sql.Connection;
	import java.sql.ResultSet;
	import java.sql.SQLException;
	import java.sql.Statement;
	import java.text.ParseException;
	import java.text.SimpleDateFormat;

	#include <iostream>
	#include <ctime>
	#include <cstdlib>
	#include <string>
	#include <vector>

	using namespace std;

	class Boat
	{