The part of PeARS development that I am responsible for is to process the URL's in such a way that they are in a useful format for semantic processing. I am also responsible for the user experience of blacklisting domains that will not be included in the search results. At this time this script only works on modern Linuxes (tested on Ubuntu and Arch) that use Firefox as their browser.
Running the script at this time will take the user's Firefox history, retrieve the links, extract the body data from the document and store it in a SQLite database called history.db
. This SQLite database is located on the user's hard drive, not in the PeARS directory so it will not be accidentally "pushed" to the PeARS repository. The reasons for this are technical - you don't want to try and version a large binary from each user, and privacy related - users will not want to see their own history