Skip to content

Instantly share code, notes, and snippets.

@drjwbaker
Last active June 9, 2016 11:37
Show Gist options
  • Save drjwbaker/7fc5efff7901195a962925ca9e83c1ea to your computer and use it in GitHub Desktop.
Save drjwbaker/7fc5efff7901195a962925ca9e83c1ea to your computer and use it in GitHub Desktop.
'Born digital big data and approaches for history and the humanities' Workshop 1, School of Advanced Study (University of London), 8 June 2016

'Born digital big data and approaches for history and the humanities' Workshop 1, School of Advanced Study (University of London), 8 June 2016

Live notes, so an incomplete, partial record of what actually happened.

Tags: AHRCborndigital

Site: https://www.eventbrite.co.uk/e/born-digital-big-data-and-approaches-for-history-and-the-humanities-tickets-25259497838

My asides in {}


### Talks


10.40 – Introduction to the aims and goals of the research network (Tobias Blanke and Jane Winters)

Previous work: Jisc and AHRC funding; struggle to get contemporary historians involved, but there were light-bulb moments; access to HPCs limited, so focus can be on what people are unable to do; need to work in publication forms that academics recognise;

.@jfwinters giving a good rundown of BUDDAH.. amazing what they did with 18 months: Shine, index, OA monograph, etc. etc. #AHRCborndigital

— Ian Milligan (@ianmilligan1) June 8, 2016
<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>

Aim of the day to focus on areas & themes to pursue, based on audience input.


11.20 – Analysing the past through born digital data

Simon Demissie, How will institutions preserving digital data have to adapt to meet the challenges posed in the digital age?

Some of challenges are out of the hands of institutions, eg legal. 20 year closure period before transfer of documents to places like the National Archives could well endanger the survival of born-digital documents: preservation that is consistent and ongoing is critical .. From the researcher perspective, is the disruption to collecting institutions caused by adapting to born-digital worth it? .. User experience: what are the expectations? How will staff keep up with those in order to assist? Will we loose the ability to provide nuanced, contextual advice? .. How will those underrepresented by the archive still be findable? Will the digital archive change the character of those who access the archive? (both those fearful of the physical and fearful of the digital)

Q&A: all this costs lots of money! .. How does write-back work? Tension between curated entity of the catalogue that a place puts it name against and the fact the researchers often know more about the stuff than the institutions who hold them.

James Baker, Hard disks as archives of everyday life

My deck and notes.

‘An inventory of your hard disk will kill Excel’ Some problems of ‘HD as archives of everyday life’ by @j_w_baker #AHRCborndigital

— Matt Shaw (@_MattShaw) June 8, 2016
<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>

We are being made to feel old by @j_w_baker, whose students very much see 1998 as history #AHRCborndigital

— Matt Shaw (@_MattShaw) June 8, 2016
<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>

As we begin preserving hard drives at scale, I worry that the individual emulation model may not scale. But early steps! 😀 #AHRCBornDigital

— Ian Milligan (@ianmilligan1) June 8, 2016
<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>

Roundtable discussion

Institutional challenges .. information overload and unrepresented people .. new sources .. whose the author .. everyday life, snapshots of everyday life .. understanding operating systems, learning from how systems organised .. skills .. archives that weren't meant to be archives .. simple tools .. interdisciplinarity .. this reiterated the classic distinctions between personal papers, official archives, published stuff .. tension between longstanding practice with personal papers (don't keep them in order, put them in chronological/thematic structure; different to corporate archives where relating to original order is crucial) and the personal digital archive where the arrangement of the personal papers ends up alongside the curatorial weeding and reorganisation.

Comment from floor: historians traditionally bad on understanding provenance of sources, digital issues not new. hmm. #AHRCborndigital

— Matt Shaw (@_MattShaw) June 8, 2016
<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>

Or, to be more precise, archival work and process elided #AHRCborndigital

— Matt Shaw (@_MattShaw) June 8, 2016
<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>

Archival hierarchy & context of documents is lost through "search" because users want something that look likes google! #AHRCBornDigital

— Sharon Webb (@wsharon145) June 8, 2016
<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>

Interesting debate on fear of depositing to digital archives-need for greater understanding of uses and public engagement #AHRCBornDigital

— Siobhan Morris (@MorrisSiobhan9) June 8, 2016
<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>

13.20 – Analysing the present through born digital data

Anne Alexander, The Arab Revolutions through the lens of born digital data

There has been a technologically deterministic view of these revolutions. Nevertheless, so much of the born-digital activity has been captured .. archives as weapons of counter narratives and counter histories .. opening up state security archives as part of arab revolutions .. vulnerability to attrition of non-western, non-standard creation

Mark Cote, The (mobile) born digital data generated by everyday life

Datafiction as pre-cognitive production of knowledge about us and our place in the world .. Gilbert Simondon (The ode of Existence of the Technical Object); relationships between people and technical objects .. Our Data Ourselves: what can we hack about ourselves that might otherwise be hacked by others?

Mark Coté discussing a hackathon they ran – figuring out what they could learn from unlocked Android phones. Woah. #AHRCBornDigital

— Ian Milligan (@ianmilligan1) June 8, 2016
<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>

Hackathons: troubling how leaky with out data free apps can be. Opening up technical objects to look at permissions in the code in comparison to permissions written up in user agreements.

Now @markcote on “Empowering Data Citizens” - cool GitHub repo here: https://t.co/rD31xpoMPz. #AHRCBornDigital

— Ian Milligan (@ianmilligan1) June 8, 2016
<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>

Data spectrum from personal to open. Personanondata http://artisopensource.net/persona-non-data/ - are you your data? Techno-cultural approach to the digital human. Increasing cultural awareness of work of data in society.

Roundtable discussion

Tension between not wanting our data to be collected and wanting as historians to have access to rich data about dead people .. as historians we know that the state creators data about people, so we should be aware the large scale data collection will become of valuable to historians .. explosion of media theory work around arab spring .. importance of working with born-digital archives alongside physical stuff .. connect 'automation' of data about us being exteriorised with historical examples of data about 'us' being pushed out: officialdom.


14.45 – Analysing the past and present

Web archives: Ian Milligan, Jason Webber and Peter Webster

Are web archives unique and if so how?

Heterogenous in form. WARC one of few ISO archival standards. False sense of things being consistent. A container for loads of stuff .. potential for completeness .. not dealing with an original items .. unstructured, and yet scholarship tends to focus on structured bits the surround an otherwise unstructured form .. cat and mouse game between web archivists and web creators: eg, Heritrix only recently able to deal with infinite scroll .. undeveloped ecosystem around web archives .. includes so many types of people

What is a real challenge facing web archives?

Skills of scholars .. understanding and provenance .. money (or at least, more expensive than historians are used to being) .. tools aren't there .. relationship between mainstream history and the digital turn is complex .. access! Reading room access is harder than working with a normal archive .. that said, legal deposit in other countries in only an aspiration ..

Good use cases

Baked in partnerships between archivists and researchers in BUDDAH et al .. work on 2015 federal election: moving focus away from 'letters to the editor' as barometer of public opinion .. Geocities: seedlist

Q&A

Duplication of collection not necessarily a bad thing: big crawls and curated collections make useful joins .. we need to give ourselves a break, we need to be honest and collaborate and set good examples undergraduates can be inspired by.


Some admin...

Creative Commons Licence
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Exceptions: embeds to and from external sources, and direct quotations from speakers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment