An Incomplete #dh2014 Twitter Archive (Conference Days Only)
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
This .XLS file contains a dataset of Tweets tagged with #dh2014 (case not sensitive).
If you use or refer to this data in any way please cite and link back using the following citation information:
Priego, Ernesto (2014): An Incomplete #dh2014 Twitter Archive (Conference Days Only). figshare. http://dx.doi.org/10.6084/m9.figshare.1102950
This file was created and shared by Ernesto Priego (Centre for Information Science, City University London) with a Creative Commons- Attribution license (CC-BY) for academic research and educational use.
The complete archive contains 16,154 Tweets published publicly and tagged with #dh2014 between Monday 07/07/2014 00:03:00 (CEST) and Saturday 12/07/2014 23:48:00 (CEST).
The tweets contained in this file were collected using Martin Hawksey’s TAGS 5.1. Due to the volume of tweets nine Google Spreadsheets were created during the period of the event, which were subsequently refined to four. The data was subsequently organised manually into various sheets, which have been included here.
Sheet 0. A 'Cite Me' sheet, including procedence of this file, citation information, information about its contents, the methods employed and some context.
Sheet 1. Monday 7 July 2014 (1,052 Tweets; gap between Monday 07/07/2014 10:19 and Monday 07/07/2014 11:20)
Sheet 2. Tuesday 8 July 2014 (3,605 Tweets)
Sheet 3. Wednesday 9 July 2014 (4,372 Tweets)
Sheet 4. Thursday 10 July 2014 (2,879 Tweets; significant gap between 10/07/2014 01:51 and 10/07/2014 10:10)
Sheet 5. Friday 11 July 2014 (3,843 Tweets)
Sheet 6. Saturday 12 July 2014 (403 Tweets)
Collected under Local Lausanne, Switzerland times. Times in GMT also included.
Only users with at least 2 followers were included in the archive. Retweets have been included. Data might require reduplication.
Unfortunately the metadata in the sheets for Monday - Thursday is incomplete (the lack of ISO language metadata in these sheets is particularly disappointing, as it would have provided interesting insights); Friday and Saturday do contain the standard metadata available from TAGS.
Some work was done to ensure the chronology was complete; I have highlighted a gap in the Tweets on Monday 7 July 2014 between Monday 07/07/2014 10:19 and Monday 07/07/2014 11:20 and on Thursday 9 July 2014 between 10/07/2014 01:51 and 10/07/2014 10:10.
I was not able to recover these Tweets. Yannick Rochat and Martin Grandjean’s archive has the complete set (available at http://goo.gl/6W3dol; last accessed Tuesday 15 July 2014 11:55 BST). Cfr. Rochat, Yannick, “The DH 2014 Conference in Lausanne – A feedback”, 2014/07/14, http://yro.ch/?p=417, accessed 15 July 2014; and Grandjean, Martin, “[DataViz] The digital humanities network on Twitter (#DH2014)”, 2014/07/14, http://www.martingrandjean.ch/dataviz-digital-humanities-twitter-dh2014/, accessed 15 July 2014.
Please note that both research and experience show that the Twitter search API isn't 100% reliable. Large tweet volumes affect the search collection process. The API might "over-represent the more central users", not offering "an accurate picture of peripheral activity" (González-Bailón, Sandra, et al. 2012). The Tweet volume was higher than what the available collecting methods allowed so data is likely to be incomplete. It is not guaranteed this file contains each and every Tweet tagged with #dh2014 during the indicated period, and is shared for comparative and indicative educational and research purposes only.
Please note the data in this file is likely to require further refining and even deduplication. The data is shared as is. This dataset is shared to encourage open research into scholarly activity on Twitter. If you use or refer to this data in any way please cite and link back using the citation information above.