City, University of London
Browse
from 25073877 Thu Feb 25 16_35_12 _0000 2016 Mon Apr 03 12_51_01 _0000 2017.csv (376.5 kB)

3805 Tweet IDs from User 25073877 [Thu Feb 25 16:35:12 +0000 2016 to Mon Apr 03 12:51:01 +0000 2017]

Download (376.5 kB)
dataset
posted on 2017-04-03, 14:34 authored by Ernesto PriegoErnesto Priego
This is a CSV file containing Tweet IDs of 3,805 Tweets  from user ID 25073877 posted publicly between Thursday February 25  2016 16:35:12 +0000  to Monday April 03 2017 12:51:01 +0000.

This file does not include Tweets' texts nor URLs.

Columns in the file are

id_str
from_user_id_str
created_at   
time  
source   
user_followers_count  
user_friends_count   
 
Motivations to Share this Data

Archived Tweets can provide interesting insights for the study of contemporary history of media, politics, diplomacy, etc. The queried account is a public account widely agreed to be of exceptional national and international public interest. Though they provide public access to tweeted content in real time, Twitter Web and mobile clients are not suited for appropriate Tweet corpus analysis. For anyone researching social media, access to the data is absolutely essential in order to perform, review and reproduce studies.  

Archiving Tweets of public interest due to their historic significance is a means to both preserve and enable reproducible study of this form of rapid online communication that otherwise can very likely become unretrievable as time passes. Due to Twitter's current business model and API limits, to date collecting in real time is the only relatively reliable method to archive Tweets at a small scale. 

 Methodology and Limitations

The Tweets contained in this file were collected by Ernesto Priego using a Python script. The data collection search query was from:realdonaldtrump. A trigger was scheduled to collect atuomatically every hour. 
 
The original data harvesting was refined to delete duplications, to subscribe to Twitter's Terms and Conditions and so that the data was sorted in chronological order.

Duplication of data due to the automated collection is possible so further data refining might be required.

The file may not contain data from Tweets deleted by the queried user account immediately after original publication.   

Both research and experience show that the Twitter search API is not 100% reliable. (Gonzalez-Bailon, Sandra, et al. 2012).

Apart from the filters and limitations already declared, it cannot be guaranteed that this file contains each and every Tweet posted by the queried account during the indicated period. This file dataset is shared for archival, comparative and indicative educational research purposes only.

The content included is from a public Twitter account and was obtained from the Twitter Search API. The shared data is also publicly available to all Twitter users via the Twitter Search API and available to anyone with an Internet connection via the Twitter and Twitter Search web client and mobile apps without the need of a Twitter account.

The original Tweets, their contents and associated metadata were published openly on the Web from the queried public account and are responsibility of the original authors. Original Tweets are likely to be copyright their individual authors but please check individually.

No private personal information is shared in this dataset. As indicated above this dataset does not contain the text of the Tweets. The collection and sharing of this dataset is enabled and allowed by Twitter's Privacy Policy. The sharing of this dataset complies with Twitter's Developer Rules of the Road.

This dataset is shared to archive, document and encourage open educational research into political activity on Twitter.

Other Considerations

All Twitter users agree to Twitter's Privacy and data sharing policies. Social media research remains in its infancy and though work has been done to develop best practices there is yet no agreement on a series of grey areas relating to reseach methodologies including ad hoc social media specific research ethics guidelines for reproducible research.

Though these datasets have limitations and are not thoroughly systematic, it is hoped they can contribute to developing new insights into the discipline's presence on Twitter over time. Reproducibility is considered here a key value for robust and trustworthy research. 

Different scholarly professional associations like the Modern Language Association recognise Tweets, datasets and other online and digital resources as citeable scholarly outputs.

The data contained in the deposited file is otherwise  available elsewhere through different methods.

History