This is a CSV file containing Tweet IDs of 3,805 Tweets from user ID 25073877 posted publicly between Thursday February 25 2016 16:35:12 +0000 to Monday April 03 2017 12:51:01 +0000.
This file does not include Tweets' texts nor URLs.
Columns in the file are
id_str from_user_id_str created_at time source user_followers_count user_friends_count
Motivations to Share this Data
Archived
Tweets can provide interesting insights for the study of contemporary history of
media, politics, diplomacy, etc. The queried account is a public account widely agreed to
be of exceptional national and international public interest. Though they provide public access to tweeted content in real time, Twitter Web and mobile clients are not suited for appropriate Tweet corpus analysis. For anyone researching social media, access to the data is absolutely essential in order to perform, review and reproduce studies.
Archiving Tweets of public interest
due to their historic significance is a means to both preserve and enable reproducible study of this form of
rapid online communication that otherwise can very likely become
unretrievable as time passes. Due to Twitter's current business model and API limits, to date collecting in real time is the
only relatively reliable method to archive Tweets at a small scale.
Methodology and Limitations
The
Tweets contained in this file were collected by Ernesto Priego using a
Python script. The data collection search query was
from:realdonaldtrump. A trigger was scheduled to collect atuomatically every hour.
The original data harvesting was refined to delete duplications, to subscribe to Twitter's Terms and Conditions and so that the data was sorted in chronological order.
Duplication of data due to the automated collection is possible so further data refining might be required.
The file may not contain data from Tweets deleted by the queried user account immediately after original publication.
Both research and experience show that the Twitter
search API is not 100% reliable.
(Gonzalez-Bailon, Sandra, et al. 2012).
Apart from the filters
and limitations already declared, it cannot be guaranteed that this file
contains each and every Tweet posted by the queried account during the indicated period. This file dataset is shared for archival,
comparative and indicative educational research purposes only.
The
content included is from a public Twitter account and was obtained from
the Twitter Search API. The shared data is also publicly available to
all Twitter users via the Twitter Search API and available to anyone
with an Internet connection via the Twitter and Twitter Search web
client and mobile apps without the need of a Twitter account.
The original Tweets, their contents and associated metadata were published openly on the Web from the
queried public account and are responsibility of the original authors.
Original Tweets are likely to be copyright their individual authors but
please check individually.
No private personal information is
shared in this dataset. As indicated above this dataset does not contain the text of the Tweets. The collection and sharing of this dataset is
enabled and allowed by Twitter's Privacy Policy. The sharing of this
dataset complies with Twitter's Developer Rules of the Road.
This dataset is shared to archive, document and encourage open educational research into political activity on Twitter.
Other Considerations
All Twitter users agree to Twitter's Privacy and data sharing policies. Social media research remains in its infancy and though work has been
done to develop best practices there is yet no agreement on a series of
grey areas relating to reseach methodologies including ad hoc social
media specific research ethics guidelines for reproducible research.
Though
these datasets have limitations and are not thoroughly systematic, it
is hoped they can contribute to developing new insights into the
discipline's presence on Twitter over time. Reproducibility is
considered here a key value for robust and trustworthy research.
Different
scholarly professional associations like the Modern Language
Association recognise Tweets, datasets and other online and digital resources as citeable scholarly outputs.
The data contained in the deposited file is otherwise available elsewhere through different methods.