Simple Collection and Powerful Analysis of Twitter Data
{tidytags} coordinates the simplicity of collecting tweets over time with a Twitter Archiving Google Sheet (TAGS) and the utility of the {rtweet} package for processing and preparing additional Twitter metadata. {tidytags} also introduces functions developed to facilitate systematic yet flexible analyses of data from Twitter.
You can install the development version of {tidytags} from GitHub:
install.packages("devtools") devtools::install_github("bretsw/tidytags")
Soon, you will be able to install the released version of {tidytags} from CRAN with:
install.packages("tidytags")
For help with initial {tidytags} setup, see the Getting started with tidytags vignette. Specifically, this guide offers help for four pain points:
At its most basic level, {tidytags} allows you to import data from a Twitter Archiving Google Sheet (TAGS) into R. This is done with the {googlesheets4} package. One requirement for using the {googlesheets4} package is that your TAGS tracker has been “published to the web.” See the Getting started with tidytags vignette, Pain Point #1, if you need help with this.
Once a TAGS tracker has been published to the web, you can import the TAGS archive into R using read_tags()
. See the Getting started with tidytags vignette, Pain Point #2, to set up API access to Google Sheets like the TAGS tracker.
With a TAGS archive imported into R, {tidytags} allows you to gather quite a bit more information related to the collected tweets with the pull_tweet_data()
function. This function uses the {rtweet} package (via rtweet::lookup_statuses()
) to query the Twitter API. This process requires Twitter API keys associated with an approved Twitter developer account. See the Getting started with tidytags vignette, Pain Point #3, if you need help with this.
For a walkthrough of numerous additional {tidytags} functions, see the Using tidytags with a conference hashtag vignette.
{tidytags} is still a work in progress, so we fully expect that there are still some bugs to work out and functions to document better. If you find an issue, have a question, or think of something that you really wish {tidytags} would do for you, don’t hesitate to email Bret or reach out on Twitter: @bretsw and @jrosenberg6432.
You can also submit an issue on Github.
You may also wish too try some general troubleshooting strategies:
{tidytags} should be used in strict accordance with Twitter’s developer terms.
Although most Institutional Review Boards (IRBs) consider the Twitter data that {tidytags} analyzes to not necessarily be human subjects research, there remain ethical considerations pertaining to the use of the {tidytags} package that should be discussed.
Even if {tidytags} use is not for research purposes (or if an IRB determines that a study is not human subjects research), “the release of personally identifiable or sensitive data is potentially harmful,” as noted in the rOpenSci Packages guide. Therefore, although you can collect Twitter data (and you can use {tidytags} to analyze it), we urge care and thoughtfulness regarding how you analyze the data and communicate the results. In short, please remember that most (if not all) of the you collect may be about people—and those people may not like the idea of their data being analyzed or included in research.
We recommend the Association of Internet Researchers’ (AoIR) resources related to conducting analyses in ethical ways when working with data about people. AoIR’s ethical guidelines may be especially helpful for navigating tensions related to collecting, analyzing, and sharing social media data.
{tidytags} should be used in strict accordance with Twitter’s developer terms.
If you encounter an obvious bug for which there is not already an active issue, please create a new issue with all code used (preferably a reproducible example) on Github.
If you would like to become a more involved contributor, please read the Contributing Guide. All contributors, from those fixing typos to adding new functionality, must adhere to the Code of Conduct.
The {tidytags} package is licensed under a GNU General Public License v3.0, or GPL-3. For background on why we chose this license, read this chapter on R package licensing.