The Stanford Libraries (SL) offer a number of datasets for social science research. This is a document detailing if and how those can be accessed when using R. I am also adding packages tp access social science data beyond Stanford. It is work in progress. (Download as pdf.)
read.csv(url("http:..."))
to load CSV from their data portal.tidyverse
and sf
-ready data framesipumsr
to import census, survey and geographic data provided by IPUMS into R. Great vignette, also for Value labels, Current Population Survey (CPS), Geographic Data, and use of NHGIS.RSocrata
(GitHub)WDI
(GitHub) - (Tutorial)wbstats
rWBData
rWBclimate
for the World Bank climate dataUS Bureau of Labor Statistics (BLS):
citecorp
: Client for the Open Citations Corpus
tradestatistics
R package to use the Trade Statistics API
rdryad
is a package to interface with the Dryad data repository API.
rplos
: Interface to the Search API for PLoS (Public Library of Science) Journals
Medicare public files: medicare
Social Media for Network Analysis: SocialMediaLab
United States Treasury
Rtreasuryio
: a single, simple function for submitting SQL queries to treasury.io
enigma
: a client to interact with the Enigma API, including getting the data and metadata for datasets as well as collecting statistics on datasets. (Note that there is another site: Enigma Public “the world’s broadest collection of public data” which provides API access as well, not sure how the two are related with regard to this package.)
pdfetch
: Economic and financial time series from public sources, including the St Louis Fed’s FRED system, Yahoo Finance, the US Bureau of Labor Statistics, the US Energy Information Administration, the World Bank, Eurostat, the European Central Bank, the Bank of England, the UK’s Office of National Statistics, Deutsche Bundesbank, and INSEE.
fredr
: An R client for the Federal Reserve Economic Data (FRED) API.
eechidna
: 2013 and 2016 Australian Federal Election (House of Representatives) and the 2011 Australian Census
rtimes
: Interface to Congress, Campaign Finance, Article Search, and Geographic APIs from the New York Times and ProPublica. Covers only a subset of the APIs.
crminer
and rcrossref
: Text mining client for Crossref. Includes functions for getting getting links to full text of articles, fetching full text articles from those links or Digital Object Identifiers (DOIs), and text extraction from PDFs. rcrossref
is for metadata.
rdpla
: Client for the Digital Public Library of America (DPLA), using its REST API
internetarchive
: Search the Internet Archive, retrieve metadata, and download files.
patentsview
: An R Client for PatentsView with functions to simplify the PatentsView API query language and parse the data that comes back.
pleiades
: Interface to the Pleiades Archeological Database
USAboundaries
: Historical boundaries of the United States. Map the United States (or the colonies that became the United States) on any date from 1629 to 2000. Contains both county and state/territory level polygons.
rdatacite
Client for the web service methods provided by DataCite
roadoi
: Find Free Versions of Scholarly Publications via Unpaywall
data.world
: High-level tools for working with data.world data sets.
jstor
: Import journal data from DfR (JSTOR)
fulltext
: A single interface to full text sources ‘scholarly’ data, including ‘Biomed Central’, Public Library of Science, ‘Pubmed Central’, ‘eLife’, and more. (Manual)
essurvey
: Download data from the European Social Survey.
RefManageR
: Import and work with bibliographic references. Stores with BibTeX
and BibLaTeX
references, interfaces with NCBI Entrez
, CrossRef
, and Zotero
, extracts references from locally stored PDF and generates bibliographies for RMarkdown
.
nomisr
: UK official statistics from the ‘Nomis’ database.Includes Census, Labour Force Survey, DWP benefit statistics and other economic and demographic data from the Office for National Statistics.
Quandl
: Access to financial, economic, and alternative datasets from Quandl. (Documentation)
quantmod
: Quantitative Financial Modelling & Trading Framework. (Documentation)
tidyquant
: A wrapper to various ‘xts’, ‘zoo’, ‘quantmod’, ‘TTR’ and ‘PerformanceAnalytics’ package functions that returns the objects in the tidy ‘tibble’ format.
rdhs
: Management and analysis of Demographic and Health Survey (DHS) data.
refnet
: Read, organize, geocode, analyze, and visualize Clarivate Web of Knowledge/Web of Science, format reference data files for scientometric, social network, and Science of Science analyses. Not on CRAN.
suppdata
: Downloading Supplementary Data from Published Manuscripts
geospatial data:
bikedata
data from public hire bicycle systems,including London, New York, Chicago, Washington DC, Boston, Los Angeles, and Philadelphia.FedData
: Download geospatial Data from federated data sources, including the The National Elevation Dataset digital elevation models, the Global Historical Climatology Network, the National Land Cover Database, and more.getlandsat
: Get Landsat 8 Data from Amazon Public Data Setsgeonames
: Interface to the “Geonames” Spatial Query Web ServiceMODIStsp
: automates the creation of time series of rasters derived from MODIS Land Products datastats19
Open Road Traffic Casualty Data from Great Britainrdataretriever
: Provides an R interface to the Data Retriever via the Data Retriever’s command line interface. The Data Retriever automates the tasks of finding, downloading, and cleaning public datasets, and then stores them in a local database.
data wrangling
naniar
: explore missing valuesvisdat
: visualise a dataframe (http://visdat.njtierney.com)(A number of packages are from https://ropensci.org/. You may want to check there for new ones.)