-
Notifications
You must be signed in to change notification settings - Fork 5
Expand file tree
/
Copy pathdata.Rmd
More file actions
93 lines (56 loc) · 2.93 KB
/
data.Rmd
File metadata and controls
93 lines (56 loc) · 2.93 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
---
title: "Data"
---
Throughout the course we will make use of he [`experimentdatar`](https://itamarcaspi.github.io/experimentdatar/) data package that contains publicly available datasets that were used in Susan Athey and Guido Imbens' course ["Machine Learning and Econometrics"](https://www.aeaweb.org/conference/cont-ed/2018-webcasts) (AEA continuing Education, 2018). The datasets are conveniently packed for R users.
### Installation
You can install the **development** version from
[GitHub](https://github.com/itamarcaspi/experimentdatar/)
```r
# install.packages("devtools")
devtools::install_github("itamarcaspi/experimentdatar")
```
### Usage
Load the __experimentdatar__ package
```{r load_package, message=FALSE, warning=FALSE}
library(experimentdatar)
```
Load the `social` dataset
```{r load_data}
data(social)
```
(You can find information about variable definitions using `?social`.)
Use `dataDetails()` function to open the paper associated with the `social` dataset
```{r details, eval=FALSE}
dataDetails("social")
```
Load the __dplyr__ and display the response (`outcome_vote`) and treatment variable (`treat_neighbors`)
```{r tibble, message=FALSE, warning=FALSE}
library(dplyr) # for data manipulations
social %>%
select(outcome_voted, treat_neighbors) %>%
head()
```
Use __ggplot2__ to plot the distribution of the median income of survey participants
```{r plot, message=FALSE, warning=FALSE}
library(ggplot2) # for generating plots
social %>%
ggplot(aes(median_age)) +
geom_histogram(binwidth = 1)
```
### List of available datasets
* `charitable`: Data used for the paper "Does Price matter in charitable giving? Evidence from a large-Scale Natural Field experiment"
by Karlan and List (2007).
* `IVdataset`: Data used for the paper "Does compulsory school attendance affect schooling and earnings?"
by Angrist and Krueger (1991) and related papers.
* `mobilization`: Data for the paper "Comparing Experimental and Matching Methods Using a Large-Scale Voter Mobilization Experiment"
by Arceneaux, Gerber, and Green (2006).
* `vouchers`: Data for the paper "Vouchers for Private Schooling in Colombia: Evidence from a Randomized Natural Experiment"
by Angrist, Bettinger, Bloom, King, and Kremer (2002).
* `secrecy`: Data for the paper "Ballot Secrecy Concerns and Voter Mobilization: New Experimental Evidence about Message Source, Context, and the Duration of Mobilization Effects"
by Gerber, Hubers, Biggers, and Hendry (2014).
* `social`: Data for the paper "Social Pressure and Voter Turnout: Evidence from a Large-Scale Field Experiment"
by Gerber, Green, and Larimer (2008).
* `welfare`: Data for the paper "Modeling heterogeneous treatment effects in survey experiments with Bayesian Additive Regression Trees"
by Green and Kern (2012).
### Resources
* The [ExperimentData repository](https://github.com/gsbDBI/ExperimentData) where you can find the original as well as new datasets and more documentations.