-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathindex.qmd
More file actions
319 lines (198 loc) · 13.2 KB
/
index.qmd
File metadata and controls
319 lines (198 loc) · 13.2 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
---
title: Semester Project Computational Movement Analysis
subtitle: Instructions on working with Quarto and GitHub
format:
html:
toc: true
references:
- type: online
id: press
author:
- family: Press
given: Gil
title: Cleaning Big Data Most Time-Consuming, Least Enjoyable Data Science Task, Survey Says
url: https://www.forbes.com/sites/gilpress/2016/03/23 data-preparation-most-time-consuming-least-enjoyable-data-science-task-survey-says/
issued:
date-parts:
- - 2016
- type: online
id: cma
author:
- family: Laube
given: Patrick
- family: Ratnaweera
given: Nils
title: "Computational Movement Analysis: Patterns and Trends in Environmental Data"
url: https://computationalmovementanalysis.github.io
issued:
date-parts:
- - 2026
---
These instructions support your semester project in *Computational Movement Analysis* (*Patterns and Trends in Environmental Data*).
## Prerequisites
To follow these instructions, you need:
- RStudio (or a similar IDE)
- Quarto [see *Week 3 - Preparation* in @cma]
- Git [see *Week 3 - Exercise A: Git* in @cma]
- A GitHub account [see *Week 3 - Exercise B: GitHub* in @cma]
:::{.callout-note title="The role of Git and GitHub in your semester project"}
Git and GitHub are required [only for submitting your research proposal]{.underline}. When working on your [semester project]{.underline}, using Git / GitHub is optional.
:::
## Setting up GitHub
Instructions for setting up your repository on GitHub:
1. One team member copies the [Template Repo](https://github.com/ComputationalMovementAnalysis/Project-template) to their GitHub account by clicking on *Use this template* → *Create a new repository* (see @fig-use-template)
2. Give the repo a reasonable name and optionally a description, make sure the visibility is set to *Public*, not *Private*. Confirm by clicking on *Create repository*.
3. Authenticate your team member on GitHub by clicking on *Settings* → *Collaborators* → *Add People*. Enter the username of your team member and confirm by clicking on the button. Your team member should now accept this invitation which is sent via mail.
Now your repository is correctly set up on GitHub.
{#fig-use-template width="80%"}
## Setup your project locally
Now, **both** team members can setup RStudio to work with the project locally. In RStudio, click on *File* → *New Project* → *Version Control* → Git. In Field *Repository URL*, copy-and-paste the URL to your GitHub Repo. Choose a reasonable parent directory by clicking on *Browse* and then click on *Create project*.
## Work on the research *proposal* in RStudio
The file *Readme.md* is the file you will use to submit your research proposal. Open this file in RStudio and fill out this markdown document by replacing the placeholders with the information about your project. Once you are done, render the markdown file to pdf by running the following command:
```{.bash}
quarto render Readme.md --to pdf
```
Now, *add*, *commit* and push your changes to GitHub. You can use the RStudio *Git Panel* for this or the following bash commands:
```{.bash}
git add Readme.md
git add Readme.pdf
git commit -m "add research proposal"
git push
```
Now, the other team member has to pull the changes that were pushed in the last step. Use the pull button for this or the following git command:
```{.bash}
git pull
```
## Work on your research *project* in RStudio
You will use the Quarto-file *index.qmd* to write your project report. You can develop your R-Code in the first code chunk of this file (the chunk is labelled `preprocessing`).
Since you will spend most of your time on getting your data into the right form [see @press], the bulk of your R-code will go here. Please add inline comments to this code, and explain what your steps are *on a meta level* in prose in your document. If you have computationally heavy preprocessing to do, read the section on [Heavy Computation](#sec-heavy-computation).
To write prose (english or german text), use the according Subsections in *index.qmd*. To add tables, figures or maps to your report, add code chunks in the appropriate section (see [Maps, Plots, Tables](#sec-maps-plots-tables)).
:::{.callout-note}
Till now, you have used Git / GitHub to work on your research *proposal*. To work on your research *project*, using Git / GitHub is optional. You will need to decide on whether you will use these tools for version control and collaboration with your team mate.
If you decide against using Git / GitHub for collaboration, we recommend you work with a cloud folder with local sync (e.g. SWITCHdrive, Dropbox, or OneDrive). Make sure both team members sync the folder locally so RStudio can access the files directly, and coordinate with your team mate so only one person edits the files at a time to avoid conflicting versions.
:::
### Big and/or sensitive Datasets and GitHub
[This section only applies if you are using Git and GitHub to manage your project files. If you are not, you can skip it.]{.underline}
Some of the files in the repository should not be tracked via git. This can be due to a big file size or due to the sensitive nature of the data (e.g. your movement data). We recommend that you create a folder called `data` and move all your datasets into this folder. In order to ignore this folder and files from being tracked by git we need to add it to the `.gitignore` file. You should already have such a file in your project's root folder. If this is not the case you can create one by clicking on File → New File → Text File and then saving this file in the root directory of your project with the name `.gitignore` (note the period!).
Here are some examples on how you can exclude folders and files in `.gitignore`:
```{.txt}
/data # ignores the folder data
*.csv # ignores all files that end with .csv
garmin-export.csv # ignores the specific file called garmin-export.csv
```
For more information on `.gitignore` have a look [here](https://git-scm.com/docs/gitignore#_pattern_format).
If you already have committed a file to Git that you did not want to commit, you can go through [these instruction](https://sentry.io/answers/delete-a-file-from-a-git-repository/) to completely remove it from Git and Github again. If you need help with this, you can get in contact with us.
### YAML Header
Note that index.qmd has default [chunk options](https://quarto.org/docs/reference/cells/cells-knitr.html#cell-output) set in the [YAML header](https://quarto.org/docs/get-started/hello/rstudio.html#yaml-header) (first few lines in between the two `---`). Please don't change these specific options (except for the `lang` option). If for some reason you want to change this behavior for a specific chunk, you can override these options by setting the chunk options within the chunk (more information [here](https://quarto.org/docs/get-started/hello/rstudio.html#code-chunks)).
```{.yaml}
---
format:
html:
code-fold: true # makes the code in the output collapsable
execute:
warning: false # hides warnings from the generated output
message: false # hides messages from the generated output
lang: en # sets the document language to english. Switch to "de" if you write in german
---
```
### Heavy computation {#sec-heavy-computation}
Including *all* your code index.qmd and rendering it each time you want to preview your report makes your report less error-prone and more reproducible, but this workflow can be cumbersome when the code takes a long time to execute. This prevents you iterating fast when writing up your report. We suggest the following method to solve this:
Outsource your preprocessing steps (and especially the heavy computation) into a separate R-Script called `preprocessing.R`. Iteratively develop this script to import, clean and prepare your data. Then save your result to a file (e.g. as a geopackage, rda or csv file).
```{.r filename="preprocessing.R"}
#| eval: false
#| echo: false
library(sf)
my_tracks <- read_sf("my_tracking_data.gpx")
# imagine some complex preprocessing logic here:
my_tracks <- filter(my_tracks, segment_id = "one")
# in the end, export your file under a new name:
st_write(my_tracks, "my_tracks.gpkg")
```
Now, you can import the results from `preprocessing.R` in `index.qmd`:
```{.r filename="index.qmd"}
library(sf)
# import the cleaned file from preprocessing.R
my_tracks <- st_read("my_tracks.gpkg")
```
:::{.callout-important}
To "prove" that this script runs on your machine from top to bottom, in a new session and without any errors, use the function `quarto::render("preprocessing.R")`. Do this once you are sure the whole script runs smoothly without any errors.
Publish the resulting file (`preprocessing.html`) and provide a link in your report (**this is a hard requirement**).
:::
### Maps, Plots, Tables {#sec-maps-plots-tables}
Interactive maps are only reasonable for small amounts of data (e.g. a few locations of a single wild boar). If you want to show a large amount of data, consider making a static map instead (using `tmap_mode("plot")`).
If you want to visualize a dataframe as a table, you can use the function `knitr::kable(df)` (replace `df` with the name of your `data.frame`). For more complex tables, consider [`kableExtra`](https://haozhu233.github.io/kableExtra/) or [`gt`](https://gt.rstudio.com/).
To add a caption to a figure, use the `#| fig-cap:` option, as in the example below. Similarly, a caption for a table is added via `#| tbl-cap`. More information for [figures](https://quarto.org/docs/reference/cells/cells-knitr.html#figures) and [tables](https://quarto.org/docs/reference/cells/cells-knitr.html#tables).
````{.markdown}
```{{r}}
#| fig-cap: "A visualisation of the wild boar data"
ggplot(wildboar) +
geom_sf()
```
````
To reference a figure in your text, first add a label to the specific chunk using `label`. You can then reference that figure using the label you specified.
````{.markdown}
The wildboar move about, see @fig-wildboar.
```{{r}}
#| fig-cap: "A visualisation of the wild boar data"
#| label: fig-wildboar
ggplot(wildboar) +
geom_sf()
```
````
Note that for figures, the label must begin with `fig-`. For more information, see [here](https://quarto.org/docs/authoring/cross-references.html#computations).
Similarly, to reference a table in your text, use a label starting with `tbl-something` and reference it with `@tbl-something`, as shown below. For more information, read [this](https://quarto.org/docs/authoring/cross-references.html#computations-1).
````{.markdown}
@tbl-wildboar-summary shows a summary of the wildboar data.
```{{r}}
#| tbl-cap: "A summary of the wildboar data"
#| label: tbl-wildboar-summary
knitr::kable(wildboar_df)
```
````
### Counting Words
To count the number of words in your report, install the R package [`wordcountaddin`](https://github.com/benmarwick/wordcountaddin#how-to-install).
```{.r}
# install.packages("pacman")
library("pacman")
p_install_gh("benmarwick/wordcountaddin")
```
Then, add the following code chunk to your report:
```{.r}
wordcountaddin::word_count("index.qmd")
```
### Publishing with GitHub
[This section only applies if you are using Git and GitHub to manage your project files. If you are not, you can skip it.]{.underline}
To publish your report follow these steps:
1. Render your file to html. If your file is called index.qmd this will create a file called index.html
2. Push this file (and the corresponding folder `index_files/`) to github. Make sure this was successful by looking for these files on GitHub. Alternatively, you can use the methods described [here](https://quarto.org/docs/publishing/github-pages.html)
3. Activate GitHub Pages on your repo: *Settings* → *Pages* → Choose your branch (usually *main*) → Select Folder "Root"
4. Once your page is ready, the url to your website is: username.github.io/reponame/
### Citations
To cite a paper:
1. Export the bibtex key from the paper you want to cite, e.g. using Google Scholar (see @fig-scholar)
2. Copy the bibtex entry to `bibliography.bib` (see @lst-bibtex). You can change the entry label if you want (e.g. `laube2011`). You can add as many bibtex keys to this file as you like
3. Since `bibliography.bib` is the YAML header of `index.qmd` under the entry `bibliography:`, you can now cite this paper anywhere in your text using `@laube2011`. For more citation methods, see <https://quarto.org/docs/authoring/citations.html>
:::{#fig-scholar layout-nrow=1}
{#fig-scholar-1}
{#fig-scholar-2}
Export bibtex key from google scholar
:::
:::{#lst-bibtex}
```{.bibtex filename="bibliography.bib" }
@article{laube2011,
title={How fast is a cow? Cross-scale analysis of movement data},
author={Laube, Patrick and Purves, Ross S},
journal={Transactions in GIS},
volume={15},
number={3},
pages={401--418},
year={2011},
publisher={Wiley Online Library}
}
```
:::
### Templates
If you are up for some advanced aspects of writing reports with Quarto, checkout either the [Journal Article Template](https://quarto.org/docs/journals/formats.html) or the [ZHAW Thesis Template](https://zhaw-lsfm.github.io/quarto-thesis-docs/).
## References
:::{#refs}
:::