Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions R/misc.R
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ is.int_vector <- function(x) {
x == as.integer(x)
}

# load system file ---------------------------------------------------------------
# load system file ------------------------------------------------------------
#' @title Load system file
#'
#' @description Load system file
Expand All @@ -40,10 +40,11 @@ is.int_vector <- function(x) {
#'
#' @export
#' @examples
#' # TODO
#' dat <- mipmapper_file("dummy_data.csv")

mipmapper_file <- function(name) {
name_full <- system.file("extdata/", name, package='mipmapper', mustWork = TRUE)
name_full <- system.file("extdata/", name, package='mipmapper',
mustWork = TRUE)
ret <- fast_read(name_full)

return(ret)
Expand Down
1 change: 0 additions & 1 deletion README.Rmd
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
---
output: github_document
always_allow_html: yes
---
```{r, echo = FALSE}
knitr::opts_chunk$set(
Expand Down
50 changes: 20 additions & 30 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,60 +1,56 @@

mipmapper
=========

[![Travis build status](https://travis-ci.org/mrc-ide/mipmapper.svg?branch=master)](https://travis-ci.org/mrc-ide/mipmapper) [![AppVeyor build status](https://ci.appveyor.com/api/projects/status/github/mrc-ide/mipmapper?branch=master&svg=true)](https://ci.appveyor.com/project/mrc-ide/mipmapper) [![Coverage status](https://codecov.io/gh/mrc-ide/mipmapper/branch/master/graph/badge.svg)](https://codecov.io/github/mrc-ide/mipmapper?branch=master) [![Documentation](https://github.com/OJWatson/rdhs/raw/master/tools/pkgdownshield.png)](https://mrc-ide.github.io/mipmapper/)
# mipmapper
[![Travis build status](https://travis-ci.org/mrc-ide/mipmapper.svg?branch=master)](https://travis-ci.org/mrc-ide/mipmapper)
[![AppVeyor build status](https://ci.appveyor.com/api/projects/status/github/mrc-ide/mipmapper?branch=master&svg=true)](https://ci.appveyor.com/project/mrc-ide/mipmapper)
[![Coverage status](https://codecov.io/gh/mrc-ide/mipmapper/branch/master/graph/badge.svg)](https://codecov.io/github/mrc-ide/mipmapper?branch=master)
[![Documentation](https://github.com/OJWatson/rdhs/raw/master/tools/pkgdownshield.png)](https://mrc-ide.github.io/mipmapper/)

The R package *mipmapper* package contains a series of functions for analysing and visualising Molecular Inversion Probe (MIP) data. **This package is in early stages of development**, but will eventually include a range of methods for carrying out population genetic analyses. Full documentation can be found [here](https://mrc-ide.github.io/mipmapper/).

### Installation

In R, ensure that you have the devtools package installed by running

``` r
```r
install.packages("devtools", repos='http://cran.us.r-project.org')
```
Then we can simply install the *mipmapper* package directly from GitHub by running

Then we can simply install the *mipmapper* package directly from GitHub by running

``` r
```r
devtools::install_github("mrc-ide/mipmapper")
```

And we can load the package by running

``` r
```r
library(mipmapper)
```

### Data loading and filtering

Load raw data from .csv file. You will need to change the file path to where you have stored the data. An example of this is shown below, but commented out.

``` r
```r
# if loading your own data, uncomment this line and change path to your data
# dat0 <- fast_read("path_to_your_data/NeutralSNPs_AheroYombo.csv")

# here we will use in-built example data
dat0 <- mipmapper_file("dummy_data.csv")
```

Some miscellaneous filtering. Subset to SNPs only (i.e. no more complex mutations), group all alternative alleles together as a single "non-reference" allele, and drop irregular loci (for example non-integer barcode counts).

``` r
```r
dat1 <- filter_misc(dat0, SNP_only = TRUE, group_Alt = TRUE, drop_irregular = TRUE)
```

Next we want to filter based on coverage, throwing away any loci that are below a minimum coverage level. We can visualise how much data will be left at different thresholds using the following function:

``` r
```r
plot_coverage(dat1)
```

![](tools/README-plot_coverage-1.png)
![plot of chunk plot_coverage](README-plot_coverage-1.png)

Choose a threshold that strikes a balance between data quantity and quality. Once you have chosen a threshold, apply the filtering as follows:

``` r
```r
my_threshold <- 6
dat3 <- filter_coverage(dat1, min_coverage = my_threshold)
```
Expand All @@ -63,42 +59,36 @@ dat3 <- filter_coverage(dat1, min_coverage = my_threshold)

Before carry out PCA analysis we will convert our filtered dataset into a wide format, where each row is an unique sample, with new columns for each locus. This can be achieved as follows:

``` r
```r
dat4 <- melt_mip_data(dat3)
```

This can then be used to impute any missing values:

``` r
```r
dat5 <- impute_mip_data(dat4)
```

The imputed data set can then be analysed using principal component analysis:

``` r
```r
pca <- pca_mip_data(dat5)
```

We can view the variance explained by each compenet graphically using:

``` r
```r
plot_pca_variance(pca)
```

![](tools/pca_var.png)

And lastly we can plot the actual prinical component analysis, to see how it has clustered our data:

``` r
```r
plot_pca(pca, num_components = 2, meta_var = "Country")
```

![](tools/pca_2var.png)

We can control whether we want to visualise the first 2 or 3 components, with the `num_componenets` argument:

``` r
```r
plot_pca(pca, num_components = 3, meta_var = "Country")
```

![](tools/pca_3var.png)
3 changes: 2 additions & 1 deletion TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,5 @@ samples, countries etc. At the moment the data is nonsense, and has no structure
or PCA utility, so might be nice to come up with imporvements to make faking
large clustered genetic data.
5. Shiny interface for loading a dataset, sliders for filtering, and then
displayed plots.
displayed plots.
6. Fst permutation test
Binary file modified docs/README-plot_coverage-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions docs/TODO.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions docs/index.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions docs/pkgdown.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
pandoc: 1.17.2
pkgdown: 1.0.0
pkgdown_sha: ~
pandoc: 1.19.2.1
pkgdown: 1.0.0.9000
pkgdown_sha: d7c658122bfbd143552cd28585f867dd344302ad
articles: []

3 changes: 1 addition & 2 deletions docs/reference/mipmapper_file.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Binary file modified docs/reference/plot_coverage-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/tools/README-plot_coverage-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion man/mipmapper_file.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

14 changes: 12 additions & 2 deletions pkgdown_link_formats.R
Original file line number Diff line number Diff line change
@@ -1,8 +1,14 @@
fix_pkgdown <- function(){

# run this after pkgdown::build_site to correct image links
# first knit the Rmd
knitr::knit("README.Rmd")

# then build the site
pkgdown::build_site()

# change lines in README.md to make link point to tools
lines <- readLines("README.md")
lines <- lines[-c(1:(which(grepl("# mipmapper",lines))-1))]
lines[grep("(.*png)",lines)] <- gsub("!\\[\\]\\(R","!\\[\\](tools/R",lines[grep("(.*png)",lines)])
writeLines(lines, "README.md")

Expand All @@ -23,8 +29,12 @@ l <- readLines("docs/index.html")
fun <- grep(".png",l,value=TRUE, fixed=TRUE)
files <- strsplit(fun,"/|\"") %>% lapply(function(x) grep("png",x,value=TRUE)) %>% unlist
for(i in 1:length(fun)) {
files[i] <- gsub("\"(.*png)\"",paste0("\"","tools/",files[i],"\""),fun[i])

files[i] <- gsub("\"(.*png)\"",paste0("\"","tools/",files[i],"\""),fun[i])

}
l[grepl(".png",l, fixed=TRUE)] <- files
writeLines(l,"docs/index.html")


}
Binary file removed tests/testthat/Rplots.pdf
Binary file not shown.
6 changes: 6 additions & 0 deletions tests/testthat/test-misc.R
Original file line number Diff line number Diff line change
Expand Up @@ -35,3 +35,9 @@ test_that("fast_read works", {
expect_equal(dat, dat2)
unlink("data.csv")
})

test_that("mipmapper_file() works", {
dat <- mipmapper_file("dummy_data.csv")
expect_equal(dim(dat), c(6336, 13))
})

Binary file modified tools/README-plot_coverage-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.