Skip to content

Commit 82c2f04

Browse files
authored
Merge pull request #41 from CMA-Lab/review
Big review of the MTP-DB
2 parents c1b7d0c + 6ac26aa commit 82c2f04

22 files changed

Lines changed: 699 additions & 150 deletions

.vscode/settings.json

Lines changed: 3 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -9,13 +9,15 @@
99
"aquaporins",
1010
"argparser",
1111
"assertthat",
12+
"biomart",
1213
"BIOMART",
1314
"CEPT",
1415
"colnames",
1516
"CPUS",
1617
"dataframes",
1718
"docstrings",
1819
"ensembl",
20+
"Ensembl",
1921
"ensg",
2022
"ensp",
2123
"entrezgene",
@@ -38,6 +40,7 @@
3840
"iuphar",
3941
"lmap",
4042
"Lysylphosphatidylglycerol",
43+
"mazeinspector",
4144
"mrna",
4245
"pbar",
4346
"permeabilities",
@@ -54,20 +57,6 @@
5457
"transportome",
5558
"treelib",
5659
"usefixtures",
57-
"wrapattr",
58-
"biomart",
59-
"colnames",
60-
"CPUS",
61-
"Ensembl",
62-
"executescript",
63-
"hardpoint",
64-
"hardpoints",
65-
"iuphar",
66-
"lmap",
67-
"pbar",
68-
"pqdm",
69-
"refseq",
70-
"tcdb",
7160
"wrapattr"
7261
],
7362
}

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ and this project adheres to [Calendar Versioning](https://calver.org/) with the
99
### Changes
1010
- **BREAKING**: Changed table name from `atp_driven_transporters` to `pumps`.
1111
- Update run_rebuilder to be compatible with the new CLI;
12+
- Added tables `structure`, `function` and `origin` with information about the structure, physiological function and tissue of expression of genes.
1213
- Write better READMEs.
1314

1415
## [0.23.15-beta] - First release

CONTRIBUTING.md

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,16 @@ If you find a problem, want to point out an error, or have a suggestion or other
1212
We also use issues to keep track of features that we want in the DB, or things we want to change. Please refer to the [ROADMAP.md](ROADMAP.md) file for the project's roadmap. Feel free to comment on roadmap issues or start working on an item in the roadmap by [creating a fork of the project](https://docs.github.com/en/get-started/quickstart/fork-a-repo).
1313

1414
## Code style
15-
We strive to keep a unified code style. For this reason, we use [`pre-commit`](https://pre-commit.com/) to unify the style of our Python code. If you want to contribute code, please make sure you setup the pre-commit hooks with [`pre-commit`](https://pre-commit.com/).
15+
We strive to keep a unified code style. For this reason, we use [`pre-commit`](https://pre-commit.com/) to unify the style of our Python code. If you want to contribute code, please make sure you setup the pre-commit hooks with [`pre-commit`](https://pre-commit.com/):
16+
17+
```bash
18+
# For all hooks to be installed, **both** commands must be ran.
19+
pre-commit install
20+
pre-commit install -t commit-msg
21+
```
22+
23+
## Versioning
24+
We follow follow [CalVer](https://calver.org/) `MAJOR.YY.0W[_MINOR][-Modifier]`. It's a bit weird but the idea is to warn the user immediately if the major version changes.
1625

1726
## Commit Messages
1827
We follow the [`conventional commits`](https://www.conventionalcommits.org/en/v1.0.0/) specification.

README.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,3 +79,21 @@ Thanks goes to these wonderful people ([emoji key](https://allcontributors.org/d
7979
<!-- ALL-CONTRIBUTORS-LIST:END -->
8080

8181
This project follows the [all-contributors](https://github.com/all-contributors/all-contributors) specification. Contributions of any kind welcome!
82+
83+
84+
## Licensing
85+
86+
Daedalus itself and all associated software in this repository is licensed following the [`COPYING`](COPYING) file. We are not sponsored or licensed by any of the data sources that we use.
87+
88+
The data downloaded and parsed by Daedalus is licensed under:
89+
- The IUPHAR database: [CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/) - Harding SD, Armstrong JF, Faccenda E, Southan C, Alexander SPH, Davenport AP, Pawson AJ, Spedding M, Davies JA; NC-IUPHAR. (2021) The IUPHAR/BPS guide to PHARMACOLOGY in 2022: curating pharmacology for COVID-19, malaria and antibacterials. Nucl. Acids Res. 2022. 50(D1) D1282-D1294. doi: 10.1093/nar/gkab1010. [Full text](https://academic.oup.com/nar/advance-article/doi/10.1093/nar/gkab1010/6414576). PMID: 34718737.
90+
- The COSMIC database: [Custom License](https://cancer.sanger.ac.uk/license/COSMIC_Non_Commercial_Use_Terms_October2021.pdf). Data is remixed and parsed, so the content of the COSMIC database is not redistributed as-is (which would be forbidden).
91+
- The Gene Onthology: [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/). Current Zenodo DOI: https://dx.doi.org/10.5281/zenodo.7504797
92+
- The TCDB: [CC BY 3.0](https://creativecommons.org/licenses/by/3.0/) - Latest publication: Saier MH, Reddy VS, Moreno-Hagelsieb G, Hendargo KJ, Zhang Y, Iddamsetty V, Lam KJK, Tian N, Russum S, Wang J, Medrano-Soto A. (2021). The Transporter Classification Database (TCDB): 2021 update. Nucleic Acids Res. 49(D1):D461-7 [33170213](https://pubmed.ncbi.nlm.nih.gov/33170213/)
93+
- Ensembl places its data in the Public Domain ([more info](https://www.ensembl.org/info/about/legal/disclaimer.html)).
94+
- The Human Gene Nomenclature Committee places its data in the Public Domain ([CC0](https://creativecommons.org/share-your-work/public-domain/cc0/), [more info](https://www.genenames.org/about/license/))
95+
- The Human Protein Atlas: [CC BY 3.0](https://creativecommons.org/licenses/by/3.0/) - Uhlén M et al., Tissue-based map of the human proteome. Science (2015)
96+
PubMed: 25613900 DOI: 10.1126/science.1260419 ([more info](https://www.proteinatlas.org/about/licence))
97+
98+
99+
(Last updated on the 22nd of June, 2023)

src/daedalus/__init__.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,8 @@
44
from colorama import Back, Fore, Style
55

66
__all__ = ["DB_NAME", "SCHEMA"]
7-
__version__ = "0.23.17-beta"
7+
# This follows CalVer MAJOR.YEAR.WEEK[.minor][-tag]
8+
__version__ = "1.23.25-beta"
89

910

1011
class ColorFormatter(logging.Formatter):

src/daedalus/constants/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010
HUGO,
1111
IUPHAR_COMPILED,
1212
IUPHAR_DB,
13+
PROTEIN_ATLAS,
1314
SLC_TABLES,
1415
TCDB,
1516
)
@@ -30,6 +31,7 @@
3031
"CACHE_NAME",
3132
"THESAURUS_FILE",
3233
"GO",
34+
"PROTEIN_ATLAS",
3335
]
3436

3537
## TODO: It could be beneficial to bundle all of these constants into

src/daedalus/constants/url_hardpoints.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -148,3 +148,10 @@
148148
"sodium_ion_channels": "GO:0005272",
149149
},
150150
}
151+
152+
PROTEIN_ATLAS = {
153+
# At the time of writing the site is v23, and we can pinpoint the version by going to
154+
# http://v23.proteinatlas.org/ if we want to use just this version.
155+
"normal_tissue_expression": "https://www.proteinatlas.org/download/normal_tissue.tsv.zip",
156+
"subcellular_location": "https://www.proteinatlas.org/download/subcellular_location.tsv.zip",
157+
}

0 commit comments

Comments
 (0)