-
Notifications
You must be signed in to change notification settings - Fork 0
FIS Import Documentation
Jerónimo Arenas-García edited this page Feb 10, 2020
·
2 revisions
Initial checklist:
- Copy config.cf.default into config.cf, and update the necessary sections of the file
- Make sure you have selenium installed in your python distribution, as well as the driver for Firefox (Geckodriver)
Recommended sequential construction of the database: The folowing actions can be taken sequentially, or one by one.
-
Create database manually, and make sure the database name, server details, and credentials provided in file
config.cfare correct. -
Download all project html pages from the FIS portal. First, all URLs will be downloaded, and then all project files are obtained and saved locally. FIS portal uses javascript, so we crawl the site using Selenium. The time to sleep between url retrieves can be modified in the configuration file. If longer delays are used, the download of all projects takes longer, but also has more guarantees to complete correctly.
>> python importFIS.py --download
- Drop all tables and create database schema. The schema can be modified manually at the end of file
FISmanager.py. Note that only autoincremental indexes are created initially. This is convenient to speed up the database creation process. After data has been imported, you can create indexes and foreign keys taking into consideration the queries that you expect to carry out.
>> python importFIS.py --resetDB
- Import project data. This will go through all files, and insert all project information in table
FISprojects. During import, each project will be assigned a unique identifier (autoincrementalprojectIDfield).
>> python importS2.py --importData