Skip to content

FIS Import Documentation

Jerónimo Arenas-García edited this page Feb 10, 2020 · 2 revisions

Initial checklist:

  • Copy config.cf.default into config.cf, and update the necessary sections of the file
  • Make sure you have selenium installed in your python distribution, as well as the driver for Firefox (Geckodriver)

Recommended sequential construction of the database: The folowing actions can be taken sequentially, or one by one.

  1. Create database manually, and make sure the database name, server details, and credentials provided in file config.cf are correct.

  2. Download all project html pages from the FIS portal. First, all URLs will be downloaded, and then all project files are obtained and saved locally. FIS portal uses javascript, so we crawl the site using Selenium. The time to sleep between url retrieves can be modified in the configuration file. If longer delays are used, the download of all projects takes longer, but also has more guarantees to complete correctly.

>> python importFIS.py --download
  1. Drop all tables and create database schema. The schema can be modified manually at the end of file FISmanager.py. Note that only autoincremental indexes are created initially. This is convenient to speed up the database creation process. After data has been imported, you can create indexes and foreign keys taking into consideration the queries that you expect to carry out.
>> python importFIS.py --resetDB
  1. Import project data. This will go through all files, and insert all project information in table FISprojects. During import, each project will be assigned a unique identifier (autoincremental projectID field).
>> python importS2.py --importData

Clone this wiki locally