This repository includes all the code I have used to develop FondoAdvisor, a website to help prospective or current investors with their financial decisions.
The tool has a website created with streamlit, supported by a pipeline that extracts public information from investments funds from link, treats the information and saves
The pipeline is supported by 2 packages: p_acquisition and p_wrangling. Each package contains a module (script) including the functions to support the main script.
-
On the acquisition package, the information is extracted from the CNMV. The extraction is done with requests library and BeautifulSoup. The extracted data is saved in csv files in the 'data' folder:
- General fund information is in 'data/csv_gen_info' e.g. fund investment group, depositary, depositary rating...
- The historical financial information is stored in 'data/csv' e.g. profitability, risk, rotation, fees,...
-
On the wrangling package, the dataframe is built, cleaned and treated. New data is also generated to improve the visualization (stored in 'data/created_data').
The front-end is a website created using Streamlit and hosted in Amazon Web Services. The main Streamlit script is supported by separate scripts, stored in 'front_end' folder. For AWS hosting, I used this tutorial but there are many more.
The folder 'notebooks' includes jupyter notebooks used for development phase, but with no current use.
The script 'playgrounds' is a light version of the pipeline when only a subset of the funds is required. The list of funds to download has to be written as a list in the variable 'fondos'
If you want to download data from CNMV, you just need to run the main.py script with your Python launcher.
There are some optional arguments that can be used:
-d or --delete: when given the values 'Y' or 'y' all the files in the data folder will be deleted to do a clean download.*
*If non-supported values are provided, ValueError is raised, except for delete argument, which ignores the input and no files will be deleted.
The pipeline includes a Telegram alert via bot, but you will need to create your own bot since the bot keys are not added. I used this tutorial but there are many more.
If you want to run the streamlit server, you need to run the main_streamlit.py with your Python launcher. The website includes 3 tools:
-
Comparador de fondos: allows easy comparison of different investment funds and their main variables.
-
¿Cuánto habría ganado si...?: provides the current value of an investment that you could have made in a specific fund in the past.
-
Top 5 fondos: provides the top 5 funds, ranked by profitability in a specific period. Please be patient since it needs to load all the database. It will only take 2-3 minutes.
- argparse: for passing arguments on command line
- os: for interaction with the operating system: directories and files.
- Pandas: for data treatment
- requests: for api and html queries
- BeautifulSoup: for webscraping the html code
- datetime: to work with date values
- tqdm: for progress visualization
- Streamlit: to create the website
- Altair: for some special visualizations in Streamlit.
- Telegram: to get an alert when the script has stopped