This project simulates realistic financial advisor and fund data to train and evaluate ML models like Sequential Attention Networks and GNNs. The distributions of various features have been crafted to mimic real-world data patterns observed in asset management and financial product recommendation systems.
- Simulated using a log-normal distribution.
- Reflects that most advisors manage <$10M, while a small number manage very large portfolios.
- Log-scaled histogram shows long-tail behavior common in wealth distribution.
- Sampled from a Beta(2, 5) distribution.
- Realistic right-skew: most users are lightly engaged, and a few power users are highly active.
- Matches real-world marketing/funnel dynamics.
- Weighted sampling emphasizes North America (50%), with diminishing representation in Europe, Asia, Middle East, and South America.
- Mirrors actual global distribution of financial advisors and fund clients.
- Biased toward Growth and Value funds, which are more common and widely marketed.
- Underrepresented categories like Balanced, Aggressive match real fund catalogs.
- Distribution follows a Gaussian bell-curve, centered around ~10.
- Reflects typical investor behavior building a diversified portfolio of 8–15 products.
data/investors.csv– 3000 advisors with detailed profilesdata/funds.csv– 500 financial productsdata/investor_product_edges.csv– ~12,000 historical investor-fund links