Open
Conversation
…ctive modeling for ML practitioners
REDESIGN_2026.md: Full redesign document with three goals:
- Goal 1: Product separation from BP2C (retained/removed/modified fields,
tech stack role-first architecture, existing question improvements)
- Goal 2: Policy-driving artifact via AMITI (6 policy blocks + advocacy brief)
- Goal 3: BP2C enrollment hook (eNPS, leave reason, job search)
- 62 active items (12 retained, 12 redesigned, 13 tech stack, 25 new)
- ~165 items removed (benefits, COVID, checkbox matrices)
question_inventory_2026.csv: Excel-friendly question list with bilingual
questions, types, options, skip logic, goal mapping, and section references.
Includes 18 removed-field rows documenting what was dropped and why.
Monte Carlo simulation (n=6000, seed=2026) comparing old ~130-item checkbox-heavy design vs new 62-item structured design: Key findings: - R² improves from 0.34 to 0.49 (+14.9 pp) - seniority_level alone adds +12.4 pp (biggest gap closed) - Information efficiency (R²/min) triples (+208%) - Coefficient stability improves 87% (bootstrap CV) - 629 more usable responses from higher completion rate - 62% more effective information (R² × N) Files: simulation_old_vs_new.py — reproducible simulation script SIMULATION_FINDINGS.md — interpreted findings document simulation_results/*.csv — raw numeric outputs
16 slides covering: rationale, 3 goals, removals, additions, seniority_level impact, tech stack redesign, policy blocks, BP2C hook, simulation evidence, information efficiency, respondent experience, roadmap, and decisions needed. Source: slides_redesign_2026.md (Pandoc Markdown) Output: slides_redesign_2026.html (reveal.js, CDN-loaded)
Directory layout:
- data/ — raw survey CSVs + options/ lookup tables
- notebooks/ — salarios.ipynb, causal_analysis.ipynb
- output/ — figures/, simulation_results/, model outputs
- docs/ — writeups, ig_scripts/
- redesign-2026/ — REDESIGN_2026.md, question inventory, simulation,
slidev-deck/ (18-slide presentation in Mexican Spanish)
Cleanup:
- Remove voila-demo.ipynb (unused)
- Remove slides_redesign_2026.html/md (superseded by Slidev)
- Remove tablas/ (empty, outputs now in output/)
- Remove README_original.md
- Update notebook paths to match new layout
- Update .gitignore to track slidev-deck source, ignore node_modules/dist
Redesigns salary survey for 2026 with improved structure and analytics
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request adds comprehensive documentation, analysis, and communication materials for a causal inference study on Mexican IT salary survey data. The main contributions include a detailed execution summary mapping the project to its specification, a formal specification document, and a set of social media scripts designed to communicate key findings. Additionally, the original README is updated to clarify data usage permissions.
Documentation and Analysis:
CAUSAL_ANALYSIS_SUMMARY.md, providing a detailed mapping of how each item in the project specification was addressed, including methodology, confounder controls, key findings, limitations, and actionable insights.SPECIFICATION.md, outlining the requirements and constraints for the causal analysis, including data sources, periods, exclusions, and the main analytical goal.Communication Materials:
ig_scripts/01_intro_estudio.md,02_experiencia_vale.md,03_ingles_48k.md,04_brecha_genero.md) to communicate the study’s main findings, methodology, and societal implications in an accessible, engaging way. Each script includes a storyboard, required assets, music suggestions, hashtags, and captions. [1] [2] [3] [4]Repository Information:
README_original.mdto clarify the academic/research-only license and provide guidance for commercial use requests.