Skip to content

Python code for calculating sampling rates in stratified samples. The current version only includes code for calculating sampling rates for an online panel.

License

Notifications You must be signed in to change notification settings

uscensusbureau/optimal-stratified-sampling-with-python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

optimal-stratified-sampling-with-python

Description

Optimal Stratified Sampling with Python

This package calculates optimal sampling rates for a stratified sample. The current version only has code for calculating sampling rates for an online panel. This includes a python file for the main function, as well as a test script. Code for other types of survey settings may come at a later date.

The package conducts numerical optimization to calculate sampling rates, similar to ideas from Valliant et al. (2018). The setting of this paper is a survey without 100% response rate, using a setup similar to Mendelson and Elliott (2024). For the online panel code, the full details of the model and computation details will be coming in a forthcoming working paper. The inputs of the function are various key parameters, like response rates and cost estimates. The algorithm then uses minimization routines from the SciPy package to find the number of sampled cases that results in the lower average standard error of the estimates, subject to the budget constraint.

Requirements

  • Python 3.11.5+
  • Required packages (tested versions listed, should work on most others):
    • numpy (1.24.3)
    • scipy (1.11.1)

Installation

Once the release .whl file is downloaded and stored in <wheel location>, a simple pip install <wheel location> should work.

Issues

This code is from current Census Bureau research and is still being tested and refined. We appreciate any feedback you would like to provide us; please post any questions that you may have in the GitHub issues section.

Citation Information

Please cite this package in any work where it proves useful.

@software{Eggleston_Optimal_Sampling_2025,
author = {Eggleston, Jonathan},
title = {{Optimal Stratified Sampling with Python}},
url = {https://github.com/uscensusbureau/optimal_stratified_sampling/},
version = {1.0.0},
year = {2025}
}

Disclaimers

U.S. Census Bureau code is provided on an ‘as is’ basis and the user assumes responsibility for its use. The Census Bureau has relinquished control of the information and no longer has responsibility to protect the integrity, confidentiality, or availability of the information. Any claims against the Census Bureau stemming from the use of its GitHub project will be governed by all applicable Federal law. Any reference to specific commercial products, processes, or services by service mark, trademark, manufacturer, or otherwise, does not constitute or imply their endorsement, recommendation or favoring by the Census Bureau. The Census Bureau seal and logo shall not be used in any manner to imply endorsement of any commercial product or activity by Census Bureau or the United States Government.

Any opinions and conclusions expressed herein are those of the author and do not represent the views of the U.S. Census Bureau.

References

Mendelson, Jonathan, and Michael R. Elliott. 2024. "Optimal Allocation Under Anticipated Nonresponse." Journal of Survey Statistics and Methodology 12 (5): 1405–29. Available at DOI

Valliant, Richard, Jill A. Dever, and Frauke Kreuter. 2018. Practical Tools for Designing and Weighting Survey Samples. 2nd ed. New York: Springer.

About

Python code for calculating sampling rates in stratified samples. The current version only includes code for calculating sampling rates for an online panel.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages