| Title | Plan of work |
|---|---|
| Intern Name | Andrew Mckenna-Foster |
| Sponsor Name | Washington State Library (Kathleen Sullivan) |
| Date | June 2019 |
The Washington state open data portal currently hosts 492 datasets (as of 10 June 2019) dating back to 2012. As part of its work to encourage open data best practices and use, the Washington State Library would like to investigate the feasibility of curating these datasets for the state and assess how the portal can be improved. To inform this work, University of Washington Open Data Literacy intern, Andrew Mckenna-Foster, will assess datasets for data and metadata quality, interview users of the data portal, and identify factors that may account for poor circulation or high use. The key stakeholders are both the WA State Library and the State of Washington data managers.
17 June - 16 August 2019
What is this project supposed to achieve, and why?
- Provide a literature review based on a bibliography of relevant and recent references.
- One on ones with State Library staff to learn about library operations
- Identify datasets with poor data quality and poor metadata quality
- Understand how the portal is perceived and used by users (depositors)
- Identify factors behind poor circulation or high circulation of datasets
- Produce a final report with recommendations
Any editing, remediation, publishing, or curation of datasets.
- A site visit to one of the libraries for which the State Library is providing open data consulting services. This may involve too much travel but could be useful informing recommendations for the state data portal.
- Investigate how to manage the transition from active datasets to archived datasets.
- Create a Jupyter Notebook (or similar) to visualize Socrata asset summary statistics
What will this project produce? This should include items like reports, best practices, software, data, metadata schemas, models or figures, and documentation. See the two types of deliverables below:
- Literature review
- Assessment of data and metadata quality
- Draft survey/interview questions for depositors and users
- Results from the survey/interviews
- List of factors behind the poor circulation or high circulation of datasets
- Two blogposts
- Report presenting information to help future decision making around the WA state data portal
- Public presentation on 16 August 2019
Describe how you will manage deliverables during the project and the plan for handing over, and sustaining deliverables over time. We require that all internships create open documentation and update their plans for sustainability regularly.
I will be in regular contact with Kathleen Sullivan and OD. I will store all files created for the project on github. The github repository for this project will be available for future use and will be findable through ODL.
Create a general timeline for completing each of the deliverables that you listed above. After you have settled on a timeline with feedback from your mentor, you should enter these as Milestones in Github's Issues tracker. Each task that you perform or plan to perform can then be files as an issue that is attached to a specific milestone.
- Literature Review 21 June
- Identify variables for data and metadata quality assessment 28 June
- Assess at least 25% of datasets 26 July
- Review data published in other formats (maps, images
- Summarize assessment data 9 August
- Draft interview questions 28 June
- Contact interviewees 1 July
- Complete interviews 26 July
- Analyze and summarize interview data 2 August
- Propose factors behind circulation rates based on literature review 26 June
- Assess datasets for those factors 12 July
- Analyze and summarize assessment data 19 July
- Write report 16 August
- Create presentation 16 August
Document date and change of work plan here. July 15- added a bullet to 'Time Allowing'
- [Customize and add bullets]
- Update the ODL GitHub repository at least weekly and more as needed, so the internship’s documentation is current and thorough.
- Notify Kathleen Sullivan and others as needed about updated reports.
- Meet with Kathleen either in person or through Zoom once a week or as needed.
- Respond to Kathleen Sullivan or ODL team within 24 business hours.
- Use Zoom, email, and Slack for communication with Kathleen Sullivan and the ODL team.
- Include any travel, online meetings, and/or site visits planned...
