You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# Tutorial: Working with Shapefiles and NetCDF in Python
2
+
# ========
3
+
# This tutorial introduces key geospatial analysis workflows using Python:
4
+
# - Reading shapefiles and converting to `shapely` geometries
5
+
# - Testing if coordinates fall inside polygons
6
+
# - Visualising spatial data and geographic boundaries
7
+
# - Reading and analysing gridded NetCDF rainfall datasets
8
+
# - Creating spatial masks and integrating values over regions
9
+
#
10
+
# We’ll focus on the Murray-Darling Basin (MDB) as a case study.
11
+
12
+
# + [markdown]
13
+
# ### Import Required Libraries
14
+
#
15
+
# We begin by importing essential Python libraries:
16
+
#
17
+
# - `numpy` for numerical operations
18
+
# - `netCDF4` for handling NetCDF files
19
+
# - `matplotlib.pyplot` for plotting
20
+
# - `pyshp` (via `shapefile.Reader`) for shapefiles
21
+
# - `shapely.geometry` to convert and operate on vector shapes
22
+
# + tags=["empty-cell"]
23
+
# Import numpy, netCDF4, matplotlib.pyplot, shapefile Reader, and shapely.geometry Point and shape
24
+
# -
25
+
# + tags=["solution"]
26
+
importnumpyasnp
27
+
importnetCDF4asnc
28
+
importmatplotlib.pyplotasplt
29
+
fromshapefileimportReader
30
+
fromshapely.geometryimportPoint, shape
31
+
# -
32
+
# + [markdown]
33
+
# ### Load and Convert Shapefiles to Geometry
34
+
# For this tutorial, we will use the MDB boundaries shapefile. Download the shapefiles from <a href="https://data.gadopt.org/water-course/MDB_boundaries.zip" download>here</a> and unzip it in the current directory. Use `Reader()` to load two shapefiles for the MDB: one for the **north** and one for the **south**.
35
+
# Convert each to a `shapely` geometry object so we can do spatial queries.
36
+
#
37
+
# This enables operations like containment tests (`Point.within(polygon)`, i.e., if a point is inside a polygon).
38
+
#
39
+
# + tags=["empty-cell"]
40
+
# Load north and south MDB shapefiles and convert each to a shapely shape
41
+
# -
42
+
# + tags=["solution"]
43
+
# Load the north and south MDB shapefiles and convert each to a shapely shape
# Check if the point (151°E, 29°S) lies in either basin.
54
+
# `Point().within(polygon)` returns `True` if the point lies *inside* the shape.
55
+
# -
56
+
# + tags=["empty-cell"]
57
+
# Use Point.within() to test if (151, -29) is in the north or south MDB
58
+
# -
59
+
# + tags=["solution"]
60
+
print(Point([151, -29]).within(NMDB_shape))
61
+
print(Point([151, -29]).within(SMDB_shape))
62
+
# -
63
+
# + [markdown]
64
+
# ### Plot Basin Boundaries and Test Coordinate
65
+
# Extract the coordinate arrays from the shapefiles and plot them.
66
+
# Overlay the test point to confirm visually.
67
+
68
+
# + tags=["empty-cell"]
69
+
# Extract shapefile coordinates, plot north and south boundaries, and add test point
70
+
# -
71
+
# + tags=["solution"]
72
+
# Extract the coordinate arrays from the shapefiles and plot them.
73
+
NMDB_coords=np.array(NMDB.shape().points)
74
+
SMDB_coords=np.array(SMDB.shape().points)
75
+
# Plot the north and south boundaries
76
+
plt.figure(figsize=(5, 5))
77
+
plt.plot(SMDB_coords[:, 0], SMDB_coords[:, 1])
78
+
plt.plot(NMDB_coords[:, 0], NMDB_coords[:, 1])
79
+
plt.scatter(151, -29, c='red')
80
+
plt.title("North and South MDB Boundaries with Test Point")
81
+
plt.show()
82
+
# -
83
+
# + [markdown]
84
+
# ### Load and Explore Rainfall Data
85
+
# Download the `rain_day_2025.nc`, which is the amount of rain on 27th of July 2025 as downloaded from BoM, from <a href="https://data.gadopt.org/water-course/rain_day_2025.nc" download>here</a>. This file is in NetCDF format, which is a common format for gridded data. If you want to learn more about NetCDF, you can read the <a href="https://pro.arcgis.com/en/pro-app/latest/help/data/multidimensional/what-is-netcdf-data.htm" target="_blank">NetCDF Quick Start Guide</a>.
86
+
# Open the `rain_day_2025.nc` NetCDF file and extract:
87
+
# - Rainfall values
88
+
# - Latitude/longitude grids
89
+
# - Time axis (convert to years)
90
+
#
91
+
# NetCDF is ideal for gridded data like daily rainfall.
92
+
# -
93
+
# + tags=["empty-cell"]
94
+
# Load NetCDF file and extract rain_day, lats, lons, and converted time
95
+
# -
96
+
# + tags=["solution"]
97
+
# Load the NetCDF file and extract the rainfall data, latitude, longitude, and time
98
+
data=nc.Dataset("rain_day_2025.nc")
99
+
# Extract the rainfall data, latitude, longitude, and time
100
+
rain_day=data["rain_day"][:]
101
+
# Extract the latitude and longitude data
102
+
lats=data["latitude"][:]
103
+
lons=data["longitude"][:]
104
+
time=data["time"][:] /365.25+1900# convert from days since 1900
105
+
# -
106
+
# + [markdown]
107
+
# ### Create Spatial Masks for Each Basin
108
+
# We want to isolate grid cells that fall within the MDB.
109
+
# Loop through the lat/lon grid and use `Point.within()` to assign `True` to cells inside the north or south basin.
110
+
#
111
+
# + tags=["empty-cell"]
112
+
# Build boolean masks for NMDB and SMDB based on which grid points fall inside each
113
+
# -
114
+
# + tags=["solution"]
115
+
# Build boolean masks for NMDB and SMDB based on which grid points fall inside each
116
+
NMDB_mask=np.zeros((len(lats), len(lons)))
117
+
SMDB_mask=np.zeros((len(lats), len(lons)))
118
+
# Loop through the lat/lon grid and use `Point.within()` to assign `True` to cells inside the north or south basin.
119
+
forilatinrange(len(lats)):
120
+
foriloninrange(len(lons)):
121
+
pt=Point([lons[ilon], lats[ilat]])
122
+
ifpt.within(NMDB_shape):
123
+
NMDB_mask[ilat, ilon] =True
124
+
elifpt.within(SMDB_shape):
125
+
SMDB_mask[ilat, ilon] =True
126
+
# -
127
+
# + [markdown]
128
+
# ### Plot the Mask to Verify Coverage
129
+
# Combine the north and south masks, and plot to confirm the shape of the coverage.
130
+
#
131
+
# + tags=["empty-cell"]
132
+
# Combine north/south masks and plot the result
133
+
# -
134
+
# + tags=["solution"]
135
+
# Combine the north and south masks, and plot to confirm the shape of the coverage.
136
+
MDB_mask=NMDB_mask+SMDB_mask
137
+
plt.figure(figsize=(5, 5))
138
+
plt.imshow(MDB_mask)
139
+
plt.title("Spatial Mask for North + South MDB")
140
+
plt.show()
141
+
# -
142
+
# + [markdown]
143
+
# ### Visualise Rainfall for a Single Day
144
+
# Mask the rainfall data at one time step (e.g. day 0) using the MDB mask.
145
+
# This reveals only the rainfall within the basin.
146
+
# -
147
+
# + tags=["empty-cell"]
148
+
# Apply spatial mask to rain_day[0] and plot the result
149
+
# -
150
+
# + tags=["solution"]
151
+
plt.figure(figsize=(5, 5))
152
+
plt.imshow(rain_day[0] *MDB_mask)
153
+
plt.title("Masked Rainfall at Time = 0")
154
+
plt.show()
155
+
# -
156
+
# + [markdown]
157
+
# ### Integrate Rainfall Over Time
158
+
# Calculate the spatially averaged rainfall over the MDB at each time step.
159
+
# This gives a time series of total rainfall.
160
+
# + tags=["empty-cell"]
161
+
# Loop over time to compute daily average rainfall over MDB
0 commit comments