Skip to content

Commit 335b5a9

Browse files
committed
Adding basics
1 parent 7dbe1f8 commit 335b5a9

1 file changed

Lines changed: 190 additions & 0 deletions

File tree

basics_01.py

Lines changed: 190 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,190 @@
1+
# Tutorial: Working with Shapefiles and NetCDF in Python
2+
# ========
3+
# This tutorial introduces key geospatial analysis workflows using Python:
4+
# - Reading shapefiles and converting to `shapely` geometries
5+
# - Testing if coordinates fall inside polygons
6+
# - Visualising spatial data and geographic boundaries
7+
# - Reading and analysing gridded NetCDF rainfall datasets
8+
# - Creating spatial masks and integrating values over regions
9+
#
10+
# We’ll focus on the Murray-Darling Basin (MDB) as a case study.
11+
12+
# + [markdown]
13+
# ### Import Required Libraries
14+
#
15+
# We begin by importing essential Python libraries:
16+
#
17+
# - `numpy` for numerical operations
18+
# - `netCDF4` for handling NetCDF files
19+
# - `matplotlib.pyplot` for plotting
20+
# - `pyshp` (via `shapefile.Reader`) for shapefiles
21+
# - `shapely.geometry` to convert and operate on vector shapes
22+
# + tags=["empty-cell"]
23+
# Import numpy, netCDF4, matplotlib.pyplot, shapefile Reader, and shapely.geometry Point and shape
24+
# -
25+
# + tags=["solution"]
26+
import numpy as np
27+
import netCDF4 as nc
28+
import matplotlib.pyplot as plt
29+
from shapefile import Reader
30+
from shapely.geometry import Point, shape
31+
# -
32+
# + [markdown]
33+
# ### Load and Convert Shapefiles to Geometry
34+
# For this tutorial, we will use the MDB boundaries shapefile. Download the shapefiles from <a href="https://data.gadopt.org/water-course/MDB_boundaries.zip" download>here</a> and unzip it in the current directory. Use `Reader()` to load two shapefiles for the MDB: one for the **north** and one for the **south**.
35+
# Convert each to a `shapely` geometry object so we can do spatial queries.
36+
#
37+
# This enables operations like containment tests (`Point.within(polygon)`, i.e., if a point is inside a polygon).
38+
#
39+
# + tags=["empty-cell"]
40+
# Load north and south MDB shapefiles and convert each to a shapely shape
41+
# -
42+
# + tags=["solution"]
43+
# Load the north and south MDB shapefiles and convert each to a shapely shape
44+
NMDB = Reader("MDB_boundaries/MDB_north_boundary.shp")
45+
SMDB = Reader("MDB_boundaries/MDB_south_boundary.shp")
46+
# Convert each to a shapely shape
47+
NMDB_shape = shape(NMDB.shape())
48+
SMDB_shape = shape(SMDB.shape())
49+
# -
50+
# + [markdown]
51+
# ### Check if a Point Lies Within the Basin
52+
#
53+
# Check if the point (151°E, 29°S) lies in either basin.
54+
# `Point().within(polygon)` returns `True` if the point lies *inside* the shape.
55+
# -
56+
# + tags=["empty-cell"]
57+
# Use Point.within() to test if (151, -29) is in the north or south MDB
58+
# -
59+
# + tags=["solution"]
60+
print(Point([151, -29]).within(NMDB_shape))
61+
print(Point([151, -29]).within(SMDB_shape))
62+
# -
63+
# + [markdown]
64+
# ### Plot Basin Boundaries and Test Coordinate
65+
# Extract the coordinate arrays from the shapefiles and plot them.
66+
# Overlay the test point to confirm visually.
67+
68+
# + tags=["empty-cell"]
69+
# Extract shapefile coordinates, plot north and south boundaries, and add test point
70+
# -
71+
# + tags=["solution"]
72+
# Extract the coordinate arrays from the shapefiles and plot them.
73+
NMDB_coords = np.array(NMDB.shape().points)
74+
SMDB_coords = np.array(SMDB.shape().points)
75+
# Plot the north and south boundaries
76+
plt.figure(figsize=(5, 5))
77+
plt.plot(SMDB_coords[:, 0], SMDB_coords[:, 1])
78+
plt.plot(NMDB_coords[:, 0], NMDB_coords[:, 1])
79+
plt.scatter(151, -29, c='red')
80+
plt.title("North and South MDB Boundaries with Test Point")
81+
plt.show()
82+
# -
83+
# + [markdown]
84+
# ### Load and Explore Rainfall Data
85+
# Download the `rain_day_2025.nc`, which is the amount of rain on 27th of July 2025 as downloaded from BoM, from <a href="https://data.gadopt.org/water-course/rain_day_2025.nc" download>here</a>. This file is in NetCDF format, which is a common format for gridded data. If you want to learn more about NetCDF, you can read the <a href="https://pro.arcgis.com/en/pro-app/latest/help/data/multidimensional/what-is-netcdf-data.htm" target="_blank">NetCDF Quick Start Guide</a>.
86+
# Open the `rain_day_2025.nc` NetCDF file and extract:
87+
# - Rainfall values
88+
# - Latitude/longitude grids
89+
# - Time axis (convert to years)
90+
#
91+
# NetCDF is ideal for gridded data like daily rainfall.
92+
# -
93+
# + tags=["empty-cell"]
94+
# Load NetCDF file and extract rain_day, lats, lons, and converted time
95+
# -
96+
# + tags=["solution"]
97+
# Load the NetCDF file and extract the rainfall data, latitude, longitude, and time
98+
data = nc.Dataset("rain_day_2025.nc")
99+
# Extract the rainfall data, latitude, longitude, and time
100+
rain_day = data["rain_day"][:]
101+
# Extract the latitude and longitude data
102+
lats = data["latitude"][:]
103+
lons = data["longitude"][:]
104+
time = data["time"][:] / 365.25 + 1900 # convert from days since 1900
105+
# -
106+
# + [markdown]
107+
# ### Create Spatial Masks for Each Basin
108+
# We want to isolate grid cells that fall within the MDB.
109+
# Loop through the lat/lon grid and use `Point.within()` to assign `True` to cells inside the north or south basin.
110+
#
111+
# + tags=["empty-cell"]
112+
# Build boolean masks for NMDB and SMDB based on which grid points fall inside each
113+
# -
114+
# + tags=["solution"]
115+
# Build boolean masks for NMDB and SMDB based on which grid points fall inside each
116+
NMDB_mask = np.zeros((len(lats), len(lons)))
117+
SMDB_mask = np.zeros((len(lats), len(lons)))
118+
# Loop through the lat/lon grid and use `Point.within()` to assign `True` to cells inside the north or south basin.
119+
for ilat in range(len(lats)):
120+
for ilon in range(len(lons)):
121+
pt = Point([lons[ilon], lats[ilat]])
122+
if pt.within(NMDB_shape):
123+
NMDB_mask[ilat, ilon] = True
124+
elif pt.within(SMDB_shape):
125+
SMDB_mask[ilat, ilon] = True
126+
# -
127+
# + [markdown]
128+
# ### Plot the Mask to Verify Coverage
129+
# Combine the north and south masks, and plot to confirm the shape of the coverage.
130+
#
131+
# + tags=["empty-cell"]
132+
# Combine north/south masks and plot the result
133+
# -
134+
# + tags=["solution"]
135+
# Combine the north and south masks, and plot to confirm the shape of the coverage.
136+
MDB_mask = NMDB_mask + SMDB_mask
137+
plt.figure(figsize=(5, 5))
138+
plt.imshow(MDB_mask)
139+
plt.title("Spatial Mask for North + South MDB")
140+
plt.show()
141+
# -
142+
# + [markdown]
143+
# ### Visualise Rainfall for a Single Day
144+
# Mask the rainfall data at one time step (e.g. day 0) using the MDB mask.
145+
# This reveals only the rainfall within the basin.
146+
# -
147+
# + tags=["empty-cell"]
148+
# Apply spatial mask to rain_day[0] and plot the result
149+
# -
150+
# + tags=["solution"]
151+
plt.figure(figsize=(5, 5))
152+
plt.imshow(rain_day[0] * MDB_mask)
153+
plt.title("Masked Rainfall at Time = 0")
154+
plt.show()
155+
# -
156+
# + [markdown]
157+
# ### Integrate Rainfall Over Time
158+
# Calculate the spatially averaged rainfall over the MDB at each time step.
159+
# This gives a time series of total rainfall.
160+
# + tags=["empty-cell"]
161+
# Loop over time to compute daily average rainfall over MDB
162+
# -
163+
# + tags=["solution"]
164+
MDB_total_rain = np.zeros(len(time))
165+
for i in range(len(time)):
166+
MDB_total_rain[i] = np.sum(MDB_mask * rain_day[i]) / np.sum(MDB_mask)
167+
# -
168+
# + [markdown]
169+
# ### Plot Time Series of Integrated Rainfall
170+
# Finally, plot the time series of total MDB rainfall to observe temporal patterns.
171+
# -
172+
# + tags=["empty-cell"]
173+
# Plot rainfall time series vs time
174+
# -
175+
# + tags=["solution"]
176+
plt.figure(figsize=(6, 2))
177+
# Plot the time series of total MDB rainfall
178+
plt.plot(time, MDB_total_rain, linewidth=1)
179+
# Set the x and y labels
180+
plt.xlabel("Year")
181+
plt.ylabel("Rainfall (mm)")
182+
plt.title("Total Rainfall in MDB over Time")
183+
plt.show()
184+
# -
185+
# + [markdown]
186+
# ### ✅ Extensions
187+
# To go further, try:
188+
# - Integrating rainfall separately for NMDB and SMDB
189+
# - Comparing trends or seasonal cycles between them
190+
# - Exporting the rainfall time series to CSV for reporting

0 commit comments

Comments
 (0)