diff --git a/docs/design/supply/bike-logsums.md b/docs/design/supply/bike-logsums.md
index c2d63842b..a948287da 100644
--- a/docs/design/supply/bike-logsums.md
+++ b/docs/design/supply/bike-logsums.md
@@ -11,23 +11,25 @@ This process generates the output\bikeMgraLogsum.csv and output\bikeTazLogsum.cs
-**Inputs:** The AT network is read from input\SANDAG_Bike_Net.dbf and input\SANDAG_Bike_Node.dbf. Coefficients and other settings are read from sandag_abm.properties.
+**Inputs:** The AT network is read from input\SANDAG_Bike_Net.dbf and input\SANDAG_Bike_Node.dbf. The bike model settings and utility files are found in src\asim\scripts\bike_route_choice.
**Network Construction:** Node and edge attributes are processed from the network files. Traversals are derived for edge pairs, using angle to determine turn type.
**Path Sampling:**
- Doubly stochastic, coefficients are randomized and resulting edge cost is randomized
-- Dijkstra's algorithm finds path each origin to destination with minimum cost
+- Dijkstra's algorithm finds path each origin to destination with minimum combined edge and traversal cost
- Add paths to path alternatives list and calculate path sizes from overlap of alternatives
-- Repeat until minimum number of alternatives and minimum path size is met for each origin/destination pair
-
-**Path Resampling:** Resample paths from alternatives list until minimum path size is met.
+- Repeat for preset number of iterations (default 10)
**Path Choice Utility Calculation:** Calculate bike logsum values for each origin/destination pair from utility expression on each path alternative.
-**Outputs:** Bike logsums and times are written to output\bikeMgraLogsum.csv and output\bikeTazLogsum.csv. During ActivitySim preprocessing, TAZ values are added to BIKE_LOGSUM and BIKE_TIME matrices of output\skims\traffic_skims_AM.omx and MGRA values are written to output\skims\maz_maz_bike.csv.
+**Outputs:** Bike logsums and times are written to output\bikeMgraLogsum.csv and output\bikeTazLogsum.csv. Log and trace files are written to output\bike. During ActivitySim preprocessing, TAZ values are added to BIKE_LOGSUM and BIKE_TIME matrices of output\skims\traffic_skims_AM.omx and MGRA values are written to output\skims\maz_maz_bike.csv.
## Further reading
-[Active Transportation Improvements Report (2015)](https://github.com/SANDAG/ABM/wiki/files/at.pdf)
\ No newline at end of file
+[Bike Route Choice README](../../../src/asim/scripts/bike_route_choice/README.md)
+
+[ABM3 Bike Model Report (2025)](../../pdf_reports/SANDAG_ABM3_Bike_Model_Report.pdf)
+
+[Active Transportation Improvements Report (2015)](https://github.com/SANDAG/ABM/wiki/files/at.pdf) (Prior bike logsum implementation in Java)
\ No newline at end of file
diff --git a/docs/images/design/bike_logsum_design.png b/docs/images/design/bike_logsum_design.png
index ffcfc895e..48a3c4454 100644
Binary files a/docs/images/design/bike_logsum_design.png and b/docs/images/design/bike_logsum_design.png differ
diff --git a/docs/pdf_reports/SANDAG_ABM3_Bike_Model_Report.pdf b/docs/pdf_reports/SANDAG_ABM3_Bike_Model_Report.pdf
new file mode 100644
index 000000000..d42d37b54
Binary files /dev/null and b/docs/pdf_reports/SANDAG_ABM3_Bike_Model_Report.pdf differ
diff --git a/src/asim/scripts/bike_route_choice/README.md b/src/asim/scripts/bike_route_choice/README.md
new file mode 100644
index 000000000..4e70c0262
--- /dev/null
+++ b/src/asim/scripts/bike_route_choice/README.md
@@ -0,0 +1,136 @@
+# Active Transport Bike Model
+
+This directory contains a Python-based implementation of a bike route
+choice model first developed in Java. The bike route choice model is a
+multinomial logit discrete choice model that estimates the probabilities
+of an individual’s choosing among multiple alternative routes between a given
+origin and destination. This discrete choice model forms the basis of both
+the estimation of the level of service between OD pairs used by the demand
+models and the estimation of the number of cyclists assigned to network links.
+The level of service is the expected maximum utility, or logsum, from the model,
+and network link assignments are made with individual route probabilities.
+
+## Setup and Running
+The UV environment used to run AcitivtySim (asim_140) is also used to run the bike model.
+The bike model is launched from the `bike_route_choice.py` file and accepts a
+single (optional) command-line argument specifying the settings YAML file to
+read. The default value for this argument is `bike_route_choice_settings.yaml`.
+
+### Settings
+In the specified settings YAML file, several options are available for configuration:
+
+**Network**
+- `data_dir`: path to the directory in which the model inputs are stored
+- `node_file`: name of the input shapefile containing bike network nodes
+- `link_file`: name of the input shapefile containing bike network links
+- `zone_level`: zone level on which path choice should be performed. Allowed values: `taz`,`mgra`
+- `zone_subset`: subset of zones to use for model testing
+
+**Utilities**
+- `edge_util_file`: name of the file containing the utility function variable specification for link costs
+- `traversal_util_file`: name of the file containing the utility function variable specification for edtraversalge costs
+- `random_scale_coef`: scaling coefficient to use for alternative utilities
+- `random_scale_link`: scaling coefficient to use for randomized edge costs
+
+**Model Hyperparameters**
+- `number_of_iterations`: maximum number of paths which should be found before terminating
+- `number_of_batches`: number of batches into which origin centroids should be divided for sequential processing
+- `number_of_processors`: number of processors to use for processing each batch
+- `max_dijkstra_utility`: cutoff threshold utility (positive) for early termination of the shortest-paths search
+- `min_iterations`: the minimum number of paths found - zone pairs with fewer paths will be discarded
+
+**Output and Caching**
+- `output_path`: path to the directory in which model outputs should be written
+- `output_file_path`: path to the final bike logsum file (bikeMgraLogsum.csv, bikeTazLogsum.csv)
+- `save_bike_net`: whether to write the derived network to a set of CSVs
+- `read_cached_bike_net`: whether to read the derived network from a cache instead of re-deriving from the original network
+- `bike_speed`: the speed in mph to use for calculating bike travel times in output
+
+**Tracing**
+- `trace_bike_utilities`: whether to output the chosen path sets from a specified list of origin-destination pairs
+- `trace_origins`: ordered list of origins whose paths will be output in tracing
+- `trace_destinations`: ordered list of destinations corresponding to the origins for tracing
+- `generate_shapefile`: whether to output the trace paths as a shapefile in addition to CSV
+- `crs`: network coordinate reference system to attach to the output shapefiles
+
+## Utility Calculation
+The variable specifications used in the utility function for edges and traversals
+are stored in `bike_edge_utils.csv` and `bike_traversal_utils.csv`, respectively,
+including the expressions and their associated coefficients. The utility accounts
+for the distance on different types of facilities, the gain in elevation, turns,
+signal delay, and navigating un-signalized intersections with high-volume facilities.
+To account for the correlation in the random utilities of overlapping routes, the
+utility function in the model includes a “path size” measure, described in the
+following section.
+
+## Path Size
+The size of path alternative $n_i$ in alternative set $\textbf{n}$ is calculated using the
+formula:
+```math
+size(n_i)=\sum_{l\in n_i}\frac{edgeLength(l)}{pathLength(n_i)*pathsUsing(\textbf{n},l)}
+```
+where $pathsUsing$ is the integer number of paths in the alternative set $\textbf{n}$
+which contain link $l$. Its use derives from the theory of aggregate and elemental
+alternatives, where a link is an aggregation of all paths that use the link. If multiple
+routes overlap, their "size" is less than one. If two routes overlap completely, their
+size will be one-half.
+
+## Path Sampling
+Given an origin, paths are sampled to all relevant destinations in the network by repeatedly
+applying Dijstra’s algorithm to search for the paths that minimize a stochastic link
+generalized cost with a mean given by the additive inverse of the path utility. For each path
+sampling iteration, a random coefficient vector is first sampled from a non-negative
+multivariate uniform distribution with zero covariance and mean equal to the link
+generalized cost coefficients corresponding to the path choice utility function. As the path
+search extends outward from the origin, the random cost coefficients do not vary over links,
+but only over iterations of path sampling. In the path search, the cost of each subsequent
+link is calculated by summing the product of the random coefficients with their respective
+link attributes, and then multiplying the result by a non-negative discrete random edge cost
+multiplier.
+
+## Threshold Utility and Distance
+To improve runtime, the Dijkstra's algorithm implementation allows for early termination once a
+predetermined utility threshold has been reached in its search, with no paths found for zone pairs
+whose utility is beyond the threshold. However, the utility of paths is not necessarily the most
+user-friendly metric - it is reasonable to aim to define the cutoff by distance rather than utility.
+However, because the underlying implementation of Dijkstra's algorithm is only aware of utilities,
+not distances, this is not directly feasible, as utility and distance do not correspond directly.
+
+To address this, the `bike_threshold_calculator` script has been built to provide a handy mechanism
+to search for a utility threshold which approximates a desired distance cutoff. Given a valid
+configuration YAML file (as described above), the script iteratively performs a binary search,
+using the YAML file's `max_dijkstra_utility` setting as a starting threshold utility and modifying
+its value on subsequent iterations until the target distance is within the allowed margin of error
+(or the maximum number of bike model iterations has been completed). The calling signature of the
+script is shown below and requires a minimum of two command line arguments: the settings filepath
+and the target distance – the remainder of the parameters are optional, with each optional argument
+requiring those listed before it in sequence:
+
+~~~
+Usage:
+ python bike_threshold_calculator.py [target_margin [percentile [max_iterations]]]
+
+ parameters:
+ settings filepath: path to YAML file containing bike model settings
+ target distance: the distance for which the search should aim (in miles)
+ target margin: the margin of error (< 1) allowed before termination (optional, default: 0.1)
+ percentile: the percentile of distance to compare against the target (optional, default: 0.99)
+ max iterations: the most bike model iterations that can be performed in the search (optional, default: 20)
+
+ examples:
+
+ python bike_threshold_calculator.py bike_route_choice_settings_taz.yaml 20
+ # the resulting 99th %ile distance must be w/in 10% of the 20-mile target distance
+ # equivalent to:
+ python bike_threshold_calculator.py bike_route_choice_settings_taz.yaml 20 0.1 0.99 20
+
+ python bike_threshold_calculator.py bike_route_choice_settings_mgra.yaml 3 0.05
+ # the resulting 99th %ile distance must be w/in 5% of the three-mile target distance
+~~~
+
+With the default parameters used, the script will search until either 20 iterations have elapsed or
+the 99th percentile of distances found is within 10% of the specified target distance. At termination,
+a CSV named `threshold_results.csv` will be written to the output directory (set in the `output_path`
+setting in the YAML file) with the input utility thresholds, the iteration runtime, and the chosen
+percentile of distance. These will also be scatter plotted on a graph and displayed, but note that
+this graph is not saved automatically to the output directory.
\ No newline at end of file
diff --git a/src/asim/scripts/bike_route_choice/bike_edge_utils.csv b/src/asim/scripts/bike_route_choice/bike_edge_utils.csv
new file mode 100644
index 000000000..46532d275
--- /dev/null
+++ b/src/asim/scripts/bike_route_choice/bike_edge_utils.csv
@@ -0,0 +1,51 @@
+Label,Description,Expression,Coefficient
+##################################################################################
+# The below coefficients are adapted from Lukawska et al. (2023),,,
+util_distance_general,Edge length (distance) in miles,distance,-10.93
+util_medium_upslope,Edge slope in range [1%-3.5%],"@df.distance * ((df.gain / (df.distance * 5280) >= 0.01) & (df.gain / (df.distance * 5280) <= 0.035)).astype(int)",-9.012
+util_high_upslope,Edge slope > 3.5%,"@df.distance * (df.gain / (df.distance * 5280) > 0.035).astype(int)",-18.19
+# per class penalties - large roadways,,,
+util_lg_protected,Large roadway w/ protected lanes,"@df.distance * ( df.functionalClass.isin([3,4]) & df.cycleTrack ).astype(int)",0.1748
+util_lg_painted,Large roadway w/ painted lanes & no protected lanes,"@df.distance * ( df.functionalClass.isin([3,4]) & (~df.cycleTrack) & (df.bikeClass == 2) ).astype(int)",-3.158
+util_lg_nofacils,Large roadway w/o bike lanes,"@df.distance * ( df.functionalClass.isin([3,4]) & (~df.cycleTrack) & ~(df.bikeClass == 2) ).astype(int)",-2.513
+# per-class penalties - medium roadways,,,
+# util_med_protected,Medium roadway w/ protected lanes,"@df.distance * ( df.functionalClass.isin([5,6]) & df.cycleTrack ).astype(int)",0.000
+util_med_painted,Medium roadway w/ painted lanes & no protected lanes,"@df.distance * ( df.functionalClass.isin([5,6]) & (~df.cycleTrack) & (df.bikeClass == 2) ).astype(int)",-0.5464
+util_med_nofacils,Medium roadway w/o bike lanes,"@df.distance * ( df.functionalClass.isin([5,6]) & (~df.cycleTrack) & ~(df.bikeClass == 2) ).astype(int)",-1.235
+# per class penalties - residential roadways,,,
+util_res_protected,Residential roadway w/ protected lanes,"@df.distance * ( (df.functionalClass == 7) & df.cycleTrack ).astype(int)",-0.9835
+util_res_painted,Residential roadway w/ painted lanes & no protected lanes,"@df.distance * ( (df.functionalClass == 7) & (~df.cycleTrack) & (df.bikeClass == 2) ).astype(int)",0.9288
+util_res_nofacils,Residential roadway w/o bike lanes,"@df.distance * ( (df.functionalClass == 7) & (~df.cycleTrack) & ~(df.bikeClass == 2) ).astype(int)",-1.901
+# per class penalties - non-vehicle routes,,,
+util_cycleway,Cycleway,"@df.distance * ( (~df.functionalClass.isin([3,4,5,6,7,10])) & (df.bikeClass == 2) ).astype(int)",0.4152
+util_sup,Shared-Use Path,"@df.distance * ( (~df.functionalClass.isin([3,4,5,6,7,10])) & (df.bikeClass == 1) ).astype(int)",-1.705
+util_pedzone,Pedestrian Zone,"@df.distance * ( (~df.functionalClass.isin([3,4,5,6,7,10])) & (~df.bikeClass.isin([1,2])) ).astype(int)",-4.021
+# FIXME better way to check wrong way?,,,
+util_wrong_way,Wrong Way - no lanes in direction of travel and no bike lane,"@df.distance * ( (~df.cycleTrack) & (~(df.bikeClass == 2)) & (df.lanes == 0) ).astype(int)",-3.202
+# The above coefficients are adapted from Lukawska et al. (2023),,,
+
+
+##################################################################################
+# The below coefficients are from the ABM2 model and have dubious sourcing (see ABM2 documentation),,,
+# util_class_0_bike_lane_dist,Distance without bike lanes,"@df.distance * np.where((df.bikeClass < 1) | (df.bikeClass > 3), 1, 0)",-0.858
+# util_class_I_bike_lane_dist,Distance on class I bike lanes,"@df.distance * np.where(df.bikeClass == 1, 1, 0)",-0.348
+# util_class_II_bike_lane_dist,Distance on class II bike lanes,"@df.distance * np.where((df.bikeClass == 2) & (~df.cycleTrack), 1, 0)",-0.544
+# util_class_III_bike_lane_dist,Distance on class III bike lanes,"@df.distance * np.where((df.bikeClass == 3) & (~df.bikeBlvd), 1, 0)",-0.858
+# util_art_no_bike_lane,Distance on arterial without bike lanes,"@df.distance * np.where((df.arterial) & (df.bikeClass != 2) & (df.bikeClass != 1), 1, 0)",-1.050
+# util_dist_cycle_track_class_II,Distance on cyle track class II bike lanes,"@df.distance * df.cycleTrack",-0.424
+# util_dist_cycle_track_class_III,Distance on boulevard class III bike routes,"@df.distance * df.bikeBlvd",-0.343
+# FIXME better way to check wrong way distance?,,,
+# util_dist_wrong_way,Distance wrong way -- no lanes in direction of travel and no bike lane means wrong way,"@df.distance * np.where((df.bikeClass != 1) & (df.lanes == 0), 1, 0)",-3.445
+# util_elevation_gain_ft,Cumulative gain in elevation ignoring declines in feet,@df.gain,-0.015
+# util_access_of_highway,"Unallowed access onto interstate, freeway, or expressway","@(df.functionalClass == 1) | (df.functionalClass == 2)",-999.9
+# The above coefficients are from the ABM2 model and have dubious sourcing (see ABM2 documentation),,,
+##################################################################################,,,
+# The below coefficients are from Meister et al. (2024) converted from km to miles,,,
+# util_distance,Total Distance,@df.distance,-0.944
+# util_bike_path,Distance on Bike Path (Class I),"@df.distance * np.where(df.bikeClass == 1, 1, 0)",1.969
+# util_bike_lane,Distance on Bike Lane (Class II),"@df.distance * np.where(df.bikeClass == 2, 1, 0)",1.616
+# util_speed_limit,Distance where speed limit is <= 25 mph,"@df.distance * np.where(df.speedLimit <= 25, 1, 0)",0.081
+# util_low_incline,Distance where slope is > 2% and <= 6%,"@df.distance * np.where((df.gain / (df.distance * 5280) > 0.02) & (df.gain / (df.distance * 5280) <= 0.06), 1, 0)",-1.727
+# util_med_incline,Distance where slope is > 6% and <= 10%,"@df.distance * np.where((df.gain / (df.distance * 5280) > 0.06) & (df.gain / (df.distance * 5280) <= 0.1), 1, 0)",-9.171
+# util_steep_incline,Distance where slope is > 10%,"@df.distance * np.where(df.gain / (df.distance * 5280) > 0.1, 1, 0)",-12.92
+# The above coefficients are from Meister et al. (2024) converted from km to mi,,,
\ No newline at end of file
diff --git a/src/asim/scripts/bike_route_choice/bike_net_reader.py b/src/asim/scripts/bike_route_choice/bike_net_reader.py
new file mode 100644
index 000000000..d911ae791
--- /dev/null
+++ b/src/asim/scripts/bike_route_choice/bike_net_reader.py
@@ -0,0 +1,1006 @@
+import pandas as pd
+import numpy as np
+import os
+import math
+import logging
+
+# import tqdm
+from bike_route_utilities import BikeRouteChoiceSettings, read_file
+
+# turn type encodings
+TURN_NONE = 0
+TURN_LEFT = 1
+TURN_RIGHT = 2
+TURN_REVERSE = 3
+
+BLUE_STEEL = TURN_RIGHT
+LE_TIGRE = TURN_RIGHT
+FERRARI = TURN_RIGHT
+MAGNUM = TURN_LEFT
+
+
+def create_and_attribute_edges(
+ settings: BikeRouteChoiceSettings,
+ node: pd.DataFrame,
+ link: pd.DataFrame,
+ logger: logging.Logger,
+) -> pd.DataFrame:
+ """
+ Create and attribute edges from the provided node and link dataframes.
+ This function creates a directional edge dataframe from the links,
+ creates reverse direction edges, combines them, and attributes them.
+ It also manipulates the node dataframe to the output format and calculates
+ edge headings and arterial status.
+
+ Parameters:
+ settings: BikeRouteChoiceSettings - Settings for the bike route choice model
+ node: pd.DataFrame - Node dataframe
+ link: pd.DataFrame - Link dataframe
+
+ Returns:
+ edges: pd.DataFrame - Edge dataframe with derived attributes
+ nodes: pd.DataFrame - Node dataframe with derived attributes
+
+ """
+
+ # create directional edge dataframe from links
+ logger.info("Creating directional edges from links")
+ abEdges = (
+ link.rename(
+ columns={
+ "A": "fromNode",
+ "B": "toNode",
+ "AB_Gain": "gain",
+ "ABBikeClas": "bikeClass",
+ "AB_Lanes": "lanes",
+ "Func_Class": "functionalClass",
+ "Bike2Sep": "cycleTrack",
+ "Bike3Blvd": "bikeBlvd",
+ "SPEED": "speedLimit",
+ }
+ )
+ .copy()
+ .assign(
+ distance=link.Shape_Leng / 5280.0, # convert feet to miles
+ autosPermitted=link.Func_Class.isin(range(1, 8)),
+ centroidConnector=link.Func_Class == 10,
+ )[
+ [
+ "fromNode",
+ "toNode",
+ "bikeClass",
+ "lanes",
+ "functionalClass",
+ "centroidConnector",
+ "autosPermitted",
+ "cycleTrack",
+ "bikeBlvd",
+ "distance",
+ "gain",
+ "speedLimit",
+ "geometry",
+ ]
+ ]
+ )
+
+ # create reverse direction edges
+ baEdges = (
+ link.rename(
+ columns={
+ "B": "fromNode",
+ "A": "toNode",
+ "BA_Gain": "gain",
+ "BABikeClas": "bikeClass",
+ "BA_Lanes": "lanes",
+ "Func_Class": "functionalClass",
+ "Bike2Sep": "cycleTrack",
+ "Bike3Blvd": "bikeBlvd",
+ "SPEED": "speedLimit",
+ }
+ )
+ .copy()
+ .assign(
+ distance=link.Shape_Leng / 5280.0, # convert feet to miles
+ autosPermitted=link.Func_Class.isin(range(1, 8)),
+ centroidConnector=link.Func_Class == 10,
+ geometry=link.geometry.reverse(),
+ )[
+ [
+ "fromNode",
+ "toNode",
+ "bikeClass",
+ "lanes",
+ "functionalClass",
+ "centroidConnector",
+ "autosPermitted",
+ "cycleTrack",
+ "bikeBlvd",
+ "distance",
+ "gain",
+ "speedLimit",
+ "geometry",
+ ]
+ ]
+ )
+
+ # combine bidirectional edges
+ edges = pd.concat([abEdges, baEdges]).set_index(["fromNode", "toNode"])
+
+ # manipulate node dataframe to output format
+ nodes = (
+ node.assign(centroid=(node.MGRA > 0) | (node.TAZ > 0))
+ .rename(
+ columns={
+ "NodeLev_ID": "nodeID",
+ "XCOORD": "x",
+ "YCOORD": "y",
+ "MGRA": "mgra",
+ "TAZ": "taz",
+ "Signal": "signalized",
+ }
+ )
+ .drop(columns=["ZCOORD", "geometry"])
+ .set_index("nodeID")
+ )
+
+ # calculate edge headings
+ # Merge coordinates for fromNode and toNode
+ edges_with_coords = (
+ edges.reset_index()
+ .merge(
+ nodes[["x", "y"]],
+ how="left",
+ left_on="fromNode",
+ right_index=True,
+ suffixes=("", "_from"),
+ )
+ .merge(
+ nodes[["x", "y"]],
+ how="left",
+ left_on="toNode",
+ right_index=True,
+ suffixes=("_from", "_to"),
+ )
+ )
+
+ # Calculate angle in radians
+ edges_with_coords["angle"] = np.arctan2(
+ edges_with_coords["y_to"] - edges_with_coords["y_from"],
+ edges_with_coords["x_to"] - edges_with_coords["x_from"],
+ )
+
+ # Set index and assign angle back to edges
+ edges_with_coords = edges_with_coords.set_index(["fromNode", "toNode"]).reindex(
+ edges.index
+ )
+ assert edges_with_coords.index.equals(
+ edges.index
+ ), "Index mismatch between edges and coordinates"
+ edges["angle"] = edges_with_coords["angle"]
+
+ # attribute (major) arterial status
+ edges["majorArterial"] = (
+ (edges.functionalClass <= 3)
+ & (edges.functionalClass > 0)
+ & (edges.bikeClass != 1)
+ )
+ edges["arterial"] = (
+ (edges.functionalClass <= 4)
+ & (edges.functionalClass > 0)
+ & (edges.bikeClass != 1)
+ )
+
+ # keep track of how many duplicate edges are (major) arterials
+ dupMajArts = edges.groupby(edges.index).majorArterial.sum()
+ dupMajArts.index = pd.MultiIndex.from_tuples(dupMajArts.index)
+ edges["dupMajArts"] = dupMajArts.reindex(edges.index)
+
+ dupArts = edges.groupby(edges.index).arterial.sum()
+ dupArts.index = pd.MultiIndex.from_tuples(dupArts.index)
+ edges["dupArts"] = dupArts.reindex(edges.index)
+
+ # keep track of how many edges emanating from a node cross (major) arterials
+ nodes["majorArtXings"] = (
+ edges.groupby(edges.index.get_level_values(0))
+ .majorArterial.sum()
+ .reindex(nodes.index)
+ )
+ nodes["artXings"] = (
+ edges.groupby(edges.index.get_level_values(0))
+ .arterial.sum()
+ .reindex(nodes.index)
+ )
+ return edges, nodes
+
+
+# def recreate_java_attributes(
+# traversals: pd.DataFrame,
+# ):
+# """
+# Recreate Java attributes for traversals.
+# WARNING: This function contains bugs that we think exist in the Java implementation.
+# This is merely an optional function used for potential backwards compatibility.\
+
+# Loop and vectorized outputs should be the same. Loop code closely mimic the Java implementation,
+# while vectorized code is more efficient for python implementation.
+
+# Parameters:
+# traversals: pd.DataFrame - Traversal dataframe with derived attributes
+
+# Returns:
+# pd.DataFrame - Traversal dataframe with recreated Java attributes
+
+# """
+# def isThruJunc(trav, turn_type, last_all):
+# # this method is buggy because it only considers the last traversal
+# # instead of checking for any right turn
+# if trav.centroid_start or trav.centroid_thru or trav.centroid_end:
+# return False
+
+# if trav.turnType != TURN_NONE:
+# return False
+
+# if (
+# # if there are no sibling traversals
+# len(
+# traversals[
+# (traversals.start == trav.start)
+# & (traversals.thru == trav.thru)
+# & (traversals.end != trav.start)
+# & (traversals.end != trav.end)
+# ]
+# )
+# == 0
+# ):
+# return True
+
+
+# if last_all == "last":
+# # if the last sibling traversal is of the target turn type
+# if (
+# traversals[
+# (traversals.start == trav.start)
+# & (traversals.thru == trav.thru)
+# & (traversals.end != trav.start)
+# & (traversals.end != trav.end)
+# ]
+# .iloc[-1]
+# .turnType
+# == turn_type
+# ):
+# return False
+# else:
+# return True
+
+# elif last_all == "any":
+# # if any sibling traversal is of the target turn type
+# return not (
+# traversals[
+# (traversals.start == trav.start)
+# & (traversals.thru == trav.thru)
+# & (traversals.end != trav.start)
+# & (traversals.end != trav.end)
+# ].turnType
+# == turn_type
+# ).any()
+
+# # keep track of whether traversal is "thru junction"
+# traversals["ThruJunction_anynonevec"] = (
+# ~(
+# traversals.centroid_start
+# | traversals.centroid_thru
+# | traversals.centroid_end
+# )
+# & (traversals.turnType == TURN_NONE)
+# & (traversals.none_turns - traversals.dupNoneTurns_toEdge == 0)
+# )
+
+# # create columns to populate in the slow loop
+# traversals["ThruJunction_lastnoneloop"] = False
+# traversals["ThruJunction_lastrtloop"] = False
+# traversals["ThruJunction_anynoneloop"] = False
+# traversals["ThruJunction_anyrtloop"] = False
+
+# logger.info("Beginning slow loop recreating Java attributes")
+
+# # the slow loop through all the traversals (could this be parallelized?)
+# for idx, trav in tqdm.tqdm(traversals.iterrows(), total=len(traversals)):
+
+# # check all four possible parameter combos
+# traversals.loc[idx, "ThruJunction_lastnoneloop"] = isThruJunc(
+# trav, TURN_NONE, "last"
+# )
+# traversals.loc[idx, "ThruJunction_lastrtloop"] = isThruJunc(
+# trav, TURN_RIGHT, "last"
+# )
+# traversals.loc[idx, "ThruJunction_anynoneloop"] = isThruJunc(
+# trav, TURN_NONE, "any"
+# )
+# traversals.loc[idx, "ThruJunction_anyrtloop"] = isThruJunc(
+# trav, TURN_RIGHT, "any"
+# )
+
+# # the below is buggy because it counts none-turns instead of right turns
+
+# # index: all last traversals of all input groups
+# # values: whether the index traversal is a none turn
+# is_none = (
+# traversals.groupby(["start", "thru"])
+# .last()
+# .reset_index()
+# .set_index(["start", "thru", "end"])
+# .turnType
+# == TURN_NONE
+# )
+
+# last_travs = is_none.index
+
+# # index: all last and penultimate traversals
+# # values: whether the index traversal is a none turn
+# is_none = pd.concat(
+# [
+# is_none,
+# traversals[
+# # don't allow index of penultimates to match last - we want two different candidates
+# ~traversals.set_index(["start", "thru", "end"]).index.isin(last_travs)
+# ]
+# .groupby(["start", "thru"])
+# .last()
+# .reset_index()
+# .set_index(["start", "thru", "end"])
+# .turnType
+# == TURN_NONE,
+# ]
+# )
+
+# # create an index of all penultimate sibling turn movements
+# penult_travs = is_none.index[~is_none.index.isin(last_travs)]
+
+# # get the penultimate sibling turn movments whose turn type is none
+# penult_is_none = is_none[penult_travs].reset_index(2).turnType.rename("penultimate_is_none")
+
+# # get the last sibling turn movements whose turn type is none
+# last_is_none = is_none[last_travs].reset_index(2).turnType.rename("last_is_none")
+
+# # tack on the two new columns
+# traversals = traversals.merge(
+# last_is_none, left_on=["start", "thru"], right_index=True, how="left"
+# ).merge(penult_is_none, left_on=["start", "thru"], right_index=True, how="left")
+
+# # for all traversals that match the last traversal, use the penultimate value
+# traversals.loc[
+# traversals.index.isin(last_is_none.index), "buggyRTE_none"
+# ] = traversals.loc[traversals.index.isin(last_is_none.index), "penultimate_is_none"]
+# # for all other traversals, use the last value
+# traversals.loc[
+# ~traversals.index.isin(last_is_none.index), "buggyRTE_none"
+# ] = traversals.loc[~traversals.index.isin(last_is_none.index), "last_is_none"]
+
+# # index: all last traversals of all input groups
+# # values: whether the index traversal is a right turn
+# last_is_rt = (
+# traversals.groupby(["start", "thru"])
+# .last()
+# .reset_index()
+# .set_index(["start", "thru", "end"])
+# .turnType
+# == TURN_RIGHT
+# )
+# # index: all penultimate traversals of input groups w/ >1 out link
+# # values: whether the index traversal is a right turn
+# penultimate_is_rt = (
+# traversals[
+# ~traversals.set_index(["start", "thru", "end"]).index.isin(last_is_rt.index)
+# ]
+# .groupby(["start", "thru"])
+# .last()
+# .reset_index()
+# .set_index(["start", "thru", "end"])
+# .turnType
+# == TURN_RIGHT
+# )
+
+# # drop the end column
+# last_is_rt = (
+# last_is_rt.reset_index()
+# .set_index(["start", "thru"])
+# .turnType.rename("last_is_rt")
+# )
+# # drop the end column and assume false for input groups w/ 1 out link
+# penultimate_is_rt = (
+# penultimate_is_rt.reset_index()
+# .set_index(["start", "thru"])
+# .turnType.reindex(last_is_rt.index, fill_value=False)
+# .rename("penultimate_is_rt")
+# )
+
+# # tack on two new columns (last_is_rt and penultimate_is_rt)
+# traversals = traversals.merge(
+# last_is_rt, left_on=["start", "thru"], right_index=True, how="left"
+# ).merge(penultimate_is_rt, left_on=["start", "thru"], right_index=True, how="left")
+
+# # define the right turn exists parameter for all positions,
+# # starting with those which are last-siblings
+# traversals.loc[
+# traversals.index.isin(last_is_rt.index), "buggyRTE_rt"
+# ] = traversals.loc[traversals.index.isin(last_is_rt.index), "penultimate_is_rt"]
+
+# # then moving on to all others
+# traversals.loc[
+# ~traversals.index.isin(last_is_rt.index), "buggyRTE_rt"
+# ] = traversals.loc[~traversals.index.isin(last_is_rt.index), "last_is_rt"]
+
+# # define whether each entry is a thru-junction based on its underlying
+# # right-turn-exists variables
+# traversals["ThruJunction_lastnonevec"] = (
+# ~(
+# traversals.centroid_start
+# | traversals.centroid_thru
+# | traversals.centroid_end
+# )
+# & (traversals.turnType == TURN_NONE)
+# & ~(traversals.buggyRTE_none)
+# )
+# traversals["ThruJunction_lastrtvec"] = (
+# ~(
+# traversals.centroid_start
+# | traversals.centroid_thru
+# | traversals.centroid_end
+# )
+# & (traversals.turnType == TURN_NONE)
+# & ~(traversals.buggyRTE_rt)
+# )
+
+# logger.info("Finished calculating java attributes")
+
+# java_cols = [
+# "ThruJunction_anynonevec",
+# "ThruJunction_lastnoneloop",
+# "ThruJunction_lastrtloop",
+# "ThruJunction_anynoneloop",
+# "ThruJunction_anyrtloop",
+# "ThruJunction_anyrtvec",
+# "ThruJunction_lastnonevec",
+# "ThruJunction_lastrtvec",
+# ]
+
+# return traversals, java_cols
+
+
+def calculate_signalExclRight_alternatives(
+ traversals: pd.DataFrame, logger: logging.Logger
+) -> pd.DataFrame:
+ """
+ Calculate alternative signalized right turn exclusion columns.
+ This was originally used to test backwards compatibility with the Java implementation.
+ It is not used in the current implementation, but is left here for reference.
+
+ "signalExclRight_lastnoneloop", # current Java implementation
+ "signalExclRight_lastrtloop", # fixes rt-vs-none
+ "signalExclRight_lastnonevec", # vector implementation of current Java
+ "signalExclRight_lastrtvec", # same as above but also fixes rt-vs-none
+ "signalExclRight_anynoneloop", # fixes any-vs-last check
+ "signalExclRight_anyrtloop", # fixes any-vs-last and rt-vs-none (allegedly correct)
+ "signalExclRight_anynonevec", # vector implementation of any-vs-last fix
+ "signalExclRight_anyrtvec", # allegedly correct vector implementation
+
+ Parameters:
+ traversals: pd.DataFrame - Traversal dataframe with attributes
+
+ Returns:
+ pd.DataFrame - Traversal dataframe with alternative signalized right turn exclusion columns
+ """
+ # the rest can be ditched (ALLEGEDLY)
+ traversals = traversals.assign(
+ signalExclRight_anynonevec=(
+ traversals.signalized
+ & (traversals.turnType != TURN_RIGHT)
+ & (~traversals.ThruJunction_anynonevec)
+ ),
+ signalExclRight_anyrtloop=(
+ traversals.signalized
+ & (traversals.turnType != TURN_RIGHT)
+ & (~traversals.ThruJunction_anyrtloop)
+ ),
+ signalExclRight_anynoneloop=(
+ traversals.signalized
+ & (traversals.turnType != TURN_RIGHT)
+ & (~traversals.ThruJunction_anynoneloop)
+ ),
+ signalExclRight_lastnonevec=(
+ traversals.signalized
+ & (traversals.turnType != TURN_RIGHT)
+ & (~traversals.ThruJunction_lastnonevec)
+ ),
+ signalExclRight_lastrtvec=(
+ traversals.signalized
+ & (traversals.turnType != TURN_RIGHT)
+ & (~traversals.ThruJunction_lastrtvec)
+ ),
+ signalExclRight_lastnoneloop=(
+ traversals.signalized
+ & (traversals.turnType != TURN_RIGHT)
+ & (~traversals.ThruJunction_lastnoneloop)
+ ),
+ signalExclRight_lastrtloop=(
+ traversals.signalized
+ & (traversals.turnType != TURN_RIGHT)
+ & (~traversals.ThruJunction_lastrtloop)
+ ),
+ )
+
+ java_attributes = [
+ "signalExclRight_lastnoneloop",
+ "signalExclRight_lastrtloop",
+ "signalExclRight_lastnonevec",
+ "signalExclRight_lastrtvec",
+ "signalExclRight_anynoneloop",
+ "signalExclRight_anyrtloop",
+ "signalExclRight_anynonevec",
+ # "signalExclRight_anyrtvec",
+ ]
+
+ return traversals, java_attributes
+
+
+def create_and_attribute_traversals(
+ settings: BikeRouteChoiceSettings,
+ edges: pd.DataFrame,
+ nodes: pd.DataFrame,
+ logger: logging.Logger,
+) -> pd.DataFrame:
+ """
+ Create and attribute traversals from edges and nodes.
+
+ This function creates a traversal table from the edges, calculates traversal attributes,
+ and merges node information to provide a comprehensive traversal dataset.
+ It calculates angles, turn types, and various derived attributes for each traversal.
+
+ Parameters:
+ edges: pd.DataFrame - Edge dataframe with attributes
+ nodes: pd.DataFrame - Node dataframe with attributes
+
+ Returns:
+ pd.DataFrame - Traversal dataframe with derived attributes
+ """
+
+ # create initial traversal table from edges
+ logger.info("Creating traversal table from edges")
+ traversals = (
+ edges.reset_index()
+ .merge(
+ edges.reset_index(),
+ left_on="toNode",
+ right_on="fromNode",
+ suffixes=["_fromEdge", "_toEdge"],
+ how="outer",
+ )
+ .rename(
+ columns={
+ "fromNode_fromEdge": "start",
+ "toNode_fromEdge": "thru",
+ "toNode_toEdge": "end",
+ }
+ )
+ .merge(nodes, left_on="thru", right_index=True, how="outer")
+ )
+
+ # drop U-turns
+ traversals = traversals[traversals.start != traversals.end]
+
+ # calculate traversal angles
+ traversals["angle"] = traversals.angle_toEdge - traversals.angle_fromEdge
+ traversals.loc[traversals.angle > math.pi, "angle"] = (
+ traversals.loc[traversals.angle > math.pi, "angle"] - 2 * math.pi
+ )
+ traversals.loc[traversals.angle < -math.pi, "angle"] = (
+ traversals.loc[traversals.angle < -math.pi, "angle"] + 2 * math.pi
+ )
+ traversals.set_index(["start", "thru", "end"])
+
+ # keep track of the absolute value of the traversal angle
+ traversals["absAngle"] = traversals.angle.abs()
+
+ # attach component nodes' centroid statuses
+ traversals = (
+ traversals.merge(
+ nodes.centroid,
+ left_on="start",
+ right_index=True,
+ suffixes=("_thru", "_start"),
+ )
+ .merge(
+ nodes.centroid,
+ left_on="end",
+ right_index=True,
+ )
+ .rename(columns={"centroid": "centroid_end"})
+ )
+
+ logger.info("Calculating derived traversal attributes")
+
+ # calculate traversal angle attributes for thru node
+ # maximum turning angle
+ max_angles = (
+ traversals[
+ traversals.autosPermitted_toEdge & (traversals.start != traversals.end)
+ ]
+ .groupby(["start", "thru"])
+ .angle.max()
+ .fillna(-math.pi)
+ .rename("max_angle")
+ )
+
+ # minimum turning angle (most negative)
+ min_angles = (
+ traversals[
+ traversals.autosPermitted_toEdge & (traversals.start != traversals.end)
+ ]
+ .groupby(["start", "thru"])
+ .angle.min()
+ .fillna(math.pi)
+ .rename("min_angle")
+ )
+
+ # minimum absolute turning angle
+ min_abs_angles = (
+ traversals[
+ traversals.autosPermitted_toEdge & (traversals.start != traversals.end)
+ ]
+ .groupby(["start", "thru"])
+ .absAngle.min()
+ .fillna(math.pi)
+ .rename("min_abs_angle")
+ )
+
+ # number of outbound legs
+ leg_count = (
+ traversals[
+ traversals.autosPermitted_toEdge & (traversals.start != traversals.end)
+ ]
+ .groupby(["start", "thru"])
+ .size()
+ .rename("leg_count")
+ + 1
+ )
+
+ # consolidate into a dataframe for merging and reindex to full traversal table
+ turn_atts = (
+ pd.concat(
+ [
+ max_angles,
+ min_angles,
+ min_abs_angles,
+ leg_count,
+ ],
+ axis=1,
+ )
+ .reindex([traversals.start, traversals.thru])
+ .fillna(
+ {
+ "max_angle": -math.pi,
+ "min_angle": math.pi,
+ "min_abs_angle": math.pi,
+ "leg_count": 1,
+ }
+ )
+ )
+
+ # attach to full table
+ traversals = pd.concat(
+ [traversals.set_index(["start", "thru"]), turn_atts], axis=1
+ ).reset_index()
+
+ # calculate turn type
+ # turn type is determined by the angle and leg count
+ # the options are no turns, left, right, and reverse
+ traversals.loc[~traversals.leg_count.isna(), "turnType"] = TURN_NONE
+
+ # label right turns
+ traversals.loc[
+ (traversals.leg_count == 3)
+ & (traversals.angle <= traversals.min_angle)
+ & (traversals.angle.abs() > math.pi / 6),
+ "turnType",
+ ] = TURN_RIGHT
+
+ # label left turns
+ traversals.loc[
+ (traversals.leg_count == 3)
+ & (traversals.angle >= traversals.max_angle)
+ & (traversals.angle.abs() > math.pi / 6),
+ "turnType",
+ ] = TURN_LEFT
+
+ # more right turns
+ traversals.loc[
+ (traversals.leg_count > 3)
+ & (traversals.angle.abs() > traversals.min_abs_angle)
+ & (
+ (traversals.angle.abs() >= math.pi / 6)
+ | (traversals.angle <= traversals.min_angle)
+ | (traversals.angle >= traversals.max_angle)
+ )
+ & (traversals.angle < 0),
+ "turnType",
+ ] = TURN_RIGHT
+
+ # more left turns
+ traversals.loc[
+ (traversals.leg_count > 3)
+ & (traversals.angle.abs() > traversals.min_abs_angle)
+ & (
+ (traversals.angle.abs() >= math.pi / 6)
+ | (traversals.angle <= traversals.min_angle)
+ | (traversals.angle >= traversals.max_angle)
+ )
+ & (traversals.angle >= 0),
+ "turnType",
+ ] = TURN_LEFT
+
+ # reversals by angle
+ traversals.loc[
+ (traversals.angle < -math.pi * 5 / 6) | (traversals.angle > math.pi * 5 / 6),
+ "turnType",
+ ] = TURN_REVERSE
+
+ # reversals by start/end node
+ traversals.loc[traversals.start == traversals.end, "turnType"] = TURN_REVERSE
+
+ # all centroid connector traversals are NONE type
+ traversals.loc[
+ traversals.centroid_start | traversals.centroid_thru | traversals.centroid_end,
+ "turnType",
+ ] = TURN_NONE
+
+ traversals.turnType = traversals.turnType.astype(int)
+
+ # keep track of the number of outgoing turns w/ turn type == none
+ # FIXME this should almost certainly get removed
+ traversals = traversals.merge(
+ (traversals.set_index(["start", "thru"]).turnType == TURN_NONE)
+ .reset_index()
+ .groupby(["start", "thru"])
+ .sum()
+ .rename(columns={"turnType": "none_turns"}),
+ left_on=["start", "thru"],
+ right_index=True,
+ how="left",
+ )
+
+ # do the same but with right turns
+ # FIXME this should be the actual usage, not the above, but we're
+ # copying from the java implementation
+ traversals = traversals.merge(
+ (traversals.set_index(["start", "thru"]).turnType == TURN_RIGHT)
+ .reset_index()
+ .groupby(["start", "thru"])
+ .sum()
+ .rename(columns={"turnType": "rt_turns"}),
+ left_on=["start", "thru"],
+ right_index=True,
+ how="left",
+ )
+
+ # keep track of how many duplicate traversals have turn type == none
+ traversals = traversals.merge(
+ (traversals.set_index(["start", "thru", "end"]).turnType == TURN_NONE)
+ .reset_index()
+ .groupby(["start", "thru", "end"])
+ .sum()
+ .rename(columns={"turnType": "dupNoneTurns_toEdge"}),
+ left_on=["start", "thru", "end"],
+ right_index=True,
+ how="left",
+ )
+
+ # keep track of how many duplicate traversals have turn type == right
+ traversals = traversals.merge(
+ (traversals.set_index(["start", "thru", "end"]).turnType == TURN_RIGHT)
+ .reset_index()
+ .groupby(["start", "thru", "end"])
+ .sum()
+ .rename(columns={"turnType": "dupRtTurns_toEdge"}),
+ left_on=["start", "thru", "end"],
+ right_index=True,
+ how="left",
+ )
+
+ # for replicating the buggy Java implementation
+ # if settings.recreate_java_attributes:
+ # traversals, java_cols = recreate_java_attributes(traversals)
+ # traversals, java_attributes = calculate_signalExclRight_alternatives(traversals)
+ # java_cols += java_attributes
+
+ # this is the correct implementation
+ traversals["ThruJunction_anyrtvec"] = (
+ ~(
+ traversals.centroid_start
+ | traversals.centroid_thru
+ | traversals.centroid_end
+ )
+ & (traversals.turnType == TURN_NONE)
+ & (traversals.rt_turns - traversals.dupRtTurns_toEdge == 0)
+ )
+
+ # populate derived traversal attributes
+
+ traversals = traversals.assign(
+ thruCentroid=(
+ traversals.centroidConnector_fromEdge & traversals.centroidConnector_toEdge
+ ),
+ # this one is allegedly the one to keep
+ # taken from signalExclRight_anyrtvec
+ signalExclRight=(
+ traversals.signalized
+ & (traversals.turnType != TURN_RIGHT)
+ & (~traversals.ThruJunction_anyrtvec)
+ ),
+ # unlfrma: unsignalized left from major arterial
+ unlfrma=(
+ (~traversals.signalized)
+ & (traversals.functionalClass_fromEdge <= 3)
+ & (traversals.functionalClass_fromEdge > 0)
+ & (traversals.bikeClass_fromEdge != 1)
+ & (traversals.turnType == TURN_LEFT)
+ ),
+ # unlfrmi: unsignalized left from minor arterial
+ unlfrmi=(
+ (~traversals.signalized)
+ & (traversals.functionalClass_fromEdge == 4)
+ & (traversals.bikeClass_fromEdge != 1)
+ & (traversals.turnType == TURN_LEFT)
+ ),
+ # unxma: unsignalized cross major arterial
+ unxma=(
+ ~(
+ traversals.centroid_start
+ | traversals.centroid_thru
+ | traversals.centroid_end
+ )
+ & (
+ (
+ (traversals.turnType == TURN_NONE)
+ & (
+ traversals.majorArtXings
+ - traversals.dupMajArts_fromEdge
+ - traversals.dupMajArts_toEdge
+ )
+ >= 2
+ )
+ | (
+ (traversals.turnType == TURN_LEFT)
+ & (traversals.functionalClass_toEdge <= 3)
+ & (traversals.functionalClass_toEdge > 0)
+ & (traversals.bikeClass_toEdge != 1)
+ )
+ )
+ ),
+ # unxmi: unsignalized cross minor arterial
+ unxmi=(
+ ~(
+ traversals.centroid_start
+ | traversals.centroid_thru
+ | traversals.centroid_end
+ )
+ & (
+ (
+ (traversals.turnType == TURN_NONE)
+ & (
+ traversals.artXings
+ - traversals.dupArts_fromEdge
+ - traversals.dupArts_toEdge
+ )
+ >= 2
+ )
+ | (
+ (traversals.turnType == TURN_LEFT)
+ & (traversals.functionalClass_toEdge == 4)
+ & (traversals.functionalClass_toEdge > 0)
+ & (traversals.bikeClass_toEdge != 1)
+ )
+ )
+ ),
+ roadDowngrade=(
+ traversals.functionalClass_fromEdge.isin([3, 4, 5, 6])
+ & traversals.functionalClass_toEdge.isin([4, 5, 6, 7])
+ & (traversals.functionalClass_fromEdge < traversals.functionalClass_toEdge)
+ ),
+ )
+
+ output_cols = [
+ "start",
+ "thru",
+ "end",
+ "turnType",
+ "thruCentroid",
+ "signalExclRight",
+ "unlfrma",
+ "unlfrmi",
+ "unxma",
+ "unxmi",
+ "roadDowngrade",
+ ]
+ # if settings.recreate_java_attributes:
+ # # include the java attributes if they were recreated
+ # output_cols += java_attributes
+
+ # keep only the relevant columns
+ traversals = traversals.set_index(["edgeID_fromEdge", "edgeID_toEdge"])[output_cols]
+
+ logger.info("Finished calculating derived traversal attributes")
+
+ return traversals
+
+
+def create_bike_net(settings: BikeRouteChoiceSettings, logger: logging.Logger) -> tuple[
+ pd.DataFrame, # node dataframe
+ pd.DataFrame, # edge dataframe
+ pd.DataFrame, # traversal dataframe
+]:
+ """
+ Read bike network from supplied shapefiles and derive attributes
+
+ This method reads in two shapefiles detailing the nodes and edges
+ of a bike network, manipulates the data tables to match the expected
+ output format, and derives additonal attributes, some of which are
+ solely used internally while others are appended to the edge and node
+ tables. Additionally, a new table is developed of all traversals, that
+ is, all possible combinations of edges leading to and from a node which
+ are not reversals.
+
+ Parameters:
+ node_file: str - path to the node shapefile
+ edge_file: str - path to the edge shapefile
+
+ Returns:
+ nodes: pd.DataFrame - node dataframe with derived attributes in expected format
+ edges: pd.DataFrame - edge dataframe with derived attributes in expected format
+ traversals: pd.DataFrame - traversal dataframe with derived attributes in expected format
+ """
+ if settings.read_cached_bike_net:
+ logger.info("Reading cached bike network from CSV files")
+ edges = pd.read_csv(
+ os.path.join(os.path.expanduser(settings.output_path), "edges.csv"),
+ index_col=[0],
+ )
+ nodes = pd.read_csv(
+ os.path.join(os.path.expanduser(settings.output_path), "nodes.csv"),
+ index_col=0,
+ )
+ traversals = pd.read_csv(
+ os.path.join(
+ os.path.join(os.path.expanduser(settings.output_path), "traversals.csv")
+ ),
+ index_col=[0, 1],
+ )
+ return nodes, edges, traversals
+
+ # read shapefiles
+ logger.info("Reading link and node shapefiles")
+ node = read_file(settings, settings.node_file)
+ link = read_file(settings, settings.link_file)
+
+ # create and attribute edges
+ edges, nodes = create_and_attribute_edges(settings, node, link, logger)
+
+ # generate authoritiative positional index
+ edges = edges.reset_index()
+ edges.index.name = "edgeID"
+
+ traversals = create_and_attribute_traversals(settings, edges, nodes, logger)
+
+ # save edges, nodes, and traversals to csv files if specified
+ if settings.save_bike_net:
+ logger.info("Saving bike network to CSV files")
+ edges.to_csv(
+ os.path.join(os.path.expanduser(settings.output_path), "edges.csv")
+ )
+ nodes.to_csv(
+ os.path.join(os.path.expanduser(settings.output_path), "nodes.csv")
+ )
+ traversals.to_csv(
+ os.path.join(os.path.expanduser(settings.output_path), "traversals.csv")
+ )
+
+ return nodes, edges, traversals
diff --git a/src/asim/scripts/bike_route_choice/bike_route_calculator.py b/src/asim/scripts/bike_route_choice/bike_route_calculator.py
new file mode 100644
index 000000000..b8e7455cd
--- /dev/null
+++ b/src/asim/scripts/bike_route_choice/bike_route_calculator.py
@@ -0,0 +1,161 @@
+import os
+import logging
+import pandas as pd
+import numpy as np
+import warnings
+
+from bike_route_utilities import BikeRouteChoiceSettings, load_settings, read_file
+
+def calculate_utilities(
+ settings: BikeRouteChoiceSettings,
+ choosers: pd.DataFrame,
+ spec: pd.DataFrame,
+ trace_label: str,
+ logger: logging.Logger,
+) -> pd.DataFrame:
+ """
+ Calculate utilities for choosers using the provided specifications.
+ Modeled after ActivitySim's core.simulate.eval_utilities.
+
+ Parameters:
+ settings: BikeRouteChoiceSettings - settings for the bike route choice model
+ choosers: pd.DataFrame - DataFrame of choosers (edges or traversals)
+ spec: pd.Series - DataFrame with index as utility expressions and values as coefficients
+ trace_label: str - label for tracing
+ Returns:
+ pd.DataFrame - DataFrame of calculated utilities with same index as choosers
+ """
+
+ assert isinstance(
+ spec, pd.Series
+ ), "Spec must be a pandas Series with utility expressions as index and coefficients as values"
+
+ globals_dict = {}
+ locals_dict = {
+ "np": np,
+ "pd": pd,
+ "df": choosers,
+ }
+
+ expression_values = np.empty((spec.shape[0], choosers.shape[0]))
+
+ for i, expr in enumerate(spec.index):
+ try:
+ with warnings.catch_warnings(record=True) as w:
+ # Cause all warnings to always be triggered.
+ warnings.simplefilter("always")
+ if expr.startswith("@"):
+ expression_value = eval(expr[1:], globals_dict, locals_dict)
+ else:
+ expression_value = choosers.eval(expr)
+
+ if len(w) > 0:
+ for wrn in w:
+ logger.warning(
+ f"{trace_label} - {type(wrn).__name__} ({wrn.message}) evaluating: {str(expr)}"
+ )
+
+ except Exception as err:
+ logger.exception(
+ f"{trace_label} - {type(err).__name__} ({str(err)}) evaluating: {str(expr)}"
+ )
+ raise err
+
+ expression_values[i] = expression_value
+
+ # - compute_utilities
+ utilities = np.dot(expression_values.transpose(), spec.astype(np.float64).values)
+ utilities = pd.DataFrame(utilities, index=choosers.index, columns=["utility"])
+
+ if settings.trace_bike_utilities:
+ # trace entire utility calculation
+ logger.info(f"Saving utility calculations to {trace_label}.csv")
+ expressions = pd.DataFrame(
+ data=expression_values.transpose(), index=choosers.index, columns=spec.index
+ )
+ expressions = pd.concat([choosers, expressions], axis=1)
+ expressions["utility"] = utilities["utility"]
+ expressions.to_csv(os.path.join(os.path.expanduser(settings.output_path), f"{trace_label}.csv"))
+
+ assert utilities.index.equals(
+ choosers.index
+ ), "Index mismatch between utilities and choosers"
+
+ return utilities
+
+
+def calculate_utilities_from_spec(
+ settings: BikeRouteChoiceSettings,
+ choosers: pd.DataFrame,
+ spec_file: str,
+ trace_label: str,
+ logger: logging.Logger,
+ randomize: bool = False,
+) -> pd.DataFrame:
+ """
+ Calculate utilities from a specification file (edge or traversal).
+
+ Args:
+ settings: BikeRouteChoiceSettings
+ choosers: pd.DataFrame (edges or traversals)
+ spec_file: str (path to the specification file)
+ trace_label: str (label for tracing/logging)
+ randomize: bool (whether to randomize coefficients)
+
+ Returns:
+ pd.DataFrame with a 'utility' column
+ """
+ # read specification file
+ spec = read_file(settings, spec_file)
+
+ # Optionally randomize coefficients
+ if randomize:
+ # randomize coefficients between 1 +/- settings.random_scale_coef, expect for unavailiability (-999)
+ spec["Coefficient"] = spec["Coefficient"].apply(
+ lambda x: x
+ * (
+ 1
+ + np.random.uniform(
+ 0 - settings.random_scale_coef, settings.random_scale_coef
+ )
+ )
+ if x > -990
+ else x
+ )
+
+ # calculate utilities
+ utilities = calculate_utilities(
+ settings=settings,
+ choosers=choosers,
+ spec=spec.set_index("Expression")["Coefficient"],
+ trace_label=trace_label,
+ logger=logger,
+ )
+
+ # Optionally also add a random component across all utilities
+ if randomize:
+ utilities.loc[utilities.utility > -999,"utility"] = utilities.loc[utilities.utility > -999,"utility"] * (
+ np.random.choice(
+ [(1-settings.random_scale_link),(1+settings.random_scale_link)],
+ size=len(utilities.loc[utilities.utility > -999])
+ )
+ )
+
+ # check that all utilities are less than or equal to zero
+ if (utilities["utility"] > 0).any():
+ logger.warning(
+ f"{trace_label} utilities contain positive values, which may indicate an issue with the utility calculation."
+ )
+ logger.info(f"Utilities: {utilities['utility'].describe()}")
+ logger.info(
+ f"Percentage of positive utilities: {(utilities['utility'] > 0).mean() * 100:.2f}%"
+ )
+ logger.info(
+ "Capping the positive utilities to zero to avoid issues in Dijkstra's algorithm."
+ )
+ utilities.loc[utilities["utility"] > 0, "utility"] = 0
+ assert (
+ ~utilities.utility.isna()
+ ).all(), f"{trace_label} utilities should not contain any NaN values"
+
+ return utilities
diff --git a/src/asim/scripts/bike_route_choice/bike_route_choice.py b/src/asim/scripts/bike_route_choice/bike_route_choice.py
new file mode 100644
index 000000000..a7d2512d4
--- /dev/null
+++ b/src/asim/scripts/bike_route_choice/bike_route_choice.py
@@ -0,0 +1,937 @@
+import sys, os
+import numpy as np
+import pandas as pd
+import geopandas as gpd
+import logging
+import warnings
+from scipy.sparse import csr_matrix
+from scipy.sparse import csr_array, coo_array
+from scipy.sparse.csgraph import dijkstra
+from multiprocessing import Pool
+
+import bike_net_reader
+import bike_route_calculator
+from bike_route_utilities import BikeRouteChoiceSettings, load_settings, read_file
+
+import time
+
+
+def process_paths_new(centroids, predecessors, logger):
+ """
+ Converts the predecessors array from Dijkstra's algorithm into paths.
+
+ Args:
+ centroids (list): List of centroid indices.
+ predecessors (np.ndarray): Predecessors array from Dijkstra's algorithm.
+
+ Returns:
+ tuple: A tuple containing:
+ - paths_orig (np.ndarray): Origin indices for paths.
+ - paths_dest (np.ndarray): Destination indices for paths.
+ - paths_from_node (np.ndarray): From-node indices for paths.
+ - paths_to_node (np.ndarray): To-node indices for paths.
+ """
+ logger.debug("Processing paths without numba...")
+
+ # Add self-referential column to predecessor table to indicate end of path
+ predecessors_null = np.hstack(
+ (predecessors, np.full((predecessors.shape[0], 1), -1))
+ )
+ predecessors_null[predecessors_null == -9999] = -1
+ # Get starting indices for OD pairs with path found
+ notnull = (predecessors_null >= 0).nonzero()
+ notnull = tuple(
+ i.astype(np.int32) for i in notnull
+ ) # force int32 to save memory (defaults to int64)
+ notnull_dest = np.isin(notnull[1], centroids).nonzero()
+ origin_indices, dest_indices = (notnull[0][notnull_dest], notnull[1][notnull_dest])
+
+ # Iterate through predecessors
+ node_indices = dest_indices
+ paths = [node_indices]
+ while np.any(node_indices >= 0):
+ node_indices = predecessors_null[origin_indices, node_indices]
+ paths.append(node_indices)
+
+ stack_paths = np.vstack(paths).T
+ stack_paths = stack_paths[:, ::-1] # Reverse order to get origin -> destination
+ stack_paths_from = stack_paths[:, :-1]
+ stack_paths_to = stack_paths[:, 1:] # Offset by 1 to get to-node
+
+ # Remove null edges
+ od_index, path_num = (stack_paths_from >= 0).nonzero()
+ # path_num_actual = path_num - np.argmax(stack_paths_from >= 0, axis=1)[od_index] # 0-index
+ paths_from_node = stack_paths_from[od_index, path_num]
+ paths_to_node = stack_paths_to[od_index, path_num]
+
+ paths_orig = origin_indices[od_index] # centroids index
+ paths_dest = dest_indices[od_index] # mapped node id
+
+ return (paths_orig, paths_dest, paths_from_node, paths_to_node)
+
+
+def calculate_final_logsums_batch_traversals(
+ settings,
+ nodes,
+ edges,
+ traversals,
+ origin_centroids,
+ dest_centroids,
+ all_paths_orig,
+ all_paths_dest,
+ all_paths_from_edge,
+ all_paths_to_edge,
+ all_paths_iteration,
+ logger,
+ trace_origins=[],
+ trace_dests=[],
+):
+ """
+ Calculate the final logsums for the bike network using pre-computed paths and traversals.
+ Includes path size calculation.
+
+ Args:
+ settings: BikeRouteChoiceSettings
+ nodes (pd.DataFrame): DataFrame containing node information.
+ edges (pd.DataFrame): DataFrame containing edge information.
+ traversals (pd.DataFrame): DataFrame containing traversal information.
+ origin_centroids (list): List of origin centroids.
+ dest_centroids (list): List of destination centroids.
+ all_paths_orig (np.ndarray): Array of origin indices for all paths.
+ all_paths_dest (np.ndarray): Array of destination indices for all paths.
+ all_paths_from_edge (np.ndarray): Array of from-edge indices for all paths.
+ all_paths_to_edge (np.ndarray): Array of to-edge indices for all paths.
+ all_paths_iteration (np.ndarray): Array of iteration indices for all paths.
+ trace_origins (list, optional): List of origins to trace. Defaults to [].
+ trace_dests (list, optional): List of destinations to trace. Defaults to [].
+
+ Returns:
+ tuple: A tuple containing:
+ - paths_od_orig_mapped (np.ndarray): Mapped origin indices for paths.
+ - paths_od_dest_mapped (np.ndarray): Mapped destination indices for paths.
+ - paths_od_logsum (np.ndarray): Logsum values for each OD pair.
+ - trace_paths_orig_mapped (np.ndarray): Mapped origins for traced paths.
+ - trace_paths_dest_mapped (np.ndarray): Mapped destinations for traced paths.
+ - trace_paths_iteration (np.ndarray): Iteration indices for traced paths.
+ - trace_paths_prev_node (np.ndarray): Previous node indices for traced paths.
+ - trace_paths_from_node (np.ndarray): From-node indices for traced paths.
+ - trace_paths_to_node (np.ndarray): To-node indices for traced paths.
+ - trace_paths_path_size (np.ndarray): Path size for traced paths.
+ - trace_paths_edge_cost (np.ndarray): Edge cost for traced paths.
+ - trace_paths_trav_cost (np.ndarray): Traversal cost for traced paths.
+ """
+ logger.info("Calculating logsums...")
+
+ if all_paths_orig.size == 0:
+ logger.info("No paths found in this batch")
+ return (
+ np.empty((0)),
+ np.empty((0)),
+ np.empty((0)),
+ np.empty((0)),
+ np.empty((0)),
+ np.empty((0)),
+ np.empty((0)),
+ np.empty((0)),
+ np.empty((0)),
+ np.empty((0)),
+ np.empty((0)),
+ np.empty((0)),
+ np.empty((0)),
+ np.empty((0)),
+ np.empty((0)),
+ np.empty((0)),
+ )
+
+ # Mapped node id to centroids index
+ dest_centroids_rev_map = np.zeros(int(max(dest_centroids)) + 1, dtype=np.int32)
+ dest_centroids_rev_map[dest_centroids] = range(len(dest_centroids))
+ all_paths_dest_rev = dest_centroids_rev_map[all_paths_dest]
+
+ node_mapping = {node_id: idx for idx, node_id in enumerate(nodes.index)}
+
+ edges = edges.reset_index()
+ edge_from_node = edges.fromNode.map(node_mapping).to_numpy()
+ edge_to_node = edges.toNode.map(node_mapping).to_numpy()
+ edge_cost = edges.edge_utility.to_numpy()
+ edge_length = edges.distance.to_numpy()
+ edge_ids = edges.edgeID.to_numpy()
+
+ all_paths_from_node = edge_from_node[all_paths_to_edge]
+ all_paths_to_node = edge_to_node[all_paths_to_edge]
+ all_paths_edge_cost = edge_cost[all_paths_to_edge]
+ all_paths_edge_length = edge_length[all_paths_to_edge]
+ all_paths_edge_id = edge_ids[all_paths_to_edge]
+
+ if len(trace_origins) > 0:
+ all_paths_prev_node = edge_from_node[all_paths_from_edge]
+
+ num_edges = len(edges)
+
+ # node mapping needs to start at 0 in order to create adjacency matrix
+ # using edges instead of nodes since the edges are treated as nodes in the Dijkstra's algorithm
+ # constructing edge_mapping with columns [index, fromNode, toNode]
+ edge_mapping = edges[["fromNode", "toNode"]].reset_index()
+
+ traversals_mapped = (
+ traversals.reset_index()
+ .merge(
+ edge_mapping,
+ how="left",
+ left_on=["start", "thru"],
+ right_on=["fromNode", "toNode"],
+ )
+ .merge(
+ edge_mapping,
+ how="left",
+ left_on=["thru", "end"],
+ right_on=["fromNode", "toNode"],
+ suffixes=("FromEdge", "ToEdge"),
+ )
+ )
+
+ row = traversals_mapped.indexFromEdge.to_numpy()
+ col = traversals_mapped.indexToEdge.to_numpy()
+ data = traversals_mapped.traversal_utility.to_numpy()
+ trav_costs = csr_array((data, (row, col)), shape=(num_edges, num_edges))
+
+ all_paths_trav_cost = trav_costs[all_paths_from_edge, all_paths_to_edge]
+
+ # Add origin connectors
+ orig_connectors_indices = np.isin(all_paths_from_edge, origin_centroids).nonzero()[
+ 0
+ ]
+ all_paths_from_node = np.concatenate(
+ (
+ all_paths_from_node,
+ edge_from_node[all_paths_from_edge][orig_connectors_indices],
+ )
+ )
+ all_paths_to_node = np.concatenate(
+ (all_paths_to_node, edge_to_node[all_paths_from_edge][orig_connectors_indices])
+ )
+ all_paths_edge_cost = np.concatenate(
+ (all_paths_edge_cost, edge_cost[all_paths_from_edge][orig_connectors_indices])
+ )
+ all_paths_edge_id = np.concatenate(
+ (all_paths_edge_id, edge_ids[all_paths_from_edge][orig_connectors_indices])
+ )
+ all_paths_edge_length = np.concatenate(
+ (
+ all_paths_edge_length,
+ edge_length[all_paths_from_edge][orig_connectors_indices],
+ )
+ )
+ all_paths_trav_cost = np.concatenate(
+ (all_paths_trav_cost, np.zeros_like(orig_connectors_indices))
+ )
+ all_paths_orig_new = np.concatenate(
+ (all_paths_orig, all_paths_orig[orig_connectors_indices])
+ )
+ all_paths_dest_rev_new = np.concatenate(
+ (all_paths_dest_rev, all_paths_dest_rev[orig_connectors_indices])
+ )
+ all_paths_iteration_new = np.concatenate(
+ (all_paths_iteration, all_paths_iteration[orig_connectors_indices])
+ )
+ all_paths_cost = all_paths_edge_cost + all_paths_trav_cost
+
+ # SciPy Sparse arrays only surrport 2d arrays, so flatten OD and link indices
+ paths_od_ravel = np.ravel_multi_index(
+ (all_paths_orig_new, all_paths_dest_rev_new),
+ (len(origin_centroids), len(dest_centroids)),
+ )
+ paths_link_ravel = np.ravel_multi_index(
+ (all_paths_from_node, all_paths_to_node), (len(nodes), len(nodes))
+ )
+
+ # extract paths for OD pairs that are being traced
+ if len(trace_origins) > 0:
+ # trace_indices = (np.isin(all_paths_orig_new, trace_origins) & np.isin(all_paths_dest_rev_new, trace_dests)).nonzero()[0]
+ all_paths_prev_node = np.concatenate(
+ (all_paths_prev_node, np.full_like(orig_connectors_indices, -1))
+ )
+ trace_od_ravel = np.ravel_multi_index(
+ (trace_origins, trace_dests), (len(origin_centroids), len(dest_centroids))
+ )
+ trace_indices = (np.isin(paths_od_ravel, trace_od_ravel)).nonzero()[0]
+ trace_paths_orig = all_paths_orig_new[trace_indices]
+ trace_paths_dest_rev = all_paths_dest_rev_new[trace_indices]
+ trace_paths_iteration = all_paths_iteration_new[trace_indices]
+ trace_paths_prev_node = all_paths_prev_node[trace_indices]
+ trace_paths_from_node = all_paths_from_node[trace_indices]
+ trace_paths_to_node = all_paths_to_node[trace_indices]
+ trace_paths_edge_cost = all_paths_edge_cost[trace_indices]
+ trace_paths_trav_cost = all_paths_trav_cost[trace_indices]
+ trace_paths_edge_id = all_paths_edge_id[trace_indices]
+
+ # SciPy COO array will add duplicates together upon conversion to CSR array
+ # Insert ones for each path link to count number of paths for each OD/link
+ # Likely not an optimal solution, but np.unique took far longer, should consider other alternatives
+ ones_arr = np.ones(len(paths_od_ravel), dtype=np.uint8)
+ link_num_paths = coo_array(
+ (ones_arr, (paths_od_ravel, paths_link_ravel)),
+ shape=(len(origin_centroids) * len(dest_centroids), len(nodes) ** 2),
+ )
+ link_num_paths = csr_array(link_num_paths)
+
+ paths_num_paths = link_num_paths[paths_od_ravel, paths_link_ravel]
+
+ # path size = sum( ( la / Li ) * ( 1 / Man ) ) = sum( la / Man ) / Li
+ all_paths_size_component = all_paths_edge_length / paths_num_paths # la / Man
+ # Flatten OD and iteration indices to sum cost, length, path size
+ # Should check if COO array is faster than bincount
+ paths_od_iter_ravel = np.ravel_multi_index(
+ (all_paths_orig_new, all_paths_dest_rev_new, all_paths_iteration_new),
+ (len(origin_centroids), len(dest_centroids), settings.number_of_iterations),
+ )
+ od_iter_length_total = np.bincount(paths_od_iter_ravel, all_paths_edge_length) # Li
+
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore")
+ od_iter_path_size = (
+ np.bincount(paths_od_iter_ravel, all_paths_size_component)
+ / od_iter_length_total
+ ) # sum( la / Man ) / Li
+
+ od_iter_cost = np.bincount(paths_od_iter_ravel, all_paths_cost)
+
+ # extract path sizes for OD pairs that are being traced
+ if len(trace_origins) > 0:
+ trace_paths_od_iter_ravel = np.ravel_multi_index(
+ (trace_paths_orig, trace_paths_dest_rev, trace_paths_iteration),
+ (len(origin_centroids), len(dest_centroids), settings.number_of_iterations),
+ )
+ trace_paths_path_size = od_iter_path_size[trace_paths_od_iter_ravel]
+
+ # Unflatten OD and iteration indices, no longer need individual links
+ od_iter_indices = (od_iter_length_total > 0).nonzero()[0]
+ paths_od_iter_orig, paths_od_iter_dest, paths_od_iter_iter = np.unravel_index(
+ od_iter_indices,
+ (len(origin_centroids), len(dest_centroids), settings.number_of_iterations),
+ )
+ paths_od_iter_cost = od_iter_cost[od_iter_indices]
+ paths_od_iter_path_size = od_iter_path_size[od_iter_indices]
+ paths_od_iter_length_total = od_iter_length_total[od_iter_indices]
+
+ # Normalize path size, need to sum path size by OD
+ paths_od_ravel = np.ravel_multi_index(
+ (paths_od_iter_orig, paths_od_iter_dest),
+ (len(origin_centroids), len(dest_centroids)),
+ )
+ od_count_iter = np.bincount(paths_od_ravel)
+ od_path_size_sum = np.bincount(paths_od_ravel, paths_od_iter_path_size)
+ paths_od_iter_path_size_sum = od_path_size_sum[paths_od_ravel]
+ paths_od_iter_path_size_normalized = (
+ paths_od_iter_path_size / paths_od_iter_path_size_sum
+ )
+
+ # Add path cost to utility function
+ paths_od_iter_utility = paths_od_iter_cost + np.log(
+ paths_od_iter_path_size_normalized
+ )
+ paths_od_iter_utility_exp = np.exp(paths_od_iter_utility)
+
+ od_utility_exp_sum = np.bincount(paths_od_ravel, paths_od_iter_utility_exp)
+ paths_od_iter_utility_exp_sum = od_utility_exp_sum[paths_od_ravel]
+ paths_od_iter_prob = paths_od_iter_utility_exp / paths_od_iter_utility_exp_sum
+
+ paths_od_iter_length_weighted = paths_od_iter_length_total * paths_od_iter_prob
+
+ # Unflatten OD indices, no longer need iterations
+ od_indices = (od_path_size_sum > 0).nonzero()[0]
+ paths_od_orig, paths_od_dest = np.unravel_index(
+ od_indices, (len(origin_centroids), len(dest_centroids))
+ )
+
+ paths_od_count_iter = od_count_iter[od_indices]
+ paths_od_path_size_sum = od_path_size_sum[od_indices]
+
+ # Average path length
+ od_dist = np.bincount(paths_od_ravel, paths_od_iter_length_weighted)
+ paths_od_dist = od_dist[od_indices]
+
+ # Logsum calculation
+ od_logsum = np.bincount(paths_od_ravel, paths_od_iter_utility_exp)
+ paths_od_logsum = np.log(od_logsum[od_indices])
+
+ # Centroids index to mapped node id
+ origin_centroids_np = np.array(origin_centroids)
+ paths_od_orig_mapped = origin_centroids_np[paths_od_orig]
+ dest_centroids_np = np.array(dest_centroids)
+ paths_od_dest_mapped = dest_centroids_np[paths_od_dest]
+
+ # edge mapping is used to map origins and destinations back to their original node ids
+ # this awkward mapping is necessary because the edges are treated as nodes in the dijkstra's algorithm
+ fromNode_map = edge_mapping.set_index("index")["fromNode"].to_dict()
+ toNode_map = edge_mapping.set_index("index")["toNode"].to_dict()
+
+ paths_od_orig_mapped = np.array([fromNode_map[i] for i in paths_od_orig_mapped])
+ paths_od_dest_mapped = np.array([toNode_map[i] for i in paths_od_dest_mapped])
+
+ if len(trace_origins) > 0:
+ # remapping traced origins and destinations from 0 index to edge index, then edge index to node id
+ trace_paths_orig_mapped = origin_centroids_np[trace_paths_orig]
+ trace_paths_dest_mapped = dest_centroids_np[trace_paths_dest_rev]
+ trace_paths_orig_mapped = np.array(
+ [fromNode_map[i] for i in trace_paths_orig_mapped]
+ )
+ trace_paths_dest_mapped = np.array(
+ [toNode_map[i] for i in trace_paths_dest_mapped]
+ )
+
+ # paths themselves just need to be mapped back from 0-index to node id
+ rev_node_mapping = {v: k for k, v in node_mapping.items()}
+ rev_node_mapping[-1] = -1 # map no path -1 to itself
+ trace_paths_prev_node = np.array(
+ [rev_node_mapping[i] for i in trace_paths_prev_node]
+ )
+ trace_paths_from_node = np.array(
+ [rev_node_mapping[i] for i in trace_paths_from_node]
+ )
+ trace_paths_to_node = np.array(
+ [rev_node_mapping[i] for i in trace_paths_to_node]
+ )
+
+ return (
+ paths_od_orig_mapped,
+ paths_od_dest_mapped,
+ paths_od_dist,
+ paths_od_logsum,
+ paths_od_path_size_sum,
+ paths_od_count_iter,
+ trace_paths_orig_mapped,
+ trace_paths_dest_mapped,
+ trace_paths_iteration,
+ trace_paths_prev_node,
+ trace_paths_from_node,
+ trace_paths_to_node,
+ trace_paths_path_size,
+ trace_paths_edge_cost,
+ trace_paths_trav_cost,
+ trace_paths_edge_id,
+ )
+ else:
+ return (
+ paths_od_orig_mapped,
+ paths_od_dest_mapped,
+ paths_od_dist,
+ paths_od_logsum,
+ paths_od_path_size_sum,
+ paths_od_count_iter,
+ np.empty((0)),
+ np.empty((0)),
+ np.empty((0)),
+ np.empty((0)),
+ np.empty((0)),
+ np.empty((0)),
+ np.empty((0)),
+ np.empty((0)),
+ np.empty((0)),
+ np.empty((0)),
+ )
+
+
+def _perform_dijkstra(centroids, adjacency_matrix, logger, limit=3):
+ """Perform Dijkstra's algorithm for a batch of centroids."""
+ logger.info(
+ f"Processing Dijkstra's on {len(centroids)} centroids with limit={limit}..."
+ )
+ distances, predecessors = dijkstra(
+ adjacency_matrix,
+ directed=True,
+ indices=centroids,
+ return_predecessors=True,
+ limit=limit,
+ )
+
+ return (distances, predecessors)
+
+
+def perform_dijkstras_algorithm_batch_traversals(
+ edges, traversals, origin_centroids, limit, logger
+):
+ """Perform Dijkstra's algorithm for centroids using SciPy's sparse graph solver with batched parallel processing."""
+ num_edges = len(edges)
+
+ # node mapping needs to start at 0 in order to create adjacency matrix
+ # fromNode and toNode index is saved as columns in edges
+ # then reindex again to get index of the edge
+ edge_mapping = edges["edge_utility"]
+
+ traversals_mapped = (
+ traversals.reset_index()
+ .merge(
+ edge_mapping,
+ how="left",
+ left_on="edgeID_fromEdge",
+ right_index=True
+ )
+ .merge(
+ edge_mapping,
+ how="left",
+ left_on="edgeID_toEdge",
+ right_index=True,
+ suffixes=("FromEdge", "ToEdge"),
+ )
+ )
+ # Total bike cost is edge cost (after traversal) plus traversal cost
+ # Note origin zone connector edge cost is not included with this approach, but this has no impact on shortest path selection
+ traversals_mapped["total_utility"] = (
+ traversals_mapped.edge_utilityToEdge + traversals_mapped.traversal_utility
+ )
+ traversals_mapped.loc[
+ traversals_mapped.edgeID_fromEdge.isin(origin_centroids), "total_utility"
+ ] += traversals_mapped.edge_utilityFromEdge
+
+ # Convert from negative utility to positive cost for Dijkstra's
+ traversals_mapped.total_utility *= -1
+
+ # Create a sparse adjacency matrix
+ row = traversals_mapped.edgeID_fromEdge.to_numpy()
+ col = traversals_mapped.edgeID_toEdge.to_numpy()
+ data = traversals_mapped.total_utility.to_numpy()
+ adjacency_matrix = csr_matrix((data, (row, col)), shape=(num_edges, num_edges))
+
+ logger.debug(f"Need to calculate Dijkstra's on {len(origin_centroids)} centroids")
+
+ # Perform Dijkstra's algorithm for all centroids
+ shortest_paths = _perform_dijkstra(origin_centroids, adjacency_matrix, logger, limit)
+ return shortest_paths
+
+
+def run_iterations_batch_traversals(
+ settings: BikeRouteChoiceSettings,
+ edges: pd.DataFrame,
+ traversals: pd.DataFrame,
+ origin_centroids: list,
+ dest_centroids: list,
+ logger: logging.Logger,
+):
+ """
+ Run multiple iterations of Dijkstra's algorithm using traversals as "links" and edges as "vertices".
+ For each iteration, it calculates utilities for edges and traversals, then performs Dijkstra's algorithm to find paths.
+
+ Args:
+ settings: BikeRouteChoiceSettings
+ edges: pd.DataFrame
+ traversals: pd.DataFrame
+ origin_centroids: list
+ dest_centroids: list
+
+ Returns:
+ tuple: A tuple containing arrays of path information.
+ """
+
+ all_paths = []
+
+ for i in range(settings.number_of_iterations):
+ logger.info(f"Running iteration {i + 1} of {settings.number_of_iterations}")
+ # calculating utilties with randomness
+ edges["edge_utility"] = bike_route_calculator.calculate_utilities_from_spec(
+ settings,
+ choosers=edges,
+ spec_file=settings.edge_util_file,
+ trace_label=f"bike_edge_utilities_iteration_{i}",
+ randomize=True,
+ logger=logger,
+ )
+
+ logger.debug("Overriding utility of centroid connectors")
+ edges.loc[edges.centroidConnector,"edge_utlity"] = -np.finfo(edges.edge_utility.dtype).eps
+
+ traversals[
+ "traversal_utility"
+ ] = bike_route_calculator.calculate_utilities_from_spec(
+ settings,
+ choosers=traversals,
+ spec_file=settings.traversal_util_file,
+ trace_label=f"bike_traversal_utilities_iteration_{i}",
+ randomize=False,
+ logger=logger,
+ )
+
+ # run dijkstra's
+ distances, predecessors = perform_dijkstras_algorithm_batch_traversals(
+ edges, traversals, origin_centroids, limit=settings.max_dijkstra_utility, logger=logger,
+ )
+
+ # process paths
+ paths = process_paths_new(dest_centroids, predecessors, logger)
+ all_paths.append(paths + (np.full_like(paths[0], i, dtype=np.uint8),))
+
+ all_paths_concat = map(np.concatenate, zip(*all_paths))
+ (
+ all_paths_orig,
+ all_paths_dest,
+ all_paths_from_edge,
+ all_paths_to_edge,
+ all_paths_iteration,
+ ) = all_paths_concat
+ return (
+ all_paths_orig,
+ all_paths_dest,
+ all_paths_from_edge,
+ all_paths_to_edge,
+ all_paths_iteration,
+ )
+
+
+def run_batch_traversals(
+ settings,
+ nodes,
+ edges,
+ traversals,
+ origin_centroids,
+ dest_centroids,
+ trace_origin_edgepos,
+ trace_dest_edgepos,
+ logger,
+ proc_id
+):
+ """
+ Run batch traversals for the bike route choice model.
+
+ """
+ if logger is None:
+ logger = logging.getLogger(f"subprocess_{proc_id}")
+ log_file_location = os.path.join(os.path.expanduser(settings.output_path), "bike_model.log")
+ logging.basicConfig(
+ level=logging.INFO,
+ format="%(asctime)s - %(name)s %(levelname)s - %(message)s",
+ handlers=[logging.FileHandler(log_file_location), logging.StreamHandler()],
+ )
+ (
+ all_paths_orig,
+ all_paths_dest,
+ all_paths_from_edge,
+ all_paths_to_edge,
+ all_paths_iteration,
+ ) = run_iterations_batch_traversals(
+ settings,
+ edges,
+ traversals,
+ origin_centroids,
+ dest_centroids,
+ logger,
+ )
+ trace_origins_rev = []
+ trace_dests_rev = []
+
+ if len(settings.trace_origins) > 0:
+
+ filtered_trace_origs = trace_origin_edgepos[np.isin(trace_origin_edgepos,origin_centroids)]
+
+ trace_origins_rev = pd.DataFrame(
+ enumerate(origin_centroids),
+ columns=['subset_pos','edge_pos']
+ ).set_index(
+ 'edge_pos'
+ ).loc[
+ filtered_trace_origs
+ ].subset_pos.values
+
+ filtered_trace_dests = trace_dest_edgepos[np.isin(trace_dest_edgepos,dest_centroids)]
+
+ trace_dests_rev = pd.DataFrame(
+ enumerate(dest_centroids),
+ columns=['subset_pos','edge_pos']
+ ).set_index(
+ 'edge_pos'
+ ).loc[
+ filtered_trace_dests
+ ].subset_pos.values
+
+ # calculate non-randomized utilities for edges and traversals to use in final logsum calculation
+ edges["edge_utility"] = bike_route_calculator.calculate_utilities_from_spec(
+ settings,
+ choosers=edges,
+ spec_file=settings.edge_util_file,
+ trace_label="bike_edge_utilities_final",
+ randomize=False,
+ logger=logger,
+ )
+
+ logger.debug("Overriding utility of centroid connectors")
+ edges.loc[edges.centroidConnector,"edge_utlity"] = -np.finfo(edges.edge_utility.dtype).eps
+
+ traversals[
+ "traversal_utility"
+ ] = bike_route_calculator.calculate_utilities_from_spec(
+ settings,
+ choosers=traversals,
+ spec_file=settings.traversal_util_file,
+ trace_label="bike_traversal_utilities_final",
+ randomize=False,
+ logger=logger,
+ )
+
+ final_paths = calculate_final_logsums_batch_traversals(
+ settings,
+ nodes,
+ edges,
+ traversals,
+ origin_centroids,
+ dest_centroids,
+ all_paths_orig,
+ all_paths_dest,
+ all_paths_from_edge,
+ all_paths_to_edge,
+ all_paths_iteration,
+ logger,
+ trace_origins_rev,
+ trace_dests_rev,
+ )
+ return final_paths
+
+
+def get_centroid_connectors(
+ settings: BikeRouteChoiceSettings, nodes: pd.DataFrame, edges: pd.DataFrame, logger: logging.Logger,
+):
+ """
+ Generate centroids for the bike route choice model.
+ This function is a placeholder and should be replaced with a more intelligent centroid selection method.
+
+ Returns:
+ Index of edge IDs of corresponding connectors
+ """
+ # node mapping needs to start at 0 in order to create adjacency matrix
+ # constructing edge_mapping with columns [index, fromNode, toNode]
+ edge_mapping = edges[["fromNode", "toNode"]]
+
+ # Get mgra centroids (nodes with 'centroid' flag True and 'mgra' greater than zero)
+ # Centroid connectors
+ origin_centroid_connectors = nodes[
+ nodes["centroid"] & (nodes[settings.zone_level] > 0)
+ ].merge(edge_mapping, how="left", left_index=True, right_on="fromNode")
+
+ dest_centroid_connectors = nodes[nodes["centroid"] & (nodes[settings.zone_level] > 0)].merge(
+ edge_mapping, how="left", left_index=True, right_on="toNode"
+ )
+
+ if isinstance(settings.zone_subset, list):
+ # filter centroids based on zone_subset if it is a list
+ origin_centroid_connectors = origin_centroid_connectors[
+ origin_centroid_connectors[settings.zone_level].isin(settings.zone_subset)
+ ]
+ dest_centroid_connectors = dest_centroid_connectors[dest_centroid_connectors[settings.zone_level].isin(settings.zone_subset)]
+ elif isinstance(settings.zone_subset, int):
+ # take the first N centroids if zone_subset is an integer
+ origin_centroid_connectors = origin_centroid_connectors[: settings.zone_subset]
+ dest_centroid_connectors = dest_centroid_connectors[: settings.zone_subset]
+
+ def _clean_centroid_connectors(df, label, edge_mapping):
+ null_rows = df[df.isnull().any(axis=1)]
+ if not null_rows.empty:
+ logger.warning(
+ f"Null columns found in {label} centroids dataframe! Dropping:\n {null_rows}"
+ )
+ df = df.dropna()
+ df.index = df.index.astype(np.int32)
+ return df
+
+ origin_centroid_connectors = _clean_centroid_connectors(origin_centroid_connectors, "origin", edge_mapping)
+ dest_centroid_connectors = _clean_centroid_connectors(dest_centroid_connectors, "destination", edge_mapping)
+
+ origin_centroid_connectors = origin_centroid_connectors[settings.zone_level]
+ dest_centroid_connectors = dest_centroid_connectors[settings.zone_level]
+
+ return origin_centroid_connectors, dest_centroid_connectors
+
+
+def run_bike_route_choice(settings, logger):
+ """Main function to run the bike route choice model."""
+
+ # create bike network
+ nodes, edges, traversals = bike_net_reader.create_bike_net(settings, logger)
+
+ # Define centroids
+ origin_centroid_connectors, dest_centroid_connectors = get_centroid_connectors(settings, nodes, edges, logger)
+
+ trace_origins_edgepos = np.array(origin_centroid_connectors[origin_centroid_connectors.isin(settings.trace_origins)].index)
+ trace_dests_edgepos = np.array(dest_centroid_connectors[dest_centroid_connectors.isin(settings.trace_destinations)].index)
+
+ # drop zone IDs
+ origin_centroid_connectors = origin_centroid_connectors.index
+ dest_centroid_connectors = dest_centroid_connectors.index
+
+
+ logger.info(
+ f"Splitting {len(origin_centroid_connectors)} origins into {settings.number_of_batches} batches"
+ )
+ origin_centroid_batches = np.array_split(
+ origin_centroid_connectors, settings.number_of_batches
+ )
+
+ # run the bike route choice model in either single or multi-process mode
+ if settings.number_of_processors > 1:
+ # Split origin centroids into batche
+ final_paths = []
+ for i, origin_centroid_batch in enumerate(origin_centroid_batches):
+ logger.info(
+ f"Splitting batch {i} of {len(origin_centroid_batch)} origins into {settings.number_of_processors} processes"
+ )
+ origin_centroid_sub_batches = np.array_split(
+ origin_centroid_batch, settings.number_of_processors
+ )
+ with Pool(processes=settings.number_of_processors) as pool:
+ results = pool.starmap(
+ run_batch_traversals,
+ [
+ (
+ settings,
+ nodes,
+ edges.drop(columns='geometry'),
+ traversals,
+ origin_centroid_sub_batch,
+ dest_centroid_connectors,
+ trace_origins_edgepos,
+ trace_dests_edgepos,
+ None,
+ proc_id
+ )
+ for proc_id, origin_centroid_sub_batch in enumerate(origin_centroid_sub_batches)
+ ],
+ )
+ final_paths.extend(results)
+
+ else:
+ final_paths = []
+ for i, origin_centroid_batch in enumerate(origin_centroid_batches):
+ logger.info(
+ f"Processing batch {i+1} of {len(origin_centroid_batch)} origins"
+ )
+ results = run_batch_traversals(
+ settings=settings,
+ nodes=nodes,
+ edges=edges.drop(columns='geometry'),
+ traversals=traversals,
+ origin_centroids=origin_centroid_batch,
+ dest_centroids=dest_centroid_connectors,
+ trace_origin_edgepos=trace_origins_edgepos,
+ trace_dest_edgepos=trace_dests_edgepos,
+ logger=logger,
+ proc_id=0
+ )
+ final_paths.append(results)
+
+ final_paths_concat = tuple(map(np.concatenate,zip(*final_paths)))
+
+ logsums = pd.DataFrame(
+ {
+ "origin": final_paths_concat[0],
+ "destination": final_paths_concat[1],
+ "distance": final_paths_concat[2],
+ "logsum": final_paths_concat[3],
+ "path_size_sum": final_paths_concat[4],
+ "iterations": final_paths_concat[5]
+ }
+ )
+
+
+ # calculate replacement intrazonal distance values
+ diag_dists = logsums[logsums.origin!=logsums.destination].groupby('origin').distance.min() * 0.5
+ diags = diag_dists.to_frame()
+
+ # calculate replacement intrazonal logsum values
+
+ # FIXME this is repetitive - find a way to retrieve spec only once (already read in run_batch_traversals)
+ spec = read_file(settings, settings.edge_util_file).set_index('Label')
+
+ intrazonal_coeff = settings.intrazonal_coefficient
+
+ assert intrazonal_coeff in spec.index, f"Intrazonal logsum coefficient {intrazonal_coeff} missing from edge spec"
+
+ diags['logsum'] = diags.distance * spec.loc[intrazonal_coeff,"Coefficient"]
+
+ # indexing work
+ diags = diags.reset_index()
+ diags['destination'] = diags.origin
+ diags = diags.set_index(['origin','destination'])
+
+ diags['path_size_sum'] = 0
+ diags['iterations'] = 0
+
+ # replace the values in the logsums table
+ logsums = logsums.set_index(['origin','destination'])
+ logsums = diags.combine_first(logsums)
+ # logsums.loc[diags.index,'distance'] = diags.distance
+ # logsums.loc[diags.index,'logsum'] = diags.logsum
+ # logsums.loc[diags.index,'path_size_sum'] = np.nan
+
+ logsums.to_csv(f"{settings.output_path}/bike_route_choice_logsums.csv")
+
+ logsums['time'] = (logsums['distance'] / settings.bike_speed) * 60
+
+ logsums = logsums.merge(nodes[[settings.zone_level]], how='left', left_on='origin', right_index=True)
+ logsums = logsums.rename(columns={settings.zone_level: 'i'})
+ logsums = logsums.merge(nodes[[settings.zone_level]], how='left', left_on='destination', right_index=True)
+ logsums = logsums.rename(columns={settings.zone_level: 'j'})
+
+ if settings.zone_level == 'mgra':
+ logsums = logsums[(logsums.iterations == 0) | (logsums.iterations >= settings.min_iterations)]
+
+ logsums = logsums[['i','j','logsum','time']]
+
+ logsums.to_csv(settings.output_file_path,index=False)
+
+ # Save the final paths to a CSV file
+ if len(settings.trace_origins) > 0:
+ trace_paths = pd.DataFrame(
+ {
+ "origin": final_paths_concat[6],
+ "destination": final_paths_concat[7],
+ "iteration": final_paths_concat[8],
+ "prev_node": final_paths_concat[9],
+ "from_node": final_paths_concat[10],
+ "to_node": final_paths_concat[11],
+ "path_size": final_paths_concat[12],
+ "edge_cost": final_paths_concat[13],
+ "traversal_cost": final_paths_concat[14],
+ "edgeID":final_paths_concat[15],
+ }
+ )
+ trace_paths.to_csv(
+ f"{settings.output_path}/bike_route_choice_trace.csv", index=False
+ )
+
+ if settings.generate_shapefile:
+ logger.info("Generating shapefile...")
+
+
+ # # attach edges to the paths
+ trace_paths = gpd.GeoDataFrame(
+ trace_paths.merge(
+ edges['geometry'],
+ left_on=['edgeID'],
+ right_index=True,
+ how='left'),
+ crs=settings.crs
+ )
+
+ logger.info("Writing shapefile...")
+ trace_paths.to_file(f"{settings.output_path}/bike_route_choice_trace.shp")
+
+
+ logger.info("Bike route choice model completed.")
+ return logsums
+
+
+if __name__ == "__main__":
+
+ start_time = time.time()
+
+ logger = logging.getLogger(__name__)
+ # can pass settings file as command line argument
+ if len(sys.argv) > 1:
+ settings_file = sys.argv[1]
+ else:
+ settings_file = "bike_route_choice_settings.yaml"
+ # load settings
+ settings = load_settings(logger,settings_file)
+
+ run_bike_route_choice(settings, logger)
+
+ end_time = time.time()
+ elapsed_time = end_time - start_time
+ logger.info(f"Total time taken: {elapsed_time:.2f} seconds")
diff --git a/src/asim/scripts/bike_route_choice/bike_route_choice_settings_mgra.yaml b/src/asim/scripts/bike_route_choice/bike_route_choice_settings_mgra.yaml
new file mode 100644
index 000000000..9dbc2c1de
--- /dev/null
+++ b/src/asim/scripts/bike_route_choice/bike_route_choice_settings_mgra.yaml
@@ -0,0 +1,54 @@
+# Path to bike network shapefiles
+node_file: SANDAG_Bike_Node.shp
+link_file: SANDAG_Bike_Net.shp
+
+# Data directory, optional additional place to look for data
+data_dir: ${path}\input
+
+# Path to bike route choice model output
+output_path: ${path}\output\bike\mgra
+output_file_path: ${path}\output\bikeMgraLogsum.csv
+
+# Edge utility specification file
+edge_util_file: bike_edge_utils.csv
+
+# Traversal utility specification file
+traversal_util_file: bike_traversal_utils.csv
+
+# Bike speed to determine bike times
+bike_speed: {mode-nonmotorized-bike-speed:}
+
+# can define a subset of zones to use for the model
+# this is useful for testing or if you only want to run the model for a specific area
+# zone_subset: [90, 5310, 5873, 8647, 5647, 3032]
+
+# whether to treat "mgra" or "taz" as the centroid zones
+zone_level: mgra
+max_dijkstra_utility: 25
+
+# how many different paths to build for each origin-destination pair
+# this is the number of times dijkstra's algorithm will be run
+number_of_iterations: 10
+
+# minimum number of iterations/paths. Zone pairs with fewer paths will be discarded
+min_iterations: 7
+
+# runtime settings
+number_of_batches: 16
+number_of_processors: 47
+
+generate_shapefile: True
+crs: epsg:2230
+
+# randomization settings
+random_scale_coef: 0.1
+random_scale_link: 0.2
+
+# whether to trace the edge utilities
+trace_bike_utilities: False
+
+# trace_origins: [90, 5310, 5873]
+# trace_destinations: [8647, 5647, 3032]
+
+read_cached_bike_net: False
+save_bike_net: False
\ No newline at end of file
diff --git a/src/asim/scripts/bike_route_choice/bike_route_choice_settings_taz.yaml b/src/asim/scripts/bike_route_choice/bike_route_choice_settings_taz.yaml
new file mode 100644
index 000000000..1cf20ed88
--- /dev/null
+++ b/src/asim/scripts/bike_route_choice/bike_route_choice_settings_taz.yaml
@@ -0,0 +1,54 @@
+# Path to bike network shapefiles
+node_file: SANDAG_Bike_Node.shp
+link_file: SANDAG_Bike_Net.shp
+
+# Data directory, optional additional place to look for data
+data_dir: ${path}\input
+
+# Path to bike route choice model output
+output_path: ${path}\output\bike\taz
+output_file_path: ${path}\output\bikeTazLogsum.csv
+
+# Edge utility specification file
+edge_util_file: bike_edge_utils.csv
+
+# Traversal utility specification file
+traversal_util_file: bike_traversal_utils.csv
+
+# Bike speed to determine bike times
+bike_speed: {mode-nonmotorized-bike-speed:}
+
+# can define a subset of zones to use for the model
+# this is useful for testing or if you only want to run the model for a specific area
+# zone_subset: [90, 5310, 5873, 8647, 5647, 3032]
+
+# whether to treat "mgra" or "taz" as the centroid zones
+zone_level: taz
+max_dijkstra_utility: 120
+
+# how many different paths to build for each origin-destination pair
+# this is the number of times dijkstra's algorithm will be run
+number_of_iterations: 10
+
+# minimum number of iterations/paths. Zone pairs with fewer paths will be discarded
+min_iterations: 0
+
+# runtime settings
+number_of_batches: 16
+number_of_processors: 47
+
+generate_shapefile: True
+crs: epsg:2230
+
+# randomization settings
+random_scale_coef: 0.1
+random_scale_link: 0.2
+
+# whether to trace the edge utilities
+trace_bike_utilities: False
+
+# trace_origins: [90, 5310, 5873]
+# trace_destinations: [8647, 5647, 3032]
+
+read_cached_bike_net: False
+save_bike_net: False
\ No newline at end of file
diff --git a/src/asim/scripts/bike_route_choice/bike_route_utilities.py b/src/asim/scripts/bike_route_choice/bike_route_utilities.py
new file mode 100644
index 000000000..140775165
--- /dev/null
+++ b/src/asim/scripts/bike_route_choice/bike_route_utilities.py
@@ -0,0 +1,286 @@
+import os
+import logging
+import yaml
+from pydantic import BaseModel, ValidationError
+import geopandas as gpd
+import pandas as pd
+import numpy as np
+import networkx as nx
+import matplotlib.pyplot as plt
+from typing import Literal
+
+
+class BikeRouteChoiceSettings(BaseModel):
+ """
+ Bike route choice settings
+ """
+
+ # path to bike network shapefiles
+ node_file: str = "SANDAG_Bike_Node.shp"
+ link_file: str = "SANDAG_Bike_Net.shp"
+
+ # data directory, optional additional place to look for data
+ data_dir: str = ""
+
+ # edge utility specifcation file
+ edge_util_file: str = "bike_edge_utils.csv"
+
+ # traversal utility specifcation file
+ traversal_util_file: str = "bike_traversal_utils.csv"
+
+ # coefficient to multiply intrazonal distance by to get logsum
+ intrazonal_coefficient: str = "util_res_nofacils"
+
+ bike_speed: float = 0
+
+ # path to bike route choice model output
+ output_path: str = "output"
+ output_file_path: str = "output\bikeTazLogsum.csv"
+
+ # whether to trace edge and traversal utility calculations
+ trace_bike_utilities: bool = False
+
+ # runtime settings
+ number_of_batches: int = 1
+ number_of_processors: int = 1
+
+ # randomization settings
+ random_seed: int = 42
+ random_scale_coef: float = 0.5
+ random_scale_link: float = 0.7
+
+ # trace origin and destination pairs
+ trace_origins: list = []
+ trace_destinations: list = []
+
+ # whether to recreate Java attributes -- not needed, but here for backwards compatibility tests
+ recreate_java_attributes: bool = False
+
+ # maximum distance in miles for Dijkstra's algorithm to search for shortest paths
+ max_dijkstra_utility: float = 10
+
+ # caching options for bike network creation to save time
+ read_cached_bike_net: bool = False # will crash if network does not exist
+ save_bike_net: bool = True
+
+ # shapefile generation properties
+ generate_shapefile: bool = False
+ crs: str = None
+
+ # can define a subset of zones to use for the model
+ # this is useful for testing or if you only want to run the model for a specific area
+ zone_subset: int | list | None = None
+
+ # whether to treat mazs or tazs as the centroid zones
+ zone_level: Literal["taz", "mgra"] = "taz"
+
+ # how many different paths to build for each origin-destination pair
+ # this is the number of times dijkstra's algorithm will be run
+ number_of_iterations: int = 10
+
+ # minimum number of iterations/paths. Zone pairs with fewer paths will be discarded
+ min_iterations: int = 0
+
+
+def load_settings(
+ logger: logging.Logger, yaml_path: str = "bike_route_choice_settings.yaml",
+) -> BikeRouteChoiceSettings:
+ with open(yaml_path, "r") as f:
+ data = yaml.safe_load(f)
+ try:
+ settings = BikeRouteChoiceSettings(**data)
+ except ValidationError as e:
+ print("Settings validation error:", e)
+ raise
+
+ # ensure output path exists
+ if not os.path.exists(os.path.expanduser(settings.output_path)):
+ os.makedirs(os.path.expanduser(settings.output_path))
+ logger.info(f"Created output directory: {settings.output_path}")
+
+ # setup logger
+ log_file_location = os.path.join(os.path.expanduser(settings.output_path), "bike_model.log")
+ logging.basicConfig(
+ level=logging.INFO,
+ format="%(asctime)s - %(levelname)s - %(message)s",
+ handlers=[logging.FileHandler(log_file_location), logging.StreamHandler()],
+ )
+
+ # set random seed for reproducibility
+ np.random.seed(settings.random_seed)
+
+ return settings
+
+
+def read_file(settings, file_path: str) -> gpd.GeoDataFrame:
+ """
+ Read a shapefile and return a GeoDataFrame
+
+ Looks for the shapefile in a few places:
+ 1. The current working directory
+ 2. The directory of the script
+ 3. The data directory specified in the settings file
+ """
+
+ def return_file(path: str) -> gpd.GeoDataFrame | pd.DataFrame:
+ if path.endswith(".shp"):
+ return gpd.read_file(path)
+ elif path.endswith(".csv"):
+ return pd.read_csv(path, comment="#")
+ else:
+ raise ValueError(f"Unsupported file type: {path}")
+
+ # 1. Try current working directory
+ if os.path.exists(file_path):
+ return return_file(file_path)
+
+ # 2. Try directory of the script
+ script_dir = os.path.dirname(os.path.abspath(__file__))
+ script_path = os.path.join(script_dir, file_path)
+ if os.path.exists(script_path):
+ return return_file(script_path)
+
+ # 3. Try data directory
+ data_path = os.path.join(os.path.expanduser(settings.data_dir), file_path)
+ if os.path.exists(data_path):
+ return return_file(data_path)
+
+ raise FileNotFoundError(
+ f"Shapefile '{file_path}' not found in current directory, script directory, or provided path."
+ )
+
+
+def plot_shortest_path_with_results_buffered(
+ nodes, edges, shortest_path_df, origin, destination, iteration, buffer_size=None
+):
+ """Plot the shortest path between two nodes with additional path information within a square buffer around the origin node."""
+ print("Plotting the shortest path...")
+ path_data = shortest_path_df[
+ (shortest_path_df.origin == origin)
+ & (shortest_path_df.destination == destination)
+ & (shortest_path_df.iteration == iteration)
+ ]
+ if path_data.empty:
+ print(
+ f"No path found between {origin} and {destination} for iteration {iteration}"
+ )
+ return
+
+ # Get the coordinates of the origin node
+ origin_node = nodes[nodes["id"] == origin].iloc[0]
+ origin_x, origin_y = origin_node["x"], origin_node["y"]
+
+ if buffer_size:
+ # Define the buffer boundaries
+ min_x, max_x = origin_x - buffer_size, origin_x + buffer_size
+ min_y, max_y = origin_y - buffer_size, origin_y + buffer_size
+
+ # Filter nodes within the buffer
+ filtered_nodes = nodes[
+ (nodes["x"] >= min_x)
+ & (nodes["x"] <= max_x)
+ & (nodes["y"] >= min_y)
+ & (nodes["y"] <= max_y)
+ ]
+
+ # Filter edges to include only those with both nodes within the buffer
+ filtered_edges = edges[
+ edges["fromNode"].isin(filtered_nodes["id"])
+ & edges["toNode"].isin(filtered_nodes["id"])
+ ]
+
+ # check to make sure destination node is also in the buffer
+ if destination not in filtered_nodes["id"].values:
+ print(
+ f"Destination node {destination} is not in the buffer size of {buffer_size}"
+ )
+
+ else:
+ filtered_nodes = nodes
+ filtered_edges = edges
+
+ # Create a graph from the filtered nodes and edges
+ G = nx.Graph()
+ G.add_nodes_from(
+ [
+ (node["id"], {"pos": (node["x"], node["y"])})
+ for _, node in filtered_nodes.iterrows()
+ ]
+ )
+ G.add_edges_from(
+ [
+ (edge.fromNode, edge.toNode, {"weight": edge.distance})
+ for _, edge in filtered_edges.iterrows()
+ ]
+ )
+
+ # Extract positions for plotting
+ pos = nx.get_node_attributes(G, "pos")
+
+ # Plot the network
+ plt.figure(figsize=(10, 10))
+ nx.draw(
+ G, pos, node_size=10, with_labels=False, edge_color="gray", alpha=0.5, width=0.5
+ )
+
+ # Highlight the shortest path
+ path_edges = [
+ (path_data.from_node.iloc[i], path_data.to_node.iloc[i])
+ for i in range(len(path_data))
+ ]
+ path_nodes = set(path_data.from_node).union(set(path_data.to_node))
+ nx.draw_networkx_nodes(
+ G, pos, nodelist=path_nodes, node_color="red", node_size=50, label="Path Nodes"
+ )
+ nx.draw_networkx_edges(
+ G, pos, edgelist=path_edges, edge_color="blue", width=2, label="Shortest Path"
+ )
+
+ # Highlight the origin and destination nodes
+ nx.draw_networkx_nodes(
+ G, pos, nodelist=[origin], node_color="green", node_size=100, label="Origin"
+ )
+ nx.draw_networkx_nodes(
+ G,
+ pos,
+ nodelist=[destination],
+ node_color="purple",
+ node_size=100,
+ label="Destination",
+ )
+
+ # Calculate path information
+ num_edges = len(path_edges)
+ total_distance = path_data["distance"].sum()
+
+ # total_cost = path_data["cost_total"].sum()
+ # turns = path_data[path_data.turnType > 0]["turnType"].count()
+ path_size = path_data.iloc[0]["path_size"]
+ # Add path information to the plot
+ info_text = (
+ f"Origin: {origin}\n"
+ f"Destination: {destination}\n"
+ f"Iteration: {iteration}\n"
+ f"Number of Edges: {num_edges}\n"
+ f"Total Distance: {total_distance:.2f} units\n"
+ # f"Turns: {turns}\n"
+ # f"Total Cost: {total_cost:.2f}\n"
+ f"Path Size: {path_size:.2f}"
+ )
+
+ plt.text(
+ 0.05,
+ 0.95,
+ info_text,
+ transform=plt.gca().transAxes,
+ fontsize=12,
+ verticalalignment="top",
+ bbox=dict(boxstyle="round,pad=0.3", edgecolor="black", facecolor="white"),
+ )
+
+ # Add a legend
+ plt.legend(loc="upper right")
+ plt.title(
+ f"Shortest Path from Node {origin} to Node {destination} for iteration {iteration}"
+ )
+ plt.show(block=True)
diff --git a/src/asim/scripts/bike_route_choice/bike_threshold_calculator.py b/src/asim/scripts/bike_route_choice/bike_threshold_calculator.py
new file mode 100644
index 000000000..c958b08e6
--- /dev/null
+++ b/src/asim/scripts/bike_route_choice/bike_threshold_calculator.py
@@ -0,0 +1,148 @@
+import sys, time, logging
+import pandas as pd
+import matplotlib.pyplot as plt
+from bike_route_choice import load_settings, run_bike_route_choice
+
+def plot_results(results):
+ fig, ax1 = plt.subplots()
+ ax2 = ax1.twinx()
+ ax1.grid(True, which='both', axis='x', linestyle='-', color='#48484a', alpha=1,zorder=0)
+ ax1.grid(True, which='both', axis='y', linestyle='-', color='#006fa1', alpha=1,zorder=0)
+ ax2.grid(True, which='both', axis='y', linestyle='--', color='#f68b1f', alpha=0.8,zorder=0)
+ ax1.set_facecolor('#dcddde')
+ ax1.set_ylabel('99th %%ile Distance (mi)',color='#006fa1')
+ ax1.tick_params(axis='y',labelcolor='#006fa1')
+
+ ax2.set_ylabel('Runtime (sec)',color='#f68b1f',rotation=-90,labelpad=15)
+ ax2.tick_params(axis='y',labelcolor='#f68b1f')
+
+ ax1.set_xlabel('Utility Threshold (utiles)')
+ dist_plot = ax1.scatter(results.index,results.dist,color='#006fa1',label='99th %%ile Distance',zorder=3)
+ runtime_plot = ax2.scatter(results.index,results.runtime,color='#f68b1f',label='Runtime',zorder=3,marker='s',alpha=0.8)
+ ymin, ymax= ax2.get_ylim()
+ padding = 0.1*(ymax-ymin)
+ ax2.set_ylim(ymin-padding,ymax+padding)
+ legend = ax2.legend(handles=[dist_plot,runtime_plot],labels=['Distance','Runtime'],loc='upper left')
+ legend.get_frame().set_facecolor('white')
+ legend.get_frame().set_alpha(1)
+ legend.get_frame().set_edgecolor('#48484a')
+
+ plt.show()
+
+if __name__=="__main__":
+ assert len(sys.argv) > 2,"""Usage:
+ python bike_threshold_calculator.py [target_margin [percentile [max_iterations]]]
+
+ parameters:
+ settings filepath: path to YAML file containing bike model settings
+ target distance: the distance for which the search should aim (in miles)
+ target margin: the margin of error (< 1) allowed before termination (optional, default: 0.1)
+ percentile: the percentile of distance to compare against the target (optional, default: 0.99)
+ max iterations: the most bike model iterations that can be performed in the search (optional, default: 20)
+
+ examples:
+
+ python bike_threshold_calculator.py bike_route_choice_settings_taz.yaml 20
+ # the resulting 99th %%ile distance must be w/in 10%% of the 20-mile target distance
+ # equivalent to:
+ python bike_threshold_calculator.py bike_route_choice_settings_taz.yaml 20 0.1 0.99 20
+
+ python bike_threshold_calculator.py bike_route_choice_settings_mgra.yaml 3 0.05
+ # the resulting 99th %%ile distance must be w/in 5%% of the three-mile target distance
+ """
+
+ # pass settings file as command line argument
+
+ settings_file = sys.argv[1]
+
+ target_distance = float(sys.argv[2])
+
+ if len(sys.argv) > 3:
+ target_margin = float(sys.argv[3])
+ else:
+ target_margin = 10/100 # 10%
+
+ if len(sys.argv) > 4:
+ pctile = float(sys.argv[4])
+ else:
+ pctile = 0.99
+
+ if len(sys.argv) > 5:
+ max_iterations = int(sys.argv[5])
+ else:
+ max_iterations = 20
+
+ logger = logging.getLogger(__name__)
+
+ # load settings
+ settings = load_settings(logger, settings_file)
+
+ # first x data point
+ cur_threshold = settings.max_dijkstra_utility
+ upper_bound = None
+ lower_bound = None
+
+ distances = {}
+ times = {}
+
+ iteration = 0
+ while True:
+ logger.info(f"Running w/ threshold {cur_threshold}")
+ start_time = time.time()
+ output = run_bike_route_choice(settings, logger)
+ end_time = time.time()
+ elapsed_time = end_time - start_time
+
+ # record runtime
+ times[cur_threshold] = elapsed_time
+
+ # calculate the 99th %ile of distances
+ cur_distance = output.distance.quantile(pctile)
+ logger.info(f"99th %ile distance: {cur_distance}")
+
+ distances[cur_threshold] = cur_distance
+
+ # termination condition
+ if abs(cur_distance - target_distance)/target_distance < target_margin or iteration >= max_iterations:
+ break
+
+ else:
+ if cur_distance > target_distance:
+ # distance is too high, so reduce threshold
+
+ upper_bound = cur_threshold
+
+ # if we don't have a lower bound yet
+ if lower_bound is None:
+ # just halve the current threshold
+ cur_threshold = cur_threshold / 2
+
+ else:
+ # move halfway to the lower bound
+ cur_threshold = cur_threshold - (cur_threshold - lower_bound)/2
+
+
+ else: # cur_distance < target_distance
+ # distance is too low, so increase threshold
+
+ lower_bound = cur_threshold
+
+ # if we don't have an upper bound yet
+ if upper_bound is None:
+ # just double the current threshold
+ cur_threshold = cur_threshold * 2
+
+ else:
+ # move halfway to the upper bound
+ cur_threshold = cur_threshold + (upper_bound - cur_threshold)/2
+
+ settings.max_dijkstra_utility = cur_threshold
+
+ iteration +=1
+
+ results = pd.DataFrame({'dist':distances,'runtime':times})
+ results.index.name = 'threshold'
+
+ plot_results(results)
+
+ results.to_csv(settings.output_path+"/threshold_results.csv")
\ No newline at end of file
diff --git a/src/asim/scripts/bike_route_choice/bike_traversal_utils.csv b/src/asim/scripts/bike_route_choice/bike_traversal_utils.csv
new file mode 100644
index 000000000..d625e8160
--- /dev/null
+++ b/src/asim/scripts/bike_route_choice/bike_traversal_utils.csv
@@ -0,0 +1,23 @@
+Label,Description,Expression,Coefficient
+util_thru_centroid,Prevent taking shortcuts via centroid connectors,thruCentroid,-999.9
+# The below coefficients are adapted from Lukawska et al. (2023),,,
+util_yield,Biker must yield at junction,"@(df.signalExclRight | df.unlfrma | df.unlfrmi | df.unxma | df.unxmi)",-0.301
+util_downgrade,Biker has ROW but downgraded functional class,"@df.roadDowngrade & (~(df.signalExclRight | df.unlfrma | df.unlfrmi | df.unxma | df.unxmi))",-0.164
+# FIXME: do we really want to incentivize ALL nodes like this?
+# The above coefficients are adapted from Lukawska et al. (2023),,,
+##################################################################################
+# The below coefficients are from the ABM2 model and have dubious sourcing (see ABM2 documentation),,,
+# util_turns,Total turns,(turnType != 0),-0.083
+# util_traffic_signal,Traffic signals excluding right turns and T junctions,signalExclRight,-0.040
+# util_unsig_left_from_principal,Unsignalized left turns from principal arterials,unlfrma,-0.360
+# util_unsig_left_from_minor,Unsignalized left turns from minor arterials,unlfrmi,-0.150
+# util_unsig_cross_or_left_onto_principal,Unsignalized cross or left onto principal arterials,unxma,-0.48
+# util_unsig_cross_or_left_onto_minor,Unsignalized cross or left onto minor arterials,unxmi,-0.1
+# can't calculate path size here since we don't know the path yet,,
+# path size calculation is done in the code when calculating the logsum,,
+# util_log_path_size,log of path size,logPathSize,1.0
+# The above coefficients are from the ABM2 model and have dubious sourcing (see ABM2 documentation),,,
+##################################################################################
+# The below traffic signal coefficient is from Meister et al. (2024)
+# util_traffic_signal,Traffic signals excluding right turns and T junctions,signalExclRight,-0.03
+# The above traffic signal coefficient is from Meister et al. (2024)
\ No newline at end of file
diff --git a/src/main/emme/toolbox/master_run.py b/src/main/emme/toolbox/master_run.py
index 5b6071a66..64bdea0c3 100644
--- a/src/main/emme/toolbox/master_run.py
+++ b/src/main/emme/toolbox/master_run.py
@@ -251,6 +251,7 @@ def __call__(self, main_directory, scenario_id, scenario_title, emmebank_title,
manage_settings(_join(main_directory, "conf", "sandag_abm.properties"))
manage_settings(_join(main_directory, "src", "asim", "configs"))
manage_settings(_join(main_directory, "src", "asim-cvm", "configs"))
+ manage_settings(_join(main_directory, "src", "asim", "scripts", "bike_route_choice"))
props = load_properties(_join(main_directory, "conf", "sandag_abm.properties"))
props.set_year_specific_properties(_join(main_directory, "input", "parametersByYears.csv"))
@@ -528,8 +529,14 @@ def __call__(self, main_directory, scenario_id, scenario_title, emmebank_title,
import_demand(omx_file, "TRUCK", period, base_scenario)
if not skipBikeLogsums:
+ for file_name in ['bike_route_choice_settings_mgra.yaml','bike_route_choice_settings_taz.yaml']:
+ with open(_join(self._path,r'src\asim\scripts\bike_route_choice',file_name), 'r') as file:
+ settings = file.read()
+ settings = settings.replace(r"${path}",self._path)
+ with open(_join(self._path,r'src\asim\scripts\bike_route_choice',file_name), 'w') as file:
+ file.writelines(settings)
self.run_proc("runSandagBikeLogsums.cmd", [drive, path_forward_slash],
- "Bike - create AT logsums and impedances")
+ "Bike - create AT logsums and impedances", capture_output=True)
# Copy updated logsums to scenario input to avoid overwriting
self.copy_files(["bikeMgraLogsum.csv", "bikeTazLogsum.csv"], output_dir, input_dir)
elif not os.path.exists(_join(output_dir, "bikeMgraLogsum.csv")):
diff --git a/src/main/resources/runSandagBikeLogsums.cmd b/src/main/resources/runSandagBikeLogsums.cmd
index da7e82dc3..282e4d66f 100644
--- a/src/main/resources/runSandagBikeLogsums.cmd
+++ b/src/main/resources/runSandagBikeLogsums.cmd
@@ -1,45 +1,13 @@
-rem @echo off
+ECHO OFF
set PROJECT_DRIVE=%1
set PROJECT_DIRECTORY=%2
%PROJECT_DRIVE%
-cd %PROJECT_DRIVE%%PROJECT_DIRECTORY%
-call %PROJECT_DIRECTORY%\bin\CTRampEnv.bat
+cd /d %PROJECT_DIRECTORY%
-rem ### First save the JAVA_PATH environment variable so it s value can be restored at the end.
-set OLDJAVAPATH=%JAVA_PATH%
-
-rem ### Set the directory of the jdk version desired for this model run
-set JAVA_PATH=%JAVA_64_PATH%
-
-rem ### Name the project directory. This directory will hava data and runtime subdirectories
-set RUNTIME=%PROJECT_DIRECTORY%
-set CONFIG=%RUNTIME%/conf
-
-set JAR_LOCATION=%PROJECT_DIRECTORY%/application
-set LIB_JAR_PATH=%JAR_LOCATION%\*
-
-rem ### Define the CLASSPATH environment variable for the classpath needed in this model run.
-set CLASSPATH=%CONFIG%;%RUNTIME%;%LIB_JAR_PATH%;
-
-rem ### Save the name of the PATH environment variable, so it can be restored at the end of the model run.
-set OLDPATH=%PATH%
-
-rem ### Change the PATH environment variable so that JAVA_HOME is listed first in the PATH.
-rem ### Doing this ensures that the JAVA_HOME path we defined above is the on that gets used in case other java paths are in PATH.
-set PATH=%JAVA_PATH%\bin;%OLDPATH%
-
-%PROJECT_DRIVE%
-cd %PROJECT_DRIVE%%PROJECT_DIRECTORY%
-
-rem rem build bike logsums
-%JAVA_64_PATH%\bin\java -showversion -server -Xmx%MEMORY_BIKELOGSUM_MAX% -cp "%CLASSPATH%" -Dlog4j.configuration=log4j.xml -Dproject.folder=%PROJECT_DIRECTORY% org.sandag.abm.active.sandag.SandagBikePathChoiceLogsumMatrixApplication %PROPERTIES_NAME%
-if ERRORLEVEL 1 goto DONE
-
-:done
-rem ### restore saved environment variable values, and change back to original current directory
-set JAVA_PATH=%OLDJAVAPATH%
-set PATH=%OLDPATH%
+CALL %activate_uv_asim%
+python src/asim/scripts/bike_route_choice/bike_route_choice.py src/asim/scripts/bike_route_choice/bike_route_choice_settings_mgra.yaml || exit /b 2
+python src/asim/scripts/bike_route_choice/bike_route_choice.py src/asim/scripts/bike_route_choice/bike_route_choice_settings_taz.yaml || exit /b 2
\ No newline at end of file