-
Notifications
You must be signed in to change notification settings - Fork 8
correction and split agreement into 2 functions #94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -122,7 +122,7 @@ def histo( | |
| offset = i * bar_width.total_seconds() / 86400 | ||
|
|
||
| bar_kwargs = { | ||
| "width": bar_width.total_seconds() / 86400, | ||
| "width": (bar_width.total_seconds() / 86400), | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why parenthesis here ? |
||
| "align": "edge", | ||
| "edgecolor": "black", | ||
| "color": color[i], | ||
|
|
@@ -469,16 +469,11 @@ def wrap_text(text: str) -> str: | |
| ax.set_xticklabels(new_labels, rotation=0) | ||
|
|
||
|
|
||
| def agreement( | ||
| def count_detections_within_timeframe( | ||
| df: DataFrame, | ||
| bin_size: Timedelta | BaseOffset, | ||
| ax: plt.Axes, | ||
| ) -> None: | ||
| """Compute and visualise agreement between two annotators. | ||
|
|
||
| This function compares annotation timestamps from two annotators over a time range. | ||
| It also fits and plots a linear regression line and displays the coefficient | ||
| of determination (R²) on the plot. | ||
| ) -> DataFrame: | ||
| """Counts the number of detections in df within bin_size timeframe. | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. """Count the number of detections in an APLOSE dataframe within bin_size time bin.``` Ruff D401: First line of docstring should be in imperative mood |
||
|
|
||
| Parameters | ||
| ---------- | ||
|
|
@@ -489,8 +484,10 @@ def agreement( | |
| bin_size : Timedelta | BaseOffset | ||
| The size of each time bin for aggregating annotation timestamps. | ||
|
|
||
| ax : matplotlib.axes.Axes | ||
| Matplotlib axes object where the scatterplot and regression line will be drawn. | ||
| Returns | ||
| ------- | ||
| df_hist: Dataframe with columns = annotators and lines = number of detections | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
| within the timebin defined by bin_size | ||
|
|
||
| """ | ||
| labels, annotators = get_labels_and_annotators(df) | ||
|
|
@@ -505,28 +502,55 @@ def agreement( | |
| ] | ||
|
|
||
| # scatter plot | ||
| n_annot_max = bin_size.total_seconds() / df["end_time"].iloc[0] | ||
|
|
||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. delete this empty line |
||
| freq = ( | ||
| bin_size if isinstance(bin_size, Timedelta) else str(bin_size.n) + bin_size.name | ||
| ) | ||
|
|
||
| bins = date_range( | ||
| start=df["start_datetime"].min().floor(bin_size), | ||
| end=df["start_datetime"].max().ceil(bin_size), | ||
| end=df["end_datetime"].max().ceil(bin_size), | ||
| freq=freq, | ||
| ) | ||
|
|
||
| df_hist = ( | ||
| return ( | ||
| DataFrame( | ||
| { | ||
| annotators[0]: histogram(datetimes[0], bins=bins)[0], | ||
| annotators[1]: histogram(datetimes[1], bins=bins)[0], | ||
| }, | ||
| ) | ||
| / n_annot_max | ||
| ) | ||
|
|
||
|
|
||
| def plot_agreement( | ||
| df: DataFrame, | ||
| bin_size: Timedelta | BaseOffset, | ||
| ax: plt.Axes, | ||
| ) -> None: | ||
| """Compute and visualise agreement between two annotators. | ||
|
|
||
| This function compares annotation timestamps from two annotators over a time range. | ||
| It also fits and plots a linear regression line and displays the coefficient | ||
| of determination (R²) on the plot. | ||
|
|
||
| Parameters | ||
| ---------- | ||
| df : DataFrame | ||
| APLOSE-formatted DataFrame. | ||
| It must contain The annotations of two annotators. | ||
|
|
||
| bin_size : Timedelta | BaseOffset | ||
| The size of each time bin for aggregating annotation timestamps. | ||
|
|
||
| ax : matplotlib.axes.Axes | ||
| Matplotlib axes object where the scatterplot and regression line will be drawn. | ||
|
|
||
|
|
||
|
|
||
| """ | ||
| labels, annotators = get_labels_and_annotators(df) | ||
| df_hist = count_detections_within_timeframe(df, bin_size) | ||
| scatterplot(data=df_hist, x=annotators[0], y=annotators[1], ax=ax) | ||
|
|
||
| coefficients = polyfit(df_hist[annotators[0]], df_hist[annotators[1]], 1) | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure you dete all outputs of notebooks before if you push it to the repo
Also, i think this file has nothing to do with your edits so you should take it out of your PR