[SYNPY-1749]Allow quote, apostrophe and ellipsis in store_row_async by danlu1 · Pull Request #1316 · Sage-Bionetworks/synapsePythonClient

danlu1 · 2026-02-09T20:01:40Z

Problem:

A JSON serialization issue occurs when a DataFrame passed to store_row_async contains a list or dictionary with strings that include both double quotes and apostrophes.

Solution:

Restrict to_csv_kwargs in get_partial_dataframe_chunk, which is the final step before uploading the data to Synapse to obtain the file handle.

Testing:

Unit test and integration test have been added.

andrewelamb · 2026-02-09T22:44:41Z

@danlu1 Is this still WIP, or are you looking for reviews?

danlu1 · 2026-02-10T00:47:02Z

@andrewelamb sorry I should have marked this a draft.

…ctly when upload data from a dataframe

…ger output json string

danlu1 · 2026-02-18T18:40:24Z

The integration test failures are in the recordset and submission modules and do not appear to be related to my changes.

linglp

I think overall it looks good. The tests can be consolidated a bit to test all the edge cases in fewer integration tests to improve performance, and the docstring can be updated to reflect the new state of the code since json.dumps() was removed. There's also some logic that can be simplified in the redundant checks where sample_values is created but never actually used. The function name could also be more descriptive of what it actually does now (sanitizing special values rather than just converting dtypes)

synapseclient/models/mixins/table_components.py

tests/integration/synapseclient/models/async/test_table_async.py

linglp · 2026-02-25T16:20:36Z

tests/integration/synapseclient/models/synchronous/test_table.py


 import pandas as pd
 import pytest
+from pandas.api.types import is_object_dtype


I think if you merge develop to this branch, you should be able to see that we are no longer adding tests to synchronous folder any more.

That is right, Sorry for the code conflicts that you'll need to deal with @danlu1

linglp · 2026-02-25T16:21:28Z

synapseclient/models/mixins/table_components.py

@@ -141,6 +141,7 @@ def row_labels_from_rows(rows: List[Row]) -> List[Row]:
 def convert_dtypes_to_json_serializable(df):


Nit: add return type hints

linglp · 2026-02-25T16:49:07Z

synapseclient/models/mixins/table_components.py

            to_csv_kwargs: Additional arguments to pass to the `pd.DataFrame.to_csv`
                function when writing the data to a CSV file.
        """
+        # Serializes dict/list values to JSON strings


I think the comment here is misleading. The function no longer converts dict/list values to JSON strings. I think in your function, the dict/list values remain as dict/list objects

BryanFauble · 2026-02-26T17:21:23Z

synapseclient/core/upload/upload_utils.py

        df.iloc[offset_start:end].to_csv(
            buffer,
            mode="a",
            header=False,
            index=False,
            float_format="%.12g",
+            doublequote=False,
+            escapechar="\\",
+            quoting=0,


Should this instead be a passthrough for the settings that the user has defined for the doublequote, escapechar, when they called the row upsert or store method(s)? I am seeing that it is currently used to go from CSV -> DataFrame, but then we are not using them to go from DataFrame -> CSV (And we probably should in some capacity). We could then default them as we do today if not passed in, but it would allow a user to specify them.

I tried implementing a passthrough when calling the store method, but it didn’t work. The likely root cause is a bug: to_csv_kwargs isn’t being passed into _stream_and_update_from_df, so downstream steps never get a chance to use it. I added the parameters here because Synapse only allows a limited set of options, similar to how we handled float_format. That said, the ideal solution is to support to_csv_kwargs directly for DataFrames. I’m going to add support for to_csv_kwargs.

BryanFauble

This is looking great, once we get the last few items handled (and develop merged in), I can approve!

reformat script

3feba11

danlu1 requested a review from a team as a code owner February 9, 2026 20:01

danlu1 added 3 commits February 9, 2026 16:23

reorganize code to ensure row columns remain int

1c68dac

add unit test for convert_dtypes_to_json_serializable

4a29a16

correct unit for datetime64

3ecb6ec

danlu1 marked this pull request as draft February 10, 2026 00:48

danlu1 added 8 commits February 9, 2026 18:08

remove the unwanted code

af989c0

revert changes in test_csv_to_pandas_df_with_date_columns

4d06d3a

update doctrings

e1b20dc

add integration test for store_rows

7ef7110

add to_csv kwargs to ensure double quote and apostophe formated corre…

a4913a6

…ctly when upload data from a dataframe

remove json string dumps function to let synapse decode data directly

98689d3

update unit test since the convert_dtypes_to_json_serializable no lon…

a0af1b6

…ger output json string

update integration test as no json string need to be generated

5002bd6

danlu1 marked this pull request as ready for review February 18, 2026 18:36

linglp requested changes Feb 20, 2026

View reviewed changes

danlu1 added 2 commits February 23, 2026 16:09

remvoe unwanted code

c874fe4

simplify test cases

dab80f0

danlu1 requested a review from linglp February 24, 2026 00:15

linglp reviewed Feb 25, 2026

View reviewed changes

BryanFauble reviewed Feb 26, 2026

View reviewed changes

BryanFauble requested changes Feb 26, 2026

View reviewed changes

		@@ -141,6 +141,7 @@ def row_labels_from_rows(rows: List[Row]) -> List[Row]:
		def convert_dtypes_to_json_serializable(df):

Conversation

danlu1 commented Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem:

Solution:

Testing:

Uh oh!

andrewelamb commented Feb 9, 2026

Uh oh!

danlu1 commented Feb 10, 2026

Uh oh!

danlu1 commented Feb 18, 2026

Uh oh!

linglp left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

linglp Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

BryanFauble Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

linglp Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

linglp Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

BryanFauble Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

danlu1 Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

BryanFauble left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

danlu1 commented Feb 9, 2026 •

edited

Loading