Skip to content

schema is Not used for the Dataframe , which is Created from pandas Dataframe. #4142

@teja0909

Description

@teja0909

Please answer these questions before submitting your issue. Thanks!

  1. What version of Python are you using?

python 3.10

  1. What operating system and processor architecture are you using?

    Windows-10-10.0.26200-SP0

  2. What are the component versions in the environment (pip freeze)?

snowflake-snowpark-python -1.47.0

  1. What did you do?
  import pandas as pd
   pdf = pd.DataFrame({
    "id": [1, 2, 3],
    "name": ["Alice", "Bob", "Charlie"],
    "age": [25, 30, 28]
})

df4 = session.create_dataframe(pdf)
df4_1 = session.create_dataframe(pdf, schema=['uid', 'uname', 'uage'])
df4_2 = session.create_dataframe(pdf,schema=['uid','uname','Agg'])
  1. What did you expect to see?
    For the DataFrame df4_1, I expected the column names to be uid, uname, and Agg. However, the resulting DataFrame retains the original column names from the Pandas DataFrame (pdf). This indicates that the provided schema is not applied when creating a Snowpark DataFrame from a Pandas DataFrame.

  2. Can you set logging to DEBUG and collect the logs?

    import logging
    
    for logger_name in ('snowflake.snowpark', 'snowflake.connector'):
        logger = logging.getLogger(logger_name)
        logger.setLevel(logging.DEBUG)
        ch = logging.StreamHandler()
        ch.setLevel(logging.DEBUG)
        ch.setFormatter(logging.Formatter('%(asctime)s - %(threadName)s %(filename)s:%(lineno)d - %(funcName)s() - %(levelname)s - %(message)s'))
        logger.addHandler(ch)
    

Metadata

Metadata

Assignees

Labels

status-triage_doneInitial triage done, will be further handled by the driver team

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions