Skip to content

read_from_stream function should use input_df from the parameters #38

@KrisSimon

Description

@KrisSimon

This function in cmd10

def read_from_stream(input_df: DataFrame) -> DataFrame:
    ### YOUR CODE HERE
    raw_stream_data = (
        spark.readStream.format("rate")
        .option("rowsPerSecond", 10)
        .load()
    )
    ###


    # This is just data setup, not part of the exercise
    return raw_stream_data.\
        join(mock_data_df, raw_stream_data.value == mock_data_df.index, 'left').\
        drop("timestamp").\
        drop("index")


df = read_from_stream(mock_data_df)

should use input_df, instead of mock_data_df.
If input parameter is used, then the tests fail.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions