Skip to content

Fix the shred logic in parquet-variant #10145

Description

@klion26

Describe the bug

Currently, we'll use the same VariantToArrowRowBuilder for shred_variant and variant_get, but shred and get are not always contain the same logic, even we have applied the cast-align with arrow-cast, such as

  • bool -> int can't be casted in the code path of shred
  • bool -> int can be casted in code path of get

To Reproduce

The following test should pass

#[test]
fn test_variant_shred() {  
       let mut array_builder = VariantArrayBuilder::new(30);

        array_builder.append_value(Variant::Int8(1));
        array_builder.append_value(Variant::BooleanTrue);

        let array = array_builder.build();
        let int8_shred_array = shred_variant(&array, &DataType::Int8).unwrap();
        let expected_array_builder = VariantArrayBuilder::new(30);
        let value = int8_shred_array.inner().column_by_name("value");
        let typed_value = int8_shred_array.inner().column_by_name("typed_value");

        // the Variant::BooleanTrue should not be shredded
        assert!(int8_shred_array.inner().column_by_name("value").unwrap().is_valid(1))
}

Expected behavior

We can't do cast logic in shred(like bool -> int*), but we can widen some types(like int8 -> int16)

Additional context

No response

Metadata

Metadata

Assignees

Labels

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions