Skip to content

Bug: Arkouda-backed Series creates NumPy RangeIndex #5410

@ajpotts

Description

@ajpotts

Bug: Arkouda-backed Series creates NumPy RangeIndex

Summary

When constructing a pandas Series backed by an Arkouda
ExtensionArray, pandas automatically creates a default RangeIndex
backed by NumPy.

This silently materializes the index on the client, breaking scalability
for very large arrays.

Problem

Calling:

pd.Series(arkouda_extension_array)

creates a NumPy-backed RangeIndex when index=None.

For large Arkouda arrays, creating a large NumPy index:

  • Uses client memory
  • Breaks distributed semantics
  • May be impossible for very large datasets

Expected Behavior

If no index is provided, the default index should be constructed on the
Arkouda server (e.g., using ak.arange(n)), ensuring the entire Series
remains Arkouda-backed.

Fix

Construct the default index using Arkouda and wrap it in an
ArkoudaExtensionArray instead of relying on pandas' default
RangeIndex.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions