Skip to content

Add Support for Additional DataFrame Functionality #58

@grundprinzip

Description

@grundprinzip

The DataFrame still has lots of missing functions that need to be implemented. This is the umbrella item for that.

Most of them should be relatively mechanical, but some require some additional work.

Method Support
agg
alias
approxQuantile
cache
checkpoint
coalesce
colRegex ✅ (OfDFWithRegex)
collect
columns
corr
count
cov
createGlobalTempView
createOrReplaceGlobalTempView
createOrReplaceTempView
createTempView
crossJoin
crosstab
cube
describe
distinct
drop
dropDuplicates
dropDuplicatesWithinWatermark
dropna
dtypes
exceptAll
executionInfo
explain
fillna
filter
first
foreach ➖ (needs native UDFs)
foreachPartition ➖ (needs native UDFs)
freqItems
groupBy ✅ (some restrictions apply)
head
hint
inputFiles
intersect
intersectAll
isEmpty
isLocal
isStreaming
is_cached
join
limit
localCheckpoint
mapInArrow ➖ (needs native UDFs)
mapInPandas ➖ (needs native UDFs)
melt
mergeInto
na
observe
offset
orderBy
pandas_api
persist
printSchema
randomSplit
rdd
registerTempTable
repartition
repartitionByRange
replace
rollup
sameSemantics
sample
sampleBy
schema
select
selectExpr
semanticHash
show
sort
sortWithinPartitions
sparkSession
stat
storageLevel
subtract
summary
tail
take
to
toArrow
toDF
toJSON
toLocalIterator
toPandas
transform
union
unionAll
unionByName
unpersist
unpivot
where
withColumn
withColumnRenamed
withColumns
withColumnsRenamed
withMetadata
withWatermark
write
writeStream
writeTo

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status

    In progress

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions