[SPARK-57527][SQL] Add the `unix_nanos` function returning nanoseconds since the epoch for timestamps by MaxGekk · Pull Request #56602 · apache/spark

MaxGekk · 2026-06-18T21:47:32Z

What changes were proposed in this pull request?

This PR adds a new built-in function unix_nanos(expr) that returns the number of nanoseconds since 1970-01-01 00:00:00 UTC for a nanosecond-precision timestamp.

Concretely:

Adds a UnixNanos expression in datetimeExpressions.scala that accepts only the nanosecond-precision timestamp types TIMESTAMP_LTZ(p) / TIMESTAMP_NTZ(p) (p in [7, 9], i.e. AnyTimestampNanoType) and returns a lossless DECIMAL(21, 0).
Computes epochMicros * 1000 + nanosWithinMicro via BigInteger in both the interpreted (eval) and codegen (doGenCode) paths. A BIGINT return type was rejected because epochMicros * 1000 overflows 64 bits across the full [0001..9999] calendar range; DECIMAL(21, 0) is wide enough for every value (~2.5e20 max) and stays lossless.
Registers unix_nanos in FunctionRegistry and adds the Scala functions.unix_nanos.
Adds catalyst unit tests (interpreted + codegen), Scala/SQL end-to-end tests, and SQL golden-file coverage for TIMESTAMP_NTZ(p) / TIMESTAMP_LTZ(p).

The microsecond TimestampType input and the PySpark / Spark Connect / R surfaces are out of scope here and tracked as follow-ups; unix_nanos is recorded in the PySpark function-parity allowlist in the meantime.

Why are the changes needed?

Part of the SPARK-56822 umbrella (timestamps with nanosecond precision). Spark has unix_seconds / unix_millis / unix_micros but no nanosecond counterpart, which is the natural inverse of nanosecond timestamp construction.

Does this PR introduce any user-facing change?

Yes. A new unix_nanos(timeExp) function is available in SQL and the Scala API. It accepts TIMESTAMP_LTZ(p) / TIMESTAMP_NTZ(p) and returns DECIMAL(21, 0). This is a change only within the unreleased nanosecond-timestamp preview.

Example:

SELECT unix_nanos(TIMESTAMP_NTZ '2008-12-25 15:30:00.123456789');
-- 1230219000123456789

How was this patch tested?

build/sbt 'catalyst/testOnly org.apache.spark.sql.catalyst.expressions.DateExpressionsSuite'
build/sbt 'sql/testOnly org.apache.spark.sql.TimestampNanosFunctionsAnsiOnSuite org.apache.spark.sql.TimestampNanosFunctionsAnsiOffSuite'
build/sbt 'sql/testOnly org.apache.spark.sql.expressions.ExpressionInfoSuite org.apache.spark.sql.ExpressionsSchemaSuite'
SPARK_GENERATE_GOLDEN_FILES=1 build/sbt 'sql/testOnly org.apache.spark.sql.SQLQueryTestSuite -- -z "nanos"'
./dev/scalastyle

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Cursor

…s since the epoch for timestamps ### What changes were proposed in this pull request? This PR adds a new built-in function `unix_nanos(expr)` that returns the number of nanoseconds since `1970-01-01 00:00:00 UTC` for a nanosecond-precision timestamp. Concretely: - Adds a `UnixNanos` expression in `datetimeExpressions.scala` that accepts only the nanosecond-precision timestamp types `TIMESTAMP_LTZ(p)` / `TIMESTAMP_NTZ(p)` (`p in [7, 9]`, i.e. `AnyTimestampNanoType`) and returns a lossless `DECIMAL(21, 0)`. - Computes `epochMicros * 1000 + nanosWithinMicro` via `BigInteger` (the product overflows a 64-bit `BIGINT` across the full `[0001..9999]` calendar range) in both the interpreted (`eval`) and codegen (`doGenCode`) paths. - Registers `unix_nanos` in `FunctionRegistry` and adds the Scala `functions.unix_nanos`. - Adds catalyst unit tests (interpreted + codegen), Scala/SQL end-to-end tests, and SQL golden-file coverage for `TIMESTAMP_NTZ(p)` / `TIMESTAMP_LTZ(p)`. Micro `TimestampType` input and the PySpark / Spark Connect / R surfaces are out of scope and tracked as follow-ups; `unix_nanos` is recorded in the PySpark function-parity allowlist in the meantime. ### Why are the changes needed? Part of the [SPARK-56822](https://issues.apache.org/jira/browse/SPARK-56822) umbrella (timestamps with nanosecond precision). Spark has `unix_seconds` / `unix_millis` / `unix_micros` but no nanosecond counterpart, which is the natural inverse of nanosecond timestamp construction. ### Does this PR introduce _any_ user-facing change? Yes. A new `unix_nanos(timeExp)` function is available in SQL and the Scala API. It accepts `TIMESTAMP_LTZ(p)` / `TIMESTAMP_NTZ(p)` and returns `DECIMAL(21, 0)`. This is within the unreleased nanosecond-timestamp preview. Example: ```sql SELECT unix_nanos(TIMESTAMP_NTZ '2008-12-25 15:30:00.123456789'); -- 1230219000123456789 ``` ### How was this patch tested? - `build/sbt 'catalyst/testOnly org.apache.spark.sql.catalyst.expressions.DateExpressionsSuite'` - `build/sbt 'sql/testOnly org.apache.spark.sql.TimestampNanosFunctionsAnsiOnSuite org.apache.spark.sql.TimestampNanosFunctionsAnsiOffSuite'` - `build/sbt 'sql/testOnly org.apache.spark.sql.expressions.ExpressionInfoSuite org.apache.spark.sql.ExpressionsSchemaSuite'` - `SPARK_GENERATE_GOLDEN_FILES=1 build/sbt 'sql/testOnly org.apache.spark.sql.SQLQueryTestSuite -- -z "nanos"'` - `./dev/scalastyle` ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Cursor

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-57527][SQL] Add the `unix_nanos` function returning nanoseconds since the epoch for timestamps#56602

[SPARK-57527][SQL] Add the `unix_nanos` function returning nanoseconds since the epoch for timestamps#56602
MaxGekk wants to merge 1 commit into
apache:masterfrom
MaxGekk:unix_nanos

MaxGekk commented Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

MaxGekk commented Jun 18, 2026

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant