Cloudflare logpush pipeline improvements internal by brijesh-elastic · Pull Request #4 · brijesh-elastic/integrations

brijesh-elastic · 2026-04-12T09:50:17Z

Proposed commit message

Checklist

I have reviewed tips for building integrations and this pull request is aligned with them.
I have verified that all data streams collect metrics or logs.
I have added an entry to my package's changelog.yml file.
I have verified that Kibana version constraints are current according to guidelines.
I have verified that any added dashboard complies with Kibana's Dashboard good practices

Author's Checklist

[ ]

How to test this PR locally

Related issues

Screenshots

Update format_version to 3.3.2 and ECS dependency to git@v9.3.0 in manifest.yml and build.yml. Update ecs.version to 9.3.0 in all 21 data stream ingest pipelines.

field descriptions Update ECS field definitions by replacing agent.yml with beats.yml and modernizing base-fields.yml across all 21 data streams. Sort all fields.yml entries alphabetically for better maintainability. Fix swapped field descriptions for firewall_event (origin.ray.id/origin.response.status) and http_request (cache.status/cache.response.status). Add support for new fields across 9 data streams with their corresponding ingest pipeline processors: device_posture (RegistrationID), firewall_event (FraudUserID), gateway_dns (12 fields including InternalDNS*, QueryApplication*, RequestContext*), gateway_http (AppControlInfo, ApplicationStatuses, RedirectTargetURI, RegistrationID), gateway_network (RegistrationID), http_request (11 fields including Fraud*, WebAssets*, WorkerScriptName), network_analytics (DNSQueryName, DNSQueryType, PFPCustomTag), network_session (InitialOriginIP, RegistrationID, ResolvedFQDN, SNI), workers_trace (CPUTimeMs, WallTimeMs).

Correct the Painless script to reference ctx.json.Timestamp (PascalCase) instead of ctx.json.timestamp, matching the actual field name from the Cloudflare API and the guard condition.

Fix the grok guard condition that used an incorrect path (ctx.json?.cloudflare_logpush) instead of (ctx.cloudflare_logpush) and a tautological || operator instead of &&. Also correct the remove processor to reference action instead of event_action. Update test data to use a valid disconnect timestamp.

Correct the split processor condition to reference ctx.json.TCPSackBlocks consistently instead of mixing TCPSACKBlocks and TCPSackBlocks casing.

Correct the rename condition to use ctx.json?.Interface (PascalCase) matching the actual Cloudflare API field name instead of lowercase.

Replace rename processors with convert processors (type: string) for fields documented as integers or arrays of integers but mapped as keyword type in fields.yml. Affected fields: gateway_dns (CNAMECategoryIDs, EDEErrors, InitialCategoryIDs, MatchedIndicatorFeedIDs, ResolvedIPCategoryIDs), gateway_http (ApplicationIDs), gateway_network (ApplicationIDs, CategoryIDs).

Change singular header to plural headers for RequestHeaders and ResponseHeaders target fields to match the fields.yml definitions (request.headers and response.headers).

Add IANA keyword representation scripts for dns.response_code and dns.question.type in both dns and dns_firewall data streams. Numeric DNS response codes are now mapped to human-readable names (e.g., 0 -> NoError, 3 -> NXDomain) and query types are mapped to their IANA names (e.g., 1 -> A, 28 -> AAAA, 15 -> MX).

Remove ignore_failure: true from the first JSON processor in all data stream pipelines that had it. Parsing failures should surface as errors rather than silently producing partial documents.

Replace grok processors with dissect for simple delimiter-based pattern matching in firewall_event, http_request, and spectrum_event data streams. Dissect is more performant than grok for fixed patterns like protocol/version splitting.

Replace multiple timestamp normalization scripts (which handled both String and Number types with try/catch) with a single, efficient script that only handles Number type. The new script directly converts timestamps to Unix milliseconds by dividing nanosecond values or multiplying second values. Update test input log files to use numeric timestamps to match the simplified script expectations.

Replace rename processors with typed convert processors for fields declared as ip, long, boolean, or double in fields.yml to ensure correct type casting. Add in-place convert processors for timestamp fields to handle string-to-number conversion before the normalization script. Affected data streams: access_request, device_posture, dns, dns_firewall, gateway_dns, gateway_http, gateway_network, http_request, magic_ids, network_session, sinkhole_http, spectrum_event. Also adds timestamp converts for all 20 data streams with numeric timestamp handling.

Add a null/empty field removal script at the end of all 21 data stream pipelines. The script recursively removes fields with null values, empty strings, empty maps, and empty lists. Standardize all on_failure error.message values to use the full format: "Processor {type} with tag {tag} in pipeline {pipeline} failed with message: {message}" for consistent debugging output.

Add a unique tag key to every processor in all 21 ingest pipelines for easier debugging and tracing of pipeline failures. Tags follow the pattern: {processor_type}_{field_description}_{hash}.

brijesh-elastic added 17 commits April 12, 2026 14:22

update ECS version to 9.3.0 and format_version to 3.3.2

7d80ec4

Update format_version to 3.3.2 and ECS dependency to git@v9.3.0 in manifest.yml and build.yml. Update ecs.version to 9.3.0 in all 21 data stream ingest pipelines.

fix email_security_alerts timestamp normalization

fbbf048

Correct the Painless script to reference ctx.json.Timestamp (PascalCase) instead of ctx.json.timestamp, matching the actual field name from the Cloudflare API and the guard condition.

fix network_analytics split processor case-sensitivity for TCPSackBlocks

21254bd

Correct the split processor condition to reference ctx.json.TCPSackBlocks consistently instead of mixing TCPSACKBlocks and TCPSackBlocks casing.

fix audit rename condition to use PascalCase for Interface field

995dab7

Correct the rename condition to use ctx.json?.Interface (PascalCase) matching the actual Cloudflare API field name instead of lowercase.

align http_request header naming between pipeline and fields.yml

2da7287

Change singular header to plural headers for RequestHeaders and ResponseHeaders target fields to match the fields.yml definitions (request.headers and response.headers).

remove ignore_failure from initial JSON processor

2f512e8

Remove ignore_failure: true from the first JSON processor in all data stream pipelines that had it. Parsing failures should surface as errors rather than silently producing partial documents.

add tags for every processor across all data streams

30ad2be

Add a unique tag key to every processor in all 21 ingest pipelines for easier debugging and tracing of pipeline failures. Tags follow the pattern: {processor_type}_{field_description}_{hash}.

add changelog entry for cloudflare_logpush pipeline improvements

7f48fe7

run pipeline and system tests

d0a94ad

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cloudflare logpush pipeline improvements internal#4

Cloudflare logpush pipeline improvements internal#4
brijesh-elastic wants to merge 17 commits into
mainfrom
cloudflare_logpush-pipeline-improvements-internal

brijesh-elastic commented Apr 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

brijesh-elastic commented Apr 12, 2026

Proposed commit message

Checklist

Author's Checklist

How to test this PR locally

Related issues

Screenshots

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant