# Confluent Cloud Flink SQL Documentation

> Comprehensive documentation for Apache Flink SQL dialect used in Confluent Cloud. This includes SQL syntax, functions, operators, and best practices for stream processing with Flink SQL in Confluent Cloud.

## Overview

This documentation covers:
- Flink SQL syntax and semantics
- Built-in functions and operators
- Stream processing concepts
- Confluent Cloud specific features
- Best practices and examples

## Core Documentation

- [Flink SQL Autopilot in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/concepts/autopilot.html): Flink SQL Autopilot for Confluent Cloud¶ Autopilot scales up and scales down the compute resources that SQL statements use in Confluent Cloud for Apache Flink®. Autopilot assigns resources efficiently...
- [Batch and Stream Processing in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/concepts/batch-and-stream-processing.html): Batch and Stream Processing in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® supports both batch and stream processing, which enables you to process data in either finite (bounde...
- [Comparing Apache Flink with Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/concepts/comparison-with-apache-flink.html): Comparing Apache Flink with Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® supports many of the capabilities of Apache Flink® and provides additional features. Also, Confluent Clo...
- [Compute Pools in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/concepts/compute-pools.html): Compute Pools in Confluent Cloud for Apache Flink¶ A compute pool in Confluent Cloud for Apache Flink® represents a set of compute resources bound to a region that is used to run your SQL statements.
- [Delivery Guarantees and Latency in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/concepts/delivery-guarantees.html): Delivery Guarantees and Latency in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides exactly-once semantics end-to-end by default, which mean that every input message is ref...
- [Determinism with continuous Flink SQL queries in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/concepts/determinism.html): Determinism in Continuous Queries on Confluent Cloud for Apache Flink¶ This topic answers the following questions about determinism in Confluent Cloud for Apache Flink®: What is determinism? Is all ba...
- [Tables and Topics in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/concepts/dynamic-tables.html): Tables and Topics in Confluent Cloud for Apache Flink¶ Apache Flink® and the Table API use the concept of dynamic tables to facilitate the manipulation and processing of streaming data. Dynamic tables...
- [Billing in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/concepts/flink-billing.html): Billing on Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® is a serverless stream-processing platform with usage-based pricing, where you are charged only for the duration that you...
- [Private Networking with Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/concepts/flink-private-networking.html): Private Networking with Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® supports private networking on AWS, Azure, and Google Cloud. This feature enables Flink to securely read and...
- [Stream Processing Concepts in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/concepts/overview.html): Stream Processing Concepts in Confluent Cloud for Apache Flink¶ Apache Flink® SQL, a high-level API powered by Confluent Cloud for Apache Flink, offers a simple and easy way to leverage the power of s...
- [Schema and Statement Evolution with Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/concepts/schema-statement-evolution.html): Schema and Statement Evolution with Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables evolving your statements over time as your schemas change. This topic describes these co...
- [Snapshot Queries in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/concepts/snapshot-queries.html): Snapshot Queries in Confluent Cloud for Apache Flink¶ In Confluent Cloud for Apache Flink®, a snapshot query is a query that reads data from a table at a specific point in time. In contrast with a str...
- [Statement CFU Metrics in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/concepts/statement-cfu-metrics.html): Statement CFU Metrics in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides detailed metrics to help you understand and manage your resource utilization. One critical aspect
- [Flink SQL Statements in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/concepts/statements.html): Flink SQL Statements in Confluent Cloud for Apache Flink¶ In Confluent Cloud for Apache Flink®, a statement represents a high-level resource that’s created when you enter a SQL query. Each statement h...
- [Time and Watermarks in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/concepts/timely-stream-processing.html): Time and Watermarks in Confluent Cloud for Apache Flink¶ Timely stream processing is an extension of stateful stream processing that incorporates time into the computation. It’s commonly used for time...
- [User-defined Functions in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/concepts/user-defined-functions.html): User-defined Functions in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® supports user-defined functions (UDFs), which are extension points for running custom logic that you can’t...
- [FAQ for Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/flink-faq.html): Frequently Asked Questions for Confluent Cloud for Apache Flink¶ This topic provides answers to frequently asked questions about Confluent Cloud for Apache Flink®. What is Confluent Cloud for Apache F...
- [Get Help with Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/get-help.html): Get Help with Confluent Cloud for Apache Flink¶ You can request support in the Confluent Support Portal. You can access the portal directly, or you can navigate to it from the Confluent Cloud Console
- [Get Started with Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/get-started/overview.html): Get Started with Confluent Cloud for Apache Flink¶ Welcome to Confluent Cloud for Apache Flink®. This section guides you through the steps to get your queries running using the Confluent Cloud Console...
- [Flink SQL Quick Start on Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/get-started/quick-start-cloud-console.html): Flink SQL Quick Start with Confluent Cloud Console¶ This quick start gets you up and running with Confluent Cloud for Apache Flink®. The following steps show how to create a workspace for running SQL
- [Java Table API Quick Start on Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/get-started/quick-start-java-table-api.html): Java Table API Quick Start on Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® supports programming applications with the Table API. Confluent provides a plugin for running applicat...
- [Python Table API Quick Start on Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/get-started/quick-start-python-table-api.html): Python Table API Quick Start on Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® supports programming applications with the Table API. Confluent provides a plugin for running applic...
- [SQL Shell Quick Start on Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/get-started/quick-start-shell.html): Flink SQL Shell Quick Start on Confluent Cloud for Apache Flink¶ This quick start walks you through the following steps to get you up and running with Confluent Cloud for Apache Flink®. Step 1: Log in...
- [Aggregate a Data Stream in a Tumbling Window with Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/how-to-guides/aggregate-tumbling-window.html): Aggregate a Stream in a Tumbling Window with Confluent Cloud for Apache Flink¶ Aggregation over windows is central to processing streaming data. Confluent Cloud for Apache Flink® supports Windowing Ta...
- [Combine Streams and Track Most Recent Records with Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/how-to-guides/combine-and-track-most-recent-records.html): Combine Streams and Track Most Recent Records with Confluent Cloud for Apache Flink¶ When working with streaming data, it’s common to need to combine information from multiple sources while tracking t...
- [Compare Current and Previous Values in a Data Stream with Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/how-to-guides/compare-current-and-previous-values.html): Compare Current and Previous Values in a Data Stream with Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides a LAG function, which is a built-in function that enables you to
- [Convert the Serialization Format of a Topic with Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/how-to-guides/convert-serialization-format.html): Convert the Serialization Format of a Topic with Confluent Cloud for Apache Flink¶ This guide shows how to use Confluent Cloud for Apache Flink® to transform a topic serialized in Avro Schema Registry...
- [Create a User-Defined Function with Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/how-to-guides/create-udf.html): Create a User-Defined Function with Confluent Cloud for Apache Flink¶ A user-defined function (UDF) extends the capabilities of Confluent Cloud for Apache Flink® and enables you to implement custom lo...
- [Deduplicate Rows in a Table with Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/how-to-guides/deduplicate-rows.html): Deduplicate Rows in a Table with Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables generating a table that contains only unique records from an input table with only a few cl...
- [Log Debug Messages in a User Defined Function with Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/how-to-guides/enable-udf-logging.html): Log Debug Messages in a User Defined Function for Confluent Cloud for Apache Flink¶ When you create a user defined function (UDF) with Confluent Cloud for Apache Flink®, you have the option of logging...
- [Mask Fields in a Table with Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/how-to-guides/mask-fields.html): Mask Fields in a Table with Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables generating a topic that contains masked fields from an input topic with only a few clicks. In th...
- [Handle Multiple Event Types In Tables in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/how-to-guides/multiple-event-types.html): Handle Multiple Event Types with Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides several ways to work with Kafka topics containing multiple event types. This guide explain...
- [How-to Guides for Developing Flink Applications on Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/how-to-guides/overview.html): How-to Guides for Confluent Cloud for Apache Flink¶ Discover how Confluent Cloud for Apache Flink® can help you accomplish common processing tasks such as joins and aggregations. This section provides...
- [Process schemaless events with Flink SQL in Confluent Cloud | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/how-to-guides/process-schemaless-events.html): Process Schemaless Events with Confluent Cloud for Apache Flink¶ This guide explains how use Confluent Cloud for Apache Flink to handle and process events in Apache Kafka® topics that don’t use serial...
- [Profile a Query with Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/how-to-guides/profile-query.html): Profile a Query with Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables you to profile the performance of your queries. The Query Profiler provides enhanced visibility into ho...
- [Resolve Statement Issues in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/how-to-guides/resolve-common-query-problems.html): Resolve Statement Issues in Confluent Cloud for Apache Flink¶ Inefficient Flink SQL queries in Confluent Cloud for Apache Flink® can cause performance issues that impact your data processing pipeline....
- [Run a Snapshot Query with in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/how-to-guides/run-snapshot-query.html): Run a Snapshot Query with Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® supports snapshot queries that read data from a table at a specific point in time. In contrast with a stre...
- [Scan and Summarize Flink Tables in Confluent Cloud | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/how-to-guides/scan-and-summarize-tables.html): Scan and Summarize Tables with Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides graphical tools in your workspaces that enable scanning and summarizing data visually in Fli...
- [Transform a Topic with Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/how-to-guides/transform-topic.html): Transform a Topic with Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables generating a transformed topic from an input topic’s properties, like partition count, key, serializa...
- [View Time Series Data in Confluent Cloud | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/how-to-guides/view-time-series-data.html): View Time Series Data with Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables visualizing time-series data in real time. The output of certain SQL statements render as time-se...
- [Best Practices for Moving SQL Statements to Production in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/operate-and-deploy/best-practices.html): Move SQL Statements to Production in Confluent Cloud for Apache Flink¶ When you move your Flink SQL statements to production in Confluent Cloud for Apache Flink®, consider the following recommendation...
- [Carry-over Offsets in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/operate-and-deploy/carry-over-offsets.html): Carry-over Offsets in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® supports carry-over offsets, which means that you can use the topic offsets from one statement to start a new
- [Manage Flink Compute Pools in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/operate-and-deploy/create-compute-pool.html): Manage Compute Pools in Confluent Cloud for Apache Flink¶ A compute pool represents the compute resources that are used to run your SQL statements. The resources provided by a compute pool are shared
- [Deploy a Flink SQL Statement in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/operate-and-deploy/deploy-flink-sql-statement.html): Deploy a Flink SQL Statement Using CI/CD and Confluent Cloud for Apache Flink¶ GitHub Actions is a powerful feature on GitHub that enables automating your software development workflows. If your sourc...
- [Grant Role-Based Access for Flink SQL Statements in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/operate-and-deploy/flink-rbac.html): Grant Role-Based Access in Confluent Cloud for Apache Flink¶ When deploying Flink SQL statements in production, you must configure appropriate access controls for different types of users and workload...
- [Flink REST API in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/operate-and-deploy/flink-rest-api.html): Flink SQL REST API for Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides a REST API for managing your Flink SQL statements, compute pools, and connections programmatically.
- [Generate an API key for Programmatic Access to Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/operate-and-deploy/generate-api-key-for-flink.html): Generate an API Key for Access in Confluent Cloud for Apache Flink¶ To manage Flink workloads programmatically in Confluent Cloud for Apache Flink®, you need an API key that’s specific to Flink. You c...
- [Manage Flink Connections in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/operate-and-deploy/manage-connections.html): Manage Connections in Confluent Cloud for Apache Flink¶ A connection in Confluent Cloud for Apache Flink® represents an external service that is used in your Flink statements. Connections are used to
- [Monitor and Manage Flink SQL Statements in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/operate-and-deploy/monitor-statements.html): Monitor and Manage Flink SQL Statements in Confluent Cloud for Apache Flink¶ You start a stream-processing app on Confluent Cloud for Apache Flink® by running a SQL statement. Once a statement is runn...
- [Operate and Deploy Flink SQL Statements with Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/operate-and-deploy/overview.html): Operate and Deploy Flink Statements with Confluent Cloud for Apache Flink¶ Confluent provides tools for operating Confluent Cloud for Apache Flink® in the Cloud Console, the Confluent CLI, the Conflue...
- [Enable Private Networking with Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/operate-and-deploy/private-networking.html): Enable Private Networking with Confluent Cloud for Apache Flink¶ You have these options for using private networking with Confluent Cloud for Apache Flink®. PrivateLink Attachment: Works with any type...
- [Flink SQL Query Profiler in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/operate-and-deploy/query-profiler.html): Flink SQL Query Profiler in Confluent Cloud for Apache Flink¶ The Query Profiler is a tool in Confluent Cloud for Apache Flink® that provides enhanced visibility into how a Flink SQL statement is proc...
- [Stream Processing with Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/overview.html): Stream Processing with Confluent Cloud for Apache Flink¶ Apache Flink® is a powerful, scalable stream processing framework for running complex, stateful, low-latency streaming applications on large vo...
- [Supported Cloud Regions for Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/cloud-regions.html): Supported Cloud Regions for Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® is available on AWS, Azure, and Google Cloud. Flink is supported in the following regions. AWS supported...
- [Flink SQL Data Types in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/datatypes.html): Data Types in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® has a rich set of native data types that you can use in SQL statements and queries. The query planner supports the fol...
- [Example Data Streams in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/example-data.html): Example Data Streams in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides an Examples catalog that has mock data streams you can use for experimenting with Flink SQL queries...
- [Confluent CLI commands with Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/flink-sql-cli.html): Confluent CLI commands with Confluent Cloud for Apache Flink¶ Manage Flink SQL statements and compute pools in Confluent Cloud for Apache Flink® by using the confluent flink commands in the Confluent
- [SQL Information Schema in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/flink-sql-information-schema.html): Information Schema in Confluent Cloud for Apache Flink¶ An information schema, or data dictionary, is a standard SQL schema with a collection of predefined views that enable accessing metadata about o...
- [SQL aggregate functions in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/functions/aggregate-functions.html): Aggregate Functions in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides these built-in functions to aggregate rows in Flink SQL queries: AVG COLLECT COUNT CUME_DIST DENSE_R...
- [SQL Collection Functions in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/functions/collection-functions.html): Collection Functions in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides these built-in collection functions to use in Flink SQL queries: ARRAY ARRAY_AGG ARRAY_APPEND ARRAY...
- [SQL comparison functions in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/functions/comparison-functions.html): Comparison Functions in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides these built-in comparison functions to use in SQL queries: Equality operations Logical operations C...
- [SQL conditional functions in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/functions/conditional-functions.html): Conditional Functions in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides these built-in functions for controlling execution flow in SQL queries: CASE CASE WHEN CONDITION C...
- [SQL Datetime Functions in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/functions/datetime-functions.html): Datetime Functions in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides these built-in functions for handling date and time logic in SQL queries: Date Time Timestamp Utility...
- [SQL hash functions in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/functions/hash-functions.html): Hash Functions in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides these built-in functions to generate hash codes in SQL queries: MD5 SHA1 SHA2 SHA224 SHA256 SHA384 SHA512...
- [SQL JSON functions in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/functions/json-functions.html): JSON Functions in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides these built-in functions to help with JSON in SQL queries: IS JSON JSON_ARRAY JSON_ARRAYAGG JSON_EXISTS J...
- [Machine-Learning Preprocessing Functions in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/functions/ml-preprocessing-functions.html): Machine-Learning Preprocessing Functions in Confluent Cloud for Apache Flink¶ The following built-in functions are available for ML preprocessing in Confluent Cloud for Apache Flink®. These functions
- [AI Model Inference and Machine Learning Functions in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/functions/model-inference-functions.html): AI Model Inference and Machine Learning Functions in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides built-in functions for invoking remote AI/ML models in Flink SQL queri...
- [SQL numeric functions in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/functions/numeric-functions.html): Numeric Functions in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides these built-in numeric functions to use in SQL queries: Numeric Trigonometry Random number generators
- [SQL Functions in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/functions/overview.html): Flink SQL Functions in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables you to do data transformations and other operations with the following built-in functions. Aggregate
- [SQL string functions in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/functions/string-functions.html): String Functions in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides these built-in string functions to use in SQL queries: ASCII BTRIM string1 || string2 CHARACTER_LENGTH
- [Table API functions in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/functions/table-api-functions.html): Table API in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® supports programming applications with the Table API. For more information, see the Table API Overview. To get started
- [Flink SQL Keywords in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/keywords.html): Flink SQL Reserved Keywords in Confluent Cloud for Apache Flink¶ Keywords are words that have significance in Confluent Cloud for Apache Flink®. Some keywords, like AND, CHAR, and SELECT are reserved
- [Flink SQL and Table API Reference in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/overview.html): Flink SQL and Table API Reference in Confluent Cloud for Apache Flink¶ This section describes the SQL language support in Confluent Cloud for Apache Flink®, including Data Definition Language (DDL) st...
- [SQL Deduplication Queries in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/queries/deduplication.html): Deduplication Queries in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables removing duplicate rows over a set of columns in a Flink SQL table. Syntax¶ SELECT [column_list] FR...
- [SQL Group Aggregation Queries in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/queries/group-aggregation.html): Group Aggregation Queries in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables computing a single result from multiple input rows in a Flink SQL table. Description¶ Compute a...
- [SQL INSERT INTO FROM SELECT Statement in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/queries/insert-into-from-select.html): INSERT INTO FROM SELECT Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables inserting SELECT query results directly into a Flink SQL table. Syntax¶ [EXECUTE] INSER...
- [SQL INSERT VALUES Statement in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/queries/insert-values.html): INSERT VALUES Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables inserting data directly into a Flink SQL table. Syntax¶ [EXECUTE] INSERT { INTO | OVERWRITE } [ca...
- [SQL Join Queries in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/queries/joins.html): Join Queries in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables join data streams over Flink SQL dynamic tables. Description¶ Flink supports complex and flexible join opera...
- [SQL LIMIT clause in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/queries/limit.html): LIMIT Clause in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables constraining the number of rows returned by a SELECT statement. Description¶ The LIMIT clause constrains the...
- [SQL Pattern Recognition Queries in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/queries/match_recognize.html): Pattern Recognition Queries in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables pattern detection in event streams. Syntax¶ SELECT T.aid, T.bid, T.cid FROM MyTable MATCH_REC...
- [SQL ORDER BY Clause in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/queries/orderby.html): ORDER BY Clause in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables sorting rows from a SELECT statement. Description¶ The ORDER BY clause causes the result rows to be sorte...
- [SQL OVER Aggregation Queries in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/queries/over-aggregation.html): OVER Aggregation Queries in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables computing an aggregated value for every row over a range of ordered rows. Syntax¶ SELECT agg_fun...
- [SQL Queries in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/queries/overview.html): Flink SQL Queries in Confluent Cloud for Apache Flink¶ In Confluent Cloud for Apache Flink®, Data Manipulation Language (DML) statements, also known as queries, are declarative verbs that read and mod...
- [SQL SELECT statement in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/queries/select.html): SELECT Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables querying the content of your tables by using familiar SELECT syntax. Syntax¶ SELECT [DISTINCT] select_li...
- [SQL Set Logic in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/queries/set-logic.html): Set Logic in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables set logic operations on tables in SQL statements. EXCEPT EXISTS IN INTERSECT UNION Example data¶ The following
- [SQL Statement Sets in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/queries/statement-set.html): EXECUTE STATEMENT SET in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables executing multiple SQL statements as a single, optimized statement by using statement sets. Syntax¶...
- [SQL Top-N queries in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/queries/topn.html): Top-N Queries in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables finding the smallest or largest values, ordered by columns, in a table. Syntax¶ SELECT [column_list] FROM (...
- [SQL Window Aggregation Queries in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/queries/window-aggregation.html): Window Aggregation Queries in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables aggregating data over windows in a table. Syntax¶ SELECT ... FROM <windowed_table> -- relation...
- [SQL Window Deduplication Queries in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/queries/window-deduplication.html): Window Deduplication Queries in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables removing duplicate rows over a set of columns in a windowed table. Syntax¶ SELECT [column_li...
- [SQL Window Join Queries in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/queries/window-join.html): Window Join Queries in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables joining data over time windows in dynamic tables. Syntax¶ The following shows the syntax of the INNER...
- [SQL Window Top-N Queries in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/queries/window-topn.html): Window Top-N Queries in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables Window Top-N queries in dynamic tables. Syntax¶ SELECT [column_list] FROM ( SELECT [column_list], RO...
- [SQL Windowing Table-Valued Functions in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/queries/window-tvf.html): Windowing Table-Valued Functions (Windowing TVFs) in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides several window table-valued functions (TVFs) for dividing the elements...
- [SQL WITH Clause in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/queries/with.html): WITH Clause in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables writing auxiliary statements to use in larger SQL queries. Syntax¶ WITH <with_item_definition> [ , ... ] SELE...
- [Data Type Mappings with Flink SQL Statements in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/serialization.html): Data Type Mappings in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® supports records in the Avro Schema Registry, JSON_SR, and Protobuf Schema Registry formats. Avro schemas JSON...
- [Flink SQL Examples in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/sql-examples.html): Flink SQL Examples in Confluent Cloud for Apache Flink¶ The following code examples show common Flink SQL use cases with Confluent Cloud for Apache Flink®. CREATE TABLE Inferred tables ALTER TABLE SEL...
- [Flink SQL Syntax in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/sql-syntax.html): Flink SQL Syntax in Confluent Cloud for Apache Flink¶ SQL is a domain-specific language for managing and manipulating data. It’s used primarily to work with structured data, where the types and relati...
- [SQL ALTER CONNECTION Statement in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/statements/alter-connection.html): ALTER CONNECTION Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® supports creating secure connections to external services and data sources. You can use these connecti...
- [SQL ALTER MODEL Statement in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/statements/alter-model.html): ALTER MODEL Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables real-time inference and prediction with AI models. Use the CREATE MODEL statement to register an AI...
- [SQL ALTER TABLE Statement in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/statements/alter-table.html): ALTER TABLE Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables changing properties of an existing table. Syntax¶ ALTER TABLE [catalog_name.][db_name.]table_name {...
- [SQL ALTER VIEW Statement in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/statements/alter-view.html): ALTER VIEW Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables modifying properties of an existing view. Syntax¶ ALTER VIEW [catalog_name.][db_name.]view_name RENA...
- [SQL CREATE CONNECTION Statement in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/statements/create-connection.html): CREATE CONNECTION Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® supports creating secure connections to external services and data sources. You can use these connect...
- [Flink SQL CREATE TABLE Statement in Confluent Cloud | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/statements/create-function.html): CREATE FUNCTION Statement¶ Confluent Cloud for Apache Flink® enables registering customer user defined functions (UDFs) by using the CREATE FUNCTION statement. When your UDFs are registered in a Flink...
- [SQL CREATE MODEL Statement in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/statements/create-model.html): CREATE MODEL Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables real-time inference and prediction with AI and ML models. The Flink SQL interface is available in
- [SQL CREATE TABLE Statement in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/statements/create-table.html): CREATE TABLE Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables creating tables backed by Apache Kafka® topics by using the CREATE TABLE statement. With Flink tab...
- [SQL CREATE VIEW Statement in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/statements/create-view.html): CREATE VIEW Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables creating views based on statement expressions by using the CREATE VIEW statement. With Flink views,...
- [SQL DESCRIBE Statement in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/statements/describe.html): DESCRIBE Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables viewing the schema of an Apache Kafka® topic. Also, you can view details of an AI model, function, or
- [SQL DROP CONNECTION Statement in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/statements/drop-connection.html): DROP CONNECTION Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® supports creating secure connections to external services and data sources. You can use these connectio...
- [SQL DROP MODEL Statement in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/statements/drop-model.html): DROP MODEL Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables real-time inference and prediction with AI models. Use the CREATE MODEL statement to register an AI
- [SQL DROP TABLE Statement in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/statements/drop-table.html): DROP TABLE Statement in Confluent Cloud for Apache Flink¶ The DROP TABLE statement removes a table definition from Confluent Cloud for Apache Flink® and, depending on the table type, will also delete
- [SQL DROP VIEW Statement in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/statements/drop-view.html): DROP VIEW Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables dropping views using the DROP VIEW statement. When a view is dropped, its definition is removed from
- [SQL EXPLAIN Statement in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/statements/explain.html): EXPLAIN Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables viewing and analyzing the query plans of Flink SQL statements. Syntax¶ EXPLAIN { <query_statement> | <i...
- [SQL HINTS in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/statements/hints.html): Dynamic Table Options in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® supports dynamic table options, or SQL hints, which enable you to specify or override table options dynamic...
- [SQL Statements in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/statements/overview.html): DDL Statements in Confluent Cloud for Apache Flink¶ In Confluent Cloud for Apache Flink®, a statement is a high-level resource that’s created when you enter a SQL query. Data Definition Language (DDL)...
- [SQL RESET Statement in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/statements/reset.html): RESET Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables resetting Flink SQL shell properties to default values. Syntax¶ RESET 'key'; Description¶ Reset the Flink...
- [SQL SET Statement in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/statements/set.html): SET Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables setting Flink SQL shell properties to different values. Syntax¶ SET 'key' = 'value'; Description¶ Modify or...
- [SQL SHOW Statements in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/statements/show.html): SHOW Statements in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables listing catalogs, which map to Confluent Cloud environments, databases, which map to Apache Kafka® cluste...
- [SQL USE CATALOG Statement in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/statements/use-catalog.html): USE CATALOG Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables setting the active environment with the SQL USE statement. Syntax¶ USE CATALOG catalog_name; Descri...
- [SQL USE database_name Statement in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/statements/use-database.html): USE <database_name> Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables setting the current Apache Kafka® cluster with the USE <database_name> statement. Syntax¶ U...
- [Table API on Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/table-api.html): Table API on Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® supports programming applications with the Table API in Java and Python. Confluent provides a plugin for running applic...
- [SQL Timezone Types in Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/flink/reference/timezone.html): Timezone Types in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides rich data types for date and time, including these: DATE TIME TIMESTAMP TIMESTAMP_LTZ INTERVAL YEAR TO MO...
- [Flink authentication and authorization event methods (Confluent Cloud audit logs) | Confluent Documentation](https://docs.confluent.io/cloud/current/monitoring/audit-logging/event-methods/flink-authn-authz.html): Flink Authentication and Authorization Auditable Event Methods on Confluent Cloud¶ Expand all examples | Collapse all examples Confluent Cloud audit logs contain records of auditable events for authen...
- [Auditable event methods for Apache Flink (Confluent Cloud) | Confluent Documentation](https://docs.confluent.io/cloud/current/monitoring/audit-logging/event-methods/flink.html): Auditable Event Methods for Apache Flink on Confluent Cloud¶ Auditable event methods for Confluent Cloud for Apache Flink are triggered by operations on Apache Flink® in Confluent Cloud and send event...
- [Query Encrypted Data with Flink & Confluent Cloud | Confluent Documentation](https://docs.confluent.io/cloud/current/security/encrypt/csfle/flink-integration.html): Secure Stream Processing: Query Encrypted Data with Flink on Confluent Cloud¶ Processing sensitive data like personally identifiable information (PII) or financial records in real-time data streams pr...
- [Query Tableflow Tables with Confluent Cloud for Apache Flink | Confluent Documentation](https://docs.confluent.io/cloud/current/topics/tableflow/how-to-guides/query-engines/query-with-flink.html): Query Tableflow Tables with Flink in Confluent Cloud for Apache Flink®¶ Confluent Cloud for Apache Flink® supports snapshot queries that read data from a Tableflow-enabled topic at a specific point in...
- [confluent flink application create | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/application/confluent_flink_application_create.html): confluent flink application create Description Create a Flink application. confluent flink application create <resourceFilePath> [flags] Flags --environment string REQUIRED: Name of the Flink envir...
- [confluent flink application delete | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/application/confluent_flink_application_delete.html): confluent flink application delete Description Delete one or more Flink applications. confluent flink application delete <name-1> [name-2] ... [name-n] [flags] Flags --environment string REQUIRED:
- [confluent flink application describe | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/application/confluent_flink_application_describe.html): confluent flink application describe Description Describe a Flink application. confluent flink application describe <name> [flags] Flags --environment string REQUIRED: Name of the Flink environment...
- [confluent flink application list | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/application/confluent_flink_application_list.html): confluent flink application list Description List Flink applications. confluent flink application list [flags] Flags --environment string REQUIRED: Name of the Flink environment. --url string Base
- [confluent flink application update | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/application/confluent_flink_application_update.html): confluent flink application update Description Update a Flink application. confluent flink application update <resourceFilePath> [flags] Flags --environment string REQUIRED: Name of the Flink envir...
- [confluent flink application web-ui-forward | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/application/confluent_flink_application_web-ui-forward.html): confluent flink application web-ui-forward Description Forward the web UI of a Flink application. confluent flink application web-ui-forward <name> [flags] Flags --environment string REQUIRED: Name...
- [confluent flink application | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/application/index.html): confluent flink application Aliases application, app Description Manage Flink applications. Subcommands Command Description confluent flink application create Create a Flink application. confluent...
- [confluent flink artifact create | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/artifact/confluent_flink_artifact_create.html): confluent flink artifact create Description Create a Flink UDF artifact. confluent flink artifact create <unique-name> [flags] Flags --artifact-file string REQUIRED: Flink artifact JAR file or ZIP
- [confluent flink artifact delete | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/artifact/confluent_flink_artifact_delete.html): confluent flink artifact delete Description Delete one or more Flink UDF artifacts. confluent flink artifact delete <id-1> [id-2] ... [id-n] [flags] Flags --cloud string REQUIRED: Specify the cloud...
- [confluent flink artifact describe | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/artifact/confluent_flink_artifact_describe.html): confluent flink artifact describe Description Describe a Flink UDF artifact. confluent flink artifact describe <id> [flags] Flags --cloud string REQUIRED: Specify the cloud provider as "aws", "azur...
- [confluent flink artifact list | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/artifact/confluent_flink_artifact_list.html): confluent flink artifact list Description List Flink UDF artifacts. confluent flink artifact list [flags] Flags --cloud string REQUIRED: Specify the cloud provider as "aws", "azure", or "gcp". --re...
- [confluent flink artifact | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/artifact/index.html): confluent flink artifact Description Manage Flink UDF artifacts. Subcommands Command Description confluent flink artifact create Create a Flink UDF artifact. confluent flink artifact delete Delete
- [confluent flink catalog create | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/catalog/confluent_flink_catalog_create.html): confluent flink catalog create Description Create a Flink catalog in Confluent Platform that provides metadata about tables and other database objects such as views and functions. confluent flink ca...
- [confluent flink catalog delete | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/catalog/confluent_flink_catalog_delete.html): confluent flink catalog delete Description Delete one or more Flink catalogs in Confluent Platform. confluent flink catalog delete <name-1> [name-2] ... [name-n] [flags] Flags --url string Base URL...
- [confluent flink catalog describe | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/catalog/confluent_flink_catalog_describe.html): confluent flink catalog describe Description Describe a Flink catalog in Confluent Platform. confluent flink catalog describe <name> [flags] Flags --url string Base URL of the Confluent Manager for...
- [confluent flink catalog list | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/catalog/confluent_flink_catalog_list.html): confluent flink catalog list Description List Flink catalogs in Confluent Platform. confluent flink catalog list [flags] Flags --url string Base URL of the Confluent Manager for Apache Flink (CMF)....
- [confluent flink catalog | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/catalog/index.html): confluent flink catalog Description Manage Flink catalogs in Confluent Platform. Subcommands Command Description confluent flink catalog create Create a Flink catalog. confluent flink catalog delet...
- [confluent flink compute-pool create | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/compute-pool/confluent_flink_compute-pool_create.html): confluent flink compute-pool create Description Cloud Create a Flink compute pool. confluent flink compute-pool create <name> [flags] On-Premises Create a Flink compute pool in Confluent Platform. c...
- [confluent flink compute-pool delete | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/compute-pool/confluent_flink_compute-pool_delete.html): confluent flink compute-pool delete Description Cloud Delete one or more Flink compute pools. confluent flink compute-pool delete <id-1> [id-2] ... [id-n] [flags] On-Premises Delete one or more Flin...
- [confluent flink compute-pool describe | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/compute-pool/confluent_flink_compute-pool_describe.html): confluent flink compute-pool describe Description Cloud Describe a Flink compute pool. confluent flink compute-pool describe [id] [flags] On-Premises Describe a Flink compute pool in Confluent Platf...
- [confluent flink compute-pool list | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/compute-pool/confluent_flink_compute-pool_list.html): confluent flink compute-pool list Description Cloud List Flink compute pools. confluent flink compute-pool list [flags] On-Premises List Flink compute pools in Confluent Platform. confluent flink co...
- [confluent flink compute-pool unset | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/compute-pool/confluent_flink_compute-pool_unset.html): confluent flink compute-pool unset Description Unset the current Flink compute pool that was set with the use command. confluent flink compute-pool unset [flags] Flags -o, --output string Specify t...
- [confluent flink compute-pool update | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/compute-pool/confluent_flink_compute-pool_update.html): confluent flink compute-pool update Description Update a Flink compute pool. confluent flink compute-pool update [id] [flags] Flags --name string Name of the compute pool. --max-cfu int32 Maximum n...
- [confluent flink compute-pool use | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/compute-pool/confluent_flink_compute-pool_use.html): confluent flink compute-pool use Description Choose a Flink compute pool to be used in subsequent commands which support passing a compute pool with the --compute-pool flag. confluent flink compute-...
- [confluent flink compute-pool | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/compute-pool/index.html): confluent flink compute-pool Description Manage Flink compute pools. Subcommands Cloud Command Description confluent flink compute-pool create Create a Flink compute pool. confluent flink compute-p...
- [confluent flink shell | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/confluent_flink_shell.html): confluent flink shell Description Start Flink interactive SQL client. confluent flink shell [flags] Flags --compute-pool string Flink compute pool ID. --service-account string Service account ID. -...
- [confluent flink connection create | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/connection/confluent_flink_connection_create.html): confluent flink connection create Description Create a Flink connection. confluent flink connection create <name> [flags] Flags --cloud string REQUIRED: Specify the cloud provider as "aws", "azure"...
- [confluent flink connection delete | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/connection/confluent_flink_connection_delete.html): confluent flink connection delete Description Delete one or more Flink connections. confluent flink connection delete <name-1> [name-2] ... [name-n] [flags] Flags --cloud string REQUIRED: Specify t...
- [confluent flink connection describe | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/connection/confluent_flink_connection_describe.html): confluent flink connection describe Description Describe a Flink connection. confluent flink connection describe <name> [flags] Flags --cloud string REQUIRED: Specify the cloud provider as "aws", "...
- [confluent flink connection list | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/connection/confluent_flink_connection_list.html): confluent flink connection list Description List Flink connections. confluent flink connection list [flags] Flags --cloud string REQUIRED: Specify the cloud provider as "aws", "azure", or "gcp". --...
- [confluent flink connection update | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/connection/confluent_flink_connection_update.html): confluent flink connection update Description Update a Flink connection. Only secret can be updated. confluent flink connection update <name> [flags] Flags --cloud string REQUIRED: Specify the clou...
- [confluent flink connection | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/connection/index.html): confluent flink connection Description Manage Flink connections. Subcommands Command Description confluent flink connection create Create a Flink connection. confluent flink connection delete Delet...
- [confluent flink connectivity-type use | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/connectivity-type/confluent_flink_connectivity-type_use.html): confluent flink connectivity-type use Description Select a Flink connectivity type for the current environment as “public” or “private”. If unspecified, the CLI will default to public connectivity t...
- [confluent flink connectivity-type | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/connectivity-type/index.html): confluent flink connectivity-type Description Manage Flink connectivity type. Subcommands Command Description confluent flink connectivity-type use Select a Flink connectivity type.
- [confluent flink endpoint list | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/endpoint/confluent_flink_endpoint_list.html): confluent flink endpoint list Description List Flink endpoint. confluent flink endpoint list [flags] Flags --context string CLI context name. -o, --output string Specify the output format as "human...
- [confluent flink endpoint unset | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/endpoint/confluent_flink_endpoint_unset.html): confluent flink endpoint unset Description Unset the current Flink endpoint that was previously set with the use command. confluent flink endpoint unset [flags] Global Flags -h, --help Show help fo...
- [confluent flink endpoint use | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/endpoint/confluent_flink_endpoint_use.html): confluent flink endpoint use Description Use a Flink endpoint as active endpoint for all subsequent Flink dataplane commands in current environment, such as flink connection, flink statement and fli...
- [confluent flink endpoint | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/endpoint/index.html): confluent flink endpoint Description Manage Flink endpoint. Subcommands Command Description confluent flink endpoint list List Flink endpoint. confluent flink endpoint unset Unset the current Flink...
- [confluent flink environment create | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/environment/confluent_flink_environment_create.html): confluent flink environment create Description Create a Flink environment. confluent flink environment create <name> [flags] Flags --kubernetes-namespace string REQUIRED: Kubernetes namespace to de...
- [confluent flink environment delete | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/environment/confluent_flink_environment_delete.html): confluent flink environment delete Description Delete one or more Flink environments. confluent flink environment delete <name-1> [name-2] ... [name-n] [flags] Flags --url string Base URL of the Co...
- [confluent flink environment describe | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/environment/confluent_flink_environment_describe.html): confluent flink environment describe Description Describe a Flink environment. confluent flink environment describe <name> [flags] Flags --url string Base URL of the Confluent Manager for Apache Fl...
- [confluent flink environment list | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/environment/confluent_flink_environment_list.html): confluent flink environment list Description List Flink environments. confluent flink environment list [flags] Flags --url string Base URL of the Confluent Manager for Apache Flink (CMF). Environme...
- [confluent flink environment update | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/environment/confluent_flink_environment_update.html): confluent flink environment update Description Update a Flink environment. confluent flink environment update <name> [flags] Flags --url string Base URL of the Confluent Manager for Apache Flink (C...
- [confluent flink environment | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/environment/index.html): confluent flink environment Aliases environment, env Description Manage Flink environments. Subcommands Command Description confluent flink environment create Create a Flink environment. confluent...
- [confluent flink | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/index.html): confluent flink Description Manage Apache Flink. Subcommands Cloud Command Description confluent flink artifact Manage Flink UDF artifacts. confluent flink compute-pool Manage Flink compute pools.
- [confluent flink region list | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/region/confluent_flink_region_list.html): confluent flink region list Description List Flink regions. confluent flink region list [flags] Flags --cloud string Specify the cloud provider as "aws", "azure", or "gcp". --context string CLI con...
- [confluent flink region unset | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/region/confluent_flink_region_unset.html): confluent flink region unset Description Unset the current Flink cloud and region that was set with the use command. confluent flink region unset [flags] Global Flags -h, --help Show help for this
- [confluent flink region use | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/region/confluent_flink_region_use.html): confluent flink region use Description Choose a Flink region to be used in subsequent commands which support passing a region with the --region flag. confluent flink region use [flags] Flags --clou...
- [confluent flink region | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/region/index.html): confluent flink region Description Manage Flink regions. Subcommands Command Description confluent flink region list List Flink regions. confluent flink region unset Unset the current Flink cloud a...
- [confluent flink statement create | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/statement/confluent_flink_statement_create.html): confluent flink statement create Description Cloud Create a Flink SQL statement. confluent flink statement create [name] [flags] On-Premises Create a Flink SQL statement in Confluent Platform. confl...
- [confluent flink statement delete | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/statement/confluent_flink_statement_delete.html): confluent flink statement delete Description Cloud Delete one or more Flink SQL statements. confluent flink statement delete <name-1> [name-2] ... [name-n] [flags] On-Premises Delete one or more Fli...
- [confluent flink statement describe | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/statement/confluent_flink_statement_describe.html): confluent flink statement describe Description Cloud Describe a Flink SQL statement. confluent flink statement describe <name> [flags] On-Premises Describe a Flink SQL statement in Confluent Platfor...
- [confluent flink statement list | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/statement/confluent_flink_statement_list.html): confluent flink statement list Description Cloud List Flink SQL statements. confluent flink statement list [flags] On-Premises List Flink SQL statements in Confluent Platform. confluent flink statem...
- [confluent flink statement rescale | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/statement/confluent_flink_statement_rescale.html): confluent flink statement rescale Description Rescale a Flink SQL statement in Confluent Platform. confluent flink statement rescale <statement-name> [flags] Flags --environment string REQUIRED: Na...
- [confluent flink statement resume | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/statement/confluent_flink_statement_resume.html): confluent flink statement resume Description Cloud Resume a Flink SQL statement. confluent flink statement resume <name> [flags] On-Premises Resume a Flink SQL statement in Confluent Platform. confl...
- [confluent flink statement stop | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/statement/confluent_flink_statement_stop.html): confluent flink statement stop Description Cloud Stop a Flink SQL statement. confluent flink statement stop <name> [flags] On-Premises Stop a Flink SQL statement in Confluent Platform. confluent fli...
- [confluent flink statement update | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/statement/confluent_flink_statement_update.html): confluent flink statement update Description Update a Flink SQL statement. confluent flink statement update <name> [flags] Flags --principal string A user or service account the statement runs as.
- [confluent flink statement web-ui-forward | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/statement/confluent_flink_statement_web-ui-forward.html): confluent flink statement web-ui-forward Description Forward the web UI of a Flink statement in Confluent Platform. confluent flink statement web-ui-forward <name> [flags] Flags --environment strin...
- [confluent flink statement exception list | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/statement/exception/confluent_flink_statement_exception_list.html): confluent flink statement exception list Description Cloud List exceptions for a Flink SQL statement. confluent flink statement exception list <statement-name> [flags] On-Premises List exceptions fo...
- [confluent flink statement exception | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/statement/exception/index.html): confluent flink statement exception Description Manage Flink SQL statement exceptions. Subcommands Command Description confluent flink statement exception list List exceptions for a Flink SQL state...
- [confluent flink statement | Confluent Documentation](https://docs.confluent.io/confluent-cli/current/command-reference/flink/statement/index.html): confluent flink statement Description Manage Flink SQL statements. Subcommands Cloud Command Description confluent flink statement create Create a Flink SQL statement. confluent flink statement del...
- [Manage Confluent Platform for Apache Flink Applications Using Confluent for Kubernetes | Confluent Documentation](https://docs.confluent.io/operator/current/co-manage-flink.html): Manage Flink Applications Using Confluent for Kubernetes Apache Flink® is a powerful, scalable, and secure stream processing framework for running complex, stateful, low-latency streaming application...

## Full Documentation Content

### Flink SQL Autopilot in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/concepts/autopilot.html

Flink SQL Autopilot for Confluent Cloud¶ Autopilot scales up and scales down the compute resources that SQL statements use in Confluent Cloud for Apache Flink®. Autopilot assigns resources efficiently to SQL statements submitted in Confluent Cloud and provides elastic autoscaling for the entire time the job is running. One of the biggest benefits of using Confluent Cloud for Apache Flink is the built-in Autopilot capability. Autopilot takes care of all the work required to scale up or scale down the compute resources that a SQL statement consumes. Resources are scaled up when a SQL statement has an increased need for resources and scaled down when resources are not being used. This is all done automatically, and no manual work is required to monitor or adjust resources. This removes the complexity of managing your own infrastructure, removes the need for over-provisioning, and ensures that you never have to pay more than needed. The autoscaling process is based on parallelism, which is the number of parallel operations that occur when the SQL statement is running. A SQL statement performs at its best when it has the optimal resources for its required parallelism. Scaling status¶ The scaling status in the SQL workspace shows you how the statement resources are scaling. These are the possible scaling statuses. Scaling Status Description Fine The SQL statement has enough resources to run at the required parallelism. Pending Scale Down The SQL statement has more resources than required and will be scaled down. Pending Scale Up The SQL statement doesn’t have enough resources and will be scaled up. Compute Pool Exhausted There aren’t enough resources in the compute pool for the statement to run with the required parallelism. Compute Pool Exhausted¶ The compute pool has run out of resources. SQL statements may run with a reduced parallelism, which could affect the overall performance of the statement, or a statement may not be able to run at all, because all resources in the compute pool are in use. There are two ways to resolve this situation: You can add more resources by increasing the CFU limit on the compute pool. You can stop some running statements to free up existing resources. Messages Behind¶ Messages Behind is another indicator of how the statement is performing. The overall goal of Autopilot is to ensure that the SQL statement keeps up with the throughput of the source tables and topics, and to keep Messages Behind as close to zero as possible. In Apache Kafka® terms, Messages Behind is the Consumer Lag. A low or decreasing Messages Behind value indicates that Autopilot is doing its job successfully. The following table describes scenarios in which Autopilot is scaling resources correctly or where it may be struggling. Messages Behind and Scaling Status Description Messages Behind is increasing Scaling status = “Pending Scale Up” Autopilot has identified a need for scaling up and will increase the Statement resources. Once resources have been scaled up, the Messages Behind should start decreasing. Messages Behind is increasing Scaling status = “Fine” There is likely a problem. Reach out to Confluent Support. For more information, see Get Help with Confluent Cloud for Apache Flink. Messages Behind is not increasing Compute Pool is Exhausted The statement resources can keep up with throughput but Autopilot needs to assign more resources to improve performance capacity. You can either add more resources by increasing the CFU limit on the compute pool or stop some running statements to free up existing resources. Related content¶ Compute Pools DDL Statements Determinism in Continuous Queries Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

---

### Batch and Stream Processing in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/concepts/batch-and-stream-processing.html

Batch and Stream Processing in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® supports both batch and stream processing, which enables you to process data in either finite (bounded) or infinite (unbounded) modes. Understanding the differences between these modes is crucial for designing efficient data pipelines and analytics solutions. Overview¶ Flink is a distributed processing engine that excels at both batch and stream processing. While both modes share the same underlying engine and APIs, they have distinct characteristics, optimizations, and use cases. In Confluent Cloud for Apache Flink, batch mode is available by using snapshot queries. Batch processing¶ Batch processing in Flink operates on bounded datasets, which are finite, static collections of data. This processing mode has the following key characteristics. It processes complete, finite datasets, like files or database snapshots. Batch jobs run to completion and then terminate. It is optimized for throughput, focusing on processing large volumes of data efficiently. Batch processing can sort, aggregate, and join across the entire dataset. The system can drop state as soon as it is no longer needed. Use cases: - Historical data analysis - ETL (Extract, Transform, Load) operations - Report generation - Data warehousing Stream processing¶ Stream processing in Flink handles unbounded data streams, which have data that arrives continuously and might never end. This processing mode has the following key characteristics. It processes infinite, continuous data streams, such as Kafka topics or sensor feeds. Stream processing jobs run indefinitely, processing data as it arrives. It focuses on processing data with minimal delay for low latency. It produces incremental results as new data arrives. The system must retain state to handle late or out-of-order events. Use cases: - Real-time analytics - Fraud detection - IoT data processing - Live dashboards Bounded and unbounded tables comparison¶ In Flink, tables can be either bounded (batch) or unbounded (streaming). The following table compares the key differences between these two modes. Aspect Bounded Mode (Batch) Unbounded Mode (Streaming) Data Size Finite (static) Infinite (dynamic, continuous) Processing Style Batch processing Real-time/continuous processing Query Semantics All data available at once Data arrives over time State Management Minimal, can drop state when done Must retain state for late/out-of-order data Use Cases ETL, reporting, historical analytics Real-time analytics, monitoring, alerting Differences between batch and stream processing¶ The following table compares the important differences between batch and stream processing. Aspect Batch Processing Stream Processing Data Model Processes complete, finite datasets. Processes infinite, continuous data streams. Execution Model Jobs run to completion. Jobs run continuously. Latency vs. Throughput Optimized for high throughput. Optimized for low latency. State Management Minimal state, which is dropped when no longer needed. Robust state, which is retained for late or out-of-order data. Fault Tolerance Can restart from the beginning. Requires checkpointing for fault recovery. Query Semantics All data is available at once, so global operations are possible. Data arrives over time, so results are incremental. SQL/API Differences ORDER BY: You can use any sort order. Windowing: Supports time-based windows on static data. Deduplication: Deduplication is global. ORDER BY: The primary sort must be on a time attribute. Windowing: Uses windows to scope aggregations over unbounded data. Deduplication: Deduplication is incremental and often uses windows. Session Windows: Supported. Unified processing model¶ One important advantage of Flink is its unified processing model. This means that the same runtime engine handles both batch and streaming. The engine treats batch processing as a special case of stream processing. The same APIs and operators work for both modes. You can use the same code for both batch and streaming applications. This unified approach enables you to: Build applications that process both historical and real-time data. Seamlessly transition between batch and streaming modes. Maintain consistent semantics across processing modes. Leverage the same tools and libraries for both paradigms. Time and watermarks¶ Time and watermarks are important concepts in Flink that help you process data correctly. Batch mode: Time is fixed. All data is available, so event time and processing time are equivalent. Streaming mode: Time is dynamic. Streaming mode uses watermarks to track event time progress and handle out-of-order data. Windowing: In streaming, you use windows (tumbling, hopping, cumulative, session) to group data for aggregation. In batch, windows apply to static data. For more information, see Time and Watermarks. Determinism¶ Determinism is a key concept in Flink that helps you ensure that your queries always produce the same results. Batch: Re-running a batch job on the same data yields the same result, except for non-deterministic functions like UUID(). Streaming: Results can vary due to timing, order of arrival, and late data. Determinism is harder to guarantee. For more information, see Determinism in Continuous Queries. Snapshot queries and batch mode¶ In Confluent Cloud for Apache Flink, batch mode is available by using snapshot queries. Snapshot queries: These are batch queries that automatically bound the input sources as of the current time. Batch optimizations: Batch mode enables optimizations like global sorting, blocking operators, and efficient joins. Snapshot queries benefit from these optimizations. Resource usage: Batch jobs, which are snapshot queries in Confluent Cloud for Apache Flink, release resources when finished. Streaming jobs hold resources as long as they run. For more information, see Snapshot Queries. Examples¶ The following code example shows a batch query. -- Count all orders in a bounded table SELECT COUNT(*) FROM orders; The following code example shows a streaming query. -- Count orders per minute in an unbounded stream. SELECT window_start, window_end, COUNT(*) FROM TABLE( TUMBLE(TABLE orders, DESCRIPTOR(order_time), INTERVAL '1' MINUTE)) GROUP BY window_start, window_end; Related content¶ Deduplication Determinism in Continuous Queries ORDER BY Clause Snapshot Queries Time and Watermarks Window Aggregation Window Deduplication Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
-- Count all orders in a bounded table
SELECT COUNT(*) FROM orders;
```

```sql
-- Count orders per minute in an unbounded stream.
SELECT window_start, window_end, COUNT(*)
FROM TABLE(
  TUMBLE(TABLE orders, DESCRIPTOR(order_time), INTERVAL '1' MINUTE))
GROUP BY window_start, window_end;
```

---

### Comparing Apache Flink with Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/concepts/comparison-with-apache-flink.html

Comparing Apache Flink with Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® supports many of the capabilities of Apache Flink® and provides additional features. Also, Confluent Cloud for Apache Flink has some different behaviors and limitations relative to Apache Flink. This topic describes the key differences between Confluent Cloud for Apache Flink and Apache Flink. Additional features¶ The following list shows features provided by Confluent Cloud for Apache Flink that go beyond what Apache Flink offers. Auto-inference of environments, clusters, topics, and schemas¶ In Apache Flink, you must define and configure your tables and their schemas, including authentication and authorization to Apache Kafka®. Confluent Cloud for Apache Flink maps environments, clusters, topics, and schemas automatically from Confluent Cloud to the corresponding Apache Flink concepts of catalogs, databases, tables, and table schemas. Autoscaling¶ Autopilot scales up and scales down the compute resources that SQL statements use in Confluent Cloud. The autoscaling process is based on parallelism, which is the number of parallel operations that occur when the SQL statement is running. A SQL statement performs at its best when it has the optimal resources for its required parallelism. Default system column implementation¶ Confluent Cloud for Apache Flink has a default implementation for a system column named $rowtime. This column is mapped to the Kafka record timestamp, which can be either LogAppendTime or CreateTime. Default watermark strategy¶ Flink requires a watermark strategy for a variety of features, such as windowing and temporal joins. Confluent Cloud for Apache Flink has a default watermark strategy applied on all tables/topics, which is based on the $rowtime system column. Apache Flink requires you to define a watermark strategy manually. For more information, see Event Time and Watermarks. Because the default strategy is defined for general usage, there are cases that require a custom strategy, for example, when delays in record arrival of longer than 7 days occur in your streams. You can override the default strategy with a custom strategy by using the ALTER TABLE statement. Schema Registry support for JSON_SR and Protobuf¶ Confluent Cloud for Apache Flink has support for Schema Registry formats AVRO, JSON_SR, and Protobuf, while Apache Flink currently supports only Schema Registry AVRO. INFORMATION_SCHEMA support¶ Confluent Cloud for Apache Flink has an implementation for IMPLEMENTATION_SCHEMA, which is a system view that provides insights on catalogs, databases, tables, and schemas. This doesn’t exist in Apache Flink. Behavioral differences¶ The following list shows differences in behavior between Confluent Cloud for Apache Flink and Apache Flink. Configuration options¶ Apache Flink supports various optimization configuration options on different levels, like Execution Options, Optimizer Options, Table Options, and SQL Client Options. Confluent Cloud for Apache Flink supports only the necessary subset of these options. Some of these options have different names in Confluent Cloud for Apache Flink, as shown in the following table. Confluent Cloud for Apache Flink Apache Flink client.results-timeout table.exec.async-lookup.timeout client.statement-name – sql.current-catalog table.builtin-catalog-name sql.current-database table.builtin-database-name sql.dry-run – sql.inline-result – sql.local-time-zone table.local-time-zone sql.state-ttl table.exec.state.ttl sql.tables.scan.bounded.timestamp-millis scan.bounded.timestamp-millis sql.tables.scan.bounded.mode scan.bounded.mode sql.tables.scan.idle-timeout table.exec.source.idle-timeout sql.tables.scan.startup.timestamp-millis scan.startup.timestamp-millis sql.tables.scan.startup.mode scan.startup.mode sql.tables.scan.watermark-alignment.max-allowed-drift scan.watermark.alignment.max-drift CREATE statements provision underlying resources¶ When you run a CREATE TABLE statement in Confluent Cloud for Apache Flink, it creates the underlying Kafka topic and a Schema Registry schema in Confluent Cloud. In Apache Flink, a CREATE TABLE statement only registers the object in the Apache Flink catalog and doesn’t create an underlying resource. This also means that temporary tables are not supported in Confluent Cloud for Apache Flink, while they are in Apache Flink. One Kafka connector and only Confluent Cloud support¶ Apache Flink contains a Kafka connector and an Upsert-Kafka connector, which, combined with the format, defines whether the source/sink is treated as an append-stream or update stream. Confluent Cloud for Apache Flink has only one Kafka connector and determines if the source/sink is an append-stream or update stream by examining the changelog.mode connector option. Confluent Cloud for Apache Flink only supports reading from and writing to Kafka topics that are located on Confluent Cloud. Apache Flink supports other connectors, like Kinesis, Pulsar, JDBC, etc., and also other Kafka environments, like on-premises and different cloud service providers. Limitations¶ The following list shows limitations of Confluent Cloud for Apache Flink compared with Apache Flink. Windowing functions syntax¶ Confluent Cloud for Apache Flink supports the TUMBLE, HOP, SESSION, and CUMULATE windowing functions only by using so-called Table-Valued Functions syntax. Apache Flink supports these windowing functions also by using the outdated Group Window Aggregations functions. Unsupported statements and features¶ Confluent Cloud for Apache Flink does not support the following statements and features. ANALYZE statements CALL statements CATALOG commands other than SHOW (No CREATE/DROP/ALTER) DATABASE command other than SHOW (No CREATE/DROP/ALTER) DELETE statements DROP CATALOG and DROP DATABASE JAR statements LOAD / UNLOAD statements TRUNCATE statements UPDATE statements Processing time operations, like PROCTIME(), TUMBLE_PROCTIME, HOP_PROCTIME, SESSION_PROCTIME, and CUMULATE_PROCTIME Limited support for ALTER¶ Confluent Cloud for Apache Flink has limited support for ALTER TABLE compared with Apache Flink. In Confluent Cloud for Apache Flink, you can use ALTER TABLE only to change the watermark strategy, add a metadata column, or change a parameter value. Related content¶ Flink SQL Autopilot Compute Pools DDL Statements in Confluent Cloud for Apache Flink Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
LogAppendTime
```

```sql
TUMBLE_PROCTIME
```

```sql
HOP_PROCTIME
```

```sql
SESSION_PROCTIME
```

```sql
CUMULATE_PROCTIME
```

---

### Compute Pools in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/concepts/compute-pools.html

Compute Pools in Confluent Cloud for Apache Flink¶ A compute pool in Confluent Cloud for Apache Flink® represents a set of compute resources bound to a region that is used to run your SQL statements. The resources provided by a compute pool are shared between all statements that use it. The capacity of a compute pool is measured in CFUs. Compute pools expand and shrink automatically based on the resources required by the statements using them. A compute pool without any running statements scale down to zero. The maximum size of a compute pool is configured during creation. A compute pool is provisioned in a specific region. The statements using a compute pool can only read and write Apache Kafka® topics in the same region as the compute pool. Compute pools fulfill two roles: Workload Isolation: Statements in different compute pools are isolated from each other. Budgeting: Statements within a compute pool can’t use more than the configured maximum number of CFUs. Compute pools and isolation¶ All statements using the same compute pool compete for resources. Although Confluent Cloud’s Autopilot aims to provide each statement with the resources it needs, this might not always be possible, in particular, when the maximum resources of the compute pool are exhausted. To avoid situations in which statements with different latency and availability requirements compete for resources, Confluent recommends using separate compute pools for different use cases, for example, ad-hoc exploration vs. mission-critical, long-running queries. Because statements may affect each other, Confluent recommends sharing compute pools only between statements with comparable requirements. Manage compute pools¶ You can use these Confluent tools to create and manage compute pools. Cloud Console Confluent CLI REST API Confluent Terraform Provider Authorization¶ You must be authorized to create, update, delete (FlinkAdmin) or use (FlinkDeveloper) a compute pool. For more information, see Grant Role-Based Access in Confluent Cloud for Apache Flink. Move statements between compute pools¶ You can move a statement from one compute pool to another. This can be useful if you’re close to maxing out the resources in one pool. To move a running statement, you must stop the statement, change its compute pool, then restart the statement. Related content¶ Billing on Confluent Cloud for Apache Flink DDL Statements

#### Code Examples

```sql
FlinkDeveloper
```

---

### Delivery Guarantees and Latency in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/concepts/delivery-guarantees.html

Delivery Guarantees and Latency in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides exactly-once semantics end-to-end by default, which mean that every input message is reflected exactly-once in the output of a statement and every output message is delivered exactly once. To achieve this, Confluent Cloud for Apache Flink relies on Apache Flink®’s checkpointing mechanism and Kafka transactions. While checkpointing and fault tolerance falls into Confluent’s responsibility, it is important to understand the implications of how Flink reads from and writes to Kafka: Flink statements write to Kafka by using transactions. Transactions are committed periodically, approximately every minute. Flink by default only reads committed messages from Kafka. For more information, see isolation.level. This implies that depending on the delivery guarantees required by your use case, you can currently achieve different end-to-end latencies with Flink. Exactly-Once: If you require exactly-once, the latency is roughly one minute and is dominated by the interval at which Kafka transactions are committed. In this case, ensure that all consumers of the output topics of Flink statements use isolation.level: read_committed or set the Flink table option 'kafka.consumer.isolation-level' = 'read-committed'. At-Least-Once: If at-least-once is sufficient for your use case, you can read from the output topics with isolation-level: read_uncommitted, which is the default in Kafka, or set the Flink table option 'kafka.consumer.isolation-level' = 'read-uncommitted'. With this configuration, depending on the logic of your query, you can achieve an end-to-end latency below 100 ms, but you may see some output messages twice. This happens when Flink needs to abort a transaction that your consumer has already read. You won’t see incorrect results, but you may see the same correct result multiple times. Note Confluent is actively working on reducing the latency under exactly-once semantics. If your use case requires a lower latency, reach out to Support or your account manager. Related content¶ Statements Determinism Flink SQL Queries Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
isolation.level: read_committed
```

```sql
'kafka.consumer.isolation-level' = 'read-committed'
```

```sql
isolation-level: read_uncommitted
```

```sql
'kafka.consumer.isolation-level' = 'read-uncommitted'
```

---

### Determinism with continuous Flink SQL queries in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/concepts/determinism.html

Determinism in Continuous Queries on Confluent Cloud for Apache Flink¶ This topic answers the following questions about determinism in Confluent Cloud for Apache Flink®: What is determinism? Is all batch processing deterministic? Two examples of batch queries with non-deterministic results Non-determinism in batch processing Determinism in stream processing Non-determinism in streaming Non-deterministic update in streaming What is determinism?¶ Paraphrasing the SQL standard’s description of determinism, an operation is deterministic if it reliably computes identical results when repeated with identical input values. Determinism for regular batch queries¶ In a classic batch scenario, repeated execution of the same query for a given bounded data set will yield consistent results, which is the most intuitive understanding of determinism. In practice, however, the same query does not always return consistent results on a batch process either, as shown by the following example queries. Two examples of batch queries with non-deterministic results¶ For example, consider a newly created website click log table: CREATE TABLE clicks ( uid VARCHAR(128), cTime TIMESTAMP(3), url VARCHAR(256) ) Some new records are written to the table: +------+---------------------+------------+ | uid | cTime | url | +------+---------------------+------------+ | Mary | 2023-08-22 12:00:01 | /home | | Bob | 2023-08-22 12:00:01 | /home | | Mary | 2023-08-22 12:00:05 | /prod?id=1 | | Liz | 2023-08-22 12:01:00 | /home | | Mary | 2023-08-22 12:01:30 | /cart | | Bob | 2023-08-22 12:01:35 | /prod?id=3 | +------+---------------------+------------+ The following query applies a time filter to the click log table and wants to return the last two minutes of click records: SELECT * FROM clicks WHERE cTime BETWEEN TIMESTAMPADD(MINUTE, -2, CURRENT_TIMESTAMP) AND CURRENT_TIMESTAMP; When the query was submitted at “2023-08-22 12:02:00”, it returned all 6 rows in the table, and when it was executed again a minute later, at “2023-08-22 12:03:00”, it returned only 3 items: +------+---------------------+------------+ | uid | cTime | url | +------+---------------------+------------+ | Liz | 2023-08-22 12:01:00 | /home | | Mary | 2023-08-22 12:01:30 | /cart | | Bob | 2023-08-22 12:01:35 | /prod?id=3 | +------+---------------------+------------+ Another query wants to add a unique identifier to each returned record, since the clicks table doesn’t have a primary key. SELECT UUID() AS uuid, * FROM clicks LIMIT 3; Executing this query twice in a row generates a different uuid identifier for each row: -- first execution +--------------------------------+------+---------------------+------------+ | uuid | uid | cTime | url | +--------------------------------+------+---------------------+------------+ | aaaa4894-16d4-44d0-a763-03f... | Mary | 2023-08-22 12:00:01 | /home | | ed26fd46-960e-4228-aaf2-0aa... | Bob | 2023-08-22 12:00:01 | /home | | 1886afc7-dfc6-4b20-a881-b0e... | Mary | 2023-08-22 12:00:05 | /prod?id=1 | +--------------------------------+------+---------------------+------------+ -- second execution +--------------------------------+------+---------------------+------------+ | uuid | uid | cTime | url | +--------------------------------+------+---------------------+------------+ | 95f7301f-bcf2-4b6f-9cf3-1ea... | Mary | 2023-08-22 12:00:01 | /home | | 63301e2d-d180-4089-876f-683... | Bob | 2023-08-22 12:00:01 | /home | | f24456d3-e942-43d1-a00f-fdb... | Mary | 2023-08-22 12:00:05 | /prod?id=1 | +--------------------------------+------+---------------------+------------+ Non-determinism in batch processing¶ The non-determinism in batch processing is caused mainly by the non-deterministic functions, as shown in the previous query examples, where the built-in functions CURRENT_TIMESTAMP and UUID() actually behave differently in batch processing. Compare with the following query: SELECT CURRENT_TIMESTAMP, * FROM clicks; CURRENT_TIMESTAMP is the same value on all records returned +-------------------------+------+---------------------+------------+ | CURRENT_TIMESTAMP | uid | cTime | url | +-------------------------+------+---------------------+------------+ | 2023-08-23 17:25:46.831 | Mary | 2023-08-22 12:00:01 | /home | | 2023-08-23 17:25:46.831 | Bob | 2023-08-22 12:00:01 | /home | | 2023-08-23 17:25:46.831 | Mary | 2023-08-22 12:00:05 | /prod?id=1 | | 2023-08-23 17:25:46.831 | Liz | 2023-08-22 12:01:00 | /home | | 2023-08-23 17:25:46.831 | Mary | 2023-08-22 12:01:30 | /cart | | 2023-08-23 17:25:46.831 | Bob | 2023-08-22 12:01:35 | /prod?id=3 | +-------------------------+------+---------------------+------------+ This difference is due to the fact that Flink SQL inherits the definition of functions from Apache Calcite, where there are two types of functions other than deterministic function: non-deterministic functions and dynamic functions. The non-deterministic functions are executed at runtime in clusters and evaluated per record. The dynamic functions determine the corresponding values only when the query plan is generated. They’re not executed at runtime, and different values are obtained at different times, but the same values are obtained during the same execution. Built-in dynamic functions are mainly temporal functions. Determinism in stream processing¶ A core difference between streaming and batch is the unboundedness of the data. Flink SQL abstracts streaming processing as the continuous query on dynamic tables. So the dynamic function in the batch query example is equivalent to a non-deterministic function in a streaming processing, where logically every change in the base table triggers the query to be executed. If the clicks log table in the previous example is from an Apache Kafka® topic that’s continuously written, the same query in stream mode returns a CURRENT_TIMESTAMP that changes over time. SELECT CURRENT_TIMESTAMP, * FROM clicks; For example: +-------------------------+------+---------------------+------------+ | CURRENT_TIMESTAMP | uid | cTime | url | +-------------------------+------+---------------------+------------+ | 2023-08-23 17:25:46.831 | Mary | 2023-08-22 12:00:01 | /home | | 2023-08-23 17:25:47.001 | Bob | 2023-08-22 12:00:01 | /home | | 2023-08-23 17:25:47.310 | Mary | 2023-08-22 12:00:05 | /prod?id=1 | +-------------------------+------+---------------------+------------+ Non-determinism in streaming¶ In addition to the non-deterministic functions, these are other factors that may generate non-determinism: Non-deterministic back read of a source connector. Querying based on processing time. Processing time is not supported in Confluent Cloud for Apache Flink. Clearing internal state data based on TTL. Non-deterministic back read of source connector¶ For Flink SQL, the determinism provided is limited to the computation only, because it doesn’t store user data itself. Here, it’s necessary to distinguish between the managed internal state in streaming mode and the user data itself. If the source connector’s implementation can’t provide deterministic back read, it brings non-determinism to the input data, which may produce non-deterministic results. Common examples are inconsistent data for multiple reads from a same offset, or requests for data that no longer exists because of the retention time, for example, when the requested data is beyond the configured TTL of a Kafka topic. Clear internal state data based on TTL¶ Because of the unbounded nature of stream processing, the internal state data maintained by long-running streaming queries in operations like regular join and group aggregation (non-windowed aggregation) may continuously increase. Setting a state TTL to clean up internal state data is often a necessary compromise but may make the computation results non-deterministic. The impact of the non-determinism on different queries is different. For some queries it only produces non-deterministic results, which means that the query works, but multiple runs fail to produce consistent results. But other queries can produce more serious effects, like incorrect results or runtime errors. The main cause of these failures is “non-deterministic update”. Non-deterministic update in streaming¶ Flink implements a complete incremental update mechanism based on the continuous query on dynamic tables abstraction. All operations that need to generate incremental messages maintain complete internal state data, and the operation of the entire query pipeline, including the complete DAG from source to sink operators, relies on the guarantee of correct delivery of update messages between operators, which can be broken by non-determinism, leading to errors. What is a “non-deterministic Update” (NDU)? Update messages (the changelog) may contain these kinds of message types: Insert (I) Delete (D) Update_Before (UB) Update_After (UA) In an insert-only changelog pipeline, there’s no NDU problem. When there is an update message containing at least one message D, UB, UA in addition to I, the update key of the message, which can be regarded as the primary key of the changelog, is deduced from the query. When the update key can be deduced, the operators in the pipeline maintain the internal state by the update key. When the update key can’t be deduced, it’s possible that the primary key isn’t defined in the CDC source table or sink table, or some operations can’t be deduced from the semantics of the query. All operators maintaining internal state can only process update (D/UB/UA) messages through complete rows. Sink nodes work in retract mode when no primary key is defined, and delete operations are performed by complete rows. This means that in the update-by-row mode, all the update messages received by the operators that need to maintain the state can’t be interfered by the non-deterministic column values, otherwise it will cause NDU problems resulting in computation errors. For a query pipeline with update messages that can’t derive the update key, the following points are the most important sources of NDU problems: Non-deterministic functions, including scalar, table, aggregate functions, built-in or custom ones LookupJoin on an evolving source CDC source carries metadata fields, like system columns, that don’t belong to the row entity itself Exceptions caused by cleaning internal state data based on TTL are discussed separately as a runtime fault-tolerant handling strategy. For more information, see FLINK-24666. Related content¶ Flink SQL Queries Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
CREATE TABLE clicks (
    uid VARCHAR(128),
    cTime TIMESTAMP(3),
    url VARCHAR(256)
)
```

```sql
+------+---------------------+------------+
|  uid |               cTime |        url |
+------+---------------------+------------+
| Mary | 2023-08-22 12:00:01 |      /home |
|  Bob | 2023-08-22 12:00:01 |      /home |
| Mary | 2023-08-22 12:00:05 | /prod?id=1 |
|  Liz | 2023-08-22 12:01:00 |      /home |
| Mary | 2023-08-22 12:01:30 |      /cart |
|  Bob | 2023-08-22 12:01:35 | /prod?id=3 |
+------+---------------------+------------+
```

```sql
SELECT * FROM clicks
WHERE cTime BETWEEN TIMESTAMPADD(MINUTE, -2, CURRENT_TIMESTAMP) AND CURRENT_TIMESTAMP;
```

```sql
+------+---------------------+------------+
|  uid |               cTime |        url |
+------+---------------------+------------+
|  Liz | 2023-08-22 12:01:00 |      /home |
| Mary | 2023-08-22 12:01:30 |      /cart |
|  Bob | 2023-08-22 12:01:35 | /prod?id=3 |
+------+---------------------+------------+
```

```sql
SELECT UUID() AS uuid, * FROM clicks LIMIT 3;
```

```sql
-- first execution
+--------------------------------+------+---------------------+------------+
|                           uuid |  uid |               cTime |        url |
+--------------------------------+------+---------------------+------------+
| aaaa4894-16d4-44d0-a763-03f... | Mary | 2023-08-22 12:00:01 |      /home |
| ed26fd46-960e-4228-aaf2-0aa... |  Bob | 2023-08-22 12:00:01 |      /home |
| 1886afc7-dfc6-4b20-a881-b0e... | Mary | 2023-08-22 12:00:05 | /prod?id=1 |
+--------------------------------+------+---------------------+------------+

-- second execution
+--------------------------------+------+---------------------+------------+
|                           uuid |  uid |               cTime |        url |
+--------------------------------+------+---------------------+------------+
| 95f7301f-bcf2-4b6f-9cf3-1ea... | Mary | 2023-08-22 12:00:01 |      /home |
| 63301e2d-d180-4089-876f-683... |  Bob | 2023-08-22 12:00:01 |      /home |
| f24456d3-e942-43d1-a00f-fdb... | Mary | 2023-08-22 12:00:05 | /prod?id=1 |
+--------------------------------+------+---------------------+------------+
```

```sql
CURRENT_TIMESTAMP
```

```sql
SELECT CURRENT_TIMESTAMP, * FROM clicks;
```

```sql
CURRENT_TIMESTAMP
```

```sql
+-------------------------+------+---------------------+------------+
|       CURRENT_TIMESTAMP |  uid |               cTime |        url |
+-------------------------+------+---------------------+------------+
| 2023-08-23 17:25:46.831 | Mary | 2023-08-22 12:00:01 |      /home |
| 2023-08-23 17:25:46.831 |  Bob | 2023-08-22 12:00:01 |      /home |
| 2023-08-23 17:25:46.831 | Mary | 2023-08-22 12:00:05 | /prod?id=1 |
| 2023-08-23 17:25:46.831 |  Liz | 2023-08-22 12:01:00 |      /home |
| 2023-08-23 17:25:46.831 | Mary | 2023-08-22 12:01:30 |      /cart |
| 2023-08-23 17:25:46.831 |  Bob | 2023-08-22 12:01:35 | /prod?id=3 |
+-------------------------+------+---------------------+------------+
```

```sql
CURRENT_TIMESTAMP
```

```sql
SELECT CURRENT_TIMESTAMP, * FROM clicks;
```

```sql
+-------------------------+------+---------------------+------------+
|       CURRENT_TIMESTAMP |  uid |               cTime |        url |
+-------------------------+------+---------------------+------------+
| 2023-08-23 17:25:46.831 | Mary | 2023-08-22 12:00:01 |      /home |
| 2023-08-23 17:25:47.001 |  Bob | 2023-08-22 12:00:01 |      /home |
| 2023-08-23 17:25:47.310 | Mary | 2023-08-22 12:00:05 | /prod?id=1 |
+-------------------------+------+---------------------+------------+
```

---

### Tables and Topics in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/concepts/dynamic-tables.html

Tables and Topics in Confluent Cloud for Apache Flink¶ Apache Flink® and the Table API use the concept of dynamic tables to facilitate the manipulation and processing of streaming data. Dynamic tables represent an abstraction for working with both batch and streaming data in a unified manner, offering a flexible and expressive way to define, modify, and query structured data. In contrast to the static tables that represent batch data, dynamic tables change over time. But like static batch tables, systems can execute queries over dynamic tables. Confluent Cloud for Apache Flink® implements ANSI-Standard SQL and has the familiar concepts of catalogs, databases, and tables. Confluent Cloud maps a Flink catalog to an environment and vice-versa. Similarly, Flink databases and tables are mapped to Apache Kafka® clusters and topics. For more information, see Metadata mapping between Kafka cluster, topics, schemas, and Flink. Dynamic tables and continuous queries¶ Every table in Flink is equivalent to a stream of events describing the changes that are being made to that table. A stream of changes like this a changelog stream. Essentially, a stream is the changelog of a table, and every table is backed by a stream. This is also the case for regular database tables. Querying a dynamic table yields a continuous query. A continuous query never terminates and produces dynamic results - another dynamic table. The query continuously updates its dynamic result table to reflect changes on its dynamic input tables. Essentially, a continuous query on a dynamic table is similar to a query that defines a materialized view. The output of a continuous query is always equivalent to the result of the same query executed in batch mode on a snapshot of the input tables. Append-only table¶ Stream-table table duality for an append-only table¶ In this animation, the only changes happening to the Orders table are the new orders being appended to the end of the table. The corresponding changelog stream is just a stream of INSERT events. Adding another order to the table is the same as adding another INSERT statement to the stream, as shown below the table. This is an example of an append-only or insert-only table. Updating table¶ Not all tables are append-only tables. Tables can also contain events that modify or delete existing rows. The changelog stream used by Flink SQL contains three additional event types to accommodate different ways that tables can be updated. Besides the regular Insertion event, Update Before and Update After are a pair of events that work together to update an earlier result. The Delete event has the effect you would expect, removing a record from the table. Stream-table table duality for an updating table¶ This animation has the same starting point as the previous example that showed the append-only table. But this time, an order has been cancelled, and the item in that order hasn’t been sold. The result of this event is that the Bestsellers table is updated, rather then doing another insert. The update starts with appending another order to the append-only/insert-only Orders table, which is registered as an INSERT event in the changelog stream. Because the SQL statement is doing grouping, the result is an updating table instead of an append-only/insert-only table. In this example, an order for 15 hats is cancelled. To process the event with the 15-hat order cancellation, the query produces two update events: The first is an UPDATE_BEFORE event that retracts the current result that showed 50 hats as the bestselling item. The second is an UPDATE_AFTER event that replaces the old entry with a new one that shows 35 hats. Conceptually, the UPDATE_BEFORE event is processed first, which removes the old entry from the Bestsellers table. Then, the sync processes the UPDATE_AFTER event, which inserts the updated results. The following figure visualizes the relationship of streams, dynamic tables, and continuous queries: A stream is converted into a dynamic table. A continuous query is evaluated on the dynamic table yielding a new dynamic table. The resulting dynamic table is converted back into a stream. Dynamic tables are a logical concept. The only state that is actually materialized by the Flink SQL runtime is whatever is strictly necessary to produce correct results for the specific query being executed. For example, the previous diagram shows a query executing a simple filter. This requires no state, so nothing is materialized. Changelog entries¶ Flink provides four different types of changelog entries: Short name Long name Semantics +I Insertion Records only the insertions that occur. -U Update Before Retracts a previously emitted result. Update Before is an update operation with the previous content of the updated row. This kind occurs together with Update After (+U) for modeling an update that must retract the previous row first. It is useful in cases of a non-idempotent update, which is an update of a row that is not uniquely identifiable by a key. +U Update After Updates a previously emitted result. Update After is an update operation with new content for the updated row. This kind can occur together with Update Before (-U) for modeling an update that must retract the previous row first, or it can describe an idempotent update, which is an update of a row that is uniquely identifiable by a key. -D Delete Deletes the last result. The - character always means that a row is being removed. If the downstream system supports upserting, you should use a primary key in Confluent Cloud for Apache Flink to avoid the need to use Update Before. Depending on the combination of source, sink, and business logic applied, you can end up with the following types of changelog streams. Changelog stream types Stream category Changelog entry types Appending stream Append stream Contains only +I Upserting streams Update stream +I, +U, -D (never contains -U but can contain +U and/or -D) Retracting stream Update stream +I, +U, -U, -D (contains +I and can contain -U and/or -D) All streams can have +I / inserts. Both retract and upsert streams can have -D / deletes and +U / upserts (upsert afters). Only retract streams can have -U. Related content¶ Flink SQL Queries Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
Bestsellers
```

```sql
Bestsellers
```

---

### Billing in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/concepts/flink-billing.html

Billing on Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® is a serverless stream-processing platform with usage-based pricing, where you are charged only for the duration that your queries are running. You configure Flink by creating a Flink compute pool. You are charged for the CFUs consumed when you run statements within a compute pool. While the compute pool itself scales elastically to provide the necessary resources, your cost is determined by the actual CFUs used per minute, not the provisioned size of the pool. You can configure the maximum size of a compute pool to limit your spending. CFUs¶ A CFU is a logical unit of processing power that is used to measure the resources consumed by Confluent Cloud for Apache Flink. Each Flink statement consumes a minimum of 1 CFU-minute but may consume more depending on the needs of the workload. CFU billing¶ You are billed for the total number of CFUs consumed inside a compute pool per minute. Usage is stated in hours in order to apply hourly pricing to minute-by-minute use. For example, 30 CFU-minutes is 0.5 CFU-hours. CFU pricing$0.21/CFU-hour, calculated by the minute ($0.0035/CFU-minute) Prices vary by cloud region. Networking fees¶ Using Flink to read and write data from Apache Kafka® doesn’t add any new Flink-specific networking fees, but you’re still responsible for the Confluent Cloud networking rates for data read from and written to your Kafka clusters. These are existing Kafka costs, not new charges created by Flink. Cost Management¶ You can’t define the number of CFUs required for individual statements. CFUs are counted by Confluent Cloud for Apache Flink. You can configure the maximum size of a compute pool to limit your spending by setting a parameter named MAX_CFU, which sets an upper limit on the hourly spend on the compute pool. If the size of the workload in a pool exceeds MAX_CFU, new statements are rejected. Existing workloads continue running but may experience increased latency. Note You can increase the MAX_CFU value after you create a compute pool, but decreasing the initial MAX_CFU value is not supported. For more information, see Update a compute pool. For more information on CFU prices, see Confluent Cloud Pricing. Pricing examples¶ Data streaming is a real-time business, and data streams oscillate on a minute-by-minute basis, creating peaks and troughs of utilization. You don’t want to allocate and overpay for processing capacity that you aren’t using. With Confluent Cloud for Apache Flink, you pay only for the processing power that you actually use. The following examples provide additional detail on how pricing works when processing streams using Confluent Cloud for Apache Flink. Data exploration and discovery¶ Most SQL queries are short-running, interactive queries that help software and data engineers understand the streams they have access to. Querying the streams directly is an important and necessary step in the iterative development of apps and pipelines. In the following example, one user executes five different queries. Unlike other Flink offerings, Confluent Cloud for Apache Flink’s serverless architecture charges you only for the five minutes when these queries are executing, with all users able to share the resources of a single compute pool. It doesn’t matter if these queries are executed by the same person, by five different people at the same time or, as shown below, at different points in the hour. Example pricing calculation Number of queries executed = 5 Total CFU-minutes consumed = 5 Total charge: 5 CFU-minutes x $0.0035/CFU-minute = $0.0175 Note: The charge appears on the invoice as “0.083 CFU-hours x $0.21/CFU-hour”. Many data streaming apps and statements¶ Data streaming architectures are composed of many applications, each with their own workload requirements. An architecture can be a mix of interactive, terminating statements and continuous, streaming statements. Confluent Cloud for Apache Flink automatically scales the processing power of the Flink compute pool up and down in real-time to ensure your apps have the processing power they need, while charging only for the minutes needed. In the following example, five streaming statements are running in a single compute pool. The data streams are oscillating, and you can see spikes of utilization for short periods within the hour. Each statement attracts a minimum price of 1 CFU-minute ($0.0035 in this example) and is automatically scaled up and down as needed on a per-minute basis. Statement CFU-minutes Statement Type Q1 5 Interactive Q2 60 Streaming Q3 110 Streaming Q4 10 Interactive Q5 124 Streaming Total 309 Example pricing calculation Number of statements executing = 5 Total CFU-minutes consumed = 309 Total charge: 309 CFU-minutes x $0.0035/CFU-minute = $1.0815 Note: The charge appears on the invoice as “5.15 CFU-hours x $0.21/CFU-hour”. Related content¶ Compute Pools Confluent Cloud Pricing

---

### Private Networking with Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/concepts/flink-private-networking.html

Private Networking with Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® supports private networking on AWS, Azure, and Google Cloud. This feature enables Flink to securely read and write data stored in Confluent Cloud clusters that are located in private networking, with no data flowing to the public internet. With private networking, you can use Flink and Apache Kafka® together for stream processing in Confluent Cloud, even in the most stringent regulatory environments. Confluent Cloud for Apache Flink supports private networking for AWS and Azure in all regions where Flink is supported. Google Cloud supports private networking in most regions where Flink is supported. For the regions that support Flink private networking, see Supported Cloud Regions. Connectivity options¶ There are a number of ways to access Flink with private networking. In all cases, they allow access to all types of private clusters (Enterprise, Dedicated, Freight), with all types of connectivity (VNET/VPC, Peering, Transit Gateway, PNI). PrivateLink Attachment: Works with any type of cluster and is available on AWS and Azure. Confluent Cloud network (CCN): Available on AWS and Azure for all types of networks. If you already have an existing Confluent Cloud network, this is the easiest way to get started, but it works only on AWS when a Confluent Cloud network is already configured. If you need to create a new Confluent Cloud network, follow the steps in Create Confluent Cloud Network on AWS. PrivateLink Attachment¶ A PrivateLink Attachment is a resource that enables you to connect to Confluent serverless products, like Enterprise clusters and Flink. For Flink, the new PrivateLink Attachment is used only to establish a connection between your clients (like Cloud Console UI, Confluent CLI, Terraform, apps using the Confluent REST API) and Flink. Flink-to-Kafka is routed internally within Confluent Cloud. As a result, this PLATT is used only for submitting statements and fetching results from the client. For Dedicated clusters, regardless of the Kafka cluster connection type (Private Link, Peering, or Transit Gateway), Flink requires that you define a PLATT in the same region of the cluster, even if a private link exists for the Dedicated cluster. For Enterprise clusters, you can reuse the same PLATT used by your Enterprise clusters. By creating a PrivateLink Attachment to a Confluent Cloud environment in a region, you are enabling Flink statements created in that environment to securely access data in any of the Flink clusters in the same region, regardless of their environment. Access to the Flink clusters is governed by RBAC. Also, a PrivateLink Attachment enables your data-movement components in Confluent Cloud, including Flink statements and cluster links, to move data between all of the private networks in the organization, including the Confluent Cloud networks associated with any Dedicated Kafka clusters. For more information, see Enable private networking with PrivateLink Attachment. Confluent Cloud network (CCN)¶ If you have an existing Confluent Cloud network, this is the easiest way to get set up, but it works only on AWS and Azure when a Confluent Cloud network is configured already and at least one Kafka Dedicated cluster exists in the environment and region where you need to use Flink. For existing Kafka Dedicated users, this option requires no effort to configure, if everything is already configured for Kafka. If a reverse proxy is not set up, this requires setup for Flink or the use of a VM within the VPC to access Flink. To create a Confluent Cloud network, follow the steps in Create Confluent Cloud Network on AWS. For more information, see Enable private networking with Confluent Cloud Network. Protect resources with IP Filtering¶ With IP Filtering, you can enhance security for your Flink resources (statements and workspaces) based on trusted source IP addresses. IP Filtering is an authorization feature that allows you to create IP filters for your Confluent Cloud organization that permit inbound requests only from specified IP groups. All incoming API requests that originate from IP addresses not included in your IP filters are denied. For Flink resources, you can implement the following access controls: No public networks: Select the predefined No Public Networks group (ipg-none) to block all public network access, allowing access only from private network connections. This IP group cannot be combined with other IP groups in the same filter. Public: The default option if no IP filters are set. Flink statements and workspaces are accessible from all source networks when connecting over the public internet. While SQL queries are visible, private cluster data remains protected, and you can’t issue statements accessing private clusters. Public with restricted IP list: Create custom IP groups containing specific CIDR blocks to allow access only from trusted networks while maintaining the same protection for private cluster data. IP Filtering applies only to requests made over public networks and doesn’t limit requests made over private network connections. When creating IP filters for Flink resources, select the Flink operation group to control access to all operations related to Apache Flink data. For more information on setting IP filters, see IP Filtering and Manage IP Filters. The IP Filtering feature replaces the previous distinction between public and private Flink statements and workspaces. Administrators can modify access controls at any time by updating IP filters. For data protection in Kafka clusters, access is governed by network settings of the cluster: You can always read public data regardless of the connectivity, whether public or private. To read or write data in a private cluster, the cluster must use private connectivity. To prevent data exfiltration, you can’t write to public clusters when using private connectivity. Available endpoints for an environment and region¶ The following section shows the endpoints that are available for connecting to Flink. While the public endpoint is always present, others may require some effort to be created. Public endpoint PrivateLink Attachment Private connectivity through Confluent Cloud network The following table shows how to get the endpoint value by using different Confluent interfaces. Interface Location Endpoint Cloud Console Flink Endpoints page Full FQDN shown for each network connection Confluent CLI confluent flink endpoint list Full FQDN shown for each network connection Network UI/API/CLI Network management details page in Environment overview GET /network/ confluent network describe Read the endpoint_suffix attribute, for example, <service-identifier>-abc1de.us-east-1.aws.glb.confluent.cloud Replace <service-identifier> with the relevant value, for example, flink for Flink or flinkpls for Language Service. Assign in interface (UI/CLI/Terraform) The following table shows the endpoint patterns for different DNS and cluster type combinations. Networking DNS Cluster Type Endpoints PrivateLink Private Enterprise (PrivateLink Attachment) flink.$region.$cloud.private.confluent.cloud flinkpls.$region.$cloud.private.confluent.cloud Dedicated flink.dom$id.$region.$cloud.private.confluent.cloud flinkpls.dom$id.$region.$cloud.private.confluent.cloud Public Dedicated flink-$nid.$region.$cloud.glb.confluent.cloud flinkpls-$nid.$region.$cloud.glb.confluent.cloud VPC Peering / Transit Gateway w/ /16 CIDR Public Dedicated flink-$nid.$region.$cloud.confluent.cloud flinkpls-$nid.$region.$cloud.confluent.cloud VPC Peering / Transit Gateway w/ /27 CIDRs Public Dedicated flink-$nid.$region.$cloud.glb.confluent.cloud flinkpls-$nid.$region.$cloud.glb.confluent.cloud Public endpoint¶ Source: Always present. Considerations: Can’t access Kafka private data. Kafka data access and scope: Can access public cluster data (read/write) in cloud region for this organization. Access to Flink statement and workspace: Configurable with IP Filtering. Endpoints: flink.<region>.<cloud>.confluent.cloud, for example: flink.us-east-2.aws.confluent.cloud. PrivateLink Attachment¶ Source: Must create a Private Link Attachment for the environment/region. Considerations: A single VPC can’t have private link connections to multiple Confluent Cloud environments. Available on AWS and Azure. Can access private cluster data (read/write) in Enterprise, Dedicated or Freight clusters for the cloud region for the organization of the endpoint. Can access public cluster data (read only). Access all Flink resources in the same environment and region of the endpoint Endpoints: flink.<region>.<cloud>.private.confluent.cloud, for example: flink.us-east-2.aws.private.confluent.cloud Private connectivity through Confluent Cloud network¶ Source: Created with Kafka Dedicated clusters. Considerations: Easiest way to use Flink when the network is created already for Dedicated clusters. Available on AWS and Azure for all types of Confluent Cloud network. Can access private cluster data (read/write) in Enterprise, Dedicated or Freight clusters for the organization of the region. Can access public cluster data (read only). Access all Flink resources in the same environment and region of the endpoint To find the endpoints from the Cloud Console or Confluent CLI, see Available endpoints for an environment and region. Access private networking with the Confluent CLI¶ Run the confluent flink region --cloud <cloud-provider> --region <region> command to select a cloud provider and region. Run the confluent flink endpoint list command to list all endpoints, both public and private. Run the confluent flink endpoint use to select an endpoint. In addition to the main Flink endpoint listed here, you must have access to flinkpls.<network>.<region>.<cloud>.private.confluent.cloud (for private DNS resolution) or flinkpls-<network>.<region>.<cloud>.private.confluent.cloud (for public DNS resolution) to access the language service for autocompletion in the Flink SQL shell. In the case of public DNS resolution, routing is done transparently, but if you use private DNS resolution, you must make sure to route this endpoint from your client. For more information, see private DNS resolution. Access private networking with the Cloud Console¶ By default, public networking is used, which won’t work if IP Filtering is set, and/or the cluster is private. You can set defaults for each cloud region in an environment. For this, use the Flink Endpoints page. The default is per-user. When a default is set, it is used for all pages that access Flink, for example, the statement list, workspace list, and workspaces. If no default is set, the public endpoint is used. Related content¶ Video: Flink Queries on Dedicated PrivateLink Kafka Clusters in Confluent Cloud Use Confluent Cloud with Private Networking Flink Compute Pools Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
confluent flink endpoint list
```

```sql
confluent network describe
```

```sql
endpoint_suffix
```

```sql
<service-identifier>-abc1de.us-east-1.aws.glb.confluent.cloud
```

```sql
<service-identifier>
```

```sql
flink.$region.$cloud.private.confluent.cloud
```

```sql
flinkpls.$region.$cloud.private.confluent.cloud
```

```sql
flink.dom$id.$region.$cloud.private.confluent.cloud
```

```sql
flinkpls.dom$id.$region.$cloud.private.confluent.cloud
```

```sql
flink-$nid.$region.$cloud.glb.confluent.cloud
```

```sql
flinkpls-$nid.$region.$cloud.glb.confluent.cloud
```

```sql
flink-$nid.$region.$cloud.confluent.cloud
```

```sql
flinkpls-$nid.$region.$cloud.confluent.cloud
```

```sql
flink-$nid.$region.$cloud.glb.confluent.cloud
```

```sql
flinkpls-$nid.$region.$cloud.glb.confluent.cloud
```

```sql
flink.<region>.<cloud>.confluent.cloud
```

```sql
flink.us-east-2.aws.confluent.cloud
```

```sql
flink.<region>.<cloud>.private.confluent.cloud
```

```sql
flink.us-east-2.aws.private.confluent.cloud
```

```sql
confluent flink region --cloud <cloud-provider> --region <region>
```

```sql
confluent flink endpoint list
```

```sql
confluent flink endpoint use
```

```sql
flinkpls.<network>.<region>.<cloud>.private.confluent.cloud
```

```sql
flinkpls-<network>.<region>.<cloud>.private.confluent.cloud
```

---

### Stream Processing Concepts in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/concepts/overview.html

Stream Processing Concepts in Confluent Cloud for Apache Flink¶ Apache Flink® SQL, a high-level API powered by Confluent Cloud for Apache Flink, offers a simple and easy way to leverage the power of stream processing. With support for a wide variety of built-in functions, queries, and statements, Flink SQL provides real-time insights into streaming data. Time is a critical element in stream processing, and Flink SQL makes it easy to process data as it arrives, avoiding delays. By using SQL syntax, you can declare expressions that filter, aggregate, route, and mutate streams of data, simplifying your data processing workflows. Stream processing¶ Streams are the de-facto way to create data. Whether the data comprises events from web servers, trades from a stock exchange, or sensor readings from a machine on a factory floor, data is created as part of a stream. When you analyze data, you can either organize your processing around bounded or unbounded streams, and which of these paradigms you choose has significant consequences. Batch processing is the paradigm at work when you process a bounded data stream. In this mode of operation, you can choose to ingest the entire dataset before producing any results, which means that it’s possible, for example, to sort the data, compute global statistics, or produce a final report that summarizes all of the input. Snapshot queries are a type of batch processing query that enables you to process a subset of data from a Kafka topic. Stream processing, on the other hand, involves unbounded data streams. Conceptually, at least, the input may never end, and so you must process the data continuously as it arrives. Bounded and unbounded tables¶ In the context of a Flink table, bounded mode refers to processing data that is finite, which means that the dataset has a clear beginning and end and does not grow continuously or update over time. This is in contrast to unbounded mode, where data arrives as a continuous stream, potentially with no end. The scan.bounded.mode property controls how Flink consumes data from a Kafka topic. A table can be bounded by committed offsets in Kafka brokers of a specific consumer group, by latest offsets, or by a user-supplied timestamp. Key characteristics of bounded mode¶ Finite data: The table represents a static dataset, similar to a traditional table in a relational database or a file in a data lake. Once all records are read, there is no more data to process. Batch processing: Operations on bounded tables are executed in batch mode. This means Flink processes all the available data, computes the results, and then the job finishes. This is suitable for use cases like ETL, reporting, and historical analysis. Optimized execution: Since the system knows the data is finite, it can apply optimizations that are not possible with unbounded (streaming) data. For example, it can sort by any column, perform global aggregations, and use blocking operators. No need for state retention: Unlike streaming mode, where Flink must keep state around to handle late or out-of-order events, batch mode can drop state as soon as it is no longer needed, reducing resource usage. The following table compares the characteristics of bounded and unbounded tables. Aspect Bounded Mode (Batch) Unbounded Mode (Streaming) Data Size Finite (static) Infinite (dynamic, continuous) Processing Style Batch processing Real-time/continuous processing Query Semantics All data available at once Data arrives over time State Management Minimal, can drop state when done Must retain state for late/out-of-order data Use Cases ETL, reporting, historical analytics Real-time analytics, monitoring, alerting Parallel dataflows¶ Programs in Flink are inherently parallel and distributed. During execution, a stream has one or more stream partitions, and each operator has one or more operator subtasks. The operator subtasks are independent of one another, and execute in different threads and possibly on different machines or containers. The number of operator subtasks is the parallelism of that particular operator. Different operators of the same program may have different levels of parallelism. A parallel dataflow in Flink with condensed view (above) and parallelized view (below).¶ Streams can transport data between two operators in a one-to-one (or forwarding) pattern, or in a redistributing pattern: One-to-one streams (for example between the Source and the map() operators in the figure above) preserve the partitioning and ordering of the elements. That means that subtask[1] of the map() operator will see the same elements in the same order as they were produced by subtask[1] of the Source operator. Redistributing streams (as between map() and keyBy/window above, as well as between keyBy/window and Sink) change the partitioning of streams. Each operator subtask sends data to different target subtasks, depending on the selected transformation. Examples are keyBy() (which re-partitions by hashing the key), broadcast(), or rebalance() (which re-partitions randomly). In a redistributing exchange the ordering among the elements is only preserved within each pair of sending and receiving subtasks (for example, subtask[1] of map() and subtask[2] of keyBy/window). So, for example, the redistribution between the keyBy/window and the Sink operators shown above introduces non-determinism regarding the order in which the aggregated results for different keys arrive at the Sink. Timely stream processing¶ For most streaming applications it is very valuable to be able re-process historic data with the same code that is used to process live data - and to produce deterministic, consistent results, regardless. It can also be crucial to pay attention to the order in which events occurred, rather than the order in which they are delivered for processing, and to be able to reason about when a set of events is (or should be) complete. For example, consider the set of events involved in an e-commerce transaction, or financial trade. These requirements for timely stream processing can be met by using event time timestamps that are recorded in the data stream, rather than using the clocks of the machines processing the data. Stateful stream processing¶ Flink operations can be stateful. This means that how one event is handled can depend on the accumulated effect of all the events that came before it. State may be used for something simple, such as counting events per minute to display on a dashboard, or for something more complex, such as computing features for a fraud detection model. A Flink application is run in parallel on a distributed cluster. The various parallel instances of a given operator will execute independently, in separate threads, and in general will be running on different machines. The set of parallel instances of a stateful operator is effectively a sharded key-value store. Each parallel instance is responsible for handling events for a specific group of keys, and the state for those keys is kept locally. The following diagram shows a job running with a parallelism of two across the first three operators in the job graph, terminating in a sink that has a parallelism of one. The third operator is stateful, and a fully-connected network shuffle is occurring between the second and third operators. This is being done to partition the stream by some key, so that all of the events that need to be processed together will be. A Flink job running with a parallelism of two.¶ State is always accessed locally, which helps Flink applications achieve high throughput and low-latency. State management¶ Fault tolerance via state snapshots¶ Flink is able to provide fault-tolerant, exactly-once semantics through a combination of state snapshots and stream replay. These snapshots capture the entire state of the distributed pipeline, recording offsets into the input queues as well as the state throughout the job graph that has resulted from having ingested the data up to that point. When a failure occurs, the sources are rewound, the state is restored, and processing is resumed. As depicted above, these state snapshots are captured asynchronously, without impeding the ongoing processing. Table programs that run in streaming mode leverage all capabilities of Flink as a stateful stream processor. In particular, a table program can be configured with a state backend and various checkpointing options for handling different requirements regarding state size and fault tolerance. State usage¶ Due to the declarative nature of Table API and SQL programs, it’s not always obvious where and how much state is used within a pipeline. The planner decides whether state is necessary to compute a correct result. A pipeline is optimized to claim as little state as possible given the current set of optimizer rules. Conceptually, source tables are never kept entirely in state. An implementer deals with logical tables, named dynamic tables. Their state requirements depend on the operations that are in use. Queries such as SELECT ... FROM ... WHERE which consist only of field projections or filters are usually stateless pipelines. But operations like joins, aggregations, or deduplications require keeping intermediate results in a fault-tolerant storage for which Flink state abstractions are used. Refer to the individual operator documentation for more details about how much state is required and how to limit a potentially ever-growing state size. For example, a regular SQL join of two tables requires the operator to keep both input tables in state entirely. For correct SQL semantics, the runtime needs to assume that a match could occur at any point in time from both sides of the join. Flink provides optimized window and interval joins that aim to keep the state size small by exploiting the concept of watermark strategies. Another example is the following query that computes the number of clicks per session. SELECT sessionId, COUNT(*) FROM clicks GROUP BY sessionId; The sessionId attribute is used as a grouping key and the continuous query maintains a count for each sessionId it observes. The sessionId attribute is evolving over time and sessionId values are only active until the session ends, i.e., for a limited period of time. However, the continuous query cannot know about this property of sessionId and expects that every sessionId value can occur at any point of time. It maintains a count for each observed sessionId value. Consequently, the total state size of the query is continuously growing as more and more sessionId values are observed. Dataflow Model¶ Flink implements many techniques from the Dataflow Model. The following articles provide a good introduction to event time and watermark strategies. Blog post: Streaming 101 by Tyler Akidau Dataflow Model Related content¶ Autopilot Comparison with Apache Flink Compute Pools Dynamic Tables Statements Time and Watermarks Time attributes Joins in Continuous Queries Determinism in Continuous Queries Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
SELECT ... FROM ... WHERE
```

```sql
SELECT sessionId, COUNT(*) FROM clicks GROUP BY sessionId;
```

---

### Schema and Statement Evolution with Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/concepts/schema-statement-evolution.html

Schema and Statement Evolution with Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables evolving your statements over time as your schemas change. This topic describes these concepts: How you can evolve your statements and the tables they maintain over time. How statements behave when the schema of their source tables change. Example¶ Throughout this topic, the following statement is used as a running example. SET 'sql.state-ttl' = '1h'; SET 'client.statement-name' = 'orders-with-customers-v1-1'; CREATE FUNCTION to_minor_currency AS 'io.confluent.flink.demo.toMinorCurrency' USING JAR 'confluent-artifact://ccp-lzj320/ver-4y0qw7'; CREATE TABLE v_orders AS SELECT order.* FROM sales_lifecycle_events WHERE order != NULL; CREATE TABLE orders_with_customers_v1 PRIMARY KEY (v_orders.order_id) DISTRIBUTED INTO 10 BUCKETS AS SELECT v_orders.order_id, v_orders.product, to_minor_currency(v_orders.price), customers.*, FROM v_orders JOIN customers FOR SYSTEM TIME AS OF orders.$rowtime ON v_orders.customer_id = customers.id; The orders_with_customers_v1 table uses a user-defined function named to_minor_currency and joins a table named v_orders with the up-to-date customer information from the customers table. Fundamentals¶ Mutability of statements and tables¶ A statement has the following components: an immutable query, for example: SELECT v_orders.product, to_minor_currency(v_orders.price), customers.* FROM orders JOIN customers FOR SYSTEM TIME AS OF orders.$rowtime ON v_orders.customer_id = customers.id; immutable statement properties, for example: 'sql.state-ttl' = '1h' a mutable principal, that is, the user or service account under which this statement runs. The principal and compute pool are mutable when stopping and resuming the statement. Note that stopping and resume the statement results in a temporarily higher materialization delay and latency. The query and options of a statement (SELECT ...) are immutable, which means that you can’t change them after the statement has been created. Note If your use case requires a lower latency, reach out to Confluent Support or your account manager. The table which the statement is writing to has these components: An immutable name, for example: orders_with_customers_v1. Mutable constraints, for example: PRIMARY KEY (v_orders.order_id) A mutable watermark definition. a mutable column definition partially mutable table options The name of a table is immutable, because it maps one-to-one to the underlying topic, which you can’t rename. The watermark strategy is mutable by using the ALTER TABLE ... MODIFY/DROP WATERMARK ...; statement. For more information, see ALTER TABLE Statement in Confluent Cloud for Apache Flink. The table options of the table are mutable by using the ALTER TABLE SET (...); statement. For more information, see ALTER TABLE Statement in Confluent Cloud for Apache Flink. The constraints are partially mutable by using the ALTER TABLE ADD/DROP PRIMARY KEY statement. Statements take a snapshot of their dependencies¶ A statement almost always references other catalog objects such as tables and functions. In the current example, the orders_with_customers_v1 table references these objects: A table named customers. A table named v_orders. A user-defined function named to_minor_currency. When a statement is created, it takes a snapshot of the configuration of all the catalog objects that it depends on. Changes, or the deletion of these objects from the catalog, are not propagated to existing statements, which means that: A change to the watermark strategy of a source table is not picked up by existing statements that reference the table. A change to a table option of a source table is not picked up by existing statements that reference the table. A change to the implementation of a user-defined functions is not picked up by existing statements that reference the function. If an underlying physical resource is deleted that statements require at runtime, like the topic, the statements transition into the FAILED, STOPPED, or RECOVERING state, depending on which resource was deleted. Schema compatibility modes¶ When a statement is created, it must be bootstrapped from its source tables. For this, Flink must be able to read the source tables from the beginning (or any other specified offsets). As mentioned previously, statements use the latest schema version, at the time of statement creation, for each source table as the read schema. You have these options for handling changes to base schemas: Compatibility Mode FULL or FULL_TRANSITIVE BACKWARD_TRANSITIVE compatibility mode and upgrade consumers first Compatibility groups and migration rules To maximize compatibility with Flink, you should use FULL_TRANSITIVE or FULL as the schema compatibility mode, which eases migrations. Note that in Confluent Cloud, the default compatibility mode is BACKWARD. Sometimes, you may need to make changes beyond what the FULL_TRANSITIVE and FULL modes enable, so Confluent Cloud for Apache Flink gives you the additional options of BACKWARD_TRANSITIVE compatibility mode and Compatibility groups and migration rules for handling changes to base schemas. Compatibility Mode FULL or FULL_TRANSITIVE¶ If you use the FULL or FULL_TRANSITIVE compatibility mode, the order you upgrade your statements doesn’t matter. FULL limits the changes that you can make to your tables to adding and removing optional fields. You can make any compatible changes to the source tables, and none of the statements that reference them will break. BACKWARD_TRANSITIVE compatibility mode and upgrade consumers first¶ BACKWARD_TRANSITIVE mandates that consumers are upgraded prior to producers. This means that if you evolve your schema according to the BACKWARD_TRANSITIVE rules (delete fields, add optional fields), you always need to upgrade all statements that are reading from the corresponding source tables before producing any records to the table that uses the next schema version, as described in Query Evolution. Compatibility groups and migration rules¶ If you need to make a non-compatible change to a table, either using FULL or BACKWARD_TRANSITIVE, Confluent Cloud for Apache Flink also supports compatibility groups and migration rules. For more information, see Data Contracts for Schema Registry on Confluent Cloud. Note If you need to make changes to your schemas that aren’t possible under schema compatibility mode FULL, use compatibility mode FULL for all topics and rely on compatibility groups and migration rules. Statements and schema evolution¶ When following the practices in the previous section, statements won’t fail when fields are added or optional fields are removed from its source tables, but these new fields aren’t picked up or forwarded to the sink tables. They are ignored by any previously created statements, and the *-operators are not evaluated dynamically when the schema changes. Note If you’re interested in to providing feedback about configuring statements to pick up schema changes of sources tables dynamically, reach out to Confluent Support or your account manager. Query evolution¶ As stated previously, the query in a statement is immutable. But you may encounter situations in which you want to change the logic of a long-running statement: You may have to fix a bug in your query. For example, you may have to handle an arithmetic error that occurs only when the statement has already existed for a long time by adding another branch in a CASE clause. You may want to evolve the logic of your statement. You want your statement to pick up configuration updates to any of the catalog objects that it references, like tables or functions. The general strategy for query evolution is to replace the existing statement and the corresponding tables it maintains with a new statement and new tables, as shown in the following steps: Use CREATE TABLE ... AS ... to create a new version of the table, orders_with_customers_v2. Wait until the new statement has caught up with latest messages of its source tables, which means that the “Messages Behind” metric is close to zero. Note that Confluent Cloud Autopilot automatically configures the statement to catch up as quickly as the compute resources provided by the assigned compute pool allow. Migrate all consumers to the new tables. The best way to find all downstream consumers of a table topic in Confluent Cloud is to use Stream Lineage. Stop the orders-with-customers-v1-1 statement. This base strategy has these features: It works for any type of statement. It requires that all relevant input messages are retained in the source tables. It requires existing consumers to switch to different topics manually, and thereby reading the …v2 table from earliest or any manually specified offset. You can adjust the base strategy in multiple ways, depending on your circumstances. Limit reprocessing to a partial history¶ Compared to the base strategy, this strategy limits the messages that are reprocessed to a subset of the messages retained in the source tables. You may not want to reprocess the full history of messages that’s retained in all source table, but instead specify a different starting offset. For this, you can override the scan.startup.mode that is defined for the table, which by default is earliest, using dynamic table option hints. SET 'sql.state-ttl' = '1h'; SET 'client.statement-name' = 'orders-with-customers-v2-1'; CREATE TABLE orders_with_customers_v2 PRIMARY KEY (orders.order_id) DISTRIBUTED INTO 10 BUCKETS AS SELECT orders.order_id, orders.product, to_minor_currency(v_orders.price), customers.*, FROM orders /*+ OPTIONS('scan.startup.mode' = 'timestamp', 'scan.startup.timestamp-millis' = '1717763226336') */ JOIN customers /*+ OPTIONS('scan.startup.mode' = 'timestamp', 'scan.startup.timestamp-millis' = '1717763226336') */ ON orders.customer_id = customers.id; Alternatively, you can set this by using statement properties, like sql.tables.scan.startup.mode, and the SET statement. While dynamic table option hints enable you to configure the starting offset for each table independently, the statement properties affect the starting offset for all tables that this statement reads from. When reprocessing a partial history of the source tables, and depending on your query, you may want to add an additional filter predicate to your tables, to avoid incorrect results. For example, if your query performs windowed aggregations on ten-minute tumbling windows, you may want to start reading from exactly the beginning of a window to avoid an incomplete window at the start. This could be achieved by adding a WHERE event_time > '<timestamp>' clause to the respective source tables, where event_time is the name of the column that is used for windowing, and <timestamp> lies within the history of messages that are reprocessed and aligns with the start of one of the ten-minute windows, for example, 2024-06-11 15:40:00. Special case: Carrying over offsets of previous statements¶ When a statement is stopped, status.latest_offsets contains the latest offset for each partition of each of the source tables: status: latestOffsets: topic1: partition:0,offset:23;partition:1,offset:65 topic2: partition:0,offset:53;partition:1,offset:56 latestOffsetsTimestamp: you can use these offsets to specify the starting offsets to a new statement by using dynamic table option hints, so the new statement continues exactly where the previous statement left off. This strategy enables you to evolve statements arbitrarily with exactly-once semantics across the update, if and only if the statement is “stateless”, which mean that every output message is affected by a single input message. The following statements are common example of “stateless” statements: Filters INSERT INTO shipped_orders SELECT * FROM orders WHERE status = shipped; Routers EXECUTE STATEMENT SET BEGIN INSERT INTO shipped_orders SELECT * FROM orders WHERE status = 'shipped'; INSERT INTO cancelled_orders SELECT * FROM orders WHERE status = 'cancelled'; INSERT INTO returned_orders SELECT * FROM orders WHERE status = 'returned'; INSERT INTO other_orders SELECT * FROM orders WHERE status NOT IN ('returned', 'shipped', 'cancelled') END; Per-row transformations, including UDFs and array expansions: INSERT INTO ordered_products SELECT o.*, order_products.* FROM orders AS o CROSS JOIN UNNEST(orders.products) AS `order_products` (product_id, category, quantity, unit_price, net_price) For more information, see Carry-over Offsets. In-place upgrade¶ Compared to the base strategy, the in-place upgrade strategy has these features: It works only for tables that have a primary key, so that the new statement updates all rows written by the old statement. It works only for compatible changes, both semantically and in terms of the schema. It doesn’t require consumers to switch manually to new topics, but it does require consumers to be able to handle out-of-order, late, bulk updates to all keys. Instead of creating a new results table, you can also replace the original CREATE TABLE ... AS ... statement with an INSERT INTO statement that produces updates into the same table as before. The upgrade procedure then looks like this: Stop the old orders-with-customers-v1-1 statement. Once the old statement is stopped, create the new statement, orders-with-customers-v1-2. This strategy can and often will be combined with limited reprocessing to a partial history. Specifically, in the case of an exactly-once upgrade of a stateless statement, it makes sense to continue publishing messages to the same topic, provided this was a compatible change. Related content¶ Flink implements many techniques from the Dataflow Model. For a good introduction to event time and watermarks, have a look at these articles. Data Contracts Stream Lineage HINTS CREATE TABLE SET Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
SET 'sql.state-ttl' = '1h';
SET 'client.statement-name' = 'orders-with-customers-v1-1';

CREATE FUNCTION to_minor_currency
AS 'io.confluent.flink.demo.toMinorCurrency'
USING JAR 'confluent-artifact://ccp-lzj320/ver-4y0qw7';

CREATE TABLE v_orders AS SELECT order.* FROM sales_lifecycle_events WHERE order != NULL;

CREATE TABLE orders_with_customers_v1
PRIMARY KEY (v_orders.order_id)
DISTRIBUTED INTO 10 BUCKETS
AS
SELECT
  v_orders.order_id,
  v_orders.product,
  to_minor_currency(v_orders.price),
  customers.*,
FROM v_orders
JOIN customers FOR SYSTEM TIME AS OF orders.$rowtime
ON v_orders.customer_id = customers.id;
```

```sql
orders_with_customers_v1
```

```sql
to_minor_currency
```

```sql
SELECT
  v_orders.product,
  to_minor_currency(v_orders.price),
  customers.*
FROM orders
JOIN customers FOR SYSTEM TIME AS OF orders.$rowtime
ON v_orders.customer_id = customers.id;
```

```sql
'sql.state-ttl' = '1h'
```

```sql
(SELECT ...)
```

```sql
orders_with_customers_v1
```

```sql
PRIMARY KEY (v_orders.order_id)
```

```sql
ALTER TABLE ... MODIFY/DROP WATERMARK ...;
```

```sql
ALTER TABLE SET (...);
```

```sql
ALTER TABLE ADD/DROP PRIMARY KEY
```

```sql
orders_with_customers_v1
```

```sql
to_minor_currency
```

```sql
FULL_TRANSITIVE
```

```sql
FULL_TRANSITIVE
```

```sql
FULL_TRANSITIVE
```

```sql
BACKWARD_TRANSITIVE
```

```sql
BACKWARD_TRANSITIVE
```

```sql
BACKWARD_TRANSITIVE
```

```sql
CREATE TABLE ... AS ...
```

```sql
orders_with_customers_v2
```

```sql
orders-with-customers-v1-1
```

```sql
scan.startup.mode
```

```sql
SET 'sql.state-ttl' = '1h';
 SET 'client.statement-name' = 'orders-with-customers-v2-1';
 CREATE TABLE orders_with_customers_v2
 PRIMARY KEY (orders.order_id)
 DISTRIBUTED INTO 10 BUCKETS
 AS
 SELECT
   orders.order_id,
   orders.product,
   to_minor_currency(v_orders.price),
   customers.*,
 FROM orders /*+ OPTIONS('scan.startup.mode' = 'timestamp', 'scan.startup.timestamp-millis' = '1717763226336') */
 JOIN customers /*+ OPTIONS('scan.startup.mode' = 'timestamp', 'scan.startup.timestamp-millis' = '1717763226336') */
 ON orders.customer_id = customers.id;
```

```sql
sql.tables.scan.startup.mode
```

```sql
WHERE event_time > '<timestamp>'
```

```sql
<timestamp>
```

```sql
2024-06-11 15:40:00
```

```sql
status.latest_offsets
```

```sql
status:
    latestOffsets:
        topic1: partition:0,offset:23;partition:1,offset:65
        topic2: partition:0,offset:53;partition:1,offset:56
    latestOffsetsTimestamp:
```

```sql
INSERT INTO shipped_orders
SELECT *
FROM orders
WHERE status = shipped;
```

```sql
EXECUTE STATEMENT SET
BEGIN
  INSERT INTO shipped_orders SELECT * FROM orders WHERE status = 'shipped';
  INSERT INTO cancelled_orders SELECT * FROM orders WHERE status = 'cancelled';
  INSERT INTO returned_orders SELECT * FROM orders WHERE status = 'returned';
  INSERT INTO other_orders SELECT * FROM orders WHERE status NOT IN ('returned', 'shipped', 'cancelled')
END;
```

```sql
INSERT INTO ordered_products
SELECT
   o.*,
   order_products.*
FROM orders AS o
CROSS JOIN UNNEST(orders.products) AS `order_products` (product_id, category, quantity, unit_price, net_price)
```

```sql
CREATE TABLE ... AS ...
```

```sql
orders-with-customers-v1-1
```

```sql
orders-with-customers-v1-2
```

---

### Snapshot Queries in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/concepts/snapshot-queries.html

Snapshot Queries in Confluent Cloud for Apache Flink¶ In Confluent Cloud for Apache Flink®, a snapshot query is a query that reads data from a table at a specific point in time. In contrast with a streaming query, which runs continuously and returns results incrementally, a snapshot query runs, returns results, and then exits. Snapshot queries are also known as point-in-time or pull queries. You can query Kafka topics as well as Apache Iceberg™ tables by using Confluent Tableflow. Note Snapshot query is an Early Access Program feature in Confluent Cloud for Apache Flink. An Early Access feature is a component of Confluent Cloud introduced to gain feedback. This feature should be used only for evaluation and non-production testing purposes or to provide feedback to Confluent, particularly as it becomes more widely available in follow-on preview editions. Early Access Program features are intended for evaluation use in development and testing environments only, and not for production use. Early Access Program features are provided: (a) without support; (b) “AS IS”; and (c) without indemnification, warranty, or condition of any kind. No service level commitment will apply to Early Access Program features. Early Access Program features are considered to be a Proof of Concept as defined in the Confluent Cloud Terms of Service. Confluent may discontinue providing preview releases of the Early Access Program features at any time in Confluent’s sole discretion. Snapshot query uses¶ A snapshot query returns a consistent view of your data at the current point in time, similar to taking a photograph of your data at that moment. This is particularly useful when you need to: Generate reports that reflect your data’s state at a specific time Analyze historical data for auditing or compliance purposes Compare data states across different points in time Debug or investigate issues by examining past data states For example, if you want to know the total number of orders in your system at the current time, you can use a snapshot query. Snapshot mode¶ A snapshot query is an ordinary Flink SQL statement that has one additional property, named sql.snapshot.mode. To enable snapshot queries, set the sql.snapshot.mode property to now. You can set this property in the following ways: SQL Workspace: Toggle the Mode dropdown to Snapshot. Flink SQL: Prepend your query with SET 'sql.snapshot.mode' = 'now';. Table API: In the Cloud.Properties project file, add sql.snapshot.mode = now. REST API: In the statement’s spec.properties map, add "sql.snapshot.mode": "now". Terraform: In the statement properties, add "sql.snapshot.mode" = "now". Snapshot queries use Flink’s batch execution mode, which enables you to run batch processing jobs beside your existing stream processing workloads, within the same Confluent Cloud environment. Also, Confluent Cloud for Apache Flink bounds all sources, which means that Flink processes only a finite set of records up to a specific point in time, rather than continuously processing an infinite stream of incoming data. How snapshot queries work¶ When you execute a snapshot query, Flink performs the following steps: Determines the Kafka offsets corresponding to your current timestamp across all partitions Reads data from the source topics up to these offsets Processes the records to build the state of your tables at this point in time Returns the query results based on this state The query execution is optimized to use Kafka’s time index for efficient offset lookup, to leverage parallel processing across partitions, and to minimize the amount of data that needs to be processed. Snapshot queries and Tableflow¶ If Tableflow is enabled on a topic, snapshot queries on the topic run in a hybrid mode. If Tableflow is not enabled on a topic, the query reads from Kafka. If Tableflow is enabled on a topic, the query reads from both Kafka and Parquet, for Confluent Managed Storage and custom storage (BYOS). Run a snapshot query¶ To run a snapshot query, in a Flink workspace or the Flink SQL shell, prepend your query with the following SET statement: SET 'sql.snapshot.mode' = 'now'; Also, in a Flink workspace, you can change the Mode dropdown setting to Snapshot. For more information, see Run a Snapshot Query. Technical Details¶ Timestamp Resolution: Timestamps are processed with millisecond precision State Handling: For tables with state (like aggregations), Flink reconstructs the state by processing all relevant records up to the specified timestamp Parallelism: Queries are automatically parallelized across available compute resources Resource Optimization: Flink uses Kafka’s time index to quickly locate the relevant offsets, minimizing unnecessary data scanning Relationship to Batch Mode¶ Snapshot queries are closely related to Flink’s batch processing mode. When you execute a snapshot query: Flink automatically switches to batch mode processing The query processes a finite, bounded dataset up to the current timestamp The computation benefits from batch optimizations like sort-merge joins Resources are released once the query completes Results are deterministic and reproducible This behavior contrasts with streaming queries which: Process continuous, unbounded data streams Maintain persistent state and resources Produce incremental, real-time results May give different results when rerun due to new data Billing¶ Snapshot queries are billed in CFUs, in the same way that streaming queries are. For more information, see Flink Billing. Related content¶ Run a Snapshot Query Query Tableflow Tables with Flink Statements Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
sql.snapshot.mode
```

```sql
sql.snapshot.mode
```

```sql
SET 'sql.snapshot.mode' = 'now';
```

```sql
Cloud.Properties
```

```sql
sql.snapshot.mode = now
```

```sql
spec.properties
```

```sql
"sql.snapshot.mode": "now"
```

```sql
"sql.snapshot.mode" = "now"
```

```sql
SET 'sql.snapshot.mode' = 'now';
```

---

### Statement CFU Metrics in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/concepts/statement-cfu-metrics.html

Statement CFU Metrics in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides detailed metrics to help you understand and manage your resource utilization. One critical aspect of this is statement CFU metrics. How to use statement CFU metrics¶ The statement CFU metrics give you insights into the resource consumed by individual statements running inside your compute pools. Specifically, the statement CFU metrics enable you to: Monitor individual statement usage: Accurately measure the number of CFUs each statement consumes over time. This metric is available for all types of statements submitted in Confluent Cloud for Apache Flink. Track resource distribution: Understand how the resources in a compute pool are being distributed among the statements running in the compute pool. Identify high-consumption statements: Pinpoint which statements are consuming the most CFUs, enabling you to optimize the statement’s Flink SQL code or adjust the resources available to this statement. By monitoring statement-level CFU consumption, you can make informed decisions about your Flink application’s cost efficiency and resource utilization. You can’t set minimum or maximum CFU limits on individual statements, but maximum CFU limits are configurable at the compute-pool level. Where to view statement CFU metrics?¶ The statement CFU consumption metrics are available to view in the statements summary table and in the statement side panel. Statements summary table: Get an overview of CFU consumption for all your statements directly within the statements summary table. This provides a quick way to identify the most resource intensive statements. Statement side panel: For a deeper dive into a statement’s resource usage, open the statement side panel. Here, you’ll find the current CFU consumption and a time-series chart that visualizes how the statement’s CFU consumption has evolved over time. How UDF resource consumption is represented¶ The statement CFU metric shows the resources consumed by your SQL statements and the resources consumed by any UDF instances the statement might invoke. Resources consumed by individual UDF instances will sometimes appear as fractional CFU values. This is because multiple UDF instances can be consolidated, or “rolled into,” a single CFU of resources. Up to three instances of a UDF can be combined into one CFU. The distribution of CFUs amongst UDFs in a compute pool is flexible. Three UDF instances across different statements can be rolled into a single CFU, as long as the statements are in the same compute pool. Also, different UDF functions and their instances can be consolidated into a single CFU, as long as they are in the same compute pool. Understanding differences between CFUs for compute pool and statement¶ When monitoring resource consumption in Confluent Cloud for Apache Flink, you might observe minor differences between your compute pool CFU metrics and the aggregated sum of your statement CFU metrics. These discrepancies are expected and are caused by rounding. If your statements use UDFs, you may see a maximum discrepancy of 2 CFUs between the compute pool CFU metrics and the total sum of your statement CFU metrics. For statements not utilizing UDFs, the maximum expected discrepancy between the compute pool CFU metrics and the total sum of your statement CFU metrics is 1 CFU. Note You are billed based on the compute pool CFU metrics, not on the summed total of individual statement CFU metrics. Related content¶ Billing Compute Pools Flink SQL Statements Confluent Cloud Pricing

---

### Flink SQL Statements in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/concepts/statements.html

Flink SQL Statements in Confluent Cloud for Apache Flink¶ In Confluent Cloud for Apache Flink®, a statement represents a high-level resource that’s created when you enter a SQL query. Each statement has a property that holds the SQL query that you entered. Based on the SQL query, the statement may be one of these kinds: A metadata operation, or DDL statement A background statement, which writes data back to a table/topic while running in the background A foreground statement, which writes data back to the UI or a client. In all of these cases, the statement represents any SQL statement for Data Definition Language (DDL), Data Manipulation Language (DML), and Data Query Language (DQL). When you submit a SQL query, Confluent Cloud creates a statement resource. You can create a statement resource from any Confluent-supported interface, including the SQL shell, Confluent CLI, Cloud Console, the REST API, and Terraform. The SQL query within a statement is immutable, which means that you can’t make changes to the SQL query once it’s been submitted. If you need to edit a statement, stop the running statement and create a new statement. You can change the security principal for the statement. If a statement is running under a user account, you can change it to run under a service account by using the Confluent Cloud Console, Confluent CLI, the REST API, or the Terraform provider. Running a statement under a service account provides better security and stability, ensuring that your statements aren’t affected by changes in user status or authorization. Also, you can change the compute pool that runs a statement. This can be useful if you’re close to maxing out the resources in one pool. You must stop the statement before changing the principal or compute pool, then restart the statement after the change. Confluent Cloud for Apache Flink enforces a 30-day retention for statements in terminal states. For example, once a statement transitions to the STOPPED state, it no longer consumes compute and is deleted after 30 days. If there is no consumer for the results of a foreground statement for five minutes or longer, Confluent Cloud moves the statement to the STOPPED state. Limit on query text size¶ Confluent Cloud for Apache Flink has a limit of 4 MB on the size of query text. This limit includes string and binary literals that are part of the query. The maximum length of a statement name is 72 characters. If you combine multiple SQL statements into a single semicolon-separated string, the length limit applies to the entire string. If the query size is greater than the 4 MB limit, you receive the following error. This query is too large to process (exceeds 4194304 bytes). This can happen due to: * Complex query structure. * Too many columns selected or expanded due to * usage. * Multiple table joins. * Large number of conditions. Try simplifying your query or breaking it into smaller parts. Lifecycle operations statements¶ These are the supported lifecycle operations for a statement. Statements have a lifecycle that includes the following states: Pending: The statement has been submitted and Flink is preparing to start running the statement. Running: Flink is actively running the statement. Completed: The statement has completed all of its work. Deleting: The statement is being deleted. Failed: The statement has encountered an error and is no longer running. Degraded: The statement appears unhealthy, for example, no transactions have been committed for a long time, or the statement has frequently restarted recently. Stopping: The statement is about to be stopped. Stopped: The statement has been stopped and is no longer running. Submit a statement¶ SQL shell Cloud Console REST API statements endpoint List running statements¶ SQL shell SHOW JOBS statement Confluent CLI Cloud Console REST API statements endpoint Describe a statement¶ Confluent CLI Cloud Console REST API statement endpoint Delete a statement¶ Confluent CLI Cloud Console REST API DELETE request List statement exceptions¶ Confluent CLI Cloud Console Stop and resume a statement¶ Confluent CLI REST API UPDATE request Cloud Console Queries in Flink¶ Flink enables issuing queries with an ANSI-standard SQL on data at rest (batch) and data in motion (streams). These are the queries that are possible with Flink SQL. Metadata queriesCRUD on catalogs, databases, tables, etc. Because Flink implements ANSI-Standard SQL, Flink uses a database analogy, and similar to a database, it uses the concepts of catalogs, databases and tables. In Apache Kafka®, these concepts map to environments, Kafka clusters, and topics, respectively. Ad-hoc / exploratory queriesYou can issue queries on a topic and see the results immediately. A query can be a batch query (“show me what happened up to now”), or a transient streaming query (“show me what happened up to now and give me updates for the near future”). In this case, when the query or the session is ended, no more compute is needed. Streaming queriesThese queries run continuously and read data from one or more tables/topics and write results of the queries to one table/topic. In general, Flink supports both batch and stream processing, but the exact subset of allowed operations differs slightly depending of the type of query. For more information, see Flink SQL Queries. All queries are executed in streaming execution mode, whether the sources are bounded or unbounded. Data lifecycle¶ Broadly speaking, the Flink SQL lifecycle is: Data is read into a Flink table from Kafka via the Flink connector for Kafka. Data is processed using SQL statements. Data is processed using Flink task managers (managed by Confluent and not exposed to users), which are part of the Flink runtime. Some data may be stored temporarily as state in Flink while it’s being processed Data is returned to the user as a result-set. The result-set may be bounded, in which case the query terminates. The result-set may be unbounded, in which case the query runs until canceled manually. OR Data is written back out to one or more tables. Data is stored in Kafka topics. Schema for the table is stored in Flink Metastore and synchronized out to Schema Registry. Flink SQL Data Definition Language (DDL) statements¶ Data Definition Language (DDL) statements are imperative verbs that define metadata in Flink SQL by adding, changing, or deleting tables. Data Definition Language statements modify metadata only and don’t operate on data. Use these statements with declarative Flink SQL Queries to create your Flink SQL applications. Flink SQL makes it simple to develop streaming applications using standard SQL. It’s easy to learn Flink SQL if you’ve ever worked with a database or SQL-like system that’s ANSI-SQL 2011 compliant. Available DDL statements¶ These are the available DDL statements in Confluent Cloud for Flink SQL. ALTER ALTER MODEL Statement in Confluent Cloud for Apache Flink ALTER TABLE Statement in Confluent Cloud for Apache Flink ALTER VIEW Statement in Confluent Cloud for Apache Flink CREATE CREATE FUNCTION Statement CREATE MODEL Statement in Confluent Cloud for Apache Flink CREATE TABLE Statement in Confluent Cloud for Apache Flink CREATE VIEW Statement in Confluent Cloud for Apache Flink DESCRIBE DESCRIBE Statement in Confluent Cloud for Apache Flink DROP DROP MODEL Statement in Confluent Cloud for Apache Flink DROP TABLE Statement in Confluent Cloud for Apache Flink DROP VIEW Statement in Confluent Cloud for Apache Flink EXPLAIN EXPLAIN Statement in Confluent Cloud for Apache Flink RESET RESET Statement in Confluent Cloud for Apache Flink SET SET Statement in Confluent Cloud for Apache Flink SHOW SHOW Statements in Confluent Cloud for Apache Flink USE USE CATALOG Statement in Confluent Cloud for Apache Flink USE <database_name> Statement in Confluent Cloud for Apache Flink Related content¶ Flink SQL Queries Stream Processing Concepts Built-in Functions Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
This query is too large to process (exceeds 4194304 bytes).

This can happen due to:

* Complex query structure.
* Too many columns selected or expanded due to * usage.
* Multiple table joins.
* Large number of conditions.

Try simplifying your query or breaking it into smaller parts.
```

---

### Time and Watermarks in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/concepts/timely-stream-processing.html

Time and Watermarks in Confluent Cloud for Apache Flink¶ Timely stream processing is an extension of stateful stream processing that incorporates time into the computation. It’s commonly used for time series analysis, aggregations based on windows, and event processing where the time of occurrence is important. If you’re working with timely Apache Flink® applications on Confluent Cloud, it’s important to consider certain factors to ensure optimal performance. Learn more about these considerations in the following sections. Notions of time: Event Time and Processing Time¶ When referring to time in a streaming program, like when you define windows, different notions of time may apply. Processing time¶ Processing time refers to the system time of the machine that’s executing the operation. When a streaming program runs on processing time, all time-based operations, like time windows, use the system clock of the machines that run the operator. An hourly processing time window includes all records that arrived at a specific operator between the times when the system clock indicated the full hour. For example, if an application begins running at 9:15 AM, the first hourly processing time window includes events processed between 9:15 AM and 10:00 AM, the next window includes events processed between 10:00 AM and 11:00 AM, and so on. Processing time is the simplest notion of time and requires no coordination between streams and machines. It provides the best performance and the lowest latency. But in distributed and asynchronous environments, processing time doesn’t provide determinism, because it’s susceptible to the speed at which records arrive in the system, like from a message queue, to the speed at which records flow between operators inside the system, and to outages (scheduled, or otherwise). Event time¶ Event time is the time that each individual event occurred on its producing device. This time is typically embedded within the records before they enter Flink, and this event timestamp can be extracted from each record. In event time, the progress of time depends on the data, not on any wall clocks. Event-time programs must specify how to generate event-time watermarks, which is the mechanism that signals progress in event time. This watermarking mechanism is described in the Event Time and Watermarks section. In a perfect world, event-time processing would yield completely consistent and deterministic results, regardless of when events arrive, or their ordering. But unless the events are known to arrive in-order (by timestamp), event-time processing incurs some latency while waiting for out-of-order events. Because it’s only possible to wait for a finite period of time, this places a limit on how deterministic event-time applications can be. Assuming all of the data has arrived, event-time operations behave as expected, and produce correct and consistent results even when working with out-of-order or late events, or when reprocessing historic data. For example, an hourly event-time window contains all records that carry an event timestamp that falls into that hour, regardless of the order in which they arrive, or when they’re processed. For more information, see Lateness. Sometimes when an event-time program is processing live data in real-time, it uses some processing time operations in order to guarantee that they are progressing in a timely fashion. Event Time and Processing Time¶ Event Time and Watermarks¶ Event time¶ A stream processor that supports event time needs a way to measure the progress of event time. For example, a window operator that builds hourly windows needs to be notified when event time has passed beyond the end of an hour, so that the operator can close the window in progress. Event time can progress independently of processing time, as measured by wall clocks. For example, in one program, the current event time of an operator may trail slightly behind the processing time, accounting for a delay in receiving the events, while both proceed at the same speed. But another streaming program might progress through weeks of event time with only a few seconds of processing, by fast-forwarding through some historic data already buffered in an Apache Kafka® topic. Watermarks¶ The mechanism in Flink to measure progress in event time is watermarks. Watermarks determine when to make progress during processing or wait for more records. Certain SQL operations, like windows, interval joins, time-versioned joins, and MATCH_RECOGNIZE require watermarks. Without watermarks, they don’t produce output. By default, every table has a watermark strategy applied. A watermark means, “I have seen all records until this point in time”. It’s a long value that usually represents epoch milliseconds. The watermark of an operator is the minimum of received watermarks over all partitions of all inputs. It triggers the execution of time-based operations within this operator before sending the watermark downstream. Watermarks can be emitted for every record, or they can be computed and emitted on a wall-clock interval. By default, Flink emits them every 200 ms. The built-in function, CURRENT_WATERMARK, enables printing the current watermark for the executing operator. Providing a timestamp is a prerequisite for providing a default watermark. Without providing some timestamp, neither a watermark nor a time attribute is possible. In Flink SQL, only time attributes can be used for time-based operations. A time attribute must be of type TIMESTAMP(p) or TIMESTAMP_LTZ(p), with 0 <= p <= 3. Defining a watermark over a timestamp makes it a time attribute. This is shown as a ROWTIME in a DESCRIBE statement. Watermarks and timestamps¶ Every Kafka record has a message timestamp which is part of the message format, and not in the payload or headers. Timestamp semantics can be CreateTime (default) or LogAppendTime. The timestamp is overwritten by the broker only if LogAppendTime is configured. Otherwise, it depends on the producer, which means that the timestamp can be user-defined, or it is set using the client’s clock if not defined by the user. In most cases, a Kafka record’s timestamp is expressed in epoch milliseconds in UTC. Watermarks flow as part of the data stream and carry a timestamp t. A Watermark(t) declares that event time has reached time t in that stream, meaning that there should be no more elements from the stream with a timestamp t’ <= t, that is, events with timestamps older or equal to the watermark. The following diagram shows a stream of events with logical timestamps and watermarks flowing inline. In this example, the events are in order with respect to their timestamps, meaning that the watermarks are simply periodic markers in the stream. A data stream with in-order events and watermarks¶ Watermarks are crucial for out-of-order streams, as shown in the following diagram, where the events are not ordered by their timestamps. In general, a watermark declares that by this point in the stream, all events up to a certain timestamp should have arrived. Once a watermark reaches an operator, the operator can advance its internal event time clock to the value of the watermark. A data stream with out-of-order events and watermarks¶ Event time is inherited by a freshly created stream element (or elements) from either the event that produced them or from the watermark that triggered creation of these elements. Watermarks in parallel streams¶ Watermarks are generated at, or directly after, source functions. Each parallel subtask of a source function usually generates its watermarks independently. These watermarks define the event time at that particular parallel source. As the watermarks flow through the streaming program, they advance the event time at the operators where they arrive. Whenever an operator advances its event time, it generates a new watermark downstream for its successor operators. Some operators consume multiple input streams. For example, a union, or operators following a keyBy(…) or partition(…) function consume multiple input streams. Such an operator’s current event time is the minimum of its input streams’ event times. As its input streams update their event times, so does the operator. The following diagram shows an example of events and watermarks flowing through parallel streams, and operators tracking event time. Parallel data streams and operators with events and watermarks¶ Lateness¶ It’s possible that certain elements violate the watermark condition, meaning that even after the Watermark(t) has occurred, more elements with timestamp t’ <= t occur. In many real-world systems, certain elements can be delayed for arbitrary lengths of time, making it impossible to specify a time by which all elements of a certain event timestamp will have occurred. Furthermore, even if the lateness can be bounded, delaying the watermarks by too much is often not desirable, because it causes too much delay in the evaluation of event-time windows. For this reason, streaming programs may explicitly expect some late elements. Late elements are elements that arrive after the system’s event time clock, as signaled by the watermarks, has already passed the time of the late element’s timestamp. Currently, Flink does not support late events or allowed lateness. Windowing¶ Aggregating events, for example in counts and sums, works differently with streams than in batch processing. For example, it’s impossible to count all elements in a stream, because streams are, in general, infinite (unbounded). Instead, aggregates on streams, like counts and sums, are scoped by windows, like as “count over the last 5 minutes”, or “sum of the last 100 elements”. Time windows and count windows on a data stream¶ Windows can be time driven, for example, “every 30 seconds”, or data driven, for example, “every 100 elements”. There are different types of windows, for example: Tumbling windows: no overlap Sliding windows: with overlap Session windows: punctuated by a gap of inactivity For more information, see: Window Aggregation Queries in Confluent Cloud for Apache Flink Window Deduplication Queries in Confluent Cloud for Apache Flink Window Join Queries in Confluent Cloud for Apache Flink Window Top-N Queries in Confluent Cloud for Apache Flink Windowing Table-Valued Functions (Windowing TVFs) in Confluent Cloud for Apache Flink Watermarks and windows¶ In the following example, the source is a Kafka topic with 4 partitions. The Flink job is running with a parallelism of 2, and each instance of the Kafka source reads from 2 partitions. Each event has a key, shown as a letter from A to D, and a timestamp. The events shown in bold text have already been read. The events in gray, to the left of the read position, will be read next. The events that have already been read are shuffled by key into the window operators, where the events are counted by key for each hour. Example Flink job graph with windows and watermarks.¶ Because the hour from 1 to 2 o’clock hasn’t been finalized yet, the windows keep track of the counters for that hour. There have been two events for key A for that hour, one event for key B, and so on. Because events for the following hour have already begun to appear, these windows also maintain counters for the hour from 2 o’clock to 3 o’clock. These windows wait for watermarks to trigger them to produce their results. The watermarks come from the watermark generators in the Kafka source operators. For each Kafka partition, the watermark generator keeps track of the largest timestamp seen so far, and subtracts from that an estimate of the expected out-of-orderness. For example, for Partition 1, the largest timestamp is 1:30. Assuming that the events are at most 1 minute out of order, then the watermark for Partition 1 is 1:29. A similar computation for Partition 3 yields a watermark of 1:30, and so on for the remaining partitions. Each of the two Kafka source instances take as its watermark the minimum of these per-partition watermarks From the point of view of the uppermost Kafka source operator, the watermark it produces should include a timestamp that reflects how complete the stream is that it is producing. This stream from Kafka Source 1 includes events from both Partition 1 and Partition 3, so it can be no more complete than the furthest behind of these two partitions, which is Partition 1. Although Partition 1 has seen an event with a timestamp as late as 1:30, it reports its watermark as 1:29, because it allowing for its events to be up to one minute out-of-order. This same reasoning is applied as the watermarks flow downstream through the job graph. Each instance of the window operator has received watermarks from the two Kafka source instances. The current watermark at both of the window operators is 1:17, because this is the furthest behind of the watermarks coming into the windows from the Kafka sources. The furthest behind of all four Kafka partitions determines the overall progress of the windows. Watermark alignment¶ Watermark alignment enables you to specify how tightly synchronized your streams should be, preventing any of the sources from getting too far ahead of the others. It addresses the problem of temporal joins between streams with progressively diverging timestamps. When performing temporal joins between two streams, if one stream is significantly ahead of the other, data from the leading stream must be buffered while waiting for the watermark of the lagging stream to advance. As timestamps diverge further, the buffering requirements grow, potentially causing performance degradation and operational issues, like checkpointing failures. Watermark alignment enables you to pause reading from streams that are too far ahead, enabling lagging streams to catch up and preventing the situation from worsening. This feature is particularly valuable when joining streams that have naturally diverging timestamps, such as when one data source produces events more frequently or with different timing characteristics than another. Watermark alignment provides these benefits: Reduces memory buffering requirements Improves performance by preventing excessive data buffering Prevents operational problems like checkpointing failures Provides control over stream synchronization In Confluent Cloud for Apache Flink, watermark alignment is enabled by default. Set the sql.tables.scan.watermark-alignment.max-allowed-drift session option to change the maximum allowed deviation, or watermark drift. The default maximum watermark drift is 5 minutes. This value matches it with the default maximum idleness detection timeout, which is also 5 minutes. Otherwise, watermark alignment would occur while Flink waits for a partition to switch to idle, potentially wasting CPU resources. Only increase the watermark alignment’s maximum allowed drift to match the idleness timeout when you increase the idleness timeout. Decreasing the watermark alignment’s maximum allowed drift may be justified if records throughput, expressed as records per minute of event time, is too large for windowed/temporal operators to buffer the default 5 minutes of the data and the window’s length is lower than 5 minutes. Time attributes¶ Confluent Cloud for Apache Flink can process data based on different notions of time. Event time refers to stream processing based on timestamps that are attached to each row. The timestamps can encode when an event happened. Processing time refers to the machine’s system time that’s executing the operation. Processing time is also known as “epoch time”, for example, Java’s System.currentTimeMillis(). Processing time is not supported in Confluent Cloud for Apache Flink. Time attributes can be part of every table schema. They are defined when creating a table from a CREATE TABLE DDL statement. Once a time attribute is defined, it can be referenced as a field and used in time-based operations. As long as a time attribute is not modified and is simply forwarded from one part of a query to another, it remains a valid time attribute. Time attributes behave like regular timestamps, and are accessible for calculations. When used in calculations, time attributes are materialized and act as standard timestamps, but ordinary timestamps can’t be used in place of, or converted to, time attributes. Event time¶ Event time enables a table program to produce results based on timestamps in every record, which allows for consistent results despite out-of-order or late events. Event time also ensures the replayability of the results of the table program when reading records from persistent storage. Also, event time enables unified syntax for table programs in both batch and streaming environments. A time attribute in a streaming environment can be a regular column of a row in a batch environment. To handle out-of-order events and to distinguish between on-time and late events in streaming, Flink must know the timestamp for each row, and it also needs regular indications of how far along in event time the processing has progressed so far, by using watermarks. You can define event-time attributes in CREATE TABLE statements. Defining in DDL¶ The event-time attribute is defined by using a WATERMARK clause in a CREATE TABLE DDL statement. A watermark statement defines a watermark generation expression on an existing event-time field, which marks the event-time field as the event-time attribute. For more information about watermark strategies, see Watermark clause. Flink SQL supports defining an event-time attribute on TIMESTAMP and TIMESTAMP_LTZ columns. If the timestamp data in the source is represented as year-month-day-hour-minute-second, usually a string value without time-zone information, for example, 2020-04-15 20:13:40.564, it’s recommended to define the event-time attribute as a TIMESTAMP column. CREATE TABLE user_actions ( user_name STRING, data STRING, user_action_time TIMESTAMP(3), -- Declare the user_action_time column as an event-time attribute -- and use a 5-seconds-delayed watermark strategy. WATERMARK FOR user_action_time AS user_action_time - INTERVAL '5' SECOND ) WITH ( ... ); SELECT TUMBLE_START(user_action_time, INTERVAL '10' MINUTE), COUNT(DISTINCT user_name) FROM user_actions GROUP BY TUMBLE(user_action_time, INTERVAL '10' MINUTE); If the timestamp data in the source is represented as epoch time, which is usually a LONG value like 1618989564564, consider defining the event-time attribute as a TIMESTAMP_LTZ column. CREATE TABLE user_actions ( user_name STRING, data STRING, ts BIGINT, time_ltz AS TO_TIMESTAMP_LTZ(ts, 3), -- Declare the time_ltz column as an event-time attribute -- and use a 5-seconds-delayed watermark strategy. WATERMARK FOR time_ltz AS time_ltz - INTERVAL '5' SECOND ) WITH ( ... ); SELECT TUMBLE_START(time_ltz, INTERVAL '10' MINUTE), COUNT(DISTINCT user_name) FROM user_actions GROUP BY TUMBLE(time_ltz, INTERVAL '10' MINUTE); Processing time¶ Processing time enables a table program to produce results based on the time of the local machine. It’s the simplest notion of time, but it generates non-deterministic results. Processing time doesn’t require timestamp extraction or watermark generation. Processing time is not supported in Confluent Cloud for Apache Flink. Related content¶ Flink implements many techniques from the Dataflow Model. For a good introduction to event time and watermarks, have a look at these articles. Course: Watermarks Demystified Video: Watermark Alignment Explained in 2 Minutes Blog post: Introducing Stream Windows in Apache Flink Streaming 101 (O’Reilly online learning) by Tyler Akidau Dataflow Model CREATE TABLE Statement in Confluent Cloud for Apache Flink Flink SQL Queries Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
TIMESTAMP(p)
```

```sql
TIMESTAMP_LTZ(p)
```

```sql
0 <= p <= 3
```

```sql
sql.tables.scan.watermark-alignment.max-allowed-drift
```

```sql
System.currentTimeMillis()
```

```sql
CREATE TABLE
```

```sql
CREATE TABLE
```

```sql
2020-04-15 20:13:40.564
```

```sql
CREATE TABLE user_actions (
  user_name STRING,
  data STRING,
  user_action_time TIMESTAMP(3),
  -- Declare the user_action_time column as an event-time attribute
  -- and use a 5-seconds-delayed watermark strategy.
  WATERMARK FOR user_action_time AS user_action_time - INTERVAL '5' SECOND
) WITH (
  ...
);

SELECT TUMBLE_START(user_action_time, INTERVAL '10' MINUTE), COUNT(DISTINCT user_name)
FROM user_actions
GROUP BY TUMBLE(user_action_time, INTERVAL '10' MINUTE);
```

```sql
1618989564564
```

```sql
TIMESTAMP_LTZ
```

```sql
CREATE TABLE user_actions (
  user_name STRING,
  data STRING,
  ts BIGINT,
  time_ltz AS TO_TIMESTAMP_LTZ(ts, 3),
  -- Declare the time_ltz column as an event-time attribute
  -- and use a 5-seconds-delayed watermark strategy.
  WATERMARK FOR time_ltz AS time_ltz - INTERVAL '5' SECOND
) WITH (
  ...
);

SELECT TUMBLE_START(time_ltz, INTERVAL '10' MINUTE), COUNT(DISTINCT user_name)
FROM user_actions
GROUP BY TUMBLE(time_ltz, INTERVAL '10' MINUTE);
```

---

### User-defined Functions in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/concepts/user-defined-functions.html

User-defined Functions in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® supports user-defined functions (UDFs), which are extension points for running custom logic that you can’t express in the system-provided Flink SQL queries or with the Table API. You can implement user-defined functions in Java, and you can use third-party libraries within a UDF. Confluent Cloud for Apache Flink supports scalar functions (UDFs), which map scalar values to a new scalar value, and table functions (UDTFs), which map multiple scalar values to multiple output rows. Create an example UDF: Create a User Defined Function Add logging to your UDFs: Enable Logging in a User Defined Function Availability: UDF regional availability Limitations: UDF limitations Example code: Flink UDF Java Examples Artifacts¶ Artifacts are Java packages, or JAR files, that contain user-defined functions and all of the required dependencies. Artifacts are uploaded to Confluent Cloud and scoped to a specific region in a Confluent Cloud environment. To be used for UDF, artifacts must follow a few common implementation principles, which are described in the following sections. To use a UDF, you must register one or more functions that reference the artifact. Functions¶ Functions are SQL objects that reference a class in an artifact and can be used in any SQL Statement or Table API program. Once an artifact is uploaded, you register a function by using the CREATE FUNCTION statement. Once a function is registered, you can invoked it from any SQL statement or Table API program. The following example shows how to register a TShirtSizingIsSmaller function and invoke it in a SQL statement. -- Register the function. CREATE FUNCTION is_smaller AS 'com.example.my.TShirtSizingIsSmaller' USING JAR 'confluent-artifact://<artifact-id>/<version-id>'; -- Invoke the function. SELECT IS_SMALLER ('L', 'M'); To build and upload a UDF to Confluent Cloud for Apache Flink for use in Flink SQL or the Table API, see Create a UDF. RBAC¶ To upload artifacts, register functions, and invoke functions, you must have the FlinkDeveloper role or higher. For more information, see Grant Role-Based Access. Shared responsibility¶ Confluent supports the UDF infrastructure in Confluent Cloud only. It is your responsibility to troubleshoot custom UDF issues for functions you build or that are provided to you by others. The following provides additional details about shared support responsibilities. Customer Managed: You are responsible for function logic. Confluent does not provide any support for debugging services and features within UDFs. Confluent Managed: Confluent is responsible for managing the Flink services and custom compute platform, and provides support for these. Scalar functions¶ A user-defined scalar function maps zero, one, or multiple scalar values to a new scalar value. You can use any data type listed in Data Types as a parameter or return type of an evaluation method. To define a scalar function, extend the ScalarFunction base class in org.apache.flink.table.functions and implement one or more evaluation methods named eval(...). The following code example shows how to define your own hash code function. import org.apache.flink.table.annotation.InputGroup; import org.apache.flink.table.api.*; import org.apache.flink.table.functions.ScalarFunction; import static org.apache.flink.table.api.Expressions.*; public static class HashFunction extends ScalarFunction { // take any data type and return INT public int eval(@DataTypeHint(inputGroup = InputGroup.ANY) Object o) { return o.hashCode(); } } The following example shows how to call the HashFunction UDF in a Flink SQL statement. SELECT HashFunction(myField) FROM MyTable; To build and upload a UDF to Confluent Cloud for Apache Flink for use in Flink SQL, see Create a User Defined Function. Table functions¶ Confluent Cloud for Apache Flink also supports user-defined table functions (UDTFs), which take multiple scalar values as input arguments and return multiple rows as output, instead of a single value. To create a user-defined table function, extend the TableFunction base class in org.apache.flink.table.functions and implement one or more of the evaluation methods, which are named eval(...). Input and output data types are inferred automatically by using reflection, including the generic argument T of the class, for determining the output data type. Unlike scalar functions, the evaluation method itself doesn’t have a return type. Instead, a table function provides a collect(T) method that’s called within every evaluation method to emit zero, one, or more records. In the Table API, a table function is used with the .joinLateral(...) or .leftOuterJoinLateral(...) operators. The joinLateral operator cross-joins each row from the outer table (the table on the left of the operator) with all rows produced by the table-valued function (on the right side of the operator). The leftOuterJoinLateral operator joins each row from the outer table with all rows produced by the table-valued function and preserves outer rows, for which the table function returns an empty table. Note User-defined table functions are distinct from the Table API but can be used in Table API code. In SQL, use LATERAL TABLE(<TableFunction>) with JOIN or LEFT JOIN with an ON TRUE join condition. The following code example shows how to implement a simple string splitting function. import org.apache.flink.table.annotation.DataTypeHint; import org.apache.flink.table.annotation.FunctionHint; import org.apache.flink.table.api.*; import org.apache.flink.table.functions.TableFunction; import org.apache.flink.types.Row; import static org.apache.flink.table.api.Expressions.*; @FunctionHint(output = @DataTypeHint("ROW<word STRING, length INT>")) public static class SplitFunction extends TableFunction<Row> { public void eval(String str) { for (String s : str.split(" ")) { // use collect(...) to emit a row collect(Row.of(s, s.length())); } } } The following example shows how to call the SplitFunction UDTF in a Flink SQL statement. SELECT myField, word, length FROM MyTable LEFT JOIN LATERAL TABLE(SplitFunction(myField)) ON TRUE; To build and upload a user-defined table function to Confluent Cloud for Apache Flink for use in Flink SQL, see Create a User Defined Table Function. Implementation considerations¶ All UDFs adhere to a few common implementation principles, which are described in the following sections. Function class Evaluation methods Type inference Named parameters Scalar functions Table functions The following code example shows how to implement a simple scalar function and how to call it in Flink SQL. For the Table API, you can register the function in code and invoke it. For SQL queries, your UDF must be registered by using the CREATE FUNCTION statement. For more information, see Create a User-defined Function. import org.apache.flink.table.api.*; import org.apache.flink.table.functions.ScalarFunction; import static org.apache.flink.table.api.Expressions.*; // define function logic public static class SubstringFunction extends ScalarFunction { public String eval(String s, Integer begin, Integer end) { return s.substring(begin, end); } } The following example shows how to call the SubstringFunction UDF in a Flink SQL statement. SELECT SubstringFunction('test string', 2, 5); Function class¶ Your implementation class must extend one of the system-provided base classes. Scalar functions extend the org.apache.flink.table.functions.ScalarFunction class. Table functions extend the org.apache.flink.table.functions.TableFunction class. The class must be declared public, not abstract, and must be accessible globally. Non-static inner or anonymous classes are not supported. Evaluation methods¶ You define the behavior of a scalar function by implementing a custom evaluation method, named eval, which must be declared public. You can overload evaluation methods by implementing multiple methods named eval. The evaluation method is called by code-generated operators during runtime. Regular JVM method-calling semantics apply, so these implementation options are available: You can implement overloaded methods, like eval(Integer) and eval(LocalDateTime). You can use var-args, like eval(Integer...). You can use object inheritance, like eval(Object) that takes both LocalDateTime and Integer. You can use combinations of these, like eval(Object...) that takes all kinds of arguments. The ScalarFunction base class provides a set of optional methods that you can override, open(), close(), isDeterministic(), and supportsConstantFolding(). You can use the open() method for initialization work and the close() method for cleanup work. Internally, Table API and SQL code generation works with primitive values where possible. To reduce overhead during runtime, a user-defined scalar function should declare parameters and result types as primitive types instead of their boxed classes. For example, DATE/TIME is equal to int, and TIMESTAMP is equal to long. The following code example shows a user-defined function that has overloaded eval methods. import org.apache.flink.table.functions.ScalarFunction; // function with overloaded evaluation methods public static class SumFunction extends ScalarFunction { public Integer eval(Integer a, Integer b) { return a + b; } public Integer eval(String a, String b) { return Integer.valueOf(a) + Integer.valueOf(b); } public Integer eval(Double... d) { double result = 0; for (double value : d) result += value; return (int) result; } } Type inference¶ The Table API is strongly typed, so both function parameters and return types must be mapped to a data type. The Flink planner needs information about expected types, precision, and scale. Also it needs information about how internal data structures are represented as JVM objects when calling a user-defined function. Type inference is the process of validating input arguments and deriving data types for both the parameters and the result of a function. User-defined functions in Flink implement automatic type-inference extraction that derives data types from the function’s class and its evaluation methods by using reflection. If this implicit extraction approach with reflection fails, you can help the extraction process by annotating affected parameters, classes, or methods with @DataTypeHint and @FunctionHint. Automatic type inference¶ Automatic type inference inspects the function’s class and evaluation methods to derive data types for the arguments and return value of a function. The @DataTypeHint and @FunctionHint annotations support automatic extraction. For a list of classes that implicitly map to a data type, see Data type extraction. Data type hints¶ In some situations, you may need to support automatic extraction inline for parameters and return types of a function. In these cases you can use data type hints and the @DataTypeHint annotation to define data types. The following code example shows how to use data type hints. import org.apache.flink.table.annotation.DataTypeHint; import org.apache.flink.table.annotation.InputGroup; import org.apache.flink.table.functions.ScalarFunction; import org.apache.flink.types.Row; // user-defined function that has overloaded evaluation methods. public static class OverloadedFunction extends ScalarFunction { // No hint required for type inference. public Long eval(long a, long b) { return a + b; } // Define the precision and scale of a decimal. public @DataTypeHint("DECIMAL(12, 3)") BigDecimal eval(double a, double b) { return BigDecimal.valueOf(a + b); } // Define a nested data type. @DataTypeHint("ROW<s STRING, t TIMESTAMP_LTZ(3)>") public Row eval(int i) { return Row.of(String.valueOf(i), Instant.ofEpochSecond(i)); } // Enable wildcard input and custom serialized output. @DataTypeHint(value = "RAW", bridgedTo = ByteBuffer.class) public ByteBuffer eval(@DataTypeHint(inputGroup = InputGroup.ANY) Object o) { return MyUtils.serializeToByteBuffer(o); } } Function hints¶ In some situations, you may want one evaluation method to handle multiple different data types, or you may have overloaded evaluation methods with a common result type that should be declared only once. The @FunctionHint annotation provides a mapping from argument data types to a result data type. It enables annotating entire function classes or evaluation methods for input, accumulator, and result data types. You can declare one or more annotations on a class or individually for each evaluation method for overloading function signatures. All hint parameters are optional. If a parameter is not defined, the default reflection-based extraction is used. Hint parameters defined on a function class are inherited by all evaluation methods. The following code example shows how to use function hints. import org.apache.flink.table.annotation.DataTypeHint; import org.apache.flink.table.annotation.FunctionHint; import org.apache.flink.table.functions.TableFunction; import org.apache.flink.types.Row; // User-defined function with overloaded evaluation methods // but globally defined output type. @FunctionHint(output = @DataTypeHint("ROW<s STRING, i INT>")) public static class OverloadedFunction extends ScalarFunction<Row> { public void eval(int a, int b) { collect(Row.of("Sum", a + b)); } // Overloading arguments is still possible. public void eval() { collect(Row.of("Empty args", -1)); } } // Decouples the type inference from evaluation methods. // The type inference is entirely determined by the function hints. @FunctionHint( input = {@DataTypeHint("INT"), @DataTypeHint("INT")}, output = @DataTypeHint("INT") ) @FunctionHint( input = {@DataTypeHint("BIGINT"), @DataTypeHint("BIGINT")}, output = @DataTypeHint("BIGINT") ) @FunctionHint( input = {}, output = @DataTypeHint("BOOLEAN") ) public static class OverloadedFunction extends ScalarFunction<Object> { // Ensure a method exists that the JVM can call. public void eval(Object... o) { if (o.length == 0) { collect(false); } collect(o[0]); } } Named parameters¶ When you call a user-define function, you can use parameter names to specify the values of the parameters. Named parameters enable passing both the parameter name and value to a function. This approach avoids confusion caused by incorrect parameter order, and it improves code readability and maintainability. Also, named parameters can omit optional parameters, which are filled with null by default. Use the @ArgumentHint annotation to specify the name, type, and whether a parameter is required or not. The following code examples demonstrate how to use @ArgumentHint in different scopes. Use the @ArgumentHint annotation on the parameters of the eval method of the function: import com.sun.tracing.dtrace.ArgsAttributes; import org.apache.flink.table.annotation.ArgumentHint; import org.apache.flink.table.functions.ScalarFunction; public static class NamedParameterClass extends ScalarFunction { // Use the @ArgumentHint annotation to specify the name, type, and whether a parameter is required. public String eval(@ArgumentHint(name = "param1", isOptional = false, type = @DataTypeHint("STRING")) String s1, @ArgumentHint(name = "param2", isOptional = true, type = @DataTypeHint("INT")) Integer s2) { return s1 + ", " + s2; } } Use the @ArgumentHint annotation on the eval method of the function. import org.apache.flink.table.annotation.ArgumentHint; import org.apache.flink.table.functions.ScalarFunction; public static class NamedParameterClass extends ScalarFunction { // Use the @ArgumentHint annotation to specify the name, type, and whether a parameter is required. @FunctionHint( argument = {@ArgumentHint(name = "param1", isOptional = false, type = @DataTypeHint("STRING")), @ArgumentHint(name = "param2", isOptional = true, type = @DataTypeHint("INTEGER"))} ) public String eval(String s1, Integer s2) { return s1 + ", " + s2; } } Use the @ArgumentHint annotation on the class of the function. import org.apache.flink.table.annotation.ArgumentHint; import org.apache.flink.table.functions.ScalarFunction; // Use the @ArgumentHint annotation to specify the name, type, and whether a parameter is required. @FunctionHint( argument = {@ArgumentHint(name = "param1", isOptional = false, type = @DataTypeHint("STRING")), @ArgumentHint(name = "param2", isOptional = true, type = @DataTypeHint("INTEGER"))} ) public static class NamedParameterClass extends ScalarFunction { public String eval(String s1, Integer s2) { return s1 + ", " + s2; } } The @ArgumentHint annotation already contains the @DataTypeHint annotation, so you can’t use it with @DataTypeHint in @FunctionHint. When applied to function parameters, @ArgumentHint can’t be used with @DataTypeHint at the same time, so you should use @ArgumentHint instead. Named parameters take effect only when the corresponding class doesn’t contain overloaded functions and variable parameter functions, otherwise using named parameters causes an error. Determinism¶ Every user-defined function class can declare whether it produces deterministic results or not by overriding the isDeterministic() method. If the function is not purely functional, like random(), date(), or now(), the method must return false. By default, isDeterministic() returns true. Also, the isDeterministic() method may influence the runtime behavior. A runtime implementation might be called at two different stages. During planning¶ During planning, in the so-called pre-flight phase, if a function is called with constant expressions, or if constant expressions can be derived from the given statement, a function is pre-evaluated for constant expression reduction and might not be executed on the cluster. In these cases, you can use the isDeterministic() method to disable constant expression reduction. For example, the following calls to ABS are executed during planning: SELECT ABS(-1) FROM t; SELECT ABS(field) FROM t WHERE field = -1; But the following call to ABS is not executed during planning: SELECT ABS(field) FROM t; During runtime¶ If a function is called with non-constant expressions or isDeterministic() returns false, the function is executed on the cluster. System function determinism¶ The determinism of system (built-in) functions is immutable. According to Apache Calcite’s SqlOperator definition, there are two kinds of functions which are not deterministic: dynamic functions and non-deterministic functions. /** * Returns whether a call to this operator is guaranteed to always return * the same result given the same operands; true is assumed by default. */ public boolean isDeterministic() { return true; } /** * Returns whether it is unsafe to cache query plans referencing this * operator; false is assumed by default. */ public boolean isDynamicFunction() { return false; } The isDeterministic() method indicates the determinism of a function is evaluated per-record during runtime if it returns false. The isDynamicFunction() method implies the function can be evaluated only at query-start if it returns true. It will be pre-evaluated during planning only for batch mode. For streaming mode, it is equivalent to a non-deterministic function, because the query is executed continuously under the abstraction of a continuous query over dynamic tables, so the dynamic functions are also re-evaluated for each query execution, which is equivalent to per-record in the current implementation. The isDynamicFunction method applies only to system functions. The following system functions are always non-deterministic, which means they are evaluated per-record during runtime, both in batch and streaming mode. CURRENT_ROW_TIMESTAMP RAND RAND_INTEGER UNIX_TIMESTAMP UUID The following system temporal functions are dynamic and are pre-evaluated during planning (query-start) for batch mode and evaluated per-record for streaming mode. CURRENT_DATE CURRENT_TIME CURRENT_TIMESTAMP LOCALTIME LOCALTIMESTAMP NOW UDF regional availability¶ Flink UDFs are available in the following AWS regions. ap-east-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2 ca-central-1 eu-central-1 eu-central-2 eu-north-1 eu-west-1 eu-west-2 me-south-1 sa-east-1 us-east-1 us-east-2 us-west-2 Flink UDFs are available in the following Azure regions. australiaeast brazilsouth centralindia centralus eastus eastus2 francecentral northeurope southcentralus southeastasia spaincentral uaenorth uksouth westeurope westus2 westus3 UDF limitations¶ User-defined functions have the following limitations. Confluent CLI version 4.13.0 or later is required. External network calls from UDFs are not supported. JDK 17 is the latest supported Java version for uploaded JAR files. Each Flink statement can have no more than 10 UDFs. Each organization/cloud/region/environment can have no more than 100 Flink artifacts. The size limit of each artifact is 100 MB. Aggregates are not supported. Table aggregates are not supported. Temporary functions are not supported. The ALTER FUNCTION statement is not supported. UDFs can’t be used in combination with MATCH_RECOGNIZE. Vararg functions are not supported. User-defined structured types are not supported. Python is not supported. Both inputs and outputs of the UDF have a row-size limit of 4MB. Custom type inference is not supported. Constant expression reduction is not supported. The UDF feature is optimized for streaming processing, so the initial query may be slow, but after the initial query, a UDF runs with low latency. File system access limitations¶ The file system is read-only in the runtime environment. UDFs can’t create, write, or modify files on the file system. This includes temporary files, model files, or any other file operations. Libraries that require file system write access, like those using JNI/native binaries that extract files from JARs, are not supported. JNI and native binary limitations¶ Libraries that use Java Native Interface (JNI) or require native binaries are not supported due to filesystem restrictions and potential architecture compatibility issues. UDF logging limitations¶ Log4j logging only: External UDF loggers can be composed only with the Apache Log4j logging framework. Burst rate to 1000/s: UDF logging supports up to 1000 log events per second for each UDF during a short burst of high activity. This helps to optimize performance and to reduce noise in logs. Events that exceed the maximum rate are dropped. Related content¶ CREATE FUNCTION Create a User-defined Function. Flink SQL Queries Flink UDF Java Examples Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
TShirtSizingIsSmaller
```

```sql
-- Register the function.
CREATE FUNCTION is_smaller
  AS 'com.example.my.TShirtSizingIsSmaller'
  USING JAR 'confluent-artifact://<artifact-id>/<version-id>';

-- Invoke the function.
SELECT IS_SMALLER ('L', 'M');
```

```sql
ScalarFunction
```

```sql
org.apache.flink.table.functions
```

```sql
import org.apache.flink.table.annotation.InputGroup;
import org.apache.flink.table.api.*;
import org.apache.flink.table.functions.ScalarFunction;
import static org.apache.flink.table.api.Expressions.*;

public static class HashFunction extends ScalarFunction {

  // take any data type and return INT
  public int eval(@DataTypeHint(inputGroup = InputGroup.ANY) Object o) {
    return o.hashCode();
  }
}
```

```sql
HashFunction
```

```sql
SELECT HashFunction(myField) FROM MyTable;
```

```sql
TableFunction
```

```sql
org.apache.flink.table.functions
```

```sql
.joinLateral(...)
```

```sql
.leftOuterJoinLateral(...)
```

```sql
joinLateral
```

```sql
leftOuterJoinLateral
```

```sql
LATERAL TABLE(<TableFunction>)
```

```sql
import org.apache.flink.table.annotation.DataTypeHint;
import org.apache.flink.table.annotation.FunctionHint;
import org.apache.flink.table.api.*;
import org.apache.flink.table.functions.TableFunction;
import org.apache.flink.types.Row;
import static org.apache.flink.table.api.Expressions.*;

@FunctionHint(output = @DataTypeHint("ROW<word STRING, length INT>"))
public static class SplitFunction extends TableFunction<Row> {

  public void eval(String str) {
    for (String s : str.split(" ")) {
      // use collect(...) to emit a row
      collect(Row.of(s, s.length()));
    }
  }
}
```

```sql
SplitFunction
```

```sql
SELECT myField, word, length
FROM MyTable
LEFT JOIN LATERAL TABLE(SplitFunction(myField)) ON TRUE;
```

```sql
import org.apache.flink.table.api.*;
import org.apache.flink.table.functions.ScalarFunction;
import static org.apache.flink.table.api.Expressions.*;

// define function logic
public static class SubstringFunction extends ScalarFunction {
  public String eval(String s, Integer begin, Integer end) {
    return s.substring(begin, end);
  }
}
```

```sql
SubstringFunction
```

```sql
SELECT SubstringFunction('test string', 2, 5);
```

```sql
org.apache.flink.table.functions.ScalarFunction
```

```sql
org.apache.flink.table.functions.TableFunction
```

```sql
eval(Integer)
```

```sql
eval(LocalDateTime)
```

```sql
eval(Integer...)
```

```sql
eval(Object)
```

```sql
LocalDateTime
```

```sql
eval(Object...)
```

```sql
ScalarFunction
```

```sql
isDeterministic()
```

```sql
supportsConstantFolding()
```

```sql
import org.apache.flink.table.functions.ScalarFunction;

// function with overloaded evaluation methods
public static class SumFunction extends ScalarFunction {

  public Integer eval(Integer a, Integer b) {
    return a + b;
  }

  public Integer eval(String a, String b) {
    return Integer.valueOf(a) + Integer.valueOf(b);
  }

  public Integer eval(Double... d) {
    double result = 0;
    for (double value : d)
      result += value;
    return (int) result;
  }
}
```

```sql
@DataTypeHint
```

```sql
@FunctionHint
```

```sql
@DataTypeHint
```

```sql
@FunctionHint
```

```sql
@DataTypeHint
```

```sql
import org.apache.flink.table.annotation.DataTypeHint;
import org.apache.flink.table.annotation.InputGroup;
import org.apache.flink.table.functions.ScalarFunction;
import org.apache.flink.types.Row;

// user-defined function that has overloaded evaluation methods.
public static class OverloadedFunction extends ScalarFunction {

  // No hint required for type inference.
  public Long eval(long a, long b) {
    return a + b;
  }

  // Define the precision and scale of a decimal.
  public @DataTypeHint("DECIMAL(12, 3)") BigDecimal eval(double a, double b) {
    return BigDecimal.valueOf(a + b);
  }

  // Define a nested data type.
  @DataTypeHint("ROW<s STRING, t TIMESTAMP_LTZ(3)>")
  public Row eval(int i) {
    return Row.of(String.valueOf(i), Instant.ofEpochSecond(i));
  }

  // Enable wildcard input and custom serialized output.
  @DataTypeHint(value = "RAW", bridgedTo = ByteBuffer.class)
  public ByteBuffer eval(@DataTypeHint(inputGroup = InputGroup.ANY) Object o) {
    return MyUtils.serializeToByteBuffer(o);
  }
}
```

```sql
@FunctionHint
```

```sql
import org.apache.flink.table.annotation.DataTypeHint;
import org.apache.flink.table.annotation.FunctionHint;
import org.apache.flink.table.functions.TableFunction;
import org.apache.flink.types.Row;

// User-defined function with overloaded evaluation methods
// but globally defined output type.
@FunctionHint(output = @DataTypeHint("ROW<s STRING, i INT>"))
public static class OverloadedFunction extends ScalarFunction<Row> {

  public void eval(int a, int b) {
    collect(Row.of("Sum", a + b));
  }

  // Overloading arguments is still possible.
  public void eval() {
    collect(Row.of("Empty args", -1));
  }
}

// Decouples the type inference from evaluation methods.
// The type inference is entirely determined by the function hints.
@FunctionHint(
  input = {@DataTypeHint("INT"), @DataTypeHint("INT")},
  output = @DataTypeHint("INT")
)
@FunctionHint(
  input = {@DataTypeHint("BIGINT"), @DataTypeHint("BIGINT")},
  output = @DataTypeHint("BIGINT")
)
@FunctionHint(
  input = {},
  output = @DataTypeHint("BOOLEAN")
)

public static class OverloadedFunction extends ScalarFunction<Object> {

  // Ensure a method exists that the JVM can call.
  public void eval(Object... o) {
    if (o.length == 0) {
      collect(false);
    }
    collect(o[0]);
  }
}
```

```sql
@ArgumentHint
```

```sql
@ArgumentHint
```

```sql
@ArgumentHint
```

```sql
import com.sun.tracing.dtrace.ArgsAttributes;
import org.apache.flink.table.annotation.ArgumentHint;
import org.apache.flink.table.functions.ScalarFunction;

public static class NamedParameterClass extends ScalarFunction {

    // Use the @ArgumentHint annotation to specify the name, type, and whether a parameter is required.
    public String eval(@ArgumentHint(name = "param1", isOptional = false, type = @DataTypeHint("STRING")) String s1,
                      @ArgumentHint(name = "param2", isOptional = true, type = @DataTypeHint("INT")) Integer s2) {
        return s1 + ", " + s2;
    }
}
```

```sql
@ArgumentHint
```

```sql
import org.apache.flink.table.annotation.ArgumentHint;
import org.apache.flink.table.functions.ScalarFunction;

public static class NamedParameterClass extends ScalarFunction {

  // Use the @ArgumentHint annotation to specify the name, type, and whether a parameter is required.
  @FunctionHint(
          argument = {@ArgumentHint(name = "param1", isOptional = false, type = @DataTypeHint("STRING")),
                  @ArgumentHint(name = "param2", isOptional = true, type = @DataTypeHint("INTEGER"))}
  )
  public String eval(String s1, Integer s2) {
    return s1 + ", " + s2;
  }
}
```

```sql
@ArgumentHint
```

```sql
import org.apache.flink.table.annotation.ArgumentHint;
import org.apache.flink.table.functions.ScalarFunction;

// Use the @ArgumentHint annotation to specify the name, type, and whether a parameter is required.
@FunctionHint(
        argument = {@ArgumentHint(name = "param1", isOptional = false, type = @DataTypeHint("STRING")),
                @ArgumentHint(name = "param2", isOptional = true, type = @DataTypeHint("INTEGER"))}
)
public static class NamedParameterClass extends ScalarFunction {

  public String eval(String s1, Integer s2) {
    return s1 + ", " + s2;
  }
}
```

```sql
@ArgumentHint
```

```sql
@DataTypeHint
```

```sql
@DataTypeHint
```

```sql
@FunctionHint
```

```sql
@ArgumentHint
```

```sql
@DataTypeHint
```

```sql
@ArgumentHint
```

```sql
isDeterministic()
```

```sql
isDeterministic()
```

```sql
isDeterministic()
```

```sql
isDeterministic()
```

```sql
SELECT ABS(-1) FROM t;
SELECT ABS(field) FROM t WHERE field = -1;
```

```sql
SELECT ABS(field) FROM t;
```

```sql
isDeterministic()
```

```sql
SqlOperator
```

```sql
/**
 * Returns whether a call to this operator is guaranteed to always return
 * the same result given the same operands; true is assumed by default.
 */
public boolean isDeterministic() {
  return true;
}

/**
 * Returns whether it is unsafe to cache query plans referencing this
 * operator; false is assumed by default.
 */
public boolean isDynamicFunction() {
  return false;
}
```

```sql
isDeterministic()
```

```sql
isDynamicFunction()
```

```sql
isDynamicFunction
```

---

### FAQ for Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/flink-faq.html

Frequently Asked Questions for Confluent Cloud for Apache Flink¶ This topic provides answers to frequently asked questions about Confluent Cloud for Apache Flink®. What is Confluent Cloud for Apache Flink?¶ Confluent Cloud for Apache Flink is a fully managed, cloud-native service for stream processing using Flink SQL. It enables you to process, analyze, and transform data in real time directly on your Confluent Cloud-managed Kafka clusters. How do I get started with Confluent Cloud for Apache Flink?¶ Get started by clicking SQL Workspaces in the Confluent Cloud Console. For more information, see Flink SQL Quick Start with Confluent Cloud Console. Also, you can run the confluent flink shell command to start the Flink SQL shell. For more information, see Flink SQL Shell Quick Start. What is a compute pool?¶ A compute pool is a dedicated set of resources, measured in CFUs, that runs your Flink SQL statements. You must create a compute pool before running statements. Multiple statements can share a compute pool, and you can scale pools up or down as needed. For more information, see Compute Pools. How is Confluent Cloud for Apache Flink billed?¶ Billing is based on the number of CFUs provisioned in your compute pools and the duration for which they are running. You are charged for the resources allocated, not per statement. For more information, see Billing. What are the prerequisites for using Confluent Cloud for Apache Flink?¶ You need a Confluent Cloud account and an environment with Stream Governance enabled. You must have the appropriate roles and permissions, for example, the FlinkDeveloper role to run statements. You need access to at least one compute pool. What sources and sinks are supported?¶ Confluent Cloud for Apache Flink supports reading from and writing to Kafka topics in your Confluent Cloud environment. In addition, you use Confluent’s AI/ML features to perform searches on external tables. And you can use Confluent Tableflow to materialize streams to external tables. How do I monitor my Flink SQL statements?¶ You can monitor statements using the Cloud Console, which provides status, metrics, and logs. For advanced monitoring, use the Metrics API and Notifications for Confluent Cloud to set up alerts for failures, lag, and resource utilization. For more information, see Best practices for alerting. What happens if my statement fails?¶ If a statement fails, you will see an error message in the Cloud Console. You can view logs and metrics to diagnose the issue. Statements can be restarted after resolving the underlying problem. Can I use Flink SQL to join multiple topics?¶ Yes, you can use Flink SQL to join multiple Kafka topics, perform aggregations, windowing, filtering, and more. For more information, see the Flink SQL statements. How do I manage schema evolution?¶ Flink SQL integrates with Confluent’s Schema Registry. When reading from or writing to topics with Avro, Protobuf, or JSON Schema, Flink SQL uses the registered schemas and handles compatible schema evolution. How do I control access to Flink resources?¶ Access to Flink resources is managed using Role-Based Access Control (RBAC) in Confluent Cloud. Assign users and service accounts the appropriate roles, such as FlinkAdmin or FlinkDeveloper, to control what actions they can perform. For more information, see Grant Role-Based Access. How do I secure my Flink SQL jobs and data?¶ Confluent Cloud for Apache Flink uses the same security model as the rest of Confluent Cloud, including RBAC, API keys, and network controls. Make sure to assign the minimum required permissions to users and service accounts. For more information, see Grant Role-Based Access in Confluent Cloud for Apache Flink. How do I move my SQL statements to production?¶ To move your Flink SQL statements to production, follow best practices such as using service accounts, applying least-privilege permissions, and thoroughly testing your statements in a development environment before deploying them to production compute pools. For detailed guidance, see Best Practices for Moving SQL Statements to Production. You can use GitHub Actions and Terraform to deploy your Flink SQL statements to production. For more information, see Deploy a Flink SQL Statement Using CI/CD. Where can I get help or support?¶ If you have questions or need support, you can use the in-product help in the Confluent Cloud Console, visit the Flink documentation, or reach out through the established channels. You can also ask questions in the Confluent Community forums or contact Confluent Support if you have a support plan. Related content¶ Flink SQL Quick Start with Confluent Cloud Console Flink SQL Shell Quick Start Stream Processing Concepts Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
confluent flink shell
```

---

### Get Help with Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/get-help.html

Get Help with Confluent Cloud for Apache Flink¶ You can request support in the Confluent Support Portal. You can access the portal directly, or you can navigate to it from the Confluent Cloud Console by selecting the Support menu identified by the help icon () in the upper-right and choosing Support portal. For more information, see Confluent Support for Confluent Cloud. Confluent Community Slack¶ There is a dedicated #flink channel in the Confluent Community Slack. Join with this link to ask questions, provide feedback, and engage with other users. Troubleshoot Flink in Confluent Cloud Console¶ If issues occur while running Flink in Cloud Console, consider generating a HAR file and uploading it to the Confluent Community Slack channel or sending it to the Support Portal. For more information, see Generate a HAR file for Troubleshooting on Confluent Cloud. Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

---

### Get Started with Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/get-started/overview.html

Get Started with Confluent Cloud for Apache Flink¶ Welcome to Confluent Cloud for Apache Flink®. This section guides you through the steps to get your queries running using the Confluent Cloud Console (browser-based) and the Flink SQL shell (CLI-based). Get Started for Free Sign up for a Confluent Cloud trial and get $400 of free credit. If you’re currently using Confluent Cloud in a region that doesn’t yet support Flink, so you can’t use your data in existing Apache Kafka® topics, you can still try out Flink SQL by using sample data generators or the Example catalog, which are used in the quick starts and How-to Guides for Confluent Cloud for Apache Flink. Choose one of the following quick starts to get started with Flink SQL on Confluent Cloud: Flink SQL Quick Start with Confluent Cloud Console Flink SQL Shell Quick Start Also, you can access Flink by using the REST API and the Confluent Terraform Provider. REST API-based data streams Sample Project for Confluent Terraform Provider If you get stuck, have a question, or want to provide feedback or feature requests, don’t hesitate to reach out. Check out Get Help with Confluent Cloud for Apache Flink for our support channels. Next steps¶ Flink SQL Quick Start with Confluent Cloud Console Flink SQL Shell Quick Start Related content¶ Stream Processing Concepts Flink SQL Queries Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

---

### Flink SQL Quick Start on Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/get-started/quick-start-cloud-console.html

Flink SQL Quick Start with Confluent Cloud Console¶ This quick start gets you up and running with Confluent Cloud for Apache Flink®. The following steps show how to create a workspace for running SQL statements on streaming data. In this quick start guide, you perform the following steps: Step 1: Create a workspace Step 2: Run SQL statements Step 3: Query streaming data Step 4: Query existing topics (optional) Prerequisites¶ Access to Confluent Cloud. Step 1: Create a workspace¶ Workspaces provide an intuitive, flexible UI for dynamically exploring and interacting with all of your data on Confluent Cloud using Flink SQL. In a workspace, you can save your queries, run multiple queries simultaneously in a single view, and browse your catalogs, databases, and tables. Log in to Confluent Cloud Console at https://confluent.cloud/login. In the navigation menu, click Stream processing to open the Stream processing page. In the dropdown, select the environment where you want to run Flink SQL, or use the default environment. If you have Kafka topics that you want to run SQL queries on, choose the environment that has these topics. Click Create new workspace, and in the dialog, select the cloud provider and region. If you have Kafka topics that you want to run SQL queries on, select the region that has your Kafka cluster. Click Create workspace. A new workspace opens with an example query in the code editor, or cell. Under the hood, Confluent Cloud for Apache Flink is creating a compute pool, which represents the compute resources that are used to run your SQL statements. The resources provided by the compute pool are shared among all statements that use it. It enables you to limit or guarantee resources as your use cases require. A compute pool is bound to a region. There is no cost for creating compute pools. It may take a minute or two for the compute pool to be provisioned. You can change the compute pool where a workspace runs by clicking the workspace settings icon and choosing from the Compute pool selection dropdown. Step 2: Run SQL statements¶ When the compute pool status changes from Provisioning to Running, it’s ready to run queries. In the cell of the new workspace, you can start running SQL statements. Click Run. The example statement is submitted, and information about the statement is displayed, including its status and a unique identifier. Click the Statement name link to open the statement details view, which displays the statement status and other information. Click X to dismiss the details view. After an initialization period, the query results display beneath the cell. Your output should resemble: EXPR$0 0 1 2 Copy the following SQL and paste it into the cell. The statement runs the CURRENT_TIMESTAMP function, which is one of many built-in functions provided by Confluent Cloud for Apache Flink. SELECT CURRENT_TIMESTAMP; Click Run. The result from the statement is displayed beneath the cell. Your output should resemble: CURRENT_TIMESTAMP 2024-03-15 16:23:18.912 Step 3: Query streaming data¶ Flink SQL enables using familiar SQL syntax to query streaming data. Confluent Cloud for Apache Flink provides example data streams that you can experiment with. In this step, you query the orders table from the marketplace database in the examples catalog. In Flink SQL, catalog objects, like tables, are scoped by catalog and database. A catalog is a collection of databases that share the same namespace. A database is a collection of tables that share the same namespace. In Confluent Cloud, an environment is mapped to a Flink catalog, and a Kafka cluster is mapped to a Flink database. You can always use three-part identifiers for your tables, like catalog.database.table, but it’s more convenient to set a default. Set the default catalog and database by using the Use catalog and Use database dropdown menus in the top-right corner of the workspace. Select examples for the catalog, and marketplace for the database. Click to create a new cell, and run the following statement to list all the tables in the marketplace database. SHOW TABLES; Your output should resemble: table name clicks customers orders products Run the following statement to inspect the orders data stream. SELECT * FROM orders; Your output should resemble: order_id customer_id product_id price 36d77b21-e68f-4123-b87a-cc19ac1f36ac 3137 1305 65.71 7fd3cd2a-392b-4f8f-b953-0bfa1d331354 3063 1327 17.75 1a223c61-38a5-4b8c-8465-2a6b359bf05e 3064 1166 14.95 ... Click Stop to end the query. Step 4: Query existing topics (optional)¶ If you’ve created the workspace in a region where you already have Kafka clusters and topics, you can explore this data with Flink SQL. Confluent Cloud for Apache Flink automatically registers Flink tables on your topics, so you can run statements on your streaming data. Set the default catalog and database by using the Use catalog and Use database dropdown menus. You can find your catalogs and databases in the navigation menu on the left side of the workspace. Click to create a new cell, and run the following statement to list all the tables in the database that you selected as the default. SHOW TABLES; You can browse any of your tables by running a SELECT statement. SELECT * FROM <table_name>; Next steps¶ How-to Guides for Confluent Cloud for Apache Flink Flink SQL Shell Quick Start Related content¶ DDL Statements Stream Processing Concepts Built-in Functions Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
EXPR$0
0
1
2
```

```sql
SELECT CURRENT_TIMESTAMP;
```

```sql
CURRENT_TIMESTAMP
2024-03-15 16:23:18.912
```

```sql
marketplace
```

```sql
catalog.database.table
```

```sql
SHOW TABLES;
```

```sql
table name
clicks
customers
orders
products
```

```sql
SELECT * FROM orders;
```

```sql
order_id                             customer_id product_id price
36d77b21-e68f-4123-b87a-cc19ac1f36ac 3137        1305       65.71
7fd3cd2a-392b-4f8f-b953-0bfa1d331354 3063        1327       17.75
1a223c61-38a5-4b8c-8465-2a6b359bf05e 3064        1166       14.95
...
```

```sql
SHOW TABLES;
```

```sql
SELECT * FROM <table_name>;
```

---

### Java Table API Quick Start on Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/get-started/quick-start-java-table-api.html

Java Table API Quick Start on Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® supports programming applications with the Table API. Confluent provides a plugin for running applications that use the Table API on Confluent Cloud. For more information, see Table API. For code examples, see Java Examples for Table API on Confluent Cloud. For a Confluent Developer course, see Apache Flink Table API: Processing Data Streams in Java. Note The Flink Table API is available for preview. A Preview feature is a Confluent Cloud component that is being introduced to gain early feedback from developers. Preview features can be used for evaluation and non-production testing purposes or to provide feedback to Confluent. The warranty, SLA, and Support Services provisions of your agreement with Confluent do not apply to Preview features. Confluent may discontinue providing preview releases of the Preview features at any time in Confluent’s’ sole discretion. Comments, questions, and suggestions related to the Table API are encouraged and can be submitted through the established channels. Prerequisites¶ Access to Confluent Cloud A compute pool in Confluent Cloud A Apache Kafka® cluster, if you want to run examples that store data in Kafka Java version 11 or later Maven (see Installing Apache Maven) To run Table API and Flink SQL programs, you must generate an API key that’s specific to the Flink environment. Also, you need Confluent Cloud account details, like your organization and environment identifiers. Flink API Key: Follow the steps in Generate a Flink API key. For convenience, assign your Flink key and secret to the FLINK_API_KEY and FLINK_API_SECRET environment variables. Organization ID: The identifier your organization, for example, b0b421724-4586-4a07-b787-d0bb5aacbf87. For convenience, assign your organization identifier to the ORG_ID environment variable. Environment ID: The identifier of the environment where your Flink SQL statements run, for example, env-z3y2x1. For convenience, assign your environment identifier to the ENV_ID environment variable. Cloud provider name: The name of the cloud provider where your cluster runs, for example, aws. To see the available providers, run the confluent flink region list command. For convenience, assign your cloud provider to the CLOUD_PROVIDER environment variable. Cloud region: The name of the region where your cluster runs, for example, us-east-1. To see the available regions, run the confluent flink region list command. For convenience, assign your cloud region to the CLOUD_REGION environment variable. export CLOUD_PROVIDER="aws" export CLOUD_REGION="us-east-1" export FLINK_API_KEY="<your-flink-api-key>" export FLINK_API_SECRET="<your-flink-api-secret>" export ORG_ID="<your-organization-id>" export ENV_ID="<your-environment-id>" export COMPUTE_POOL_ID="<your-compute-pool-id>" Compile and run a Table API program¶ The following code example shows how to run a “Hello World” statement and how to query an example data stream. Copy the following project object model (POM) into a file named pom.xml. pom.xml <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>example</groupId> <artifactId>flink-table-api-java-hello-world</artifactId> <version>1.0</version> <packaging>jar</packaging> <name>Apache Flink® Table API Java Hello World Example on Confluent Cloud</name> <properties> <flink.version>2.1.0</flink.version> <confluent-plugin.version>2.1-8</confluent-plugin.version> <target.java.version>11</target.java.version> <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> <maven.compiler.source>${target.java.version}</maven.compiler.source> <maven.compiler.target>${target.java.version}</maven.compiler.target> <log4j.version>2.17.1</log4j.version> </properties> <repositories> <repository> <id>confluent</id> <url>https://packages.confluent.io/maven/</url> </repository> <repository> <id>apache.snapshots</id> <name>Apache Development Snapshot Repository</name> <url>https://repository.apache.org/content/repositories/snapshots/</url> <releases> <enabled>false</enabled> </releases> <snapshots> <enabled>true</enabled> </snapshots> </repository> </repositories> <dependencies> <!-- Apache Flink dependencies --> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-table-api-java</artifactId> <version>${flink.version}</version> </dependency> <!-- Confluent Flink Table API Java plugin --> <dependency> <groupId>io.confluent.flink</groupId> <artifactId>confluent-flink-table-api-java-plugin</artifactId> <version>${confluent-plugin.version}</version> </dependency> <!-- Add logging framework, to produce console output when running in the IDE. --> <!-- These dependencies are excluded from the application JAR by default. --> <dependency> <groupId>org.apache.logging.log4j</groupId> <artifactId>log4j-slf4j-impl</artifactId> <version>${log4j.version}</version> <scope>runtime</scope> </dependency> <dependency> <groupId>org.apache.logging.log4j</groupId> <artifactId>log4j-api</artifactId> <version>${log4j.version}</version> <scope>runtime</scope> </dependency> <dependency> <groupId>org.apache.logging.log4j</groupId> <artifactId>log4j-core</artifactId> <version>${log4j.version}</version> <scope>runtime</scope> </dependency> </dependencies> <build> <sourceDirectory>./example</sourceDirectory> <plugins> <!-- Java Compiler --> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-compiler-plugin</artifactId> <version>3.10.1</version> <configuration> <source>${target.java.version}</source> <target>${target.java.version}</target> </configuration> </plugin> <!-- We use the maven-shade plugin to create a fat jar that contains all necessary dependencies. --> <!-- Change the value of <mainClass>...</mainClass> if your program entry point changes. --> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-shade-plugin</artifactId> <version>3.4.1</version> <executions> <!-- Run shade goal on package phase --> <execution> <phase>package</phase> <goals> <goal>shade</goal> </goals> <configuration> <artifactSet> <excludes> <exclude>org.apache.flink:flink-shaded-force-shading</exclude> <exclude>com.google.code.findbugs:jsr305</exclude> </excludes> </artifactSet> <filters> <filter> <!-- Do not copy the signatures in the META-INF folder. Otherwise, this might cause SecurityExceptions when using the JAR. --> <artifact>*:*</artifact> <excludes> <exclude>META-INF/*.SF</exclude> <exclude>META-INF/*.DSA</exclude> <exclude>META-INF/*.RSA</exclude> </excludes> </filter> </filters> <transformers> <transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/> <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer"> <mainClass>example.hello_table_api</mainClass> </transformer> </transformers> </configuration> </execution> </executions> </plugin> </plugins> <pluginManagement> <plugins> <!-- This improves the out-of-the-box experience in Eclipse by resolving some warnings. --> <plugin> <groupId>org.eclipse.m2e</groupId> <artifactId>lifecycle-mapping</artifactId> <version>1.0.0</version> <configuration> <lifecycleMappingMetadata> <pluginExecutions> <pluginExecution> <pluginExecutionFilter> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-shade-plugin</artifactId> <versionRange>[3.1.1,)</versionRange> <goals> <goal>shade</goal> </goals> </pluginExecutionFilter> <action> <ignore/> </action> </pluginExecution> <pluginExecution> <pluginExecutionFilter> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-compiler-plugin</artifactId> <versionRange>[3.1,)</versionRange> <goals> <goal>testCompile</goal> <goal>compile</goal> </goals> </pluginExecutionFilter> <action> <ignore/> </action> </pluginExecution> </pluginExecutions> </lifecycleMappingMetadata> </configuration> </plugin> </plugins> </pluginManagement> </build> </project> Create a directory named “example”. mkdir example Create a file named hello_table_api.java in the example directory. touch example/hello_table_api.java Copy the following code into hello_table_api.java. package example; import io.confluent.flink.plugin.ConfluentSettings; import io.confluent.flink.plugin.ConfluentTools; import org.apache.flink.table.api.EnvironmentSettings; import org.apache.flink.table.api.Table; import org.apache.flink.table.api.TableEnvironment; import org.apache.flink.types.Row; import java.util.List; /** * A table program example to get started with the Apache Flink® Table API. * * <p>It executes two foreground statements in Confluent Cloud. The results of both statements are * printed to the console. */ public class hello_table_api { // All logic is defined in a main() method. It can run both in an IDE or CI/CD system. public static void main(String[] args) { // Set up connection properties to Confluent Cloud. // Use the fromGlobalVariables() method if you assigned environment variables. // EnvironmentSettings settings = ConfluentSettings.fromGlobalVariables(); // Use the fromArgs(args) method if you want to run with command-line arguments. EnvironmentSettings settings = ConfluentSettings.fromArgs(args); // Initialize the session context to get started. TableEnvironment env = TableEnvironment.create(settings); System.out.println("Running with printing..."); // The Table API centers on 'Table' objects, which help in defining data pipelines // fluently. You can define pipelines fully programmatically. Table table = env.fromValues("Hello world!"); // Also, You can define pipelines with embedded Flink SQL. // Table table = env.sqlQuery("SELECT 'Hello world!'"); // Once the pipeline is defined, execute it on Confluent Cloud. // If no target table has been defined, results are streamed back and can be printed // locally. This can be useful for development and debugging. table.execute().print(); System.out.println("Running with collecting..."); // Results can be collected locally and accessed individually. // This can be useful for testing. Table moreHellos = env.fromValues("Hello Bob", "Hello Alice", "Hello Peter").as("greeting"); List<Row> rows = ConfluentTools.collectChangelog(moreHellos, 10); rows.forEach( r -> { String column = r.getFieldAs("greeting"); System.out.println("Greeting: " + column); }); } } Run the following command to build the jar file. mvn clean package Run the jar. If you assigned your cloud configuration to the environment variables specified in the Prerequisites section, and you used the fromGlobalVariables method in the hello_table_api code, you don’t need to provide the command-line options. java -jar target/flink-table-api-java-hello-world-1.0.jar \ --cloud aws \ --region us-east-1 \ --flink-api-key key \ --flink-api-secret secret \ --organization-id b0b21724-4586-4a07-b787-d0bb5aacbf87 \ --environment-id env-z3y2x1 \ --compute-pool-id lfcp-8m03rm Your output should resemble: Running with printing... +----+--------------------------------+ | op | f0 | +----+--------------------------------+ | +I | Hello world! | +----+--------------------------------+ 1 row in set Running with collecting... Greeting: Hello Bob Greeting: Hello Alice Greeting: Hello Peter Next steps¶ Python Table API Quick Start How-to Guides for Confluent Cloud for Apache Flink Related content¶ Course: Apache Flink® Table API: Processing Data Streams in Java GitHub repo: Java Examples for Table API on Confluent Cloud GitHub repo: Python Examples for Table API on Confluent Cloud Built-in Functions Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
b0b421724-4586-4a07-b787-d0bb5aacbf87
```

```sql
confluent flink region list
```

```sql
confluent flink region list
```

```sql
export CLOUD_PROVIDER="aws"
export CLOUD_REGION="us-east-1"
export FLINK_API_KEY="<your-flink-api-key>"
export FLINK_API_SECRET="<your-flink-api-secret>"
export ORG_ID="<your-organization-id>"
export ENV_ID="<your-environment-id>"
export COMPUTE_POOL_ID="<your-compute-pool-id>"
```

```sql
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>example</groupId>
    <artifactId>flink-table-api-java-hello-world</artifactId>
    <version>1.0</version>
    <packaging>jar</packaging>

    <name>Apache Flink® Table API Java Hello World Example on Confluent Cloud</name>

    <properties>
        <flink.version>2.1.0</flink.version>
        <confluent-plugin.version>2.1-8</confluent-plugin.version>
        <target.java.version>11</target.java.version>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <maven.compiler.source>${target.java.version}</maven.compiler.source>
        <maven.compiler.target>${target.java.version}</maven.compiler.target>
        <log4j.version>2.17.1</log4j.version>
    </properties>

    <repositories>
        <repository>
            <id>confluent</id>
            <url>https://packages.confluent.io/maven/</url>
        </repository>
        <repository>
            <id>apache.snapshots</id>
            <name>Apache Development Snapshot Repository</name>
            <url>https://repository.apache.org/content/repositories/snapshots/</url>
            <releases>
                <enabled>false</enabled>
            </releases>
            <snapshots>
                <enabled>true</enabled>
            </snapshots>
        </repository>
    </repositories>

    <dependencies>
        <!-- Apache Flink dependencies -->
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-table-api-java</artifactId>
            <version>${flink.version}</version>
        </dependency>

        <!-- Confluent Flink Table API Java plugin -->
        <dependency>
            <groupId>io.confluent.flink</groupId>
            <artifactId>confluent-flink-table-api-java-plugin</artifactId>
            <version>${confluent-plugin.version}</version>
        </dependency>

        <!-- Add logging framework, to produce console output when running in the IDE. -->
        <!-- These dependencies are excluded from the application JAR by default. -->
        <dependency>
            <groupId>org.apache.logging.log4j</groupId>
            <artifactId>log4j-slf4j-impl</artifactId>
            <version>${log4j.version}</version>
            <scope>runtime</scope>
        </dependency>
        <dependency>
            <groupId>org.apache.logging.log4j</groupId>
            <artifactId>log4j-api</artifactId>
            <version>${log4j.version}</version>
            <scope>runtime</scope>
        </dependency>
        <dependency>
            <groupId>org.apache.logging.log4j</groupId>
            <artifactId>log4j-core</artifactId>
            <version>${log4j.version}</version>
            <scope>runtime</scope>
        </dependency>
    </dependencies>

    <build>
    <sourceDirectory>./example</sourceDirectory>
        <plugins>

            <!-- Java Compiler -->
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>3.10.1</version>
                <configuration>
                    <source>${target.java.version}</source>
                    <target>${target.java.version}</target>
                </configuration>
            </plugin>

            <!-- We use the maven-shade plugin to create a fat jar that contains all necessary dependencies. -->
            <!-- Change the value of <mainClass>...</mainClass> if your program entry point changes. -->
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-shade-plugin</artifactId>
                <version>3.4.1</version>
                <executions>
                    <!-- Run shade goal on package phase -->
                    <execution>
                        <phase>package</phase>
                        <goals>
                            <goal>shade</goal>
                        </goals>
                        <configuration>
                            <artifactSet>
                                <excludes>
                                    <exclude>org.apache.flink:flink-shaded-force-shading</exclude>
                                    <exclude>com.google.code.findbugs:jsr305</exclude>
                                </excludes>
                            </artifactSet>
                            <filters>
                                <filter>
                                    <!-- Do not copy the signatures in the META-INF folder.
                                    Otherwise, this might cause SecurityExceptions when using the JAR. -->
                                    <artifact>*:*</artifact>
                                    <excludes>
                                        <exclude>META-INF/*.SF</exclude>
                                        <exclude>META-INF/*.DSA</exclude>
                                        <exclude>META-INF/*.RSA</exclude>
                                    </excludes>
                                </filter>
                            </filters>
                            <transformers>
                                <transformer
                                        implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
                                <transformer
                                        implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                                    <mainClass>example.hello_table_api</mainClass>
                                </transformer>
                            </transformers>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
        </plugins>

        <pluginManagement>
            <plugins>

                <!-- This improves the out-of-the-box experience in Eclipse by resolving some warnings. -->
                <plugin>
                    <groupId>org.eclipse.m2e</groupId>
                    <artifactId>lifecycle-mapping</artifactId>
                    <version>1.0.0</version>
                    <configuration>
                        <lifecycleMappingMetadata>
                            <pluginExecutions>
                                <pluginExecution>
                                    <pluginExecutionFilter>
                                        <groupId>org.apache.maven.plugins</groupId>
                                        <artifactId>maven-shade-plugin</artifactId>
                                        <versionRange>[3.1.1,)</versionRange>
                                        <goals>
                                            <goal>shade</goal>
                                        </goals>
                                    </pluginExecutionFilter>
                                    <action>
                                        <ignore/>
                                    </action>
                                </pluginExecution>
                                <pluginExecution>
                                    <pluginExecutionFilter>
                                        <groupId>org.apache.maven.plugins</groupId>
                                        <artifactId>maven-compiler-plugin</artifactId>
                                        <versionRange>[3.1,)</versionRange>
                                        <goals>
                                            <goal>testCompile</goal>
                                            <goal>compile</goal>
                                        </goals>
                                    </pluginExecutionFilter>
                                    <action>
                                        <ignore/>
                                    </action>
                                </pluginExecution>
                            </pluginExecutions>
                        </lifecycleMappingMetadata>
                    </configuration>
                </plugin>
            </plugins>
        </pluginManagement>
    </build>
</project>
```

```sql
mkdir example
```

```sql
hello_table_api.java
```

```sql
touch example/hello_table_api.java
```

```sql
hello_table_api.java
```

```sql
package example;
import io.confluent.flink.plugin.ConfluentSettings;
import io.confluent.flink.plugin.ConfluentTools;
import org.apache.flink.table.api.EnvironmentSettings;
import org.apache.flink.table.api.Table;
import org.apache.flink.table.api.TableEnvironment;
import org.apache.flink.types.Row;
import java.util.List;

/**
 * A table program example to get started with the Apache Flink® Table API.
 *
 * <p>It executes two foreground statements in Confluent Cloud. The results of both statements are
 * printed to the console.
 */
public class hello_table_api {

    // All logic is defined in a main() method. It can run both in an IDE or CI/CD system.
    public static void main(String[] args) {

        // Set up connection properties to Confluent Cloud.
        // Use the fromGlobalVariables() method if you assigned environment variables.
        // EnvironmentSettings settings = ConfluentSettings.fromGlobalVariables();

        // Use the fromArgs(args) method if you want to run with command-line arguments.
        EnvironmentSettings settings = ConfluentSettings.fromArgs(args);

        // Initialize the session context to get started.
        TableEnvironment env = TableEnvironment.create(settings);

        System.out.println("Running with printing...");

        // The Table API centers on 'Table' objects, which help in defining data pipelines
        // fluently. You can define pipelines fully programmatically.
        Table table = env.fromValues("Hello world!");

        // Also, You can define pipelines with embedded Flink SQL.
        // Table table = env.sqlQuery("SELECT 'Hello world!'");

        // Once the pipeline is defined, execute it on Confluent Cloud.
        // If no target table has been defined, results are streamed back and can be printed
        // locally. This can be useful for development and debugging.
        table.execute().print();

        System.out.println("Running with collecting...");

        // Results can be collected locally and accessed individually.
        // This can be useful for testing.
        Table moreHellos = env.fromValues("Hello Bob", "Hello Alice", "Hello Peter").as("greeting");
        List<Row> rows = ConfluentTools.collectChangelog(moreHellos, 10);
        rows.forEach(
                r -> {
                    String column = r.getFieldAs("greeting");
                    System.out.println("Greeting: " + column);
                });
    }
}
```

```sql
mvn clean package
```

```sql
fromGlobalVariables
```

```sql
hello_table_api
```

```sql
java -jar target/flink-table-api-java-hello-world-1.0.jar \
  --cloud aws \
  --region us-east-1 \
  --flink-api-key key \
  --flink-api-secret secret \
  --organization-id b0b21724-4586-4a07-b787-d0bb5aacbf87 \
  --environment-id env-z3y2x1 \
  --compute-pool-id lfcp-8m03rm
```

```sql
Running with printing...
+----+--------------------------------+
| op |                             f0 |
+----+--------------------------------+
| +I |                   Hello world! |
+----+--------------------------------+
1 row in set
Running with collecting...
Greeting: Hello Bob
Greeting: Hello Alice
Greeting: Hello Peter
```

---

### Python Table API Quick Start on Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/get-started/quick-start-python-table-api.html

Python Table API Quick Start on Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® supports programming applications with the Table API. Confluent provides a plugin for running applications that use the Table API on Confluent Cloud. For more information, see Table API. For code examples, see Python Examples for Table API on Confluent Cloud. Note The Flink Table API is available for preview. A Preview feature is a Confluent Cloud component that is being introduced to gain early feedback from developers. Preview features can be used for evaluation and non-production testing purposes or to provide feedback to Confluent. The warranty, SLA, and Support Services provisions of your agreement with Confluent do not apply to Preview features. Confluent may discontinue providing preview releases of the Preview features at any time in Confluent’s’ sole discretion. Comments, questions, and suggestions related to the Table API are encouraged and can be submitted through the established channels. Prerequisites¶ Access to Confluent Cloud A compute pool in Confluent Cloud A Apache Kafka® cluster, if you want to run examples that store data in Kafka Java version 11 or later Environment variables as defined in Environment variables. The uv package manager to manage your Python versions and environments. Only Python versions 3.9 to 3.12 are supported. To run Table API and Flink SQL programs, you must generate an API key that’s specific to the Flink environment. Also, you need Confluent Cloud account details, like your organization and environment identifiers. Flink API Key: Follow the steps in Generate a Flink API key. For convenience, assign your Flink key and secret to the FLINK_API_KEY and FLINK_API_SECRET environment variables. Organization ID: The identifier your organization, for example, b0b421724-4586-4a07-b787-d0bb5aacbf87. For convenience, assign your organization identifier to the ORG_ID environment variable. Environment ID: The identifier of the environment where your Flink SQL statements run, for example, env-z3y2x1. For convenience, assign your environment identifier to the ENV_ID environment variable. Cloud provider name: The name of the cloud provider where your cluster runs, for example, aws. To see the available providers, run the confluent flink region list command. For convenience, assign your cloud provider to the CLOUD_PROVIDER environment variable. Cloud region: The name of the region where your cluster runs, for example, us-east-1. To see the available regions, run the confluent flink region list command. For convenience, assign your cloud region to the CLOUD_REGION environment variable. export CLOUD_PROVIDER="aws" export CLOUD_REGION="us-east-1" export FLINK_API_KEY="<your-flink-api-key>" export FLINK_API_SECRET="<your-flink-api-secret>" export ORG_ID="<your-organization-id>" export ENV_ID="<your-environment-id>" export COMPUTE_POOL_ID="<your-compute-pool-id>" Note The Flink Python API communicates with a Java process. You must have at least Java 11 installed. Check that your JAVA_HOME environment variable is set correctly. Checking only java -version might not be sufficient. echo $JAVA_HOME If required, install openjdk and export the JAVA_HOME variable: brew install openjdk && export JAVA_HOME=$(/usr/libexec/java_home) && echo $JAVA_HOME Setup your environment and run a Table API program¶ Use uv to create a virtual environment that contains all required dependencies and project files. Use one of the following commands to install uv. curl -LsSf https://astral.sh/uv/install.sh | sh # or brew install uv # or pip install uv Create a new virtual environment. uv venv --python 3.11 Copy the following code into a file named hello_table_api.py. # /// script # requires-python = ">=3.9,<3.12" # dependencies = [ # "confluent-flink-table-api-python-plugin>=2.1-8", # ] # /// from pyflink.table.confluent import ConfluentSettings, ConfluentTools from pyflink.table import TableEnvironment, Row from pyflink.table.expressions import col, row def run(): # Set up the connection to Confluent Cloud settings = ConfluentSettings.from_global_variables() env = TableEnvironment.create(settings) # Run your first Flink statement in Table API env.from_elements([row("Hello world!")]).execute().print() # Or use SQL env.sql_query("SELECT 'Hello world!'").execute().print() # Structure your code with Table objects - the main ingredient of Table API. table = env.from_path("examples.marketplace.clicks") \ .filter(col("user_agent").like("Mozilla%")) \ .select(col("click_id"), col("user_id")) table.print_schema() print(table.explain()) # Use the provided tools to test on a subset of the streaming data expected = ConfluentTools.collect_materialized_limit(table, 50) actual = [Row(42, 500)] if expected != actual: print("Results don't match!") if __name__ == "__main__": run() Run the following command to execute the Table API program from the directory where you created hello_table_api.py. uv run hello_table_api.py Related content¶ Filter Kafka messages in Python using Flink’s Table API GitHub repo: Java Examples for Table API on Confluent Cloud GitHub repo: Python Examples for Table API on Confluent Cloud Java Table API Quick Start How-to Guides for Confluent Cloud for Apache Flink Built-in Functions Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
b0b421724-4586-4a07-b787-d0bb5aacbf87
```

```sql
confluent flink region list
```

```sql
confluent flink region list
```

```sql
export CLOUD_PROVIDER="aws"
export CLOUD_REGION="us-east-1"
export FLINK_API_KEY="<your-flink-api-key>"
export FLINK_API_SECRET="<your-flink-api-secret>"
export ORG_ID="<your-organization-id>"
export ENV_ID="<your-environment-id>"
export COMPUTE_POOL_ID="<your-compute-pool-id>"
```

```sql
java -version
```

```sql
echo $JAVA_HOME
```

```sql
brew install openjdk && export JAVA_HOME=$(/usr/libexec/java_home) && echo $JAVA_HOME
```

```sql
curl -LsSf https://astral.sh/uv/install.sh | sh
# or
brew install uv
# or
pip install uv
```

```sql
uv venv --python 3.11
```

```sql
hello_table_api.py
```

```sql
# /// script
# requires-python = ">=3.9,<3.12"
# dependencies = [
#   "confluent-flink-table-api-python-plugin>=2.1-8",
# ]
# ///

from pyflink.table.confluent import ConfluentSettings, ConfluentTools
from pyflink.table import TableEnvironment, Row
from pyflink.table.expressions import col, row

def run():
    # Set up the connection to Confluent Cloud
    settings = ConfluentSettings.from_global_variables()
    env = TableEnvironment.create(settings)

    # Run your first Flink statement in Table API
    env.from_elements([row("Hello world!")]).execute().print()

    # Or use SQL
    env.sql_query("SELECT 'Hello world!'").execute().print()

    # Structure your code with Table objects - the main ingredient of Table API.
    table = env.from_path("examples.marketplace.clicks") \
        .filter(col("user_agent").like("Mozilla%")) \
        .select(col("click_id"), col("user_id"))

    table.print_schema()
    print(table.explain())

    # Use the provided tools to test on a subset of the streaming data
    expected = ConfluentTools.collect_materialized_limit(table, 50)
    actual = [Row(42, 500)]
    if expected != actual:
        print("Results don't match!")

if __name__ == "__main__":
    run()
```

```sql
hello_table_api.py
```

```sql
uv run hello_table_api.py
```

---

### SQL Shell Quick Start on Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/get-started/quick-start-shell.html

Flink SQL Shell Quick Start on Confluent Cloud for Apache Flink¶ This quick start walks you through the following steps to get you up and running with Confluent Cloud for Apache Flink®. Step 1: Log in to Confluent Cloud with the Confluent CLI Step 2: Start the Flink SQL shell Step 3: Submit a SQL statement Step 4: Create and populate a table Step 5: Query streaming data Prerequisites¶ You need the following prerequisites to use Confluent Cloud for Apache Flink. Access to Confluent Cloud. The organization ID, environment ID, and compute pool ID for your organization. The OrganizationAdmin, EnvironmentAdmin, or FlinkAdmin role for creating compute pools, or the FlinkDeveloper role if you already have a compute pool. If you don’t have the appropriate role, reach out to your OrganizationAdmin or EnvironmentAdmin. The Confluent CLI. To use the Flink SQL shell, update to the latest version of the Confluent CLI by running the following command: confluent update --yes If you used homebrew to install the Confluent CLI, update the CLI by using the brew upgrade command, instead of confluent update. For more information, see Confluent CLI. Step 1: Log in to Confluent Cloud with the Confluent CLI¶ Run the following CLI command to log in to Confluent Cloud. confluent login --save --organization ${ORG_ID} Your output should resemble: Assuming https protocol. Logged in as "<your-email>" for organization "<your-org-id>" ("<your-org-name>"). Step 2: Start the Flink SQL shell¶ Start the Flink SQL shell by running the confluent flink shell command. The shell connects with Confluent Cloud Important This guide focuses on ad-hoc statements. To run statements in long-running jobs, you should provide the --service-account option in the confluent flink shell command. When you start the shell without this option, statements run with your user account. For more information, see Service Accounts on Confluent Cloud. Run the following CLI command to start the Flink SQL shell. confluent flink shell --compute-pool ${COMPUTE_POOL_ID} --environment ${ENV_ID} Your output should resemble: Welcome! To exit, press Ctrl-Q or type "exit". [Ctrl-Q] Quit [Ctrl-S] Toggle Smart Completion > You’re ready to start processing data by submitting statements to Flink SQL. Step 3: Submit a SQL statement¶ In the SQL shell, run the following statement to see Flink SQL in action. The CURRENT_TIMESTAMP function returns the local date and time. SELECT CURRENT_TIMESTAMP; Your output should resemble: Statement name: ab12345c-6e11-7bcd-9 Statement successfully submitted. Fetching results... +-------------------------+ | CURRENT_TIMESTAMP | +-------------------------+ | 2023-07-05 18:57:53.867 | +-------------------------+ For all functions and statements supported by Flink SQL, see Flink SQL Reference. Step 4: Create and populate a table¶ The following steps show how to create a table, populate it with a few records, and query it to view the records it contains. Run the following statement to create a table that contains pseudorandom integers. CREATE TABLE random_float_table( ts TIMESTAMP_LTZ(3), random_value FLOAT); Run the following INSERT VALUES statement to populate random_int_table with records that have a timestamp field and a float field. timestamp values are generated by the CURRENT_TIMESTAMP function, and float values are generated by the RAND_INTEGER(INT) function multiplied by a float. INSERT INTO random_float_table VALUES (CURRENT_TIMESTAMP, RAND_INTEGER(100)*0.02), (CURRENT_TIMESTAMP, RAND_INTEGER(1000)*0.05), (CURRENT_TIMESTAMP, RAND_INTEGER(10000)*0.20), (CURRENT_TIMESTAMP, RAND_INTEGER(100000)*0.22), (CURRENT_TIMESTAMP, RAND_INTEGER(1000000)*0.7); Press ENTER to return to the SQL shell. Because INSERT INTO VALUES is a point-in-time statement, it exits after it completes inserting records. Run the following statement to query random_float_table for all of its records. SELECT * FROM random_float_table; Your output should resemble: ts random_value 2023-09-07 20:24:19.366 0.46 2023-09-07 20:24:19.276 28.75 2023-09-07 20:24:19.367 1467.2 2023-09-07 20:24:19.368 7953.88 2023-09-07 20:24:19.465 685883.1 Press Q to exit the results view and stop the statement. Run the SHOW JOBS statement to get the status of statements in your SQL environment. SHOW JOBS; Your output should resemble: Statement name: dbdb79f8-7e6e-4b03 Statement successfully submitted. Waiting for statement to be ready. Statement phase is PENDING. Statement phase is COMPLETED. +--------------------+-----------+--------------------------------+--------------+------------------+ | Name | Phase | Statement | Compute Pool | Creation Time | +--------------------+-----------+--------------------------------+--------------+------------------+ | f8f118e1-bd79-40c1 | COMPLETED | CREATE TABLE random_float_t... | lfcp-xxxxxx | 2023-09-07 20... | | a30f8a59-af67-4bf6 | COMPLETED | INSERT INTO random_float_ta... | lfcp-xxxxxx | 2023-09-07 20... | +--------------------+-----------+--------------------------------+--------------+------------------+ Step 5: Query streaming data¶ Flink SQL enables using familiar SQL syntax to query streaming data. Confluent Cloud for Apache Flink provides example data streams that you can experiment with. In this step, you query the orders table from the marketplace database in the examples catalog. In Flink SQL, catalog objects, like tables, are scoped by catalog and database. A catalog is a collection of databases that share the same namespace. A database is a collection of tables that share the same namespace. In Confluent Cloud, an environment is mapped to a Flink catalog, and a Kafka cluster is mapped to a Flink database. You can always use three-part identifiers for your tables, like catalog.database.table, but it’s more convenient to set a default. Run the following statement to set the default catalog. USE CATALOG `examples`; Your output should resemble: +---------------------+----------+ | Key | Value | +---------------------+----------+ | sql.current-catalog | examples | +---------------------+----------+ Run the following statement to set the default database. USE `marketplace`; Your output should resemble: +----------------------+-------------+ | Key | Value | +----------------------+-------------+ | sql.current-database | marketplace | +----------------------+-------------+ Run the following statement to see the list of available tables. SHOW TABLES; Your output should resemble: +------------+ | Table Name | +------------+ | clicks | | customers | | orders | | products | +------------+ Run the following statement to inspect the orders data stream. SELECT * FROM orders; Your output should resemble: order_id customer_id product_id price 36d77b21-e68f-4123-b87a-cc19ac1f36ac 3137 1305 65.71 7fd3cd2a-392b-4f8f-b953-0bfa1d331354 3063 1327 17.75 1a223c61-38a5-4b8c-8465-2a6b359bf05e 3064 1166 14.95 ... Press Q to exit the results view and stop the statement. Congratulations, you have run your first Flink SQL statements on Confluent Cloud using the SQL Shell. Next steps¶ How-to Guides for Confluent Cloud for Apache Flink Related content¶ Course: Apache Flink 101 Course: Building Flink Applications in Java DDL Statements Stream Processing Concepts Built-in Functions Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
confluent update --yes
```

```sql
brew upgrade
```

```sql
confluent update
```

```sql
confluent login --save --organization ${ORG_ID}
```

```sql
Assuming https protocol.
Logged in as "<your-email>" for organization "<your-org-id>" ("<your-org-name>").
```

```sql
confluent flink shell
```

```sql
--service-account
```

```sql
confluent flink shell
```

```sql
confluent flink shell --compute-pool ${COMPUTE_POOL_ID} --environment ${ENV_ID}
```

```sql
Welcome!
To exit, press Ctrl-Q or type "exit".

[Ctrl-Q] Quit [Ctrl-S] Toggle Smart Completion
>
```

```sql
SELECT CURRENT_TIMESTAMP;
```

```sql
Statement name: ab12345c-6e11-7bcd-9
Statement successfully submitted.
Fetching results...
+-------------------------+
|    CURRENT_TIMESTAMP    |
+-------------------------+
| 2023-07-05 18:57:53.867 |
+-------------------------+
```

```sql
CREATE TABLE random_float_table(
  ts TIMESTAMP_LTZ(3),
  random_value FLOAT);
```

```sql
random_int_table
```

```sql
INSERT INTO random_float_table VALUES
  (CURRENT_TIMESTAMP, RAND_INTEGER(100)*0.02),
  (CURRENT_TIMESTAMP, RAND_INTEGER(1000)*0.05),
  (CURRENT_TIMESTAMP, RAND_INTEGER(10000)*0.20),
  (CURRENT_TIMESTAMP, RAND_INTEGER(100000)*0.22),
  (CURRENT_TIMESTAMP, RAND_INTEGER(1000000)*0.7);
```

```sql
random_float_table
```

```sql
SELECT * FROM random_float_table;
```

```sql
ts                      random_value
2023-09-07 20:24:19.366 0.46
2023-09-07 20:24:19.276 28.75
2023-09-07 20:24:19.367 1467.2
2023-09-07 20:24:19.368 7953.88
2023-09-07 20:24:19.465 685883.1
```

```sql
Statement name: dbdb79f8-7e6e-4b03
Statement successfully submitted.
Waiting for statement to be ready. Statement phase is PENDING.
Statement phase is COMPLETED.
+--------------------+-----------+--------------------------------+--------------+------------------+
|        Name        |   Phase   |           Statement            | Compute Pool |  Creation Time   |
+--------------------+-----------+--------------------------------+--------------+------------------+
| f8f118e1-bd79-40c1 | COMPLETED | CREATE TABLE random_float_t... | lfcp-xxxxxx  | 2023-09-07 20... |
| a30f8a59-af67-4bf6 | COMPLETED | INSERT INTO random_float_ta... | lfcp-xxxxxx  | 2023-09-07 20... |
+--------------------+-----------+--------------------------------+--------------+------------------+
```

```sql
marketplace
```

```sql
catalog.database.table
```

```sql
USE CATALOG `examples`;
```

```sql
+---------------------+----------+
|         Key         |  Value   |
+---------------------+----------+
| sql.current-catalog | examples |
+---------------------+----------+
```

```sql
USE `marketplace`;
```

```sql
+----------------------+-------------+
|         Key          |    Value    |
+----------------------+-------------+
| sql.current-database | marketplace |
+----------------------+-------------+
```

```sql
SHOW TABLES;
```

```sql
+------------+
| Table Name |
+------------+
| clicks     |
| customers  |
| orders     |
| products   |
+------------+
```

```sql
SELECT * FROM orders;
```

```sql
order_id                             customer_id product_id price
36d77b21-e68f-4123-b87a-cc19ac1f36ac 3137        1305       65.71
7fd3cd2a-392b-4f8f-b953-0bfa1d331354 3063        1327       17.75
1a223c61-38a5-4b8c-8465-2a6b359bf05e 3064        1166       14.95
...
```

---

### Aggregate a Data Stream in a Tumbling Window with Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/how-to-guides/aggregate-tumbling-window.html

Aggregate a Stream in a Tumbling Window with Confluent Cloud for Apache Flink¶ Aggregation over windows is central to processing streaming data. Confluent Cloud for Apache Flink® supports Windowing Table-Valued Functions (Windowing TVFs) in Confluent Cloud for Apache Flink, a SQL-standard syntax for splitting an infinite stream into windows of finite size and computing aggregations within each window. This is often used to find the min/max/average within a group, finding the first or last record or calculating totals. In this guide, you will learn how to run an Flink SQL statement that identifies the maximum and minimum orders from a continuous data stream of orders data. This topic shows the following steps: Step 1: Inspect the example stream Step 2: View aggregated results in a tumbling window Prerequisites¶ Access to Confluent Cloud. The OrganizationAdmin, EnvironmentAdmin, or FlinkAdmin role for creating compute pools, or the FlinkDeveloper role if you already have a compute pool. If you don’t have the appropriate role, contact your OrganizationAdmin or EnvironmentAdmin. For more information, see Grant Role-Based Access in Confluent Cloud for Apache Flink. A provisioned Flink compute pool. Step 1: Inspect the example stream¶ In this step, you query the read-only orders table in the examples.marketplace database to inspect the stream for fields that you can mask. Log in to Confluent Cloud and navigate to your Flink workspace. In the Use catalog dropdown, select your environment. In the Use database dropdown, select your Kafka cluster. Run the following statement to inspect the example orders stream. SELECT * FROM examples.marketplace.orders; Your output should resemble: order_id customer_id product_id price 68362284-34df-41a3-87fb-50b79647b786 3195 1267 47.48 6e03663e-d20b-4a23-848a-aec959d794e3 3094 1412 50.92 84217b5d-7dcb-46d1-9600-675a3734a3ed 3038 1094 83.56 ... Step 2: View aggregated results in a tumbling window¶ Run the following statement to start a windowed query on the orders data. SELECT window_start, window_end, MIN(price) as minimum_order_value, MAX(price) as maximum_order_value FROM TABLE(TUMBLE(TABLE examples.marketplace.orders, DESCRIPTOR($rowtime), INTERVAL '10' SECOND)) GROUP BY window_start, window_end; Your output should resemble: window_start window_end minimum_order_value maximum_order_value 2023-09-12 08:54:20.000 2023-09-12 08:54:30.000 10.05 99.75 2023-09-12 08:54:30.000 2023-09-12 08:54:40.000 10.22 99.88 2023-09-12 08:54:40.000 2023-09-12 08:54:50.000 10.09 150.45 ... The Flink statement created with this query identifies the minimum and maximum order value in each 10-second window. Related content¶ Compare Current and Previous Values in a Data Stream Windowing Table-Valued Functions Window Aggregation Queries Window Deduplication Queries Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
examples.marketplace
```

```sql
SELECT * FROM examples.marketplace.orders;
```

```sql
order_id                                customer_id   product_id  price
68362284-34df-41a3-87fb-50b79647b786    3195          1267        47.48
6e03663e-d20b-4a23-848a-aec959d794e3    3094          1412        50.92
84217b5d-7dcb-46d1-9600-675a3734a3ed    3038          1094        83.56
...
```

```sql
SELECT
  window_start,
  window_end,
  MIN(price) as minimum_order_value,
  MAX(price) as maximum_order_value
FROM TABLE(TUMBLE(TABLE examples.marketplace.orders, DESCRIPTOR($rowtime), INTERVAL '10' SECOND))
GROUP BY window_start, window_end;
```

```sql
window_start            window_end              minimum_order_value maximum_order_value
2023-09-12 08:54:20.000 2023-09-12 08:54:30.000 10.05               99.75
2023-09-12 08:54:30.000 2023-09-12 08:54:40.000 10.22               99.88
2023-09-12 08:54:40.000 2023-09-12 08:54:50.000 10.09               150.45
...
```

---

### Combine Streams and Track Most Recent Records with Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/how-to-guides/combine-and-track-most-recent-records.html

Combine Streams and Track Most Recent Records with Confluent Cloud for Apache Flink¶ When working with streaming data, it’s common to need to combine information from multiple sources while tracking the most recent record data. Confluent Cloud for Apache Flink® provides powerful capabilities to merge streams and maintain up-to-date information for each record, regardless of which stream it originated from. In this guide, you learn how to run a Flink SQL statement that combines multiple data streams and keeps track of the most recent information for each record by using window functions. While this example uses order and clickstream data, the pattern can be applied to any number of streams that share a common identifier. This topic shows the following steps: Step 1: Inspect the example source streams Step 2: Create a unified view with most recent records Prerequisites¶ Access to Confluent Cloud. The OrganizationAdmin, EnvironmentAdmin, or FlinkAdmin role for creating compute pools, or the FlinkDeveloper role if you already have a compute pool. If you don’t have the appropriate role, contact your OrganizationAdmin or EnvironmentAdmin. For more information, see Grant Role-Based Access in Confluent Cloud for Apache Flink. A provisioned Flink compute pool. Step 1: Inspect the example source streams¶ In this step, you examine the read-only orders and clicks tables in the examples.marketplace database to identify: The common identifier field that links the streams The unique fields from each stream that you want to track Log in to Confluent Cloud and navigate to your Flink workspace. Examine your source streams. The following example includes orders and clicks: -- First stream SELECT * FROM `examples`.`marketplace`.`orders`; -- Second stream SELECT * FROM `examples`.`marketplace`.`clicks`; Your output from orders should resemble: order_id customer_id product_id price be396ae5-d7d9-4454-99d7-9b1c155d51d4 3243 1304 99.55 79e295d3-5a0b-4127-9337-9a483794e7d4 3132 1201 21.43 9b59d319-c37a-4088-a803-350d43bc5382 3099 1271 66.70 8aaa9d8e-d8f7-4bb5-9d59-ce4d0cfc9a92 3181 1028 76.23 e681fa67-3a1e-4e99-ba03-da9fb5d12845 3186 1212 69.67 89ba7186-f927-462b-860a-68b8c9d51a06 3238 1336 76.89 ebfec6c6-3294-444b-82e5-5a66e7dc5cd5 3233 1223 23.69 Your output from clicks should resemble: click_id user_id url user_agent view_time a5c31d8b-cc93-4a48-a7d9-c1d389c83f4a 3099 https://www.acme.com/product/foxmh Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0 79 b7d42e6f-85a1-4f7b-b1c2-d3e456789abc 3262 https://www.acme.com/product/lruuv Mozilla/5.0 (iPhone; CPU OS 9_3_5 like Mac OS X) AppleWebKit/601.1.46 108 c8e53f7a-96b2-4a8c-c2d3-e4f567890def 3181 https://www.acme.com/product/vfzsy Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) 33 d9f64g8b-a7c3-4b9d-d3e4-f5g678901hij 4882 https://www.acme.com/product/zkxun Opera/9.80 (Windows NT 6.0) Presto/2.12.388 Version/12.14 99 e74441b6-09da-4113-b8f9-db12cee90c77 3500 https://www.acme.com/product/lruuv Mozilla/5.0 (iPhone; CPU iPhone OS 11_4_1 like Mac OS X) AppleWebKit/6... 116 f39236ac-2646-4e5d-bab2-cd4445630529 4360 https://www.acme.com/product/vfzsy Mozilla/4.0 (compatible; Win32; WinHttp.WinHttpRequest.5) 52 3f3b06df-aa2b-417e-833e-ccc232536c4a 4171 https://www.acme.com/product/foxmh Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) C... 82 ee9fe475-5420-410d-90ae-47987eba32d5 4095 https://www.acme.com/product/ifgcb Mozilla/5.0 (Windows NT 6.1; WOW64; rv:18.0) Gecko/20100101 Firefox/1... 119 e75faa6f-78d3-45e0-817e-1338381f53a2 4904 https://www.acme.com/product/ffnsl Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like ... 36 77c6acbb-eb71-4a49-96e5-714f8b024c98 4681 https://www.acme.com/product/zkxun Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.11) Gecko GranParadiso/3... 67 Step 2: Create a unified view with most recent records¶ Run the following statement to combine multiple streams while tracking the most recent information for each record: -- This query combines order and click data, tracking the latest values -- for each customer's interactions across both datasets -- First, combine order data and clickstream data into a single structure -- Note: Fields not present in one source are filled with NULL WITH combined_data AS ( -- Orders data with empty click-related fields SELECT customer_id, order_id, product_id, price, CAST(NULL AS STRING) AS url, -- Click-specific fields set to NULL CAST(NULL AS STRING) AS user_agent, -- for order records CAST(NULL AS INT) AS view_time, $rowtime FROM `examples`.`marketplace`.`orders` UNION ALL -- Click data with empty order-related fields SELECT user_id AS customer_id, -- Normalize user_id to match customer_id CAST(NULL AS STRING) AS order_id, -- Order-specific fields set to NULL CAST(NULL AS STRING) AS product_id, -- for click records CAST(NULL AS DOUBLE) AS price, url, user_agent, view_time, $rowtime FROM `examples`.`marketplace`.`clicks` ) -- For each customer, maintain the latest value for each field -- using window functions over the combined dataset SELECT LAST_VALUE(customer_id) OVER w AS customer_id, LAST_VALUE(order_id) OVER w AS order_id, LAST_VALUE(product_id) OVER w AS product_id, LAST_VALUE(price) OVER w AS price, LAST_VALUE(url) OVER w AS url, LAST_VALUE(user_agent) OVER w AS user_agent, LAST_VALUE(view_time) OVER w AS view_time, MAX($rowtime) OVER w AS rowtime -- Track the latest event timestamp FROM combined_data -- Define window for tracking latest values per customer WINDOW w AS ( PARTITION BY customer_id -- Group all events by customer ORDER BY $rowtime -- Order by event timestamp ROWS BETWEEN UNBOUNDED PRECEDING -- Consider all previous events AND CURRENT ROW -- up to the current one ) Your output should resemble: customer_id order_id product_id price url user_agent view_time rowtime 3243 be396ae5-d7d9-4454-99d7-9b1c155d51d4 1304 99.55 NULL NULL NULL 2024-10-22T08:21:07.620Z 3132 79e295d3-5a0b-4127-9337-9a483794e7d4 1201 21.43 NULL NULL NULL 2024-10-22T08:21:07.640Z 3099 9b59d319-c37a-4088-a803-350d43bc5382 1271 66.7 https://www.acme.com/product/foxmh Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0 79 2024-10-22T08:21:07.600Z 3262 NULL NULL NULL https://www.acme.com/product/lruuv Mozilla/5.0 (iPhone; CPU OS 9_3_5 like Mac OS X) AppleWebKit/601.1.46 108 2024-10-22T08:21:07.637Z 3181 8aaa9d8e-d8f7-4bb5-9d59-ce4d0cfc9a92 1028 76.23 https://www.acme.com/product/vfzsy Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) 33 2024-10-22T08:21:07.656Z 3186 e681fa67-3a1e-4e99-ba03-da9fb5d12845 1212 69.67 NULL NULL NULL 2024-10-22T08:21:07.660Z 4882 NULL NULL NULL https://www.acme.com/product/zkxun Opera/9.80 (Windows NT 6.0) Presto/2.12.388 Version/12.14 99 2024-10-22T08:21:07.676Z 3238 89ba7186-f927-462b-860a-68b8c9d51a06 1336 76.89 NULL NULL NULL 2024-10-22T08:21:07.679Z 3233 ebfec6c6-3294-444b-82e5-5a66e7dc5cd5 1223 23.69 NULL NULL NULL 2024-10-22T08:21:07.699Z This pattern works by: Using a Common Table Expression (CTE) to combine all streams Setting fields not present in each stream to NULL Using window functions to track the most recent data for each field Partitioning by the common identifier to group related records Ordering by the watermark timestamp ($rowtime) to ensure proper temporal sequencing You can adapt this pattern by: Adding more streams to the UNION ALL Changing the common identifier field in the PARTITION BY clause Modifying the selected fields based on your needs Using a custom defined watermark strategy Key considerations¶ When applying this pattern, consider: All streams must have a common identifier field Timestamp fields should be consistent across streams NULL handling may need adjustment based on your use case Why UNION ALL vs. JOIN?¶ While it might seem natural to use a JOIN to combine data from multiple streams, the UNION ALL approach shown in this pattern offers several important advantages for streaming use cases. Consider what would happen with a join-based approach: SELECT COALESCE(o.customer_id, c.user_id) as customer_id, o.order_id, o.product_id, o.price, c.url, c.user_agent, c.view_time FROM orders o FULL OUTER JOIN clicks c ON o.customer_id = c.user_id This join would need to maintain state for both streams to match records, leading to several challenges in a streaming context: State management and performance¶ When using a join, Flink must maintain state for both sides of the join operation to match records. This state grows over time as new records arrive, consuming more resources. In contrast, the UNION ALL pattern simply combines records as they arrive, without needing to maintain state for matching. Handling late-arriving data¶ With a join, if a click record arrives late, Flink would need to match it against all historical order records for that customer. Similarly, a late order would need to be matched against historical clicks. This can lead to reprocessing of historical data and potential out-of-order results. The UNION ALL pattern handles each record independently, making late-arriving data much simpler to process. Append-only output¶ The combination of UNION ALL with window functions produces an append-only output stream, where each record contains the complete latest state for a customer at the time of each event. When materializing these results, you can: Use an append-only table to maintain the history of how each customer’s state changed over time Use an upsert table to maintain only the current state for each customer For example, when new events arrive for customer 3099 (first an order, then a click): customer_id order_id product_id price url user_agent view_time rowtime 3099 e681fa67-3a1e-4e99-ba03-da9fb5d12845 1424 89.99 NULL NULL NULL 2024-10-22T08:21:08.620Z 3099 e681fa67-3a1e-4e99-ba03-da9fb5d12845 1424 89.99 https://www.acme.com/product/vfzsy Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0 45 2024-10-22T08:21:09.620Z Each event produces a new output record with the complete latest state for that customer. In contrast, a join produces a changelog output where existing records may be updated, requiring downstream systems to handle inserts, updates, and deletions. Related content¶ Compare Current and Previous Values in a Data Stream Window Aggregation Queries Handle Multiple Event Types Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
examples.marketplace
```

```sql
-- First stream
SELECT * FROM `examples`.`marketplace`.`orders`;

-- Second stream
SELECT * FROM `examples`.`marketplace`.`clicks`;
```

```sql
order_id                                customer_id   product_id  price
be396ae5-d7d9-4454-99d7-9b1c155d51d4    3243          1304        99.55
79e295d3-5a0b-4127-9337-9a483794e7d4    3132          1201        21.43
9b59d319-c37a-4088-a803-350d43bc5382    3099          1271        66.70
8aaa9d8e-d8f7-4bb5-9d59-ce4d0cfc9a92    3181          1028        76.23
e681fa67-3a1e-4e99-ba03-da9fb5d12845    3186          1212        69.67
89ba7186-f927-462b-860a-68b8c9d51a06    3238          1336        76.89
ebfec6c6-3294-444b-82e5-5a66e7dc5cd5    3233          1223        23.69
```

```sql
click_id                             user_id url                                user_agent                                                                      view_time
a5c31d8b-cc93-4a48-a7d9-c1d389c83f4a 3099    https://www.acme.com/product/foxmh Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0 79
b7d42e6f-85a1-4f7b-b1c2-d3e456789abc 3262    https://www.acme.com/product/lruuv Mozilla/5.0 (iPhone; CPU OS 9_3_5 like Mac OS X) AppleWebKit/601.1.46           108
c8e53f7a-96b2-4a8c-c2d3-e4f567890def 3181    https://www.acme.com/product/vfzsy Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)                              33
d9f64g8b-a7c3-4b9d-d3e4-f5g678901hij 4882    https://www.acme.com/product/zkxun Opera/9.80 (Windows NT 6.0) Presto/2.12.388 Version/12.14                       99
e74441b6-09da-4113-b8f9-db12cee90c77 3500    https://www.acme.com/product/lruuv Mozilla/5.0 (iPhone; CPU iPhone OS 11_4_1 like Mac OS X) AppleWebKit/6...       116
f39236ac-2646-4e5d-bab2-cd4445630529 4360    https://www.acme.com/product/vfzsy Mozilla/4.0 (compatible; Win32; WinHttp.WinHttpRequest.5)                       52
3f3b06df-aa2b-417e-833e-ccc232536c4a 4171    https://www.acme.com/product/foxmh Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) C...        82
ee9fe475-5420-410d-90ae-47987eba32d5 4095    https://www.acme.com/product/ifgcb Mozilla/5.0 (Windows NT 6.1; WOW64; rv:18.0) Gecko/20100101 Firefox/1...        119
e75faa6f-78d3-45e0-817e-1338381f53a2 4904    https://www.acme.com/product/ffnsl Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like ...         36
77c6acbb-eb71-4a49-96e5-714f8b024c98 4681    https://www.acme.com/product/zkxun Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.11) Gecko GranParadiso/3...    67
```

```sql
-- This query combines order and click data, tracking the latest values
-- for each customer's interactions across both datasets

-- First, combine order data and clickstream data into a single structure
-- Note: Fields not present in one source are filled with NULL
WITH combined_data AS (
  -- Orders data with empty click-related fields
  SELECT
    customer_id,
    order_id,
    product_id,
    price,
    CAST(NULL AS STRING) AS url,        -- Click-specific fields set to NULL
    CAST(NULL AS STRING) AS user_agent, -- for order records
    CAST(NULL AS INT) AS view_time,
    $rowtime
  FROM `examples`.`marketplace`.`orders`
  UNION ALL
  -- Click data with empty order-related fields
  SELECT
    user_id AS customer_id,             -- Normalize user_id to match customer_id
    CAST(NULL AS STRING) AS order_id,   -- Order-specific fields set to NULL
    CAST(NULL AS STRING) AS product_id, -- for click records
    CAST(NULL AS DOUBLE) AS price,
    url,
    user_agent,
    view_time,
    $rowtime
  FROM `examples`.`marketplace`.`clicks`
)
-- For each customer, maintain the latest value for each field
-- using window functions over the combined dataset
SELECT
  LAST_VALUE(customer_id) OVER w AS customer_id,
  LAST_VALUE(order_id) OVER w AS order_id,
  LAST_VALUE(product_id) OVER w AS product_id,
  LAST_VALUE(price) OVER w AS price,
  LAST_VALUE(url) OVER w AS url,
  LAST_VALUE(user_agent) OVER w AS user_agent,
  LAST_VALUE(view_time) OVER w AS view_time,
  MAX($rowtime) OVER w AS rowtime      -- Track the latest event timestamp
FROM combined_data
-- Define window for tracking latest values per customer
WINDOW w AS (
  PARTITION BY customer_id             -- Group all events by customer
  ORDER BY $rowtime                    -- Order by event timestamp
  ROWS BETWEEN UNBOUNDED PRECEDING     -- Consider all previous events
    AND CURRENT ROW                    -- up to the current one
)
```

```sql
customer_id  order_id                               product_id  price    url                                user_agent                                                                         view_time rowtime
3243         be396ae5-d7d9-4454-99d7-9b1c155d51d4   1304        99.55    NULL                               NULL                                                                               NULL      2024-10-22T08:21:07.620Z
3132         79e295d3-5a0b-4127-9337-9a483794e7d4   1201        21.43    NULL                               NULL                                                                               NULL      2024-10-22T08:21:07.640Z
3099         9b59d319-c37a-4088-a803-350d43bc5382   1271        66.7     https://www.acme.com/product/foxmh Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0    79        2024-10-22T08:21:07.600Z
3262         NULL                                   NULL        NULL     https://www.acme.com/product/lruuv Mozilla/5.0 (iPhone; CPU OS 9_3_5 like Mac OS X) AppleWebKit/601.1.46              108       2024-10-22T08:21:07.637Z
3181         8aaa9d8e-d8f7-4bb5-9d59-ce4d0cfc9a92   1028        76.23    https://www.acme.com/product/vfzsy Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)                                 33        2024-10-22T08:21:07.656Z
3186         e681fa67-3a1e-4e99-ba03-da9fb5d12845   1212        69.67    NULL                               NULL                                                                               NULL      2024-10-22T08:21:07.660Z
4882         NULL                                   NULL        NULL     https://www.acme.com/product/zkxun Opera/9.80 (Windows NT 6.0) Presto/2.12.388 Version/12.14                          99        2024-10-22T08:21:07.676Z
3238         89ba7186-f927-462b-860a-68b8c9d51a06   1336        76.89    NULL                               NULL                                                                               NULL      2024-10-22T08:21:07.679Z
3233         ebfec6c6-3294-444b-82e5-5a66e7dc5cd5   1223        23.69    NULL                               NULL                                                                               NULL      2024-10-22T08:21:07.699Z
```

```sql
SELECT
  COALESCE(o.customer_id, c.user_id) as customer_id,
  o.order_id,
  o.product_id,
  o.price,
  c.url,
  c.user_agent,
  c.view_time
FROM orders o
FULL OUTER JOIN clicks c
ON o.customer_id = c.user_id
```

```sql
customer_id  order_id                                product_id  price    url                                        user_agent                                                                          view_time  rowtime
3099         e681fa67-3a1e-4e99-ba03-da9fb5d12845    1424        89.99    NULL                                       NULL                                                                                NULL       2024-10-22T08:21:08.620Z
3099         e681fa67-3a1e-4e99-ba03-da9fb5d12845    1424        89.99    https://www.acme.com/product/vfzsy         Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0     45         2024-10-22T08:21:09.620Z
```

---

### Compare Current and Previous Values in a Data Stream with Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/how-to-guides/compare-current-and-previous-values.html

Compare Current and Previous Values in a Data Stream with Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides a LAG function, which is a built-in function that enables you to access data from a previous event in the same row without the need for a self-join. It gives you the ability to analyze the differences between consecutive rows or to create more complex calculations based on previous events. This can be particularly useful in scenarios such as comparing daily sales values. In this guide, you will learn how to run an Flink SQL statement that uses the LAG function to compare current and historical order values from a continuous data stream of orders data. This topic shows the following steps: Step 1: Inspect the example stream Step 2: View aggregated results Prerequisites¶ Access to Confluent Cloud. The OrganizationAdmin, EnvironmentAdmin, or FlinkAdmin role for creating compute pools, or the FlinkDeveloper role if you already have a compute pool. If you don’t have the appropriate role, contact your OrganizationAdmin or EnvironmentAdmin. For more information, see Grant Role-Based Access in Confluent Cloud for Apache Flink. A provisioned Flink compute pool. Step 1: Inspect the example stream¶ In this step, you query the read-only orders table in the examples.marketplace database to inspect the stream for fields that you can mask. Log in to Confluent Cloud and navigate to your Flink workspace. In the Use catalog dropdown, select your environment. In the Use database dropdown, select your Kafka cluster. Run the following statement to inspect the example orders stream. SELECT * FROM examples.marketplace.orders; Your output should resemble: order_id customer_id product_id price 68362284-34df-41a3-87fb-50b79647b786 3195 1267 47.48 6e03663e-d20b-4a23-848a-aec959d794e3 3094 1412 50.92 84217b5d-7dcb-46d1-9600-675a3734a3ed 3038 1094 83.56 ... Step 2: View aggregated results¶ Run the following statement to start a query on the orders data using the LAG function to return current and previous order data for each customer. SELECT $rowtime AS row_time , customer_id , order_id , price , LAG(order_id, 1) OVER (PARTITION BY customer_id ORDER BY $rowtime) previous_order_id , LAG(price, 1) OVER (PARTITION BY customer_id ORDER BY $rowtime) previous_order_price FROM examples.marketplace.orders; Your output should resemble: row_time customer_id order_id price previous_order_id previous_order_price 2024-01-11 15:42:00.557 3213 821f81d4-d912-4e0f-ab8b-88fe8d9af397 89.34 2c26a03b-4cd5-4df6-90d0-0b11916533d2 57.89 2024-01-11 15:42:01.079 3090 57b20b43-3f52-49d8-b8bc-3a55d0440482 50.22 c913ea7b-a7dc-4b22-b966-8df3f28e8e5e 66.12 2024-01-11 15:42:01.391 3142 8a536722-3e4f-4920-bd33-2b981179b8f8 10.77 NULL NULL 2024-01-11 15:42:01.482 3006 cabf50e8-129d-4b71-b253-894526a571c1 113.12 NULL NULL 2024-01-11 15:42:01.681 3009 fd96d839-f06b-43ef-a23f-38e4ca6849b4 78.01 d5cdafb2-ddf1-4161-8843-48ae5f46f524 102.34 2024-01-11 15:42:01.910 3158 16165e84-d1d6-49b9-afaf-1856c4f2a751 354.11 NULL NULL ... Note that there are some NULL values for previous_order_id and previous_order_price. For these customers, the current order is the first order they have made, so there is no historical previous order data to return. Related content¶ Aggregate a Stream in a Tumbling Window Aggregate Functions Time Attributes DDL Statements Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
examples.marketplace
```

```sql
SELECT * FROM examples.marketplace.orders;
```

```sql
order_id                                customer_id   product_id  price
68362284-34df-41a3-87fb-50b79647b786    3195          1267        47.48
6e03663e-d20b-4a23-848a-aec959d794e3    3094          1412        50.92
84217b5d-7dcb-46d1-9600-675a3734a3ed    3038          1094        83.56
...
```

```sql
SELECT $rowtime AS row_time
      , customer_id
      , order_id
      , price
      , LAG(order_id, 1) OVER (PARTITION BY customer_id ORDER BY $rowtime) previous_order_id
      , LAG(price, 1) OVER (PARTITION BY customer_id ORDER BY $rowtime) previous_order_price
  FROM examples.marketplace.orders;
```

```sql
row_time                 customer_id  order_id                               price    previous_order_id                       previous_order_price
2024-01-11 15:42:00.557  3213         821f81d4-d912-4e0f-ab8b-88fe8d9af397   89.34    2c26a03b-4cd5-4df6-90d0-0b11916533d2    57.89
2024-01-11 15:42:01.079  3090         57b20b43-3f52-49d8-b8bc-3a55d0440482   50.22    c913ea7b-a7dc-4b22-b966-8df3f28e8e5e    66.12
2024-01-11 15:42:01.391  3142         8a536722-3e4f-4920-bd33-2b981179b8f8   10.77    NULL                                    NULL
2024-01-11 15:42:01.482  3006         cabf50e8-129d-4b71-b253-894526a571c1   113.12   NULL                                    NULL
2024-01-11 15:42:01.681  3009         fd96d839-f06b-43ef-a23f-38e4ca6849b4   78.01    d5cdafb2-ddf1-4161-8843-48ae5f46f524    102.34
2024-01-11 15:42:01.910  3158         16165e84-d1d6-49b9-afaf-1856c4f2a751   354.11   NULL                                    NULL
...
```

```sql
previous_order_id
```

```sql
previous_order_price
```

---

### Convert the Serialization Format of a Topic with Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/how-to-guides/convert-serialization-format.html

Convert the Serialization Format of a Topic with Confluent Cloud for Apache Flink¶ This guide shows how to use Confluent Cloud for Apache Flink® to transform a topic serialized in Avro Schema Registry format to a topic serialized in JSON Schema Registry format. The Apache Flink® type system is used to map the datatypes between the these two different wire formats. This topic shows the following steps: Step 1: Create a streaming data source using Avro Step 2: Inspect the source data Step 3: Convert the serialization format to JSON Step 4: Delete the long-running statement Prerequisites¶ You need the following prerequisites to use Confluent Cloud for Apache Flink. Access to Confluent Cloud. The organization ID, environment ID, and compute pool ID for your organization. The OrganizationAdmin, EnvironmentAdmin, or FlinkAdmin role for creating compute pools, or the FlinkDeveloper role if you already have a compute pool. If you don’t have the appropriate role, reach out to your OrganizationAdmin or EnvironmentAdmin. The Confluent CLI. To use the Flink SQL shell, update to the latest version of the Confluent CLI by running the following command: confluent update --yes If you used homebrew to install the Confluent CLI, update the CLI by using the brew upgrade command, instead of confluent update. For more information, see Confluent CLI. Step 1: Create a streaming data source using Avro¶ The streaming data for this topic is produced by a Datagen Source Connector that’s configured with the Gaming player activity template. It produces mock data to an Apache Kafka® topic named gaming_player_activity_source. The connector produces player score records that are randomly generated from the gaming_player_activity.avro file. Log in to the Confluent Cloud Console and navigate to the environment that hosts Flink SQL. In the navigation menu, select Connectors. The Connectors page opens. Click Add Connector The Connector Plugins page opens. In the Search connectors box, enter “datagen”. From the search results, click the Sample Data connector. If the Launch Sample Data dialog opens, click Advanced settings. In the Add Datagen Source Connector page, complete the following steps. 1: Create a topic2: Kafka credentials3: Configuration4: Sizing5: Review and Launch Click Add new topic, and in the Topic name field, enter “gaming_player_activity_source”. Click Create with defaults. Confluent Cloud creates the Kafka topic that the connector produces records to. Note When you’re in a Confluent Cloud environment that has Flink SQL, a SQL table is created automatically when you create a Kafka topic. In the Topics list, select gaming_player_activity_source and click Continue. Select the way you want to provide Kafka Cluster credentials. You can choose one of the following options: My account: This setting allows your connector to globally access everything that you have access to. With a user account, the connector uses an API key and secret to access the Kafka cluster. This option is not recommended for production. Service account: This setting limits the access for your connector by using a service account. This option is recommended for production. Use an existing API key: This setting allows you to specify an API key and a secret pair. You can use an existing pair or create a new one. This method is not recommended for production environments. Note Freight clusters support only service accounts for Kafka authentication. In the Kafka credentials pane, leave Global access selected, and click Generate API key & download. This creates an API key and secret that allows the connector to access your cluster, and downloads the key and secret to your computer. Click Continue. On the Configuration page, select AVRO for the output record value format. Selecting AVRO configures the connector to associate a schema with the gaming_player_activity_source topic and register it with Schema Registry. In the Select a template section, click Show more options, click the Gaming player activity tile. Click Show advanced configurations, and in the Max interval between messages (ms) textbox, enter 10. Click Continue. For Connector sizing, leave the slider at the default of 1 task and click Continue. In the Connector name box, Select the text and replace it with “gaming_player_activity_source_connector”. Click Continue to start the connector. The status of your new connector reads Provisioning, which lasts for a few seconds. When the status of the new connector changes from Provisioning to Running, you have a producer sending an event stream to your topic in the Confluent Cloud cluster. Step 2: Inspect the source data¶ In Cloud Console, navigate to your environment’s Flink workspace, or using the Confluent CLI, open a SQL shell from the Confluent CLI. If you use the workspace in Cloud Console, set the Use catalog and Use database controls to your environment and Kafka cluster. If you use the Flink SQL shell, run the following statements to set the current environment and Kafka cluster. USE CATALOG <your-environment-name>; USE DATABASE <your-cluster-name>; Run the following statement to see the data flowing into the gaming_player_activity_source table. SELECT * FROM gaming_player_activity_source; Your output should resemble: key player_id game_room_id points coordinates x'31303833' 1083 4634 85 [30,39] x'31303731' 1071 3406 432 [91,61] x'31303239' 1029 3078 359 [63,04] x'31303736' 1076 4501 256 [73,12] x'31303437' 1047 3644 375 [24,55] ... If you add $rowtime to the SELECT statement, you can see the Kafka timestamp for each record. SELECT $rowtime, * FROM gaming_player_activity_source; Your output should resemble: $rowtime key player_id game_room_id points coordinates 2023-11-08 14:27:27.647 x'31303838' 1088 4198 22 [02,86] 2023-11-08 14:27:27.695 x'31303638' 1068 1446 132 [80,86] 2023-11-08 14:27:27.729 x'31303536' 1056 4839 125 [35,74] 2023-11-08 14:27:27.732 x'31303530' 1050 4517 221 [11,69] 2023-11-08 14:27:27.746 x'31303438' 1048 3337 339 [91,10] ... Step 3: Convert the serialization format to JSON¶ Run the following statement to confirm that the current format of this table is Avro Schema Registry. SHOW CREATE TABLE gaming_player_activity_source; Your output should resemble: +-------------------------------------------------------------+ | SHOW CREATE TABLE | +-------------------------------------------------------------+ | CREATE TABLE `env`.`clus`.`gaming_player_activity_source` ( | | `key` VARBINARY(2147483647), | | `player_id` INT NOT NULL, | | `game_room_id` INT NOT NULL, | | `points` INT NOT NULL, | | `coordinates` VARCHAR(2147483647) NOT NULL, | | ) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS | | WITH ( | | 'changelog.mode' = 'append', | | 'connector' = 'confluent', | | 'kafka.cleanup-policy' = 'delete', | | 'kafka.max-message-size' = '2097164 bytes', | | 'kafka.partitions' = '6', | | 'kafka.retention.size' = '0 bytes', | | 'kafka.retention.time' = '604800000 ms', | | 'key.format' = 'raw', | | 'scan.bounded.mode' = 'unbounded', | | 'scan.startup.mode' = 'earliest-offset', | | 'value.format' = 'avro-registry' | | ) | | | +-------------------------------------------------------------+ Run the following statement to create a second table that has the same schema but is configured with the value format set to JSON with Schema Registry. The key format is unchanged. CREATE TABLE gaming_player_activity_source_json ( `key` VARBINARY(2147483647), `player_id` INT NOT NULL, `game_room_id` INT NOT NULL, `points` INT NOT NULL, `coordinates` VARCHAR(2147483647) NOT NULL ) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS WITH ( 'value.format' = 'json-registry', 'key.format' = 'raw' ); This statement creates a corresponding Kafka topic and Schema Registry subject named gaming_player_activity_source_json-value for the value. Run the following SQL to create a long-running statement that continuously transforms gaming_player_activity_source records into gaming_player_activity_source_json records. INSERT INTO gaming_player_activity_source_json SELECT * FROM gaming_player_activity_source; Run the following statement to confirm that records are continuously appended to the target table: SELECT * FROM gaming_player_activity_source_json; Your output should resemble: key player_id game_room_id points coordinates x'31303834' 1084 3583 211 [51,93] x'31303037' 1007 2268 55 [98,72] x'31303230' 1020 1625 431 [01,08] x'31303934' 1094 4760 43 [80,71] x'31303539' 1059 2822 390 [33,74] ... Tip Run the SHOW JOBS; statement to see the phase of statements that you’ve started in your workspace or Flink SQL shell. Run the following statement to confirm that the format of the gaming_player_activity_source_json table is JSON. SHOW CREATE TABLE gaming_player_activity_source_json; Your output should resemble: +--------------------------------------------------------------------------------------+ | SHOW CREATE TABLE | +--------------------------------------------------------------------------------------+ | CREATE TABLE `jim-flink-test-env`.`cluster_0`.`gaming_player_activity_source_json` ( | | `key` VARBINARY(2147483647), | | `player_id` INT NOT NULL, | | `game_room_id` INT NOT NULL, | | `points` INT NOT NULL, | | `coordinates` VARCHAR(2147483647) NOT NULL | | ) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS | | WITH ( | | 'changelog.mode' = 'append', | | 'connector' = 'confluent', | | 'kafka.cleanup-policy' = 'delete', | | 'kafka.max-message-size' = '2097164 bytes', | | 'kafka.partitions' = '6', | | 'kafka.retention.size' = '0 bytes', | | 'kafka.retention.time' = '604800000 ms', | | 'key.format' = 'raw', | | 'scan.bounded.mode' = 'unbounded', | | 'scan.startup.mode' = 'earliest-offset', | | 'value.format' = 'json-registry' | | ) | | | +--------------------------------------------------------------------------------------+ Step 4: Delete the long-running statement¶ Your INSERT INTO statement is converting records in the Avro format to the JSON format continuously. When you’re done with this guide, free resources in your compute pool by deleting the long-running statement. In Cloud Console, navigate to the Flink page in your environment and click Flink statements. In the statements list, find the statement that has a status of Running. In the Actions column, click … and select Delete statement. In the Confirm statement deletion dialog, copy and paste the statement name and click Confirm. Related content¶ Data Type Mappings WITH options Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
confluent update --yes
```

```sql
brew upgrade
```

```sql
confluent update
```

```sql
gaming_player_activity_source
```

```sql
gaming_player_activity_source
```

```sql
USE CATALOG <your-environment-name>;
USE DATABASE <your-cluster-name>;
```

```sql
gaming_player_activity_source
```

```sql
SELECT * FROM gaming_player_activity_source;
```

```sql
key         player_id game_room_id points coordinates
x'31303833' 1083      4634         85     [30,39]
x'31303731' 1071      3406         432    [91,61]
x'31303239' 1029      3078         359    [63,04]
x'31303736' 1076      4501         256    [73,12]
x'31303437' 1047      3644         375    [24,55]
...
```

```sql
SELECT $rowtime, * FROM gaming_player_activity_source;
```

```sql
$rowtime                key         player_id game_room_id points coordinates
2023-11-08 14:27:27.647 x'31303838' 1088      4198         22     [02,86]
2023-11-08 14:27:27.695 x'31303638' 1068      1446         132    [80,86]
2023-11-08 14:27:27.729 x'31303536' 1056      4839         125    [35,74]
2023-11-08 14:27:27.732 x'31303530' 1050      4517         221    [11,69]
2023-11-08 14:27:27.746 x'31303438' 1048      3337         339    [91,10]
...
```

```sql
SHOW CREATE TABLE gaming_player_activity_source;
```

```sql
+-------------------------------------------------------------+
|                      SHOW CREATE TABLE                      |
+-------------------------------------------------------------+
| CREATE TABLE `env`.`clus`.`gaming_player_activity_source` ( |
|   `key` VARBINARY(2147483647),                              |
|   `player_id` INT NOT NULL,                                 |
|   `game_room_id` INT NOT NULL,                              |
|   `points` INT NOT NULL,                                    |
|   `coordinates` VARCHAR(2147483647) NOT NULL,               |
| ) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS                 |
| WITH (                                                      |
|   'changelog.mode' = 'append',                              |
|   'connector' = 'confluent',                                |
|   'kafka.cleanup-policy' = 'delete',                        |
|   'kafka.max-message-size' = '2097164 bytes',               |
|   'kafka.partitions' = '6',                                 |
|   'kafka.retention.size' = '0 bytes',                       |
|   'kafka.retention.time' = '604800000 ms',                  |
|   'key.format' = 'raw',                                     |
|   'scan.bounded.mode' = 'unbounded',                        |
|   'scan.startup.mode' = 'earliest-offset',                  |
|   'value.format' = 'avro-registry'                          |
| )                                                           |
|                                                             |
+-------------------------------------------------------------+
```

```sql
CREATE TABLE gaming_player_activity_source_json (
  `key` VARBINARY(2147483647),
  `player_id` INT NOT NULL,
  `game_room_id` INT NOT NULL,
  `points` INT NOT NULL,
  `coordinates` VARCHAR(2147483647) NOT NULL
) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS
WITH (
  'value.format' = 'json-registry',
  'key.format' = 'raw'
);
```

```sql
gaming_player_activity_source_json-value
```

```sql
gaming_player_activity_source
```

```sql
gaming_player_activity_source_json
```

```sql
INSERT INTO gaming_player_activity_source_json
SELECT
  *
FROM gaming_player_activity_source;
```

```sql
SELECT * FROM gaming_player_activity_source_json;
```

```sql
key         player_id game_room_id points coordinates
x'31303834' 1084      3583         211    [51,93]
x'31303037' 1007      2268         55     [98,72]
x'31303230' 1020      1625         431    [01,08]
x'31303934' 1094      4760         43     [80,71]
x'31303539' 1059      2822         390    [33,74]
...
```

```sql
gaming_player_activity_source_json
```

```sql
SHOW CREATE TABLE gaming_player_activity_source_json;
```

```sql
+--------------------------------------------------------------------------------------+
|                                  SHOW CREATE TABLE                                   |
+--------------------------------------------------------------------------------------+
| CREATE TABLE `jim-flink-test-env`.`cluster_0`.`gaming_player_activity_source_json` ( |
|   `key` VARBINARY(2147483647),                                                       |
|   `player_id` INT NOT NULL,                                                          |
|   `game_room_id` INT NOT NULL,                                                       |
|   `points` INT NOT NULL,                                                             |
|   `coordinates` VARCHAR(2147483647) NOT NULL                                         |
| ) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS                                          |
| WITH (                                                                               |
|   'changelog.mode' = 'append',                                                       |
|   'connector' = 'confluent',                                                         |
|   'kafka.cleanup-policy' = 'delete',                                                 |
|   'kafka.max-message-size' = '2097164 bytes',                                        |
|   'kafka.partitions' = '6',                                                          |
|   'kafka.retention.size' = '0 bytes',                                                |
|   'kafka.retention.time' = '604800000 ms',                                           |
|   'key.format' = 'raw',                                                              |
|   'scan.bounded.mode' = 'unbounded',                                                 |
|   'scan.startup.mode' = 'earliest-offset',                                           |
|   'value.format' = 'json-registry'                                                   |
| )                                                                                    |
|                                                                                      |
+--------------------------------------------------------------------------------------+
```

---

### Create a User-Defined Function with Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/how-to-guides/create-udf.html

Create a User-Defined Function with Confluent Cloud for Apache Flink¶ A user-defined function (UDF) extends the capabilities of Confluent Cloud for Apache Flink® and enables you to implement custom logic beyond what is supported by SQL. For example, you can implement functions like encoding and decoding a string, performing geospatial calculations, encrypting and decrypting fields, or reusing an existing library or code from a third-party supplier. Confluent Cloud for Apache Flink supports UDFs written in Java. Package your custom function and its dependencies into a JAR file and upload it as an artifact to Confluent Cloud. Register the function in a Flink database by using the CREATE FUNCTION statement, and invoke your UDF in Flink SQL or the Table API. Confluent Cloud provides the infrastructure to run your code. For a list of cloud service providers and regions that support UDFs, see UDF regional availability. The following steps show how to implement a simple user-defined scalar function, upload it to Confluent Cloud, and use it in a Flink SQL statement. Step 1: Build the uber jar Step 2: Upload the jar as a Flink artifact Step 3: Register the UDF Step 4: Use the UDF in a Flink SQL query Step 5: Implement UDF logging (optional) Step 6: Delete the UDF After you build and run the scalar function, try building a table function. For more code examples, see Flink UDF Java Examples. Permanent and in-line UDFs¶ Starting with Confluent Table API plugin version 2.1-8, you can simplify the process of creating and managing UDFs. Permanent UDFs are registered automatically and can be used in any Flink SQL or Table API program. The Table API creates a temporary JAR file containing all transitive classes required to run the function, uploads it to Confluent Cloud, and registers the function using the previously uploaded artifact. In-line UDFs are defined and used in the same Table API program. Note Permanent and in-line UDFs are an Open Preview feature in Confluent Cloud. A Preview feature is a Confluent Cloud component that is being introduced to gain early feedback from developers. Preview features can be used for evaluation and non-production testing purposes or to provide feedback to Confluent. The warranty, SLA, and Support Services provisions of your agreement with Confluent do not apply to Preview features. Confluent may discontinue providing preview releases of the Preview features at any time in Confluent’s’ sole discretion. The following example shows how to create and call a permanent UDF and an in-line UDF. For the full code listing, see Example_09_Functions.java in the flink-table-api-java-examples repository. Implement a permanent and in-line UDF package io.confluent.flink.examples.table; import io.confluent.flink.plugin.ConfluentSettings; import org.apache.flink.table.api.EnvironmentSettings; import org.apache.flink.table.api.TableEnvironment; import org.apache.flink.table.functions.ScalarFunction; import org.apache.flink.table.functions.TableFunction; import java.util.List; import static org.apache.flink.table.api.Expressions.$; import static org.apache.flink.table.api.Expressions.array; import static org.apache.flink.table.api.Expressions.call; import static org.apache.flink.table.api.Expressions.row; /** * A table program example showing how to use User-Defined Functions * (UDFs) in the Flink Table API. * * <p>The Flink Table API simplifies the process of creating and managing UDFs. * * <ul> * <li>It helps creating a JAR file containing all required dependencies for a given UDF. * <li>Uploads the JAR to Confluent artifact API. * <li>Creates SQL functions for given artifacts. * </ul> */ public class Example_09_Functions { // Fill this with an environment you have write access to static final String TARGET_CATALOG = ""; // Fill this with a Kafka cluster you have write access to static final String TARGET_DATABASE = ""; // All logic is defined in a main() method. It can run both in an IDE or CI/CD system. public static void main(String[] args) { // Setup connection properties to Confluent Cloud EnvironmentSettings settings = ConfluentSettings.fromResource("/cloud.properties"); // Initialize the session context to get started TableEnvironment env = TableEnvironment.create(settings); // Set default catalog and database env.useCatalog(TARGET_CATALOG); env.useDatabase(TARGET_DATABASE); System.out.println("Registering a scalar function..."); // The Table API underneath creates a temporary JAR file containing all transitive classes // required to run the function, uploads it to Confluent Cloud, and registers the function // using the previously uploaded artifact. env.createFunction("CustomTax", CustomTax.class, true); // As of now, Scalar and Table functions are supported. System.out.println("Registering a table function..."); env.createFunction("Explode", Explode.class, true); // Once registered, the functions can be used in Table API and SQL queries. System.out.println("Executing registered UDFs..."); env.fromValues(row("Apple", "USA", 2), row("Apple", "EU", 3)) .select( $("f0").as("product"), $("f1").as("location"), $("f2").times(call("CustomTax", $("f1"))).as("tax")) .execute() .print(); env.fromValues( row(1L, "Ann", array("Apples", "Bananas")), row(2L, "Peter", array("Apples", "Pears"))) .joinLateral(call("Explode", $("f2")).as("fruit")) .select($("f0").as("id"), $("f1").as("name"), $("fruit")) .execute() .print(); // Instead of registering functions permanently, you can embed UDFs directly into queries // without registering them first. This will upload all the functions of the query as a // single artifact to Confluent Cloud. Moreover, the functions lifecycle will be bound to // the lifecycle of the query. System.out.println("Executing inline UDFs..."); env.fromValues(row("Apple", "USA", 2), row("Apple", "EU", 3)) .select( $("f0").as("product"), $("f1").as("location"), $("f2").times(call(CustomTax.class, $("f1"))).as("tax")) .execute() .print(); env.fromValues( row(1L, "Ann", array("Apples", "Bananas")), row(2L, "Peter", array("Apples", "Pears"))) .joinLateral(call(Explode.class, $("f2")).as("fruit")) .select($("f0").as("id"), $("f1").as("name"), $("fruit")) .execute() .print(); } /** A scalar function that calculates a custom tax based on the provided location. */ public static class CustomTax extends ScalarFunction { public int eval(String location) { if (location.equals("USA")) { return 10; } if (location.equals("EU")) { return 5; } return 0; } } /** A table function that explodes an array of string into multiple rows. */ public static class Explode extends TableFunction<String> { public void eval(List<String> arr) { for (String i : arr) { collect(i); } } } } Prerequisites¶ You need the following prerequisites to use Confluent Cloud for Apache Flink. Access to Confluent Cloud. The organization ID, environment ID, and compute pool ID for your organization. The OrganizationAdmin, EnvironmentAdmin, or FlinkAdmin role for creating compute pools, or the FlinkDeveloper role if you already have a compute pool. If you don’t have the appropriate role, reach out to your OrganizationAdmin or EnvironmentAdmin. The Confluent CLI. To use the Flink SQL shell, update to the latest version of the Confluent CLI by running the following command: confluent update --yes If you used homebrew to install the Confluent CLI, update the CLI by using the brew upgrade command, instead of confluent update. For more information, see Confluent CLI. A provisioned Flink compute pool in Confluent Cloud. Apache Maven software project management tool (see Installing Apache Maven) Java 11 to Java 17 Sufficient permissions to upload and invoke UDFs in Confluent Cloud. For more information, see Flink RBAC. If using the Table API only, Flink versions 1.18.x and 1.19.x of flink-table-api-java are supported. Step 1: Build the uber jar¶ In this section, you compile a simple Java class, named TShirtSizingIsSmaller into a jar file. The project is based on the ScalarFunction class in the Flink Table API. The TShirtSizingIsSmaller.java class has an eval function that compares two T-shirt sizes and returns the smaller size. Copy the following project object model into a file named pom.xml. Important You can’t use your own Flink-related jars. If you package Flink core dependencies as part of the jar, you may break the dependency. Also, this example shows how to capture all dependencies greedily, possibly including more than needed. As an alternative, you can optimize on artifact size by listing all dependencies and including their transitive dependencies. pom.xml <?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>example</groupId> <artifactId>udf_example</artifactId> <version>1.0</version> <properties> <maven.compiler.source>11</maven.compiler.source> <maven.compiler.target>11</maven.compiler.target> <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> </properties> <dependencies> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-table-api-java</artifactId> <version>2.1.0</version> <scope>provided</scope> </dependency> <!-- Dependencies --> </dependencies> <build> <sourceDirectory>./example</sourceDirectory> <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-shade-plugin</artifactId> <version>3.6.0</version> <configuration> <artifactSet> <includes> <!-- Include all UDF dependencies and their transitive dependencies here. --> <!-- This example shows how to capture all of them greedily. --> <include>*:*</include> </includes> </artifactSet> <filters> <filter> <artifact>*</artifact> <excludes> <!-- Do not copy the signatures in the META-INF folder. Otherwise, this might cause SecurityExceptions when using the JAR. --> <exclude>META-INF/*.SF</exclude> <exclude>META-INF/*.DSA</exclude> <exclude>META-INF/*.RSA</exclude> </excludes> </filter> </filters> </configuration> <executions> <execution> <phase>package</phase> <goals> <goal>shade</goal> </goals> </execution> </executions> </plugin> </plugins> </build> </project> Create a directory named “example”. mkdir example In the example directory, create a file named TShirtSizingIsSmaller.java. touch example/TShirtSizingIsSmaller.java Copy the following code into TShirtSizingIsSmaller.java. package com.example.my; import org.apache.flink.table.functions.ScalarFunction; import java.util.Arrays; import java.util.List; import java.util.stream.IntStream; /** TShirt sizing function for demo. */ public class TShirtSizingIsSmaller extends ScalarFunction { public static final String NAME = "IS_SMALLER"; private static final List<Size> ORDERED_SIZES = Arrays.asList( new Size("X-Small", "XS"), new Size("Small", "S"), new Size("Medium", "M"), new Size("Large", "L"), new Size("X-Large", "XL"), new Size("XX-Large", "XXL")); public boolean eval(String shirt1, String shirt2) { int size1 = findSize(shirt1); int size2 = findSize(shirt2); // If either can't be found just say false rather than throw an error if (size1 == -1 || size2 == -1) { return false; } return size1 < size2; } private int findSize(String shirt) { return IntStream.range(0, ORDERED_SIZES.size()) .filter( i -> { Size s = ORDERED_SIZES.get(i); return s.name.equalsIgnoreCase(shirt) || s.abbreviation.equalsIgnoreCase(shirt); }) .findFirst() .orElse(-1); } private static class Size { private final String name; private final String abbreviation; public Size(String name, String abbreviation) { this.name = name; this.abbreviation = abbreviation; } } } Run the following command to build the jar file. mvn clean package Run the following command to check the contents of your jar. jar -tf target/udf_example-1.0.jar | grep -i TShirtSizingIsSmaller Your output should resemble: com/example/my/TShirtSizingIsSmaller$Size.class com/example/my/TShirtSizingIsSmaller.class Step 2: Upload the jar as a Flink artifact¶ You can use the Confluent Cloud Console, the Confluent CLI, or the REST API to upload your UDF. Confluent Cloud ConsoleConfluent CLIREST API Log in to Confluent Cloud and navigate to your Flink workspace. Navigate to the environment where you want to run the UDF. Click Flink, in the Flink page, click Artifacts. Click Upload artifact to open the upload pane. In the Cloud provider dropdown, select AWS, and in the Region dropdown, select the cloud region. Click Upload your JAR file and navigate to the location of your JAR file, which in the current example is target/udf_example-1.0.jar. When your JAR file is uploaded, it appears in the Artifacts list. In the list, click the row for your UDF artifact to open the details pane. Log in to Confluent Cloud. confluent login --organization-id ${ORG_ID} --prompt Run the following command to upload the jar to Confluent Cloud. confluent flink artifact create udf_example \ --artifact-file target/udf_example-1.0.jar \ --cloud ${CLOUD_PROVIDER} \ --region ${CLOUD_REGION} \ --environment ${ENV_ID} Your output should resemble: +--------------------+-------------+ | ID | cfa-ldxmro | | Name | udf_example | | Version | ver-81vxm5 | | Cloud | aws | | Region | us-east-1 | | Environment | env-z3q9rd | | Content Format | JAR | | Description | | | Documentation Link | | +--------------------+-------------+ Note the artifact ID and version of your UDTF, which in this example are cfa-ldxmro and ver-81vxm5, because you use them later to register the UDTF in Flink SQL and to manage it. Run the following command to view all of the available UDFs. confluent flink artifact list \ --cloud ${CLOUD_PROVIDER} \ --region ${CLOUD_REGION} Your output should resemble: ID | Name | Cloud | Region | Environment -------------+-------------+-------+-----------+-------------- cfa-ldxmro | udf_example | AWS | us-east-1 | env-z3q9rd Run the following command to view the details of your UDF. You can use the artifact ID from the previous step or the artifact name to specify your UDF. # use the artifact ID confluent flink artifact describe \ cfa-ldxmro \ --cloud ${CLOUD_PROVIDER} \ --region ${CLOUD_REGION} # use the artifact name confluent flink artifact describe \ udf_example \ --cloud ${CLOUD_PROVIDER} \ --region ${CLOUD_REGION} Your output should resemble: +--------------------+-------------+ | ID | cfa-ldxmro | | Name | udf_example | | Version | ver-81vxm5 | | Cloud | aws | | Region | us-east-1 | | Environment | env-z3q9rd | | Content Format | JAR | | Description | | | Documentation Link | | +--------------------+-------------+ You can upload your JAR file by requesting a presigned upload URL, then uploading the file by using the presigned URL information. For more information, see Create a Flink artifact. Step 3: Register the UDF¶ UDFs are registered inside a Flink database, which means that you must specify the Confluent Cloud environment (Flink catalog) and Kafka cluster (Flink database) where you want to use the UDF. You can use the Confluent Cloud Console, the Confluent CLI, the Confluent Terraform provider, or the REST API to register your UDF. Confluent Cloud ConsoleConfluent CLITerraformREST API In the Flink page, click Compute pools. In the tile for the compute pool where you want to run the UDF, click Open SQL workspace. In the Use catalog dropdown, select the environment where you want to run the UDF. In the Use database dropdown, select Kafka cluster that you want to run the UDF. Run the following command to start the Flink shell. confluent flink shell --environment ${ENV_ID} --compute-pool ${COMPUTE_POOL_ID} Run the following statements to specify the catalog and database. -- Specify your catalog. This example uses the default. USE CATALOG default; Your output should resemble: +---------------------+---------+ | Key | Value | +---------------------+---------+ | sql.current-catalog | default | +---------------------+---------+ Specify the database you want to use, for example, cluster_0. -- Specify your database. This example uses cluster_0. USE cluster_0; Your output should resemble: +----------------------+-----------+ | Key | Value | +----------------------+-----------+ | sql.current-database | cluster_0 | +----------------------+-----------+ You can register a previously uploaded UDF by using the Confluent Terraform provider. For more information, see confluent_flink_artifact ResourceYou can register a UDF by sending a POST request to the Create Artifact endpoint. For more information, see Create a Flink artifact. In Cloud Console or the Confluent CLI, run the CREATE FUNCTION statement to register your UDF in the current catalog and database. Substitute your UDF’s value for <artifact-id>. CREATE FUNCTION is_smaller AS 'com.example.my.TShirtSizingIsSmaller' USING JAR 'confluent-artifact://<artifact-id>'; Your output should resemble: Function 'is_smaller' created. Step 4: Use the UDF in a Flink SQL query¶ Once it is registered, your UDF is available to use in queries. Run the following statement to view the UDFs in the current database. SHOW USER FUNCTIONS; Your output should resemble: +---------------+ | function name | +---------------+ | is_smaller | +---------------+ Run the following statement to create a sizes table. CREATE TABLE sizes ( `size_1` STRING, `size_2` STRING ); Run the following statement to populate the sizes table with values. INSERT INTO sizes VALUES ('XL', 'L'), ('small', 'L'), ('M', 'L'), ('XXL', 'XL'); Run the following statement to view the rows in the sizes table. SELECT * FROM sizes; Your output should resemble: size_1 size_2 XL L small L M L XXL XL Run the following statement to execute the is_smaller function on the data in the sizes table. SELECT size_1, size_2, is_smaller (size_1, size_2) AS is_smaller FROM sizes; Your output should resemble: size_1 size_2 is_smaller XL L FALSE small L TRUE M L TRUE XXL XL FALSE Step 5: Implement UDF logging (optional)¶ If you want to log UDF status messages, follow the steps in Log Debug Messages in UDFs. Step 6: Delete the UDF¶ When you’re finished using the UDF, you can delete it from the current database. You can use the Confluent Cloud Console, the Confluent CLI, the Confluent Terraform provider, or the REST API to delete your UDF. Drop the function¶ Run the following statement to remove the is_smaller function from the current database. DROP FUNCTION is_smaller; Your output should resemble: Function 'is_smaller' dropped. Currently running statements are not affected and continue running. Exit the Flink shell. exit; Delete the JAR artifact¶ Confluent Cloud ConsoleConfluent CLITerraformREST API Navigate to the environment where your UDF is registered. Click Flink, and in the Flink page, click Artifacts. In the artifacts list, find the UDF you want to delete. In the Actions column, click the icon, and in the context menu, select Delete artifact. In the confirmation dialog, type “udf_example”, and click Confirm. The “Artifact deleted successfully” message appears. Run the following command to delete the artifact form the environment. confluent flink artifact delete \ <artifact-id> \ --cloud ${CLOUD_PROVIDER} \ --region ${CLOUD_REGION} You receive a warning about breaking Flink statements that use the artifact. Type “y” when you’re prompted to proceed. Your output should resemble: Deleted Flink artifact "<artifact-id>". You can delete a UDF by using the Confluent Terraform provider. For more information, see confluent_flink_artifact ResourceYou can delete a UDF by sending a DELETE request to the Delete Artifact endpoint. For more information, see Delete an artifact. Implement a user-defined table function¶ In the previous steps, you implemented a UDF with a simple scalar function. Confluent Cloud for Apache Flink also supports user-defined table functions (UDTFs), which take multiple scalar values as input arguments and return multiple rows as output, instead of a single value. The following steps show how to implement a simple UDTF, upload it to Confluent Cloud, and use it in a Flink SQL statement. Step 1: Build the uber jar Step 2: Upload the UDTF jar as a Flink artifact Step 3: Register the UDTF Step 4: Use the UDTF in a Flink SQL query Step 1: Build the uber jar¶ In this section, you compile a simple Java class, named SplitFunction into a jar file, similar to the previous section. The class is based on the TableFunction class in the Flink Table API. The SplitFunction.java class has an eval function that uses the Java split method to break up a string into words and returns the words as columns in a row. In the example directory, create a file named SplitFunction.java. touch example/SplitFunction.java Copy the following code into SplitFunction.java. package com.example.my; import org.apache.flink.table.annotation.DataTypeHint; import org.apache.flink.table.annotation.FunctionHint; import org.apache.flink.table.api.*; import org.apache.flink.table.functions.TableFunction; import org.apache.flink.types.Row; import static org.apache.flink.table.api.Expressions.*; @FunctionHint(output = @DataTypeHint("ROW<word STRING>")) public class SplitFunction extends TableFunction<Row> { public void eval(String str, String delimiter) { for (String s : str.split(delimiter)) { // use collect(...) to emit a row collect(Row.of(s)); } } } Run the following command to build the jar file. You can use the POM file from the previous section. mvn clean package Run the following command to check the contents of your jar. jar -tf target/udf_example-1.0.jar | grep -i SplitFunction Your output should resemble: com/example/my/SplitFunction.class Step 2: Upload the UDTF jar as a Flink artifact¶ Confluent Cloud ConsoleConfluent CLI Log in to Confluent Cloud and navigate to your Flink workspace. Navigate to the environment where you want to run the UDF. Click Flink, in the Flink page, click Artifacts. Click Upload artifact to open the upload pane. In the Cloud provider dropdown, select AWS, and in the Region dropdown, select the cloud region. Click Upload your JAR file and navigate to the location of your JAR file, which in the current example is target/udf_example-1.0.jar. When your JAR file is uploaded, it appears in the Artifacts list. In the list, click the row for your UDF artifact to open the details pane. Log in to Confluent Cloud. confluent login --organization-id ${ORG_ID} --prompt Run the following command to upload the jar to Confluent Cloud. confluent flink artifact create udf_table_example \ --artifact-file target/udf_example-1.0.jar \ --cloud ${CLOUD_PROVIDER} \ --region ${CLOUD_REGION} \ --environment ${ENV_ID} Your output should resemble: +--------------------+-------------------+ | ID | cfa-l5xp82 | | Name | udf_table_example | | Version | ver-0x37m2 | | Cloud | aws | | Region | us-east-1 | | Environment | env-z3q9rd | | Content Format | JAR | | Description | | | Documentation Link | | +--------------------+-------------------+ Note the artifact ID and version of your UDTF, which in this example are cfa-l5xp82 and ver-0x37m2, because you use them later to register the UDTF in Flink SQL and to manage it. Step 3: Register the UDTF¶ In the Flink shell or the Cloud Console, specify the catalog and database (environment and cluster) where you want to use the UDTF, as you did in the previous section. Run the CREATE FUNCTION statement to register your UDTF in the current catalog and database. Substitute your UDTF’s value for <artifact-id>. CREATE FUNCTION split_string AS 'com.example.my.SplitFunction' USING JAR 'confluent-artifact://<artifact-id>'; Your output should resemble: Function 'split_string' created. Step 4: Use the UDTF in a Flink SQL query¶ Once it is registered, your UDTF is available to use in queries. Run the following statement to view the UDFs in the current database. SHOW USER FUNCTIONS; Your output should resemble: +---------------+ | Function Name | +---------------+ | split_string | +---------------+ Run the following statement to execute the split_string function. SELECT * FROM (VALUES 'A;B', 'C;D;E;F') as T(f), LATERAL TABLE(split_string(f, ';')) Your output should resemble: f word A;B A A;B B C;D;E;F C C;D;E;F D C;D;E;F E C;D;E;F F When you’re done with the example UDTF, drop the function and delete the JAR artifact as you did in Step 6: Delete the UDF. Related content¶ Enable UDF Logging confluent flink artifact create CREATE FUNCTION Statement Artifacts endpoints Flink UDF Java Examples Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
package io.confluent.flink.examples.table;

import io.confluent.flink.plugin.ConfluentSettings;

import org.apache.flink.table.api.EnvironmentSettings;
import org.apache.flink.table.api.TableEnvironment;
import org.apache.flink.table.functions.ScalarFunction;
import org.apache.flink.table.functions.TableFunction;

import java.util.List;

import static org.apache.flink.table.api.Expressions.$;
import static org.apache.flink.table.api.Expressions.array;
import static org.apache.flink.table.api.Expressions.call;
import static org.apache.flink.table.api.Expressions.row;

/**
* A table program example showing how to use User-Defined Functions
* (UDFs) in the Flink Table API.
*
* <p>The Flink Table API simplifies the process of creating and managing UDFs.
*
* <ul>
*   <li>It helps creating a JAR file containing all required dependencies for a given UDF.
*   <li>Uploads the JAR to Confluent artifact API.
*   <li>Creates SQL functions for given artifacts.
* </ul>
*/
public class Example_09_Functions {

   // Fill this with an environment you have write access to
   static final String TARGET_CATALOG = "";

   // Fill this with a Kafka cluster you have write access to
   static final String TARGET_DATABASE = "";

   // All logic is defined in a main() method. It can run both in an IDE or CI/CD system.
   public static void main(String[] args) {
      // Setup connection properties to Confluent Cloud
      EnvironmentSettings settings = ConfluentSettings.fromResource("/cloud.properties");

      // Initialize the session context to get started
      TableEnvironment env = TableEnvironment.create(settings);

      // Set default catalog and database
      env.useCatalog(TARGET_CATALOG);
      env.useDatabase(TARGET_DATABASE);

      System.out.println("Registering a scalar function...");
      // The Table API underneath creates a temporary JAR file containing all transitive classes
      // required to run the function, uploads it to Confluent Cloud, and registers the function
      // using the previously uploaded artifact.
      env.createFunction("CustomTax", CustomTax.class, true);

      // As of now, Scalar and Table functions are supported.
      System.out.println("Registering a table function...");
      env.createFunction("Explode", Explode.class, true);

      // Once registered, the functions can be used in Table API and SQL queries.
      System.out.println("Executing registered UDFs...");
      env.fromValues(row("Apple", "USA", 2), row("Apple", "EU", 3))
               .select(
                        $("f0").as("product"),
                        $("f1").as("location"),
                        $("f2").times(call("CustomTax", $("f1"))).as("tax"))
               .execute()
               .print();

      env.fromValues(
                        row(1L, "Ann", array("Apples", "Bananas")),
                        row(2L, "Peter", array("Apples", "Pears")))
               .joinLateral(call("Explode", $("f2")).as("fruit"))
               .select($("f0").as("id"), $("f1").as("name"), $("fruit"))
               .execute()
               .print();

      // Instead of registering functions permanently, you can embed UDFs directly into queries
      // without registering them first. This will upload all the functions of the query as a
      // single artifact to Confluent Cloud. Moreover, the functions lifecycle will be bound to
      // the lifecycle of the query.
      System.out.println("Executing inline UDFs...");
      env.fromValues(row("Apple", "USA", 2), row("Apple", "EU", 3))
               .select(
                        $("f0").as("product"),
                        $("f1").as("location"),
                        $("f2").times(call(CustomTax.class, $("f1"))).as("tax"))
               .execute()
               .print();

      env.fromValues(
                        row(1L, "Ann", array("Apples", "Bananas")),
                        row(2L, "Peter", array("Apples", "Pears")))
               .joinLateral(call(Explode.class, $("f2")).as("fruit"))
               .select($("f0").as("id"), $("f1").as("name"), $("fruit"))
               .execute()
               .print();
   }

   /** A scalar function that calculates a custom tax based on the provided location. */
   public static class CustomTax extends ScalarFunction {
      public int eval(String location) {
            if (location.equals("USA")) {
               return 10;
            }
            if (location.equals("EU")) {
               return 5;
            }
            return 0;
      }
   }

   /** A table function that explodes an array of string into multiple rows. */
   public static class Explode extends TableFunction<String> {
      public void eval(List<String> arr) {
            for (String i : arr) {
               collect(i);
            }
      }
   }
}
```

```sql
confluent update --yes
```

```sql
brew upgrade
```

```sql
confluent update
```

```sql
flink-table-api-java
```

```sql
TShirtSizingIsSmaller
```

```sql
ScalarFunction
```

```sql
TShirtSizingIsSmaller.java
```

```sql
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>example</groupId>
    <artifactId>udf_example</artifactId>
    <version>1.0</version>

    <properties>
        <maven.compiler.source>11</maven.compiler.source>
        <maven.compiler.target>11</maven.compiler.target>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    </properties>

    <dependencies>
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-table-api-java</artifactId>
            <version>2.1.0</version>
            <scope>provided</scope>
        </dependency>

        <!-- Dependencies -->

    </dependencies>

    <build>
        <sourceDirectory>./example</sourceDirectory>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-shade-plugin</artifactId>
                <version>3.6.0</version>
                <configuration>
                    <artifactSet>
                        <includes>
                            <!-- Include all UDF dependencies and their transitive dependencies here. -->
                            <!-- This example shows how to capture all of them greedily. -->
                            <include>*:*</include>
                        </includes>
                    </artifactSet>
                    <filters>
                        <filter>
                            <artifact>*</artifact>
                            <excludes>
                                <!-- Do not copy the signatures in the META-INF folder.
                                Otherwise, this might cause SecurityExceptions when using the JAR. -->
                                <exclude>META-INF/*.SF</exclude>
                                <exclude>META-INF/*.DSA</exclude>
                                <exclude>META-INF/*.RSA</exclude>
                            </excludes>
                        </filter>
                    </filters>
                </configuration>
                <executions>
                    <execution>
                        <phase>package</phase>
                        <goals>
                            <goal>shade</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>
</project>
```

```sql
mkdir example
```

```sql
TShirtSizingIsSmaller.java
```

```sql
touch example/TShirtSizingIsSmaller.java
```

```sql
TShirtSizingIsSmaller.java
```

```sql
package com.example.my;

import org.apache.flink.table.functions.ScalarFunction;

import java.util.Arrays;
import java.util.List;
import java.util.stream.IntStream;

/** TShirt sizing function for demo. */
public class TShirtSizingIsSmaller extends ScalarFunction {
   public static final String NAME = "IS_SMALLER";

   private static final List<Size> ORDERED_SIZES =
            Arrays.asList(
                  new Size("X-Small", "XS"),
                  new Size("Small", "S"),
                  new Size("Medium", "M"),
                  new Size("Large", "L"),
                  new Size("X-Large", "XL"),
                  new Size("XX-Large", "XXL"));

   public boolean eval(String shirt1, String shirt2) {
      int size1 = findSize(shirt1);
      int size2 = findSize(shirt2);
      // If either can't be found just say false rather than throw an error
      if (size1 == -1 || size2 == -1) {
            return false;
      }
      return size1 < size2;
   }

   private int findSize(String shirt) {
      return IntStream.range(0, ORDERED_SIZES.size())
               .filter(
                        i -> {
                           Size s = ORDERED_SIZES.get(i);
                           return s.name.equalsIgnoreCase(shirt)
                                    || s.abbreviation.equalsIgnoreCase(shirt);
                        })
               .findFirst()
               .orElse(-1);
   }

   private static class Size {
      private final String name;
      private final String abbreviation;

      public Size(String name, String abbreviation) {
            this.name = name;
            this.abbreviation = abbreviation;
      }
   }
}
```

```sql
mvn clean package
```

```sql
jar -tf target/udf_example-1.0.jar | grep -i TShirtSizingIsSmaller
```

```sql
com/example/my/TShirtSizingIsSmaller$Size.class
com/example/my/TShirtSizingIsSmaller.class
```

```sql
target/udf_example-1.0.jar
```

```sql
confluent login --organization-id ${ORG_ID} --prompt
```

```sql
confluent flink artifact create udf_example \
--artifact-file target/udf_example-1.0.jar \
--cloud ${CLOUD_PROVIDER} \
--region ${CLOUD_REGION} \
--environment ${ENV_ID}
```

```sql
+--------------------+-------------+
| ID                 | cfa-ldxmro  |
| Name               | udf_example |
| Version            | ver-81vxm5  |
| Cloud              | aws         |
| Region             | us-east-1   |
| Environment        | env-z3q9rd  |
| Content Format     | JAR         |
| Description        |             |
| Documentation Link |             |
+--------------------+-------------+
```

```sql
confluent flink artifact list \
--cloud ${CLOUD_PROVIDER} \
--region ${CLOUD_REGION}
```

```sql
ID     |    Name     | Cloud |  Region   | Environment
-------------+-------------+-------+-----------+--------------
cfa-ldxmro | udf_example | AWS   | us-east-1 | env-z3q9rd
```

```sql
# use the artifact ID
confluent flink artifact describe \
cfa-ldxmro \
--cloud ${CLOUD_PROVIDER} \
--region ${CLOUD_REGION}

# use the artifact name
confluent flink artifact describe \
udf_example \
--cloud ${CLOUD_PROVIDER} \
--region ${CLOUD_REGION}
```

```sql
+--------------------+-------------+
| ID                 | cfa-ldxmro  |
| Name               | udf_example |
| Version            | ver-81vxm5  |
| Cloud              | aws         |
| Region             | us-east-1   |
| Environment        | env-z3q9rd  |
| Content Format     | JAR         |
| Description        |             |
| Documentation Link |             |
+--------------------+-------------+
```

```sql
confluent flink shell --environment ${ENV_ID} --compute-pool ${COMPUTE_POOL_ID}
```

```sql
-- Specify your catalog. This example uses the default.
USE CATALOG default;
```

```sql
+---------------------+---------+
|         Key         |  Value  |
+---------------------+---------+
| sql.current-catalog | default |
+---------------------+---------+
```

```sql
-- Specify your database. This example uses cluster_0.
USE cluster_0;
```

```sql
+----------------------+-----------+
|         Key          |   Value   |
+----------------------+-----------+
| sql.current-database | cluster_0 |
+----------------------+-----------+
```

```sql
<artifact-id>
```

```sql
CREATE FUNCTION is_smaller
  AS 'com.example.my.TShirtSizingIsSmaller'
  USING JAR 'confluent-artifact://<artifact-id>';
```

```sql
Function 'is_smaller' created.
```

```sql
SHOW USER FUNCTIONS;
```

```sql
+---------------+
| function name |
+---------------+
| is_smaller    |
+---------------+
```

```sql
CREATE TABLE sizes (
  `size_1` STRING,
  `size_2` STRING
);
```

```sql
INSERT INTO sizes VALUES
  ('XL', 'L'),
  ('small', 'L'),
  ('M', 'L'),
  ('XXL', 'XL');
```

```sql
SELECT * FROM sizes;
```

```sql
size_1 size_2
XL     L
small  L
M      L
XXL    XL
```

```sql
SELECT size_1, size_2, is_smaller (size_1, size_2)
  AS is_smaller
  FROM sizes;
```

```sql
size_1 size_2 is_smaller
XL     L      FALSE
small  L      TRUE
M      L      TRUE
XXL    XL     FALSE
```

```sql
DROP FUNCTION is_smaller;
```

```sql
Function 'is_smaller' dropped.
```

```sql
confluent flink artifact delete \
<artifact-id> \
--cloud ${CLOUD_PROVIDER} \
--region ${CLOUD_REGION}
```

```sql
Deleted Flink artifact "<artifact-id>".
```

```sql
SplitFunction
```

```sql
TableFunction
```

```sql
SplitFunction.java
```

```sql
SplitFunction.java
```

```sql
touch example/SplitFunction.java
```

```sql
SplitFunction.java
```

```sql
package com.example.my;

import org.apache.flink.table.annotation.DataTypeHint;
import org.apache.flink.table.annotation.FunctionHint;
import org.apache.flink.table.api.*;
import org.apache.flink.table.functions.TableFunction;
import org.apache.flink.types.Row;
import static org.apache.flink.table.api.Expressions.*;

@FunctionHint(output = @DataTypeHint("ROW<word STRING>"))
public class SplitFunction extends TableFunction<Row> {

   public void eval(String str, String delimiter) {
      for (String s : str.split(delimiter)) {
         // use collect(...) to emit a row
         collect(Row.of(s));
      }
   }
}
```

```sql
mvn clean package
```

```sql
jar -tf target/udf_example-1.0.jar | grep -i SplitFunction
```

```sql
com/example/my/SplitFunction.class
```

```sql
target/udf_example-1.0.jar
```

```sql
confluent login --organization-id ${ORG_ID} --prompt
```

```sql
confluent flink artifact create udf_table_example \
--artifact-file target/udf_example-1.0.jar \
--cloud ${CLOUD_PROVIDER} \
--region ${CLOUD_REGION} \
--environment ${ENV_ID}
```

```sql
+--------------------+-------------------+
| ID                 | cfa-l5xp82        |
| Name               | udf_table_example |
| Version            | ver-0x37m2        |
| Cloud              | aws               |
| Region             | us-east-1         |
| Environment        | env-z3q9rd        |
| Content Format     | JAR               |
| Description        |                   |
| Documentation Link |                   |
+--------------------+-------------------+
```

```sql
<artifact-id>
```

```sql
CREATE FUNCTION split_string
  AS 'com.example.my.SplitFunction'
  USING JAR 'confluent-artifact://<artifact-id>';
```

```sql
Function 'split_string' created.
```

```sql
SHOW USER FUNCTIONS;
```

```sql
+---------------+
| Function Name |
+---------------+
| split_string  |
+---------------+
```

```sql
split_string
```

```sql
SELECT * FROM (VALUES 'A;B', 'C;D;E;F') as T(f), LATERAL TABLE(split_string(f, ';'))
```

```sql
f        word
A;B      A
A;B      B
C;D;E;F  C
C;D;E;F  D
C;D;E;F  E
C;D;E;F  F
```

---

### Deduplicate Rows in a Table with Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/how-to-guides/deduplicate-rows.html

Deduplicate Rows in a Table with Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables generating a table that contains only unique records from an input table with only a few clicks. In this guide, you create a Flink table and apply the Deduplicate Rows action to generate a topic that has only unique records, by using a deduplication statement. The Deduplicate Rows action creates a Flink SQL statement for you, but no knowledge of Flink SQL is required to use it. This guide shows the following steps: Step 1: Create a users table Step 2: Apply the Deduplicate Topic action Step 3: Inspect the output table Prerequisites¶ Access to Confluent Cloud. The OrganizationAdmin, EnvironmentAdmin, or FlinkAdmin role for creating compute pools, or the FlinkDeveloper role if you already have a compute pool. If you don’t have the appropriate role, contact your OrganizationAdmin or EnvironmentAdmin. For more information, see Grant Role-Based Access in Confluent Cloud for Apache Flink. A provisioned Flink compute pool. Step 1: Create a users table¶ Before you can deduplicate rows, you need a table with sample data that contains duplicates. In this step, you create a simple users table and populate it with mock records, some of which are duplicated intentionally. Log in to Confluent Cloud and navigate to your Flink workspace. Run the following statement to create a users table. CREATE TABLE users ( user_id STRING NOT NULL, registertime BIGINT, gender STRING, regionid STRING ); Insert rows with mock data into the users table. INSERT INTO users VALUES ('Thomas A. Anderson', 1677260724, 'male', 'Region_4'), ('Thomas A. Anderson', 1677260724, 'male', 'Region_4'), ('Trinity', 1677260733, 'female', 'Region_4'), ('Trinity', 1677260733, 'female', 'Region_4'), ('Morpheus', 1677260742, 'male', 'Region_8'), ('Morpheus', 1677260742, 'male', 'Region_8'), ('Dozer', 1677260823, 'male', 'Region_1'), ('Agent Smith', 1677260955, 'male', 'Region_0'), ('Persephone', 1677260901, 'female', 'Region_2'), ('Niobe', 1677260921, 'female', 'Region_3'), ('Niobe', 1677260921, 'female', 'Region_3'), ('Niobe', 1677260921, 'female', 'Region_3'), ('Zee', 1677260922, 'female', 'Region_5'); Inspect the inserted rows. SELECT * FROM users; Your output should resemble: user_id registertime gender regionid Thomas A. Anderson 1677260724 male Region_4 Thomas A. Anderson 1677260724 male Region_4 Trinity 1677260733 female Region_4 Trinity 1677260733 female Region_4 Morpheus 1677260742 male Region_8 Morpheus 1677260742 male Region_8 Dozer 1677260823 male Region_1 Agent Smith 1677260955 male Region_0 Persephone 1677260901 female Region_2 Niobe 1677260921 female Region_3 Niobe 1677260921 female Region_3 Niobe 1677260921 female Region_3 Zee 1677260922 female Region_5 Step 2: Apply the Deduplicate Topic action¶ In the previous step, you created a Flink table that had duplicate rows. In this step, you apply the Deduplicate Topic action to create an output table that has only unique rows. In the navigation menu, click Data portal. In the Data portal page, click the Environment dropdown menu and select the environment for your workspace. In the Recently created section, find your users topic and click it to open the details pane. Click Actions, and in the Actions list, click Deduplicate topic to open the Deduplicate topic dialog. In the Fields to deduplicate dropdown, select user_id. Flink uses the deduplication field as the output message key. This means that the output topic’s row key may be different from the input topic’s row key, because the deduplication statement’s DISTRIBUTED BY clause determines the output topic’s key. For this example, the output message key is the user_id field. In the Compute pool dropdown, select the compute pool you want to use. (Optional) In the Runtime configuration section, select Run with a service account to run the deduplicate query with a service account principal. Use this option for production queries. Note The service account you select must have the DeveloperManage and DeveloperWrite roles to create topics, schemas, and run Flink statements. For more information, see Grant Role-Based Access. Click the Show SQL toggle to view the statement that the action will run. For this example, the deduplication query depends on the registertime field, so you must modify the generated statement to use the registertime field as the field to sort on. Click Open SQL editor to modify the statement. A Flink workspace opens with the generated statement in the cell. In the cell, replace $rowtime with registertime in the ORDER BY clause. CREATE TABLE `<your-environment>`.`<your-kafka-cluster>`.`users_deduplicate` ( PRIMARY KEY (`user_id`) NOT ENFORCED ) DISTRIBUTED BY HASH( `user_id` ) WITH ( 'changelog.mode' = 'upsert', 'value.format'='avro-registry', 'key.format'='avro-registry' ) AS SELECT `user_id`, `registertime`, `gender`, `regionid` FROM ( SELECT *, ROW_NUMBER() OVER (PARTITION BY `user_id` ORDER BY registertime ASC) AS row_num FROM `<your-environment>`.`<your-kafka-cluster>`.`users`) WHERE row_num = 1; Click Run to execute the deduplication query. The CREATE TABLE AS SELECT statement creates the users_deduplicate table and populates it with rows from the users table using a deduplication query. When the Statement status changes to Running, you can query the users_deduplicate table. Step 3: Inspect the output table¶ The statement generated by the Deduplicate Topic action created an output table named users_deduplicate. In this step, you query the output table to see the deduplicated rows. Run the following statement to inspect the users_deduplicate output table. SELECT * FROM users_deduplicate; Your output should resemble: user_id registertime gender regionid Thomas A. Anderson 1677260724 male Region_4 Trinity 1677260733 female Region_4 Morpheus 1677260742 male Region_8 Dozer 1677260823 male Region_1 Agent Smith 1677260955 male Region_0 Persephone 1677260901 female Region_2 Niobe 1677260921 female Region_3 Zee 1677260922 female Region_5 Related content¶ Flink action: Mask Fields in a Table Flink action: Transform a Topic Flink action: Create an Embedding Aggregate a Stream in a Tumbling Window Compare Current and Previous Values in a Data Stream Convert the Serialization Format of a Topic Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
CREATE TABLE users (
  user_id STRING NOT NULL,
  registertime BIGINT,
  gender STRING,
  regionid STRING
);
```

```sql
INSERT INTO users VALUES
  ('Thomas A. Anderson', 1677260724, 'male', 'Region_4'),
  ('Thomas A. Anderson', 1677260724, 'male', 'Region_4'),
  ('Trinity', 1677260733, 'female', 'Region_4'),
  ('Trinity', 1677260733, 'female', 'Region_4'),
  ('Morpheus', 1677260742, 'male', 'Region_8'),
  ('Morpheus', 1677260742, 'male', 'Region_8'),
  ('Dozer', 1677260823, 'male', 'Region_1'),
  ('Agent Smith', 1677260955, 'male', 'Region_0'),
  ('Persephone', 1677260901, 'female', 'Region_2'),
  ('Niobe', 1677260921, 'female', 'Region_3'),
  ('Niobe', 1677260921, 'female', 'Region_3'),
  ('Niobe', 1677260921, 'female', 'Region_3'),
  ('Zee', 1677260922, 'female', 'Region_5');
```

```sql
SELECT * FROM users;
```

```sql
user_id            registertime gender regionid
Thomas A. Anderson 1677260724   male   Region_4
Thomas A. Anderson 1677260724   male   Region_4
Trinity            1677260733   female Region_4
Trinity            1677260733   female Region_4
Morpheus           1677260742   male   Region_8
Morpheus           1677260742   male   Region_8
Dozer              1677260823   male   Region_1
Agent Smith        1677260955   male   Region_0
Persephone         1677260901   female Region_2
Niobe              1677260921   female Region_3
Niobe              1677260921   female Region_3
Niobe              1677260921   female Region_3
Zee                1677260922   female Region_5
```

```sql
registertime
```

```sql
registertime
```

```sql
registertime
```

```sql
CREATE TABLE `<your-environment>`.`<your-kafka-cluster>`.`users_deduplicate` (
       PRIMARY KEY (`user_id`) NOT ENFORCED
) DISTRIBUTED BY HASH(
       `user_id`
) WITH (
       'changelog.mode' = 'upsert',
       'value.format'='avro-registry',
       'key.format'='avro-registry'
) AS SELECT `user_id`, `registertime`, `gender`, `regionid` FROM (
       SELECT *,
              ROW_NUMBER() OVER (PARTITION BY `user_id` ORDER BY registertime ASC) AS row_num
       FROM `<your-environment>`.`<your-kafka-cluster>`.`users`) WHERE row_num = 1;
```

```sql
users_deduplicate
```

```sql
users_deduplicate
```

```sql
users_deduplicate
```

```sql
users_deduplicate
```

```sql
SELECT * FROM users_deduplicate;
```

```sql
user_id            registertime gender regionid
Thomas A. Anderson 1677260724   male   Region_4
Trinity            1677260733   female Region_4
Morpheus           1677260742   male   Region_8
Dozer              1677260823   male   Region_1
Agent Smith        1677260955   male   Region_0
Persephone         1677260901   female Region_2
Niobe              1677260921   female Region_3
Zee                1677260922   female Region_5
```

---

### Log Debug Messages in a User Defined Function with Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/how-to-guides/enable-udf-logging.html

Log Debug Messages in a User Defined Function for Confluent Cloud for Apache Flink¶ When you create a user defined function (UDF) with Confluent Cloud for Apache Flink®, you have the option of logging events to help with monitoring and debugging. Your log messages appear in the Confluent Cloud Console’s statement log page. For more information on creating UDFs, see Create a User Defined Function. Limitations¶ UDF logging has these limitations. Log4j logging only: External UDF loggers can be composed only with the Apache Log4j logging framework. Burst rate to 1000/s: UDF logging supports up to 1000 log events per second for each UDF during a short burst of high activity. This helps to optimize performance and to reduce noise in logs. Events that exceed the maximum rate are dropped. Implement logging code¶ In your UDF project, import the org.apache.logging.log4j.LogManager and org.apache.logging.log4j.Logger namespaces. Get the Logger instance by calling the LogManager.getLogger() method. package your.package.namespace; import org.apache.flink.table.functions.ScalarFunction; import org.apache.logging.log4j.LogManager; import org.apache.logging.log4j.Logger; import java.util.Date; /* This class is a SumScalar function that logs messages at different levels */ public class LogSumScalarFunction extends ScalarFunction { private static final Logger LOGGER = LogManager.getLogger(); public int eval(int a, int b) { String value = String.format("SumScalar of %d and %d", a, b); Date now = new java.util.Date(); // You can choose the logging level for log messages. LOGGER.info(value + " info log messages by log4j logger --- " + now); LOGGER.error(value + " error log messages by log4j logger --- " + now); LOGGER.warn(value + " warn log messages by log4j logger --- " + now); LOGGER.debug(value + " debug log messages by log4j logger --- " + now); return a + b; } } The following log levels are supported. OFF FATAL ERROR WARN INFO DEBUG TRACE ALL View logged events¶ After the instrumented UDF statements run, you can view logged events in the Confluent Cloud Console’s event logging page. Related content¶ User-defined Functions Create a User-defined Function confluent flink artifact create CREATE FUNCTION Statement Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
org.apache.logging.log4j.LogManager
```

```sql
org.apache.logging.log4j.Logger
```

```sql
LogManager.getLogger()
```

```sql
package your.package.namespace;

import org.apache.flink.table.functions.ScalarFunction;
import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;
import java.util.Date;

/* This class is a SumScalar function that logs messages at different levels */
public class LogSumScalarFunction extends ScalarFunction {

   private static final Logger LOGGER = LogManager.getLogger();

   public int eval(int a, int b) {
     String value = String.format("SumScalar of %d and %d", a, b);
      Date now = new java.util.Date();

      // You can choose the logging level for log messages.
      LOGGER.info(value + " info log messages by log4j logger --- " + now);
      LOGGER.error(value + " error log messages by log4j logger --- " + now);
      LOGGER.warn(value + " warn log messages by log4j logger --- " + now);
      LOGGER.debug(value + " debug log messages by log4j logger --- " + now);
      return a + b;
   }
}
```

---

### Mask Fields in a Table with Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/how-to-guides/mask-fields.html

Mask Fields in a Table with Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables generating a topic that contains masked fields from an input topic with only a few clicks. In this guide, you create a Flink table and apply the Mask Fields action to generate a topic that has user names masked out, by using a preconfigured regular expression. The Mask Fields action creates a Flink SQL statement for you, but no knowledge of Flink SQL is required to use it. This guide shows the following steps: Step 1: Inspect the example stream Step 2: Create a source table Step 3: Apply the Mask Fields action Step 4: Inspect the output table Step 5: Stop the persistent query Prerequisites¶ Access to Confluent Cloud. The OrganizationAdmin, EnvironmentAdmin, or FlinkAdmin role for creating compute pools, or the FlinkDeveloper role if you already have a compute pool. If you don’t have the appropriate role, contact your OrganizationAdmin or EnvironmentAdmin. For more information, see Grant Role-Based Access in Confluent Cloud for Apache Flink. A provisioned Flink compute pool. Step 1: Inspect the example stream¶ In this step, you query the read-only customers table in the examples.marketplace database to inspect the stream for fields that you can mask. Log in to Confluent Cloud and navigate to your Flink workspace. In the Use catalog dropdown, select your environment. In the Use database dropdown, select your Kafka cluster. Run the following statement to inspect the example customers stream. SELECT * FROM examples.marketplace.customers; Your output should resemble: customer_id name address postcode city email 3134 Dr. Andrew Terry 45488 Eileen Walk 78690 Latoyiaberg romaine.lynch@hotmail.com 3243 Miss Shelby Lueilwitz 199 Bernardina Brook 79991 Johnburgh dominick.oconner@hotmail.c… 3027 Korey Hand 655 Murray Turnpike 08917 Port Sukshire karlyn.ziemann@yahoo.com ... Step 2: Create a source table¶ In the step, you create a customers_source table for the data from the example customers stream. You use the INSERT INTO FROM SELECT statement to populate the table with streaming data. Run the following statement to register the customers_source table. Confluent Cloud for Apache Flink creates a backing Kafka topic that has the same name automatically. -- Register a customers source table. CREATE TABLE customers_source ( customer_id INT NOT NULL, name STRING, address STRING, postcode STRING, city STRING, email STRING, PRIMARY KEY(`customer_id`) NOT ENFORCED ); Run the following statement to populate the customers_source table with data from the example customers stream. -- Persistent query to stream data from -- the customers example stream to the -- customers_source table. INSERT INTO customers_source( customer_id, name, address, postcode, city, email ) SELECT customer_id, name, address, postcode, city, email FROM examples.marketplace.customers; Run the following statement to inspect the customers_source table. SELECT * FROM customers_source; Your output should resemble: customer_id name address postcode city email 3088 Phil Grimes 07738 Zieme Court 84845 Port Dillontown garnett.abernathy@hotmail.com 3022 Jeana Gaylord 021 Morgan Drives 35160 West Celena emile.daniel@gmail.com 3097 Lily Ryan 671 Logan Throughway 58261 Dickinsonburgh ivory.lockman@gmail.com ... Step 3: Apply the Mask Fields action¶ In the previous step, you created a Flink table that had rows with customer names, which might be confidential data. In this step, you apply the Mask Fields action to create an output table that has the contents of the name field masked. Navigate to the Environments page, and in the navigation menu, click Data portal. In the Data portal page, click the dropdown menu and select the environment for your workspace. In the Recently created section, find your customers_source topic and click it to open the details pane. Click Actions, and in the Actions list, click Mask fields to open the Mask fields dialog. In the Field to mask dropdown, select name. In the Regex for name dropdown, select Word characters. In the Runtime configuration section, either select an existing service account or create a new service account for the current action. Note The service you select must have the EnvironmentAdmin role to create topics, schemas, and run Flink statements. Optionally, click the Show SQL toggle to view the statements that the action will run. The code resembles: CREATE TABLE `<your-environment>`.`<your-kafka-cluster>`.`customers_source_mask` LIKE `<your-environment>`.`<your-kafka-cluster>`.`customers_source` INSERT INTO `<your-environment>`.`<your-kafka-cluster>`.`customers_source_mask` SELECT `customer_id`, REGEXP_REPLACE(`name`, '(\w)', '*') as `name`, address, postcode, city, email FROM `<your-environment>`.`<your-kafka-cluster>`.`customers_source`; Click Confirm. The action runs the CREATE TABLE and INSERT INTO statements. These statements register the customers_source_mask table and populate it with rows from the customers_source table. The strings in the name column are masked by the REGEXP_REPLACE function. Step 4: Inspect the output table¶ The statements that were generated by the Mask Fields action created an output table named customers_source_mask. In this step, you query the output table to see the masked field values. Return to your workspace and run the following command to inspect the customers_source_mask output table. SELECT * FROM customers_source_mask; Your output should resemble: customer_id name address postcode city email 3104 **** *** ****** 342 Odis Hollow 27615 West Florentino bryce.hodkiewicz@hotmail.c… 3058 **** ******* ****** 33569 Turner Glens 14107 Schummchester sarah.roob@yahoo.com 3138 **** ****** ******** 944 Elden Walks 39293 New Ernestbury velvet.volkman@gmail.com ... Step 5: Stop the persistent query¶ The INSERT INTO statement that was created by the Mask Fields action runs continuously until you stop it manually. Free resources in your compute pool by deleting the long-running statement. Navigate to the Flink page in your environment and click Flink statements. In the statements list, find the statement that has a status of Running. In the Actions column, click … and select Delete statement. In the Confirm statement deletion dialog, copy and paste the statement name and click Confirm. Related content¶ Flink action: Deduplicate Rows in a Table Flink action: Transform a Topic Flink action: Create an Embedding Aggregate a Stream in a Tumbling Window Compare Current and Previous Values in a Data Stream Convert the Serialization Format of a Topic Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
examples.marketplace
```

```sql
SELECT * FROM examples.marketplace.customers;
```

```sql
customer_id name                  address                  postcode city              email
3134        Dr. Andrew Terry      45488 Eileen Walk        78690    Latoyiaberg       romaine.lynch@hotmail.com
3243        Miss Shelby Lueilwitz 199 Bernardina Brook     79991    Johnburgh         dominick.oconner@hotmail.c…
3027        Korey Hand            655 Murray Turnpike      08917    Port Sukshire     karlyn.ziemann@yahoo.com
...
```

```sql
customers_source
```

```sql
customers_source
```

```sql
-- Register a customers source table.
CREATE TABLE customers_source (
  customer_id INT NOT NULL,
  name STRING,
  address STRING,
  postcode STRING,
  city STRING,
  email STRING,
  PRIMARY KEY(`customer_id`) NOT ENFORCED
);
```

```sql
customers_source
```

```sql
-- Persistent query to stream data from
-- the customers example stream to the
-- customers_source table.
INSERT INTO customers_source(
  customer_id,
  name,
  address,
  postcode,
  city,
  email
  )
SELECT customer_id, name, address, postcode, city, email FROM examples.marketplace.customers;
```

```sql
customers_source
```

```sql
SELECT * FROM customers_source;
```

```sql
customer_id name                  address                  postcode city              email
3088        Phil Grimes          07738 Zieme Court        84845    Port Dillontown     garnett.abernathy@hotmail.com
3022        Jeana Gaylord        021 Morgan Drives        35160    West Celena         emile.daniel@gmail.com
3097        Lily Ryan            671 Logan Throughway     58261    Dickinsonburgh      ivory.lockman@gmail.com
...
```

```sql
CREATE TABLE `<your-environment>`.`<your-kafka-cluster>`.`customers_source_mask`
  LIKE `<your-environment>`.`<your-kafka-cluster>`.`customers_source`

INSERT INTO `<your-environment>`.`<your-kafka-cluster>`.`customers_source_mask` SELECT
  `customer_id`,
  REGEXP_REPLACE(`name`, '(\w)', '*') as `name`,
  address,
  postcode,
  city,
  email
FROM `<your-environment>`.`<your-kafka-cluster>`.`customers_source`;
```

```sql
customers_source_mask
```

```sql
customers_source
```

```sql
customers_source_mask
```

```sql
customers_source_mask
```

```sql
SELECT * FROM customers_source_mask;
```

```sql
customer_id name                 address                postcode city              email
3104        **** *** ******      342 Odis Hollow        27615    West Florentino   bryce.hodkiewicz@hotmail.c…
3058        **** ******* ******  33569 Turner Glens     14107    Schummchester     sarah.roob@yahoo.com
3138        **** ****** ******** 944 Elden Walks        39293    New Ernestbury    velvet.volkman@gmail.com
...
```

---

### Handle Multiple Event Types In Tables in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/how-to-guides/multiple-event-types.html

Handle Multiple Event Types with Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides several ways to work with Kafka topics containing multiple event types. This guide explains how Flink automatically infers and handles different event type patterns, allowing you to query and process mixed event streams effectively. Overview¶ When working with Kafka topics containing multiple event types, Flink automatically infers table schemas based on the Schema Registry configuration and schema format. The following sections describe the supported approaches in order of recommendation. Using Schema References¶ Schema references provide the most robust way to handle multiple event types in a single topic. With this approach, you define a main schema that references other schemas, allowing for modular schema management and independent evolution of event types. For example, consider a topic that combines purchase and pageview events. Schema for purchase events. AvroJSON SchemaProtobuf{ "type":"record", "namespace": "io.confluent.developer.avro", "name":"Purchase", "fields": [ {"name": "item", "type":"string"}, {"name": "amount", "type": "double"}, {"name": "customer_id", "type": "string"} ] } { "$schema": "http://json-schema.org/draft-07/schema#", "title": "Purchase", "type": "object", "properties": { "item": { "type": "string" }, "amount": { "type": "number" }, "customer_id": { "type": "string" } }, "required": ["item", "amount", "customer_id"] } syntax = "proto3"; package io.confluent.developer.proto; message Purchase { string item = 1; double amount = 2; string customer_id = 3; } Schema for pageview events. AvroJSON SchemaProtobuf{ "type":"record", "namespace": "io.confluent.developer.avro", "name":"Pageview", "fields": [ {"name": "url", "type":"string"}, {"name": "is_special", "type": "boolean"}, {"name": "customer_id", "type": "string"} ] } { "$schema": "http://json-schema.org/draft-07/schema#", "title": "Pageview", "type": "object", "properties": { "url": { "type": "string" }, "is_special": { "type": "boolean" }, "customer_id": { "type": "string" } }, "required": ["url", "is_special", "customer_id"] } syntax = "proto3"; package io.confluent.developer.proto; message Pageview { string url = 1; bool is_special = 2; string customer_id = 3; } Combined schema that references both event types: AvroJSON SchemaProtobuf[ "io.confluent.developer.avro.Purchase", "io.confluent.developer.avro.Pageview" ] { "$schema": "http://json-schema.org/draft-07/schema#", "title": "CustomerEvent", "type": "object", "oneOf": [ { "$ref": "io.confluent.developer.json.Purchase" }, { "$ref": "io.confluent.developer.json.Pageview" } ] } syntax = "proto3"; package io.confluent.developer.proto; import "purchase.proto"; import "pageview.proto"; message CustomerEvent { oneof action { Purchase purchase = 1; Pageview pageview = 2; } } When these schemas are registered in Schema Registry and used with the default TopicNameStrategy, Flink automatically infers the table structure. You can see this structure using: SHOW CREATE TABLE `customer-events`; Your output will show a table structure that includes columns for both event types: AvroJSON SchemaProtobufCREATE TABLE `customer-events` ( `key` VARBINARY(2147483647), `Purchase` ROW<`item` VARCHAR(2147483647), `amount` DOUBLE, `customer_id` VARCHAR(2147483647)>, `Pageview` ROW<`url` VARCHAR(2147483647), `is_special` BOOLEAN, `customer_id` VARCHAR(2147483647)> ) CREATE TABLE `customer-events` ( `key` VARBINARY(2147483647), `connect_union_field_0` ROW<`amount` DOUBLE NOT NULL, `customer_id` VARCHAR(2147483647) NOT NULL, `item` VARCHAR(2147483647) NOT NULL>, `connect_union_field_1` ROW<`customer_id` VARCHAR(2147483647) NOT NULL, `is_special` BOOLEAN NOT NULL, `url` VARCHAR(2147483647) NOT NULL> ) CREATE TABLE `customer-events` ( `key` VARBINARY(2147483647), `action` ROW `purchase` ROW<`item` VARCHAR(2147483647) NOT NULL, `amount` DOUBLE NOT NULL, `customer_id` VARCHAR(2147483647) NOT NULL>, `pageview` ROW<`url` VARCHAR(2147483647) NOT NULL, `is_special` BOOLEAN NOT NULL, `customer_id` VARCHAR(2147483647) NOT NULL> > ) You can query specific event types using standard SQL. The exact syntax depends on your schema format: AvroJSON SchemaProtobuf-- Query purchase events SELECT Purchase.* FROM `customer-events` WHERE Purchase IS NOT NULL; -- Query pageview events SELECT Pageview.* FROM `customer-events` WHERE Pageview IS NOT NULL; -- Query purchase events SELECT connect_union_field_0.* FROM `customer-events` WHERE connect_union_field_0 IS NOT NULL; -- Query pageview events SELECT connect_union_field_1.* FROM `customer-events` WHERE connect_union_field_1 IS NOT NULL; -- Query purchase events SELECT action.purchase.* FROM `customer-events` WHERE action.purchase IS NOT NULL; -- Query pageview events SELECT action.pageview.* FROM `customer-events` WHERE action.pageview IS NOT NULL; Using Union Types¶ Flink automatically handles union types across different schema formats. With this approach, all event types are defined within a single schema using the format’s native union type mechanism: Avro unions JSON Schema oneOf Protocol Buffer oneOf For example, consider a schema combining order and shipment events: AvroJSON SchemaProtobuf{ "type": "record", "namespace": "io.confluent.examples.avro", "name": "AllTypes", "fields": [ { "name": "event_type", "type": [ { "type": "record", "name": "Order", "fields": [ {"name": "order_id", "type": "string"}, {"name": "amount", "type": "double"} ] }, { "type": "record", "name": "Shipment", "fields": [ {"name": "tracking_id", "type": "string"}, {"name": "status", "type": "string"} ] } ] } ] } { "$schema": "http://json-schema.org/draft-07/schema#", "title": "AllTypes", "type": "object", "oneOf": [ { "type": "object", "title": "Order", "properties": { "order_id": { "type": "string" }, "amount": { "type": "number" } }, "required": ["order_id", "amount"] }, { "type": "object", "title": "Shipment", "properties": { "tracking_id": { "type": "string" }, "status": { "type": "string" } }, "required": ["tracking_id", "status"] } ] } syntax = "proto3"; package io.confluent.examples.proto; message Order { string order_id = 1; double amount = 2; } message Shipment { string tracking_id = 1; string status = 2; } message AllTypes { oneof event_type { Order order = 1; Shipment shipment = 2; } } When using these union types with TopicNameStrategy, Flink automatically creates a table structure based on your schema format. You can see this structure using: SHOW CREATE TABLE `events`; The output shows a table structure that reflects how each format handles unions: AvroJSON SchemaProtobufCREATE TABLE `events` ( `key` VARBINARY(2147483647), `event_type` ROW `Order` ROW<`order_id` VARCHAR(2147483647) NOT NULL, `amount` DOUBLE NOT NULL>, `Shipment` ROW<`tracking_id` VARCHAR(2147483647) NOT NULL, `status` VARCHAR(2147483647) NOT NULL> > NOT NULL ) You can query specific event types: -- Query orders SELECT event_type.Order.* FROM `events` WHERE event_type.Order IS NOT NULL; -- Query shipments SELECT event_type.Shipment.* FROM `events` WHERE event_type.Shipment IS NOT NULL; CREATE TABLE `events` ( `key` VARBINARY(2147483647), `connect_union_field_0` ROW<`amount` DOUBLE NOT NULL, `order_id` VARCHAR(2147483647) NOT NULL>, `connect_union_field_1` ROW<`status` VARCHAR(2147483647) NOT NULL, `tracking_id` VARCHAR(2147483647) NOT NULL> ) You can query specific event types: -- Query orders SELECT connect_union_field_0.* FROM `events` WHERE connect_union_field_0 IS NOT NULL; -- Query shipments SELECT connect_union_field_1.* FROM `events` WHERE connect_union_field_1 IS NOT NULL; CREATE TABLE `events` ( `key` VARBINARY(2147483647), `AllTypes` ROW `event_type` ROW `order` ROW<`order_id` VARCHAR(2147483647) NOT NULL, `amount` DOUBLE NOT NULL>, `shipment` ROW<`tracking_id` VARCHAR(2147483647) NOT NULL, `status` VARCHAR(2147483647) NOT NULL> > > ) You can query specific event types: -- Query orders SELECT AllTypes.event_type.order.* FROM `events` WHERE AllTypes.event_type.order IS NOT NULL; -- Query shipments SELECT AllTypes.event_type.shipment.* FROM `events` WHERE AllTypes.event_type.shipment IS NOT NULL; Using RecordNameStrategy Or TopicRecordNameStrategy Strategies¶ For topics using RecordNameStrategy or TopicRecordNameStrategy, Flink initially infers a raw binary table: CREATE TABLE `events` ( `key` VARBINARY(2147483647), `value` VARBINARY(2147483647) ) To work with these events, you need to manually configure the table with the appropriate subject names: ALTER TABLE events SET ( 'value.format' = 'avro-registry', 'value.avro-registry.subject-names' = 'com.example.events.OrderEvent;com.example.events.ShipmentEvent' ); If your topic uses keyed messages, you may also need to configure the key format: ALTER TABLE events SET ( 'key.format' = 'avro-registry', 'key.avro-registry.subject-names' = 'com.example.events.OrderKey' ); Replace avro-registry with json-registry or proto-registry based on your schema format. Best Practices¶ Use schema references with TopicNameStrategy when possible, as this provides the best balance of flexibility and manageability. If schema references aren’t suitable, use union types for a simpler schema management approach. Configure alternative subject name strategies only when working with existing systems that require them. Related Content¶ Schema References Flink SQL Data Type Mappings Subject Name Strategy Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
{
   "type":"record",
   "namespace": "io.confluent.developer.avro",
   "name":"Purchase",
   "fields": [
      {"name": "item", "type":"string"},
      {"name": "amount", "type": "double"},
      {"name": "customer_id", "type": "string"}
   ]
}
```

```sql
{
   "$schema": "http://json-schema.org/draft-07/schema#",
   "title": "Purchase",
   "type": "object",
   "properties": {
      "item": {
         "type": "string"
      },
      "amount": {
         "type": "number"
      },
      "customer_id": {
         "type": "string"
      }
   },
   "required": ["item", "amount", "customer_id"]
}
```

```sql
syntax = "proto3";

package io.confluent.developer.proto;

message Purchase {
   string item = 1;
   double amount = 2;
   string customer_id = 3;
}
```

```sql
{
   "type":"record",
   "namespace": "io.confluent.developer.avro",
   "name":"Pageview",
   "fields": [
      {"name": "url", "type":"string"},
      {"name": "is_special", "type": "boolean"},
      {"name": "customer_id", "type":  "string"}
   ]
}
```

```sql
{
   "$schema": "http://json-schema.org/draft-07/schema#",
   "title": "Pageview",
   "type": "object",
   "properties": {
      "url": {
         "type": "string"
      },
      "is_special": {
         "type": "boolean"
      },
      "customer_id": {
         "type": "string"
      }
   },
   "required": ["url", "is_special", "customer_id"]
}
```

```sql
syntax = "proto3";

package io.confluent.developer.proto;

message Pageview {
   string url = 1;
   bool is_special = 2;
   string customer_id = 3;
}
```

```sql
[
   "io.confluent.developer.avro.Purchase",
   "io.confluent.developer.avro.Pageview"
]
```

```sql
{
   "$schema": "http://json-schema.org/draft-07/schema#",
   "title": "CustomerEvent",
   "type": "object",
   "oneOf": [
      { "$ref": "io.confluent.developer.json.Purchase" },
      { "$ref": "io.confluent.developer.json.Pageview" }
   ]
}
```

```sql
syntax = "proto3";

package io.confluent.developer.proto;

import "purchase.proto";
import "pageview.proto";

message CustomerEvent {
   oneof action {
      Purchase purchase = 1;
      Pageview pageview = 2;
   }
}
```

```sql
SHOW CREATE TABLE `customer-events`;
```

```sql
CREATE TABLE `customer-events` (
  `key` VARBINARY(2147483647),
  `Purchase` ROW<`item` VARCHAR(2147483647), `amount` DOUBLE, `customer_id` VARCHAR(2147483647)>,
  `Pageview` ROW<`url` VARCHAR(2147483647), `is_special` BOOLEAN, `customer_id` VARCHAR(2147483647)>
)
```

```sql
CREATE TABLE `customer-events` (
  `key` VARBINARY(2147483647),
  `connect_union_field_0` ROW<`amount` DOUBLE NOT NULL, `customer_id` VARCHAR(2147483647) NOT NULL, `item` VARCHAR(2147483647) NOT NULL>,
  `connect_union_field_1` ROW<`customer_id` VARCHAR(2147483647) NOT NULL, `is_special` BOOLEAN NOT NULL, `url` VARCHAR(2147483647) NOT NULL>
)
```

```sql
CREATE TABLE `customer-events` (
  `key` VARBINARY(2147483647),
  `action` ROW
    `purchase` ROW<`item` VARCHAR(2147483647) NOT NULL, `amount` DOUBLE NOT NULL, `customer_id` VARCHAR(2147483647) NOT NULL>,
    `pageview` ROW<`url` VARCHAR(2147483647) NOT NULL, `is_special` BOOLEAN NOT NULL, `customer_id` VARCHAR(2147483647) NOT NULL>
  >
)
```

```sql
-- Query purchase events
SELECT Purchase.* FROM `customer-events` WHERE Purchase IS NOT NULL;

-- Query pageview events
SELECT Pageview.* FROM `customer-events` WHERE Pageview IS NOT NULL;
```

```sql
-- Query purchase events
SELECT connect_union_field_0.* FROM `customer-events` WHERE connect_union_field_0 IS NOT NULL;

-- Query pageview events
SELECT connect_union_field_1.* FROM `customer-events` WHERE connect_union_field_1 IS NOT NULL;
```

```sql
-- Query purchase events
SELECT action.purchase.* FROM `customer-events` WHERE action.purchase IS NOT NULL;

-- Query pageview events
SELECT action.pageview.* FROM `customer-events` WHERE action.pageview IS NOT NULL;
```

```sql
{
   "type": "record",
   "namespace": "io.confluent.examples.avro",
   "name": "AllTypes",
   "fields": [
      {
         "name": "event_type",
         "type": [
            {
               "type": "record",
               "name": "Order",
               "fields": [
                  {"name": "order_id", "type": "string"},
                  {"name": "amount", "type": "double"}
               ]
            },
            {
               "type": "record",
               "name": "Shipment",
               "fields": [
                  {"name": "tracking_id", "type": "string"},
                  {"name": "status", "type": "string"}
               ]
            }
         ]
      }
   ]
}
```

```sql
{
   "$schema": "http://json-schema.org/draft-07/schema#",
   "title": "AllTypes",
   "type": "object",
   "oneOf": [
      {
         "type": "object",
         "title": "Order",
         "properties": {
            "order_id": { "type": "string" },
            "amount": { "type": "number" }
         },
         "required": ["order_id", "amount"]
      },
      {
         "type": "object",
         "title": "Shipment",
         "properties": {
            "tracking_id": { "type": "string" },
            "status": { "type": "string" }
         },
         "required": ["tracking_id", "status"]
      }
   ]
}
```

```sql
syntax = "proto3";

package io.confluent.examples.proto;

message Order {
   string order_id = 1;
   double amount = 2;
}

message Shipment {
   string tracking_id = 1;
   string status = 2;
}

message AllTypes {
   oneof event_type {
      Order order = 1;
      Shipment shipment = 2;
   }
}
```

```sql
SHOW CREATE TABLE `events`;
```

```sql
CREATE TABLE `events` (
  `key` VARBINARY(2147483647),
  `event_type` ROW
    `Order` ROW<`order_id` VARCHAR(2147483647) NOT NULL, `amount` DOUBLE NOT NULL>,
    `Shipment` ROW<`tracking_id` VARCHAR(2147483647) NOT NULL, `status` VARCHAR(2147483647) NOT NULL>
  > NOT NULL
)
```

```sql
-- Query orders
SELECT event_type.Order.* FROM `events` WHERE event_type.Order IS NOT NULL;

-- Query shipments
SELECT event_type.Shipment.* FROM `events` WHERE event_type.Shipment IS NOT NULL;
```

```sql
CREATE TABLE `events` (
  `key` VARBINARY(2147483647),
  `connect_union_field_0` ROW<`amount` DOUBLE NOT NULL, `order_id` VARCHAR(2147483647) NOT NULL>,
  `connect_union_field_1` ROW<`status` VARCHAR(2147483647) NOT NULL, `tracking_id` VARCHAR(2147483647) NOT NULL>
)
```

```sql
-- Query orders
SELECT connect_union_field_0.* FROM `events` WHERE connect_union_field_0 IS NOT NULL;

-- Query shipments
SELECT connect_union_field_1.* FROM `events` WHERE connect_union_field_1 IS NOT NULL;
```

```sql
CREATE TABLE `events` (
  `key` VARBINARY(2147483647),
  `AllTypes` ROW
    `event_type` ROW
      `order` ROW<`order_id` VARCHAR(2147483647) NOT NULL, `amount` DOUBLE NOT NULL>,
      `shipment` ROW<`tracking_id` VARCHAR(2147483647) NOT NULL, `status` VARCHAR(2147483647) NOT NULL>
    >
  >
)
```

```sql
-- Query orders
SELECT AllTypes.event_type.order.* FROM `events` WHERE AllTypes.event_type.order IS NOT NULL;

-- Query shipments
SELECT AllTypes.event_type.shipment.* FROM `events` WHERE AllTypes.event_type.shipment IS NOT NULL;
```

```sql
CREATE TABLE `events` (
  `key` VARBINARY(2147483647),
  `value` VARBINARY(2147483647)
)
```

```sql
ALTER TABLE events SET (
  'value.format' = 'avro-registry',
  'value.avro-registry.subject-names' = 'com.example.events.OrderEvent;com.example.events.ShipmentEvent'
);
```

```sql
ALTER TABLE events SET (
  'key.format' = 'avro-registry',
  'key.avro-registry.subject-names' = 'com.example.events.OrderKey'
);
```

```sql
avro-registry
```

```sql
json-registry
```

```sql
proto-registry
```

---

### How-to Guides for Developing Flink Applications on Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/how-to-guides/overview.html

How-to Guides for Confluent Cloud for Apache Flink¶ Discover how Confluent Cloud for Apache Flink® can help you accomplish common processing tasks such as joins and aggregations. This section provides step-by-step guidance on how to use Flink to process your data efficiently and effectively. Aggregate a Stream in a Tumbling Window Combine Streams and Track Most Recent Records Compare Current and Previous Values in a Data Stream Convert the Serialization Format of a Topic Create a User Defined Function Handle Multiple Event Types Process Schemaless Events Resolve Statement Issues Run a Snapshot Query Scan and Summarize Tables View Time Series Data Flink actions¶ Confluent Cloud for Apache Flink provides Flink Actions that enable you to perform specific data-processing tasks on topics with minimal configuration. These actions are designed to simplify common workloads by providing a user-friendly interface to configure and execute them. Create an Embedding: Convert data in a topic’s column into a vector embedding for AI model inference. Deduplicate Rows in a Table: Remove duplicate records from a topic based on specified fields, ensuring that only unique records are retained in the output topic. Mask Fields in a Table: Mask sensitive data in specified fields of a topic by replacing the original data with a static value. Transform a Topic: Change a topic’s properties by applying custom Flink SQL transformations. Related content¶ Video: How to Set Idle Timeouts Video: How to Analyze Data from a REST API with Flink SQL Video: How To Use Streaming Joins with Apache Flink Video: How to Visualize Real-Time Data from Apache Kafka using Apache Flink SQL and Streamlit Use Flink SQL with Kafka, Streamlit, and the Alpaca API Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

---

### Process schemaless events with Flink SQL in Confluent Cloud | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/how-to-guides/process-schemaless-events.html

Process Schemaless Events with Confluent Cloud for Apache Flink¶ This guide explains how use Confluent Cloud for Apache Flink to handle and process events in Apache Kafka® topics that don’t use serializers that are compatible with Schema Registry, while still leveraging Schema Registry for data processing with Flink SQL. Overview¶ When working with Kafka topics containing events that aren’t serialized with Schema Registry-compatible serializers, you can still use Flink SQL to process your data. This approach enables you to handle “schemaless” events by defining a schema separately in Schema Registry. Prerequisites¶ Access to Confluent Cloud A Kafka topic containing events you want to process Appropriate permissions to access Schema Registry in Confluent Cloud Step 1: Submit your schema to Schema Registry¶ Log in to the Confluent Cloud Console. Navigate to the Topics Overview page. Locate your topic and click it to open the topic details page. Click Set a schema. Submit your schema in Avro, Protobuf, or JSON format. Note: With JSON you can define a partial schema, which means that not all fields that can exist in the payload need to be defined in the schema at first. Flink ignores fields that aren’t defined. Also, the order of these fields doesn’t matter for JSON. This differs from Avro and Protobuf, where you must define all fields in the right order. In case some fields don’t appear in every event, you can mark these fields as optional. The following example schemas show how sensor data might be represented in full JSON, partial JSON, Avro, and Protobuf formats. Full JSONPartial JSONAvroProtobuf{ "$schema": "http://json-schema.org/draft-07/schema#", "additionalProperties": false, "properties": { "humidity": { "description": "The humidity reading as a percentage", "type": "number" }, "id": { "description": "The unique identifier for the event", "type": "string" }, "temperature": { "description": "The temperature reading in Celsius", "type": "number" }, "timestamp": { "description": "The timestamp of the event in milliseconds since the epoch", "type": "integer" } }, "required": [ "id" ], "title": "DynamicEvent", "type": "object" } { "$schema": "http://json-schema.org/draft-07/schema#", "additionalProperties": false, "properties": { "id": { "description": "The unique identifier for the event", "type": "string" } }, "required": [ "id" ], "title": "DynamicEvent", "type": "object" } { "fields": [ { "name": "id", "type": "string" }, { "default": null, "name": "timestamp", "type": [ "null", "long" ] }, { "default": null, "name": "temperature", "type": [ "null", "float" ] }, { "default": null, "name": "humidity", "type": [ "null", "float" ] } ], "name": "DynamicEvent", "type": "record" } syntax = "proto3"; package example; message DynamicEvent { string id = 1; optional int64 timestamp = 2; optional float temperature = 3; optional float humidity = 4; } Step 2: Query your table¶ Once you’ve submitted the schema, you can start querying your topic immediately by using Flink SQL. The defined schema is used to interpret the data, even if the events themselves don’t contain schema information. Flink first tries to deserialize as if the data was serialized with Schema Registry serializers, and otherwise treats the incoming bytes as Avro, Protobuf, or JSON. Important considerations¶ When possible, you should always use the Schema Registry serializers, to gain the benefits of properly governing your data streams. This method works even if your events don’t include schema version information in their byte stream. You can submit a partial schema only for JSON. Flink will process the defined fields and ignore the rest. With this approach, automatic schema evolution within the stream is not supported. If you want to evolve the schema, you must manually evolve the it and consider the impact as described in Schema Evolution and Compatibility for Schema Registry on Confluent Cloud. Related content¶ Handle Multiple Event Types Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "additionalProperties": false,
  "properties": {
    "humidity": {
      "description": "The humidity reading as a percentage",
      "type": "number"
    },
      "id": {
      "description": "The unique identifier for the event",
      "type": "string"
    },
      "temperature": {
      "description": "The temperature reading in Celsius",
      "type": "number"
    },
      "timestamp": {
      "description": "The timestamp of the event in milliseconds since the epoch",
      "type": "integer"
    }
  },
  "required": [
    "id"
  ],
  "title": "DynamicEvent",
  "type": "object"
}
```

```sql
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "additionalProperties": false,
  "properties": {
    "id": {
      "description": "The unique identifier for the event",
      "type": "string"
    }
  },
  "required": [
    "id"
  ],
  "title": "DynamicEvent",
  "type": "object"
}
```

```sql
{
  "fields": [
    {
      "name": "id",
      "type": "string"
    },
    {
      "default": null,
      "name": "timestamp",
      "type": [
        "null",
        "long"
      ]
    },
    {
      "default": null,
      "name": "temperature",
      "type": [
        "null",
        "float"
      ]
    },
    {
      "default": null,
      "name": "humidity",
      "type": [
        "null",
        "float"
      ]
    }
  ],
  "name": "DynamicEvent",
  "type": "record"
}
```

```sql
syntax = "proto3";
package example;

message DynamicEvent {
  string id = 1;
  optional int64 timestamp = 2;
  optional float temperature = 3;
  optional float humidity = 4;
}
```

---

### Profile a Query with Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/how-to-guides/profile-query.html

Profile a Query with Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables you to profile the performance of your queries. The Query Profiler provides enhanced visibility into how a Flink statement is processing data, which enables rapid identification of bottlenecks, data skew issues, and other performance issues. The profiler updates metrics in near real-time, enabling you to monitor query performance as data flows through your pipeline. For more information about the Query Profiler, see Flink SQL Query Profiler. Prerequisites¶ Access to Confluent Cloud. The OrganizationAdmin, EnvironmentAdmin, or FlinkAdmin role for creating compute pools, or the FlinkDeveloper role if you already have a compute pool. If you don’t have the appropriate role, contact your OrganizationAdmin or EnvironmentAdmin. For more information, see Grant Role-Based Access in Confluent Cloud for Apache Flink. A provisioned Flink compute pool. Step 1: Analyze and run a statement¶ In this step, you use the EXPLAIN statement to perform a static analysis on a query and then start the query. The query is a temporal join between the orders and customers tables. Log in to Confluent Cloud and navigate to your Flink workspace. Run the following EXPLAIN statement to view a static analysis of a query. EXPLAIN SELECT o.order_id, o.`$rowtime`, c.customer_id, c.name, o.price FROM examples.marketplace.orders o JOIN examples.marketplace.customers FOR SYSTEM_TIME AS OF o.`$rowtime` c ON o.customer_id = c.customer_id WHERE o.`$rowtime` >= CURRENT_TIMESTAMP - INTERVAL '1' HOUR; Your output should resemble: == Physical Plan == StreamSink [12] +- StreamCalc [11] +- StreamTemporalJoin [10] +- StreamExchange [3] : +- StreamCalc [2] : +- StreamTableSourceScan [1] +- StreamExchange [9] +- StreamCalc [8] +- StreamChangelogNormalize [7] +- StreamExchange [6] +- StreamCalc [5] +- StreamTableSourceScan [4] ... Run the statement. SELECT o.order_id, o.`$rowtime`, c.customer_id, c.name, o.price FROM examples.marketplace.orders o JOIN examples.marketplace.customers FOR SYSTEM_TIME AS OF o.`$rowtime` c ON o.customer_id = c.customer_id WHERE o.`$rowtime` >= CURRENT_TIMESTAMP - INTERVAL '1' HOUR; Step 2: Profile the query¶ In this step, you use the Query Profiler to monitor the performance of the query. The Query Profiler helps identify performance bottlenecks by showing where records are flowing slowly or backing up in the pipeline. Navigate to your environment’s overview page. In the navigation menu, click Flink, and in the overview page, click Flink statements. In the statement list, click your statement to open the statement details page. Click Query profiler to view the profiler graph. The Query Profiler opens and shows a graph of the Flink tasks that are running. The graph shows the physical execution plan of your query, with each operator represented as a node. The nodes are connected by arrows showing the flow of data between operators. For each operator node, you can see: The operator name and ID Metrics like Idleness and Backpressure Resource utilization like CPU and memory usage Key operators in the current temporal join query include: StreamTableSourceScan nodes [1] and [4] reading from the orders and customers tables StreamCalc nodes [2], [5], [8], [11] performing filtering and projection StreamExchange nodes [3], [6], [9] handling data redistribution between tasks StreamChangelogNormalize [7] processing changelog records from the versioned customers table StreamTemporalJoin [10] joining the orders with customer versions based on event time StreamSink [12] writing results to the output Click the title bar of the TemporalJoin node to open the operator details pane. Click Operator to view details about the operators in the task. Expand State Size to view the amount of data currently stored by the task. In the graph, click on other operator nodes to see metrics about their performance. Related content¶ Flink SQL Query Profiler EXPLAIN Statement Flink SQL Statements Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
EXPLAIN
SELECT
  o.order_id,
  o.`$rowtime`,
  c.customer_id,
  c.name,
  o.price
FROM examples.marketplace.orders o
JOIN examples.marketplace.customers FOR SYSTEM_TIME AS OF o.`$rowtime` c
ON o.customer_id = c.customer_id
WHERE o.`$rowtime` >= CURRENT_TIMESTAMP - INTERVAL '1' HOUR;
```

```sql
== Physical Plan ==

StreamSink [12]
  +- StreamCalc [11]
    +- StreamTemporalJoin [10]
      +- StreamExchange [3]
      :  +- StreamCalc [2]
      :    +- StreamTableSourceScan [1]
      +- StreamExchange [9]
        +- StreamCalc [8]
          +- StreamChangelogNormalize [7]
            +- StreamExchange [6]
              +- StreamCalc [5]
                +- StreamTableSourceScan [4]
...
```

```sql
SELECT
  o.order_id,
  o.`$rowtime`,
  c.customer_id,
  c.name,
  o.price
FROM examples.marketplace.orders o
JOIN examples.marketplace.customers FOR SYSTEM_TIME AS OF o.`$rowtime` c
ON o.customer_id = c.customer_id
WHERE o.`$rowtime` >= CURRENT_TIMESTAMP - INTERVAL '1' HOUR;
```

---

### Resolve Statement Issues in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/how-to-guides/resolve-common-query-problems.html

Resolve Statement Issues in Confluent Cloud for Apache Flink¶ Inefficient Flink SQL queries in Confluent Cloud for Apache Flink® can cause performance issues that impact your data processing pipeline. These inefficiencies can be identified early through warnings when you submit your query, or they become apparent later when your statement enters a DEGRADED state during execution. This page explains how to identify and resolve query inefficiencies, providing a comprehensive approach to troubleshooting statement performance problems. Statement enters DEGRADED state¶ When a Flink statement is unable to make consistent progress, it may enter a DEGRADED state. This typically occurs due to performance bottlenecks or resource constraints. You may see the following error message: Your |af| statement has entered a Degraded state because it is unable to make consistent progress. This can be caused by inefficient query logic or insufficient compute resources. Please review your statement for performance bottlenecks. If the issue persists, consider scaling your compute pool or contacting Confluent support for assistance. To resolve a DEGRADED state, follow these steps: Check for Statement Advisor warnings: Review and resolve any warnings that were returned during query submission. If you’re unsure whether warnings were shown, run your query with the EXPLAIN statement to see if warnings are generated. Profile your query: Use the Query Profiler to identify performance bottlenecks and data flow issues in your statement. Review compute resources: Check if your compute pool has reached its maximum CFU limit. If so, consider: Increasing the maximum CFU limit for your compute pool Moving the statement to a dedicated compute pool with more CFU capacity Optimizing your query to reduce resource consumption Optimize query logic: Based on the warnings and profiling results, implement the specific optimizations described in the following warning sections. Primary key differs from derived upsert key¶ [Warning] The primary key "<pk_column>" does not match the upsert key "<upsert_key_column>" that is derived from the query. If the primary key and upsert key don't match, the system needs to add a state-intensive operation for correction, which can result in a DEGRADED statement and higher CFU consumption. If possible, revisit the table declaration with the primary key or change your query. For more information, see https://cnfl.io/primary_vs_upsert_key. This warning occurs when you insert data into a table where the table’s defined PRIMARY KEY doesn’t align with the key columns derived from the INSERT INTO ... SELECT or CREATE TABLE ... AS SELECT query’s grouping or source. When the keys mismatch, Flink must introduce an expensive internal operator (UpsertMaterialize) to ensure correctness, which consumes more state and resources. The following example illustrates a query that triggers this warning: -- Create a table to store customer total orders CREATE TABLE customer_orders ( total_orders INT PRIMARY KEY NOT ENFORCED, -- Primary Key is total_orders customer_name STRING ); -- Insert aggregated order counts per customer INSERT INTO customer_orders SELECT SUM(order_count), customer_name -- Upsert key derived from GROUP BY is customer_name FROM ( VALUES ('Bob', 2), -- Bob placed 2 orders ('Alice', 1), -- Alice placed 1 order ('Bob', 2) -- Bob placed 2 more orders ) AS OrderData(customer_name, order_count) GROUP BY customer_name; To resolve this warning: Align Primary KeyModify the PRIMARY KEY definition in your CREATE TABLE statement to match the columns used to uniquely identify rows in your INSERT query (often the GROUP BY columns). In the example above, changing the primary key to customer_name resolves the warning. Modify QueryAdjust your INSERT INTO ... SELECT query so the selected columns or grouping aligns with the existing primary key definition. This might involve changing the GROUP BY clause or the columns being selected. Check for warningsIf you’re unsure whether your query produces this warning, run it with the EXPLAIN statement to see if warnings are generated. High state operator without state TTL¶ [Warning] Your query includes one or more highly state-intensive operators but does not set a time-to-live (TTL) value, which means that the system potentially needs to store an infinite amount of state. This can result in a DEGRADED statement and higher CFU consumption. If possible, change your query to use a different operator, or set a time-to-live (TTL) value. For more information, see https://cnfl.io/high_state_intensive_operators. Certain SQL operations, like joins on unbounded streams or aggregations without windowing, require Flink to maintain internal state. If this state isn’t configured to expire (using a Time-To-Live or TTL setting), it can grow indefinitely, leading to excessive memory usage, performance degradation, and higher costs. The following example illustrates a query that triggers this warning: -- Joining two unbounded streams without TTL SELECT c.*, o.* FROM `examples`.`marketplace`.`clicks` c INNER JOIN `examples`.`marketplace`.`orders` o ON c.user_id = o.customer_id; To resolve this warning: Set State TTLConfigure a state time-to-live (TTL) for the table(s) involved in the stateful operation. This ensures that state older than the specified duration is automatically cleared. This can done for the full statement via SET ‘sql.state-ttl’ option or for individual tables via State TTL Hints. Use Windowed OperationsIf applicable, rewrite your query to use windowed operations, like windowed joins or windowed aggregations, instead of unbounded operations. Windows limit the amount of state required inherently. Refactor QueryAnalyze if the stateful operation is necessary or if the query logic can be changed to avoid large state requirements. Check for warningsIf you’re unsure whether your query produces this warning, run it with the EXPLAIN statement to see if warnings are generated. Missing window_start or window_end in GROUP BY for window aggregation¶ [Warning] Your query contains only "window_end" in the GROUP BY clause, with no corresponding "window_start". This means that the query is considered a regular aggregation query and not a windowed aggregation, which can result in unexpected, continuously updating output and higher CFU consumption. if you want a windowed aggregation in your query, ensure that you include both "window_start" and "window_end" in the GROUP BY clause. For more information, see https://cnfl.io/regular_vs_window_aggregation. A similar warning appears if only window_start is included without window_end. When performing windowed aggregations, using functions like TUMBLE, HOP, CUMULATE, SESSION, you typically group by the window boundaries (window_start and window_end) along with any other grouping keys. If you include only one of the window boundary columns. either window_start or window_end, in the GROUP BY clause, Flink interprets this as a regular, non-windowed aggregation. This leads to continuously updating results for each input row rather than a single result per window, which is usually not the intended behavior and can consume more resources. The following example illustrates a query that triggers this warning: -- Incorrect GROUP BY for TUMBLE window SELECT window_end, SUM(price) as `sum` FROM TABLE( TUMBLE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES) ) GROUP BY window_end; -- Missing window_start To resolve this warning should it occur in a query: Include both window boundariesWhen performing windowed aggregations, ensure that your GROUP BY clause includes both window_start and window_end. Check for warningsIf you’re unsure whether your query produces this warning, run it with the EXPLAIN statement to see if warnings are generated. The following example shows the revised query that resolves this warning: -- Correct GROUP BY for TUMBLE window SELECT window_start, window_end, SUM(price) as `sum` FROM TUMBLE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES) ) GROUP BY window_start, window_end; -- Includes both window boundaries Session window without a PARTITION BY key¶ [Warning] Your query uses a SESSION window without a PARTITION BY clause. This results in all data being processed by a single, non-parallel task, which can create a significant bottleneck, leading to poor performance and high resource consumption. To improve performance and enable parallel execution, specify a PARTITION BY key in your SESSION window. For more information, see https://cnfl.io/session_without_partioning. When using a SESSION window, data is grouped into sessions based on periods of activity, which are separated by a specified gap of inactivity. If you don’t include a PARTITION BY clause, all data will be sent to a single, non-parallel task to correctly identify these sessions. This creates a significant performance bottleneck and prevents the query from scaling. The following example shows a query that triggers this warning: -- This query uses a SESSION window without a PARTITION BY key SELECT * FROM SESSION( TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '5' MINUTES ); To resolve this warning: Add a PARTITION BY keyModify your SESSION window definition to include a PARTITION BY clause. This partitions the data by the specified key(s), allowing the sessionization to be performed independently and in parallel for each partition. This is important for performance and scalability. Check for warningsIf you’re unsure whether your query produces this warning, run it with the EXPLAIN statement to see if warnings are generated. The following example shows the revised query that resolves the warning: -- Corrected query with PARTITION BY to enable parallel execution SELECT * FROM SESSION( TABLE `examples`.`marketplace`.`orders` PARTITION BY customer_id, DESCRIPTOR($rowtime), INTERVAL '5' MINUTES ); Related content¶ Profile a Query EXPLAIN Statement SET HINTS Window Aggregation Window TopN Window Join Window Deduplication Interval join Temporal join Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
Your |af| statement has entered a Degraded state because it is unable to make consistent progress. This can be caused by inefficient query logic or insufficient compute resources. Please review your statement for performance bottlenecks. If the issue persists, consider scaling your compute pool or contacting Confluent support for assistance.
```

```sql
[Warning] The primary key "<pk_column>" does not match the upsert key "<upsert_key_column>" that is derived from the query. If the primary key and upsert key don't match, the system needs to add a state-intensive operation for correction, which can result in a DEGRADED statement and higher CFU consumption. If possible, revisit the table declaration with the primary key or change your query. For more information, see https://cnfl.io/primary_vs_upsert_key.
```

```sql
PRIMARY KEY
```

```sql
INSERT INTO ... SELECT
```

```sql
CREATE TABLE ... AS SELECT
```

```sql
UpsertMaterialize
```

```sql
-- Create a table to store customer total orders
CREATE TABLE customer_orders (
    total_orders INT PRIMARY KEY NOT ENFORCED, -- Primary Key is total_orders
    customer_name STRING
);

-- Insert aggregated order counts per customer
INSERT INTO customer_orders
SELECT
    SUM(order_count), customer_name -- Upsert key derived from GROUP BY is customer_name
FROM ( VALUES
    ('Bob', 2),      -- Bob placed 2 orders
    ('Alice', 1),    -- Alice placed 1 order
    ('Bob', 2)       -- Bob placed 2 more orders
) AS OrderData(customer_name, order_count)
GROUP BY customer_name;
```

```sql
PRIMARY KEY
```

```sql
CREATE TABLE
```

```sql
customer_name
```

```sql
INSERT INTO ... SELECT
```

```sql
[Warning] Your query includes one or more highly state-intensive operators but does not set a time-to-live (TTL) value, which means that the system potentially needs to store an infinite amount of state. This can result in a DEGRADED statement and higher CFU consumption. If possible, change your query to use a different operator, or set a time-to-live (TTL) value. For more information, see https://cnfl.io/high_state_intensive_operators.
```

```sql
-- Joining two unbounded streams without TTL
SELECT c.*, o.*
FROM `examples`.`marketplace`.`clicks` c
INNER JOIN `examples`.`marketplace`.`orders` o
ON c.user_id = o.customer_id;
```

```sql
window_start
```

```sql
[Warning] Your query contains only "window_end" in the GROUP BY clause, with no corresponding "window_start". This means that the query is considered a regular aggregation query and not a windowed aggregation, which can result in unexpected, continuously updating output and higher CFU consumption. if you want a windowed aggregation in your query, ensure that you include both "window_start" and "window_end" in the GROUP BY clause. For more information, see https://cnfl.io/regular_vs_window_aggregation.
```

```sql
window_start
```

```sql
window_start
```

```sql
window_start
```

```sql
-- Incorrect GROUP BY for TUMBLE window
SELECT window_end, SUM(price) as `sum`
FROM TABLE(
    TUMBLE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES)
)
GROUP BY window_end; -- Missing window_start
```

```sql
window_start
```

```sql
-- Correct GROUP BY for TUMBLE window
SELECT window_start, window_end, SUM(price) as `sum`
FROM TUMBLE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES)
)
GROUP BY window_start, window_end; -- Includes both window boundaries
```

```sql
[Warning] Your query uses a SESSION window without a PARTITION BY clause. This results in all data being processed by a single, non-parallel task, which can create a significant bottleneck, leading to poor performance and high resource consumption. To improve performance and enable parallel execution, specify a PARTITION BY key in your SESSION window. For more information, see https://cnfl.io/session_without_partioning.
```

```sql
-- This query uses a SESSION window without a PARTITION BY key
SELECT *
FROM SESSION(
    TABLE `examples`.`marketplace`.`orders`,
    DESCRIPTOR($rowtime),
    INTERVAL '5' MINUTES
);
```

```sql
-- Corrected query with PARTITION BY to enable parallel execution
SELECT *
   FROM SESSION(
       TABLE `examples`.`marketplace`.`orders` PARTITION BY customer_id,
       DESCRIPTOR($rowtime),
       INTERVAL '5' MINUTES
   );
```

---

### Run a Snapshot Query with in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/how-to-guides/run-snapshot-query.html

Run a Snapshot Query with Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® supports snapshot queries that read data from a table at a specific point in time. In contrast with a streaming query, which runs continuously and returns results incrementally, a snapshot query runs, returns results, and then exits. This guide shows how to run a snapshot query on a Flink table. Step 1: Create an example data stream Step 2: Run a snapshot query on the topic Step 3: Set the snapshot mode in SQL Note Snapshot query is an Early Access Program feature in Confluent Cloud for Apache Flink. An Early Access feature is a component of Confluent Cloud introduced to gain feedback. This feature should be used only for evaluation and non-production testing purposes or to provide feedback to Confluent, particularly as it becomes more widely available in follow-on preview editions. Early Access Program features are intended for evaluation use in development and testing environments only, and not for production use. Early Access Program features are provided: (a) without support; (b) “AS IS”; and (c) without indemnification, warranty, or condition of any kind. No service level commitment will apply to Early Access Program features. Early Access Program features are considered to be a Proof of Concept as defined in the Confluent Cloud Terms of Service. Confluent may discontinue providing preview releases of the Early Access Program features at any time in Confluent’s sole discretion. Prerequisites¶ Access to Confluent Cloud. The OrganizationAdmin, EnvironmentAdmin, or FlinkAdmin role for creating compute pools, or the FlinkDeveloper role if you already have a compute pool. If you don’t have the appropriate role, contact your OrganizationAdmin or EnvironmentAdmin. For more information, see Grant Role-Based Access in Confluent Cloud for Apache Flink. A provisioned Flink compute pool. Step 1: Create an example data stream¶ In this step, you create a Datagen source connector that produces a stream of data. If you have a topic with data, you can skip this step and proceed to Step 2: Run a snapshot query on the topic. In the Confluent Cloud UI, go to the Environments page. Select the environment where you want to create the connector. In the Overview page, click the cluster that you want to use. In the navigation menu, click Connectors. Click Add connector, and in the Connector Plugins page, click Sample Data. In the Launch Sample Data dialog, click Users, and click Launch. It may take a few minutes to create the connector. Step 2: Run a snapshot query on the topic¶ In the navigation menu, click Topics. In the topics list, find the topic you want to query. If you created a Datagen source connector, the topic is named sample_data_users. Click the topic name to open the topic details page. Click Query with Flink. A Flink workspace opens with a SQL editor that you can use to run a snapshot query. In the cell, find the Mode dropdown, which defaults to Streaming. Change the mode to Snapshot and click Run. The query runs and returns all of the messages that have been produced to the topic at the current point in time. Step 3: Set the snapshot mode in SQL¶ You can set the snapshot mode in SQL by using the SET statement to assign the sql.snapshot.mode configuration option. In the cell, prepend the SELECT statement with the following SET statement: SET 'sql.snapshot.mode' = 'now'; SELECT * FROM `<your-env>`.`<your-cluster>`.`sample_data_users`; Click Run. The query runs and returns all of the messages that have been produced to the topic at the current point in time. Related content¶ Snapshot Queries Query Tableflow Tables with Flink Statements Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
sample_data_users
```

```sql
sql.snapshot.mode
```

```sql
SET 'sql.snapshot.mode' = 'now';
SELECT * FROM `<your-env>`.`<your-cluster>`.`sample_data_users`;
```

---

### Scan and Summarize Flink Tables in Confluent Cloud | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/how-to-guides/scan-and-summarize-tables.html

Scan and Summarize Tables with Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides graphical tools in your workspaces that enable scanning and summarizing data visually in Flink tables. Distributions of values for each column in a table are shown in embedded charts, or sparklines. You can highlight values in one chart to filter corresponding values in all columns, revealing connections and relationships in your data. Overview¶ When you explore data in a table, you frequently want to find a row (scan), or you may want to understand the shape of the data (summarize). Scan¶ Cloud Console workspaces provide a search box that enables scanning the data for particular rows. For example, if you’re interested in the orders that are placed by a particular customer, you can enter the customer’s ID in the search box to scan the table for relevant rows. Summarize¶ In a Cloud Console workspace, when you run a Flink SQL statement that returns a table, sparklines are displayed automatically and show the distribution of distinct values in each column. These charts update automatically as new rows arrive from the data stream. The workspace enables filtering rows by interacting with these charts. For example, in an orders table, you can apply a filter that shows only rows for low-price items and compare these results with another filter that shows high-price items to see if there’s a different distribution of items between the price ranges. Explore example data¶ Log in to the Confluent Cloud Console and navigate to an environment that hosts Flink SQL. In the navigation menu, click Stream processing to open the Stream processing page. If you have a workspace set up already, click its tile, or click Create workspace to create a new one. In the workspace, use the Catalog and Database dropdown controls to select the examples catalog and the marketplace database. Run the following statement to query the orders stream for all rows. SELECT * FROM orders; Your output should resemble: At the top of each column, a chart is displayed. The charts update as new rows stream into the query results. Each chart shows the distribution of distinct values in the column, for strings, booleans, numbers, and categories. An icon displays the data type of the column. Also, the arrow icon enables sorting rows by the column values. At the bottom of each column, aggregated values are displayed that summarize aspects of the data in the column, like the count of rows and the number of distinct values, or cardinality. For columns with numerical values, you can see statistics, like the average, minimum, and maximum values. The number of rows displayed is limited to 5000 or to the LIMIT value you specify in your query. For example, the following statement limits the query result to 50 rows. SELECT * FROM orders LIMIT 50; At the bottom of the price column, change the dropdown control from Count to Average. The average value of the most recent prices displays and updates as new rows arrive. Select other statistics for prices, like Max and Min. Search for values¶ The search box enables finding values across all columns in the currently displayed result set. The search box doesn’t filter the data. It’s useful for scanning for a particular row or narrowing the results down to a particular row. In the search box, type “3000”. All rows that have a customer_id value of 3000 are displayed, which enables viewing all orders from this customer. Click x in the search to clear it. In the search box, type “1000”. All rows that have a product_id value of 1000 are displayed, which enables viewing all orders for this product. Click x in the search box to clear it. In the search box, type “3050”, and in the price column, click the double-arrow icon. All rows for customer 3050 are displayed, and the rows are sorted by price, from lowest to highest. In the price column, click the arrow icon. All rows for customer 3050 are displayed, and the rows are sorted by price, from highest to lowest. Click x in the search box to clear it, and click the arrow icon in the price column to reset the rows to unsorted. Apply a filter¶ Any column that has numerical or datetime data is filterable. Filters apply across all columns in the table. Filters apply only in the graphical display and don’t affect the underlying data stream. Hover over the leftmost bar in the price chart. The cursor changes to a + target, and a summary of the rows represented by the bar appears in a popup. Click-drag, or brush the cursor over the first three bars in the price chart. A filter is applied to the price data, so only the rows with prices that fall within the selected range are displayed. This filter shows the orders for the least expensive products. When a filter is applied, the unfiltered data is shown in gray. Above the charts, the current filter is displayed and if you click on it you will see (and be able to adjust) its settings. You can apply more than one filter. In the customer_id chart, brush the first three bars. A filter is applied to the customer data. In conjunction with the filter you applied already to the price data, the displayed rows show the least expensive products ordered by customers with IDs between 3000 and 3029, inclusive. Click x in the filters to clear them. View changes over time¶ If your data contains a datetime column then each numerical column, along with distribution, will have the option to show the average value over time. If the data is filtered then the unfiltered average value is also shown for context. You can hover over the chart for exact values. Related content¶ View Time Series Data Aggregate a Stream in a Tumbling Window Compare Current and Previous Values in a Data Stream Convert the Serialization Format of a Topic Flink action: Mask Fields in a Table Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
SELECT * FROM orders;
```

```sql
SELECT * FROM orders LIMIT 50;
```

---

### Transform a Topic with Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/how-to-guides/transform-topic.html

Transform a Topic with Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables generating a transformed topic from an input topic’s properties, like partition count, key, serialization format, and field names, with only a few clicks. In this guide, you create a Flink table and apply a transformation that creates an output topic with these changes: Rename a field Specify a bucket key Change the key and value serialization format Specify a different partition count The Transform Topic action creates a Flink SQL statement for you, but no knowledge of Flink SQL is required to use it. This guide shows the following steps: Step 1: Create a users table Step 2: Apply the Transform Topic action Step 3: Inspect the transformed topic Prerequisites¶ Access to Confluent Cloud. The OrganizationAdmin, EnvironmentAdmin, or FlinkAdmin role for creating compute pools, or the FlinkDeveloper role if you already have a compute pool. If you don’t have the appropriate role, contact your OrganizationAdmin or EnvironmentAdmin. For more information, see Grant Role-Based Access in Confluent Cloud for Apache Flink. A provisioned Flink compute pool. Step 1: Create a users table¶ Log in to Confluent Cloud and navigate to your Flink workspace. Run the following statement to create a users table. -- Create a users table. CREATE TABLE users ( user_id STRING, registertime BIGINT, gender STRING, regionid STRING ); Insert rows with mock data into the users table. -- Populate the table with mock users data. INSERT INTO users VALUES ('Thomas A. Anderson', 1677260724, 'male', 'Region_4'), ('Trinity', 1677260733, 'female', 'Region_4'), ('Morpheus', 1677260742, 'male', 'Region_8'), ('Dozer', 1677260823, 'male', 'Region_1'), ('Agent Smith', 1677260955, 'male', 'Region_0'), ('Persephone', 1677260901, 'female', 'Region_2'), ('Niobe', 1677260921, 'female', 'Region_3'), ('Zee', 1677260922, 'female', 'Region_5'); Inspect the inserted rows. SELECT * FROM users; Your output should resemble: user_id registertime gender regionid Thomas A. Anderson 1677260724 male Region_4 Trinity 1677260733 female Region_4 Morpheus 1677260742 male Region_8 Dozer 1677260823 male Region_1 Agent Smith 1677260955 male Region_0 Persephone 1677260901 female Region_2 Niobe 1677260921 female Region_3 Zee 1677260922 female Region_5 Step 2: Apply the Transform Topic action¶ In the previous step, you created a Flink table and populated it with a few rows. In this step, you apply the Transform Topic action to create a transformed output table. Navigate to the Environments page, and in the navigation menu, click Data portal. In the Data portal page, click the dropdown menu and select the environment for your workspace. In the Recently created section, find your users topic and click it to open the details pane. In the details pane, click Actions, and in the Actions list, click Transform topic to open the dialog. In the Action details section, set up the transformation. user_id field: select the Key field checkbox. registertime field: enter registration_time. Partition count property: enter 3. Serialization format property: select JSON Schema. By default, the name of the transformed topic is users_transform, and you can change this as desired. In the Runtime configuration section, configure how the transformation statement will run. (Optional) Select the Flink compute pool to run the embedding query. The current compute pool is selected as the default. (Optional) Select Run with a service account for production jobs. The service account you select must have the EnvironmentAdmin role to create topics, schemas, and run Flink statements. (Optional) Select Show SQL to view the Flink statement that does the transformation work. Your Flink SQL should resemble: CREATE TABLE `your-env`.`your-cluster`.`users_transform` DISTRIBUTED BY HASH ( `user_id` ) INTO 3 BUCKETS WITH ( 'value.format' = 'json-registry', 'key.format' = 'json-registry' ) AS SELECT `user_id`, `registertime` as `registration_time`, `gender`, `regionid` FROM `your-env`.`your-cluster`.`users`; Click Confirm and run to run the transformation statement. A Summary page displays the result of the job submission, showing the statement name and other details. Step 3: Inspect the transformed topic¶ In the Summary page, click the Output topic link for the users_transform topic, and in the topic’s details pane, click Query to open a Flink workspace. Run the following statement to view the rows in the users_transform table. Note the renamed registration_time column. SELECT * FROM `users_transform`; Click Stop to end the statement. Run the following command to confirm that the user_id field in the transformed table is a key field. DESCRIBE `users_source_transform`; Your output should resemble: +-------------------+-----------+----------+------------+ | Column Name | Data Type | Nullable | Extras | +-------------------+-----------+----------+------------+ | user_id | STRING | NULL | BUCKET KEY | | registration_time | BIGINT | NULL | | | gender | STRING | NULL | | | regionid | STRING | NULL | | +-------------------+-----------+----------+------------+ Run the following command to confirm the serialization format and partition count on the transformed topic. SHOW CREATE TABLE `users_source_transform`; Your output should resemble: CREATE TABLE `your-env`.`your-cluster`.`users_transform` ( `user_id` VARCHAR(2147483647), `registration_time` BIGINT, `gender` VARCHAR(2147483647), `regionid` VARCHAR(2147483647) ) DISTRIBUTED BY HASH(`user_id`) INTO 3 BUCKETS WITH ( 'changelog.mode' = 'append', 'connector' = 'confluent', 'kafka.cleanup-policy' = 'delete', 'kafka.max-message-size' = '2097164 bytes', 'kafka.retention.size' = '0 bytes', 'kafka.retention.time' = '7 d', 'key.format' = 'json-registry', 'scan.bounded.mode' = 'unbounded', 'scan.startup.mode' = 'earliest-offset', 'value.format' = 'json-registry' ) Related content¶ Flink action: Deduplicate Rows in a Table Flink action: Mask Fields in a Table Flink action: Create an Embedding Aggregate a Stream in a Tumbling Window Compare Current and Previous Values in a Data Stream Convert the Serialization Format of a Topic Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
-- Create a users table.
CREATE TABLE users (
  user_id STRING,
  registertime BIGINT,
  gender STRING,
  regionid STRING
);
```

```sql
-- Populate the table with mock users data.
INSERT INTO users VALUES
  ('Thomas A. Anderson', 1677260724, 'male', 'Region_4'),
  ('Trinity', 1677260733, 'female', 'Region_4'),
  ('Morpheus', 1677260742, 'male', 'Region_8'),
  ('Dozer', 1677260823, 'male', 'Region_1'),
  ('Agent Smith', 1677260955, 'male', 'Region_0'),
  ('Persephone', 1677260901, 'female', 'Region_2'),
  ('Niobe', 1677260921, 'female', 'Region_3'),
  ('Zee', 1677260922, 'female', 'Region_5');
```

```sql
SELECT * FROM users;
```

```sql
user_id            registertime gender regionid
Thomas A. Anderson 1677260724   male   Region_4
Trinity            1677260733   female Region_4
Morpheus           1677260742   male   Region_8
Dozer              1677260823   male   Region_1
Agent Smith        1677260955   male   Region_0
Persephone         1677260901   female Region_2
Niobe              1677260921   female Region_3
Zee                1677260922   female Region_5
```

```sql
users_transform
```

```sql
CREATE TABLE `your-env`.`your-cluster`.`users_transform`
DISTRIBUTED BY HASH (
    `user_id`
) INTO 3 BUCKETS WITH (
    'value.format' = 'json-registry',
    'key.format' = 'json-registry'
) AS SELECT
    `user_id`,
    `registertime` as `registration_time`,
    `gender`,
    `regionid`
FROM `your-env`.`your-cluster`.`users`;
```

```sql
SELECT * FROM `users_transform`;
```

```sql
DESCRIBE `users_source_transform`;
```

```sql
+-------------------+-----------+----------+------------+
|    Column Name    | Data Type | Nullable |   Extras   |
+-------------------+-----------+----------+------------+
| user_id           | STRING    | NULL     | BUCKET KEY |
| registration_time | BIGINT    | NULL     |            |
| gender            | STRING    | NULL     |            |
| regionid          | STRING    | NULL     |            |
+-------------------+-----------+----------+------------+
```

```sql
SHOW CREATE TABLE `users_source_transform`;
```

```sql
CREATE TABLE `your-env`.`your-cluster`.`users_transform` (
  `user_id` VARCHAR(2147483647),
  `registration_time` BIGINT,
  `gender` VARCHAR(2147483647),
  `regionid` VARCHAR(2147483647)
)
DISTRIBUTED BY HASH(`user_id`) INTO 3 BUCKETS
WITH (
  'changelog.mode' = 'append',
  'connector' = 'confluent',
  'kafka.cleanup-policy' = 'delete',
  'kafka.max-message-size' = '2097164 bytes',
  'kafka.retention.size' = '0 bytes',
  'kafka.retention.time' = '7 d',
  'key.format' = 'json-registry',
  'scan.bounded.mode' = 'unbounded',
  'scan.startup.mode' = 'earliest-offset',
  'value.format' = 'json-registry'
)
```

---

### View Time Series Data in Confluent Cloud | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/how-to-guides/view-time-series-data.html

View Time Series Data with Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables visualizing time-series data in real time. The output of certain SQL statements render as time-series charts. Whenever a statement’s output has at least one time column, and at least one numeric column, it is charted automatically in a time-series graph when you toggle to chart mode. You can further customize charts by user interactions: you can choose a different x-axis column, add multiple series, change the chart’s time granularity, and filter the overall time range. Prerequisites¶ Access to Confluent Cloud. The OrganizationAdmin, EnvironmentAdmin, or FlinkAdmin role for creating compute pools, or the FlinkDeveloper role if you already have a compute pool. If you don’t have the appropriate role, contact your OrganizationAdmin or EnvironmentAdmin. For more information, see Grant Role-Based Access in Confluent Cloud for Apache Flink. A provisioned Flink compute pool. Step 1: Open a workspace¶ Log in to Confluent Cloud Console at https://confluent.cloud/login. Open a Flink workspace. Use the Catalog and Database dropdown controls to select the examples catalog and the marketplace database. Step 2: Generate time-series data¶ Run the following statement to generate three time-series signals. SELECT $rowtime AS row_timestamp, RAND() * 0.10 * SIN(0.10 * UNIX_TIMESTAMP() + 0) AS series1, RAND() * 0.10 * SIN(0.10 * UNIX_TIMESTAMP() + 1.1e3) AS series2, RAND() * 0.03 * SIN(0.10 * UNIX_TIMESTAMP() + 1.2e3) AS series3 FROM orders; Step 3: View time-series data¶ Click the time-series toggle () to open the time-series visualizer. Your output should resemble: The upper pane shows the series1 signal. The lower pane enables scrolling through the data as it streams through the visualizer. On the right side of the lower pane, click and drag it to the left. On the left side of the lower pane, click and drag it to the right. These gestures define the width of the view window that displays in the upper pane. Click and drag it to the right. The view in the upper pane adjusts to display the data within the window. As data continues to stream, the window in the lower pane moves to the left, while the display in the upper pane remains centered on the data selected in the window. Double-click to reset the view. Click Add Column, and in the context menu, select series2 and series3 to display the other signals. Click to download the current visualization as a PNG file. Click the time-series toggle () to close the visualizer. Related content¶ Scan and Summarize Tables Compare Current and Previous Values in a Data Stream Convert the Serialization Format of a Topic Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
SELECT $rowtime AS row_timestamp,
   RAND() * 0.10 * SIN(0.10 * UNIX_TIMESTAMP() + 0) AS series1,
   RAND() * 0.10 * SIN(0.10 * UNIX_TIMESTAMP() + 1.1e3) AS series2,
   RAND() * 0.03 * SIN(0.10 * UNIX_TIMESTAMP() + 1.2e3) AS series3
FROM orders;
```

---

### Best Practices for Moving SQL Statements to Production in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/operate-and-deploy/best-practices.html

Move SQL Statements to Production in Confluent Cloud for Apache Flink¶ When you move your Flink SQL statements to production in Confluent Cloud for Apache Flink®, consider the following recommendations and best practices. Validate your watermark strategy Validate or disable idleness handling Choose the correct Schema Registry compatibility type Separate workloads of different priorities into separate compute pools Use event-time temporal joins instead of streaming joins Implement state time-to-live (TTL) Use service account API keys for production Assign custom names to Flink SQL statements Validate your watermark strategy¶ When moving your Flink SQL statements to production, it’s crucial to validate your watermark strategy. Watermarks in Flink track the progress of event time and provide a way to trigger time-based operations. Confluent Cloud for Apache Flink provides a default watermark strategy for all tables, whether they’re created automatically from a Kafka topic or from a CREATE TABLE statement. The default watermark strategy is applied on the $rowtime system column, which is mapped to the associated timestamp of a Kafka record. Watermarks for this default strategy are calculated per Kafka partition, and at least 250 events are required per partition. Here are some situations when you need to define your own custom watermark strategy: When the event time needs to be based on data from the payload and not the timestamp of the Kafka record. If a delay of longer than 7 days can occur. When events might not arrive in the exact order they were generated. When data may arrive late due to network latency or processing delays. Validate or disable idleness handling¶ One critical aspect to consider when moving your Flink SQL statements to production is the handling of idleness in data streams. If no events arrive within a certain time (timeout duration) on a Kafka partition, that partition is marked as idle and does not contribute to the watermark calculation until a new event comes in. This situation creates a problem: if some partitions continue to receive events while others are idle, the overall watermark computation, which is based on the minimum across all parallel watermarks, may be inaccurately held back. Confluent Cloud for Apache Flink dynamically adjusts the consideration of idle partitions in watermark calculations with Confluent’s Progressive Idleness feature. The idle-time detection starts small at 15 seconds but grows linearly with the age of the statement up to a maximum of 5 minutes. Progressive Idleness can cause wrong watermarks if a partition is marked as idle too quickly, and this can cause the system to move ahead too quickly, impacting data processing. When you move your Flink SQL statement into production, make sure that you have validated how you want to handle idleness. You can configure or disable this behavior by using the sql.tables.scan.idle-timeout option. Choose the correct Schema Registry compatibility type¶ The Confluent Schema Registry plays a pivotal role in ensuring that the schemas of the data flowing through your Kafka topics are consistent, compatible, and evolve in a controlled manner. One of the key decisions in this process is selecting the appropriate schema compatibility type. Consider using FULL_TRANSITIVE compatibility to ensure that any new schema is fully compatible with all previous versions of the schema. This comprehensive check minimizes the risk of introducing changes that could disrupt data-processing applications relying on the data. When choosing any of the other compatibility modes, you need to consider the consequences on currently-running statements, especially since a Flink statement is both a producer and a consumer at the same time. Separate workloads of different priorities into separate compute pools¶ All statements using the same compute pools compete for resources. Although the Confluent Cloud Autopilot aims to provide each statement with the resources it needs, this may not always be possible, in particular, when the maximum resources of the compute pool are exhausted. To avoid situations in which statements with different latency and availability requirements compete for resources, consider using separate compute pools for different use cases, for example, ad-hoc exploration vs. mission-critical, long-running queries. Because statements may affect each other, you should share compute pools only between statements with comparable requirements. Use event-time temporal joins instead of streaming joins¶ When processing data streams, choosing the right type of join operation is crucial for efficiency and performance. Event-time temporal joins offer significant advantages over regular streaming joins. Temporal joins are particularly useful when the join condition is based on a time attribute. They enable you to join a primary stream with a historical version of another table, using the state of that table as it existed at the time of the event. This results in more efficient processing, because it avoids the need to keep large amounts of state in memory. Traditional streaming joins involve keeping a stateful representation of all joined records, which can be inefficient and resource-intensive, especially with large datasets or high-velocity streams. Also, event-time temporal joins typically result in insert-only outputs, when your inputs are also insert-only, which means that once a record is processed and joined, it is not updated or deleted later. Streaming joins often need to handle updates and deletions. When moving to production, prefer using temporal joins wherever applicable to ensure your data processing is efficient and performant. Avoid traditional streaming joins unless necessary, as they can lead to increased resource consumption and complexity. Implement state time-to-live (TTL)¶ Some stateful operations in Flink require storing state, like streaming joins and pattern matching. Managing this state effectively is crucial for application performance, resource optimization, and cost reduction. The state time-to-live (TTL) feature enables specifying a minimum time interval for how long state, meaning state that is not updated, is retained. This mechanism ensures that state is cleared at some time after the idle duration. When moving to production, you should configure the sql.state-ttl setting carefully to balance performance versus correctness of the results. Use service account API keys for production¶ API keys for Confluent Cloud can be created with user accounts and service accounts. A service account is intended to provide an identity for an application or service that needs to perform programmatic operations within Confluent Cloud. When moving to production, ensure that only service account API keys are used. Avoid user account API keys, except for development and testing. If a user leaves and a user account is deleted, all API keys created with that user account are deleted, and applications might break. Assign custom names to Flink SQL statements¶ Custom naming facilitates easier management, monitoring, and debugging of your streaming applications by providing clear, identifiable references to specific operations or data flows. You can do this easily by using the client.statement-name option. Review error handling and monitoring best practices¶ Review these topics: Error handling and recovery Best practices for alerting Notifications Related content¶ Flink Compute Pools Billing on Confluent Cloud for Apache Flink Managing and Monitoring Statements Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
FULL_TRANSITIVE
```

---

### Carry-over Offsets in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/operate-and-deploy/carry-over-offsets.html

Carry-over Offsets in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® supports carry-over offsets, which means that you can use the topic offsets from one statement to start a new statement. Carry-over offsets provide a streamlined way to update Flink statements without data loss. This feature eliminates the manual complexity of copying offsets between statements and reduces the need to monitor statement status when deploying CI/CD pipelines. Automatic orchestration handles the upgrade process. The system automatically waits for the old statement to stop before starting the new one, providing a seamless transition of processing between statements. Carry-over offsets are available only when replacing an existing statement. This feature enables you to evolve statements with exactly-once semantics across the update when the statement is “stateless”, as determined by the system. At a high level, “stateless” applies to statements that can process each event independently and in any order. For other scenarios, such as aggregates, lag, windows, pattern matching, or use of upsert sink, this feature can’t be used, because the update may cause inconsistent results. To use carry-over offsets, add the sql.tables.initial-offset-from property to the statement configuration when you create your new statement, for example: In the Confluent Cloud Console and the Flink SQL shell, you can set the property by using the SET statement, for example: SET 'sql.tables.initial-offset-from' = '<reference-statement-name>' The <reference-statement-name> is the name of the statement that you want to use as the reference for the carry-over offsets. If you’re using the Statements API or the Confluent Terraform provider, you can set the property by using the properties field, for example: { "properties": { "sql.tables.initial-offset-from": "<reference-statement-name>" } } Considerations for carry-over offsets¶ Regional limitations¶ The referenced statement must be in the same organization, environment, and region as the new statement. Cross-region offset carry-over is not supported using this property. Timeout Behavior¶ New statements will wait up to 6 hours for the referenced statement to stop. If the timeout expires, the new statement will fail with an error message indicating the reason. Table Options Priority¶ Explicit table options in your SQL text take precedence over inherited offsets. Only tables without explicit options will use carried-over offsets. Example of table options priority: INSERT INTO output SELECT * FROM table1 UNION ALL SELECT * FROM table2 /*+ OPTIONS('scan.startup.mode' = 'latest-offset') */; Result: table1 uses carried-over offsets, table2 uses the specified latest-offset mode. Common Issues¶ Statement Not Found Error¶ Verify the referenced statement name is correct. Ensure the statement exists in the same org/env/region. Timeout Exceeded¶ Check if the old statement is actually stopping. Verify there are no blocking conditions preventing termination. Invalid SQL Error¶ The new statement’s syntax is validated immediately upon creation. Fix SQL syntax errors before the offset carry-over process begins. Referenced Statement Savepoint Failed¶ The statement failed to be submitted because the referenced statement didn’t enter a stopped state gracefully. Data inconsistencies can occur when using offsets from failed savepoints. Try to resume the referenced statement and stop it again. If there are still issues, contact Confluent Support. Examples¶ Statement already stopped¶ You have a stopped statement named my-original-statement. Create a new statement with updated logic: SET 'sql.tables.initial-offset-from' = 'my-original-statement'; INSERT INTO enhanced_output SELECT user_id, event_type, timestamp, new_field FROM user_events WHERE event_type IN ('click', 'view', 'purchase'); Statement still running¶ Your original statement metrics-processor-v1 is still running, SET 'sql.tables.initial-offset-from' = 'metrics-processor-v1'; INSERT INTO enhanced_output SELECT user_id, event_type, timestamp, new_field FROM user_events WHERE event_type IN ('click', 'view', 'purchase'); The new statement remains in the “Pending” state until you stop metrics-processor-v1. Related content¶ Schema and Statement Evolution Flink SQL Shell Quick Start Flink SQL Shell Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
sql.tables.initial-offset-from
```

```sql
SET 'sql.tables.initial-offset-from' = '<reference-statement-name>'
```

```sql
<reference-statement-name>
```

```sql
{
   "properties": {
      "sql.tables.initial-offset-from": "<reference-statement-name>"
   }
}
```

```sql
INSERT INTO output
  SELECT * FROM table1
UNION ALL
  SELECT * FROM table2 /*+ OPTIONS('scan.startup.mode' = 'latest-offset') */;
```

```sql
latest-offset
```

```sql
my-original-statement
```

```sql
SET 'sql.tables.initial-offset-from' = 'my-original-statement';
INSERT INTO enhanced_output
SELECT
    user_id,
    event_type,
    timestamp,
    new_field
FROM user_events
WHERE event_type IN ('click', 'view', 'purchase');
```

```sql
metrics-processor-v1
```

```sql
SET 'sql.tables.initial-offset-from' = 'metrics-processor-v1';
INSERT INTO enhanced_output
SELECT
    user_id,
    event_type,
    timestamp,
    new_field
FROM user_events
WHERE event_type IN ('click', 'view', 'purchase');
```

```sql
metrics-processor-v1
```

---

### Manage Flink Compute Pools in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/operate-and-deploy/create-compute-pool.html

Manage Compute Pools in Confluent Cloud for Apache Flink¶ A compute pool represents the compute resources that are used to run your SQL statements. The resources provided by a compute pool are shared among all statements that use it. It enables you to limit or guarantee resources as your use cases require. A compute pool is bound to a region. There is no cost for creating compute pools. To create a compute pool, you need the OrganizationAdmin, EnvironmentAdmin, or FlinkAdmin RBAC role. In addition to the Cloud Console, Confluent provides these tools for creating and managing Flink compute pools: Confluent CLI Confluent Cloud REST API Confluent Terraform provider Create a compute pool¶ Confluent Cloud ConsoleConfluent CLIREST APITerraform In the navigation menu, click Environments, and click the tile for the environment where you want to use Flink SQL. In the environment details page, click Flink. In the Flink page, click Compute pools, if it’s not selected already. Click Create compute pool to open the Create compute pool page. In the Region dropdown, select the region that hosts the data you want to process with SQL, or use any region if you just want to try out Flink using sample data. Click Continue. In the Pool name textbox, enter “my-compute-pool”. In the Max CFUs dropdown, select 10. For more information, see CFUs. Note You can increase the Max CFUs value later, but decreasing Max CFUs is not supported. Click Continue, and on the Review and create page, click Finish. A tile for your compute pool appears on the Flink page. It shows the pool in the Provisioning state. It may take a few minutes for the pool to enter the Running state. Tip The tile for your compute pool provides the Confluent CLI command for using the pool from the CLI. Learn more about the CLI in the Flink SQL Shell Quick Start. Run the confluent flink compute-pool create command to create a compute pool. Creating a compute pool requires the following inputs: export COMPUTE_POOL_NAME=<compute-pool-name> # human-readable name, for example, "my-compute-pool" export CLOUD_PROVIDER="<cloud-provider>" # example: "aws" export CLOUD_REGION="<cloud-region>" # example: "us-east-1" export ENV_ID="<environment-id>" # example: "env-z3y2x1" export MAX_CFU="<max-cfu>" # example: 5 Run the following command to create a compute pool in the specified cloud provider and environment. confluent flink compute-pool create ${COMPUTE_POOL_NAME} \ --cloud ${CLOUD_PROVIDER} \ --region ${CLOUD_REGION} \ --max-cfu ${MAX_CFU} \ --environment ${ENV_ID} Your output should resemble: +-------------+-----------------+ | Current | false | | ID | lfcp-xxd6og | | Name | my-compute-pool | | Environment | env-z3y2x1 | | Current CFU | 0 | | Max CFU | 5 | | Cloud | AWS | | Region | us-east-1 | | Status | PROVISIONING | +-------------+-----------------+ Create a compute pool in your environment by sending a POST request to the Compute Pools endpoint. This request uses your Cloud API key instead of the Flink API key. Creating a compute pool requires the following inputs: export COMPUTE_POOL_NAME="<compute-pool-name>" # human readable name, for example: "my-compute-pool" export CLOUD_API_KEY="<cloud-api-key>" export CLOUD_API_SECRET="<cloud-api-secret>" export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0) export ENV_ID="<environment-id>" # example: "env-z3y2x1" export CLOUD_PROVIDER="<cloud-provider>" # example: "aws" export CLOUD_REGION="<cloud-region>" # example: "us-east-1" export MAX_CFU="<max-cfu>" # example: 5 export JSON_DATA="<payload-string>" The following JSON shows an example payload. The network key is optional. { "spec": { "display_name": "${COMPUTE_POOL_NAME}", "cloud": "${CLOUD_PROVIDER}", "region": "${CLOUD_REGION}", "max_cfu": ${MAX_CFU}, "environment": { "id": "${ENV_ID}" }, "network": { "id": "n-00000", "environment": "string" } } } Quotation mark characters in the JSON string must be escaped, so the payload string to send resembles the following: export JSON_DATA="{ \"spec\": { \"display_name\": \"${COMPUTE_POOL_NAME}\", \"cloud\": \"${CLOUD_PROVIDER}\", \"region\": \"${CLOUD_REGION}\", \"max_cfu\": ${MAX_CFU}, \"environment\": { \"id\": \"${ENV_ID}\" } } }" The following command sends a POST request to create a compute pool. curl --request POST \ --url https://api.confluent.cloud/fcpm/v2/compute-pools \ --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" \ --header 'content-type: application/json' \ --data "${JSON_DATA}" Your output should resemble: Response from a request to create a compute pool { "api_version": "fcpm/v2", "id": "lfcp-6g7h8i", "kind": "ComputePool", "metadata": { "created_at": "2024-02-27T22:44:27.18964Z", "resource_name": "crn://confluent.cloud/organization=b0b21724-4586-4a07-b787-d0bb5aacbf87/environment=env-z3y2x1/flink-region=aws.us-east-1/compute-pool=lfcp-6g7h8i", "self": "https://api.confluent.cloud/fcpm/v2/compute-pools/lfcp-6g7h8i", "updated_at": "2024-02-27T22:44:27.18964Z" }, "spec": { "cloud": "AWS", "display_name": "my-compute-pool", "environment": { "id": "env-z3y2x1", "related": "https://api.confluent.cloud/fcpm/v2/compute-pools/lfcp-6g7h8i", "resource_name": "crn://confluent.cloud/organization=b0b21724-4586-4a07-b787-d0bb5aacbf87/environment=env-z3y2x1" }, "http_endpoint": "https://flink.us-east-1.aws.confluent.cloud/sql/v1/organizations/b0b21724-4586-4a07-b787-d0bb5aacbf87/environments/env-z3y2x1", "max_cfu": 5, "region": "us-east-1" }, "status": { "current_cfu": 0, "phase": "PROVISIONING" } } To create a compute pool by using the Confluent Terraform provider, use the confluent_flink_compute_pool resource. Configure your Terraform file. Provide your Confluent Cloud API key and secret. terraform { required_providers { confluent = { source = "confluentinc/confluent" version = "2.44.0" } } } provider "confluent" { cloud_api_key = var.confluent_cloud_api_key # optionally use CONFLUENT_CLOUD_API_KEY env var cloud_api_secret = var.confluent_cloud_api_secret # optionally use CONFLUENT_CLOUD_API_SECRET env var } Define the environment where the compute pool will be created. resource "confluent_environment" "development" { display_name = "Development" lifecycle { prevent_destroy = true } } Define the confluent_flink_compute_pool resource with the required parameters, like display_name, cloud, region, max_cfu, and the environment ID. resource "confluent_flink_compute_pool" "main" { display_name = "standard_compute_pool" cloud = "AWS" region = "us-east-1" max_cfu = 5 environment { id = confluent_environment.development.id } } Run the terraform apply command to create the resources. terraform apply If you need to import an existing compute pool, use the terraform import command. export CONFLUENT_CLOUD_API_KEY="<cloud_api_key>" export CONFLUENT_CLOUD_API_SECRET="<cloud_api_secret>" terraform import confluent_flink_compute_pool.main <your-environment-id>/<compute-pool-id> For more information, see confluent_flink_compute_pool resource. View details for a compute pool¶ Confluent Cloud ConsoleConfluent CLIREST APITerraform In the navigation menu, click Environments, and click the tile for the environment where you use Flink SQL. In the environment details page, click Flink. In the Flink page, click Compute pools, if it’s not selected already. The available compute pools are listed as tiles, with details like Max CFUs and the cloud provider and region. If the tile for your compute pool isn’t visible, start typing in the Search pools textbox to filter the view. Click the tile for your compute pool to open the details page, which shows information like consumption metrics and Flink SQL statements that are associated with the compute pool. Run the confluent flink compute-pool describe command to get details about a compute pool. Describing a compute pool requires the following inputs: export COMPUTE_POOL_ID="<compute-pool-id>" # example: "lfcp-8m03rm" export ENV_ID="<environment-id>" # example: "env-z3y2x1" Run the following command to get details about a compute pool in the specified environment. confluent flink compute-pool describe ${COMPUTE_POOL_ID} \ --environment ${ENV_ID} Your output should resemble: +-------------+-----------------+ | Current | false | | ID | lfcp-xxd6og | | Name | my-compute-pool | | Environment | env-z3y2x1 | | Current CFU | 0 | | Max CFU | 5 | | Cloud | AWS | | Region | us-east-1 | | Status | PROVISIONED | +-------------+-----------------+ Get the details about a compute pool in your environment by sending a GET request to the Compute Pools endpoint. This request uses your Cloud API key instead of the Flink API key. Getting details about a compute pool requires the following inputs: export COMPUTE_POOL_ID="<compute-pool-id>" # example: "lfcp-8m03rm" export CLOUD_API_KEY="<cloud-api-key>" export CLOUD_API_SECRET="<cloud-api-secret>" export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0) export ENV_ID="<environment-id>" # example: "env-z3y2x1" Run the following command to get details about the compute pool specified in the COMPUTE_POOL_ID environment variable. curl --request GET \ --url "https://api.confluent.cloud/fcpm/v2/compute-pools/${COMPUTE_POOL_ID}?environment=${ENV_ID}" \ --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" Your output should resemble: Response from a request to read a compute pool { "api_version": "fcpm/v2", "id": "lfcp-6g7h8i", "kind": "ComputePool", "metadata": { "created_at": "2024-02-27T22:44:27.18964Z", "resource_name": "crn://confluent.cloud/organization=b0b21724-4586-4a07-b787-d0bb5aacbf87/environment=env-z3y2x1/flink-region=aws.us-east-1/compute-pool=lfcp-6g7h8i", "self": "https://api.confluent.cloud/fcpm/v2/compute-pools/lfcp-6g7h8i", "updated_at": "2024-02-27T22:44:27.18964Z" }, "spec": { "cloud": "AWS", "display_name": "my-compute-pool", "environment": { "id": "env-z3y2x1", "related": "https://api.confluent.cloud/fcpm/v2/compute-pools/lfcp-6g7h8i", "resource_name": "crn://confluent.cloud/organization=b0b21724-4586-4a07-b787-d0bb5aacbf87/environment=env-z3y2x1" }, "http_endpoint": "https://flink.us-east-1.aws.confluent.cloud/sql/v1/organizations/b0b21724-4586-4a07-b787-d0bb5aacbf87/environments/env-z3y2x1", "max_cfu": 5, "region": "us-east-1" }, "status": { "current_cfu": 0, "phase": "PROVISIONED" } } To view details for a compute pool by using the Confluent Terraform provider, use the confluent_flink_compute_pool data source and the data argument. data "confluent_flink_compute_pool" "example_using_id" { id = "lfcp-abc123" environment { id = "<your-environment-id>" } } output "example_using_id" { value = data.confluent_flink_compute_pool.example_using_id } data "confluent_flink_compute_pool" "example_using_name" { display_name = "my_compute_pool" environment { id = "<your-environment-id>" } } output "example_using_name" { value = data.confluent_flink_compute_pool.example_using_name } Run the terraform apply or terraform output command. The example_using_id and example_using_name output contains details for the compute pool with the specified ID or name. For more information, see confluent_flink_compute_pool data source. List compute pools¶ Confluent Cloud ConsoleConfluent CLIREST APITerraform In the navigation menu, click Environments, and click the tile for the environment where you use Flink SQL. In the environment details page, click Flink. In the Flink page, click Compute pools, if it’s not selected already. The available compute pools are listed as tiles, with details like Max CFUs and the cloud provider and region. Run the confluent flink compute-pool list command to compute pools in the specified environment. Listing compute pools may require the following inputs, depending on the command: export CLOUD_REGION="<cloud-region>" # example: "us-east-1" export ENV_ID="<environment-id>" # example: "env-z3y2x1" Run the following command to get details about a compute pool in the specified environment. confluent flink compute-pool list --environment ${ENV_ID} Your output should resemble: Current | ID | Name | Environment | Current CFU | Max CFU | Cloud | Region | Status ----------+-------------+---------------------------+-------------+-------------+---------+-------+-----------+-------------- * | lfcp-xxd6og | my-compute-pool | env-z3y2x1 | 0 | 5 | AWS | us-east-1 | PROVISIONED | lfcp-8m03rm | test-blue-compute-pool | env-z3q9rd | 0 | 10 | AWS | us-east-1 | PROVISIONED ... List the compute pools in your environment by sending a GET request to the Compute Pools endpoint. This request uses your Cloud API key instead of the Flink API key. Listing the compute pools in your environment requires the following inputs: export CLOUD_API_KEY="<cloud-api-key>" export CLOUD_API_SECRET="<cloud-api-secret>" export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0) export ENV_ID="<environment-id>" # example: "env-z3y2x1" Run the following command to list the compute pools in your environment. curl --request GET \ --url "https://confluent.cloud/api/fcpm/v2/compute-pools?environment=${ENV_ID}&page_size=100" \ --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" \ | jq -r '.data[] | .spec.display_name, {id}' Your output should resemble: compute_pool_0 { "id": "lfcp-j123kl" } compute_pool_2 { "id": "lfcp-abc1de" } my-lfcp-01 { "id": "lfcp-l2mn3o" } ... Find your compute pool in the list and save its ID in an environment variable. export COMPUTE_POOL_ID="<your-compute-pool-id>" To list all compute pools using the Confluent Terraform provider, use the confluent_flink_compute_pool data source and the data argument. provider "confluent" { cloud_api_key = var.confluent_cloud_api_key cloud_api_secret = var.confluent_cloud_api_secret } data "confluent_flink_compute_pools" "all_pools" { environment_id = "<your-environment-id>" } output "compute_pools" { value = data.confluent_flink_compute_pools.all_pools.compute_pools } Run the terraform apply or terraform output command. The compute_pools output contains a list of all compute pools in your environment. To filter the compute pools by a specific attribute, region, availability, or name, use the filter argument within the data block. data "confluent_flink_compute_pools" "pools_in_us_east" { environment_id = "<your-environment-id>" filter = "region == '<region-id>'" } For more information, see confluent_flink_compute_pool. Update a compute pool¶ You can update the name of the compute pool, its environment, and the MAX_CFUs setting. You can increase the Max CFUs value, but decreasing Max CFUs is not supported. Confluent Cloud ConsoleConfluent CLIREST APITerraform In the navigation menu, click Environments, and click the tile for the environment where you use Flink SQL. In the environment details page, click Flink. In the Flink page, click Compute pools, if it’s not selected already. In the listed compute pools, find the one you want to update, and click the options icon (⋮). In the context menu, click either Edit display name or Edit max CFUs and follow the instructions in the dialog. Click the tile for your compute pool to open the details page. In the details page, you can update the compute pool’s description or add metadata tags. Also, you can manage Flink SQL statements that are associated with the compute pool. Run the confluent flink compute-pool update command to update a compute pool. Updating a compute pool may require the following inputs, depending on the command: export COMPUTE_POOL_NAME=<compute-pool-name> # human-readable name, for example, "my-compute-pool" export COMPUTE_POOL_ID="<compute-pool-id>" # example: "lfcp-8m03rm" export ENV_ID="<environment-id>" # example: "env-z3y2x1" export MAX_CFU="<max-cfu>" # example: 5 Run the following command to update a compute pool in the specified environment. confluent flink compute-pool update ${COMPUTE_POOL_ID} \ --environment ${ENV_ID} \ --name ${COMPUTE_POOL_NAME} \ --max-cfu ${MAX_CFU} Your output should resemble: +-------------+----------------------+ | Current | false | | ID | lfcp-xxd6og | | Name | renamed-compute-pool | | Environment | env-z3y2x1 | | Current CFU | 0 | | Max CFU | 10 | | Cloud | AWS | | Region | us-east-1 | | Status | PROVISIONED | +-------------+----------------------+ Update a compute pool in your environment by sending a PATCH request to the Compute Pools endpoint. This request uses your Cloud API key instead of the Flink API key. Updating a compute pool requires the following inputs: export COMPUTE_POOL_ID="<compute-pool-id>" # example: "lfcp-8m03rm" export CLOUD_API_KEY="<cloud-api-key>" export CLOUD_API_SECRET="<cloud-api-secret>" export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0) export ENV_ID="<environment-id>" # example: "env-z3y2x1" export MAX_CFU="<max-cfu>" # example: 5 export JSON_DATA="<payload-string>" The following JSON shows an example payload. The network key is optional. { "spec": { "display_name": "${COMPUTE_POOL_NAME}", "max_cfu": ${MAX_CFU}, "environment": { "id": "${ENV_ID}" } } } Quotation mark characters in the JSON string must be escaped, so the payload string to send resembles the following: export JSON_DATA="{ \"spec\": { \"display_name\": \"${COMPUTE_POOL_NAME}\", \"max_cfu\": ${MAX_CFU}, \"environment\": { \"id\": \"${ENV_ID}\" } } }" Run the following command to update the compute pool specified in the COMPUTE_POOL_ID environment variable. curl --request PATCH \ --url "https://api.confluent.cloud/fcpm/v2/compute-pools/${COMPUTE_POOL_ID}" \ --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" \ --header 'content-type: application/json' \ --data "${JSON_DATA}" To update a compute pool by using the Confluent Terraform provider, use the confluent_flink_compute_pool resource. Find the definition for the compute pool resource in your Terraform configuration, for example: resource "confluent_flink_compute_pool" "example" { cloud = "AWS" region = "us-west-2" max_cfu = 10 # other required parameters } Modify the attributes of the confluent_flink_compute_pool resource in the Terraform configuration file. The following example updates the max_cfu attribute. resource "confluent_flink_compute_pool" "example" { cloud = "AWS" region = "us-west-2" max_cfu = 20 # Updated value # other required parameters } Run the terraform apply command to update the compute pool with the new configuration. terraform apply For more information, see confluent_flink_compute_pool. Delete a compute pool¶ Confluent Cloud ConsoleConfluent CLIREST APITerraform In the navigation menu, click Environments, and click the tile for the environment where you want to use Flink SQL. In the environment details page, click Flink. In the Flink page, click Compute pools, if it’s not selected already. In the listed compute pools, find the one you want to delete, and click the options icon (⋮). In the context menu, click Delete compute pool, and in the dialog, enter the compute pool name to confirm deletion. Run the confluent flink compute-pool delete command to delete a compute pool. Run the following command to delete a compute pool in the specified environment. The optional --force flag skips the confirmation prompt. confluent flink compute-pool delete ${COMPUTE_POOL_ID} \ --environment ${ENV_ID} --force Your output should resemble: Deleted Flink compute pool "lfcp-xxd6og". Delete a compute pool in your environment by sending a DELETE request to the Compute Pools endpoint. This request uses your Cloud API key instead of the Flink API key. Deleting a compute pool requires the following inputs: export COMPUTE_POOL_ID="<compute-pool-id>" # example: "lfcp-8m03rm" export CLOUD_API_KEY="<cloud-api-key>" export CLOUD_API_SECRET="<cloud-api-secret>" export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0) export ENV_ID="<environment-id>" # example: "env-z3y2x1" Run the following command to delete the compute pool specified in the COMPUTE_POOL_ID environment variable. curl --request DELETE \ --url "https://api.confluent.cloud/fcpm/v2/compute-pools/${COMPUTE_POOL_ID}?environment=${ENV_ID}" \ --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" To delete a compute pool by using the Confluent Terraform provider, use the confluent_flink_compute_pool resource. Define the compute pool resource in your Terraform configuration file, for example: resource "confluent_flink_compute_pool" "main" { display_name = "standard_compute_pool" cloud = "AWS" region = "us-east-1" max_cfu = 5 environment { id = "<your-environment-id>" } } To avoid accidental deletions, review the plan before applying the destroy command. terraform plan -destroy -target=confluent_flink_compute_pool.main To delete the compute pool, run the following command to target the specific resource. This command deletes only the compute pool and not other resources. terraform apply -destroy -target=confluent_flink_compute_pool.main To remove all resources defined in your Terraform configuration file, including the compute pool, run the terraform destroy command. terraform destroy For more information, see confluent_flink_compute_pool. Related content¶ Flink Compute Pools Billing on Confluent Cloud for Apache Flink Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
export COMPUTE_POOL_NAME=<compute-pool-name> # human-readable name, for example, "my-compute-pool"
export CLOUD_PROVIDER="<cloud-provider>" # example: "aws"
export CLOUD_REGION="<cloud-region>" # example: "us-east-1"
export ENV_ID="<environment-id>" # example: "env-z3y2x1"
export MAX_CFU="<max-cfu>" # example: 5
```

```sql
confluent flink compute-pool create ${COMPUTE_POOL_NAME} \
  --cloud ${CLOUD_PROVIDER} \
  --region ${CLOUD_REGION} \
  --max-cfu ${MAX_CFU} \
  --environment ${ENV_ID}
```

```sql
+-------------+-----------------+
| Current     | false           |
| ID          | lfcp-xxd6og     |
| Name        | my-compute-pool |
| Environment | env-z3y2x1      |
| Current CFU | 0               |
| Max CFU     | 5               |
| Cloud       | AWS             |
| Region      | us-east-1       |
| Status      | PROVISIONING    |
+-------------+-----------------+
```

```sql
export COMPUTE_POOL_NAME="<compute-pool-name>" # human readable name, for example: "my-compute-pool"
export CLOUD_API_KEY="<cloud-api-key>"
export CLOUD_API_SECRET="<cloud-api-secret>"
export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0)
export ENV_ID="<environment-id>" # example: "env-z3y2x1"
export CLOUD_PROVIDER="<cloud-provider>" # example: "aws"
export CLOUD_REGION="<cloud-region>" # example: "us-east-1"
export MAX_CFU="<max-cfu>" # example: 5
export JSON_DATA="<payload-string>"
```

```sql
{
  "spec": {
    "display_name": "${COMPUTE_POOL_NAME}",
    "cloud": "${CLOUD_PROVIDER}",
    "region": "${CLOUD_REGION}",
    "max_cfu": ${MAX_CFU},
    "environment": {
      "id": "${ENV_ID}"
    },
    "network": {
      "id": "n-00000",
      "environment": "string"
    }
  }
}
```

```sql
export JSON_DATA="{
  \"spec\": {
    \"display_name\": \"${COMPUTE_POOL_NAME}\",
    \"cloud\": \"${CLOUD_PROVIDER}\",
    \"region\": \"${CLOUD_REGION}\",
    \"max_cfu\": ${MAX_CFU},
    \"environment\": {
      \"id\": \"${ENV_ID}\"
    }
  }
}"
```

```sql
curl --request POST \
  --url https://api.confluent.cloud/fcpm/v2/compute-pools \
  --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" \
  --header 'content-type: application/json' \
  --data "${JSON_DATA}"
```

```sql
{
    "api_version": "fcpm/v2",
    "id": "lfcp-6g7h8i",
    "kind": "ComputePool",
    "metadata": {
        "created_at": "2024-02-27T22:44:27.18964Z",
        "resource_name": "crn://confluent.cloud/organization=b0b21724-4586-4a07-b787-d0bb5aacbf87/environment=env-z3y2x1/flink-region=aws.us-east-1/compute-pool=lfcp-6g7h8i",
        "self": "https://api.confluent.cloud/fcpm/v2/compute-pools/lfcp-6g7h8i",
        "updated_at": "2024-02-27T22:44:27.18964Z"
    },
    "spec": {
        "cloud": "AWS",
        "display_name": "my-compute-pool",
        "environment": {
            "id": "env-z3y2x1",
            "related": "https://api.confluent.cloud/fcpm/v2/compute-pools/lfcp-6g7h8i",
            "resource_name": "crn://confluent.cloud/organization=b0b21724-4586-4a07-b787-d0bb5aacbf87/environment=env-z3y2x1"
        },
        "http_endpoint": "https://flink.us-east-1.aws.confluent.cloud/sql/v1/organizations/b0b21724-4586-4a07-b787-d0bb5aacbf87/environments/env-z3y2x1",
        "max_cfu": 5,
        "region": "us-east-1"
    },
    "status": {
        "current_cfu": 0,
        "phase": "PROVISIONING"
    }
}
```

```sql
terraform {
  required_providers {
    confluent = {
      source = "confluentinc/confluent"
      version = "2.44.0"
    }
  }
}

provider "confluent" {
  cloud_api_key    = var.confluent_cloud_api_key    # optionally use CONFLUENT_CLOUD_API_KEY env var
  cloud_api_secret = var.confluent_cloud_api_secret # optionally use CONFLUENT_CLOUD_API_SECRET env var
}
```

```sql
resource "confluent_environment" "development" {
  display_name = "Development"
  lifecycle {
    prevent_destroy = true
  }
}
```

```sql
confluent_flink_compute_pool
```

```sql
display_name
```

```sql
resource "confluent_flink_compute_pool" "main" {
  display_name = "standard_compute_pool"
  cloud        = "AWS"
  region       = "us-east-1"
  max_cfu      = 5

  environment {
    id = confluent_environment.development.id
  }
}
```

```sql
terraform apply
```

```sql
terraform apply
```

```sql
terraform import
```

```sql
export CONFLUENT_CLOUD_API_KEY="<cloud_api_key>"
export CONFLUENT_CLOUD_API_SECRET="<cloud_api_secret>"
terraform import confluent_flink_compute_pool.main <your-environment-id>/<compute-pool-id>
```

```sql
export COMPUTE_POOL_ID="<compute-pool-id>" # example: "lfcp-8m03rm"
export ENV_ID="<environment-id>" # example: "env-z3y2x1"
```

```sql
confluent flink compute-pool describe ${COMPUTE_POOL_ID} \
  --environment ${ENV_ID}
```

```sql
+-------------+-----------------+
| Current     | false           |
| ID          | lfcp-xxd6og     |
| Name        | my-compute-pool |
| Environment | env-z3y2x1      |
| Current CFU | 0               |
| Max CFU     | 5               |
| Cloud       | AWS             |
| Region      | us-east-1       |
| Status      | PROVISIONED     |
+-------------+-----------------+
```

```sql
export COMPUTE_POOL_ID="<compute-pool-id>" # example: "lfcp-8m03rm"
export CLOUD_API_KEY="<cloud-api-key>"
export CLOUD_API_SECRET="<cloud-api-secret>"
export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0)
export ENV_ID="<environment-id>" # example: "env-z3y2x1"
```

```sql
curl --request GET \
  --url "https://api.confluent.cloud/fcpm/v2/compute-pools/${COMPUTE_POOL_ID}?environment=${ENV_ID}" \
  --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}"
```

```sql
{
    "api_version": "fcpm/v2",
    "id": "lfcp-6g7h8i",
    "kind": "ComputePool",
    "metadata": {
        "created_at": "2024-02-27T22:44:27.18964Z",
        "resource_name": "crn://confluent.cloud/organization=b0b21724-4586-4a07-b787-d0bb5aacbf87/environment=env-z3y2x1/flink-region=aws.us-east-1/compute-pool=lfcp-6g7h8i",
        "self": "https://api.confluent.cloud/fcpm/v2/compute-pools/lfcp-6g7h8i",
        "updated_at": "2024-02-27T22:44:27.18964Z"
    },
    "spec": {
        "cloud": "AWS",
        "display_name": "my-compute-pool",
        "environment": {
            "id": "env-z3y2x1",
            "related": "https://api.confluent.cloud/fcpm/v2/compute-pools/lfcp-6g7h8i",
            "resource_name": "crn://confluent.cloud/organization=b0b21724-4586-4a07-b787-d0bb5aacbf87/environment=env-z3y2x1"
        },
        "http_endpoint": "https://flink.us-east-1.aws.confluent.cloud/sql/v1/organizations/b0b21724-4586-4a07-b787-d0bb5aacbf87/environments/env-z3y2x1",
        "max_cfu": 5,
        "region": "us-east-1"
    },
    "status": {
        "current_cfu": 0,
        "phase": "PROVISIONED"
    }
}
```

```sql
data "confluent_flink_compute_pool" "example_using_id" {
  id = "lfcp-abc123"
  environment {
    id = "<your-environment-id>"
  }
}

output "example_using_id" {
  value = data.confluent_flink_compute_pool.example_using_id
}

data "confluent_flink_compute_pool" "example_using_name" {
  display_name = "my_compute_pool"
  environment {
    id = "<your-environment-id>"
  }
}

output "example_using_name" {
  value = data.confluent_flink_compute_pool.example_using_name
}
```

```sql
terraform apply
```

```sql
terraform output
```

```sql
example_using_id
```

```sql
example_using_name
```

```sql
export CLOUD_REGION="<cloud-region>" # example: "us-east-1"
export ENV_ID="<environment-id>" # example: "env-z3y2x1"
```

```sql
confluent flink compute-pool list --environment ${ENV_ID}
```

```sql
Current |     ID      |           Name            | Environment | Current CFU | Max CFU | Cloud |  Region   |   Status
----------+-------------+---------------------------+-------------+-------------+---------+-------+-----------+--------------
  *       | lfcp-xxd6og | my-compute-pool           | env-z3y2x1  |           0 |       5 | AWS   | us-east-1 | PROVISIONED
          | lfcp-8m03rm | test-blue-compute-pool    | env-z3q9rd  |           0 |      10 | AWS   | us-east-1 | PROVISIONED
...
```

```sql
export CLOUD_API_KEY="<cloud-api-key>"
export CLOUD_API_SECRET="<cloud-api-secret>"
export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0)
export ENV_ID="<environment-id>" # example: "env-z3y2x1"
```

```sql
curl --request GET \
     --url "https://confluent.cloud/api/fcpm/v2/compute-pools?environment=${ENV_ID}&page_size=100" \
     --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" \
     | jq -r '.data[] | .spec.display_name, {id}'
```

```sql
compute_pool_0
{
  "id": "lfcp-j123kl"
}
compute_pool_2
{
  "id": "lfcp-abc1de"
}
my-lfcp-01
{
  "id": "lfcp-l2mn3o"
}
...
```

```sql
export COMPUTE_POOL_ID="<your-compute-pool-id>"
```

```sql
provider "confluent" {
  cloud_api_key    = var.confluent_cloud_api_key
  cloud_api_secret = var.confluent_cloud_api_secret
}

data "confluent_flink_compute_pools" "all_pools" {
  environment_id = "<your-environment-id>"
}

output "compute_pools" {
  value = data.confluent_flink_compute_pools.all_pools.compute_pools
}
```

```sql
terraform apply
```

```sql
terraform output
```

```sql
compute_pools
```

```sql
data "confluent_flink_compute_pools" "pools_in_us_east" {
  environment_id = "<your-environment-id>"
  filter = "region == '<region-id>'"
}
```

```sql
export COMPUTE_POOL_NAME=<compute-pool-name> # human-readable name, for example, "my-compute-pool"
export COMPUTE_POOL_ID="<compute-pool-id>" # example: "lfcp-8m03rm"
export ENV_ID="<environment-id>" # example: "env-z3y2x1"
export MAX_CFU="<max-cfu>" # example: 5
```

```sql
confluent flink compute-pool update ${COMPUTE_POOL_ID} \
  --environment ${ENV_ID} \
  --name ${COMPUTE_POOL_NAME} \
  --max-cfu ${MAX_CFU}
```

```sql
+-------------+----------------------+
| Current     | false                |
| ID          | lfcp-xxd6og          |
| Name        | renamed-compute-pool |
| Environment | env-z3y2x1           |
| Current CFU | 0                    |
| Max CFU     | 10                   |
| Cloud       | AWS                  |
| Region      | us-east-1            |
| Status      | PROVISIONED          |
+-------------+----------------------+
```

```sql
export COMPUTE_POOL_ID="<compute-pool-id>" # example: "lfcp-8m03rm"
export CLOUD_API_KEY="<cloud-api-key>"
export CLOUD_API_SECRET="<cloud-api-secret>"
export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0)
export ENV_ID="<environment-id>" # example: "env-z3y2x1"
export MAX_CFU="<max-cfu>" # example: 5
export JSON_DATA="<payload-string>"
```

```sql
{
  "spec": {
    "display_name": "${COMPUTE_POOL_NAME}",
    "max_cfu": ${MAX_CFU},
    "environment": {
      "id": "${ENV_ID}"
    }
  }
}
```

```sql
export JSON_DATA="{
  \"spec\": {
    \"display_name\": \"${COMPUTE_POOL_NAME}\",
    \"max_cfu\": ${MAX_CFU},
    \"environment\": {
      \"id\": \"${ENV_ID}\"
    }
  }
}"
```

```sql
curl --request PATCH \
  --url "https://api.confluent.cloud/fcpm/v2/compute-pools/${COMPUTE_POOL_ID}" \
  --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" \
  --header 'content-type: application/json' \
  --data "${JSON_DATA}"
```

```sql
resource "confluent_flink_compute_pool" "example" {
  cloud  = "AWS"
  region = "us-west-2"
  max_cfu = 10
  # other required parameters
}
```

```sql
confluent_flink_compute_pool
```

```sql
resource "confluent_flink_compute_pool" "example" {
  cloud  = "AWS"
  region = "us-west-2"
  max_cfu = 20 # Updated value
  # other required parameters
}
```

```sql
terraform apply
```

```sql
terraform apply
```

```sql
confluent flink compute-pool delete ${COMPUTE_POOL_ID} \
  --environment ${ENV_ID}
  --force
```

```sql
Deleted Flink compute pool "lfcp-xxd6og".
```

```sql
export COMPUTE_POOL_ID="<compute-pool-id>" # example: "lfcp-8m03rm"
export CLOUD_API_KEY="<cloud-api-key>"
export CLOUD_API_SECRET="<cloud-api-secret>"
export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0)
export ENV_ID="<environment-id>" # example: "env-z3y2x1"
```

```sql
curl --request DELETE \
  --url "https://api.confluent.cloud/fcpm/v2/compute-pools/${COMPUTE_POOL_ID}?environment=${ENV_ID}" \
  --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}"
```

```sql
resource "confluent_flink_compute_pool" "main" {
  display_name = "standard_compute_pool"
  cloud        = "AWS"
  region       = "us-east-1"
  max_cfu      = 5
  environment {
    id = "<your-environment-id>"
  }
}
```

```sql
terraform plan -destroy -target=confluent_flink_compute_pool.main
```

```sql
terraform apply -destroy -target=confluent_flink_compute_pool.main
```

```sql
terraform destroy
```

```sql
terraform destroy
```

---

### Deploy a Flink SQL Statement in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/operate-and-deploy/deploy-flink-sql-statement.html

Deploy a Flink SQL Statement Using CI/CD and Confluent Cloud for Apache Flink¶ GitHub Actions is a powerful feature on GitHub that enables automating your software development workflows. If your source code is stored in a GitHub repository, you can easily create a custom workflow in GitHub Actions to build, test, package, release, or deploy any code project. This topic shows how to create a CI/CD workflow that deploys an Apache Flink® SQL statement programmatically on Confluent Cloud for Apache Flink by using Hashicorp Terraform and GitHub Actions. With the steps in this topic, you can streamline your development process. In this walkthrough, you perform the following steps: Step 1: Set up a Terraform Cloud workspace Step 2: Set up a repository and secrets in GitHub Step 3. Create a CI/CD workflow in GitHub Actions Step 4. Deploy resources in Confluent Cloud Step 5. Deploy a Flink SQL statement Prerequisites¶ You need the following prerequisites to complete this tutorial: Access to Confluent Cloud A GitHub account to set up a repository and create the CI/CD workflow A Terraform Cloud account Step 1: Set up a Terraform Cloud workspace¶ You need a Terraform Cloud account to follow this tutorial. If you don’t have one yet, create an account for free at Terraform Cloud. With a Terraform Cloud account, you can manage your infrastructure-as-code and collaborate with your team. Create a workspace¶ If you have created a new Terraform Cloud account and the Getting Started page is displayed, click Create a new organization, and in the Organization name textbox, enter “flink_ccloud”. Click Create organization. Otherwise, from the Terraform Cloud homepage, click New to create a new workspace. In the Create a new workspace page, click the API-Driven Workflow tile, and in the Workspace name textbox, enter “cicd_flink_ccloud”. Click Create to create the workspace. Create a Terraform Cloud API token¶ By creating an API token, you can authenticate securely with Terraform Cloud and integrate it with GitHub Actions. Save the token in a secure location, and don’t share it with anyone. At the top of the navigation menu, click your user icon and select User settings. In the navigation menu, click Tokens, and in the Tokens page, click Create an API token. Give your token a meaningful description, like “github_actions”, and click Generate token. Your token appears in the Tokens list. Save the API token in a secure location. It won’t be displayed again. Step 2: Set up a repository and secrets in GitHub¶ To create an Action Secret in GitHub for securely storing the API token from Terraform Cloud, follow these steps. Log in to your GitHub account and create a new repository. In the Create a new repository page, use the Owner dropdown to choose an owner, and give the repository a unique name, like “<your-name-flink-ccloud>”. Click Create. In the repository details page, click Settings. In the navigation menu, click Secrets and variables, and in the context menu, select Actions to open the Actions secrets and variables page. Click New repository secret. In the New secret page, enter the following settings. In the Name textbox, enter “TF_API_TOKEN”. In the Secret textbox, enter the API token value that you saved from the previous Terraform Cloud step. Click Add secret to save the Action Secret. By creating an Action Secret for the API token, you can use it securely in your CI/CD pipelines, such as in GitHub Actions. Keep the secret safe, and don’t share it with anyone who shouldn’t have access to it. Step 3. Create a CI/CD workflow in GitHub Actions¶ The following steps show how to create an Action Workflow for automating the deployment of a Flink SQL statement on Confluent Cloud using Terraform. In the toolbar at the top of the screen, click Actions. The Get started with GitHub Actions page opens. Click set up a workflow yourself ->. If you already have a workflow defined, click new workflow, and then click set up a workflow yourself ->. Copy the following YAML into the editor. This YAML file defines a workflow that runs when changes are pushed to the main branch of your repository. It includes a job named “terraform_flink_ccloud_tutorial” that runs on the latest version of Ubuntu. The job includes these steps: Check out the code Set up Terraform Log in to Terraform Cloud using the API token stored in the Action Secret Initialize Terraform Apply the Terraform configuration to deploy changes to your Confluent Cloud account on: push: branches: - main jobs: terraform_flink_ccloud_tutorial: name: "terraform_flink_ccloud_tutorial" runs-on: ubuntu-latest steps: - name: Checkout uses: actions/checkout@v4 - name: Setup Terraform uses: hashicorp/setup-terraform@v3 with: cli_config_credentials_token: ${{ secrets.TF_API_TOKEN }} - name: Terraform Init id: init run: terraform init - name: Terraform Validate id: validate run: terraform validate -no-color - name: Terraform Plan id: plan run: terraform plan env: TF_VAR_confluent_cloud_api_key: ${{ secrets.CONFLUENT_CLOUD_API_KEY }} TF_VAR_confluent_cloud_api_secret: ${{ secrets.CONFLUENT_CLOUD_API_SECRET }} - name: Terraform Apply id: apply run: terraform apply -auto-approve env: TF_VAR_confluent_cloud_api_key: ${{ secrets.CONFLUENT_CLOUD_API_KEY }} TF_VAR_confluent_cloud_api_secret: ${{ secrets.CONFLUENT_CLOUD_API_SECRET }} Click Commit changes, and in the dialog, enter a description in the Extended description textbox, for example, “CI/CD workflow to automate deployment on Confluent Cloud”. Click Commit changes. The file main.yml is created in the .github/workflows directory in your repository. With this Action Workflow, your deployment of Flink SQL statements on Confluent Cloud is now automatic. Step 4. Deploy resources in Confluent Cloud¶ In this section, you deploy a Flink SQL statement programmatically to Confluent Cloud that runs continuously until stopped manually. In VS Code or another IDE, clone your repository and create a new file in the root named “main.tf” with the following code. Replace the organization and workspace names with your Terraform Cloud organization name and workspace names from Step 1. terraform { cloud { organization = "<your-terraform-org-name>" workspaces { name = "cicd_flink_ccloud" } } required_providers { confluent = { source = "confluentinc/confluent" version = "2.2.0" } } } Commit and push the changes to the repository. The CI/CD workflow that you created previously runs automatically. Verify that it’s running by navigating to the Actions section in your repository and clicking on the latest workflow run. Create a Confluent Cloud API key¶ To access Confluent Cloud securely, you must have a Confluent Cloud API key. After you generate an API key, you store securely it in your GitHub repository’s Secrets and variables page, the same way that you stored the Terraform API token. Follow the instructions here to create a new API key for Confluent Cloud, and on the https://confluent.cloud/settings/api-keys page, select the Cloud resource management tile for the API key’s resource scope. You will use this API key to communicate securely with Confluent Cloud. Return to the Settings page for your GitHub repository, and in the navigation menu, click Secrets and variables. In the context menu, select Actions to open the Actions secrets and variables page. Click New repository secret. In the New secret page, enter the following settings. In the Name textbox, enter “CONFLUENT_CLOUD_API_KEY”. In the Secret textbox, enter the Cloud API key. Click Add secret to save the Cloud API key as an Action Secret. Click New repository secret and repeat the previous steps for the Cloud API secret. Name the secret “CONFLUENT_CLOUD_API_SECRET”. Your Repository secrets list should resemble the following: Deploy resources¶ In this section, you add resources to your Terraform configuration file and provision them when the GitHub Action runs. In your repository, create a new file named “variables.tf” with the following code. variable "confluent_cloud_api_key" { description = "Confluent Cloud API Key" type = string } variable "confluent_cloud_api_secret" { description = "Confluent Cloud API Secret" type = string sensitive = true } In the “main.tf” file, add the following code. This code references the Cloud API key and secret you added in the previous steps and creates a new environment and Kafka cluster for your organization. Optionally, you can choose to use an existing environment. locals { cloud = "AWS" region = "us-east-2" } provider "confluent" { cloud_api_key = var.confluent_cloud_api_key cloud_api_secret = var.confluent_cloud_api_secret } # Create a new environment. resource "confluent_environment" "my_env" { display_name = "my_env" stream_governance { package = "ESSENTIALS" } } # Create a new Kafka cluster. resource "confluent_kafka_cluster" "my_kafka_cluster" { display_name = "my_kafka_cluster" availability = "SINGLE_ZONE" cloud = local.cloud region = local.region basic {} environment { id = confluent_environment.my_env.id } depends_on = [ confluent_environment.my_env ] } # Access the Stream Governance Essentials package to the environment. data "confluent_schema_registry_cluster" "my_sr_cluster" { environment { id = confluent_environment.my_env.id } } Create a Service Account and provide a role binding by adding the following code to “main.tf”. The role binding gives the Service Account the necessary permissions to create topics, Flink statements, and other resources. In production, you may want to assign a less privileged role than OrganizationAdmin. # Create a new Service Account. This will used during Kafka API key creation and Flink SQL statement submission. resource "confluent_service_account" "my_service_account" { display_name = "my_service_account" } data "confluent_organization" "my_org" {} # Assign the OrganizationAdmin role binding to the above Service Account. # This will give the Service Account the necessary permissions to create topics, Flink statements, etc. # In production, you may want to assign a less privileged role. resource "confluent_role_binding" "my_org_admin_role_binding" { principal = "User:${confluent_service_account.my_service_account.id}" role_name = "OrganizationAdmin" crn_pattern = data.confluent_organization.my_org.resource_name depends_on = [ confluent_service_account.my_service_account ] } Push all changes to your repository and check the Actions page to ensure the workflow runs successfully. At this point, you should have a new environment, an Apache Kafka® cluster, and a Stream Governance package provisioned in your Confluent Cloud organization. Step 5. Deploy a Flink SQL statement¶ To use Flink, you must create a Flink compute pool. A compute pool represents a set of compute resources that are bound to a region and are used to run your Flink SQL statements. For more information, see Compute Pools. Create a new compute pool by adding the following code to “main.tf”. # Create a Flink compute pool to execute a Flink SQL statement. resource "confluent_flink_compute_pool" "my_compute_pool" { display_name = "my_compute_pool" cloud = local.cloud region = local.region max_cfu = 10 environment { id = confluent_environment.my_env.id } depends_on = [ confluent_environment.my_env ] } Create a Flink-specific API key, which is required for submitting statements to Confluent Cloud, by adding the following code to “main.tf”. # Create a Flink-specific API key that will be used to submit statements. data "confluent_flink_region" "my_flink_region" { cloud = local.cloud region = local.region } resource "confluent_api_key" "my_flink_api_key" { display_name = "my_flink_api_key" owner { id = confluent_service_account.my_service_account.id api_version = confluent_service_account.my_service_account.api_version kind = confluent_service_account.my_service_account.kind } managed_resource { id = data.confluent_flink_region.my_flink_region.id api_version = data.confluent_flink_region.my_flink_region.api_version kind = data.confluent_flink_region.my_flink_region.kind environment { id = confluent_environment.my_env.id } } depends_on = [ confluent_environment.my_env, confluent_service_account.my_service_account ] } Deploy a Flink SQL statement on Confluent Cloud by adding the following code to “main.tf”. The statement consumes data from examples.marketplace.orders, aggregates in 1 minute windows and ingests the filtered data into sink_topic. Because you’re using a Service Account, the statement runs in Confluent Cloud continuously until manually stopped. # Deploy a Flink SQL statement to Confluent Cloud. resource "confluent_flink_statement" "my_flink_statement" { organization { id = data.confluent_organization.my_org.id } environment { id = confluent_environment.my_env.id } compute_pool { id = confluent_flink_compute_pool.my_compute_pool.id } principal { id = confluent_service_account.my_service_account.id } # This SQL reads data from source_topic, filters it, and ingests the filtered data into sink_topic. statement = <<EOT CREATE TABLE my_sink_topic AS SELECT window_start, window_end, SUM(price) AS total_revenue, COUNT(*) AS cnt FROM TABLE(TUMBLE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '1' MINUTE)) GROUP BY window_start, window_end; EOT properties = { "sql.current-catalog" = confluent_environment.my_env.display_name "sql.current-database" = confluent_kafka_cluster.my_kafka_cluster.display_name } rest_endpoint = data.confluent_flink_region.my_flink_region.rest_endpoint credentials { key = confluent_api_key.my_flink_api_key.id secret = confluent_api_key.my_flink_api_key.secret } depends_on = [ confluent_api_key.my_flink_api_key, confluent_flink_compute_pool.my_compute_pool, confluent_kafka_cluster.my_kafka_cluster ] } Push all changes to your repository and check the Actions page to ensure the workflow runs successfully. In Confluent Cloud Console, verify that the statement has been deployed and that sink_topic is receiving the data. You have a fully functioning CI/CD pipeline with Confluent Cloud and Terraform. This pipeline enables automating the deployment and management of your infrastructure, making it more efficient and scalable. Related content¶ Get Started with Confluent Cloud for Apache Flink Compute Pools

#### Code Examples

```sql
on:
 push:
    branches:
    - main

jobs:
 terraform_flink_ccloud_tutorial:
    name: "terraform_flink_ccloud_tutorial"
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v4

      - name: Setup Terraform
        uses: hashicorp/setup-terraform@v3
        with:
         cli_config_credentials_token: ${{ secrets.TF_API_TOKEN }}

      - name: Terraform Init
        id: init
        run: terraform init

      - name: Terraform Validate
        id: validate
        run: terraform validate -no-color

      - name: Terraform Plan
        id: plan
        run: terraform plan
        env:
          TF_VAR_confluent_cloud_api_key: ${{ secrets.CONFLUENT_CLOUD_API_KEY }}
          TF_VAR_confluent_cloud_api_secret: ${{ secrets.CONFLUENT_CLOUD_API_SECRET }}

      - name: Terraform Apply
        id: apply
        run: terraform apply -auto-approve
        env:
          TF_VAR_confluent_cloud_api_key: ${{ secrets.CONFLUENT_CLOUD_API_KEY }}
          TF_VAR_confluent_cloud_api_secret: ${{ secrets.CONFLUENT_CLOUD_API_SECRET }}
```

```sql
.github/workflows
```

```sql
terraform {
  cloud {
    organization = "<your-terraform-org-name>"

    workspaces {
      name = "cicd_flink_ccloud"
    }
  }

  required_providers {
    confluent = {
      source  = "confluentinc/confluent"
      version = "2.2.0"
    }
  }
}
```

```sql
variable "confluent_cloud_api_key" {
  description = "Confluent Cloud API Key"
  type        = string
}

variable "confluent_cloud_api_secret" {
  description = "Confluent Cloud API Secret"
  type        = string
  sensitive   = true
}
```

```sql
locals {
  cloud  = "AWS"
  region = "us-east-2"
}

provider "confluent" {
  cloud_api_key    = var.confluent_cloud_api_key
  cloud_api_secret = var.confluent_cloud_api_secret
}

# Create a new environment.
resource "confluent_environment" "my_env" {
  display_name = "my_env"

  stream_governance {
    package = "ESSENTIALS"
  }
}

# Create a new Kafka cluster.
resource "confluent_kafka_cluster" "my_kafka_cluster" {
  display_name = "my_kafka_cluster"
  availability = "SINGLE_ZONE"
  cloud        = local.cloud
  region       = local.region
  basic {}

  environment {
    id = confluent_environment.my_env.id
  }

  depends_on = [
    confluent_environment.my_env
  ]
}

# Access the Stream Governance Essentials package to the environment.
data "confluent_schema_registry_cluster" "my_sr_cluster" {
  environment {
    id = confluent_environment.my_env.id
  }
}
```

```sql
# Create a new Service Account. This will used during Kafka API key creation and Flink SQL statement submission.
resource "confluent_service_account" "my_service_account" {
  display_name = "my_service_account"
}

data "confluent_organization" "my_org" {}

# Assign the OrganizationAdmin role binding to the above Service Account.
# This will give the Service Account the necessary permissions to create topics, Flink statements, etc.
# In production, you may want to assign a less privileged role.
resource "confluent_role_binding" "my_org_admin_role_binding" {
  principal   = "User:${confluent_service_account.my_service_account.id}"
  role_name   = "OrganizationAdmin"
  crn_pattern = data.confluent_organization.my_org.resource_name

  depends_on = [
    confluent_service_account.my_service_account
  ]
}
```

```sql
# Create a Flink compute pool to execute a Flink SQL statement.
resource "confluent_flink_compute_pool" "my_compute_pool" {
  display_name = "my_compute_pool"
  cloud        = local.cloud
  region       = local.region
  max_cfu      = 10

  environment {
    id = confluent_environment.my_env.id
  }

  depends_on = [
    confluent_environment.my_env
  ]
}
```

```sql
# Create a Flink-specific API key that will be used to submit statements.
data "confluent_flink_region" "my_flink_region" {
  cloud  = local.cloud
  region = local.region
}

resource "confluent_api_key" "my_flink_api_key" {
  display_name = "my_flink_api_key"

  owner {
    id          = confluent_service_account.my_service_account.id
    api_version = confluent_service_account.my_service_account.api_version
    kind        = confluent_service_account.my_service_account.kind
  }

  managed_resource {
    id          = data.confluent_flink_region.my_flink_region.id
    api_version = data.confluent_flink_region.my_flink_region.api_version
    kind        = data.confluent_flink_region.my_flink_region.kind

    environment {
      id = confluent_environment.my_env.id
    }
  }

  depends_on = [
    confluent_environment.my_env,
    confluent_service_account.my_service_account
  ]
}
```

```sql
examples.marketplace.orders
```

```sql
# Deploy a Flink SQL statement to Confluent Cloud.
resource "confluent_flink_statement" "my_flink_statement" {
  organization {
    id = data.confluent_organization.my_org.id
  }

  environment {
    id = confluent_environment.my_env.id
  }

  compute_pool {
    id = confluent_flink_compute_pool.my_compute_pool.id
  }

  principal {
    id = confluent_service_account.my_service_account.id
  }

  # This SQL reads data from source_topic, filters it, and ingests the filtered data into sink_topic.
  statement = <<EOT
    CREATE TABLE my_sink_topic AS
    SELECT
      window_start,
      window_end,
      SUM(price) AS total_revenue,
      COUNT(*) AS cnt
    FROM
    TABLE(TUMBLE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '1' MINUTE))
    GROUP BY window_start, window_end;
    EOT

  properties = {
    "sql.current-catalog"  = confluent_environment.my_env.display_name
    "sql.current-database" = confluent_kafka_cluster.my_kafka_cluster.display_name
  }

  rest_endpoint = data.confluent_flink_region.my_flink_region.rest_endpoint

  credentials {
    key    = confluent_api_key.my_flink_api_key.id
    secret = confluent_api_key.my_flink_api_key.secret
  }

  depends_on = [
    confluent_api_key.my_flink_api_key,
    confluent_flink_compute_pool.my_compute_pool,
    confluent_kafka_cluster.my_kafka_cluster
  ]
}
```

---

### Grant Role-Based Access for Flink SQL Statements in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/operate-and-deploy/flink-rbac.html

Grant Role-Based Access in Confluent Cloud for Apache Flink¶ When deploying Flink SQL statements in production, you must configure appropriate access controls for different types of users and workloads. Confluent Cloud for Apache Flink® supports Role-based Access Control (RBAC). ACLs are not supported for Flink. The Flink-specific RBAC roles are: FlinkAdmin: Full access to Flink resources and compute pool management FlinkDeveloper: Limited access for running statements but not managing infrastructure Assigner: Enables delegation of statement execution to service accounts Operator: Has metadata access to Flink tables, databases, and catalogs. Layered permission model: Flink permissions follow a layered approach. Start with base permissions required for all Flink operations, then add additional layers based on what users need to accomplish. Operational considerations: Use service accounts for production workloads and apply least-privilege principles by granting only the permissions needed for each use case. For complete role definitions, see Predefined RBAC Roles. Permission layers Common user scenarios Production best practices Audit log events Reference Permission layers¶ Flink permissions follow a layered approach, in which each layer builds upon the previous one. This design enables you to grant only the permissions needed for each use case, following the principle of least privilege. These are the permission layers: Base/required layer: Fundamental permissions needed for all Flink operations Data access layer: Read and write access to specific tables and topics Table management layer: Create, alter, and delete tables Administrative layer: Manage compute pools and infrastructure Logging permissions layer: Access to UDF logs and audit events Start with the base layer and add additional layers as needed for your specific use cases. Base/required layer¶ All Flink user accounts need these permissions. Flink access¶ Choose the appropriate Flink role based on the user’s responsibilities: FlinkDeveloper: Can create and run statements, manage workspaces and artifacts, but can’t manage compute pools FlinkAdmin: All FlinkDeveloper capabilities plus compute pool management (create, delete, alter compute pool settings) Run the following commands to grant the necessary permissions. # For most users (statement execution and development) confluent iam rbac role-binding create \ --environment ${ENV_ID} \ --principal User:${USER_ID} \ --role FlinkDeveloper # For infrastructure administrators confluent iam rbac role-binding create \ --environment ${ENV_ID} \ --principal User:${USER_ID} \ --role FlinkAdmin Kafka Transactional-Id permissions¶ Flink uses Kafka transactions to ensure exactly-once processing semantics. All Flink statements require: DeveloperRead on Transactional-Id _confluent-flink_* (to read transaction state) DeveloperWrite on Transactional-Id _confluent-flink_* (to create and manage transactions) Run the following commands to grant the necessary permissions. # Read transaction state confluent iam rbac role-binding create \ --role DeveloperRead \ --principal User:${USER_ID} \ --environment ${ENV_ID} \ --cloud-cluster ${KAFKA_ID} \ --kafka-cluster ${KAFKA_ID} \ --resource Transactional-Id:_confluent-flink_ \ --prefix # Create and manage transactions confluent iam rbac role-binding create \ --role DeveloperWrite \ --principal User:${USER_ID} \ --environment ${ENV_ID} \ --cloud-cluster ${KAFKA_ID} \ --kafka-cluster ${KAFKA_ID} \ --resource Transactional-Id:_confluent-flink_ \ --prefix Data access layer¶ The data access layer provides permissions for reading from and writing to existing tables in your Flink statements. This layer builds on the base layer and adds specific access to Kafka topics and Schema Registry subjects that your statements need to interact with. Read from existing tables¶ When your Flink SQL statements read from tables, for example, by using a SELECT * FROM my_table statement, you need these roles: DeveloperRead on the Kafka topic DeveloperRead on the Schema Registry subject Run the following commands to grant the necessary permissions. # Kafka topic read permission confluent iam rbac role-binding create \ --role DeveloperRead \ --principal User:${USER_ID} \ --environment ${ENV_ID} \ --cloud-cluster ${KAFKA_ID} \ --kafka-cluster ${KAFKA_ID} \ --resource Topic:${TOPIC_NAME} # Schema Registry subject read permission confluent iam rbac role-binding create \ --role DeveloperRead \ --principal User:${USER_ID} \ --environment ${ENV_ID} \ --cloud-cluster ${SR_ID} \ --schema-registry-cluster ${SR_ID} \ --resource Subject:${SUBJECT_NAME} Write to existing tables¶ When your Flink SQL statements write to tables, for example, by using an INSERT INTO my_sink_table statement, you need the following roles: DeveloperWrite on the Kafka topic DeveloperRead on the Schema Registry subject, to validate data format Run the following commands to grant the necessary permissions. # Kafka topic write permission confluent iam rbac role-binding create \ --role DeveloperWrite \ --principal User:${USER_ID} \ --environment ${ENV_ID} \ --cloud-cluster ${KAFKA_ID} \ --kafka-cluster ${KAFKA_ID} \ --resource Topic:${TOPIC_NAME} # Schema Registry subject read permission, to validate data format confluent iam rbac role-binding create \ --role DeveloperRead \ --principal User:${USER_ID} \ --environment ${ENV_ID} \ --cloud-cluster ${SR_ID} \ --schema-registry-cluster ${SR_ID} \ --resource Subject:${SUBJECT_NAME} Table management layer¶ The table management layer provides permissions for creating and modifying tables in your Flink statements. This layer builds on the data access layer and adds specific access to Kafka topics and Schema Registry subjects that your statements need to interact with. Create new tables¶ When your Flink SQL statements create new tables, for example, by using a CREATE TABLE or CREATE TABLE AS SELECT statement, you need the following roles: DeveloperManage on Kafka topics, to create topics DeveloperWrite on Schema Registry subjects, to create schemas Run the following commands to grant the necessary permissions. # Kafka topic create/manage permission confluent iam rbac role-binding create \ --role DeveloperManage \ --principal User:${USER_ID} \ --environment ${ENV_ID} \ --cloud-cluster ${KAFKA_ID} \ --kafka-cluster ${KAFKA_ID} \ --resource Topic:${TABLE_PREFIX} \ --prefix # Schema Registry subject create/write permission confluent iam rbac role-binding create \ --role DeveloperWrite \ --principal User:${USER_ID} \ --environment ${ENV_ID} \ --cloud-cluster ${SR_ID} \ --schema-registry-cluster ${SR_ID} \ --resource Subject:${TABLE_PREFIX} \ --prefix Modify existing tables¶ When your Flink SQL statements modify table structures, for example, by using an ALTER TABLE statement for watermarks, computed columns, or column type changes, you need the following roles: DeveloperManage on the Kafka topic, for table structure changes DeveloperWrite on the Schema Registry subject, for schema evolution Run the following commands to grant the necessary permissions. # Kafka topic manage permission, for table structure changes confluent iam rbac role-binding create \ --role DeveloperManage \ --principal User:${USER_ID} \ --environment ${ENV_ID} \ --cloud-cluster ${KAFKA_ID} \ --kafka-cluster ${KAFKA_ID} \ --resource Topic:${TABLE_NAME} # Schema Registry subject write permission, for schema evolution confluent iam rbac role-binding create \ --role DeveloperWrite \ --principal User:${USER_ID} \ --environment ${ENV_ID} \ --cloud-cluster ${SR_ID} \ --schema-registry-cluster ${SR_ID} \ --resource Subject:${TABLE_NAME} Administrative layer¶ The administrative layer provides permissions for managing Flink compute pools and infrastructure. This layer builds on the table management layer and adds specific access to Flink resources that your statements need to interact with. The following roles can manage Flink compute pools and infrastructure. FlinkAdminThis role is for Flink-specific administrative access. It provides these capabilities: Manage compute pools (create, delete, and alter compute pool settings) All FlinkDeveloper capabilities (statements, workspaces, and artifacts) Most common choice for Flink-focused administrators EnvironmentAdminThis role provides environment-wide administrative access. It provides these capabilities: All Flink administrative capabilities plus broader environment management Typically assigned for other reasons (managing multiple services in an environment) OrganizationAdminThis role provides organization-wide administrative access. It provides these capabilities: All Flink administrative capabilities plus organization-wide management Typically assigned for other reasons, like managing the entire organization Use FlinkAdmin for users who primarily manage Flink infrastructure. Users with EnvironmentAdmin or OrganizationAdmin roles already have the necessary Flink administrative capabilities. For complete role definitions and capabilities, see Predefined RBAC Roles in Confluent Cloud. Logging permissions layer¶ The logging permissions layer provides permissions for accessing UDF logs and audit events related to your Flink statements. This layer builds on the administrative layer and adds specific access to Kafka topics that your statements need to interact with. UDF logging access¶ To access UDF logs, you need these roles: FlinkAdmin or FlinkDeveloper role: provides describe access to UDF logs DeveloperRead on the UDF log topics: to read the actual log messages CloudClusterAdmin role: can manage logging settings by enabling or disabling logging Run the following command to grant the necessary permissions. confluent iam rbac role-binding create \ --role DeveloperRead \ --principal User:${USER_ID} \ --environment ${ENV_ID} \ --cloud-cluster ${LOGGING_KAFKA_ID} \ --kafka-cluster ${LOGGING_KAFKA_ID} \ --resource Topic:${UDF_LOG_TOPIC} How Flink permissions work¶ Understanding the Flink permission model helps you make informed decisions about access control and troubleshoot permission issues effectively. Principal-Based Access Control¶ Flink uses a principal-based permission model, in which statements inherit all permissions from the principal that runs them. The principal can be a user or a service account. The following key concepts help you understand how Flink permissions work. Statements are not principals - A Flink SQL statement doesn’t have its own permissions. It uses the permissions of the principal that runs it. Flexible principal assignment - You can run statements under your user account, which is recommended for ad-hoc queries, or under a service account, which is recommended for production workloads. Permission inheritance - A statement can access any data that the principal has permissions to access in that region, even across environments. For example, if a service account has DeveloperRead on topics in multiple environments, any statement running under this service account can read from topics in all of these environments, when in the same region. Separation of control plane and data plane access¶ Flink separates access into two distinct planes: the control plane and the data plane. Understanding this separation is key to configuring permissions correctly. Control plane access (infrastructure)¶ This is managed by Flink-specific roles and governs what actions you can perform within the Flink service. Control plane access has these characteristics: Controls who can create statements, manage compute pools, and other Flink resources Managed by using FlinkAdmin and FlinkDeveloper roles Environment-scoped permissions, which apply to all environments in an organization, and organization-scoped permissions, which apply to all environments in an organization Data plane access (data)¶ This is managed by Kafka and Schema Registry roles and governs which data that your Flink statements can interact with. Data plane access has these characteristics: Controls which data your statements can read from and write to Managed by using DeveloperRead, DeveloperWrite, and DeveloperManage roles Resource-scoped permissions (topics, subjects) Data plane access is important because a user needs permissions on both planes to execute a Flink SQL statement successfully. For example, a user might have the FlinkDeveloper role (control plane access to create a statement), but if they lack DeveloperRead on a source topic (data plane access), the statement fails at runtime. In contrast, a principal with extensive data access but no Flink role can’t create statements in the first place. Compute pools as shared infrastructure¶ It’s important to understand that compute pools are resources, not principals. A compute pool provides the computational infrastructure for running statements. Compute pools don’t have their own permissions or identity. Multiple users can share the same compute pool if they have appropriate Flink roles. The principal running the statement determines data access, not the compute pool. For example, users Alice and Bob both have the FlinkDeveloper role and can use the same compute pool. Alice’s statements access data based on Alice’s permissions, while Bob’s statements use Bob’s permissions, even when running on the same compute pool. Cross-environment data access¶ Flink statements can access data across environment boundaries based on the principal’s permissions. For example, a statement in Environment A can read from topics in Environment B if the principal has: FlinkDeveloper role in Environment A, to create the statement DeveloperRead role on the topics in Environment B, to access the data Cross-environment data access is important in these use cases: Cross-environment analytics and reporting Data pipeline orchestration across multiple environments Centralized processing with distributed data sources Important Grant cross-environment permissions carefully, because a statement has broad access based on its principal’s permissions. Common user scenarios¶ This section describes common user scenarios and the required permission configurations for each. These scenarios follow the layered permission model, starting with base permissions and adding additional layers as needed. Choose the scenario that best matches your use case, then follow the corresponding permission setup instructions. Developers¶ Assign the following permissions to developer accounts: Base/Required Layer (Flink Developer role + Transactional-Id permissions) Data Access Layer (read/write access to existing tables) Table Management Layer (for creating and modifying tables) Run the commands shown in the previous sections to grant the necessary permissions. Production workloads (service accounts)¶ For automated deployments and long-running statements, use service accounts to ensure stable identity that isn’t affected by changes to user accounts. Setup options¶ Broad-access approach Create a service account and grant the EnvironmentAdmin role. Grant a user the Assigner role on the service account. Deploy statements using the service account. Least-privilege approach Create a service account and grant base/required layer permissions. Grant specific Data Access Layer and Table Management Layer permissions as needed. Grant a user account the Assigner role on the service account. Run the following commands to grant the necessary permissions. # Create service account confluent iam service-account create ${SA_NAME} \ --description "${SA_DESCRIPTION}" # Broad access: Grant EnvironmentAdmin role confluent iam rbac role-binding create \ --environment ${ENV_ID} \ --principal User:${SERVICE_ACCOUNT_ID} \ --role EnvironmentAdmin # Grant user Assigner role (for both approaches) confluent iam rbac role-binding create \ --principal User:${USER_ID} \ --resource service-account:${SERVICE_ACCOUNT_ID} \ --role Assigner For the least-privilege approach, run the commands in the previous layer sections, using the service account as the principal instead of a user account. Administrators (infrastructure management)¶ For managing compute pools and Flink infrastructure, grant an administrative layer role. These are the administrative roles that can manage Flink infrastructure: FlinkAdmin: Most common choice for Flink-focused administrators EnvironmentAdmin: If they already manage other services in the environment OrganizationAdmin: If they already manage the entire organization Production best practices¶ Grant permissions incrementally, starting with base permissions and adding additional layers as needed for your production use cases. Start with base/required layer - Grant fundamental Flink and Kafka permissions. Add data access - Grant read/write access to existing tables as needed. Add capabilities - Table management, administrative access as required. Validate each layer - Test functionality after adding each permission layer. Service account delegation pattern¶ For automated deployments, run the following command to grant the Assigner role on production service accounts. # CI/CD service account with Assigner role on production service accounts confluent iam rbac role-binding create \ --principal User:${CICD_SA_ID} \ --resource service-account:${PROD_SA_ID} \ --role Assigner Audit log events¶ Auditable event methods for the FLINK_WORKSPACE and STATEMENT resource types are triggered by operations on a Flink workspace and generate event messages that are sent to the audit log cluster, where they are stored as event records in a Kafka topic. For more information, see Auditable Event Methods. Reference¶ Permission summary by layer¶ Layer Kafka Schema Registry Flink Base/Required Transactional-Id – FlinkDeveloper OR FlinkAdmin Data Access DeveloperRead/Write DeveloperRead/Write – Table Management DeveloperManage DeveloperWrite – Administrative – – FlinkAdmin, EnvironmentAdmin, or OrganizationAdmin Logging DeveloperRead (UDF logs) – – Access to Flink resources¶ The following table shows which Flink resources the RBAC roles can access. “CRUD” stands for “Create, Read, Update, Delete”. Scope Statements Workspaces Compute pools Artifacts User-defined functions UDF logging AI inference models Kafka clusters Kafka Topics EnvironmentAdmin CRUD CRUD CRUD CRUD CRUD CRUD CRUD CRUD CRUD FlinkAdmin CRUD CRUD CRUD CRUD CRUD -R– CRUD – – FlinkDeveloper CRUD CRUD -R– CRUD CRUD [1] -R– CRUD [1] – – OrganizationAdmin CRUD CRUD CRUD CRUD CRUD CRUD CRUD CRUD CRUD [1](1, 2) Requires access to cluster. Related content¶ Auditable Event Methods DDL Statements Manage RBAC Role Bindings Role-based Access Control (RBAC) Service Accounts UDF logs Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
# For most users (statement execution and development)
confluent iam rbac role-binding create \
  --environment ${ENV_ID} \
  --principal User:${USER_ID} \
  --role FlinkDeveloper

# For infrastructure administrators
confluent iam rbac role-binding create \
  --environment ${ENV_ID} \
  --principal User:${USER_ID} \
  --role FlinkAdmin
```

```sql
_confluent-flink_*
```

```sql
_confluent-flink_*
```

```sql
# Read transaction state
confluent iam rbac role-binding create \
  --role DeveloperRead \
  --principal User:${USER_ID} \
  --environment ${ENV_ID} \
  --cloud-cluster ${KAFKA_ID} \
  --kafka-cluster ${KAFKA_ID} \
  --resource Transactional-Id:_confluent-flink_ \
  --prefix

# Create and manage transactions
confluent iam rbac role-binding create \
  --role DeveloperWrite \
  --principal User:${USER_ID} \
  --environment ${ENV_ID} \
  --cloud-cluster ${KAFKA_ID} \
  --kafka-cluster ${KAFKA_ID} \
  --resource Transactional-Id:_confluent-flink_ \
  --prefix
```

```sql
SELECT * FROM my_table
```

```sql
# Kafka topic read permission
confluent iam rbac role-binding create \
  --role DeveloperRead \
  --principal User:${USER_ID} \
  --environment ${ENV_ID} \
  --cloud-cluster ${KAFKA_ID} \
  --kafka-cluster ${KAFKA_ID} \
  --resource Topic:${TOPIC_NAME}

# Schema Registry subject read permission
confluent iam rbac role-binding create \
  --role DeveloperRead \
  --principal User:${USER_ID} \
  --environment ${ENV_ID} \
  --cloud-cluster ${SR_ID} \
  --schema-registry-cluster ${SR_ID} \
  --resource Subject:${SUBJECT_NAME}
```

```sql
INSERT INTO my_sink_table
```

```sql
# Kafka topic write permission
confluent iam rbac role-binding create \
  --role DeveloperWrite \
  --principal User:${USER_ID} \
  --environment ${ENV_ID} \
  --cloud-cluster ${KAFKA_ID} \
  --kafka-cluster ${KAFKA_ID} \
  --resource Topic:${TOPIC_NAME}

# Schema Registry subject read permission, to validate data format
confluent iam rbac role-binding create \
  --role DeveloperRead \
  --principal User:${USER_ID} \
  --environment ${ENV_ID} \
  --cloud-cluster ${SR_ID} \
  --schema-registry-cluster ${SR_ID} \
  --resource Subject:${SUBJECT_NAME}
```

```sql
CREATE TABLE
```

```sql
CREATE TABLE AS SELECT
```

```sql
# Kafka topic create/manage permission
confluent iam rbac role-binding create \
  --role DeveloperManage \
  --principal User:${USER_ID} \
  --environment ${ENV_ID} \
  --cloud-cluster ${KAFKA_ID} \
  --kafka-cluster ${KAFKA_ID} \
  --resource Topic:${TABLE_PREFIX} \
  --prefix

# Schema Registry subject create/write permission
confluent iam rbac role-binding create \
  --role DeveloperWrite \
  --principal User:${USER_ID} \
  --environment ${ENV_ID} \
  --cloud-cluster ${SR_ID} \
  --schema-registry-cluster ${SR_ID} \
  --resource Subject:${TABLE_PREFIX} \
  --prefix
```

```sql
ALTER TABLE
```

```sql
# Kafka topic manage permission, for table structure changes
confluent iam rbac role-binding create \
  --role DeveloperManage \
  --principal User:${USER_ID} \
  --environment ${ENV_ID} \
  --cloud-cluster ${KAFKA_ID} \
  --kafka-cluster ${KAFKA_ID} \
  --resource Topic:${TABLE_NAME}

# Schema Registry subject write permission, for schema evolution
confluent iam rbac role-binding create \
  --role DeveloperWrite \
  --principal User:${USER_ID} \
  --environment ${ENV_ID} \
  --cloud-cluster ${SR_ID} \
  --schema-registry-cluster ${SR_ID} \
  --resource Subject:${TABLE_NAME}
```

```sql
confluent iam rbac role-binding create \
  --role DeveloperRead \
  --principal User:${USER_ID} \
  --environment ${ENV_ID} \
  --cloud-cluster ${LOGGING_KAFKA_ID} \
  --kafka-cluster ${LOGGING_KAFKA_ID} \
  --resource Topic:${UDF_LOG_TOPIC}
```

```sql
# Create service account
confluent iam service-account create ${SA_NAME} \
  --description "${SA_DESCRIPTION}"

# Broad access: Grant EnvironmentAdmin role
confluent iam rbac role-binding create \
  --environment ${ENV_ID} \
  --principal User:${SERVICE_ACCOUNT_ID} \
  --role EnvironmentAdmin

# Grant user Assigner role (for both approaches)
confluent iam rbac role-binding create \
  --principal User:${USER_ID} \
  --resource service-account:${SERVICE_ACCOUNT_ID} \
  --role Assigner
```

```sql
# CI/CD service account with Assigner role on production service accounts
confluent iam rbac role-binding create \
  --principal User:${CICD_SA_ID} \
  --resource service-account:${PROD_SA_ID} \
  --role Assigner
```

```sql
FLINK_WORKSPACE
```

---

### Flink REST API in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/operate-and-deploy/flink-rest-api.html

Flink SQL REST API for Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides a REST API for managing your Flink SQL statements, compute pools, and connections programmatically. Use the REST API to manage these features: Artifacts (user-defined functions) Compute pools Connections List available regions Statements For the complete Flink REST API reference, see: Artifacts (user-defined functions) Compute Pools Connections Regions Statements In addition to the REST API, you can manage Flink resources by using these Confluent tools: Cloud Console Confluent CLI SQL shell Confluent Terraform Provider Prerequisites¶ To manage Flink resources by using the REST API, you must generate an API key that’s specific to the Flink environment. Also, you need Confluent Cloud account details, like your organization and environment identifiers. Flink API Key: Follow the steps in Generate a Flink API key. Organization ID: The identifier your organization, for example, “b0b421724-4586-4a07-b787-d0bb5aacbf87”. Environment ID: The identifier of the environment where your Flink SQL statements run, for example, “env-z3y2x1”. Cloud provider name: The name of the cloud provider where your cluster runs, for example, “AWS”. To see the available providers, run the confluent flink region list command. Cloud region: The name of the region where your cluster runs, for example, “us-east-1”. To see the available regions, run the confluent flink region list command. Depending on the request, you may need these details: Cloud API key: Some requests require a Confluent Cloud API key and secret, which are distinct from a Flink API key and secret. Follow the instructions here to create a new API key for Confluent Cloud, and on the https://confluent.cloud/settings/api-keys page, select the Cloud resource management tile for the API key’s resource scope. Principal ID: The identifier of your user account or a service account, for example, “u-aq1dr2” for a user account or “sa-23kgz4” for a service account. Compute pool ID: The identifier of the compute pool that runs your Flink SQL statements, for example, “lfcp-8m03rm”. Statement name: A unique name for a Flink SQL statement. SQL code: The code for a Flink SQL statement. Rate limits¶ Requests to the Flink REST API are rate-limited per IP address. Concurrent connections: 100 Requests per minute: 1000 Requests per second: 50 Private networking endpoints¶ If you have enabled Flink private networking, the REST endpoints are different. <!-- Without private network --> https://flink.${CLOUD_REGION}.${CLOUD_PROVIDER}.confluent.cloud/ <!-- With private network --> https://flink.${CLOUD_REGION}.${CLOUD_PROVIDER}.private.confluent.cloud/ For example, if you send a request to the us-east-1 AWS region without a private network, the host is: <!-- Without private network --> https://flink.us-east-1.aws.confluent.cloud With a private network, the host is: <!-- With private network --> https://flink.us-east-1.aws.private.confluent.cloud Generate a Flink API key¶ To access the REST API, you need an API key specifically for Flink. This key is distinct from the Confluent Cloud API key. Before you create an API key for Flink access, decide whether you want to create long-running Flink SQL statements. If you need long-running Flink SQL statements, Confluent recommends using a service account and creating an API key for it. If you want to run only interactive queries or statements for a short time while developing queries, you can create an API key for your user account. Follow the steps in Generate an API Key for Access. Run the following commands to save your API key and secret in environment variables. export FLINK_API_KEY="<flink-api-key>" export FLINK_API_SECRET="<flink-api-secret>" The REST API uses basic authentication, which means that you provide a base64-encoded string made from your Flink API key and secret in the request header. You can use the base64 command to encode the “key:secret” string. Be sure to use the -n option of the echo command to prevent newlines from being embedded in the encoded string. If you’re on Linux, be sure to use the -w 0 option of the base64 command, to prevent the string from being line-wrapped. For convenience, save the encoded string in an environment variable: export BASE64_FLINK_KEY_AND_SECRET=$(echo -n "${FLINK_API_KEY}:${FLINK_API_SECRET}" | base64 -w 0) Manage statements¶ Using requests to the Flink REST API, you can perform these actions: Submit a statement Get a statement List statements Update metadata for a statement Delete a statement Flink SQL statement schema¶ A statement has the following schema: api_version: "sql/v1" kind: "Statement" organization_id: "" # Identifier of your Confluent Cloud organization environment_id: "" # Identifier of your Confluent Cloud environment name: "" # Primary identifier of the statement, must be unique within the environment, 100 max length, [a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)* metadata: created_at: "" # Creation timestamp of this resource updated_at: "" # Last updated timestamp of this resource resource_version: "" # Generated by the system and updated whenever the statement is updated (including by the system). Opaque and should not be parsed. self: "" # An absolute URL to this resource uid: "" # uid is unique in time and space (i.e., even if the name is re-used) spec: compute_pool_id: "" # The ID of the compute pool the statement should run in. DNS Subdomain (RFC 1123) – 255 max len, [a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)* principal: "" # user or service account ID properties: map[string]string # Optional. request/client properties statement: "SELECT * from Orders;" # The raw SQL text stopped: false # Boolean, specifying if the statement should be stopped status: phase: PENDING | RUNNING | COMPLETED | DELETING | FAILING | FAILED detail: "" # Optional. Human-readable description of phase. result_schema: "" # Optional. JSON object in TableSchema format; describes the data returned by the results serving API. The statement name has a maximum length of 100 characters and must satisfy the following regular expression: [a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)* The underscore character (_) and period character (.) are not supported. Submit a statement¶ You can submit a Flink SQL statement by sending a POST request to the Statements endpoint. Submitting a Flink SQL statement requires the following inputs: export FLINK_API_KEY="<flink-api-key>" export FLINK_API_SECRET="<flink-api-secret>" export BASE64_FLINK_KEY_AND_SECRET=$(echo -n "${FLINK_API_KEY}:${FLINK_API_SECRET}" | base64 -w 0) export STATEMENT_NAME="<statement-name>" # example: "user-filter" export ORG_ID="<organization-id>" # example: "b0b21724-4586-4a07-b787-d0bb5aacbf87" export ENV_ID="<environment-id>" # example: "env-z3y2x1" export CLOUD_PROVIDER="<cloud-provider>" # example: "aws" export CLOUD_REGION="<cloud-region>" # example: "us-east-1" export COMPUTE_POOL_ID="<compute-pool-id>" # example: "lfcp-8m03rm" export PRINCIPAL_ID="<principal-id>" # (optional) example: "sa-23kgz4" for a service account, or "u-aq1dr2" for a user account export SQL_CODE="<sql-statement-text>" # example: "SELECT * FROM USERS;" export JSON_DATA="<payload-string>" The PRINCIPAL_ID parameter is optional. Confluent Cloud infers the principal from the provided Flink API key. The following JSON shows an example payload: { "name": "${STATEMENT_NAME}", "organization_id": "${ORG_ID}", "environment_id": "${ENV_ID}", "spec": { "statement": "${SQL_CODE}", "properties": { "key1": "value1", "key2": "value2" }, "compute_pool_id": "${COMPUTE_POOL_ID}", "principal": "${PRINCIPAL_ID}", "stopped": false } } Quotation mark characters in the JSON string must be escaped, so the payload string to send resembles the following: export JSON_DATA="{ \"name\": \"${STATEMENT_NAME}\", \"organization_id\": \"${ORG_ID}\", \"environment_id\": \"${ENV_ID}\", \"spec\": { \"statement\": \"${SQL_CODE}\", \"properties\": { \"key1\": \"value1\", \"key2\": \"value2\" }, \"compute_pool_id\": \"${COMPUTE_POOL_ID}\", \"principal\": \"${PRINCIPAL_ID}\", \"stopped\": false } }" The following command sends a POST request that submits a Flink SQL statement. curl --request POST \ --url "https://flink.${CLOUD_REGION}.${CLOUD_PROVIDER}.confluent.cloud/sql/v1/organizations/${ORG_ID}/environments/${ENV_ID}/statements" \ --header "Authorization: Basic ${BASE64_FLINK_KEY_AND_SECRET}" \ --header 'content-type: application/json' \ --data "${JSON_DATA}" Your output should resemble: Response from a request to submit a SQL statement { "api_version": "sql/v1", "environment_id": "env-z3y2x1", "kind": "Statement", "metadata": { "created_at": "2023-12-16T17:12:08.914198Z", "resource_version": "1", "self": "https://flink.us-east-1.aws.confluent.cloud/sql/v1/organizations/b0b21724-4586-4a07-b787-d0bb5aacbf87/environments/env-z3y2x1/statements/demo-statement-1", "uid": "0005dd7b-8a7e-4274-b97e-c21b134d98f0", "updated_at": "2023-12-16T17:12:08.914198Z" }, "name": "demo-statement-1", "organization_id": "b0b21724-4586-4a07-b787-d0bb5aacbf87", "spec": { "compute_pool_id": "lfcp-8m03rm", "principal": "u-aq1dr2", "properties": null, "statement": "select 1;", "stopped": false }, "status": { "detail": "", "phase": "PENDING" } } Get a statement¶ Get the details about a Flink SQL statement by sending a GET request to the Statements endpoint. Getting a Flink SQL statement requires the following inputs: export FLINK_API_KEY="<flink-api-key>" export FLINK_API_SECRET="<flink-api-secret>" export BASE64_FLINK_KEY_AND_SECRET=$(echo -n "${FLINK_API_KEY}:${FLINK_API_SECRET}" | base64 -w 0) export STATEMENT_NAME="<statement-name>" # example: "user-filter" export ORG_ID="<organization-id>" # example: "b0b21724-4586-4a07-b787-d0bb5aacbf87" export ENV_ID="<environment-id>" # example: "env-z3y2x1" export CLOUD_PROVIDER="<cloud-provider>" # example: "aws" export CLOUD_REGION="<cloud-region>" # example: "us-east-1" The following command gets a Flink SQL statement’s details by its name. Attempting to get a deleted statement returns 404. curl --request GET \ --url "https://flink.${CLOUD_REGION}.${CLOUD_PROVIDER}.confluent.cloud/sql/v1/organizations/${ORG_ID}/environments/${ENV_ID}/statements/${STATEMENT_NAME}" \ --header "Authorization: Basic ${BASE64_FLINK_KEY_AND_SECRET}" Your output should resemble: Response from a request to get a SQL statement { "api_version": "sql/v1", "environment_id": "env-z3y2x1", "kind": "Statement", "metadata": { "created_at": "2023-12-16T16:08:36.650591Z", "resource_version": "13", "self": "https://flink.us-east-1.aws.confluent.cloud/sql/v1/organizations/b0b21724-4586-4a07-b787-d0bb5aacbf87/environments/env-z3y2x1/statements/demo-statement-1", "uid": "5387a4a4-02dd-4375-8db1-80bdd82ede96", "updated_at": "2023-12-16T16:10:05.353298Z" }, "name": "demo-statement-1", "organization_id": "b0b21724-4586-4a07-b787-d0bb5aacbf87", "spec": { "compute_pool_id": "lfcp-8m03rm", "principal": "u-aq1dr2", "properties": null, "statement": "select 1;", "stopped": false }, "status": { "detail": "", "phase": "COMPLETED", "result_schema": { "columns": [ { "name": "EXPR$0", "type": { "nullable": false, "type": "INTEGER" } } ] } } } Tip Pipe the result through jq to extract the code for the Flink SQL statement: curl --request GET \ --url "https://flink.${CLOUD_REGION}.${CLOUD_PROVIDER}.confluent.cloud/sql/v1/organizations/${ORG_ID}/environments/${ENV_ID}/statements/${STATEMENT_NAME}" \ --header "Authorization: Basic ${BASE64_FLINK_KEY_AND_SECRET}" \ | jq -r '.spec.statement' Your output should resemble: select 1; List statements¶ List the statements in an environment by sending a GET request to the Statements endpoint. Request Query Parameters spec.compute_pool_id (optional): Fetch only the statements under this compute pool ID. page_token (optional): Retrieve a page based on a previously received token (via the metadata.next field of StatementList). page_size (optional): Maximum number of items to return in a page. Listing all Flink SQL statements requires the following inputs: export FLINK_API_KEY="<flink-api-key>" export FLINK_API_SECRET="<flink-api-secret>" export BASE64_FLINK_KEY_AND_SECRET=$(echo -n "${FLINK_API_KEY}:${FLINK_API_SECRET}" | base64 -w 0) export ORG_ID="<organization-id>" # example: "b0b21724-4586-4a07-b787-d0bb5aacbf87" export ENV_ID="environment-id" # example: "env-z3y2x1" export CLOUD_PROVIDER="<cloud-provider>" # example: "aws" export CLOUD_REGION="<cloud-region>" # example: "us-east-1" The following command returns details for all non-deleted Flink SQL statements under the scope of the environment (one or more compute pools) where you have permission to do a GET request. curl --request GET \ --url "https://flink.${CLOUD_REGION}.${CLOUD_PROVIDER}.confluent.cloud/sql/v1/organizations/${ORG_ID}/environments/${ENV_ID}/statements" \ --header "Authorization: Basic ${BASE64_FLINK_KEY_AND_SECRET}" Your output should resemble: Response from a request to list the statements in an environment { "api_version": "sql/v1", "data": [ { "api_version": "sql/v1", "environment_id": "env-z3y2x1", "kind": "Statement", "metadata": { "created_at": "2023-12-16T16:08:36.650591Z", "resource_version": "13", "self": "https://flink.us-east-1.aws.confluent.cloud/sql/v1/organizations/b0b21724-4586-4a07-b787-d0bb5aacbf87/environments/env-z3y2x1/statements/demo-statement-1", "uid": "5387a4a4-02dd-4375-8db1-80bdd82ede96", "updated_at": "2023-12-16T16:10:05.353298Z" }, "name": "demo-statement-1", "organization_id": "b0b21724-4586-4a07-b787-d0bb5aacbf87", "spec": { "compute_pool_id": "lfcp-8m03rm", "principal": "u-aq1dr2", "properties": null, "statement": "select 1;", "stopped": false }, "status": { "detail": "", "phase": "COMPLETED", "result_schema": { "columns": [ { "name": "EXPR$0", "type": { "nullable": false, "type": "INTEGER" } } ] } } } Update metadata for a statement¶ Update the metadata for a statement by sending a PUT request to the Statements endpoint. You can stop and resume a statement by setting stopped in the spec to true to stop the statement and false to resume the statement. You can update the statement’s name, compute pool, and security principal. To update the compute pool or principal, you must stop the statement, send the update request, then restart the statement. The statement’s code is immutable. You must specify a resource version in the payload metadata. Updating metadata for an existing Flink SQL statement requires the following inputs: export FLINK_API_KEY="<flink-api-key>" export FLINK_API_SECRET="<flink-api-secret>" export BASE64_FLINK_KEY_AND_SECRET=$(echo -n "${FLINK_API_KEY}:${FLINK_API_SECRET}" | base64 -w 0) export STATEMENT_NAME="<statement-name>" # example: "user-filter" export ORG_ID="<organization-id>" # example: "b0b21724-4586-4a07-b787-d0bb5aacbf87" export ENV_ID="<environment-id>" # example: "env-z3y2x1" export CLOUD_PROVIDER="<cloud-provider>" # example: "aws" export CLOUD_REGION="<cloud-region>" # example: "us-east-1" export COMPUTE_POOL_ID="<compute-pool-id>" # example: "lfcp-8m03rm" export PRINCIPAL_ID="<principal-id>" # (optional) example: "sa-23kgz4" for a service account, or "u-aq1dr2" for a user account export SQL_CODE="<sql-statement-text>" # example: "SELECT * FROM USERS;" export RESOURCE_VERSION="<version>" # example: "a3e", must be fetched from the latest version of the statement export JSON_DATA="<payload-string>" The PRINCIPAL_ID parameter is optional. Confluent Cloud infers the principal from the provided Flink API key. The following JSON shows an example payload: { "name": "${STATEMENT_NAME}", "organization_id": "${ORG_ID}", "environment_id": "${ENV_ID}", "spec": { "statement": "${SQL_CODE}", "properties": { "key1": "value1", "key2": "value2" }, "compute_pool_id": "${COMPUTE_POOL_ID}", "principal": "${PRINCIPAL_ID}", "stopped": false }, "metadata": { "resource_version": "${RESOURCE_VERSION}" } } Quotation mark characters in the JSON string must be escaped, so the payload string to send resembles the following: export JSON_DATA="{ \"name\": \"${STATEMENT_NAME}\", \"organization_id\": \"${ORG_ID}\", \"environment_id\": \"${ENV_ID}\", \"spec\": { \"statement\": \"${SQL_CODE}\", \"properties\": { \"key1\": \"value1\", \"key2\": \"value2\" }, \"compute_pool_id\": \"${COMPUTE_POOL_ID}\", \"principal\": \"${PRINCIPAL_ID}\", \"stopped\": false }, \"metadata\": { \"resource_version\": \"${RESOURCE_VERSION}\" } }" The following command sends a PUT request that updates metadata for an existing Flink SQL statement. curl --request PUT \ --url "https://flink.${CLOUD_REGION}.${CLOUD_PROVIDER}.confluent.cloud/sql/v1/organizations/${ORG_ID}/environments/${ENV_ID}/statements/${STATEMENT_NAME}" \ --header "Authorization: Basic ${BASE64_FLINK_KEY_AND_SECRET}" \ --header 'content-type: application/json' \ --data "${JSON_DATA}" Resource version is required in the PUT request and changes every time the statement is updated, by the system or by the user. It’s not possible to calculate the resource version ahead of time, so if the statement has changed since it was fetched, you must submit a GET request, reapply the modifications, and try the update again. This means you must loop and retry on 409 errors. The following pseudo code shows the loop. while true: statement = getStatement() # make modifications to the current statement statement.spec.stopped = True # send the update response = updateStatement(statement) # if a conflict, retry if response.code == 409: continue elif response.code == 200: return "success" else: return response.error() Delete a statement¶ Delete a statement from the compute pool by sending a DELETE request to the Statements endpoint. Once a statement deleted, it can’t be undone. State is cleaned up by Confluent Cloud. When deletion is complete, the statement is no longer accessible. Deleting a statement requires the following inputs: export FLINK_API_KEY="<flink-api-key>" export FLINK_API_SECRET="<flink-api-secret>" export BASE64_FLINK_KEY_AND_SECRET=$(echo -n "${FLINK_API_KEY}:${FLINK_API_SECRET}" | base64 -w 0) export STATEMENT_NAME="<statement-name>" # example: "user-filter" export ORG_ID="<organization-id>" # example: "b0b21724-4586-4a07-b787-d0bb5aacbf87" export ENV_ID="<environment-id>" # example: "env-z3y2x1" export CLOUD_PROVIDER="<cloud-provider>" # example: "aws" export CLOUD_REGION="<cloud-region>" # example: "us-east-1" The following command deletes a statement in the specified organization and environment. curl --request DELETE \ --url "https://flink.${CLOUD_REGION}.${CLOUD_PROVIDER}.confluent.cloud/sql/v1/organizations/${ORG_ID}/environments/${ENV_ID}/statements/${STATEMENT_NAME}" \ --header "Authorization: Basic ${BASE64_FLINK_KEY_AND_SECRET}" Manage compute pools¶ Using requests to the Flink REST API, you can perform these actions: List Flink compute pools Create a Flink compute pool Read a Flink compute pool Update a Flink compute pool Delete a Flink compute pool You must be authorized to create, update, delete (FlinkAdmin) or use (FlinkDeveloper) a compute pool. For more information, see Grant Role-Based Access in Confluent Cloud for Apache Flink. List Flink compute pools¶ List the compute pools in your environment by sending a GET request to the Compute Pools endpoint. This request uses your Cloud API key instead of the Flink API key. Listing the compute pools in your environment requires the following inputs: export CLOUD_API_KEY="<cloud-api-key>" export CLOUD_API_SECRET="<cloud-api-secret>" export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0) export ENV_ID="<environment-id>" # example: "env-z3y2x1" Run the following command to list the compute pools in your environment. curl --request GET \ --url "https://confluent.cloud/api/fcpm/v2/compute-pools?environment=${ENV_ID}&page_size=100" \ --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" \ | jq -r '.data[] | .spec.display_name, {id}' Your output should resemble: compute_pool_0 { "id": "lfcp-j123kl" } compute_pool_2 { "id": "lfcp-abc1de" } my-lfcp-01 { "id": "lfcp-l2mn3o" } ... Find your compute pool in the list and save its ID in an environment variable. export COMPUTE_POOL_ID="<your-compute-pool-id>" Create a Flink compute pool¶ Create a compute pool in your environment by sending a POST request to the Compute Pools endpoint. This request uses your Cloud API key instead of the Flink API key. Creating a compute pool requires the following inputs: export COMPUTE_POOL_NAME="<compute-pool-name>" # human readable name, for example: "my-compute-pool" export CLOUD_API_KEY="<cloud-api-key>" export CLOUD_API_SECRET="<cloud-api-secret>" export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0) export ENV_ID="<environment-id>" # example: "env-z3y2x1" export CLOUD_PROVIDER="<cloud-provider>" # example: "aws" export CLOUD_REGION="<cloud-region>" # example: "us-east-1" export MAX_CFU="<max-cfu>" # example: 5 export JSON_DATA="<payload-string>" The following JSON shows an example payload. The network key is optional. { "spec": { "display_name": "${COMPUTE_POOL_NAME}", "cloud": "${CLOUD_PROVIDER}", "region": "${CLOUD_REGION}", "max_cfu": ${MAX_CFU}, "environment": { "id": "${ENV_ID}" }, "network": { "id": "n-00000", "environment": "string" } } } Quotation mark characters in the JSON string must be escaped, so the payload string to send resembles the following: export JSON_DATA="{ \"spec\": { \"display_name\": \"${COMPUTE_POOL_NAME}\", \"cloud\": \"${CLOUD_PROVIDER}\", \"region\": \"${CLOUD_REGION}\", \"max_cfu\": ${MAX_CFU}, \"environment\": { \"id\": \"${ENV_ID}\" } } }" The following command sends a POST request to create a compute pool. curl --request POST \ --url https://api.confluent.cloud/fcpm/v2/compute-pools \ --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" \ --header 'content-type: application/json' \ --data "${JSON_DATA}" Your output should resemble: Response from a request to create a compute pool { "api_version": "fcpm/v2", "id": "lfcp-6g7h8i", "kind": "ComputePool", "metadata": { "created_at": "2024-02-27T22:44:27.18964Z", "resource_name": "crn://confluent.cloud/organization=b0b21724-4586-4a07-b787-d0bb5aacbf87/environment=env-z3y2x1/flink-region=aws.us-east-1/compute-pool=lfcp-6g7h8i", "self": "https://api.confluent.cloud/fcpm/v2/compute-pools/lfcp-6g7h8i", "updated_at": "2024-02-27T22:44:27.18964Z" }, "spec": { "cloud": "AWS", "display_name": "my-compute-pool", "environment": { "id": "env-z3y2x1", "related": "https://api.confluent.cloud/fcpm/v2/compute-pools/lfcp-6g7h8i", "resource_name": "crn://confluent.cloud/organization=b0b21724-4586-4a07-b787-d0bb5aacbf87/environment=env-z3y2x1" }, "http_endpoint": "https://flink.us-east-1.aws.confluent.cloud/sql/v1/organizations/b0b21724-4586-4a07-b787-d0bb5aacbf87/environments/env-z3y2x1", "max_cfu": 5, "region": "us-east-1" }, "status": { "current_cfu": 0, "phase": "PROVISIONING" } } Read a Flink compute pool¶ Get the details about a compute pool in your environment by sending a GET request to the Compute Pools endpoint. This request uses your Cloud API key instead of the Flink API key. Getting details about a compute pool requires the following inputs: export COMPUTE_POOL_ID="<compute-pool-id>" # example: "lfcp-8m03rm" export CLOUD_API_KEY="<cloud-api-key>" export CLOUD_API_SECRET="<cloud-api-secret>" export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0) export ENV_ID="<environment-id>" # example: "env-z3y2x1" Run the following command to get details about the compute pool specified in the COMPUTE_POOL_ID environment variable. curl --request GET \ --url "https://api.confluent.cloud/fcpm/v2/compute-pools/${COMPUTE_POOL_ID}?environment=${ENV_ID}" \ --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" Your output should resemble: Response from a request to read a compute pool { "api_version": "fcpm/v2", "id": "lfcp-6g7h8i", "kind": "ComputePool", "metadata": { "created_at": "2024-02-27T22:44:27.18964Z", "resource_name": "crn://confluent.cloud/organization=b0b21724-4586-4a07-b787-d0bb5aacbf87/environment=env-z3y2x1/flink-region=aws.us-east-1/compute-pool=lfcp-6g7h8i", "self": "https://api.confluent.cloud/fcpm/v2/compute-pools/lfcp-6g7h8i", "updated_at": "2024-02-27T22:44:27.18964Z" }, "spec": { "cloud": "AWS", "display_name": "my-compute-pool", "environment": { "id": "env-z3y2x1", "related": "https://api.confluent.cloud/fcpm/v2/compute-pools/lfcp-6g7h8i", "resource_name": "crn://confluent.cloud/organization=b0b21724-4586-4a07-b787-d0bb5aacbf87/environment=env-z3y2x1" }, "http_endpoint": "https://flink.us-east-1.aws.confluent.cloud/sql/v1/organizations/b0b21724-4586-4a07-b787-d0bb5aacbf87/environments/env-z3y2x1", "max_cfu": 5, "region": "us-east-1" }, "status": { "current_cfu": 0, "phase": "PROVISIONED" } } Update a Flink compute pool¶ Update a compute pool in your environment by sending a PATCH request to the Compute Pools endpoint. This request uses your Cloud API key instead of the Flink API key. Updating a compute pool requires the following inputs: export COMPUTE_POOL_ID="<compute-pool-id>" # example: "lfcp-8m03rm" export CLOUD_API_KEY="<cloud-api-key>" export CLOUD_API_SECRET="<cloud-api-secret>" export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0) export ENV_ID="<environment-id>" # example: "env-z3y2x1" export MAX_CFU="<max-cfu>" # example: 5 export JSON_DATA="<payload-string>" The following JSON shows an example payload. The network key is optional. { "spec": { "display_name": "${COMPUTE_POOL_NAME}", "max_cfu": ${MAX_CFU}, "environment": { "id": "${ENV_ID}" } } } Quotation mark characters in the JSON string must be escaped, so the payload string to send resembles the following: export JSON_DATA="{ \"spec\": { \"display_name\": \"${COMPUTE_POOL_NAME}\", \"max_cfu\": ${MAX_CFU}, \"environment\": { \"id\": \"${ENV_ID}\" } } }" Run the following command to update the compute pool specified in the COMPUTE_POOL_ID environment variable. curl --request PATCH \ --url "https://api.confluent.cloud/fcpm/v2/compute-pools/${COMPUTE_POOL_ID}" \ --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" \ --header 'content-type: application/json' \ --data "${JSON_DATA}" Delete a Flink compute pool¶ Delete a compute pool in your environment by sending a DELETE request to the Compute Pools endpoint. This request uses your Cloud API key instead of the Flink API key. Deleting a compute pool requires the following inputs: export COMPUTE_POOL_ID="<compute-pool-id>" # example: "lfcp-8m03rm" export CLOUD_API_KEY="<cloud-api-key>" export CLOUD_API_SECRET="<cloud-api-secret>" export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0) export ENV_ID="<environment-id>" # example: "env-z3y2x1" Run the following command to delete the compute pool specified in the COMPUTE_POOL_ID environment variable. curl --request DELETE \ --url "https://api.confluent.cloud/fcpm/v2/compute-pools/${COMPUTE_POOL_ID}?environment=${ENV_ID}" \ --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" List Flink regions¶ List the regions where Flink is available by sending a GET request to the Regions endpoint. This request uses your Cloud API key instead of the Flink API key. Getting details about a compute pool requires the following inputs: export CLOUD_API_KEY="<cloud-api-key>" export CLOUD_API_SECRET="<cloud-api-secret>" export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0) Run the following command to list the available Flink regions. curl --request GET \ --url "https://api.confluent.cloud/fcpm/v2/regions" \ --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" \ | jq -r '.data[].id' Your output should resemble: aws.eu-central-1 aws.us-east-1 aws.eu-west-1 aws.us-east-2 ... Manage Flink artifacts¶ Using requests to the Flink REST API, you can perform these actions: List Flink artifacts Create a Flink artifact Read an artifact Update an artifact Delete an artifact An artifact has the following schema: api_version: artifact/v1 kind: FlinkArtifact id: dlz-f3a90de metadata: self: 'https://api.confluent.cloud/artifact/v1/flink-artifacts/fa-12345' resource_name: crn://confluent.cloud/organization=<org-id>/flink-artifact=fa-12345 created_at: '2006-01-02T15:04:05-07:00' updated_at: '2006-01-02T15:04:05-07:00' deleted_at: '2006-01-02T15:04:05-07:00' cloud: AWS region: us-east-1 environment: env-00000 display_name: string class: io.confluent.example.SumScalarFunction content_format: JAR description: string documentation_link: '^$|^(http://|https://).' runtime_language: JAVA versions: - version: cfa-ver-001 release_notes: string is_beta: true artifact_id: {} upload_source: api_version: artifact.v1/UploadSource kind: PresignedUrl id: dlz-f3a90de metadata: self: https://api.confluent.cloud/artifact.v1/UploadSource/presigned-urls/pu-12345 resource_name: crn://confluent.cloud/organization=<org-id>/presigned-url=pu-12345 created_at: '2006-01-02T15:04:05-07:00' updated_at: '2006-01-02T15:04:05-07:00' deleted_at: '2006-01-02T15:04:05-07:00' location: PRESIGNED_URL_LOCATION upload_id: <guid> List Flink artifacts¶ List the artifacts, like user-defined functions (UDFs), in your environment by sending a GET request to the List Artifacts endpoint. This request uses your Cloud API key instead of the Flink API key. Listing the artifacts in your environment requires the following inputs: export CLOUD_PROVIDER="<cloud-provider>" # example: "aws" export CLOUD_REGION="<cloud-region>" # example: "us-east-1" export CLOUD_API_KEY="<cloud-api-key>" export CLOUD_API_SECRET="<cloud-api-secret>" export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0) export ENV_ID="<environment-id>" # example: "env-z3y2x1" Run the following command to list the artifacts in your environment. curl --request GET \ --url "https://api.confluent.cloud/artifact/v1/flink-artifacts?cloud=${CLOUD_PROVIDER}&region=${CLOUD_REGION}&environment=${ENV_ID}" \ --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" \ | jq -r '.data[] | .spec.display_name, {id}' Your output should resemble: { "id": "cfa-e8rzq7" } Create a Flink artifact¶ Creating an artifact, like a user-defined function (UDF), requires these steps: Request a presigned upload URL for a new Flink Artifact by sending a POST request to the Presigned URLs endpoint. Upload your JAR file to the object storage provider by using the results from the presigned URL request. Create the artifact in your environment by sending a POST request to the Create Artifact endpoint. These requests use your Cloud API key instead of the Flink API key. Creating an artifact in your environment requires the following inputs: export ARTIFACT_DISPLAY_NAME="<human-readable-name>" # example: "my-udf" export ARTIFACT_DESCRIPTION="<description>" # example: "This is a demo UDF." export ARTIFACT_DOC_LINK="<url-to-documentation>" # example: "https://docs.example.com/my-udf" export CLASS_NAME="<java-class-name>" # example: "io.confluent.example.SumScalarFunction" export CLOUD_PROVIDER="<cloud-provider>" # example: "aws" export CLOUD_REGION="<cloud-region>" # example: "us-east-1" export CLOUD_API_KEY="<cloud-api-key>" export CLOUD_API_SECRET="<cloud-api-secret>" export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0) export ENV_ID="<environment-id>" # example: "env-z3y2x1" The following JSON shows an example payload. { "content_format": "JAR", "cloud": "${CLOUD_PROVIDER}", "environment": "${ENV_ID}", "region": "${CLOUD_REGION}" } Quotation mark characters in the JSON string must be escaped, so the payload string to send resembles the following: export JSON_DATA="{ \"content_format\": \"JAR\", \"cloud\": \"${CLOUD_PROVIDER}\", \"environment\": \"${ENV_ID}\", \"region\": \"${CLOUD_REGION}\" }" Run the following command to request the upload identifier and the presigned upload URL for your artifact. curl --request POST \ --url https://api.confluent.cloud/artifact/v1/presigned-upload-url \ --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" \ --header 'content-type: application/json' \ --data "${JSON_DATA}" Your output should resemble: { "api_version": "artifact/v1", "cloud": "AWS", "content_format": "JAR", "kind": "PresignedUrl", "region": "us-east-1", "upload_form_data": { "bucket": "confluent-custom-connectors-prod-us-east-1", "key": "staging/ccp/v1/<your-org-id>/custom-plugins/<guid>/plugin.jar", "policy": "ey…", "x-amz-algorithm": "AWS4-HMAC-SHA256", "x-amz-credential": "AS…/20241121/us-east-1/s3/aws4_request", "x-amz-date": "20241121T212232Z", "x-amz-security-token": "IQ…", "x-amz-signature": "52…" }, "upload_id": "<upload-id-guid>", "upload_url": "https://confluent-custom-connectors-prod-us-east-1.s3.dualstack.us-east-1.amazonaws.com/" } For convenience, save the security details in environment variables: export UPLOAD_ID="<upload-id-guid>" export UPLOAD_URL="<upload_url>" export UPLOAD_BUCKET="<bucket>" export UPLOAD_KEY="<key>" export UPLOAD_POLICY="<policy>" export UPLOAD_KEY="<key>" export X_AMZ_ALGORITHM="<x-amz-algorithm>" export X_AMZ_CREDENTIAL="<x-amz-credential>" export X_AMZ_DATE="<x-amz-date>" export X_AMZ_SECURITY_TOKEN="<x-amz-security-token>" export X_AMZ_SIGNATURE="<x-amz-signature>" Once you have the presigned URL, ID, bucket policy, and other security details, upload your JAR to the bucket. The following example provides a curl command you can use to upload your JAR file. Note When specifying the JAR file to upload, you must use the @ symbol at the start of the file path. For example, -F file=@</path/to/upload/file>. If the @ symbol is not used, you may see an error stating that Your proposed upload is smaller than the minimum allowed size. curl -X POST "${UPLOAD_URL}" \ -F "bucket=${UPLOAD_BUCKET}" \ -F "key=${UPLOAD_KEY}" \ -F "policy=${UPLOAD_POLICY}" \ -F "x-amz-algorithm=${X_AMZ_ALGORITHM}" \ -F "x-amz-credential=${X_AMZ_CREDENTIAL}" \ -F "x-amz-date=${X_AMZ_DATE}" \ -F "x-amz-security-token=${X_AMZ_SECURITY_TOKEN}" \ -F "x-amz-signature=${X_AMZ_SIGNATURE}" \ -F file=@/path/to/udf_file.jar When your JAR file is uploaded to the object score, you can create the UDF in Confluent Cloud for Apache Flink by sending a POST request to the Create Artifact endpoint. The following JSON shows an example payload. { "cloud": "${CLOUD_PROVIDER}", "region": "${CLOUD_REGION}", "environment": "${ENV_ID}", "display_name": "${ARTIFACT_DISPLAY_NAME}", "class": "${CLASS_NAME}", "content_format": "JAR", "description": "${ARTIFACT_DESCRIPTION}", "documentation_link": "${ARTIFACT_DOC_LINK}", "runtime_language": "JAVA", "upload_source": { "location": "PRESIGNED_URL_LOCATION", "upload_id": "${UPLOAD_ID}" } } Quotation mark characters in the JSON string must be escaped, so the payload string resembles the following: export JSON_DATA="{ \"cloud\": \"${CLOUD_PROVIDER}\", \"region\": \"${CLOUD_REGION}\", \"environment\": \"${ENV_ID}\", \"display_name\": \"${ARTIFACT_DISPLAY_NAME}\", \"class\": \"${CLASS_NAME}\", \"content_format\": \"JAR\", \"description\": \"${ARTIFACT_DESCRIPTION}\", \"documentation_link\": \"${ARTIFACT_DOC_LINK}\", \"runtime_language\": \"JAVA\", \"upload_source\": { \"location\": \"PRESIGNED_URL_LOCATION\", \"upload_id\": \"${UPLOAD_ID}\" } }" Run the following command to create the artifact in your environment. curl --request POST \ --url "https://api.confluent.cloud/artifact/v1/flink-artifacts?cloud=${CLOUD_REGION}&region=${CLOUD_REGION}&environment=${ENV_ID}" \ --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" \ --header 'content-type: application/json' \ --data "${JSON_DATA}" Read an artifact¶ Get the details about an artifact in your environment by sending a GET request to the Read Artifact endpoint. This request uses your Cloud API key instead of the Flink API key. Getting details about an artifact requires the following inputs: export ARTIFACT_ID="<artifact-id>" # example: cfa-e8rzq7 export CLOUD_PROVIDER="<cloud-provider>" # example: "aws" export CLOUD_REGION="<cloud-region>" # example: "us-east-1" export CLOUD_API_KEY="<cloud-api-key>" export CLOUD_API_SECRET="<cloud-api-secret>" export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0) export ENV_ID="<environment-id>" # example: "env-z3y2x1" Run the following command to get details about the artifact specified by the ARTIFACT_ID environment variable. curl --request GET \ --url "https://api.confluent.cloud/artifact/v1/flink-artifacts/${ARTIFACT_ID}?cloud=${CLOUD_PROVIDER}&region=${CLOUD_REGION}&environment=${ENV_ID}" \ --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" Your output should resemble: Response from a request to get details about an artifact { "api_version": "artifact/v1", "class": "default", "cloud": "AWS", "content_format": "JAR", "description": "", "display_name": "udf_example", "documentation_link": "", "environment": "env-z3q9rd", "id": "cfa-e8rzq7", "kind": "FlinkArtifact", "metadata": { "created_at": "2024-11-21T21:52:43.788042Z", "resource_name": "crn://confluent.cloud/organization=<org-id>/flink-artifact=cfa-e8rzq7", "self": "https://api.confluent.cloud/artifact/v1/flink-artifacts/cfa-e8rzq7", "updated_at": "2024-11-21T21:52:44.625318Z" }, "region": "us-east-1", "runtime_language": "JAVA", "versions": [ { "artifact_id": {}, "is_beta": false, "release_notes": "", "upload_source": { "location": "PRESIGNED_URL_LOCATION", "upload_id": "" }, "version": "ver-xq72dk" } ] } Update an artifact¶ Update an artifact in your environment by sending a PATCH request to the Update Artifact endpoint. This request uses your Cloud API key instead of the Flink API key. Updating an artifact in your environment requires the following inputs: export ARTIFACT_ID="<artifact-id>" # example: cfa-e8rzq7 export ARTIFACT_DISPLAY_NAME="<human-readable-name>" # example: "my-udf" export ARTIFACT_DESCRIPTION="<description>" # example: "This is a demo UDF." export ARTIFACT_DOC_LINK="<url-to-documentation>" # example: "https://docs.example.com/my-udf", "^$|^(http://|https://)." export CLASS_NAME="<java-class-name>" # example: "io.confluent.example.SumScalarFunction" export CLOUD_PROVIDER="<cloud-provider>" # example: "aws" export CLOUD_REGION="<cloud-region>" # example: "us-east-1" export CLOUD_API_KEY="<cloud-api-key>" export CLOUD_API_SECRET="<cloud-api-secret>" export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0) export ENV_ID="<environment-id>" # example: "env-z3y2x1" The following JSON shows an example payload. Response from a request to update an artifact { "cloud": "${CLOUD_PROVIDER}", "region": "${CLOUD_REGION}", "environment": "${ENV_ID}", "display_name": "${ARTIFACT_DISPLAY_NAME}", "content_format": "JAR", "description": "${ARTIFACT_DESCRIPTION}", "documentation_link": "${ARTIFACT_DOC_LINK}", "runtime_language": "JAVA", "versions": [ { "version": "cfa-ver-001", "release_notes": "string", "is_beta": true, "artifact_id": { "cloud": "${CLOUD_PROVIDER}", "region": "${CLOUD_REGION}", "environment": "${ENV_ID}", "display_name": "${ARTIFACT_DISPLAY_NAME}", "class": "${CLASS_NAME}", "content_format": "JAR", "description": "${ARTIFACT_DESCRIPTION}", "documentation_link": "${ARTIFACT_DOC_LINK}", "runtime_language": "JAVA", "versions": [ {} ] }, "upload_source": { "location": "PRESIGNED_URL_LOCATION", "upload_id": "${UPLOAD_ID}" } } ] } Quotation mark characters in the JSON string must be escaped, so the payload string resembles the following: export JSON_DATA="{ \"cloud\": \"${CLOUD_PROVIDER}\", \"region\": \"${CLOUD_REGION}\", \"environment\": \"${ENV_ID}\", \"display_name\": \"${ARTIFACT_DISPLAY_NAME}\", \"content_format\": \"JAR\", \"description\": \"${ARTIFACT_DESCRIPTION}\", \"documentation_link\": \"${ARTIFACT_DOC_LINK}\", \"runtime_language\": \"JAVA\", \"versions\": [ { \"version\": \"cfa-ver-001\", \"release_notes\": \"string\", \"is_beta\": true, \"artifact_id\": { \"cloud\": \"${CLOUD_PROVIDER}\", \"region\": \"${CLOUD_REGION}\", \"environment\": \"${ENV_ID}\", \"display_name\": \"${ARTIFACT_DISPLAY_NAME}\", \"class\": \"${CLASS_NAME}\", \"content_format\": \"JAR\", \"description\": \"${ARTIFACT_DESCRIPTION}\", \"documentation_link\": \"${ARTIFACT_DOC_LINK}\", \"runtime_language\": \"JAVA\", \"versions\": [ {} ] }, \"upload_source\": { \"location\": \"PRESIGNED_URL_LOCATION\", \"upload_id\": \"${UPLOAD_ID}\" } } ] }" Run the following command to update the artifact specified by the ARTIFACT_ID environment variable. curl --request PATCH \ --url "https://api.confluent.cloud/artifact/v1/flink-artifacts/${ARTIFACT_ID}?cloud=${CLOUD_PROVIDER}&region=${CLOUD_REGION}&environment=${ENV_ID}" \ --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" \ --header 'content-type: application/json' \ --data "${JSON_DATA}" Delete an artifact¶ Delete an artifact in your environment by sending a DELETE request to the Delete Artifact endpoint. This request uses your Cloud API key instead of the Flink API key. Deleting an artifact in your environment requires the following inputs: export ARTIFACT_ID="<artifact-id>" # example: cfa-e8rzq7 export CLOUD_PROVIDER="<cloud-provider>" # example: "aws" export CLOUD_REGION="<cloud-region>" # example: "us-east-1" export CLOUD_API_KEY="<cloud-api-key>" export CLOUD_API_SECRET="<cloud-api-secret>" export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0) export ENV_ID="<environment-id>" # example: "env-z3y2x1" Run the following command to delete an artifact specified by the ARTIFACT_ID environment variable. curl --request DELETE \ --url "https://api.confluent.cloud/artifact/v1/flink-artifacts/${ARTIFACT_ID}?cloud=${CLOUD_PROVIDER}&region=${CLOUD_REGION}&environment=${ENV_ID}" \ --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" Manage UDF logging¶ When you create a user defined function (UDF) with Confluent Cloud for Apache Flink®, you have the option of enabling logging to a Kafka topic to help with monitoring and debugging. For more information, see Enable Logging in a User Defined Function. Using requests to the Flink REST API, you can perform these actions: Enable logging List UDF logs Disable a UDF log View log details Update the logging level for a UDF log Managing UDF logs requires the following inputs: export UDF_LOG_ID="<udf-log-id>" # example: "ccl-4l5klo" export UDF_LOG_TOPIC_NAME="<topic-name>" # example: "udf_log" export KAFKA_CLUSTER_ID="<kafka-cluster-id>" # example: "lkc-12345" export CLOUD_PROVIDER="<cloud-provider>" # example: "aws" export CLOUD_REGION="<cloud-region>" # example: "us-east-1" export CLOUD_API_KEY="<cloud-api-key>" export CLOUD_API_SECRET="<cloud-api-secret>" export ENV_ID="<environment-id>" # example: "env-z3y2x1" Enable logging¶ Run the following command to enable UDF logging. cat << EOF | curl --silent -X POST -u ${CLOUD_API_KEY}:${CLOUD_API_SECRET} \ -d @- https://api.confluent.cloud/ccl/v1/custom-code-loggings { "cloud":"${CLOUD_PROVIDER}", "region":"${CLOUD_REGION}", "environment": { "id":"${ENV_ID}" }, "destination_settings":{ "kind":"Kafka", "cluster_id":"${KAFKA_CLUSTER_ID}", "topic":"${UDF_LOG_TOPIC_NAME}", "log_level":"info" } } EOF List UDF logs¶ To list the active UDF logs, run the following commands. curl --silent -X GET \ -u ${CLOUD_API_KEY}:${CLOUD_API_SECRET} \ https://api.confluent.cloud/ccl/v1/custom-code-loggings?environment=${ENV_ID} Disable a UDF log¶ Run the following command to disable UDF logging. curl --silent -X DELETE \ -u ${CLOUD_API_KEY}:${CLOUD_API_SECRET} \ https://api.confluent.cloud/ccl/v1/custom-code-loggings/${UDF_LOG_ID}?environment=${ENV_ID} View log details¶ Run the following command to view the details of a UDF log. curl --silent -X GET \ -u ${CLOUD_API_KEY}:${CLOUD_API_SECRET} \ https://api.confluent.cloud/ccl/v1/custom-code-loggings/${UDF_LOG_ID}?environment=${ENV_ID} Update the logging level for a UDF log¶ Run the following command to change the logging level for an active UDF log. cat <<EOF | curl --silent -X PATCH \ -u ${CLOUD_API_KEY}:${CLOUD_API_SECRET} \ -d @- https://api.confluent.cloud/ccl/v1/custom-code-loggings/${UDF_LOG_ID} { "region":"asddf", "destination_settings":{ "kind":"Kafka" } } EOF Manage connections¶ To manage connections, you can use the following endpoints: Create Connection Delete Connection Describe Connection List Connections Update Connection You must be authorized to create, update, delete (FlinkAdmin) or use (FlinkDeveloper) a connection. For more information, see Grant Role-Based Access in Confluent Cloud for Apache Flink. Create a connection¶ Create a connection in your environment by sending a POST request to the Connections endpoint. Creating a connection requires the following inputs. Credentials vary by service. export CONNECTION_NAME="<connection-name>" # example: "my-openai-connection" export CONNECTION_TYPE="<connection-type>" # example: "OPENAI" export ENDPOINT="<endpoint>" # example: "https://api.openai.com/v1/chat/completions" export CLOUD_API_KEY="<cloud-api-key>" export CLOUD_API_SECRET="<cloud-api-secret>" export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0) export ORG_ID="<organization-id>" # example: "b0b21724-4586-4a07-b787-d0bb5aacbf87" export ENV_ID="<environment-id>" # example: "env-a1b2c3" export CLOUD_PROVIDER="<cloud-provider>" # example: "aws" export CLOUD_REGION="<cloud-region>" # example: "us-east-1" export JSON_DATA="<payload-string>" The following JSON shows an example payload. The auth_data key varies by service. { "name": "${CONNECTION_NAME}", "spec": { "connection_type": "${CONNECTION_TYPE}", "endpoint": "${ENDPOINT}", "auth_data": { "kind": "PlaintextProvider", "data": "string" } }, "metadata": {} } Quotation mark characters in the JSON string must be escaped, so the payload string to send resembles the following: export JSON_DATA="{ \"name\": \"${CONNECTION_NAME}\", \"spec\": { \"connection_type\": \"${CONNECTION_TYPE}\", \"endpoint\": \"${ENDPOINT}\", \"auth_data\": { \"kind\": \"PlaintextProvider\", \"data\": \"string\" } }, \"metadata\": {} }" The following command sends a POST request to create a connection. curl --request POST \ --url "https://flink.region.provider.confluent.cloud/sql/v1/organizations/${ORG_ID}/environments/${ENV_ID}/connections" \ --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" \ --header 'content-type: application/json' \ --data "${JSON_DATA}" Your output should resemble: Response from a request to create a connection { "api_version": "sql/v1", "kind": "Connection", "metadata": { "self": "https://flink.us-west1.aws.confluent.cloud/sql/v1/organizations/org-abc/environments/env-a1b2c3/connections/my-openai-connection", "resource_name": "", "created_at": "2006-01-02T15:04:05-07:00", "updated_at": "2006-01-02T15:04:05-07:00", "deleted_at": "2006-01-02T15:04:05-07:00", "uid": "12345678-1234-1234-1234-123456789012", "resource_version": "a23av" }, "name": "my-openai-connection", "spec": { "connection_type": "OPENAI", "endpoint": "https://api.openai.com/v1/chat/completions", "auth_data": { "kind": "PlaintextProvider", "data": "string" } }, "status": { "phase": "READY", "detail": "Lookup failed: ai.openai.com" } } } Delete a connection¶ Delete a connection in your environment by sending a DELETE request to the Connections endpoint. This request uses your Cloud API key instead of the Flink API key. Deleting a connection requires the following inputs: export CONNECTION_NAME="<connection-name>" # example: "my-openai-connection" export CLOUD_API_KEY="<cloud-api-key>" export CLOUD_API_SECRET="<cloud-api-secret>" export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0) export ORG_ID="<organization-id>" # example: "b0b21724-4586-4a07-b787-d0bb5aacbf87" export ENV_ID="<environment-id>" # example: "env-a1b2c3" Run the following command to delete the connection specified in the CONNECTION_NAME environment variable. curl --request DELETE \ --url "https://flink.region.provider.confluent.cloud/sql/v1/organizations/${ORG_ID}/environments/${ENV_ID}/connections/${CONNECTION_NAME}" \ --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" Describe a connection Get the details about a connection in your environment by sending a GET request to the Connections endpoint. This request uses your Cloud API key instead of the Flink API key. Getting details about a connection requires the following inputs: export CONNECTION_NAME="<connection-name>" # example: "my-openai-connection" export CLOUD_API_KEY="<cloud-api-key>" export CLOUD_API_SECRET="<cloud-api-secret>" export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0) export ORG_ID="<organization-id>" # example: "b0b21724-4586-4a07-b787-d0bb5aacbf87" export ENV_ID="<environment-id>" # example: "env-a1b2c3" Run the following command to get details about the connection specified in the CONNECTION_NAME environment variable. curl --request GET \ --url "https://flink.region.provider.confluent.cloud/sql/v1/organizations/${ORG_ID}/environments/${ENV_ID}/connections/${CONNECTION_NAME}" \ --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" Your output should resemble: Response from a request to read a connection { "api_version": "sql/v1", "kind": "Connection", "metadata": { "self": "https://flink.us-west1.aws.confluent.cloud/sql/v1/organizations/org-abc/environments/env-123/connections/my-openai-connection", "resource_name": "", "created_at": "2006-01-02T15:04:05-07:00", "updated_at": "2006-01-02T15:04:05-07:00", "deleted_at": "2006-01-02T15:04:05-07:00", "uid": "12345678-1234-1234-1234-123456789012", "resource_version": "a23av" }, "name": "my-openai-connection", "spec": { "connection_type": "OPENAI", "endpoint": "https://api.openai.com/v1/chat/completions", "auth_data": { "kind": "PlaintextProvider", "data": "string" } }, "status": { "phase": "READY", "detail": "Lookup failed: ai.openai.com" } } List connections¶ List the connections in your environment by sending a GET request to the Connections endpoint. This request uses your Cloud API key instead of the Flink API key. Listing the connections in your environment requires the following inputs: export CLOUD_API_KEY="<cloud-api-key>" export CLOUD_API_SECRET="<cloud-api-secret>" export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0) export ORG_ID="<organization-id>" # example: "b0b21724-4586-4a07-b787-d0bb5aacbf87" export ENV_ID="<environment-id>" # example: "env-a1b2c3" Run the following command to list the connections in your environment. curl --request GET \ --url "https://flink.region.provider.confluent.cloud/sql/v1/organizations/${ORG_ID}/environments/${ENV_ID}/connections" \ --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" Your output should resemble: Response from a request to list connections { "api_version": "sql/v1", "kind": "ConnectionList", "metadata": { "first": "https://flink.us-west1.aws.confluent.cloud/sql/v1/environments/env-abc123/connections", "last": "", "prev": "", "next": "https://flink.us-west1.aws.confluent.cloud/sql/v1/environments/env-abc123/connections?page_token=UvmDWOB1iwfAIBPj6EYb", "total_size": 123, "self": "https://flink.us-west1.aws.confluent.cloud/sql/v1/environments/env-123/connections" }, "data": [ { "api_version": "sql/v1", "kind": "Connection", "metadata": { "self": "https://flink.us-west1.aws.confluent.cloud/sql/v1/organizations/org-abc/environments/env-123/connections/my-openai-connection", "resource_name": "", "created_at": "2006-01-02T15:04:05-07:00", "updated_at": "2006-01-02T15:04:05-07:00", "deleted_at": "2006-01-02T15:04:05-07:00", "uid": "12345678-1234-1234-1234-123456789012", "resource_version": "a23av" }, "name": "my-openai-connection", "spec": { "connection_type": "OPENAI", "endpoint": "https://api.openai.com/v1/chat/completions", "auth_data": { "kind": "PlaintextProvider", "data": "string" } } }, "status": { "phase": "READY", "detail": "Lookup failed: ai.openai.com" } } ] } Update a connection Update a connection in your environment by sending a PATCH request to the Connections endpoint. This request uses your Cloud API key instead of the Flink API key. Updating a connection requires the following inputs: export CONNECTION_NAME="<connection-name>" # example: "my-openai-connection" export CONNECTION_TYPE="<connection-type>" # example: "OPENAI" export ENDPOINT="<endpoint>" # example: "https://api.openai.com/v1/chat/completions" export CLOUD_API_KEY="<cloud-api-key>" export CLOUD_API_SECRET="<cloud-api-secret>" export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0) export ORG_ID="<organization-id>" # example: "b0b21724-4586-4a07-b787-d0bb5aacbf87" export ENV_ID="<environment-id>" # example: "env-a1b2c3" export CLOUD_PROVIDER="<cloud-provider>" # example: "aws" export CLOUD_REGION="<cloud-region>" # example: "us-east-1" export JSON_DATA="<payload-string>" The following JSON shows an example payload. The auth_data key varies by service. { "name": "${CONNECTION_NAME}", "spec": { "connection_type": "${CONNECTION_TYPE}", "endpoint": "${ENDPOINT}", "auth_data": { "kind": "PlaintextProvider", "data": "string" } }, "metadata": {} } Quotation mark characters in the JSON string must be escaped, so the payload string to send resembles the following: export JSON_DATA="{ \"name\": \"${CONNECTION_NAME}\", \"spec\": { \"connection_type\": \"${CONNECTION_TYPE}\", \"endpoint\": \"${ENDPOINT}\", \"auth_data\": { \"kind\": \"PlaintextProvider\", \"data\": \"string\" } }, \"metadata\": {} }" The following command sends a PUT request to update a connection. curl --request PUT \ --url "https://flink.region.provider.confluent.cloud/sql/v1/organizations/${ORG_ID}/environments/${ENV_ID}/connections/${CONNECTION_NAME}" \ --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" \ --header 'content-type: application/json' \ --data "${JSON_DATA}" Your output should resemble: Response from a request to update a connection { "api_version": "sql/v1", "kind": "Connection", "metadata": { "self": "https://flink.us-west1.aws.confluent.cloud/sql/v1/organizations/org-abc/environments/env-a1b2c3/connections/my-openai-connection", "resource_name": "", "created_at": "2006-01-02T15:04:05-07:00", "updated_at": "2006-01-02T15:04:05-07:00", "deleted_at": "2006-01-02T15:04:05-07:00", "uid": "12345678-1234-1234-1234-123456789012", "resource_version": "a23av" }, "name": "my-openai-connection", "spec": { "connection_type": "OPENAI", "endpoint": "https://api.openai.com/v1/chat/completions", "auth_data": { "kind": "PlaintextProvider", "data": "string" } }, "status": { "phase": "READY", "detail": "Lookup failed: ai.openai.com" } } } Related content¶ Cloud Console Confluent CLI SQL shell Confluent Terraform Provider Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
confluent flink region list
```

```sql
confluent flink region list
```

```sql
<!-- Without private network -->
https://flink.${CLOUD_REGION}.${CLOUD_PROVIDER}.confluent.cloud/

<!-- With private network -->
https://flink.${CLOUD_REGION}.${CLOUD_PROVIDER}.private.confluent.cloud/
```

```sql
<!-- Without private network -->
https://flink.us-east-1.aws.confluent.cloud
```

```sql
<!-- With private network -->
https://flink.us-east-1.aws.private.confluent.cloud
```

```sql
export FLINK_API_KEY="<flink-api-key>"
export FLINK_API_SECRET="<flink-api-secret>"
```

```sql
export BASE64_FLINK_KEY_AND_SECRET=$(echo -n "${FLINK_API_KEY}:${FLINK_API_SECRET}" | base64 -w 0)
```

```sql
api_version: "sql/v1"
kind: "Statement"
organization_id: "" # Identifier of your Confluent Cloud organization
environment_id: "" # Identifier of your Confluent Cloud environment
name: "" # Primary identifier of the statement, must be unique within the environment, 100 max length, [a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*
metadata:
  created_at: "" # Creation timestamp of this resource
  updated_at: "" # Last updated timestamp of this resource
  resource_version: "" # Generated by the system and updated whenever the statement is updated (including by the system). Opaque and should not be parsed.
  self: "" # An absolute URL to this resource
  uid: "" # uid is unique in time and space (i.e., even if the name is re-used)
spec:
  compute_pool_id: "" # The ID of the compute pool the statement should run in. DNS Subdomain (RFC 1123) – 255 max len, [a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*
  principal: "" # user or service account ID
  properties: map[string]string # Optional. request/client properties
  statement: "SELECT * from Orders;" # The raw SQL text
  stopped: false # Boolean, specifying if the statement should be stopped
status:
  phase: PENDING | RUNNING | COMPLETED | DELETING | FAILING | FAILED
  detail: "" # Optional. Human-readable description of phase.
  result_schema: "" # Optional. JSON object in TableSchema format; describes the data returned by the results serving API.
```

```sql
[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*
```

```sql
export FLINK_API_KEY="<flink-api-key>"
export FLINK_API_SECRET="<flink-api-secret>"
export BASE64_FLINK_KEY_AND_SECRET=$(echo -n "${FLINK_API_KEY}:${FLINK_API_SECRET}" | base64 -w 0)
export STATEMENT_NAME="<statement-name>" # example: "user-filter"
export ORG_ID="<organization-id>" # example: "b0b21724-4586-4a07-b787-d0bb5aacbf87"
export ENV_ID="<environment-id>" # example: "env-z3y2x1"
export CLOUD_PROVIDER="<cloud-provider>" # example: "aws"
export CLOUD_REGION="<cloud-region>" # example: "us-east-1"
export COMPUTE_POOL_ID="<compute-pool-id>" # example: "lfcp-8m03rm"
export PRINCIPAL_ID="<principal-id>" # (optional) example: "sa-23kgz4" for a service account, or "u-aq1dr2" for a user account
export SQL_CODE="<sql-statement-text>" # example: "SELECT * FROM USERS;"
export JSON_DATA="<payload-string>"
```

```sql
{
  "name": "${STATEMENT_NAME}",
  "organization_id": "${ORG_ID}",
  "environment_id": "${ENV_ID}",
  "spec": {
    "statement": "${SQL_CODE}",
    "properties": {
      "key1": "value1",
      "key2": "value2"
    },
    "compute_pool_id": "${COMPUTE_POOL_ID}",
    "principal": "${PRINCIPAL_ID}",
    "stopped": false
  }
}
```

```sql
export JSON_DATA="{
  \"name\": \"${STATEMENT_NAME}\",
  \"organization_id\": \"${ORG_ID}\",
  \"environment_id\": \"${ENV_ID}\",
  \"spec\": {
    \"statement\": \"${SQL_CODE}\",
    \"properties\": {
      \"key1\": \"value1\",
      \"key2\": \"value2\"
    },
    \"compute_pool_id\": \"${COMPUTE_POOL_ID}\",
    \"principal\": \"${PRINCIPAL_ID}\",
    \"stopped\": false
  }
}"
```

```sql
curl --request POST \
  --url "https://flink.${CLOUD_REGION}.${CLOUD_PROVIDER}.confluent.cloud/sql/v1/organizations/${ORG_ID}/environments/${ENV_ID}/statements" \
  --header "Authorization: Basic ${BASE64_FLINK_KEY_AND_SECRET}" \
  --header 'content-type: application/json' \
  --data "${JSON_DATA}"
```

```sql
{
  "api_version": "sql/v1",
  "environment_id": "env-z3y2x1",
  "kind": "Statement",
  "metadata": {
    "created_at": "2023-12-16T17:12:08.914198Z",
    "resource_version": "1",
    "self": "https://flink.us-east-1.aws.confluent.cloud/sql/v1/organizations/b0b21724-4586-4a07-b787-d0bb5aacbf87/environments/env-z3y2x1/statements/demo-statement-1",
    "uid": "0005dd7b-8a7e-4274-b97e-c21b134d98f0",
    "updated_at": "2023-12-16T17:12:08.914198Z"
  },
  "name": "demo-statement-1",
  "organization_id": "b0b21724-4586-4a07-b787-d0bb5aacbf87",
  "spec": {
    "compute_pool_id": "lfcp-8m03rm",
    "principal": "u-aq1dr2",
    "properties": null,
    "statement": "select 1;",
    "stopped": false
  },
  "status": {
    "detail": "",
    "phase": "PENDING"
  }
}
```

```sql
export FLINK_API_KEY="<flink-api-key>"
export FLINK_API_SECRET="<flink-api-secret>"
export BASE64_FLINK_KEY_AND_SECRET=$(echo -n "${FLINK_API_KEY}:${FLINK_API_SECRET}" | base64 -w 0)
export STATEMENT_NAME="<statement-name>" # example: "user-filter"
export ORG_ID="<organization-id>" # example: "b0b21724-4586-4a07-b787-d0bb5aacbf87"
export ENV_ID="<environment-id>" # example: "env-z3y2x1"
export CLOUD_PROVIDER="<cloud-provider>" # example: "aws"
export CLOUD_REGION="<cloud-region>" # example: "us-east-1"
```

```sql
curl --request GET \
 --url "https://flink.${CLOUD_REGION}.${CLOUD_PROVIDER}.confluent.cloud/sql/v1/organizations/${ORG_ID}/environments/${ENV_ID}/statements/${STATEMENT_NAME}" \
 --header "Authorization: Basic ${BASE64_FLINK_KEY_AND_SECRET}"
```

```sql
{
  "api_version": "sql/v1",
  "environment_id": "env-z3y2x1",
  "kind": "Statement",
  "metadata": {
    "created_at": "2023-12-16T16:08:36.650591Z",
    "resource_version": "13",
    "self": "https://flink.us-east-1.aws.confluent.cloud/sql/v1/organizations/b0b21724-4586-4a07-b787-d0bb5aacbf87/environments/env-z3y2x1/statements/demo-statement-1",
    "uid": "5387a4a4-02dd-4375-8db1-80bdd82ede96",
    "updated_at": "2023-12-16T16:10:05.353298Z"
  },
  "name": "demo-statement-1",
  "organization_id": "b0b21724-4586-4a07-b787-d0bb5aacbf87",
  "spec": {
    "compute_pool_id": "lfcp-8m03rm",
    "principal": "u-aq1dr2",
    "properties": null,
    "statement": "select 1;",
    "stopped": false
  },
  "status": {
    "detail": "",
    "phase": "COMPLETED",
    "result_schema": {
      "columns": [
        {
          "name": "EXPR$0",
          "type": {
            "nullable": false,
            "type": "INTEGER"
          }
        }
      ]
    }
  }
}
```

```sql
curl --request GET \
  --url "https://flink.${CLOUD_REGION}.${CLOUD_PROVIDER}.confluent.cloud/sql/v1/organizations/${ORG_ID}/environments/${ENV_ID}/statements/${STATEMENT_NAME}" \
  --header "Authorization: Basic ${BASE64_FLINK_KEY_AND_SECRET}" \
  | jq -r '.spec.statement'
```

```sql
spec.compute_pool_id
```

```sql
metadata.next
```

```sql
StatementList
```

```sql
export FLINK_API_KEY="<flink-api-key>"
export FLINK_API_SECRET="<flink-api-secret>"
export BASE64_FLINK_KEY_AND_SECRET=$(echo -n "${FLINK_API_KEY}:${FLINK_API_SECRET}" | base64 -w 0)
export ORG_ID="<organization-id>" # example: "b0b21724-4586-4a07-b787-d0bb5aacbf87"
export ENV_ID="environment-id" # example: "env-z3y2x1"
export CLOUD_PROVIDER="<cloud-provider>" # example: "aws"
export CLOUD_REGION="<cloud-region>" # example: "us-east-1"
```

```sql
curl --request GET \
  --url "https://flink.${CLOUD_REGION}.${CLOUD_PROVIDER}.confluent.cloud/sql/v1/organizations/${ORG_ID}/environments/${ENV_ID}/statements" \
  --header "Authorization: Basic ${BASE64_FLINK_KEY_AND_SECRET}"
```

```sql
{
  "api_version": "sql/v1",
  "data": [
    {
      "api_version": "sql/v1",
      "environment_id": "env-z3y2x1",
      "kind": "Statement",
      "metadata": {
        "created_at": "2023-12-16T16:08:36.650591Z",
        "resource_version": "13",
        "self": "https://flink.us-east-1.aws.confluent.cloud/sql/v1/organizations/b0b21724-4586-4a07-b787-d0bb5aacbf87/environments/env-z3y2x1/statements/demo-statement-1",
        "uid": "5387a4a4-02dd-4375-8db1-80bdd82ede96",
        "updated_at": "2023-12-16T16:10:05.353298Z"
      },
      "name": "demo-statement-1",
      "organization_id": "b0b21724-4586-4a07-b787-d0bb5aacbf87",
      "spec": {
        "compute_pool_id": "lfcp-8m03rm",
        "principal": "u-aq1dr2",
        "properties": null,
        "statement": "select 1;",
        "stopped": false
      },
      "status": {
        "detail": "",
        "phase": "COMPLETED",
        "result_schema": {
          "columns": [
            {
              "name": "EXPR$0",
              "type": {
                "nullable": false,
                "type": "INTEGER"
              }
            }
          ]
        }
      }
    }
```

```sql
export FLINK_API_KEY="<flink-api-key>"
export FLINK_API_SECRET="<flink-api-secret>"
export BASE64_FLINK_KEY_AND_SECRET=$(echo -n "${FLINK_API_KEY}:${FLINK_API_SECRET}" | base64 -w 0)
export STATEMENT_NAME="<statement-name>" # example: "user-filter"
export ORG_ID="<organization-id>" # example: "b0b21724-4586-4a07-b787-d0bb5aacbf87"
export ENV_ID="<environment-id>" # example: "env-z3y2x1"
export CLOUD_PROVIDER="<cloud-provider>" # example: "aws"
export CLOUD_REGION="<cloud-region>" # example: "us-east-1"
export COMPUTE_POOL_ID="<compute-pool-id>" # example: "lfcp-8m03rm"
export PRINCIPAL_ID="<principal-id>" # (optional) example: "sa-23kgz4" for a service account, or "u-aq1dr2" for a user account
export SQL_CODE="<sql-statement-text>" # example: "SELECT * FROM USERS;"
export RESOURCE_VERSION="<version>" # example: "a3e", must be fetched from the latest version of the statement
export JSON_DATA="<payload-string>"
```

```sql
{
  "name": "${STATEMENT_NAME}",
  "organization_id": "${ORG_ID}",
  "environment_id": "${ENV_ID}",
  "spec": {
    "statement": "${SQL_CODE}",
    "properties": {
      "key1": "value1",
      "key2": "value2"
    },
    "compute_pool_id": "${COMPUTE_POOL_ID}",
    "principal": "${PRINCIPAL_ID}",
    "stopped": false
  },
  "metadata": {
     "resource_version": "${RESOURCE_VERSION}"
  }
}
```

```sql
export JSON_DATA="{
  \"name\": \"${STATEMENT_NAME}\",
  \"organization_id\": \"${ORG_ID}\",
  \"environment_id\": \"${ENV_ID}\",
  \"spec\": {
    \"statement\": \"${SQL_CODE}\",
    \"properties\": {
      \"key1\": \"value1\",
      \"key2\": \"value2\"
    },
    \"compute_pool_id\": \"${COMPUTE_POOL_ID}\",
    \"principal\": \"${PRINCIPAL_ID}\",
    \"stopped\": false
  },
  \"metadata\": {
    \"resource_version\": \"${RESOURCE_VERSION}\"
  }
}"
```

```sql
curl --request PUT \
  --url "https://flink.${CLOUD_REGION}.${CLOUD_PROVIDER}.confluent.cloud/sql/v1/organizations/${ORG_ID}/environments/${ENV_ID}/statements/${STATEMENT_NAME}" \
  --header "Authorization: Basic ${BASE64_FLINK_KEY_AND_SECRET}" \
  --header 'content-type: application/json' \
  --data "${JSON_DATA}"
```

```sql
while true:
  statement = getStatement()
  # make modifications to the current statement
  statement.spec.stopped = True
  # send the update
  response = updateStatement(statement)
  # if a conflict, retry
  if response.code == 409:
    continue
  elif response.code == 200:
    return "success"
  else:
    return response.error()
```

```sql
export FLINK_API_KEY="<flink-api-key>"
export FLINK_API_SECRET="<flink-api-secret>"
export BASE64_FLINK_KEY_AND_SECRET=$(echo -n "${FLINK_API_KEY}:${FLINK_API_SECRET}" | base64 -w 0)
export STATEMENT_NAME="<statement-name>" # example: "user-filter"
export ORG_ID="<organization-id>" # example: "b0b21724-4586-4a07-b787-d0bb5aacbf87"
export ENV_ID="<environment-id>" # example: "env-z3y2x1"
export CLOUD_PROVIDER="<cloud-provider>" # example: "aws"
export CLOUD_REGION="<cloud-region>" # example: "us-east-1"
```

```sql
curl --request DELETE \
  --url "https://flink.${CLOUD_REGION}.${CLOUD_PROVIDER}.confluent.cloud/sql/v1/organizations/${ORG_ID}/environments/${ENV_ID}/statements/${STATEMENT_NAME}" \
  --header "Authorization: Basic ${BASE64_FLINK_KEY_AND_SECRET}"
```

```sql
FlinkDeveloper
```

```sql
export CLOUD_API_KEY="<cloud-api-key>"
export CLOUD_API_SECRET="<cloud-api-secret>"
export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0)
export ENV_ID="<environment-id>" # example: "env-z3y2x1"
```

```sql
curl --request GET \
     --url "https://confluent.cloud/api/fcpm/v2/compute-pools?environment=${ENV_ID}&page_size=100" \
     --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" \
     | jq -r '.data[] | .spec.display_name, {id}'
```

```sql
compute_pool_0
{
  "id": "lfcp-j123kl"
}
compute_pool_2
{
  "id": "lfcp-abc1de"
}
my-lfcp-01
{
  "id": "lfcp-l2mn3o"
}
...
```

```sql
export COMPUTE_POOL_ID="<your-compute-pool-id>"
```

```sql
export COMPUTE_POOL_NAME="<compute-pool-name>" # human readable name, for example: "my-compute-pool"
export CLOUD_API_KEY="<cloud-api-key>"
export CLOUD_API_SECRET="<cloud-api-secret>"
export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0)
export ENV_ID="<environment-id>" # example: "env-z3y2x1"
export CLOUD_PROVIDER="<cloud-provider>" # example: "aws"
export CLOUD_REGION="<cloud-region>" # example: "us-east-1"
export MAX_CFU="<max-cfu>" # example: 5
export JSON_DATA="<payload-string>"
```

```sql
{
  "spec": {
    "display_name": "${COMPUTE_POOL_NAME}",
    "cloud": "${CLOUD_PROVIDER}",
    "region": "${CLOUD_REGION}",
    "max_cfu": ${MAX_CFU},
    "environment": {
      "id": "${ENV_ID}"
    },
    "network": {
      "id": "n-00000",
      "environment": "string"
    }
  }
}
```

```sql
export JSON_DATA="{
  \"spec\": {
    \"display_name\": \"${COMPUTE_POOL_NAME}\",
    \"cloud\": \"${CLOUD_PROVIDER}\",
    \"region\": \"${CLOUD_REGION}\",
    \"max_cfu\": ${MAX_CFU},
    \"environment\": {
      \"id\": \"${ENV_ID}\"
    }
  }
}"
```

```sql
curl --request POST \
  --url https://api.confluent.cloud/fcpm/v2/compute-pools \
  --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" \
  --header 'content-type: application/json' \
  --data "${JSON_DATA}"
```

```sql
{
    "api_version": "fcpm/v2",
    "id": "lfcp-6g7h8i",
    "kind": "ComputePool",
    "metadata": {
        "created_at": "2024-02-27T22:44:27.18964Z",
        "resource_name": "crn://confluent.cloud/organization=b0b21724-4586-4a07-b787-d0bb5aacbf87/environment=env-z3y2x1/flink-region=aws.us-east-1/compute-pool=lfcp-6g7h8i",
        "self": "https://api.confluent.cloud/fcpm/v2/compute-pools/lfcp-6g7h8i",
        "updated_at": "2024-02-27T22:44:27.18964Z"
    },
    "spec": {
        "cloud": "AWS",
        "display_name": "my-compute-pool",
        "environment": {
            "id": "env-z3y2x1",
            "related": "https://api.confluent.cloud/fcpm/v2/compute-pools/lfcp-6g7h8i",
            "resource_name": "crn://confluent.cloud/organization=b0b21724-4586-4a07-b787-d0bb5aacbf87/environment=env-z3y2x1"
        },
        "http_endpoint": "https://flink.us-east-1.aws.confluent.cloud/sql/v1/organizations/b0b21724-4586-4a07-b787-d0bb5aacbf87/environments/env-z3y2x1",
        "max_cfu": 5,
        "region": "us-east-1"
    },
    "status": {
        "current_cfu": 0,
        "phase": "PROVISIONING"
    }
}
```

```sql
export COMPUTE_POOL_ID="<compute-pool-id>" # example: "lfcp-8m03rm"
export CLOUD_API_KEY="<cloud-api-key>"
export CLOUD_API_SECRET="<cloud-api-secret>"
export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0)
export ENV_ID="<environment-id>" # example: "env-z3y2x1"
```

```sql
curl --request GET \
  --url "https://api.confluent.cloud/fcpm/v2/compute-pools/${COMPUTE_POOL_ID}?environment=${ENV_ID}" \
  --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}"
```

```sql
{
    "api_version": "fcpm/v2",
    "id": "lfcp-6g7h8i",
    "kind": "ComputePool",
    "metadata": {
        "created_at": "2024-02-27T22:44:27.18964Z",
        "resource_name": "crn://confluent.cloud/organization=b0b21724-4586-4a07-b787-d0bb5aacbf87/environment=env-z3y2x1/flink-region=aws.us-east-1/compute-pool=lfcp-6g7h8i",
        "self": "https://api.confluent.cloud/fcpm/v2/compute-pools/lfcp-6g7h8i",
        "updated_at": "2024-02-27T22:44:27.18964Z"
    },
    "spec": {
        "cloud": "AWS",
        "display_name": "my-compute-pool",
        "environment": {
            "id": "env-z3y2x1",
            "related": "https://api.confluent.cloud/fcpm/v2/compute-pools/lfcp-6g7h8i",
            "resource_name": "crn://confluent.cloud/organization=b0b21724-4586-4a07-b787-d0bb5aacbf87/environment=env-z3y2x1"
        },
        "http_endpoint": "https://flink.us-east-1.aws.confluent.cloud/sql/v1/organizations/b0b21724-4586-4a07-b787-d0bb5aacbf87/environments/env-z3y2x1",
        "max_cfu": 5,
        "region": "us-east-1"
    },
    "status": {
        "current_cfu": 0,
        "phase": "PROVISIONED"
    }
}
```

```sql
export COMPUTE_POOL_ID="<compute-pool-id>" # example: "lfcp-8m03rm"
export CLOUD_API_KEY="<cloud-api-key>"
export CLOUD_API_SECRET="<cloud-api-secret>"
export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0)
export ENV_ID="<environment-id>" # example: "env-z3y2x1"
export MAX_CFU="<max-cfu>" # example: 5
export JSON_DATA="<payload-string>"
```

```sql
{
  "spec": {
    "display_name": "${COMPUTE_POOL_NAME}",
    "max_cfu": ${MAX_CFU},
    "environment": {
      "id": "${ENV_ID}"
    }
  }
}
```

```sql
export JSON_DATA="{
  \"spec\": {
    \"display_name\": \"${COMPUTE_POOL_NAME}\",
    \"max_cfu\": ${MAX_CFU},
    \"environment\": {
      \"id\": \"${ENV_ID}\"
    }
  }
}"
```

```sql
curl --request PATCH \
  --url "https://api.confluent.cloud/fcpm/v2/compute-pools/${COMPUTE_POOL_ID}" \
  --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" \
  --header 'content-type: application/json' \
  --data "${JSON_DATA}"
```

```sql
export COMPUTE_POOL_ID="<compute-pool-id>" # example: "lfcp-8m03rm"
export CLOUD_API_KEY="<cloud-api-key>"
export CLOUD_API_SECRET="<cloud-api-secret>"
export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0)
export ENV_ID="<environment-id>" # example: "env-z3y2x1"
```

```sql
curl --request DELETE \
  --url "https://api.confluent.cloud/fcpm/v2/compute-pools/${COMPUTE_POOL_ID}?environment=${ENV_ID}" \
  --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}"
```

```sql
export CLOUD_API_KEY="<cloud-api-key>"
export CLOUD_API_SECRET="<cloud-api-secret>"
export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0)
```

```sql
curl --request GET \
  --url "https://api.confluent.cloud/fcpm/v2/regions" \
  --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" \
  | jq -r '.data[].id'
```

```sql
aws.eu-central-1
aws.us-east-1
aws.eu-west-1
aws.us-east-2
...
```

```sql
api_version: artifact/v1
kind: FlinkArtifact
id: dlz-f3a90de
metadata:
  self: 'https://api.confluent.cloud/artifact/v1/flink-artifacts/fa-12345'
  resource_name: crn://confluent.cloud/organization=<org-id>/flink-artifact=fa-12345
  created_at: '2006-01-02T15:04:05-07:00'
  updated_at: '2006-01-02T15:04:05-07:00'
  deleted_at: '2006-01-02T15:04:05-07:00'
cloud: AWS
region: us-east-1
environment: env-00000
display_name: string
class: io.confluent.example.SumScalarFunction
content_format: JAR
description: string
documentation_link: '^$|^(http://|https://).'
runtime_language: JAVA
versions:
  - version: cfa-ver-001
    release_notes: string
    is_beta: true
    artifact_id: {}
    upload_source:
      api_version: artifact.v1/UploadSource
      kind: PresignedUrl
      id: dlz-f3a90de
      metadata:
        self: https://api.confluent.cloud/artifact.v1/UploadSource/presigned-urls/pu-12345
        resource_name: crn://confluent.cloud/organization=<org-id>/presigned-url=pu-12345
        created_at: '2006-01-02T15:04:05-07:00'
        updated_at: '2006-01-02T15:04:05-07:00'
        deleted_at: '2006-01-02T15:04:05-07:00'
      location: PRESIGNED_URL_LOCATION
      upload_id: <guid>
```

```sql
export CLOUD_PROVIDER="<cloud-provider>" # example: "aws"
export CLOUD_REGION="<cloud-region>" # example: "us-east-1"
export CLOUD_API_KEY="<cloud-api-key>"
export CLOUD_API_SECRET="<cloud-api-secret>"
export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0)
export ENV_ID="<environment-id>" # example: "env-z3y2x1"
```

```sql
curl --request GET \
     --url "https://api.confluent.cloud/artifact/v1/flink-artifacts?cloud=${CLOUD_PROVIDER}&region=${CLOUD_REGION}&environment=${ENV_ID}" \
     --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" \
     | jq -r '.data[] | .spec.display_name, {id}'
```

```sql
{
  "id": "cfa-e8rzq7"
}
```

```sql
export ARTIFACT_DISPLAY_NAME="<human-readable-name>" # example: "my-udf"
export ARTIFACT_DESCRIPTION="<description>" # example: "This is a demo UDF."
export ARTIFACT_DOC_LINK="<url-to-documentation>" # example: "https://docs.example.com/my-udf"
export CLASS_NAME="<java-class-name>" # example: "io.confluent.example.SumScalarFunction"
export CLOUD_PROVIDER="<cloud-provider>" # example: "aws"
export CLOUD_REGION="<cloud-region>" # example: "us-east-1"
export CLOUD_API_KEY="<cloud-api-key>"
export CLOUD_API_SECRET="<cloud-api-secret>"
export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0)
export ENV_ID="<environment-id>" # example: "env-z3y2x1"
```

```sql
{
  "content_format": "JAR",
  "cloud": "${CLOUD_PROVIDER}",
  "environment": "${ENV_ID}",
  "region": "${CLOUD_REGION}"
}
```

```sql
export JSON_DATA="{
  \"content_format\": \"JAR\",
  \"cloud\": \"${CLOUD_PROVIDER}\",
  \"environment\": \"${ENV_ID}\",
  \"region\": \"${CLOUD_REGION}\"
}"
```

```sql
curl --request POST \
     --url https://api.confluent.cloud/artifact/v1/presigned-upload-url \
     --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" \
     --header 'content-type: application/json' \
     --data "${JSON_DATA}"
```

```sql
{
  "api_version": "artifact/v1",
  "cloud": "AWS",
  "content_format": "JAR",
  "kind": "PresignedUrl",
  "region": "us-east-1",
  "upload_form_data": {
    "bucket": "confluent-custom-connectors-prod-us-east-1",
    "key": "staging/ccp/v1/<your-org-id>/custom-plugins/<guid>/plugin.jar",
    "policy": "ey…",
    "x-amz-algorithm": "AWS4-HMAC-SHA256",
    "x-amz-credential": "AS…/20241121/us-east-1/s3/aws4_request",
    "x-amz-date": "20241121T212232Z",
    "x-amz-security-token": "IQ…",
    "x-amz-signature": "52…"
  },
  "upload_id": "<upload-id-guid>",
  "upload_url": "https://confluent-custom-connectors-prod-us-east-1.s3.dualstack.us-east-1.amazonaws.com/"
}
```

```sql
export UPLOAD_ID="<upload-id-guid>"
export UPLOAD_URL="<upload_url>"
export UPLOAD_BUCKET="<bucket>"
export UPLOAD_KEY="<key>"
export UPLOAD_POLICY="<policy>"
export UPLOAD_KEY="<key>"
export X_AMZ_ALGORITHM="<x-amz-algorithm>"
export X_AMZ_CREDENTIAL="<x-amz-credential>"
export X_AMZ_DATE="<x-amz-date>"
export X_AMZ_SECURITY_TOKEN="<x-amz-security-token>"
export X_AMZ_SIGNATURE="<x-amz-signature>"
```

```sql
-F file=@</path/to/upload/file>
```

```sql
Your proposed upload is smaller than the minimum allowed size.
```

```sql
curl -X POST "${UPLOAD_URL}" \
  -F "bucket=${UPLOAD_BUCKET}" \
  -F "key=${UPLOAD_KEY}" \
  -F "policy=${UPLOAD_POLICY}" \
  -F "x-amz-algorithm=${X_AMZ_ALGORITHM}" \
  -F "x-amz-credential=${X_AMZ_CREDENTIAL}" \
  -F "x-amz-date=${X_AMZ_DATE}" \
  -F "x-amz-security-token=${X_AMZ_SECURITY_TOKEN}" \
  -F "x-amz-signature=${X_AMZ_SIGNATURE}" \
  -F file=@/path/to/udf_file.jar
```

```sql
{
  "cloud": "${CLOUD_PROVIDER}",
  "region": "${CLOUD_REGION}",
  "environment": "${ENV_ID}",
  "display_name": "${ARTIFACT_DISPLAY_NAME}",
  "class": "${CLASS_NAME}",
  "content_format": "JAR",
  "description": "${ARTIFACT_DESCRIPTION}",
  "documentation_link": "${ARTIFACT_DOC_LINK}",
  "runtime_language": "JAVA",
  "upload_source": {
    "location": "PRESIGNED_URL_LOCATION",
    "upload_id": "${UPLOAD_ID}"
  }
}
```

```sql
export JSON_DATA="{
  \"cloud\": \"${CLOUD_PROVIDER}\",
  \"region\": \"${CLOUD_REGION}\",
  \"environment\": \"${ENV_ID}\",
  \"display_name\": \"${ARTIFACT_DISPLAY_NAME}\",
  \"class\": \"${CLASS_NAME}\",
  \"content_format\": \"JAR\",
  \"description\": \"${ARTIFACT_DESCRIPTION}\",
  \"documentation_link\": \"${ARTIFACT_DOC_LINK}\",
  \"runtime_language\": \"JAVA\",
  \"upload_source\": {
    \"location\": \"PRESIGNED_URL_LOCATION\",
    \"upload_id\": \"${UPLOAD_ID}\"
  }
}"
```

```sql
curl --request POST \
     --url "https://api.confluent.cloud/artifact/v1/flink-artifacts?cloud=${CLOUD_REGION}&region=${CLOUD_REGION}&environment=${ENV_ID}" \
     --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" \
     --header 'content-type: application/json' \
     --data "${JSON_DATA}"
```

```sql
export ARTIFACT_ID="<artifact-id>" # example: cfa-e8rzq7
export CLOUD_PROVIDER="<cloud-provider>" # example: "aws"
export CLOUD_REGION="<cloud-region>" # example: "us-east-1"
export CLOUD_API_KEY="<cloud-api-key>"
export CLOUD_API_SECRET="<cloud-api-secret>"
export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0)
export ENV_ID="<environment-id>" # example: "env-z3y2x1"
```

```sql
curl --request GET \
  --url "https://api.confluent.cloud/artifact/v1/flink-artifacts/${ARTIFACT_ID}?cloud=${CLOUD_PROVIDER}&region=${CLOUD_REGION}&environment=${ENV_ID}" \
  --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}"
```

```sql
{
  "api_version": "artifact/v1",
  "class": "default",
  "cloud": "AWS",
  "content_format": "JAR",
  "description": "",
  "display_name": "udf_example",
  "documentation_link": "",
  "environment": "env-z3q9rd",
  "id": "cfa-e8rzq7",
  "kind": "FlinkArtifact",
  "metadata": {
    "created_at": "2024-11-21T21:52:43.788042Z",
    "resource_name": "crn://confluent.cloud/organization=<org-id>/flink-artifact=cfa-e8rzq7",
    "self": "https://api.confluent.cloud/artifact/v1/flink-artifacts/cfa-e8rzq7",
    "updated_at": "2024-11-21T21:52:44.625318Z"
  },
  "region": "us-east-1",
  "runtime_language": "JAVA",
  "versions": [
    {
      "artifact_id": {},
      "is_beta": false,
      "release_notes": "",
      "upload_source": {
        "location": "PRESIGNED_URL_LOCATION",
        "upload_id": ""
      },
      "version": "ver-xq72dk"
    }
  ]
}
```

```sql
export ARTIFACT_ID="<artifact-id>" # example: cfa-e8rzq7
export ARTIFACT_DISPLAY_NAME="<human-readable-name>" # example: "my-udf"
export ARTIFACT_DESCRIPTION="<description>" # example: "This is a demo UDF."
export ARTIFACT_DOC_LINK="<url-to-documentation>" # example: "https://docs.example.com/my-udf", "^$|^(http://|https://)."
export CLASS_NAME="<java-class-name>" # example: "io.confluent.example.SumScalarFunction"
export CLOUD_PROVIDER="<cloud-provider>" # example: "aws"
export CLOUD_REGION="<cloud-region>" # example: "us-east-1"
export CLOUD_API_KEY="<cloud-api-key>"
export CLOUD_API_SECRET="<cloud-api-secret>"
export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0)
export ENV_ID="<environment-id>" # example: "env-z3y2x1"
```

```sql
{
  "cloud": "${CLOUD_PROVIDER}",
  "region": "${CLOUD_REGION}",
  "environment": "${ENV_ID}",
  "display_name": "${ARTIFACT_DISPLAY_NAME}",
  "content_format": "JAR",
  "description": "${ARTIFACT_DESCRIPTION}",
  "documentation_link": "${ARTIFACT_DOC_LINK}",
  "runtime_language": "JAVA",
  "versions": [
    {
      "version": "cfa-ver-001",
      "release_notes": "string",
      "is_beta": true,
      "artifact_id": {
        "cloud": "${CLOUD_PROVIDER}",
        "region": "${CLOUD_REGION}",
        "environment": "${ENV_ID}",
        "display_name": "${ARTIFACT_DISPLAY_NAME}",
        "class": "${CLASS_NAME}",
        "content_format": "JAR",
        "description": "${ARTIFACT_DESCRIPTION}",
        "documentation_link": "${ARTIFACT_DOC_LINK}",
        "runtime_language": "JAVA",
        "versions": [
          {}
        ]
      },
      "upload_source": {
        "location": "PRESIGNED_URL_LOCATION",
        "upload_id": "${UPLOAD_ID}"
      }
    }
  ]
}
```

```sql
export JSON_DATA="{
  \"cloud\": \"${CLOUD_PROVIDER}\",
  \"region\": \"${CLOUD_REGION}\",
  \"environment\": \"${ENV_ID}\",
  \"display_name\": \"${ARTIFACT_DISPLAY_NAME}\",
  \"content_format\": \"JAR\",
  \"description\": \"${ARTIFACT_DESCRIPTION}\",
  \"documentation_link\": \"${ARTIFACT_DOC_LINK}\",
  \"runtime_language\": \"JAVA\",
  \"versions\": [
    {
      \"version\": \"cfa-ver-001\",
      \"release_notes\": \"string\",
      \"is_beta\": true,
      \"artifact_id\": {
        \"cloud\": \"${CLOUD_PROVIDER}\",
        \"region\": \"${CLOUD_REGION}\",
        \"environment\": \"${ENV_ID}\",
        \"display_name\": \"${ARTIFACT_DISPLAY_NAME}\",
        \"class\": \"${CLASS_NAME}\",
        \"content_format\": \"JAR\",
        \"description\": \"${ARTIFACT_DESCRIPTION}\",
        \"documentation_link\": \"${ARTIFACT_DOC_LINK}\",
        \"runtime_language\": \"JAVA\",
        \"versions\": [
          {}
        ]
      },
      \"upload_source\": {
        \"location\": \"PRESIGNED_URL_LOCATION\",
        \"upload_id\": \"${UPLOAD_ID}\"
      }
    }
  ]
}"
```

```sql
curl --request PATCH \
     --url "https://api.confluent.cloud/artifact/v1/flink-artifacts/${ARTIFACT_ID}?cloud=${CLOUD_PROVIDER}&region=${CLOUD_REGION}&environment=${ENV_ID}" \
     --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" \
     --header 'content-type: application/json' \
     --data "${JSON_DATA}"
```

```sql
export ARTIFACT_ID="<artifact-id>" # example: cfa-e8rzq7
export CLOUD_PROVIDER="<cloud-provider>" # example: "aws"
export CLOUD_REGION="<cloud-region>" # example: "us-east-1"
export CLOUD_API_KEY="<cloud-api-key>"
export CLOUD_API_SECRET="<cloud-api-secret>"
export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0)
export ENV_ID="<environment-id>" # example: "env-z3y2x1"
```

```sql
curl --request DELETE \
  --url "https://api.confluent.cloud/artifact/v1/flink-artifacts/${ARTIFACT_ID}?cloud=${CLOUD_PROVIDER}&region=${CLOUD_REGION}&environment=${ENV_ID}" \
  --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}"
```

```sql
export UDF_LOG_ID="<udf-log-id>" # example: "ccl-4l5klo"
export UDF_LOG_TOPIC_NAME="<topic-name>" # example: "udf_log"
export KAFKA_CLUSTER_ID="<kafka-cluster-id>" # example: "lkc-12345"
export CLOUD_PROVIDER="<cloud-provider>" # example: "aws"
export CLOUD_REGION="<cloud-region>" # example: "us-east-1"
export CLOUD_API_KEY="<cloud-api-key>"
export CLOUD_API_SECRET="<cloud-api-secret>"
export ENV_ID="<environment-id>" # example: "env-z3y2x1"
```

```sql
cat << EOF | curl --silent -X POST
  -u ${CLOUD_API_KEY}:${CLOUD_API_SECRET} \
  -d @- https://api.confluent.cloud/ccl/v1/custom-code-loggings
    {
        "cloud":"${CLOUD_PROVIDER}",
        "region":"${CLOUD_REGION}",
        "environment": {
        "id":"${ENV_ID}"
    },
        "destination_settings":{
                "kind":"Kafka",
                "cluster_id":"${KAFKA_CLUSTER_ID}",
                "topic":"${UDF_LOG_TOPIC_NAME}",
        "log_level":"info"
        }
    }
    EOF
```

```sql
curl --silent -X GET \
  -u ${CLOUD_API_KEY}:${CLOUD_API_SECRET} \
  https://api.confluent.cloud/ccl/v1/custom-code-loggings?environment=${ENV_ID}
```

```sql
curl --silent -X DELETE \
  -u ${CLOUD_API_KEY}:${CLOUD_API_SECRET} \
  https://api.confluent.cloud/ccl/v1/custom-code-loggings/${UDF_LOG_ID}?environment=${ENV_ID}
```

```sql
curl --silent -X GET \
  -u ${CLOUD_API_KEY}:${CLOUD_API_SECRET} \
  https://api.confluent.cloud/ccl/v1/custom-code-loggings/${UDF_LOG_ID}?environment=${ENV_ID}
```

```sql
cat <<EOF | curl --silent -X PATCH \
  -u ${CLOUD_API_KEY}:${CLOUD_API_SECRET} \
  -d @- https://api.confluent.cloud/ccl/v1/custom-code-loggings/${UDF_LOG_ID}
  {
    "region":"asddf",
    "destination_settings":{
      "kind":"Kafka"
    }
  }
  EOF
```

```sql
FlinkDeveloper
```

```sql
export CONNECTION_NAME="<connection-name>" # example: "my-openai-connection"
export CONNECTION_TYPE="<connection-type>" # example: "OPENAI"
export ENDPOINT="<endpoint>" # example: "https://api.openai.com/v1/chat/completions"
export CLOUD_API_KEY="<cloud-api-key>"
export CLOUD_API_SECRET="<cloud-api-secret>"
export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0)
export ORG_ID="<organization-id>" # example: "b0b21724-4586-4a07-b787-d0bb5aacbf87"
export ENV_ID="<environment-id>" # example: "env-a1b2c3"
export CLOUD_PROVIDER="<cloud-provider>" # example: "aws"
export CLOUD_REGION="<cloud-region>" # example: "us-east-1"
export JSON_DATA="<payload-string>"
```

```sql
{
  "name": "${CONNECTION_NAME}",
  "spec": {
    "connection_type": "${CONNECTION_TYPE}",
    "endpoint": "${ENDPOINT}",
    "auth_data": {
      "kind": "PlaintextProvider",
      "data": "string"
    }
  },
  "metadata": {}
}
```

```sql
export JSON_DATA="{
  \"name\": \"${CONNECTION_NAME}\",
  \"spec\": {
    \"connection_type\": \"${CONNECTION_TYPE}\",
    \"endpoint\": \"${ENDPOINT}\",
    \"auth_data\": {
      \"kind\": \"PlaintextProvider\",
      \"data\": \"string\"
    }
  },
  \"metadata\": {}
}"
```

```sql
curl --request POST \
  --url "https://flink.region.provider.confluent.cloud/sql/v1/organizations/${ORG_ID}/environments/${ENV_ID}/connections" \
  --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" \
  --header 'content-type: application/json' \
  --data "${JSON_DATA}"
```

```sql
{
  "api_version": "sql/v1",
  "kind": "Connection",
  "metadata": {
    "self": "https://flink.us-west1.aws.confluent.cloud/sql/v1/organizations/org-abc/environments/env-a1b2c3/connections/my-openai-connection",
    "resource_name": "",
    "created_at": "2006-01-02T15:04:05-07:00",
    "updated_at": "2006-01-02T15:04:05-07:00",
    "deleted_at": "2006-01-02T15:04:05-07:00",
    "uid": "12345678-1234-1234-1234-123456789012",
    "resource_version": "a23av"
  },
  "name": "my-openai-connection",
  "spec": {
    "connection_type": "OPENAI",
    "endpoint": "https://api.openai.com/v1/chat/completions",
    "auth_data": {
      "kind": "PlaintextProvider",
      "data": "string"
    }
  },
  "status": {
    "phase": "READY",
    "detail": "Lookup failed: ai.openai.com"
     }
   }
}
```

```sql
export CONNECTION_NAME="<connection-name>" # example: "my-openai-connection"
export CLOUD_API_KEY="<cloud-api-key>"
export CLOUD_API_SECRET="<cloud-api-secret>"
export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0)
export ORG_ID="<organization-id>" # example: "b0b21724-4586-4a07-b787-d0bb5aacbf87"
export ENV_ID="<environment-id>" # example: "env-a1b2c3"
```

```sql
curl --request DELETE \
  --url "https://flink.region.provider.confluent.cloud/sql/v1/organizations/${ORG_ID}/environments/${ENV_ID}/connections/${CONNECTION_NAME}" \
  --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}"
```

```sql
export CONNECTION_NAME="<connection-name>" # example: "my-openai-connection"
export CLOUD_API_KEY="<cloud-api-key>"
export CLOUD_API_SECRET="<cloud-api-secret>"
export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0)
export ORG_ID="<organization-id>" # example: "b0b21724-4586-4a07-b787-d0bb5aacbf87"
export ENV_ID="<environment-id>" # example: "env-a1b2c3"
```

```sql
curl --request GET \
  --url "https://flink.region.provider.confluent.cloud/sql/v1/organizations/${ORG_ID}/environments/${ENV_ID}/connections/${CONNECTION_NAME}" \
  --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}"
```

```sql
{
  "api_version": "sql/v1",
  "kind": "Connection",
  "metadata": {
    "self": "https://flink.us-west1.aws.confluent.cloud/sql/v1/organizations/org-abc/environments/env-123/connections/my-openai-connection",
    "resource_name": "",
    "created_at": "2006-01-02T15:04:05-07:00",
    "updated_at": "2006-01-02T15:04:05-07:00",
    "deleted_at": "2006-01-02T15:04:05-07:00",
    "uid": "12345678-1234-1234-1234-123456789012",
    "resource_version": "a23av"
  },
  "name": "my-openai-connection",
  "spec": {
    "connection_type": "OPENAI",
    "endpoint": "https://api.openai.com/v1/chat/completions",
    "auth_data": {
      "kind": "PlaintextProvider",
      "data": "string"
    }
  },
  "status": {
    "phase": "READY",
    "detail": "Lookup failed: ai.openai.com"
  }
}
```

```sql
export CLOUD_API_KEY="<cloud-api-key>"
export CLOUD_API_SECRET="<cloud-api-secret>"
export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0)
export ORG_ID="<organization-id>" # example: "b0b21724-4586-4a07-b787-d0bb5aacbf87"
export ENV_ID="<environment-id>" # example: "env-a1b2c3"
```

```sql
curl --request GET \
  --url "https://flink.region.provider.confluent.cloud/sql/v1/organizations/${ORG_ID}/environments/${ENV_ID}/connections" \
  --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}"
```

```sql
{
  "api_version": "sql/v1",
  "kind": "ConnectionList",
  "metadata": {
    "first": "https://flink.us-west1.aws.confluent.cloud/sql/v1/environments/env-abc123/connections",
    "last": "",
    "prev": "",
    "next": "https://flink.us-west1.aws.confluent.cloud/sql/v1/environments/env-abc123/connections?page_token=UvmDWOB1iwfAIBPj6EYb",
    "total_size": 123,
    "self": "https://flink.us-west1.aws.confluent.cloud/sql/v1/environments/env-123/connections"
  },
  "data": [
    {
      "api_version": "sql/v1",
      "kind": "Connection",
      "metadata": {
        "self": "https://flink.us-west1.aws.confluent.cloud/sql/v1/organizations/org-abc/environments/env-123/connections/my-openai-connection",
        "resource_name": "",
        "created_at": "2006-01-02T15:04:05-07:00",
        "updated_at": "2006-01-02T15:04:05-07:00",
        "deleted_at": "2006-01-02T15:04:05-07:00",
        "uid": "12345678-1234-1234-1234-123456789012",
        "resource_version": "a23av"
      },
      "name": "my-openai-connection",
      "spec": {
        "connection_type": "OPENAI",
        "endpoint": "https://api.openai.com/v1/chat/completions",
        "auth_data": {
          "kind": "PlaintextProvider",
          "data": "string"
        }
     }
   },
   "status": {
     "phase": "READY",
     "detail": "Lookup failed: ai.openai.com"
      }
    }
  ]
}
```

```sql
export CONNECTION_NAME="<connection-name>" # example: "my-openai-connection"
export CONNECTION_TYPE="<connection-type>" # example: "OPENAI"
export ENDPOINT="<endpoint>" # example: "https://api.openai.com/v1/chat/completions"
export CLOUD_API_KEY="<cloud-api-key>"
export CLOUD_API_SECRET="<cloud-api-secret>"
export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0)
export ORG_ID="<organization-id>" # example: "b0b21724-4586-4a07-b787-d0bb5aacbf87"
export ENV_ID="<environment-id>" # example: "env-a1b2c3"
export CLOUD_PROVIDER="<cloud-provider>" # example: "aws"
export CLOUD_REGION="<cloud-region>" # example: "us-east-1"
export JSON_DATA="<payload-string>"
```

```sql
{
  "name": "${CONNECTION_NAME}",
  "spec": {
    "connection_type": "${CONNECTION_TYPE}",
    "endpoint": "${ENDPOINT}",
    "auth_data": {
      "kind": "PlaintextProvider",
      "data": "string"
    }
  },
  "metadata": {}
}
```

```sql
export JSON_DATA="{
  \"name\": \"${CONNECTION_NAME}\",
  \"spec\": {
    \"connection_type\": \"${CONNECTION_TYPE}\",
    \"endpoint\": \"${ENDPOINT}\",
    \"auth_data\": {
      \"kind\": \"PlaintextProvider\",
      \"data\": \"string\"
    }
  },
  \"metadata\": {}
}"
```

```sql
curl --request PUT \
  --url "https://flink.region.provider.confluent.cloud/sql/v1/organizations/${ORG_ID}/environments/${ENV_ID}/connections/${CONNECTION_NAME}" \
  --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" \
  --header 'content-type: application/json' \
  --data "${JSON_DATA}"
```

```sql
{
  "api_version": "sql/v1",
  "kind": "Connection",
  "metadata": {
    "self": "https://flink.us-west1.aws.confluent.cloud/sql/v1/organizations/org-abc/environments/env-a1b2c3/connections/my-openai-connection",
    "resource_name": "",
    "created_at": "2006-01-02T15:04:05-07:00",
    "updated_at": "2006-01-02T15:04:05-07:00",
    "deleted_at": "2006-01-02T15:04:05-07:00",
    "uid": "12345678-1234-1234-1234-123456789012",
    "resource_version": "a23av"
  },
  "name": "my-openai-connection",
  "spec": {
    "connection_type": "OPENAI",
    "endpoint": "https://api.openai.com/v1/chat/completions",
    "auth_data": {
      "kind": "PlaintextProvider",
      "data": "string"
    }
  },
  "status": {
    "phase": "READY",
    "detail": "Lookup failed: ai.openai.com"
     }
   }
}
```

---

### Generate an API key for Programmatic Access to Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/operate-and-deploy/generate-api-key-for-flink.html

Generate an API Key for Access in Confluent Cloud for Apache Flink¶ To manage Flink workloads programmatically in Confluent Cloud for Apache Flink®, you need an API key that’s specific to Flink. You can use the Confluent CLI, the Confluent Cloud APIs, the Confluent Terraform Provider, or the Cloud Console to create API keys. Before you create an API key for Flink access, decide whether you want to create long-running statements. If you need long-running statements, you should use a service account and create an API key for it. If you only need to run interactive queries or run statements for a short time while developing queries, you can create an API key for your user account. A Flink API key is scoped to an environment and region pair, for example, env-abc123.aws.us-east-2. The key enables creating, reading, updating, and deleting Flink SQL statements. To create an API key for Flink access by using the Confluent Cloud APIs or the Confluent Terraform Provider, you must first create a Cloud API key. This step is done automatically if you use the Confluent CLI to create an API key for Flink access. Create a service account (optional)¶ If you need to create long-running Flink SQL statements, create a service account principal before you create a Flink API key. Create a service account by using the Cloud Console or the CLI. Assign the OrganizationAdmin role to the service account by following the steps in Add a role binding to a principal. Store the service account ID in a convenient location, for example, in an environment variable: export PRINCIPAL_ID="<service-account-id>" Generate an API Key¶ You can use the Confluent Cloud APIs, the Confluent Terraform Provider, the Confluent CLI, or the Cloud Console to create an API key for Flink access. For more information, see Manage API Keys. Cloud ConsoleConfluent CLIConfluent Cloud APIsTerraformYou can use the Cloud Console to generate an API key for Flink access. Log in to the Confluent Cloud Console and navigate to the environment that hosts your data and compute pools. Click Flink and in the Flink overview page, click API keys. Click Add API Key to open the Create API key page. Select either the My account tile to create an API key for your user account or the Service account tile to create an API key for a service account. For production Flink deployments, select the Service account option, and click either Existing account or New account to assign the service account principal. Click Next to open the Resource scope page. Select the cloud provider and region for the API key. Ensure that you choose the same provider and region where your data and compute pools are located. Click Next to open the API key detail page. Enter a name and a description for the new API key. This step is optional. Click Create API key. The API key download page opens. Click Download API key and save the key to a secure location on your local machine. Click Complete. You can use the Confluent CLI to generate an API key for Flink access. For more information, see confluent api-key create . Log in to Confluent Cloud: confluent login To see the available regions for Flink, run the following command: confluent flink region list Your output should resemble: Current | Name | Cloud | Region ----------+--------------------------+-------+--------------- | Frankfurt (eu-central-1) | aws | eu-central-1 | Ireland (eu-west-1) | aws | eu-west-1 * | N. Virginia (us-east-1) | aws | us-east-1 | Ohio (us-east-2) | aws | us-east-2 Run the following command to create an API key. Enure that the environment variables are set to your values. # Example values for environment variables. export CLOUD_PROVIDER=aws export CLOUD_REGION=us-east-1 export ENV_ID=env-a12b34 # Generate the API key and secret. confluent api-key create \ --resource flink \ --cloud ${CLOUD_PROVIDER} \ --region ${CLOUD_REGION} \ --environment ${ENV_ID} Your output should resemble: It may take a couple of minutes for the API key to be ready. Save the API key and secret. The secret is not retrievable later. +------------+------------------------------------------------------------------+ | API Key | ABC1DDN2BNASQVRU | | API Secret | B0b+xCoSPY2pSNETeuyrziWmsPmou0WP9rH0Nxed4y4/msnESzjj7kBrRWGOMu1a | +------------+------------------------------------------------------------------+ If the environment, cloud, and region flags are set globally, you can create an API key by running confluent api-key create --resource flink. For more information, see Manage API Keys in Confluent Cloud. To create an API key for an existing service account, provide the --service-account <sa-a1b2c3> option. This enables submitting long-running Flink SQL statements. To create an API key for Flink access by using the Confluent Cloud APIs, you must first create a Cloud API key. To generate the Flink key, you send your Cloud API key and secret in the request header, encoded as a base64 string. Create a Cloud API key for the principal, which is either a service account or your user account. For more information, see Add an API key. Assign the Cloud API key and secret to environment variables that you use in your REST API requests. export CLOUD_API_KEY="<cloud-api-key>" export CLOUD_API_SECRET="<cloud-api-secret>" export PRINCIPAL_ID="<service-account-id>" # or "<user-account-id>" export ENV_REGION_ID="<environment-id>.<cloud-region>" # example: "env-z3y2x1.aws.us-east-1" The ENV_REGION_ID variable is a concatenation of your environment ID and the cloud provider region of your Kafka cluster, separated by a . character. To see the available regions, run the confluent flink region list command. Run the following command to send a POST request to the api-keys endpoint. The REST API uses basic authentication, which means that you provide a base64-encoded string made from your Cloud API key and secret in the request header. curl --request POST \ --url 'https://api.confluent.cloud/iam/v2/api-keys' \ --header "Authorization: Basic $(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0)" \ --header 'content-type: application/json' \ --data "{"spec":{"display_name":"flinkapikey","owner":{"id":"${PRINCIPAL_ID}"},"resource":{"api_version":"fcpm/v2","id":"${ENV_REGION_ID}"}}}" Your output should resemble: { "api_version": "iam/v2", "id": "KJDYFDMBOBDNQEIU", "kind": "ApiKey", "metadata": { "created_at": "2023-12-15T23:10:20.406556Z", "resource_name": "crn://api.confluent.cloud/organization=b0b21724-4586-4a07-b787-d0bb5aacbf87/user=u-lq1dr3/api-key=KJDYFDMBOBDNQEIU", "self": "https://api.confluent.cloud/iam/v2/api-keys/KJDYFDMBOBDNQEIU", "updated_at": "2023-12-15T23:10:20.406556Z" }, "spec": { "description": "", "display_name": "flinkapikey", "owner": { "api_version": "iam/v2", "id": "u-lq1dr3", "kind": "User", "related": "https://api.confluent.cloud/iam/v2/users/u-lq2dr7", "resource_name": "crn://api.confluent.cloud/organization=b0b21724-4586-4a07-b787-d0bb5aacbf87/user=u-lq2dr7" }, "resource": { "api_version": "fcpm/v2", "id": "env-z3q9rd.aws.us-east-1", "kind": "Region", "related": "https://api.confluent.cloud/fcpm/v2/regions?cloud=aws", "resource_name": "crn://api.confluent.cloud/organization=b0b21724-4586-4a07-b787-d0bb5aacbf87/environment=env-z3q9rd/flink-region=aws.us-east-1" }, "secret": "B0BYFzyd0bb5Q58ZZJJYV52mbwDDHnZx21f0gOTz2k6Qv2V9I4KraVztwFOlQx6z" } } You can use the Confluent Terraform Provider to generate an API key for Flink access. Follow the steps in Sample Project for Confluent Terraform Provider and use the configuration shown in Example Flink API Key. When your API key and secret are generated, save them in environment variables for later use. export FLINK_API_KEY="<flink-api-key>" export FLINK_API_SECRET="<flink-api-secret>" You can manage the API key by using the Confluent CLI commands. For more information, see confluent api-key . Also, you can use the REST API and Cloud Console. Next steps¶ Flink SQL REST API Related content¶ Manage API Keys Confluent CLI commands with Confluent Cloud for Apache Flink Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
env-abc123.aws.us-east-2
```

```sql
export PRINCIPAL_ID="<service-account-id>"
```

```sql
confluent login
```

```sql
confluent flink region list
```

```sql
Current |           Name           | Cloud |    Region
----------+--------------------------+-------+---------------
          | Frankfurt (eu-central-1) | aws   | eu-central-1
          | Ireland (eu-west-1)      | aws   | eu-west-1
  *       | N. Virginia (us-east-1)  | aws   | us-east-1
          | Ohio (us-east-2)         | aws   | us-east-2
```

```sql
# Example values for environment variables.
export CLOUD_PROVIDER=aws
export CLOUD_REGION=us-east-1
export ENV_ID=env-a12b34

# Generate the API key and secret.
confluent api-key create \
  --resource flink \
  --cloud ${CLOUD_PROVIDER} \
  --region ${CLOUD_REGION} \
  --environment ${ENV_ID}
```

```sql
It may take a couple of minutes for the API key to be ready.
Save the API key and secret. The secret is not retrievable later.
+------------+------------------------------------------------------------------+
| API Key    | ABC1DDN2BNASQVRU                                                 |
| API Secret | B0b+xCoSPY2pSNETeuyrziWmsPmou0WP9rH0Nxed4y4/msnESzjj7kBrRWGOMu1a |
+------------+------------------------------------------------------------------+
```

```sql
confluent api-key create --resource flink
```

```sql
--service-account <sa-a1b2c3>
```

```sql
export CLOUD_API_KEY="<cloud-api-key>"
export CLOUD_API_SECRET="<cloud-api-secret>"
export PRINCIPAL_ID="<service-account-id>" # or "<user-account-id>"
export ENV_REGION_ID="<environment-id>.<cloud-region>" # example: "env-z3y2x1.aws.us-east-1"
```

```sql
confluent flink region list
```

```sql
curl --request POST \
  --url 'https://api.confluent.cloud/iam/v2/api-keys' \
  --header "Authorization: Basic $(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0)" \
  --header 'content-type: application/json' \
  --data "{"spec":{"display_name":"flinkapikey","owner":{"id":"${PRINCIPAL_ID}"},"resource":{"api_version":"fcpm/v2","id":"${ENV_REGION_ID}"}}}"
```

```sql
{
  "api_version": "iam/v2",
  "id": "KJDYFDMBOBDNQEIU",
  "kind": "ApiKey",
  "metadata": {
    "created_at": "2023-12-15T23:10:20.406556Z",
    "resource_name": "crn://api.confluent.cloud/organization=b0b21724-4586-4a07-b787-d0bb5aacbf87/user=u-lq1dr3/api-key=KJDYFDMBOBDNQEIU",
    "self": "https://api.confluent.cloud/iam/v2/api-keys/KJDYFDMBOBDNQEIU",
    "updated_at": "2023-12-15T23:10:20.406556Z"
  },
  "spec": {
    "description": "",
    "display_name": "flinkapikey",
    "owner": {
      "api_version": "iam/v2",
      "id": "u-lq1dr3",
      "kind": "User",
      "related": "https://api.confluent.cloud/iam/v2/users/u-lq2dr7",
      "resource_name": "crn://api.confluent.cloud/organization=b0b21724-4586-4a07-b787-d0bb5aacbf87/user=u-lq2dr7"
    },
    "resource": {
      "api_version": "fcpm/v2",
      "id": "env-z3q9rd.aws.us-east-1",
      "kind": "Region",
      "related": "https://api.confluent.cloud/fcpm/v2/regions?cloud=aws",
      "resource_name": "crn://api.confluent.cloud/organization=b0b21724-4586-4a07-b787-d0bb5aacbf87/environment=env-z3q9rd/flink-region=aws.us-east-1"
    },
    "secret": "B0BYFzyd0bb5Q58ZZJJYV52mbwDDHnZx21f0gOTz2k6Qv2V9I4KraVztwFOlQx6z"
  }
}
```

```sql
export FLINK_API_KEY="<flink-api-key>"
export FLINK_API_SECRET="<flink-api-secret>"
```

---

### Manage Flink Connections in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/operate-and-deploy/manage-connections.html

Manage Connections in Confluent Cloud for Apache Flink¶ A connection in Confluent Cloud for Apache Flink® represents an external service that is used in your Flink statements. Connections are used to access external services, such as databases, APIs, and other systems, from your Flink statements. To create a connection, you need the OrganizationAdmin, EnvironmentAdmin, or FlinkAdmin RBAC role. Confluent Cloud for Apache Flink makes a best-effort attempt to redact sensitive values from the CREATE CONNECTION and ALTER CONNECTION statements by masking the values for the known sensitive keys. In Confluent Cloud Console, the sensitive values are redacted in the Flink SQL workspace if you navigate away from the workspace and return, or if you reload the page in the browser. Alternatively, you can use the Confluent CLI commands to create and manage connections. In addition, if syntax in the CREATE CONNECTION statement is incorrect, Confluent Cloud for Apache Flink may not detect the secrets. For example, if you type CREATE CONNECTION my_conn WITH ('ap-key' = 'x'), Flink won’t redact the x, because api-key is misspelled. Note Connection resources are an Open Preview feature in Confluent Cloud. A Preview feature is a Confluent Cloud component that is being introduced to gain early feedback from developers. Preview features can be used for evaluation and non-production testing purposes or to provide feedback to Confluent. The warranty, SLA, and Support Services provisions of your agreement with Confluent do not apply to Preview features. Confluent may discontinue providing preview releases of the Preview features at any time in Confluent’s’ sole discretion. Create a connection¶ Flink SQLConfluent Cloud ConsoleConfluent CLIREST APITerraform In the Confluent Cloud Console or in the Flink SQL shell, run the CREATE CONNECTION statement to create a connection. The following example creates an OpenAI connection with an API key. CREATE CONNECTION `my-connection` WITH ( 'type' = 'OPENAI', 'endpoint' = 'https://<your-endpoint>.openai.azure.com/openai/deployments/<deployment-name>/chat/completions?api-version=2025-01-01-preview', 'api-key' = '<your-api-key>' ); The following example creates a MongoDB connection with basic authorization. CREATE CONNECTION `my-mongodb-connection` WITH ( 'type' = 'MONGODB', 'endpoint' = 'mongodb+srv://myCluster.mongodb.net/myDatabase', 'username' = '<atlas-user-name>', 'password' = '<atlas-password>' ); Run the CREATE TABLE statement to create a table that uses the connection. The following example creates a MongoDB external table that uses the MongoDB connection. -- Use the MongoDB connection to create a MongoDB external table. CREATE TABLE mongodb_movies_full_text_search ( title STRING, plot STRING ) WITH ( 'connector' = 'mongodb', 'mongodb.connection' = 'my-mongodb-connection', 'mongodb.database' = 'sample_mflix', 'mongodb.collection' = 'movies', 'mongodb.index' = 'default' ); In the navigation menu, click Environments, and click the tile for the environment where you’re using Flink SQL. In the navigation menu, click Integrations. Click Connections, then click Add connection. The available services are listed. Click the tile of the service you want to connect to, and click Continue. The Define endpoint and credentials page opens. In the Endpoint textbox, enter the URL for the service you want to connect to. In the following fields, enter your credentials, which may be an API key, a username/password pair, or another type of credential, like a Service Account Key, depending on the service. Click Continue. The Review and launch page opens. In the Cloud provider and Region dropdowns, select the cloud provider and region where your Flink statements run. Important You can access the connection only from a workspace that is in the same region as the connection. Click Create connection. The connection is created and you can use it in your Flink statements. Note You can edit the credentials later, but you can’t change the other properties, like the cloud provider or region. Run the confluent flink connection create command to create a connection. Creating a connection requires the following inputs. Credentials vary by service. export CONNECTION_NAME="<connection-name>" # human-readable name, for example, "azure-openai-connection" export CLOUD_PROVIDER="<cloud-provider>" # example: "aws" export CLOUD_REGION="<cloud-region>" # example: "us-east-1" export ENV_ID="<environment-id>" # example: "env-a1b2c3" export CONNECTION_TYPE="<connection-type>" # example: "azureopenai" export ENDPOINT="<endpoint>" # example: "https://<your-project>.openai.azure.com/openai/deployments/<deployment-name>/chat/completions?api-version=2025-01-01-preview" export API_KEY="<api-key>" Run the following command to create a connection in the specified cloud provider and environment. confluent flink connection create ${CONNECTION_NAME} \ --cloud ${CLOUD_PROVIDER} \ --region ${CLOUD_REGION} \ --environment ${ENV_ID} \ --type ${CONNECTION_TYPE} \ --endpoint ${ENDPOINT} \ --api-key ${API_KEY} Your output should resemble: +---------------+------------------------------------+ | Creation Date | 2025-08-13 22:04:57.972969 | | | +0000 UTC | | Name | azure-openai-connection | | Environment | env-a1b2c3 | | Cloud | aws | | Region | us-west-2 | | Type | AZUREOPENAI | | Endpoint | https://<your-project-endpoint> | | Data | <REDACTED> | | Status | | +---------------+------------------------------------+ Create a connection in your environment by sending a POST request to the Connections endpoint. Creating a connection requires the following inputs. Credentials vary by service. export CONNECTION_NAME="<connection-name>" # example: "my-openai-connection" export CONNECTION_TYPE="<connection-type>" # example: "OPENAI" export ENDPOINT="<endpoint>" # example: "https://api.openai.com/v1/chat/completions" export CLOUD_API_KEY="<cloud-api-key>" export CLOUD_API_SECRET="<cloud-api-secret>" export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0) export ORG_ID="<organization-id>" # example: "b0b21724-4586-4a07-b787-d0bb5aacbf87" export ENV_ID="<environment-id>" # example: "env-a1b2c3" export CLOUD_PROVIDER="<cloud-provider>" # example: "aws" export CLOUD_REGION="<cloud-region>" # example: "us-east-1" export JSON_DATA="<payload-string>" The following JSON shows an example payload. The auth_data key varies by service. { "name": "${CONNECTION_NAME}", "spec": { "connection_type": "${CONNECTION_TYPE}", "endpoint": "${ENDPOINT}", "auth_data": { "kind": "PlaintextProvider", "data": "string" } }, "metadata": {} } Quotation mark characters in the JSON string must be escaped, so the payload string to send resembles the following: export JSON_DATA="{ \"name\": \"${CONNECTION_NAME}\", \"spec\": { \"connection_type\": \"${CONNECTION_TYPE}\", \"endpoint\": \"${ENDPOINT}\", \"auth_data\": { \"kind\": \"PlaintextProvider\", \"data\": \"string\" } }, \"metadata\": {} }" The following command sends a POST request to create a connection. curl --request POST \ --url "https://flink.region.provider.confluent.cloud/sql/v1/organizations/${ORG_ID}/environments/${ENV_ID}/connections" \ --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" \ --header 'content-type: application/json' \ --data "${JSON_DATA}" Your output should resemble: Response from a request to create a connection { "api_version": "sql/v1", "kind": "Connection", "metadata": { "self": "https://flink.us-west1.aws.confluent.cloud/sql/v1/organizations/org-abc/environments/env-a1b2c3/connections/my-openai-connection", "resource_name": "", "created_at": "2006-01-02T15:04:05-07:00", "updated_at": "2006-01-02T15:04:05-07:00", "deleted_at": "2006-01-02T15:04:05-07:00", "uid": "12345678-1234-1234-1234-123456789012", "resource_version": "a23av" }, "name": "my-openai-connection", "spec": { "connection_type": "OPENAI", "endpoint": "https://api.openai.com/v1/chat/completions", "auth_data": { "kind": "PlaintextProvider", "data": "string" } }, "status": { "phase": "READY", "detail": "Lookup failed: ai.openai.com" } } } To create a connection by using the Confluent Terraform provider, use the confluent_flink_connection resource. Configure your Terraform file. Provide your Confluent Cloud API key and secret. terraform { required_providers { confluent = { source = "confluentinc/confluent" version = "2.44.0" } } } provider "confluent" { cloud_api_key = var.confluent_cloud_api_key # optionally use CONFLUENT_CLOUD_API_KEY env var cloud_api_secret = var.confluent_cloud_api_secret # optionally use CONFLUENT_CLOUD_API_SECRET env var } Define the confluent_flink_connection resource with the required parameters, like display_name, cloud, region, and the environment ID. resource "confluent_flink_connection" "openai-connection" { organization { id = data.confluent_organization.main.id } environment { id = data.confluent_environment.staging.id } compute_pool { id = confluent_flink_compute_pool.example.id } principal { id = confluent_service_account.app-manager-flink.id } rest_endpoint = data.confluent_flink_region.main.rest_endpoint credentials { key = confluent_api_key.env-admin-flink-api-key.id secret = confluent_api_key.env-admin-flink-api-key.secret } display_name = "connection1" type = "OPENAI" endpoint = "https://api.openai.com/v1/chat/completions" api_key ="API_Key_value" lifecycle { prevent_destroy = true } } Run the terraform apply command to create the resources. terraform apply For more information, see confluent_flink_connection resource. View details for a connection¶ Flink SQLConfluent Cloud ConsoleConfluent CLIREST APITerraformIn the Confluent Cloud Console or in the Flink SQL shell, run the DESCRIBE CONNECTION statement to get details about a connection. DESCRIBE CONNECTION `my-connection`; Your output should resemble: +---------------+------------------------------------+ | Creation Date | 2025-08-13 22:04:57.972969 | | | +0000 UTC | | Name | azure-openai-connection | | Environment | env-a1b2c3 | | Cloud | aws | | Region | us-west-2 | | Type | AZUREOPENAI | | Endpoint | https://<your-project-endpoint> | | Data | <REDACTED> | | Status | | +---------------+------------------------------------+ In the navigation menu, click Environments, and click the tile for the environment where you’re using Flink SQL. In the navigation menu, click Integrations. Click Connections. In the listed connections, find the one you want to view. If you have many connections in the list, use the search bar to find the connection. Click the connection name to view the connection details. Run the confluent flink connection describe command to get details about a connection. Describing a connection requires the following inputs: export CONNECTION_NAME="<connection-name>" # example: "azure-openai-connection" export CLOUD_PROVIDER="<cloud-provider>" # example: "aws" export CLOUD_REGION="<cloud-region>" # example: "us-east-1" export ENV_ID="<environment-id>" # example: "env-a1b2c3" Run the following command to get details about a connection. confluent flink connection describe ${CONNECTION_NAME} \ --cloud ${CLOUD_PROVIDER} \ --region ${CLOUD_REGION} \ --environment ${ENV_ID} Your output should resemble: +---------------+------------------------------------+ | Creation Date | 2025-08-13 22:04:57.972969 | | | +0000 UTC | | Name | azure-openai-connection | | Environment | env-a1b2c3 | | Cloud | aws | | Region | us-west-2 | | Type | AZUREOPENAI | | Endpoint | https://<your-project-endpoint> | | Data | <REDACTED> | | Status | | +---------------+------------------------------------+ Get the details about a connection in your environment by sending a GET request to the Connections endpoint. This request uses your Cloud API key instead of the Flink API key. Getting details about a connection requires the following inputs: export CONNECTION_NAME="<connection-name>" # example: "my-openai-connection" export CLOUD_API_KEY="<cloud-api-key>" export CLOUD_API_SECRET="<cloud-api-secret>" export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0) export ORG_ID="<organization-id>" # example: "b0b21724-4586-4a07-b787-d0bb5aacbf87" export ENV_ID="<environment-id>" # example: "env-a1b2c3" Run the following command to get details about the connection specified in the CONNECTION_NAME environment variable. curl --request GET \ --url "https://flink.region.provider.confluent.cloud/sql/v1/organizations/${ORG_ID}/environments/${ENV_ID}/connections/${CONNECTION_NAME}" \ --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" Your output should resemble: Response from a request to read a connection { "api_version": "sql/v1", "kind": "Connection", "metadata": { "self": "https://flink.us-west1.aws.confluent.cloud/sql/v1/organizations/org-abc/environments/env-123/connections/my-openai-connection", "resource_name": "", "created_at": "2006-01-02T15:04:05-07:00", "updated_at": "2006-01-02T15:04:05-07:00", "deleted_at": "2006-01-02T15:04:05-07:00", "uid": "12345678-1234-1234-1234-123456789012", "resource_version": "a23av" }, "name": "my-openai-connection", "spec": { "connection_type": "OPENAI", "endpoint": "https://api.openai.com/v1/chat/completions", "auth_data": { "kind": "PlaintextProvider", "data": "string" } }, "status": { "phase": "READY", "detail": "Lookup failed: ai.openai.com" } } To view details for a connection by using the Confluent Terraform provider, use the confluent_flink_connection data source. data "confluent_flink_connection" "existing_connection" { organization { id = "<your-organization-id>" } environment { id = "<your-environment-id>" } compute_pool { id = "<your-compute-pool-id>" } principal { id = "<your-service-account-id>" } rest_endpoint = "<your-flink-rest-endpoint>" credentials { key = "<your-flink-api-key>" secret = "<your-flink-api-secret>" } display_name = "my_connection" type = "JDBC" } output "connection_endpoint" { value = data.confluent_flink_connection.existing_connection.endpoint } Run the terraform apply or terraform output command. The connection_endpoint output contains details for the connection. To inspect specific attributes after your configuration has been applied, run the terraform output command. terraform output connection_endpoint For more information, see confluent_flink_connection data source. List connections¶ Flink SQLConfluent Cloud ConsoleConfluent CLIREST APITerraformIn the Confluent Cloud Console or in the Flink SQL shell, run the SHOW CONNECTIONS statement to list the connections. SHOW CONNECTIONS; Your output should resemble: Creation Date | Name | Environment | Cloud | Region | Type | Endpoint | Data | Status | Status Detail ---------------------------------+--------------------------+-------------+-------+-----------+-------------+---------------------------------+------------+--------+---------------- 2025-08-13 21:05:15.035376 | azureopenai-connection-2 | env-a1b2c3 | aws | us-west-2 | AZUREOPENAI | https://<your-project-endpoint> | <REDACTED> | | +0000 UTC | | | | | | | | | 2025-08-13 22:04:57.972969 | azure-openai-connection | env-a1b2c3 | aws | us-west-2 | AZUREOPENAI | https://<your-project-endpoint> | <REDACTED> | | +0000 UTC | | | | | | | | | In the navigation menu, click Environments, and click the tile for the environment where you’re using Flink SQL. In the navigation menu, click Integrations. Click Connections. The available connections are listed. Run the confluent flink connection list command to list connections in the specified environment. Listing connections requires the following inputs: export CLOUD_PROVIDER="<cloud-provider>" # example: "aws" export CLOUD_REGION="<cloud-region>" # example: "us-east-1" export ENV_ID="<environment-id>" # example: "env-a1b2c3" Run the following command to list connections in the specified environment. confluent flink connection list --cloud ${CLOUD_PROVIDER} \ --region ${CLOUD_REGION} \ --environment ${ENV_ID} Your output should resemble: Creation Date | Name | Environment | Cloud | Region | Type | Endpoint | Data | Status | Status Detail ---------------------------------+--------------------------+-------------+-------+-----------+-------------+---------------------------------+------------+--------+---------------- 2025-08-13 21:05:15.035376 | azureopenai-connection-2 | env-a1b2c3 | aws | us-west-2 | AZUREOPENAI | https://<your-project-endpoint> | <REDACTED> | | +0000 UTC | | | | | | | | | 2025-08-13 22:04:57.972969 | azure-openai-connection | env-a1b2c3 | aws | us-west-2 | AZUREOPENAI | https://<your-project-endpoint> | <REDACTED> | | +0000 UTC | | | | | | | | | List the connections in your environment by sending a GET request to the Connections endpoint. This request uses your Cloud API key instead of the Flink API key. Listing the connections in your environment requires the following inputs: export CLOUD_API_KEY="<cloud-api-key>" export CLOUD_API_SECRET="<cloud-api-secret>" export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0) export ORG_ID="<organization-id>" # example: "b0b21724-4586-4a07-b787-d0bb5aacbf87" export ENV_ID="<environment-id>" # example: "env-a1b2c3" Run the following command to list the connections in your environment. curl --request GET \ --url "https://flink.region.provider.confluent.cloud/sql/v1/organizations/${ORG_ID}/environments/${ENV_ID}/connections" \ --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" Your output should resemble: Response from a request to list connections { "api_version": "sql/v1", "kind": "ConnectionList", "metadata": { "first": "https://flink.us-west1.aws.confluent.cloud/sql/v1/environments/env-abc123/connections", "last": "", "prev": "", "next": "https://flink.us-west1.aws.confluent.cloud/sql/v1/environments/env-abc123/connections?page_token=UvmDWOB1iwfAIBPj6EYb", "total_size": 123, "self": "https://flink.us-west1.aws.confluent.cloud/sql/v1/environments/env-123/connections" }, "data": [ { "api_version": "sql/v1", "kind": "Connection", "metadata": { "self": "https://flink.us-west1.aws.confluent.cloud/sql/v1/organizations/org-abc/environments/env-123/connections/my-openai-connection", "resource_name": "", "created_at": "2006-01-02T15:04:05-07:00", "updated_at": "2006-01-02T15:04:05-07:00", "deleted_at": "2006-01-02T15:04:05-07:00", "uid": "12345678-1234-1234-1234-123456789012", "resource_version": "a23av" }, "name": "my-openai-connection", "spec": { "connection_type": "OPENAI", "endpoint": "https://api.openai.com/v1/chat/completions", "auth_data": { "kind": "PlaintextProvider", "data": "string" } } }, "status": { "phase": "READY", "detail": "Lookup failed: ai.openai.com" } } ] } The Confluent Terraform provider does not support a plural data source or enumeration method that enables you to list all existing connection resources in one operation. To view all connections, you must use Flink SQL, Confluent Cloud Console, the CLI, or the REST API. If you use the Flink SQL REST API, you could integrate the response list into Terraform workflows by scripting an external data source that queries the Flink SQL API, and using an external provider, parses the results and feeds them into Terraform. This is a custom integration, not a supported feature. For more information, see confluent_flink_connection. Update a connection¶ You can update only the credentials for a connection. Flink SQLConfluent Cloud ConsoleConfluent CLIREST APITerraformIn the Confluent Cloud Console or in the Flink SQL shell, run the ALTER CONNECTION statement to update the connection. ALTER CONNECTION `my-connection` SET ('api-key' = '<new-api-key>'); Your output should resemble: +---------------+------------------------------------+ | Creation Date | 2025-08-13 22:04:57.972969 | | | +0000 UTC | | Name | azure-openai-connection | | Environment | env-a1b2c3 | | Cloud | aws | | Region | us-west-2 | | Type | AZUREOPENAI | | Endpoint | https://<your-project-endpoint> | | Data | <REDACTED> | | Status | | +---------------+------------------------------------+ In the navigation menu, click Environments, and click the tile for the environment where you’re using Flink SQL. In the navigation menu, click Integrations. Click Connections. In the listed connections, find the one you want to update, and click the options icon (⋮). In the context menu, click Edit connection. In the credentials fields, enter the new credentials for the connection. Click Save changes. The connection is updated, and you can use it in your Flink statements. Run the confluent flink connection update command to update a connection. Updating a connection requires the following inputs. Credentials vary by service. export CONNECTION_NAME="<connection-name>" # example: "azure-openai-connection" export CLOUD_PROVIDER="<cloud-provider>" # example: "aws" export CLOUD_REGION="<cloud-region>" # example: "us-east-1" export ENV_ID="<environment-id>" # example: "env-a1b2c3" export ENDPOINT="<endpoint>" # example: "https://<your-project>.openai.azure.com/openai/deployments/<deployment-name>/chat/completions?api-version=2025-01-01-preview" export NEWAPI_KEY="<new-api-key>" Run the following command to update a connection. confluent flink connection update ${CONNECTION_NAME} \ --cloud ${CLOUD_PROVIDER} \ --region ${CLOUD_REGION} \ --environment ${ENV_ID} \ --endpoint ${ENDPOINT} \ --api-key ${NEWAPI_KEY} Your output should resemble: +---------------+------------------------------------+ | Creation Date | 2025-08-13 22:04:57.972969 | | | +0000 UTC | | Name | azure-openai-connection | | Environment | env-a1b2c3 | | Cloud | aws | | Region | us-west-2 | | Type | AZUREOPENAI | | Endpoint | https://<your-project-endpoint> | | Data | <REDACTED> | | Status | | +---------------+------------------------------------+ Update a connection in your environment by sending a PATCH request to the Connections endpoint. This request uses your Cloud API key instead of the Flink API key. Updating a connection requires the following inputs: export CONNECTION_NAME="<connection-name>" # example: "my-openai-connection" export CONNECTION_TYPE="<connection-type>" # example: "OPENAI" export ENDPOINT="<endpoint>" # example: "https://api.openai.com/v1/chat/completions" export CLOUD_API_KEY="<cloud-api-key>" export CLOUD_API_SECRET="<cloud-api-secret>" export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0) export ORG_ID="<organization-id>" # example: "b0b21724-4586-4a07-b787-d0bb5aacbf87" export ENV_ID="<environment-id>" # example: "env-a1b2c3" export CLOUD_PROVIDER="<cloud-provider>" # example: "aws" export CLOUD_REGION="<cloud-region>" # example: "us-east-1" export JSON_DATA="<payload-string>" The following JSON shows an example payload. The auth_data key varies by service. { "name": "${CONNECTION_NAME}", "spec": { "connection_type": "${CONNECTION_TYPE}", "endpoint": "${ENDPOINT}", "auth_data": { "kind": "PlaintextProvider", "data": "string" } }, "metadata": {} } Quotation mark characters in the JSON string must be escaped, so the payload string to send resembles the following: export JSON_DATA="{ \"name\": \"${CONNECTION_NAME}\", \"spec\": { \"connection_type\": \"${CONNECTION_TYPE}\", \"endpoint\": \"${ENDPOINT}\", \"auth_data\": { \"kind\": \"PlaintextProvider\", \"data\": \"string\" } }, \"metadata\": {} }" The following command sends a PUT request to update a connection. curl --request PUT \ --url "https://flink.region.provider.confluent.cloud/sql/v1/organizations/${ORG_ID}/environments/${ENV_ID}/connections/${CONNECTION_NAME}" \ --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" \ --header 'content-type: application/json' \ --data "${JSON_DATA}" Your output should resemble: Response from a request to update a connection { "api_version": "sql/v1", "kind": "Connection", "metadata": { "self": "https://flink.us-west1.aws.confluent.cloud/sql/v1/organizations/org-abc/environments/env-a1b2c3/connections/my-openai-connection", "resource_name": "", "created_at": "2006-01-02T15:04:05-07:00", "updated_at": "2006-01-02T15:04:05-07:00", "deleted_at": "2006-01-02T15:04:05-07:00", "uid": "12345678-1234-1234-1234-123456789012", "resource_version": "a23av" }, "name": "my-openai-connection", "spec": { "connection_type": "OPENAI", "endpoint": "https://api.openai.com/v1/chat/completions", "auth_data": { "kind": "PlaintextProvider", "data": "string" } }, "status": { "phase": "READY", "detail": "Lookup failed: ai.openai.com" } } } To update a connection by using the Confluent Terraform provider, use the confluent_flink_connection resource. Find the definition for the connection resource in your Terraform configuration, for example: resource "confluent_flink_connection" "openai-connection" { ... credentials { api_key = confluent_api_key.env-admin-flink-api-key.id } } Modify the attributes of the confluent_flink_connection resource in the Terraform configuration file. The following example updates the api_key attribute. resource "confluent_flink_connection" "openai-connection" { ... credentials { api_key = confluent_api_key.env-admin-flink-api-key.id # Updated value } } Run the terraform apply command to update the connection with the new configuration. terraform apply For more information, see confluent_flink_connection. Delete a connection¶ Flink SQLConfluent Cloud ConsoleConfluent CLIREST APITerraformIn the Confluent Cloud Console or in the Flink SQL shell, run the DROP CONNECTION statement to delete the connection. DROP CONNECTION `my-connection`; In the navigation menu, click Environments, and click the tile for the environment where you’re using Flink SQL. In the navigation menu, click Integrations. Click Connections. In the listed connections, find the one you want to delete, and click the options icon (⋮). In the context menu, click Delete connection. In the dialog, enter the connection name, and click Confirm. The connection is deleted. Run the confluent flink connection delete command to delete a connection. Deleting a connection requires the following inputs: export CONNECTION_NAME="<connection-name>" # example: "azure-openai-connection" export CLOUD_PROVIDER="<cloud-provider>" # example: "aws" export CLOUD_REGION="<cloud-region>" # example: "us-east-1" export ENV_ID="<environment-id>" # example: "env-a1b2c3" Run the following command to delete a connection. confluent flink connection delete ${CONNECTION_NAME} \ --cloud ${CLOUD_PROVIDER} \ --region ${CLOUD_REGION} \ --environment ${ENV_ID} Your output should resemble: Deleted Flink connection "azure-openai-connection". Delete a connection in your environment by sending a DELETE request to the Connections endpoint. This request uses your Cloud API key instead of the Flink API key. Deleting a connection requires the following inputs: export CONNECTION_NAME="<connection-name>" # example: "my-openai-connection" export CLOUD_API_KEY="<cloud-api-key>" export CLOUD_API_SECRET="<cloud-api-secret>" export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0) export ORG_ID="<organization-id>" # example: "b0b21724-4586-4a07-b787-d0bb5aacbf87" export ENV_ID="<environment-id>" # example: "env-a1b2c3" Run the following command to delete the connection specified in the CONNECTION_NAME environment variable. curl --request DELETE \ --url "https://flink.region.provider.confluent.cloud/sql/v1/organizations/${ORG_ID}/environments/${ENV_ID}/connections/${CONNECTION_NAME}" \ --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" To delete a connection by using the Confluent Terraform provider, use the confluent_flink_connection resource. Find the definition for the connection resource in your Terraform configuration and copy the name of the resource. In the following example, the resource name is main. resource "confluent_flink_connection" "main" { display_name = "standard_connection" ... } } To avoid accidental deletions, review the plan before applying the destroy command. terraform plan -destroy -target=confluent_flink_connection.main To delete the connection, run the following command to target the specific resource. This command deletes only the connection and not other resources. terraform apply -destroy -target=confluent_flink_connection.main To remove all resources defined in your Terraform configuration file, including the connection, run the terraform destroy command. terraform destroy For more information, see confluent_flink_connection. Related content¶ Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
CREATE CONNECTION my_conn WITH ('ap-key' = 'x')
```

```sql
CREATE CONNECTION `my-connection`
  WITH (
    'type' = 'OPENAI',
    'endpoint' = 'https://<your-endpoint>.openai.azure.com/openai/deployments/<deployment-name>/chat/completions?api-version=2025-01-01-preview',
    'api-key' = '<your-api-key>'
  );
```

```sql
CREATE CONNECTION `my-mongodb-connection`
  WITH (
    'type' = 'MONGODB',
    'endpoint' = 'mongodb+srv://myCluster.mongodb.net/myDatabase',

    'username' = '<atlas-user-name>',
    'password' = '<atlas-password>'
  );
```

```sql
-- Use the MongoDB connection to create a MongoDB external table.
CREATE TABLE mongodb_movies_full_text_search (
    title STRING,
    plot STRING
) WITH (
    'connector' = 'mongodb',
    'mongodb.connection' = 'my-mongodb-connection',
    'mongodb.database' = 'sample_mflix',
    'mongodb.collection' = 'movies',
    'mongodb.index' = 'default'
);
```

```sql
export CONNECTION_NAME="<connection-name>" # human-readable name, for example, "azure-openai-connection"
export CLOUD_PROVIDER="<cloud-provider>" # example: "aws"
export CLOUD_REGION="<cloud-region>" # example: "us-east-1"
export ENV_ID="<environment-id>" # example: "env-a1b2c3"
export CONNECTION_TYPE="<connection-type>" # example: "azureopenai"
export ENDPOINT="<endpoint>" # example: "https://<your-project>.openai.azure.com/openai/deployments/<deployment-name>/chat/completions?api-version=2025-01-01-preview"
export API_KEY="<api-key>"
```

```sql
confluent flink connection create ${CONNECTION_NAME} \
  --cloud ${CLOUD_PROVIDER} \
  --region ${CLOUD_REGION} \
  --environment ${ENV_ID} \
  --type ${CONNECTION_TYPE} \
  --endpoint ${ENDPOINT} \
  --api-key ${API_KEY}
```

```sql
+---------------+------------------------------------+
| Creation Date | 2025-08-13 22:04:57.972969         |
|               | +0000 UTC                          |
| Name          | azure-openai-connection            |
| Environment   | env-a1b2c3                         |
| Cloud         | aws                                |
| Region        | us-west-2                          |
| Type          | AZUREOPENAI                        |
| Endpoint      | https://<your-project-endpoint>    |
| Data          | <REDACTED>                         |
| Status        |                                    |
+---------------+------------------------------------+
```

```sql
export CONNECTION_NAME="<connection-name>" # example: "my-openai-connection"
export CONNECTION_TYPE="<connection-type>" # example: "OPENAI"
export ENDPOINT="<endpoint>" # example: "https://api.openai.com/v1/chat/completions"
export CLOUD_API_KEY="<cloud-api-key>"
export CLOUD_API_SECRET="<cloud-api-secret>"
export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0)
export ORG_ID="<organization-id>" # example: "b0b21724-4586-4a07-b787-d0bb5aacbf87"
export ENV_ID="<environment-id>" # example: "env-a1b2c3"
export CLOUD_PROVIDER="<cloud-provider>" # example: "aws"
export CLOUD_REGION="<cloud-region>" # example: "us-east-1"
export JSON_DATA="<payload-string>"
```

```sql
{
  "name": "${CONNECTION_NAME}",
  "spec": {
    "connection_type": "${CONNECTION_TYPE}",
    "endpoint": "${ENDPOINT}",
    "auth_data": {
      "kind": "PlaintextProvider",
      "data": "string"
    }
  },
  "metadata": {}
}
```

```sql
export JSON_DATA="{
  \"name\": \"${CONNECTION_NAME}\",
  \"spec\": {
    \"connection_type\": \"${CONNECTION_TYPE}\",
    \"endpoint\": \"${ENDPOINT}\",
    \"auth_data\": {
      \"kind\": \"PlaintextProvider\",
      \"data\": \"string\"
    }
  },
  \"metadata\": {}
}"
```

```sql
curl --request POST \
  --url "https://flink.region.provider.confluent.cloud/sql/v1/organizations/${ORG_ID}/environments/${ENV_ID}/connections" \
  --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" \
  --header 'content-type: application/json' \
  --data "${JSON_DATA}"
```

```sql
{
  "api_version": "sql/v1",
  "kind": "Connection",
  "metadata": {
    "self": "https://flink.us-west1.aws.confluent.cloud/sql/v1/organizations/org-abc/environments/env-a1b2c3/connections/my-openai-connection",
    "resource_name": "",
    "created_at": "2006-01-02T15:04:05-07:00",
    "updated_at": "2006-01-02T15:04:05-07:00",
    "deleted_at": "2006-01-02T15:04:05-07:00",
    "uid": "12345678-1234-1234-1234-123456789012",
    "resource_version": "a23av"
  },
  "name": "my-openai-connection",
  "spec": {
    "connection_type": "OPENAI",
    "endpoint": "https://api.openai.com/v1/chat/completions",
    "auth_data": {
      "kind": "PlaintextProvider",
      "data": "string"
    }
  },
  "status": {
    "phase": "READY",
    "detail": "Lookup failed: ai.openai.com"
     }
   }
}
```

```sql
terraform {
  required_providers {
    confluent = {
      source = "confluentinc/confluent"
      version = "2.44.0"
    }
  }
}

provider "confluent" {
  cloud_api_key    = var.confluent_cloud_api_key    # optionally use CONFLUENT_CLOUD_API_KEY env var
  cloud_api_secret = var.confluent_cloud_api_secret # optionally use CONFLUENT_CLOUD_API_SECRET env var
}
```

```sql
confluent_flink_connection
```

```sql
display_name
```

```sql
resource "confluent_flink_connection" "openai-connection" {
  organization {
      id = data.confluent_organization.main.id
  }
  environment {
      id = data.confluent_environment.staging.id
  }
  compute_pool {
      id = confluent_flink_compute_pool.example.id
  }
  principal {
      id = confluent_service_account.app-manager-flink.id
  }
  rest_endpoint = data.confluent_flink_region.main.rest_endpoint
  credentials {
      key    = confluent_api_key.env-admin-flink-api-key.id
      secret = confluent_api_key.env-admin-flink-api-key.secret
  }

  display_name = "connection1"
  type = "OPENAI"
  endpoint = "https://api.openai.com/v1/chat/completions"
  api_key ="API_Key_value"

  lifecycle {
      prevent_destroy = true
  }
}
```

```sql
terraform apply
```

```sql
terraform apply
```

```sql
DESCRIBE CONNECTION `my-connection`;
```

```sql
+---------------+------------------------------------+
| Creation Date | 2025-08-13 22:04:57.972969         |
|               | +0000 UTC                          |
| Name          | azure-openai-connection            |
| Environment   | env-a1b2c3                         |
| Cloud         | aws                                |
| Region        | us-west-2                          |
| Type          | AZUREOPENAI                        |
| Endpoint      | https://<your-project-endpoint>    |
| Data          | <REDACTED>                         |
| Status        |                                    |
+---------------+------------------------------------+
```

```sql
export CONNECTION_NAME="<connection-name>" # example: "azure-openai-connection"
export CLOUD_PROVIDER="<cloud-provider>" # example: "aws"
export CLOUD_REGION="<cloud-region>" # example: "us-east-1"
export ENV_ID="<environment-id>" # example: "env-a1b2c3"
```

```sql
confluent flink connection describe ${CONNECTION_NAME} \
  --cloud ${CLOUD_PROVIDER} \
  --region ${CLOUD_REGION} \
  --environment ${ENV_ID}
```

```sql
+---------------+------------------------------------+
| Creation Date | 2025-08-13 22:04:57.972969         |
|               | +0000 UTC                          |
| Name          | azure-openai-connection            |
| Environment   | env-a1b2c3                         |
| Cloud         | aws                                |
| Region        | us-west-2                          |
| Type          | AZUREOPENAI                        |
| Endpoint      | https://<your-project-endpoint>    |
| Data          | <REDACTED>                         |
| Status        |                                    |
+---------------+------------------------------------+
```

```sql
export CONNECTION_NAME="<connection-name>" # example: "my-openai-connection"
export CLOUD_API_KEY="<cloud-api-key>"
export CLOUD_API_SECRET="<cloud-api-secret>"
export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0)
export ORG_ID="<organization-id>" # example: "b0b21724-4586-4a07-b787-d0bb5aacbf87"
export ENV_ID="<environment-id>" # example: "env-a1b2c3"
```

```sql
curl --request GET \
  --url "https://flink.region.provider.confluent.cloud/sql/v1/organizations/${ORG_ID}/environments/${ENV_ID}/connections/${CONNECTION_NAME}" \
  --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}"
```

```sql
{
  "api_version": "sql/v1",
  "kind": "Connection",
  "metadata": {
    "self": "https://flink.us-west1.aws.confluent.cloud/sql/v1/organizations/org-abc/environments/env-123/connections/my-openai-connection",
    "resource_name": "",
    "created_at": "2006-01-02T15:04:05-07:00",
    "updated_at": "2006-01-02T15:04:05-07:00",
    "deleted_at": "2006-01-02T15:04:05-07:00",
    "uid": "12345678-1234-1234-1234-123456789012",
    "resource_version": "a23av"
  },
  "name": "my-openai-connection",
  "spec": {
    "connection_type": "OPENAI",
    "endpoint": "https://api.openai.com/v1/chat/completions",
    "auth_data": {
      "kind": "PlaintextProvider",
      "data": "string"
    }
  },
  "status": {
    "phase": "READY",
    "detail": "Lookup failed: ai.openai.com"
  }
}
```

```sql
data "confluent_flink_connection" "existing_connection" {
  organization {
     id = "<your-organization-id>"
  }
  environment {
     id = "<your-environment-id>"
  }
  compute_pool {
     id = "<your-compute-pool-id>"
  }
  principal {
     id = "<your-service-account-id>"
  }
  rest_endpoint = "<your-flink-rest-endpoint>"
  credentials {
     key    = "<your-flink-api-key>"
     secret = "<your-flink-api-secret>"
  }
  display_name = "my_connection"
  type         = "JDBC"
}

output "connection_endpoint" {
  value = data.confluent_flink_connection.existing_connection.endpoint
}
```

```sql
terraform apply
```

```sql
terraform output
```

```sql
connection_endpoint
```

```sql
terraform output
```

```sql
terraform output connection_endpoint
```

```sql
SHOW CONNECTIONS;
```

```sql
Creation Date          |           Name           | Environment | Cloud |  Region   |    Type     |            Endpoint             |    Data    | Status | Status Detail
---------------------------------+--------------------------+-------------+-------+-----------+-------------+---------------------------------+------------+--------+----------------
  2025-08-13 21:05:15.035376     | azureopenai-connection-2 | env-a1b2c3  | aws   | us-west-2 | AZUREOPENAI | https://<your-project-endpoint> | <REDACTED> |        |
  +0000 UTC                      |                          |             |       |           |             |                                 |            |        |
  2025-08-13 22:04:57.972969     | azure-openai-connection  | env-a1b2c3  | aws   | us-west-2 | AZUREOPENAI | https://<your-project-endpoint> | <REDACTED> |        |
  +0000 UTC                      |                          |             |       |           |             |                                 |            |        |
```

```sql
export CLOUD_PROVIDER="<cloud-provider>" # example: "aws"
export CLOUD_REGION="<cloud-region>" # example: "us-east-1"
export ENV_ID="<environment-id>" # example: "env-a1b2c3"
```

```sql
confluent flink connection list
  --cloud ${CLOUD_PROVIDER} \
  --region ${CLOUD_REGION} \
  --environment ${ENV_ID}
```

```sql
Creation Date          |           Name           | Environment | Cloud |  Region   |    Type     |            Endpoint             |    Data    | Status | Status Detail
---------------------------------+--------------------------+-------------+-------+-----------+-------------+---------------------------------+------------+--------+----------------
  2025-08-13 21:05:15.035376     | azureopenai-connection-2 | env-a1b2c3  | aws   | us-west-2 | AZUREOPENAI | https://<your-project-endpoint> | <REDACTED> |        |
  +0000 UTC                      |                          |             |       |           |             |                                 |            |        |
  2025-08-13 22:04:57.972969     | azure-openai-connection  | env-a1b2c3  | aws   | us-west-2 | AZUREOPENAI | https://<your-project-endpoint> | <REDACTED> |        |
  +0000 UTC                      |                          |             |       |           |             |                                 |            |        |
```

```sql
export CLOUD_API_KEY="<cloud-api-key>"
export CLOUD_API_SECRET="<cloud-api-secret>"
export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0)
export ORG_ID="<organization-id>" # example: "b0b21724-4586-4a07-b787-d0bb5aacbf87"
export ENV_ID="<environment-id>" # example: "env-a1b2c3"
```

```sql
curl --request GET \
  --url "https://flink.region.provider.confluent.cloud/sql/v1/organizations/${ORG_ID}/environments/${ENV_ID}/connections" \
  --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}"
```

```sql
{
  "api_version": "sql/v1",
  "kind": "ConnectionList",
  "metadata": {
    "first": "https://flink.us-west1.aws.confluent.cloud/sql/v1/environments/env-abc123/connections",
    "last": "",
    "prev": "",
    "next": "https://flink.us-west1.aws.confluent.cloud/sql/v1/environments/env-abc123/connections?page_token=UvmDWOB1iwfAIBPj6EYb",
    "total_size": 123,
    "self": "https://flink.us-west1.aws.confluent.cloud/sql/v1/environments/env-123/connections"
  },
  "data": [
    {
      "api_version": "sql/v1",
      "kind": "Connection",
      "metadata": {
        "self": "https://flink.us-west1.aws.confluent.cloud/sql/v1/organizations/org-abc/environments/env-123/connections/my-openai-connection",
        "resource_name": "",
        "created_at": "2006-01-02T15:04:05-07:00",
        "updated_at": "2006-01-02T15:04:05-07:00",
        "deleted_at": "2006-01-02T15:04:05-07:00",
        "uid": "12345678-1234-1234-1234-123456789012",
        "resource_version": "a23av"
      },
      "name": "my-openai-connection",
      "spec": {
        "connection_type": "OPENAI",
        "endpoint": "https://api.openai.com/v1/chat/completions",
        "auth_data": {
          "kind": "PlaintextProvider",
          "data": "string"
        }
     }
   },
   "status": {
     "phase": "READY",
     "detail": "Lookup failed: ai.openai.com"
      }
    }
  ]
}
```

```sql
ALTER CONNECTION `my-connection` SET ('api-key' = '<new-api-key>');
```

```sql
+---------------+------------------------------------+
| Creation Date | 2025-08-13 22:04:57.972969         |
|               | +0000 UTC                          |
| Name          | azure-openai-connection            |
| Environment   | env-a1b2c3                         |
| Cloud         | aws                                |
| Region        | us-west-2                          |
| Type          | AZUREOPENAI                        |
| Endpoint      | https://<your-project-endpoint>    |
| Data          | <REDACTED>                         |
| Status        |                                    |
+---------------+------------------------------------+
```

```sql
export CONNECTION_NAME="<connection-name>" # example: "azure-openai-connection"
export CLOUD_PROVIDER="<cloud-provider>" # example: "aws"
export CLOUD_REGION="<cloud-region>" # example: "us-east-1"
export ENV_ID="<environment-id>" # example: "env-a1b2c3"
export ENDPOINT="<endpoint>" # example: "https://<your-project>.openai.azure.com/openai/deployments/<deployment-name>/chat/completions?api-version=2025-01-01-preview"
export NEWAPI_KEY="<new-api-key>"
```

```sql
confluent flink connection update ${CONNECTION_NAME} \
  --cloud ${CLOUD_PROVIDER} \
  --region ${CLOUD_REGION} \
  --environment ${ENV_ID} \
  --endpoint ${ENDPOINT} \
  --api-key ${NEWAPI_KEY}
```

```sql
+---------------+------------------------------------+
| Creation Date | 2025-08-13 22:04:57.972969         |
|               | +0000 UTC                          |
| Name          | azure-openai-connection            |
| Environment   | env-a1b2c3                         |
| Cloud         | aws                                |
| Region        | us-west-2                          |
| Type          | AZUREOPENAI                        |
| Endpoint      | https://<your-project-endpoint>    |
| Data          | <REDACTED>                         |
| Status        |                                    |
+---------------+------------------------------------+
```

```sql
export CONNECTION_NAME="<connection-name>" # example: "my-openai-connection"
export CONNECTION_TYPE="<connection-type>" # example: "OPENAI"
export ENDPOINT="<endpoint>" # example: "https://api.openai.com/v1/chat/completions"
export CLOUD_API_KEY="<cloud-api-key>"
export CLOUD_API_SECRET="<cloud-api-secret>"
export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0)
export ORG_ID="<organization-id>" # example: "b0b21724-4586-4a07-b787-d0bb5aacbf87"
export ENV_ID="<environment-id>" # example: "env-a1b2c3"
export CLOUD_PROVIDER="<cloud-provider>" # example: "aws"
export CLOUD_REGION="<cloud-region>" # example: "us-east-1"
export JSON_DATA="<payload-string>"
```

```sql
{
  "name": "${CONNECTION_NAME}",
  "spec": {
    "connection_type": "${CONNECTION_TYPE}",
    "endpoint": "${ENDPOINT}",
    "auth_data": {
      "kind": "PlaintextProvider",
      "data": "string"
    }
  },
  "metadata": {}
}
```

```sql
export JSON_DATA="{
  \"name\": \"${CONNECTION_NAME}\",
  \"spec\": {
    \"connection_type\": \"${CONNECTION_TYPE}\",
    \"endpoint\": \"${ENDPOINT}\",
    \"auth_data\": {
      \"kind\": \"PlaintextProvider\",
      \"data\": \"string\"
    }
  },
  \"metadata\": {}
}"
```

```sql
curl --request PUT \
  --url "https://flink.region.provider.confluent.cloud/sql/v1/organizations/${ORG_ID}/environments/${ENV_ID}/connections/${CONNECTION_NAME}" \
  --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}" \
  --header 'content-type: application/json' \
  --data "${JSON_DATA}"
```

```sql
{
  "api_version": "sql/v1",
  "kind": "Connection",
  "metadata": {
    "self": "https://flink.us-west1.aws.confluent.cloud/sql/v1/organizations/org-abc/environments/env-a1b2c3/connections/my-openai-connection",
    "resource_name": "",
    "created_at": "2006-01-02T15:04:05-07:00",
    "updated_at": "2006-01-02T15:04:05-07:00",
    "deleted_at": "2006-01-02T15:04:05-07:00",
    "uid": "12345678-1234-1234-1234-123456789012",
    "resource_version": "a23av"
  },
  "name": "my-openai-connection",
  "spec": {
    "connection_type": "OPENAI",
    "endpoint": "https://api.openai.com/v1/chat/completions",
    "auth_data": {
      "kind": "PlaintextProvider",
      "data": "string"
    }
  },
  "status": {
    "phase": "READY",
    "detail": "Lookup failed: ai.openai.com"
     }
   }
}
```

```sql
resource "confluent_flink_connection" "openai-connection" {
  ...
  credentials {
      api_key    = confluent_api_key.env-admin-flink-api-key.id
  }
}
```

```sql
confluent_flink_connection
```

```sql
resource "confluent_flink_connection" "openai-connection" {
   ...
   credentials {
       api_key    = confluent_api_key.env-admin-flink-api-key.id # Updated value
   }
 }
```

```sql
terraform apply
```

```sql
terraform apply
```

```sql
DROP CONNECTION `my-connection`;
```

```sql
export CONNECTION_NAME="<connection-name>" # example: "azure-openai-connection"
export CLOUD_PROVIDER="<cloud-provider>" # example: "aws"
export CLOUD_REGION="<cloud-region>" # example: "us-east-1"
export ENV_ID="<environment-id>" # example: "env-a1b2c3"
```

```sql
confluent flink connection delete ${CONNECTION_NAME} \
  --cloud ${CLOUD_PROVIDER} \
  --region ${CLOUD_REGION} \
  --environment ${ENV_ID}
```

```sql
Deleted Flink connection "azure-openai-connection".
```

```sql
export CONNECTION_NAME="<connection-name>" # example: "my-openai-connection"
export CLOUD_API_KEY="<cloud-api-key>"
export CLOUD_API_SECRET="<cloud-api-secret>"
export BASE64_CLOUD_KEY_AND_SECRET=$(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0)
export ORG_ID="<organization-id>" # example: "b0b21724-4586-4a07-b787-d0bb5aacbf87"
export ENV_ID="<environment-id>" # example: "env-a1b2c3"
```

```sql
curl --request DELETE \
  --url "https://flink.region.provider.confluent.cloud/sql/v1/organizations/${ORG_ID}/environments/${ENV_ID}/connections/${CONNECTION_NAME}" \
  --header "Authorization: Basic ${BASE64_CLOUD_KEY_AND_SECRET}"
```

```sql
resource "confluent_flink_connection" "main" {
  display_name = "standard_connection"
  ...
  }
}
```

```sql
terraform plan -destroy -target=confluent_flink_connection.main
```

```sql
terraform apply -destroy -target=confluent_flink_connection.main
```

```sql
terraform destroy
```

```sql
terraform destroy
```

---

### Monitor and Manage Flink SQL Statements in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/operate-and-deploy/monitor-statements.html

Monitor and Manage Flink SQL Statements in Confluent Cloud for Apache Flink¶ You start a stream-processing app on Confluent Cloud for Apache Flink® by running a SQL statement. Once a statement is running, you can monitor its progress by using the Confluent Cloud Console. Also, you can set up integrations with monitoring services like Prometheus and Datadog. View and monitor statements in Cloud Console¶ Cloud Console shows details about your statements on the Flink page. If you don’t have running statements currently, run a SQL query like INSERT INTO FROM SELECT in the Flink SQL shell or in a workspace. Log in to the Confluent Cloud Console. Navigate to the Environments page. Click the tile that has the environment where your Flink compute pools are provisioned. Click Flink, and in the Flink page, click Flink statements. The Statements list opens. You can use the Filter options on the page to identify the statements you want to view. The following information is available in the Flink statements table to help you monitor your statements. Field Description Flink Statement Name The name of the statement. The name is populated automatically when a statement is submitted. You can set the name by using the SET command. Status The statement status Represents what is currently happening with the statement. These are the status values: Pending: The statement has been submitted and Flink is preparing to start running the statement. Running: Flink is actively running the statement. Completed: The statement has completed all of its work. Deleting: The statement is being deleted. Failed: The statement has encountered an error and is no longer running. Degraded: The statement appears unhealthy, for example, no transactions have been committed for a long time, or the statement has frequently restarted recently. Stopping: The statement is about to be stopped. Stopped: The statement has been stopped and is no longer running. Statement Type The type of SQL function that is used in the statement. Statement CFU The number of CFUs that the statement is consuming. State size (GB) The size of the state used by the statement, in gigabytes. Created Indicates when the statement started running. If you stop and resume the statement, the Created date shows the date when the statement was first submitted. Messages Behind The Consumer Lag of the statement. You are also shown an indicator of whether the back pressure is increasing, decreasing, or if the back pressure is being maintained at a stable rate. Ideally, the Messages Behind metric should be as close to zero as possible. A low, close-to-zero consumer lag is the best indicator that your statement is running smoothly and keeping up with all of its inputs. A growing consumer lag indicates there is a problem. Messages in The count of Messages in per minute which represents the rate at which records are read. You also have a watermark for the messages read. The watermark displayed in the Flink statements table is the minimum watermark from the source(s) in the query. Messages out The count of Messages out per minute which represents the rate at which records are written. You also have a watermark for the messages written. The watermark displayed in the Flink statements table is the minimum watermark from the sink(s) in the query. Account The name of the user account or service account the statement is running with. When you click on a particular statement a detailed side panel opens up. The panel provides detailed information on the statement at a more granular level, showing how messages are being read from sources and written to sinks. The watermarks for each individual source and sink table are shown in this panel along with the statement’s catalog, database, local time zone, and Scaling status . The SQL Content section shows the code used to generate the statement. The panel also contains visual interactive graphs of statement’s performance over time. There are charts for # Messages behind, Messages in per minute, and Messages out per minute. Manage statements in Cloud Console¶ Cloud Console gives you actions to manage your statements on the Flink page. In the statement list, click the checkbox next to one of your statements to select it. Click Actions. A menu opens, showing options for managing the statement’s status. You can select Stop statement, Resume statement, or Delete statement. Flink metrics integrations¶ Confluent Cloud for Apache Flink supports metrics integrations with services like Prometheus and Datadog. If you don’t have running statements currently, run a SQL query like INSERT INTO FROM SELECT in the Flink SQL shell or in a workspace. Log in to the Confluent Cloud Console. Open the Administration menu () and select Metrics to open the Metrics integration page. In the Explore available metrics section, click the Metric dropdown. Scroll until you find the Flink compute pool and Flink statement metrics, for example, Messages behind. This list doesn’t include all available metrics. For a full list of available metrics, see Metrics API Reference. Click the Resource dropdown and select the corresponding compute pool or statement that you want to monitor. A graph showing the most recent data for your selected Flink metric displays. Click New integration to export your metrics to a monitoring service. For more information, see Integrate with third-party monitoring tools. For an introductory example of setting up monitoring with Grafana and Prometheus, see the Flink Monitoring repository. Error handling and recovery¶ When errors occur during the runtime of a statement, Confluent Cloud for Apache Flink handles them differently depending on the type of error: Statement failures: When a statement encounters an error that prevents it from continuing, it moves to the FAILED state. FAILED statements do not consume any CFUs. You’ll see an error message in the statement details explaining what went wrong. Common causes of statement failures include: Data format issues (deserialization errors) Query logic problems (division by zero, invalid operations) Missing or inaccessible topics Insufficient permissions For deserialization errors, you can use custom error handling rules to skip problematic records or send them to a dead letter queue instead of failing the entire statement. FAILED statements can be resumed, but you must fix the underlying issue first to prevent the statement from failing again immediately. For more information on evolving statements, see Schema and Statement Evolution. Statement degradation: When a statement encounters issues but could continue running, it may enter the DEGRADED state. For more information, see Degraded statements Degraded statements¶ When a statement enters the DEGRADED state, it means the statement is unable to make consistent progress. There are two scenarios that can cause this: Query-related issues: When the degradation is caused by inefficient query logic or insufficient compute resources, you’ll see an error message like: Your |af| statement has entered a Degraded state because it is unable to make consistent progress. This can be caused by inefficient query logic or insufficient compute resources. Please review your statement for performance bottlenecks. If the issue persists, consider scaling your compute pool or contacting Confluent support for assistance. System-related issues: When the degradation is caused by an unknown or internal system error, you’ll see this error message: An internal system error has been detected that requires attention from our engineering team. We are actively working to resolve this issue. No action is required on your part at this time. If the issue persists, please contact Confluent support for further assistance. DEGRADED statements will continue to consume CFUs. For query-related issues, see Resolve Common Statement Problems for a troubleshooting guide. Custom error handling rules¶ Confluent Cloud for Apache Flink supports custom error handling for deserialization errors using the error-handling.mode table property. You can choose to fail, ignore, or log problematic records to a Dead Letter Queue (DLQ). When set to log, errors are sent to a DLQ table. Notifications¶ Confluent Cloud for Apache Flink integrates with Notifications for Confluent Cloud. The following notifications are available for Flink statements. They apply only to background Data Manipulation Language (DML) statements like INSERT INTO, EXECUTE STATEMENT SET, or CREATE TABLE AS. Statement failure: This notification is triggered when a statement transitions from RUNNING to FAILED. A statement transitions to FAILED on exceptions that Confluent classifies as USER, as opposed to SYSTEM exceptions. Statement degraded: This notification triggered when a statement transitions from RUNNING to DEGRADED. Statement stuck in pending: This notification is triggered when a newly submitted statement stays in PENDING for a long time. The time period for a statement to be considered stuck in the PENDING state depends on the cloud provider that’s running your Flink statements: AWS: 10 minutes Azure: 30 minutes Google Cloud: 10 minutes Statement auto-stopped: This notification is triggered when a statement moves into STOPPED because the compute pool it is using was deleted by a user. Best practices for alerting¶ Use the Metrics API and Notifications for Confluent Cloud to monitor your compute pools and statements over time. You should monitor and configure alerts for the following conditions: Per compute pool Alert on exhausted compute pools by comparing the current CFUs (io.confluent.flink/compute_pool_utilization/current_cfus) to the maximum CFUs of the pool (io.confluent.flink/compute_pool_utilization/cfu_limit). Flink statement stuck in pending notifications also indicate compute-pool exhaustion. Per statement Alert on statement failures (see Notifications) Alert on Statement degradation (see Notifications) Alert on a increase of “Messages Behind”/”Consumer Lag” (metric name: io.confluent.flink/pending_records) over an extended period of time, for example > 10 minutes; your mileage may vary. Note that Confluent Cloud for Apache Flink does not appear as a consumer in the regular consumer lag monitoring feature in Confluent Cloud, because it uses the assign() method. (Optional) Alert on an increase of the difference between the output (io.confluent.flink/current_output_watermark_ms) and input watermark (io.confluent.flink/current_input_watermark_ms). The input watermark corresponds to the time up to which the input data is complete, and the output watermark corresponds to the time up to which the output data is complete. This difference can be considered as a measure of the amount of data that’s currently “in-flight”. Depending on the logic of the statement, different patterns are expected. For example, for a tumbling event-time window, expect an increasing difference until the window is fired, at which point the difference drops to zero and starts increasing again. Statement logging¶ Confluent Cloud for Apache Flink supports event logging for statements in Confluent Cloud Console. The following screenshot shows the event log for a statement that failed due to a division by zero error. The event log is available in the Logs tab of the statement details page. The statement event log page provides logs for the following events: Changes of lifecycle, for example, PENDING or RUNNING. For more information, see Statement lifecycle. Scaling status changes, for example, OK or Pending Scale Up. For more information, see Scaling status. Autopilot scaling decisions, for example, Autopilot is requesting to scale the statement to [New CFU Value] CFUs. or Autopilot is unable to scale up the statement because the compute pool’s CFU limit has been reached. Errors and warnings. The Cloud Console enables the following operations: Search: Search for specific log messages. Wildcards are supported. Time range: Select the time range for the log events. Log level: Filter logs events by severity: Error, Warning, Info. Chart: View the log events in a chart. Download: Download log events as a CSV or JSON file. UDF logging¶ Log messages from user-defined functions (UDFs) are also shown in the statement log page. For more information, see Log Debug Messages in UDFs. Related content¶ Video: How to work with a paused stream Statements Queries Flink SQL Shell Quick Start Flink SQL Shell Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
Your |af| statement has entered a Degraded state because it is unable to make consistent progress. This can be caused by inefficient query logic or insufficient compute resources. Please review your statement for performance bottlenecks. If the issue persists, consider scaling your compute pool or contacting Confluent support for assistance.
```

```sql
An internal system error has been detected that requires attention from our
engineering team. We are actively working to resolve this issue. No action
is required on your part at this time. If the issue persists, please contact
Confluent support for further assistance.
```

```sql
io.confluent.flink/compute_pool_utilization/current_cfus
```

```sql
io.confluent.flink/compute_pool_utilization/cfu_limit
```

```sql
io.confluent.flink/pending_records
```

```sql
io.confluent.flink/current_output_watermark_ms
```

```sql
io.confluent.flink/current_input_watermark_ms
```

---

### Operate and Deploy Flink SQL Statements with Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/operate-and-deploy/overview.html

Operate and Deploy Flink Statements with Confluent Cloud for Apache Flink¶ Confluent provides tools for operating Confluent Cloud for Apache Flink® in the Cloud Console, the Confluent CLI, the Confluent Terraform Provider, and the REST API: Deploy a Statement Billing Monitor Statements with Cloud Console CLI commands Terraform resources REST API RBAC Flink API Keys Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

---

### Enable Private Networking with Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/operate-and-deploy/private-networking.html

Enable Private Networking with Confluent Cloud for Apache Flink¶ You have these options for using private networking with Confluent Cloud for Apache Flink®. PrivateLink Attachment: Works with any type of cluster and is available on AWS, Azure, and Google Cloud. For more information, see Supported Cloud Regions. Existing or new Confluent Cloud network (CCN): Available on AWS and Azure. To create a new Confluent Cloud network, follow the steps in Create Confluent Cloud Network on AWS. For more information, see Private Networking with Confluent Cloud for Apache Flink. Enable private networking with Confluent Cloud Network¶ If you already have a Confluent Cloud Network (CCN) created and configured, which is usually the case when you have any Dedicated cluster, you can use this network directly to connect to Flink. No setup, or minimum setup, is required to configure Flink, because you can reuse connectivity to existing Private Endpoints, Peering, or Transit Gateway. To access Flink from your local client, follow these steps. Prerequisites¶ Access to Confluent Cloud. The OrganizationAdmin, EnvironmentAdmin, or NetworkAdmin role to enable Flink private networking for an environment. Configure DNS resolution¶ Ensure your VPC is configured to route your unique Flink endpoint to Confluent Cloud. Have a client that is running within the VPC, or a proxy that reroutes your client to the VPC. For more information, see Use the Confluent Cloud Console with Private Networking. If you already configured 1 and 2 for Apache Kafka® you may not need any changes. For public DNS resolution with endpoints that resemble flink-<network>.<region>.<cloud>.private.confluent.cloud: if your local machine was already configured to access Kafka, no additional setup is necessary. With PrivateLink only: For private DNS resolution with endpoints that resemble flink.<network>.<region>.<cloud>.private.confluent.cloud, if routing is using *.<network>.<region>.<cloud>.private.confluent.cloud, no additional setup is necessary, but if your routing is using a more specific URL, you must add the Flink endpoint to your routing rules. Note that if you use a reverse proxy with a custom route added to your local host file, you must add the Flink endpoint to your host file. Routing to flinkpls...confluent.cloud is necessary to enable auto-completion and error highlighting in the Flink SQL shell and Confluent Cloud Console. Enable private networking with PrivateLink Attachment¶ Private networking with PrivateLink Attachment works with any type of cluster and is available on AWS and Azure. Prerequisites¶ Access to Confluent Cloud. The OrganizationAdmin, EnvironmentAdmin, or NetworkAdmin role to enable Flink private networking for an environment. A VPC in AWS, a VNet in Azure, or a VPC in Google Cloud. Overview¶ In this walkthrough, you perform the following steps. Set up a PrivateLink attachment Create a PrivateLink Attachment. Create a private endpoint. For AWS, create a VPC Interface Endpoint to the PrivateLink Attachment. For Azure, create a private endpoint that’s associated with the PrivateLink Attachment. For Google Cloud, create a private endpoint that’s associated with the PrivateLink Attachment. Create a PrivateLink Attachment Connection. Set up DNS resolution. Connect to the private network: If your client is not in the VPC or VNet, enable the Cloud Console or Confluent CLI to connect to your private network. When the previous steps are completed, you can use Flink over your private network from the Confluent Cloud Console or Confluent CLI. The experience is the same as with public networking. Step 1: Set up a PrivateLink Attachment and connection¶ In AWS, Azure, or Google Cloud, follow these steps to create a PrivateLink Attachment, a private endpoint, a PrivateLink Attachment Connection, and set up a DNS resolution. AWSAzureGoogle Cloud In Confluent Cloud, create a PrivateLinkAttachment. In AWS, create a VPC Interface Endpoint to the PrivateLinkAttachment service. In Confluent Cloud, create a PrivateLinkAttachmentConnection. Set up a DNS resolution. In Confluent Cloud, create a PrivateLinkAttachment. In Azure, create a private endpoint. In Confluent Cloud, create a PrivateLinkAttachmentConnection. Set up a DNS resolution. In Confluent Cloud, create a PrivateLinkAttachment. PrivateLink Attachments are powered by Private Service Connect. In Google Cloud, create a Private Service Connect endpoint to the service attachment URI you get in Step 1. If you use the Confluent Cloud Console for configuration, this step is merged into the next step and shows up as the first and second steps in access point creation. In Confluent Cloud, create a PrivateLink Attachment Connection for the Private Service Connect endpoint you created. A PrivateLink Attachment Connection is required for each Private Service Connect endpoint. Set up a DNS resolution. Step 2: Connect to the network with Cloud Console or Confluent CLI¶ If your client is not in the VPC or VNet, enable the Confluent Cloud Console or Confluent CLI to connect to your private network. If you don’t connect from a machine in the VPC or VNet, you see the following error. To connect to Confluent Cloud with your PrivateLink Attachment, see Use Confluent Cloud with Private Networking. One way to connect is to set up a reverse proxy. Create an EC2 instance. Connect to the instance with SSH. Install NGINX. Configure Routing Table. Set up DNS resolution: point to the Flink regional endpoints you use, as described in Step 6 of Configure a proxy. <Public IP Address of VM instance> <Flink-private-endpoint> <Flink-private-endpoint> will resemble flink.<region>.<cloud>.private.confluent.cloud, for example: flink.us-east-2.aws.private.confluent.cloud. Find the DNS part of the PrivateLink Attachment by navigating to your environment’s Network management page and finding the DNS domain setting. You can find the full list of supported Flink regions by using the Regions endpoint API. Once networking is set up in Cloud Console, the interface uses the correct endpoint automatically, either public or private, based on the presence of a PrivateLink Attachment. If the connection is private, access to the Flink private network works transparently. Related content¶ Use Confluent Cloud with Private Networking Flink Compute Pools Billing on Confluent Cloud for Apache Flink Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
flink-<network>.<region>.<cloud>.private.confluent.cloud
```

```sql
flink.<network>.<region>.<cloud>.private.confluent.cloud
```

```sql
*.<network>.<region>.<cloud>.private.confluent.cloud
```

```sql
flinkpls...confluent.cloud
```

```sql
<Public IP Address of VM instance> <Flink-private-endpoint>
```

```sql
<Flink-private-endpoint>
```

```sql
flink.<region>.<cloud>.private.confluent.cloud
```

```sql
flink.us-east-2.aws.private.confluent.cloud
```

---

### Flink SQL Query Profiler in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/operate-and-deploy/query-profiler.html

Flink SQL Query Profiler in Confluent Cloud for Apache Flink¶ The Query Profiler is a tool in Confluent Cloud for Apache Flink® that provides enhanced visibility into how a Flink SQL statement is processing data, which enables rapid identification of bottlenecks, data skew issues, and other performance concerns. To use the Query Profiler, see Profile a Query. The Query Profiler is a dynamic, real-time visual dashboard that provides insights into the computations performed by your Flink SQL statements. It boosts observability, enabling you to monitor running statements and diagnose performance issues during execution. The Query Profiler presents key metrics and visual representations of the performance and behavior of individual tasks, subtasks, and operators within a statement. Query Profiler is available in the Confluent Cloud Console. Key features of the Query Profiler include: Monitor in real time: Track the live performance of your Flink SQL statements, enabling you to react quickly to emerging issues. View detailed metrics: The profiler provides a breakdown of performance metrics at various levels, including statement, task, operator, and partition levels, which helps you understand how different components of a Flink SQL job are performing. Visualize data flow: The profiler visualizes data flow as a job graph, showing how data is processed through different tasks and operators. This helps you identify operators experiencing high latency, large amounts of state, or workload imbalances. Reduce manual analysis: By offering immediate visibility into performance data, the profiler reduces the need for extensive manual logging and analysis, which can consume significant developer time. This enables you to focus on optimizing your queries and improving performance. The Query Profiler helps you manage the complexities of stream processing applications and optimize query performance in real time. Available metrics¶ The Query Profiler provides the following metrics for the tasks in your Flink statements. Metric Definition Backpressure Percentage of time a task is regulating data flow to match processing speed by reducing pending events. Busyness The percentage of time a task is actively processing data. If a task has multiple subtasks running in parallel, Query Profiler shows the highest busyness value seen among them. Note that idleness and busyness will not always add up to 100%. Bytes in/min Amount of data received by a task per minute. Bytes out/min Amount of data sent by a task per minute. Idleness The percentage of time a task is not actively processing data. If a task has multiple subtasks running in parallel, Query Profiler shows the highest idleness value seen among them. Note that idleness and busyness do not always add up to 100%. Messages in/min Number of events the task receives per minute. Messages out/min Number of events the task sends out per minute. State size Amount of data stored by the task during processing to track information across events. Watermark Timestamp Flink uses to track event time progress and handle out-of-order events. The Query Profiler provides the following metrics for the operators in your Flink statements. Metric Definition Messages in/min Number of events the operator receives per minute. Messages out/min Number of events the operator sends out per minute. State size Amount of data stored by the operator during processing to track information across events. Watermark Timestamp Flink uses to track event time progress and handle out-of-order events. The Query Profiler provides the following metrics for the Kafka partitions in your data source(s). Metric Definition Active Percentage of time the partition is active. An active partition processes events and creates watermarks to keep your statements running smoothly. Blocked Percentage of time the partition is blocked. A blocked partition is overwhelmed with data, causing delays in the watermark calculation. Idle Percentage of time the partition is idle. An idle partition has not received any events for a certain time period and is not contributing to the watermark calculation. Related content¶ EXPLAIN Statement Flink SQL Statements Profile a Query Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

---

### Stream Processing with Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/overview.html

Stream Processing with Confluent Cloud for Apache Flink¶ Apache Flink® is a powerful, scalable stream processing framework for running complex, stateful, low-latency streaming applications on large volumes of data. Flink excels at complex, high-performance, mission-critical streaming workloads and is used by many companies for production stream processing applications. Flink is the de facto industry standard for stream processing. Get Started for Free Sign up for a Confluent Cloud trial and get $400 of free credit. Confluent Cloud for Apache Flink provides a cloud-native, serverless service for Flink that enables simple, scalable, and secure stream processing that integrates seamlessly with Apache Kafka®. Your Kafka topics appear automatically as queryable Flink tables, with schemas and metadata attached by Confluent Cloud. Confluent Cloud for Apache Flink supports creating stream-processing applications by using Flink SQL, the Flink Table API (Java and Python), and custom user-defined functions. To run Flink on-premises with Confluent Platform, see Confluent Platform for Apache Flink. What is Confluent Cloud for Apache Flink? Cloud native Complete Everywhere Program Flink with SQL, Java, and Python Confluent for VS Code What is Confluent Cloud for Apache Flink?¶ Confluent Cloud for Apache Flink integrates with the Kafka ecosystem¶ Confluent Cloud for Apache Flink is Flink re-imagined as a truly cloud-native service. Confluent’s fully managed Flink service enables you to: Easily filter, join, and enrich your data streams with Flink Enable high-performance and efficient stream processing at any scale, without the complexities of managing infrastructure Experience Kafka and Flink as a unified platform, with fully integrated monitoring, security, and governance When bringing Flink to Confluent Cloud, the goal was to provide a uniquely serverless experience superior to just “cloud-hosted” Flink. Kafka on Confluent Cloud goes beyond Kafka by using the Kora engine, which showcases Confluent’s engineering expertise in building cloud-native data systems. Confluent’s goal is to deliver the same simplicity, security, and scalability for Flink that you expect for Kafka. Confluent Cloud for Apache Flink is engineered to be: Cloud-native: Flink is fully managed on Confluent Cloud and autoscales up and down with your workloads. Complete: Flink is integrated deeply with Confluent Cloud to provide an enterprise-ready experience. Everywhere: Flink is available in AWS, Azure, and Google Cloud. Get started with Confluent Cloud for Apache Flink: Flink SQL Quick Start with Confluent Cloud Console Flink SQL Shell Quick Start Confluent Cloud for Apache Flink is cloud-native¶ Confluent Cloud for Apache Flink autoscales with your workloads¶ Confluent Cloud for Apache Flink provides a cloud-native experience for Flink. This means you can focus fully on your business logic, encapsulated in Flink SQL statements, and Confluent Cloud takes care of what’s needed to run them in a secure, resource-efficient and fault-tolerant manner. You don’t need to know about or interact with Flink clusters, state backends, checkpointing, or any of the other aspects that are usually involved when operating a production-ready Flink deployment. Fully ManagedOn Confluent Cloud, you don’t need to choose a runtime version of Flink. You’re always using the latest version and benefit from continuous improvements and innovations. All of your running statements automatically and transparently receive security patches and minor upgrades of the Flink runtime. AutoscalingAll of your Flink SQL statements on Confluent Cloud are monitored continuously and auto-scaled to keep up with the rate of their input topics. The resources required by a statement depend on its complexity and the throughput of topics it reads from. Usage-based billingYou pay only for what you use, not what you provision. Flink compute in Confluent Cloud is elastic: once you stop using the compute resources, they are deallocated, and you no longer pay for them. Coupled with the elasticity provided by scale-to-zero, you can benefit from unbounded scalability while maintaining cost efficiency. For more information, see Billing. Confluent Cloud for Apache Flink is complete¶ Confluent Cloud for Apache Flink is a unified platform¶ Confluent has integrated Flink deeply with Confluent Cloud to provide an enterprise-ready, complete experience that enables data discovery and processing using familiar SQL semantics. Confluent Cloud for Apache Flink is a regional service¶ Confluent Cloud for Apache Flink is a regional service, and you can create compute pools in any of the supported regions. Compute pools represent a set of resources that scale automatically between zero and their maximum size to provide all of the power required by your statements. A compute pool is bound to a region, and the resources provided by a compute pool are shared among all statements that use them. While compute pools are created within an environment, you can query data in any topic in your Confluent Cloud organization, even if the data is in a different environment, as long as it’s in the same region. This enables Flink to do cross-cluster, cross-environment queries while providing low latency. Of course, access control with RBAC still determines the data that can be read or written. Flink can read from and write to any Kafka cluster in the same region, but by design, Confluent Cloud doesn’t allow you to query across regions. This helps you to avoid expensive data transfer charges, and also protects data locality and sovereignty by keeping reads and writes in-region. For a list of available regions, see Supported Cloud Regions. Metadata mapping between Kafka cluster, topics, schemas, and Flink¶ Kafka topics and schemas are always in sync with Flink, simplifying how you can process your data. Any topic created in Kafka is visible directly as a table in Flink, and any table created in Flink is visible as a topic in Kafka. Effectively, Flink provides a SQL interface on top of Confluent Cloud. Because Flink follows the SQL standard, the terminology is slightly different from Kafka. The following table shows the mapping between Kafka and Flink terminology. Kafka Flink Notes Environment Catalog Flink can query and join data that are in any environments/catalogs Cluster Database Flink can query and join data that are in different clusters/databases Topic + Schema Table Kafka topics and Flink tables are always in sync. You never need to declare tables manually for existing topics. Creating a table in Flink creates a topic and the associated schema. As a result, when you start using Flink, you can directly access all of the environments, clusters, and topics that you already have in Confluent Cloud, without any additional metadata creation. Automatic metadata integration in Confluent Cloud for Apache Flink¶ Compared with Apache Flink, the main difference is that the Data Definition Language (DDL) statements related to catalogs, databases, and tables act on physical objects and not only on metadata. For example, when you create a table in Flink, the corresponding topic and schema are created immediately in Confluent Cloud. Confluent Cloud provides a unified approach to metadata management. There is one object definition, and Flink integrates directly with this definition, avoiding unnecessary duplication of metadata and making all topics immediately queryable with Flink SQL. Also, any existing schemas in Schema Registry are used to surface fully-defined entities in Confluent Cloud. If you’re already on Confluent Cloud, you see tables automatically that are ready to query using Flink, simplifying data discovery and exploration. Observability¶ Confluent Cloud provides you with a curated set of metrics, exposing them through Confluent’s existing Metrics API. If you have established observability platforms in place, Confluent Cloud provides first-class integrations with New Relic, Datadog, Grafana Cloud, and Dynatrace. You can also monitor workloads directly within the Confluent Cloud Console. Clicking into a compute pool gives you insight into the health and performance of your applications, in addition to the resource consumption of your compute pool. Security¶ Confluent Cloud for Apache Flink has a deep integration with Role-Based Access Control (RBAC), ensuring that you can easily access and process the data that you have access to, and no other data. Access from Flink to the data¶ For ad-hoc queries, you can use your user account, because the permissions of the current user are applied automatically without any additional setting needed. For long-running statements that need to run 24/7, like INSERT INTO, you should use a service account, so the statements are not affected by a user leaving the company or changing teams. Access to Flink¶ To manage Flink access, Confluent has introduced two roles. In both cases, RBAC of the user on the underlying data is still applied. FlinkDeveloper: basic access to Flink, enabling users to query data and manage their own statements. FlinkAdmin: role that enables creating and managing Flink compute pools. Service accounts¶ Service accounts are available for running statements permanently. If you want to run a statement with service account permissions, an OrganizationAdmin must create an Assigner role binding for the user on the service account. For more information, see Production workloads (service accounts). Private networking¶ Confluent Cloud for Apache Flink supports private networking on AWS, Azure, and Google Cloud, providing a simple, secure, and flexible solution that enables new scenarios while keeping your data securely in private networking. All Kafka cluster types are supported, with any type of connectivity (public, Private Links, VPC Peering, and Transit Gateway). For more information, see Private Networking with Flink. Cross-environment queries¶ Flink can perform cross-environment queries when using both public and private networking. This can be useful if you want to enable a single networking route from your VPC or VNET. In this case, you can use a single environment and a single PLATT where you run all your Flink workloads and use three-part name queries, to query data in other environments, for example: SELECT * FROM `myEnvironment`.`myDatabase`.`myTable`; As a result, a single routing rule is necessary on the VPC or VNet side, per region, to redirect all traffic to the Flink regional endpoint(s) using this PrivateLink Attachment Connection. To isolate different workloads, you can create different compute pools, which enables you to control budget and scale of these workloads independently. Data access is protected by RBAC at the Kafka cluster (Flink database) or Kafka topic (Flink table) level. If your user account or service account that runs the query doesn’t have access, Flink can’t access sources and destinations. To access Flink statements and workspaces, you must access them from a public IP address, if authorized, or from a PLATT or Confluent Cloud Network from the same environment and region. Flink statements themselves can then access all the environments in the same organization and region. Program Flink with SQL, Java, and Python¶ Confluent Cloud for Apache Flink supports programming your streaming applications in these languages: SQL Java Table API Python Table API Also, you can create custom user-defined functions and call them in your SQL statements. For more information, see User-defined Functions. Note The Flink Table API is available for preview. A Preview feature is a Confluent Cloud component that is being introduced to gain early feedback from developers. Preview features can be used for evaluation and non-production testing purposes or to provide feedback to Confluent. The warranty, SLA, and Support Services provisions of your agreement with Confluent do not apply to Preview features. Confluent may discontinue providing preview releases of the Preview features at any time in Confluent’s’ sole discretion. Comments, questions, and suggestions related to the Table API are encouraged and can be submitted through the established channels. Confluent for VS Code¶ Install Confluent for VS Code to access Smart Project Templates that accelerate project setup by providing ready-to-use templates tailored for common development patterns. These templates enable you to launch new projects quickly with minimal configuration, significantly reducing setup time. Next steps¶ Get Started Quick Start with Cloud Console Quick Start with Flink SQL Shell Java Table API Quick Start Python Table API Quick Start Related content¶ Stream Processing Concepts How-to Guides Operate and Deploy Flink SQL Reference Blog post: Introducing Confluent Cloud for Apache Flink Blog post: Your Guide to Flink SQL: An In-Depth Exploration Blog post: How to Use Flink SQL, Streamlit, and Kafka: Part 1 Blog post: How to Use Flink SQL, Streamlit, and Kafka: Part 2 Blog post: Data Products, Data Contracts, and Change Data Capture Course: Apache Flink SQL Course: Apache Flink 101 Course: Building Flink Applications in Java Course: Apache Flink® Table API: Processing Data Streams in Java Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
SELECT * FROM `myEnvironment`.`myDatabase`.`myTable`;
```

---

### Supported Cloud Regions for Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/cloud-regions.html

Supported Cloud Regions for Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® is available on AWS, Azure, and Google Cloud. Flink is supported in the following regions. AWS supported regions Azure supported regions Google Cloud supported regions You can see the regions where Confluent Cloud for Apache Flink is supported by using the Confluent Cloud Console, the Confluent CLI, and the Flink REST API. List regions by using Cloud Console, Confluent CLI, or REST API AWS supported regions¶ Regions Networking ap-east-1 Public and private ap-northeast-1 Public and private ap-northeast-2 Public and private ap-south-1 Public and private ap-southeast-1 Public and private ap-southeast-2 Public and private ca-central-1 Public and private eu-central-1 Public and private eu-north-1 Public and private eu-west-1 Public and private eu-west-2 Public and private me-south-1 Public and private sa-east-1 Public and private us-east-1 Public and private us-east-2 Public and private us-west-2 Public and private Azure supported regions¶ Regions Networking australiaeast Public and private brazilsouth Public and private canadacentral Public and private centralindia Public and private centralus Public and private eastasia Public and private eastus Public and private eastus2 Public and private francecentral Public and private germanywestcentral Public and private northeurope Public and private southcentralus Public and private southeastasia Public and private spaincentral Public and private uaenorth Public and private uksouth Public and private westeurope Public and private westus2 Public and private westus3 Public and private Google Cloud supported regions¶ Regions Networking asia-south1 Public and private asia-south2 Public and private asia-southeast1 Public and private asia-southeast2 Public and private australia-southeast1 Public and private europe-west1 Public and private europe-west2 Public and private europe-west3 Public and private europe-west4 Public and private northamerica-northeast1 Public and private northamerica-northeast2 Public and private us-central1 Public and private us-east1 Public and private us-east4 Public and private us-west1 Public and private us-west2 Public and private us-west4 Public and private List regions by using Cloud Console, Confluent CLI, or REST API¶ You can see the regions where Confluent Cloud for Apache Flink is supported by using the Confluent Cloud Console, the Confluent CLI, or the Flink REST API. Confluent Cloud ConsoleConfluent CLIREST API Log in to Confluent Cloud and navigate to your environment. Click Flink and ensure that Compute pools is selected. Click Add compute pool. In the Create compute pool page, you can browse the available cloud providers and regions. Log in to Confluent Cloud. confluent login --organization-id ${ORG_ID} --prompt Use the Confluent CLI command to see the regions where Confluent Cloud for Apache Flink is supported. confluent flink region list Your output should resemble: Current | Name | Cloud | Region ----------+--------------------------------+-------+----------------------- | Belgium (europe-west1) | GCP | europe-west1 | Canada (ca-central-1) | AWS | ca-central-1 | Iowa (centralus) | AZURE | centralus ... Use grep to filter the list by cloud provider. For example, the following command shows the AWS regions where Flink is available. confluent flink region list | grep -i aws Your output should resemble: | Canada (ca-central-1) | AWS | ca-central-1 | Frankfurt (eu-central-1) | AWS | eu-central-1 | Ireland (eu-west-1) | AWS | eu-west-1 ... Send a GET request to the Flink REST API Regions endpoint to list the available regions. For more information, see List Flink Regions. Related content¶ Compute Pools Create a Compute Pool Statements Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
confluent login --organization-id ${ORG_ID} --prompt
```

```sql
confluent flink region list
```

```sql
Current |              Name              | Cloud |        Region
----------+--------------------------------+-------+-----------------------
          | Belgium (europe-west1)         | GCP   | europe-west1
          | Canada (ca-central-1)          | AWS   | ca-central-1
          | Iowa (centralus)               | AZURE | centralus
...
```

```sql
confluent flink region list | grep -i aws
```

```sql
| Canada (ca-central-1)          | AWS   | ca-central-1
| Frankfurt (eu-central-1)       | AWS   | eu-central-1
| Ireland (eu-west-1)            | AWS   | eu-west-1
...
```

---

### Flink SQL Data Types in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/datatypes.html

Data Types in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® has a rich set of native data types that you can use in SQL statements and queries. The query planner supports the following SQL types. Flink SQL type Java type JSON Schema type Protobuf type Avro type Avro logical type ARRAY t[] Array repeated T array – BIGINT long Number INT64 long – BINARY byte[] String BYTES fixed – BOOLEAN boolean Boolean BOOL boolean – BYTES / VARBINARY byte[] String BYTES bytes – CHAR String String STRING string – DATE java.time.LocalDate Number MESSAGE int date DECIMAL java.math.BigDecimal Number MESSAGE bytes decimal DOUBLE double Number DOUBLE double – FLOAT float Number FLOAT float – INT long Number INT32 int – INTERVAL DAY TO SECOND java.time.Duration Not supported Not supported Not supported – INTERVAL YEAR TO MONTH java.time.Period Not supported Not supported Not supported – MAP java.util.Map<kt, vt> Array[Object] / Object repeated MESSAGE map / array – MULTISET java.util.Map<t, Integer> Array[Object] / Object repeated MESSAGE map / array – NULL java.lang.Object oneOf(Null, T) [1] union(avro_type, null) – ROW org.apache.flink.types.Row Object MESSAGE record [2] – SMALLINT short Number INT32 int – TIME java.time.LocalTime Number – int time-millis TIMESTAMP java.time.LocalDateTime Number MESSAGE long local-timestamp-millis/local-timestamp-micros TIMESTAMP_LTZ java.time.Instant Number MESSAGE long timestamp-millis / timestamp-micros TINYINT byte Number INT32 int – VARCHAR / STRING String String STRING string – [1]See discussion at Flink SQL types to Protobuf types [2]See discussion at Flink SQL types to Avro types Data type definition¶ A data type describes the logical type of a value in a SQL table. You use data types to declare the input and output types of an operation. The Flink data types are similar to the SQL standard data type terminology, but for efficient handling of scalar expressions, they also contain information about the nullability of a value. These are examples of SQL data types: INT INT NOT NULL INTERVAL DAY TO SECOND(3) ROW<fieldOne ARRAY<BOOLEAN>, fieldTwo TIMESTAMP(3)> The following sections list all pre-defined data types in Flink SQL. Character strings¶ CHAR¶ Represents a fixed-length character string. Declaration CHAR CHAR(n) Bridging to JVM types Java Type Input Output Notes java.lang.String ✓ ✓ Default byte[] ✓ ✓ Assumes UTF-8 encoding org.apache.flink.table.data.StringData ✓ ✓ Internal data structure Formats The following table shows examples of the CHAR type in different formats. JSON for data type {"type":"CHAR","nullable":true,"length":8} CLI/UI format CHAR(8) JSON for payload "Example string" CLI/UI format for payload Example string Declare this type by using CHAR(n), where n is the number of code points. n must have a value between 1 and 2,147,483,647 (both inclusive). If no length is specified, n is equal to 1. CHAR(0) is not supported for CAST or persistence in catalogs, but it exists in protocols. VARCHAR / STRING¶ Represents a variable-length character string. Declaration VARCHAR VARCHAR(n) STRING Bridging to JVM types Java Type Input Output Notes java.lang.String ✓ ✓ Default byte[] ✓ ✓ Assumes UTF-8 encoding org.apache.flink.table.data.StringData ✓ ✓ Internal data structure Formats The following table shows examples of the VARCHAR type in different formats. JSON for data type {"type":"VARCHAR","nullable":true,"length":8} CLI/UI format VARCHAR(800) JSON for payload "Example string" CLI/UI format for payload Example string Declare this type by using VARCHAR(n), where n is the maximum number of code points. n must have a value between 1 and 2,147,483,647 (both inclusive). If no length is specified, n is equal to 1. STRING is equivalent to VARCHAR(2147483647). VARCHAR(0) is not supported for CAST or persistence in catalogs, but it exists in protocols. Binary strings¶ BINARY¶ Represents a fixed-length binary string (=a sequence of bytes). Declaration BINARY BINARY(n) Bridging to JVM types Java Type Input Output Notes byte[] ✓ ✓ Default Formats The following table shows examples of the BINARY type in different formats. JSON for data type {"type":"BINARY","nullable":true,"length":1} CLI/UI format BINARY(3) JSON for payload "x'7f0203'" CLI/UI format for payload x'7f0203' Declare this type by using BINARY(n), where n is the number of bytes. n must have a value between 1 and 2,147,483,647 (both inclusive). If no length is specified, n is equal to 1. The string representation is hexadecimal format. BINARY(0) is not supported for CAST or persistence in catalogs, but it exists in protocols. BYTES / VARBINARY¶ Represents a variable-length binary string (=a sequence of bytes). Declaration BYTES VARBINARY VARBINARY(n) Bridging to JVM types Java Type Input Output Notes byte[] ✓ ✓ Default Formats The following table shows examples of the VARBINARY type in different formats. JSON for data type {"type":"VARBINARY","nullable":true,"length":1} CLI/UI format VARBINARY(800) JSON for payload "x'7f0203'" CLI/UI format for payload x'7f0203' Declare this type by using VARBINARY(n) where n is the maximum number of bytes. n must have a value between 1 and 2,147,483,647 (both inclusive). If no length is specified, n is equal to 1. BYTES is equivalent to VARBINARY(2147483647). VARCHAR(0) is not supported for CAST or persistence in catalogs, but it exists in protocols. Exact numerics¶ BIGINT¶ Represents an 8-byte signed integer with values from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807. Declaration BIGINT Bridging to JVM types Java Type Input Output Notes java.lang.Long ✓ ✓ Default long ✓ (✓) Output only if type is not nullable Formats The following table shows examples of the BIGINT type in different formats. JSON for data type {"type":"BIGINT","nullable":true} CLI/UI format BIGINT JSON for payload "23" CLI/UI format for payload 23 DECIMAL¶ Represents a decimal number with fixed precision and scale. Declaration DECIMAL DECIMAL(p) DECIMAL(p, s) DEC DEC(p) DEC(p, s) NUMERIC NUMERIC(p) NUMERIC(p, s) Bridging to JVM types Java Type Input Output Notes java.math.BigDecimal ✓ ✓ Default org.apache.flink.table.data.DecimalData ✓ ✓ Internal data structure Formats The following table shows examples of the DECIMAL type in different formats. JSON for data type {"type":"DECIMAL","nullable":true,"precision":5,"scale":3} CLI/UI format DECIMAL(5, 3) JSON for payload "12.123" CLI/UI format for payload 12.123 Declare this type by using DECIMAL(p, s) where p is the number of digits in a number (precision) and s is the number of digits to the right of the decimal point in a number (scale). p must have a value between 1 and 38 (both inclusive). The default value for p is 10. s must have a value between 0 and p (both inclusive). The default value for s is 0. The right side is padded with 0. The left side must be padded with spaces, like all other values. NUMERIC(p, s) and DEC(p, s) are synonyms for this type. INT¶ Represents a 4-byte signed integer with values from -2,147,483,648 to 2,147,483,647. Declaration INT INTEGER Bridging to JVM types Java Type Input Output Notes java.lang.Integer ✓ ✓ Default long ✓ (✓) Output only if type is not nullable Formats The following table shows examples of the INT type in different formats. JSON for data type {"type":"INT","nullable":true} CLI/UI format INT JSON for payload "23" CLI/UI format for payload 23 INTEGER is a synonym for this type. SMALLINT¶ Represents a 2-byte signed integer with values from -32,768 to 32,767. Declaration SMALLINT Bridging to JVM types Java Type Input Output Notes java.lang.Short ✓ ✓ Default short ✓ (✓) Output only if type is not nullable Formats The following table shows examples of the SMALLINT type in different formats. JSON for data type {"type":"SMALLINT","nullable":true} CLI/UI format SMALLINT JSON for payload "23" CLI/UI format for payload 23 TINYINT¶ Represents a 1-byte signed integer with values from -128 to 127. Declaration TINYINT Bridging to JVM types Java Type Input Output Notes java.lang.Byte ✓ ✓ Default byte ✓ (✓) Output only if type is not nullable Formats The following table shows examples of the TINYINT type in different formats. JSON for data type {"type":"TINYINT","nullable":true} CLI/UI format TINYINT JSON for payload "23" CLI/UI format for payload 23 Approximate numerics¶ DOUBLE¶ Represents an 8-byte double precision floating point number. Declaration DOUBLE DOUBLE PRECISION Bridging to JVM types Java Type Input Output Notes java.lang.Double ✓ ✓ Default double ✓ (✓) Output only if type is not nullable Formats The following table shows examples of the DOUBLE type in different formats. JSON for data type {"type":"DOUBLE","nullable":true} CLI/UI format DOUBLE JSON for payload "1.1111112120000001E7" CLI/UI format for payload 1.1111112120000001E7 DOUBLE PRECISION is a synonym for this type. FLOAT¶ Represents a 4-byte single precision floating point number. Declaration FLOAT Bridging to JVM types Java Type Input Output Notes java.lang.Float ✓ ✓ Default float ✓ (✓) Output only if type is not nullable Formats The following table shows examples of the FLOAT type in different formats. JSON for data type {"type":"FLOAT","nullable":true} CLI/UI format FLOAT JSON for payload "1.1111112E7" CLI/UI format for payload 1.1111112E7 Compared to the SQL standard, this type doesn’t take parameters. Date and time¶ DATE¶ Represents a date consisting of year-month-day with values ranging from 0000-01-01 to 9999-12-31. Declaration DATE Bridging to JVM types Java Type Input Output Notes java.time.LocalDate ✓ ✓ Default java.sql.Date ✓ ✓ java.lang.Integer ✓ ✓ Describes the number of days since Unix epoch int ✓ (✓) Describes the number of days since Unix epoch. Output only if type is not nullable. Formats The following table shows examples of the DATE type in different formats. JSON for data type {"type":"DATE","nullable":true} CLI/UI format DATE JSON for payload "2023-04-06" CLI/UI format for payload 2023-04-06 Compared to the SQL standard, the range starts at year 0000. INTERVAL DAY TO SECOND¶ Data type for a group of day-time interval types. Declaration INTERVAL DAY INTERVAL DAY(p1) INTERVAL DAY(p1) TO HOUR INTERVAL DAY(p1) TO MINUTE INTERVAL DAY(p1) TO SECOND(p2) INTERVAL HOUR INTERVAL HOUR TO MINUTE INTERVAL HOUR TO SECOND(p2) INTERVAL MINUTE INTERVAL MINUTE TO SECOND(p2) INTERVAL SECOND INTERVAL SECOND(p2) Bridging to JVM types Java Type Input Output Notes java.time.Duration ✓ ✓ Default java.lang.Long ✓ ✓ Describes the number of milliseconds long ✓ (✓) Describes the number of milliseconds. Output only if type is not nullable. Formats The following table shows examples of the INTERVAL DAY TO SECOND type in different formats. JSON for data type {"type":"INTERVAL_DAY_TIME","nullable":true,"precision":1,"fractionalPrecision":3,"resolution":"DAY_TO_SECOND"} CLI/UI format INTERVAL DAY(1) TO SECOND(3) JSON for payload "+2 07:33:20.000" CLI/UI format for payload +2 07:33:20.000 Declare this type by using the above combinations, where p1 is the number of digits of days (day precision) and p2 is the number of digits of fractional seconds (fractional precision). p1 must have a value between 1 and 6 (both inclusive). If no p1 is specified, it is equal to 2 by default. p2 must have a value between 0 and 9 (both inclusive). If no p2 is specified, it is equal to 6 by default. The type must be parameterized to one of these resolutions with up to nanosecond precision: Interval of days Interval of days to hours Interval of days to minutes Interval of days to seconds Interval of hours Interval of hours to minutes Interval of hours to seconds Interval of minutes Interval of minutes to seconds Interval of seconds An interval of day-time consists of +days hours:months:seconds.fractional with values ranging from -999999 23:59:59.999999999 to +999999 23:59:59.999999999. The value representation is the same for all types of resolutions. For example, an interval of seconds of 70 is always represented in an interval-of-days-to-seconds format (with default precisions): +00 00:01:10.000000. Formatting intervals are tricky, because they have different resolutions: DAY DAY_TO_HOUR DAY_TO_MINUTE DAY_TO_SECOND HOUR HOUR_TO_MINUTE HOUR_TO_SECOND MINUTE MINUTE_TO_SECOND SECOND Depending on the resolution, use: INTERVAL DAY(1) INTERVAL DAY(1) TO HOUR INTERVAL DAY(1) TO MINUTE INTERVAL DAY(1) TO SECOND(3) INTERVAL HOUR INTERVAL HOUR TO MINUTE INTERVAL HOUR TO SECOND(3) INTERVAL MINUTE INTERVAL MINUTE TO SECOND(3) INTERVAL SECOND(3) INTERVAL YEAR TO MONTH¶ Data type for a group of year-month interval types. Declaration INTERVAL YEAR INTERVAL YEAR(p) INTERVAL YEAR(p) TO MONTH INTERVAL MONTH Bridging to JVM types Java Type Input Output Notes java.time.Period ✓ ✓ Default. Ignores the days part. java.lang.Integer ✓ ✓ Describes the number of months. int ✓ (✓) Describes the number of months. Output only if type is not nullable. Formats The following table shows examples of the INTERVAL YEAR TO MONTH type in different formats. JSON for data type {"type":"INTERVAL_YEAR_MONTH","nullable":true,"precision":4,"resolution":"YEAR_TO_MONTH"} CLI/UI format INTERVAL YEAR(4) TO MONTH JSON for payload "+2000-02" CLI/UI format for payload +2000-02 Declare this type by using the above combinations, where p is the number of digits of years (year precision). p must have a value between 1 and 4 (both inclusive). If no year precision is specified, p is equal to 2. The type must be parameterized to one of these resolutions: Interval of years Interval of years to months Interval of months An interval of year-month consists of +years-months with values ranging from -9999-11 to +9999-11. The value representation is the same for all types of resolutions. For example, an interval of months of 50 is always represented in an interval-of-years-to-months format (with default year precision): +04-02. Formatting intervals are tricky, because they have different resolutions: YEAR YEAR_TO_MONTH MONTH Depending on the resolution, use: INTERVAL YEAR(4) INTERVAL YEAR(4) TO MONTH INTERVAL MONTH TIME¶ Represents a time without timezone consisting of hour:minute:second[.fractional] with up to nanosecond precision and values ranging from 00:00:00.000000000 to 23:59:59.999999999. Declaration TIME TIME(p) TIME_WITHOUT_TIME_ZONE TIME_WITHOUT_TIME_ZONE(p) Bridging to JVM types Java Type Input Output Notes java.time.LocalTime ✓ ✓ Default java.sql.Time ✓ ✓ java.lang.Integer ✓ ✓ Describes the number of milliseconds of the day. int ✓ (✓) Describes the number of milliseconds of the day. Output only if type is not nullable. java.lang.Long ✓ ✓ Describes the number of nanoseconds of the day. long ✓ (✓) Describes the number of nanoseconds of the day. Output only if type is not nullable. Formats The following table shows examples of the TIME type in different formats. JSON for data type {"type":"TIME_WITHOUT_TIME_ZONE","nullable":true,"precision":3} CLI/UI format TIME(3) JSON for payload "10:56:22.541" CLI/UI format for payload 10:56:22.541 Declare this type by using TIME(p), where p is the number of digits of fractional seconds (precision). p must have a value between 0 and 9 (both inclusive). If no precision is specified, p is equal to 0. Compared to the SQL standard, leap seconds (23:59:60 and 23:59:61) are not supported, as the semantics are closer to java.time.LocalTime. A time with timezone is not provided. TIME acts like a pure string and isn’t related to a time zone of any kind, including UTC. TIME WITHOUT TIME ZONE is a synonym for this type. TIMESTAMP¶ Represents a timestamp without timezone consisting of year-month-day hour:minute:second[.fractional] with up to nanosecond precision and values ranging from 0000-01-01 00:00:00.000000000 to 9999-12-31 23:59:59.999999999. Declaration TIMESTAMP TIMESTAMP(p) TIMESTAMP WITHOUT TIME ZONE TIMESTAMP(p) WITHOUT TIME ZONE Bridging to JVM types Java Type Input Output Notes java.time.LocalDateTime ✓ ✓ Default java.sql.Timestamp ✓ ✓ org.apache.flink.table.data.TimestampData ✓ ✓ Internal data structure Formats The following table shows examples of the TIMESTAMP type in different formats. JSON for data type {"type":"TIMESTAMP_WITHOUT_TIME_ZONE","nullable":true,"precision":3} CLI/UI format TIMESTAMP(3) JSON for payload "2023-04-06 10:59:32.628" CLI/UI format for payload 2023-04-06 10:59:32.628 Declare this type by using TIMESTAMP(p), where p is the number of digits of fractional seconds (precision). p must have a value between 0 and 9 (both inclusive). If no precision is specified, p is equal to 6. A space separates the date and time parts. Compared to the SQL standard, leap seconds (23:59:60 and 23:59:61) are not supported, as the semantics are closer to java.time.LocalDateTime. A conversion from and to BIGINT (a JVM long type) is not supported, as this would imply a timezone, but this type is time-zone free. For more java.time.Instant-like semantics use TIMESTAMP_LTZ. TIMESTAMP acts like a pure string and isn’t related to a time zone of any kind, including UTC. TIMESTAMP WITHOUT TIME ZONE is a synonym for this type. TIMESTAMP_LTZ¶ Represents a timestamp with the local timezone consisting of year-month-day hour:minute:second[.fractional] zone with up to nanosecond precision and values ranging from 0000-01-01 00:00:00.000000000 +14:59 to 9999-12-31 23:59:59.999999999 -14:59. Declaration TIMESTAMP_LTZ TIMESTAMP_LTZ(p) TIMESTAMP WITH LOCAL TIME ZONE TIMESTAMP(p) WITH LOCAL TIME ZONE Bridging to JVM types Java Type Input Output Notes java.time.Instant ✓ ✓ Default java.lang.Integer ✓ ✓ Describes the number of seconds since Unix epoch. int ✓ (✓) Describes the number of seconds since Unix epoch. Output only if type is not nullable. java.lang.Long ✓ ✓ Describes the number of milliseconds since Unix epoch. long ✓ (✓) Describes the number of milliseconds since Unix epoch. Output only if type is not nullable. java.sql.Timestamp ✓ ✓ Describes the number of milliseconds since Unix epoch. org.apache.flink.table.data.TimestampData ✓ ✓ Internal data structure Formats The following table shows examples of the TIMESTAMP_LTZ type in different formats. JSON for data type {"type":"TIMESTAMP_WITH_LOCAL_TIME_ZONE","nullable":true,"precision":3} CLI/UI format TIMESTAMP(3) WITH LOCAL TIME ZONE JSON for payload "2023-04-06 11:06:47.224" CLI/UI format for payload 2023-04-06 11:06:47.224 Declare this type by using TIMESTAMP_LTZ(p), where p is the number of digits of fractional seconds (precision). p must have a value between 0 and 9 (both inclusive). If no precision is specified, p is equal to 6. Leap seconds (23:59:60 and 23:59:61) are not supported, as the semantics are closer to java.time.OffsetDateTime. Compared to TIMESTAMP WITH TIME ZONE, the timezone offset information is not stored physically in every datum. Instead, the type assumes java.time.Instant semantics in the UTC timezone at the edges of the table ecosystem. Every datum is interpreted in the local timezone configured in the current session for computation and visualization. This type fills the gap between time-zone free and time-zone mandatory timestamp types by allowing the interpretation of UTC timestamps according to the configured session timezone. TIMESTAMP_LTZ resembles a TIMESTAMP without a timezone, but the string always considers the sessions/query’s timezone. Internally, it is always in the UTC time zone. If you require the short format, prefer TIMESTAMP_LTZ(3). TIMESTAMP WITH LOCAL TIME ZONE is a synonym for this type. TIMESTAMP and TIMESTAMP_LTZ comparison¶ Although TIMESTAMP and TIMESTAMP_LTZ are similarly named, they represent different concepts. TIMESTAMP_LTZ TIMESTAMP_LTZ in SQL is similar to the Instant class in Java. TIMESTAMP_LTZ represents a moment, or a specific point in the UTC timeline. TIMESTAMP_LTZ stores time as a UTC integer, which can be converted dynamically to every other timezone. When printing or casting TIMESTAMP_LTZ as a character string, the sql.local-time-zone setting is considered. TIMESTAMP TIMESTAMP in SQL is similar to LocalDateTime in Java. TIMESTAMP has no time zone or offset from UTC, so it can’t represent a moment. TIMESTAMP stores time as character string, not related to any timezone. TIMESTAMP WITH TIME ZONE¶ Represents a timestamp with time zone consisting of year-month-day hour:minute:second[.fractional] zone with up to nanosecond precision and values ranging from 0000-01-01 00:00:00.000000000 +14:59 to 9999-12-31 23:59:59.999999999 -14:59. Declaration TIMESTAMP WITH TIME ZONE TIMESTAMP(p) WITH TIME ZONE Bridging to JVM types Java Type Input Output Notes java.time.OffsetDateTime ✓ ✓ Default java.time.ZonedDateTime ✓ Ignores the zone ID Compared to TIMESTAMP_LTZ, the time zone offset information is stored physically in every datum. It is used individually for every computation, visualization, or communication to external systems. Collection data types¶ ARRAY¶ Represents an array of elements with same subtype. Declaration ARRAY<t> t ARRAY Bridging to JVM types Java Type Input Output Notes t[] ✓ ✓ Default. Depends on the subtype. java.util.List<t> ✓ ✓ subclass of java.util.List<t> ✓ org.apache.flink.table.data.ArrayData ✓ ✓ Internal data structure Formats The following table shows examples of the ARRAY type in different formats. JSON for data type {"type":"ARRAY","nullable":true,"elementType":{"type":"INTEGER","nullable":true}} CLI/UI format ARRAY<INT> JSON for payload ["1", "2", "3", null] CLI/UI format for payload [1, 2, 3, NULL] Declare this type by using ARRAY<t>, where t is the data type of the contained elements. Compared to the SQL standard, the maximum cardinality of an array cannot be specified and is fixed at 2,147,483,647. Also, any valid type is supported as a subtype. t ARRAY is a synonym for being closer to the SQL standard. For example, INT ARRAY is equivalent to ARRAY<INT>. MAP¶ Represents an associative array that maps keys (including NULL) to values (including NULL). Declaration MAP<kt, vt> Bridging to JVM types Java Type Input Output Notes java.util.Map<kt, vt> ✓ ✓ Default subclass of java.util.Map<kt, vt> ✓ org.apache.flink.table.data.MapData ✓ ✓ Internal data structure Formats The following table shows examples of the MAP type in different formats. JSON for data type {"type":"MAP","nullable":true,"keyType":{"type":"INTEGER","nullable":true},"valueType":{"type":"VARCHAR","nullable":true,"length":2147483647}} CLI/UI format MAP<STRING> JSON for payload [["1", "a"], ["2", "b"], [null, "c"]] CLI/UI format for payload {1=a, 2=b, NULL=c} Declare this type by using MAP<kt, vt> where kt is the data type of the key elements and vt is the data type of the value elements. A map can’t contain duplicate keys. Each key can map to at most one value. There is no restriction of element types. It is the responsibility of the user to ensure uniqueness. The map type is an extension to the SQL standard. MULTISET¶ Represents a multiset (=bag). Declaration MULTISET<t> t MULTISET Bridging to JVM types Java Type Input Output Notes java.util.Map<t, java.lang.Integer> ✓ ✓ Default. Assigns each value to an integer multiplicity. subclass of java.util.Map<t, java.lang.Integer> ✓ org.apache.flink.table.data.MapData ✓ ✓ Internal data structure Formats The following table shows examples of the MULTISET type in different formats. JSON for data type {"type":"MULTISET","nullable":true,"elementType":{"type":"INTEGER","nullable":true}} CLI/UI format MULTISET<INT> JSON for payload [["a", "1"], ["b", "2"], [null, "1"]] CLI/UI format for payload {a=1, b=2, NULL=1} Declare this type by using MULTISET<t> where t is the data type of the contained elements. Unlike a set, the multiset allows for multiple instances for each of its elements with a common subtype. Each unique value (including NULL) is mapped to some multiplicity. There is no restriction of element types; it is the responsibility of the user to ensure uniqueness. t MULTISET is a synonym for being closer to the SQL standard. For example, INT MULTISET is equivalent to MULTISET<INT>. ROW¶ Represents a sequence of fields. Declaration ROW<name0 type0, name1 type1, ...> ROW<name0 type0 'description0', name1 type1 'description1', ...> ROW(name0 type0, name1 type1, ...) ROW(name0 type0 'description0', name1 type1 'description1', ...) Bridging to JVM types Java Type Input Output Notes org.apache.flink.types.Row ✓ ✓ Default org.apache.flink.table.data.RowData ✓ ✓ Internal data structure Formats The following table shows examples of the ROW type in different formats. JSON for data type {"type":"ROW","nullable":true,"fields":[{"name":"a","fieldType":{"type":"INTEGER","nullable":true}},{"name":"b","fieldType":{"type":"VARCHAR","nullable":true,"length":2147483647}}]} CLI/UI format MULTISET<INT> JSON for payload [["a", "1"], ["b", "2"], [null, "1"]] CLI/UI format for payload {a=1, b=2, NULL=1} Declare this type by using ROW<n0 t0 'd0', n1 t1 'd1', ...>, where n is the unique name of a field, t is the logical type of a field, d is the description of a field. A field consists of a field name, field type, and an optional description. The most specific type of a row of a table is a row type. In this case, each column of the row corresponds to the field of the row type that has the same ordinal position as the column. To create a table with a row type, use the following syntax: CREATE TABLE table_with_row_types ( `Customer` ROW<name STRING, age INT>, `Order` ROW<id BIGINT, title STRING> ); To insert a row into a table with a row type, use the following syntax: INSERT INTO table_with_row_types VALUES (('Alice', 30), (101, 'Book')), (('Bob', 25), (102, 'Laptop')), (('Charlie', 35), (103, 'Phone')), (('Diana', 28), (104, 'Tablet')), (('Eve', 22), (105, 'Headphones')); To work with fields from a row, use dot notation: SELECT `Customer`.name, `Customer`.age, `Order`.id, `Order`.title FROM table_with_row_types WHERE `Customer`.age > 30; Compared to the SQL standard, an optional field description simplifies the handling with complex structures. A row type is similar to the STRUCT type known from other non-standard-compliant frameworks. ROW(...) is a synonym for being closer to the SQL standard. For example, ROW(fieldOne INT, fieldTwo BOOLEAN) is equivalent to ROW<fieldOne INT, fieldTwo BOOLEAN>. If the fields of the data type contain characters other than [A-Za-z_], use escaping notation. Double backticks escape the backtick character, for example: ROW<`a-b` INT, b STRING, `weird_col``_umn` STRING> Rows fields can contain comments, for example: {"type":"ROW","nullable":true,"fields":[{"name":"a","fieldType":{"type":"INTEGER","nullable":true},"description":"hello"}]} Format using single quotes. Double single quotes escape single quotes, for example: ROW<a INT 'This field''s content'> Other data types¶ BOOLEAN¶ Represents a boolean with a (possibly) three-valued logic of TRUE, FALSE, and UNKNOWN. Declaration BOOLEAN Bridging to JVM types Java Type Input Output Notes java.lang.Boolean ✓ ✓ Default boolean ✓ (✓) Output only if type is not nullable. Formats The following table shows examples of the BOOLEAN type in different formats. JSON for data type {"type":"BOOLEAN","nullable":true} CLI/UI format NULL JSON for payload null CLI/UI format for payload NULL NULL¶ Data type for representing untyped NULL values. Declaration NULL Bridging to JVM types Java Type Input Output Notes java.lang.Object ✓ ✓ Default any class (✓) Any non-primitive type. Formats The following table shows examples of the NULL type in different formats. JSON for data type {"type":"NULL"} CLI/UI format NULL JSON for payload null CLI/UI format for payload NULL The NULL type is an extension to the SQL standard. A NULL type has no other value except NULL, thus, it can be cast to any nullable type similar to JVM semantics. This type helps in representing unknown types in API calls that use a NULL literal as well as bridging to formats such as JSON or Avro that define such a type as well. This type is not very useful in practice and is described here only for completeness. Casting¶ Flink SQL can perform casting between a defined input type and target type. While some casting operations can always succeed regardless of the input value, others can fail at runtime when there’s no way to create a value for the target type. For example, it’s always possible to convert INT to STRING, but you can’t always convert a STRING to INT. During the planning stage, the query validator rejects queries for invalid type pairs with a ValidationException, for example, when trying to cast a TIMESTAMP to an INTERVAL. Valid type pairs that can fail at runtime are accepted by the query validator, but this requires you to handle cast failures correctly. In Flink SQL, casting can be performed by using one of these two built-in functions: CAST: The regular cast function defined by the SQL standard. It can fail the job if the cast operation is fallible and the provided input is not valid. Type inference preserves the nullability of the input type. TRY_CAST: An extension to the regular cast function that returns NULL if the cast operation fails. Its return type is always nullable. For example: -- returns 42 of type INT NOT NULL SELECT CAST('42' AS INT); -- returns NULL of type VARCHAR SELECT CAST(NULL AS VARCHAR); -- throws an exception and fails the job SELECT CAST('non-number' AS INT); -- returns 42 of type INT SELECT TRY_CAST('42' AS INT); -- returns NULL of type VARCHAR SELECT TRY_CAST(NULL AS VARCHAR); -- returns NULL of type INT SELECT TRY_CAST('non-number' AS INT); -- returns 0 of type INT NOT NULL SELECT COALESCE(TRY_CAST('non-number' AS INT), 0); The following matrix shows the supported cast pairs, where “Y” means supported, “!” means fallible, and “N” means unsupported: Input / Target CHAR¹ / VARCHAR¹ / STRING BINARY¹ / VARBINARY¹ / BYTES BOOLEAN DECIMAL TINYINT SMALLINT INTEGER BIGINT FLOAT DOUBLE DATE TIME TIMESTAMP TIMESTAMP_LTZ INTERVAL ARRAY MULTISET MAP ROW CHAR / VARCHAR / STRING Y ! ! ! ! ! ! ! ! ! ! ! ! ! N N N N N BINARY / VARBINARY / BYTES Y Y N N N N N N N N N N N N N N N N N BOOLEAN Y N Y Y Y Y Y Y Y Y N N N N N N N N N DECIMAL Y N N Y Y Y Y Y Y Y N N N N N N N N N TINYINT Y N Y Y Y Y Y Y Y Y N N N² N² N N N N N SMALLINT Y N Y Y Y Y Y Y Y Y N N N² N² N N N N N INTEGER Y N Y Y Y Y Y Y Y Y N N N² N² Y⁵ N N N N BIGINT Y N Y Y Y Y Y Y Y Y N N N² N² Y⁶ N N N N FLOAT Y N N Y Y Y Y Y Y Y N N N N N N N N N DOUBLE Y N N Y Y Y Y Y Y Y N N N N N N N N N DATE Y N N N N N N N N N Y N Y Y N N N N N TIME Y N N N N N N N N N N Y Y Y N N N N N TIMESTAMP Y N N N N N N N N N Y Y Y Y N N N N N TIMESTAMP_LTZ Y N N N N N N N N N Y Y Y Y N N N N N INTERVAL Y N N N N N Y⁵ Y⁶ N N N N N N Y N N N N ARRAY Y N N N N N N N N N N N N N N !³ N N N MULTISET Y N N N N N N N N N N N N N N N !³ N N MAP Y N N N N N N N N N N N N N N N N !³ N ROW Y N N N N N N N N N N N N N N N N N !³ Notes: All the casting to constant length or variable length also trims and pads, according to the type definition. TO_TIMESTAMP and TO_TIMESTAMP_LTZ must be used instead of CAST/ TRY_CAST. Supported iff the children type pairs are supported. Fallible iff the children type pairs are fallible. Supported iff the RAW class and serializer are equals. Supported iff INTERVAL is a MONTH TO YEAR range. Supported iff INTERVAL is a DAY TO TIME range. Note A cast of a NULL value always returns NULL, regardless of whether the function used is CAST or TRY_CAST. Data type extraction¶ In many locations in the API, Flink tries to extract data types automatically from class information by using reflection to avoid repetitive manual schema work. But extracting a data type using reflection is not always successful, because logical information might be missing. In these cases, it may be necessary to add additional information close to a class or field declaration for supporting the extraction logic. The following table lists classes that map implicitly to a data type without requiring further information. Other JVM bridging classes require the @DataTypeHint annotation. Class Data Type boolean BOOLEAN NOT NULL byte TINYINT NOT NULL byte[] BYTES double DOUBLE NOT NULL float FLOAT NOT NULL int INT NOT NULL java.lang.Boolean BOOLEAN java.lang.Byte TINYINT java.lang.Double DOUBLE java.lang.Float FLOAT java.lang.Integer INT java.lang.Long BIGINT java.lang.Short SMALLINT java.lang.String STRING java.sql.Date DATE java.sql.Time TIME(0) java.sql.Timestamp TIMESTAMP(9) java.time.Duration INTERVAL SECOND(9) java.time.Instant TIMESTAMP_LTZ(9) java.time.LocalDate DATE java.time.LocalTime TIME(9) java.time.LocalDateTime TIMESTAMP(9) java.time.OffsetDateTime TIMESTAMP(9) WITH TIME ZONE java.time.Period INTERVAL YEAR(4) TO MONTH java.util.Map<K, V> MAP<K, V> short SMALLINT NOT NULL structured type T anonymous structured type T long BIGINT NOT NULL T[] ARRAY<T> Related content¶ DDL Statements Flink SQL Queries Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
INT
INT NOT NULL
INTERVAL DAY TO SECOND(3)
ROW<fieldOne ARRAY<BOOLEAN>, fieldTwo TIMESTAMP(3)>
```

```sql
CHAR
CHAR(n)
```

```sql
{"type":"CHAR","nullable":true,"length":8}
```

```sql
"Example string"
```

```sql
Example string
```

```sql
VARCHAR
VARCHAR(n)

STRING
```

```sql
{"type":"VARCHAR","nullable":true,"length":8}
```

```sql
VARCHAR(800)
```

```sql
"Example string"
```

```sql
Example string
```

```sql
VARCHAR(2147483647)
```

```sql
BINARY
BINARY(n)
```

```sql
{"type":"BINARY","nullable":true,"length":1}
```

```sql
"x'7f0203'"
```

```sql
BYTES

VARBINARY
VARBINARY(n)
```

```sql
{"type":"VARBINARY","nullable":true,"length":1}
```

```sql
VARBINARY(800)
```

```sql
"x'7f0203'"
```

```sql
VARBINARY(n)
```

```sql
VARBINARY(2147483647)
```

```sql
{"type":"BIGINT","nullable":true}
```

```sql
DECIMAL
DECIMAL(p)
DECIMAL(p, s)

DEC
DEC(p)
DEC(p, s)

NUMERIC
NUMERIC(p)
NUMERIC(p, s)
```

```sql
{"type":"DECIMAL","nullable":true,"precision":5,"scale":3}
```

```sql
DECIMAL(5, 3)
```

```sql
DECIMAL(p, s)
```

```sql
NUMERIC(p, s)
```

```sql
INT

INTEGER
```

```sql
{"type":"INT","nullable":true}
```

```sql
{"type":"SMALLINT","nullable":true}
```

```sql
{"type":"TINYINT","nullable":true}
```

```sql
DOUBLE

DOUBLE PRECISION
```

```sql
{"type":"DOUBLE","nullable":true}
```

```sql
"1.1111112120000001E7"
```

```sql
1.1111112120000001E7
```

```sql
DOUBLE PRECISION
```

```sql
{"type":"FLOAT","nullable":true}
```

```sql
"1.1111112E7"
```

```sql
1.1111112E7
```

```sql
year-month-day
```

```sql
{"type":"DATE","nullable":true}
```

```sql
"2023-04-06"
```

```sql
INTERVAL DAY
INTERVAL DAY(p1)
INTERVAL DAY(p1) TO HOUR
INTERVAL DAY(p1) TO MINUTE
INTERVAL DAY(p1) TO SECOND(p2)
INTERVAL HOUR
INTERVAL HOUR TO MINUTE
INTERVAL HOUR TO SECOND(p2)
INTERVAL MINUTE
INTERVAL MINUTE TO SECOND(p2)
INTERVAL SECOND
INTERVAL SECOND(p2)
```

```sql
{"type":"INTERVAL_DAY_TIME","nullable":true,"precision":1,"fractionalPrecision":3,"resolution":"DAY_TO_SECOND"}
```

```sql
INTERVAL DAY(1) TO SECOND(3)
```

```sql
"+2 07:33:20.000"
```

```sql
+2 07:33:20.000
```

```sql
+days hours:months:seconds.fractional
```

```sql
-999999 23:59:59.999999999
```

```sql
+999999 23:59:59.999999999
```

```sql
+00 00:01:10.000000
```

```sql
INTERVAL DAY(1)
INTERVAL DAY(1) TO HOUR
INTERVAL DAY(1) TO MINUTE
INTERVAL DAY(1) TO SECOND(3)
INTERVAL HOUR
INTERVAL HOUR TO MINUTE
INTERVAL HOUR TO SECOND(3)
INTERVAL MINUTE
INTERVAL MINUTE TO SECOND(3)
INTERVAL SECOND(3)
```

```sql
INTERVAL YEAR
INTERVAL YEAR(p)
INTERVAL YEAR(p) TO MONTH
INTERVAL MONTH
```

```sql
{"type":"INTERVAL_YEAR_MONTH","nullable":true,"precision":4,"resolution":"YEAR_TO_MONTH"}
```

```sql
INTERVAL YEAR(4) TO MONTH
```

```sql
+years-months
```

```sql
INTERVAL YEAR(4)
INTERVAL YEAR(4) TO MONTH
INTERVAL MONTH
```

```sql
hour:minute:second[.fractional]
```

```sql
00:00:00.000000000
```

```sql
23:59:59.999999999
```

```sql
TIME
TIME(p)

TIME_WITHOUT_TIME_ZONE
TIME_WITHOUT_TIME_ZONE(p)
```

```sql
{"type":"TIME_WITHOUT_TIME_ZONE","nullable":true,"precision":3}
```

```sql
"10:56:22.541"
```

```sql
10:56:22.541
```

```sql
java.time.LocalTime
```

```sql
TIME WITHOUT TIME ZONE
```

```sql
year-month-day hour:minute:second[.fractional]
```

```sql
0000-01-01 00:00:00.000000000
```

```sql
9999-12-31 23:59:59.999999999
```

```sql
TIMESTAMP
TIMESTAMP(p)

TIMESTAMP WITHOUT TIME ZONE
TIMESTAMP(p) WITHOUT TIME ZONE
```

```sql
{"type":"TIMESTAMP_WITHOUT_TIME_ZONE","nullable":true,"precision":3}
```

```sql
TIMESTAMP(3)
```

```sql
"2023-04-06 10:59:32.628"
```

```sql
2023-04-06 10:59:32.628
```

```sql
TIMESTAMP(p)
```

```sql
java.time.LocalDateTime
```

```sql
java.time.Instant
```

```sql
TIMESTAMP_LTZ
```

```sql
TIMESTAMP WITHOUT TIME ZONE
```

```sql
year-month-day hour:minute:second[.fractional] zone
```

```sql
0000-01-01 00:00:00.000000000 +14:59
```

```sql
9999-12-31 23:59:59.999999999 -14:59
```

```sql
TIMESTAMP_LTZ
TIMESTAMP_LTZ(p)

TIMESTAMP WITH LOCAL TIME ZONE
TIMESTAMP(p) WITH LOCAL TIME ZONE
```

```sql
{"type":"TIMESTAMP_WITH_LOCAL_TIME_ZONE","nullable":true,"precision":3}
```

```sql
TIMESTAMP(3) WITH LOCAL TIME ZONE
```

```sql
"2023-04-06 11:06:47.224"
```

```sql
2023-04-06 11:06:47.224
```

```sql
TIMESTAMP_LTZ(p)
```

```sql
java.time.OffsetDateTime
```

```sql
TIMESTAMP WITH TIME ZONE
```

```sql
java.time.Instant
```

```sql
TIMESTAMP_LTZ
```

```sql
TIMESTAMP_LTZ(3)
```

```sql
TIMESTAMP WITH LOCAL TIME ZONE
```

```sql
sql.local-time-zone
```

```sql
LocalDateTime
```

```sql
year-month-day hour:minute:second[.fractional]
```

```sql
0000-01-01 00:00:00.000000000 +14:59
```

```sql
9999-12-31 23:59:59.999999999 -14:59
```

```sql
TIMESTAMP WITH TIME ZONE
TIMESTAMP(p) WITH TIME ZONE
```

```sql
ARRAY<t>
t ARRAY
```

```sql
{"type":"ARRAY","nullable":true,"elementType":{"type":"INTEGER","nullable":true}}
```

```sql
["1", "2", "3", null]
```

```sql
[1, 2, 3, NULL]
```

```sql
MAP<kt, vt>
```

```sql
{"type":"MAP","nullable":true,"keyType":{"type":"INTEGER","nullable":true},"valueType":{"type":"VARCHAR","nullable":true,"length":2147483647}}
```

```sql
MAP<STRING>
```

```sql
[["1", "a"], ["2", "b"], [null, "c"]]
```

```sql
{1=a, 2=b, NULL=c}
```

```sql
MAP<kt, vt>
```

```sql
MULTISET<t>
t MULTISET
```

```sql
{"type":"MULTISET","nullable":true,"elementType":{"type":"INTEGER","nullable":true}}
```

```sql
MULTISET<INT>
```

```sql
[["a", "1"], ["b", "2"], [null, "1"]]
```

```sql
{a=1, b=2, NULL=1}
```

```sql
MULTISET<t>
```

```sql
INT MULTISET
```

```sql
MULTISET<INT>
```

```sql
ROW<name0 type0, name1 type1, ...>
ROW<name0 type0 'description0', name1 type1 'description1', ...>

ROW(name0 type0, name1 type1, ...)
ROW(name0 type0 'description0', name1 type1 'description1', ...)
```

```sql
{"type":"ROW","nullable":true,"fields":[{"name":"a","fieldType":{"type":"INTEGER","nullable":true}},{"name":"b","fieldType":{"type":"VARCHAR","nullable":true,"length":2147483647}}]}
```

```sql
MULTISET<INT>
```

```sql
[["a", "1"], ["b", "2"], [null, "1"]]
```

```sql
{a=1, b=2, NULL=1}
```

```sql
ROW<n0 t0 'd0', n1 t1 'd1', ...>
```

```sql
CREATE TABLE table_with_row_types (
   `Customer` ROW<name STRING, age INT>,
   `Order` ROW<id BIGINT, title STRING>
);
```

```sql
INSERT INTO table_with_row_types VALUES
   (('Alice', 30), (101, 'Book')),
   (('Bob', 25), (102, 'Laptop')),
   (('Charlie', 35), (103, 'Phone')),
   (('Diana', 28), (104, 'Tablet')),
   (('Eve', 22), (105, 'Headphones'));
```

```sql
SELECT `Customer`.name, `Customer`.age, `Order`.id, `Order`.title
FROM table_with_row_types
WHERE `Customer`.age > 30;
```

```sql
ROW(fieldOne INT, fieldTwo BOOLEAN)
```

```sql
ROW<fieldOne INT, fieldTwo BOOLEAN>
```

```sql
ROW<`a-b` INT, b STRING, `weird_col``_umn` STRING>
```

```sql
{"type":"ROW","nullable":true,"fields":[{"name":"a","fieldType":{"type":"INTEGER","nullable":true},"description":"hello"}]}
```

```sql
ROW<a INT 'This field''s content'>
```

```sql
{"type":"BOOLEAN","nullable":true}
```

```sql
{"type":"NULL"}
```

```sql
ValidationException
```

```sql
-- returns 42 of type INT NOT NULL
SELECT CAST('42' AS INT);

-- returns NULL of type VARCHAR
SELECT CAST(NULL AS VARCHAR);

-- throws an exception and fails the job
SELECT CAST('non-number' AS INT);

-- returns 42 of type INT
SELECT TRY_CAST('42' AS INT);

-- returns NULL of type VARCHAR
SELECT TRY_CAST(NULL AS VARCHAR);

-- returns NULL of type INT
SELECT TRY_CAST('non-number' AS INT);

-- returns 0 of type INT NOT NULL
SELECT COALESCE(TRY_CAST('non-number' AS INT), 0);
```

```sql
TO_TIMESTAMP
```

```sql
TO_TIMESTAMP_LTZ
```

```sql
MONTH TO YEAR
```

```sql
DAY TO TIME
```

---

### Example Data Streams in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/example-data.html

Example Data Streams in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides an Examples catalog that has mock data streams you can use for experimenting with Flink SQL queries. The examples catalog is available in all environments. All example tables have $rowtime available as a system column. The SOURCE_WATERMARK() strategy for example tables is different than the SOURCE_WATERMARK() strategy Kafka-based tables. For the example tables, the SOURCE_WATAERMARK() corresponds to the maximum timestamp seen to this point. You can use example data in Flink workspaces, Flink shell, Terraform, and all other clients. Example data is read-only, so you can’t use INSERT INTO/ALTER/DROP/CREATE statements on these tables, the database, or the catalog. SHOW statements work for the database, catalog, and tables. SHOW CREATE TABLE works for the example tables. Publish to a Kafka topic¶ You can publish any of the example streams to a Kafka topic by creating a Flink table and populating it with the INSERT INTO FROM SELECT statement. Confluent Cloud for Apache Flink creates a Kafka topic automatically for the table. Run the following statements to create and populate a customers_source table with the examples.marketplace.customers stream. CREATE TABLE customers_source ( customer_id INT, name STRING, address STRING, postcode STRING, city STRING, email STRING, PRIMARY KEY (customer_id) NOT ENFORCED ); INSERT INTO customers_source( customer_id, name, address, postcode, city, email ) SELECT * FROM examples.marketplace.customers; Run the following statement to inspect the customers_source table: SELECT * FROM customers_source; Your output should resemble: customer_id name address postcode city email 3172 Roseanna Bode 6744 Kacy Bypass 22635 Margarettborough rico.zboncak@yahoo.com 3055 Josiah Morissette PhD 61799 Friesen Islands 14194 North Abbybury thomas.dach@gmail.com 3177 Buddy Hill 6836 Graham Street 72767 South Earnest enoch.turcotte@hotmail.com ... Navigate to the Environments page, and in the navigation menu, click Data portal. In the Data portal page, click the dropdown menu and select the environment for your workspace. In the Recently created section, find your customers_source topic and click it to open the details pane. Click View all messages to open the Message viewer on the customers_source topic. Observe the example data from the examples.marketplace.customers flowing into the Kafka topic. Important The INSERT INTO statement runs continuously until you stop it manually. Free resources in your compute pool by deleting the long-running statement when you’re done. Marketplace database¶ The marketplace database provides streams that simulate commerce-related data. The marketplace database has these tables: clicks: simulates a stream of user clicks on a web page. customers: simulates a stream of customers who order products. orders: simulates a stream of orders. products: simulates a stream of products that a customer has ordered. clicks table¶ To access the clicks example stream, use the fully qualified string, examples.marketplace.clicks in your queries. The clicks table has the following schema: CREATE TABLE clicks ( click_id STRING, -- UUID user_id INT, -- range between 3000 and 5000 url STRING, -- regex https://www[.]acme[.]com/product/[a-z]{5} user_agent STRING, -- set by the datafaker Internet class view_time INT -- range between 10 and 120 ); The user_agent field is assigned by the datafaker Internet class. Run the following statement to inspect the clicks data stream: SELECT * FROM examples.marketplace.clicks; Your output should resemble: click_id user_id url user_agent view_time 23add2ce-da47-47c1-925a-f7c1def06f0c 3278 https://www.acme.com/product/mqwpg Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; AS; rv:11.0) like … 11 b81dc020-5ad2-493f-8175-d3e50e40f411 4919 https://www.acme.com/product/vycnj Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko)… 58 b62ae975-0f5d-4e87-9cbe-45b7661ad327 3461 https://www.acme.com/product/pghkm Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML… 105 ... customers table¶ To access the customers example stream, use the fully qualified string, examples.marketplace.customers in your queries. The customers table has the following schema: CREATE TABLE customers ( customer_id INT, -- range between 3000 and 3250 name STRING, -- set by the datafaker Name class address STRING, -- set by the datafaker Address class postcode STRING, -- set by the datafaker Address class city STRING, -- set by the datafaker Address class email STRING, -- set by the datafaker Internet class PRIMARY KEY (customer_id) NOT ENFORCED ); The name field is assigned by the datafaker Name class The address fields are assigned by the datafaker Address class. The email field is assigned by the datafaker Internet class. Run the following statement to inspect the customers data stream: SELECT * FROM examples.marketplace.customers; Your output should resemble: customer_id name address postcode city email 3023 Ellsworth Price 0644 Mara Drive 29407 Emilyhaven sheldon.sipes@gmail.com 3003 Jayme Buckridge 320 Schumm Green 38752 Schowalterchester johnsie.hane@yahoo.com 3010 Les Beier 7032 Gerda Road 66841 Deckowside minnie.becker@hotmail.com ... orders table¶ To access the orders example stream, use the fully qualified string, examples.marketplace.orders in your queries. The customer_id and product_id are suitable for joins with the customers and products streams. CREATE TABLE orders ( order_id STRING, -- UUID customer_id INT, -- range between 3000 and 3250 product_id INT, -- range between 1000 and 1500 price DOUBLE -- range between 0.00 and 100.00 ); Run the following statement to inspect the orders data stream: SELECT * FROM examples.marketplace.orders; Your output should resemble: order_id customer_id product_id price 36d77b21-e68f-4123-b87a-cc19ac1f36ac 3137 1305 65.71 7fd3cd2a-392b-4f8f-b953-0bfa1d331354 3063 1327 17.75 1a223c61-38a5-4b8c-8465-2a6b359bf05e 3064 1166 14.95 ... Run the following statement to join the orders data stream with the customers and products streams. The query shows the name of the customer, and the product name, and the price of the order. SELECT examples.marketplace.customers.name AS customer_name, examples.marketplace.products.name AS product_name, examples.marketplace.orders.price FROM examples.marketplace.products JOIN examples.marketplace.orders ON examples.marketplace.products.product_id = examples.marketplace.orders.product_id JOIN examples.marketplace.customers ON examples.marketplace.customers.customer_id = examples.marketplace.orders.customer_id; Your output should resemble: customer_name product_name price Mr. Lexie Collins Fantastic Rubber Car 32.76 Lyle Spencer Synergistic Leather Clock 21.28 Mrs. Candida Howe Lightweight Silk Hat 35.38 Colette Ebert Sleek Steel Keyboard 92.22 products table¶ To access the products example stream, use the fully qualified string, examples.marketplace.products in your queries. CREATE TABLE products ( product_id INT, -- range between 1000 and 1500 name STRING, -- set by the datafaker Commerce class brand STRING, -- set by the datafaker Commerce class vendor STRING, -- set by the datafaker Commerce class department STRING, -- set by the datafaker Commerce class PRIMARY KEY (product_id) NOT ENFORCED ); The product fields are assigned by the datafaker Commerce class. Run the following statement to inspect the products data stream: SELECT * FROM examples.marketplace.products; Your output should resemble: product_id name brand vendor department 1440 Enormous Aluminum Keyboard LG Dollar General Garden & Movies 1404 Practical Plastic Computer Adidas Target Outdoors 1132 Gorgeous Paper Watch Samsung Amazon Home, Kids & Movies ... Related content¶ DDL Statements Flink SQL Queries Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
SOURCE_WATERMARK()
```

```sql
SOURCE_WATERMARK()
```

```sql
SOURCE_WATAERMARK()
```

```sql
customers_source
```

```sql
examples.marketplace.customers
```

```sql
CREATE TABLE customers_source (
  customer_id INT,
  name STRING,
  address STRING,
  postcode STRING,
  city STRING,
  email STRING,
  PRIMARY KEY (customer_id) NOT ENFORCED
);

INSERT INTO customers_source(
  customer_id,
  name,
  address,
  postcode,
  city,
  email
)
SELECT * FROM examples.marketplace.customers;
```

```sql
customers_source
```

```sql
SELECT * FROM customers_source;
```

```sql
customer_id name                  address                postcode city               email
3172        Roseanna Bode         6744 Kacy Bypass       22635    Margarettborough   rico.zboncak@yahoo.com
3055        Josiah Morissette PhD 61799 Friesen Islands  14194    North Abbybury     thomas.dach@gmail.com
3177        Buddy Hill            6836 Graham Street     72767    South Earnest      enoch.turcotte@hotmail.com
...
```

```sql
customers_source
```

```sql
examples.marketplace.customers
```

```sql
marketplace
```

```sql
marketplace
```

```sql
examples.marketplace.clicks
```

```sql
CREATE TABLE clicks (
  click_id STRING, -- UUID
  user_id INT, -- range between 3000 and 5000
  url STRING, -- regex https://www[.]acme[.]com/product/[a-z]{5}
  user_agent STRING, -- set by the datafaker Internet class
  view_time INT -- range between 10 and 120
 );
```

```sql
SELECT * FROM examples.marketplace.clicks;
```

```sql
click_id                             user_id url                                user_agent                                                           view_time
23add2ce-da47-47c1-925a-f7c1def06f0c 3278    https://www.acme.com/product/mqwpg Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; AS; rv:11.0) like … 11
b81dc020-5ad2-493f-8175-d3e50e40f411 4919    https://www.acme.com/product/vycnj Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko)… 58
b62ae975-0f5d-4e87-9cbe-45b7661ad327 3461    https://www.acme.com/product/pghkm Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML… 105
...
```

```sql
examples.marketplace.customers
```

```sql
CREATE TABLE customers (
  customer_id INT, -- range between 3000 and 3250
  name STRING, -- set by the datafaker Name class
  address STRING, -- set by the datafaker Address class
  postcode STRING, -- set by the datafaker Address class
  city STRING, -- set by the datafaker Address class
  email STRING, -- set by the datafaker Internet class
  PRIMARY KEY (customer_id) NOT ENFORCED
 );
```

```sql
SELECT * FROM examples.marketplace.customers;
```

```sql
customer_id name                 address                postcode city               email
3023        Ellsworth Price      0644 Mara Drive        29407    Emilyhaven         sheldon.sipes@gmail.com
3003        Jayme Buckridge      320 Schumm Green       38752    Schowalterchester  johnsie.hane@yahoo.com
3010        Les Beier            7032 Gerda Road        66841    Deckowside         minnie.becker@hotmail.com
...
```

```sql
examples.marketplace.orders
```

```sql
customer_id
```

```sql
CREATE TABLE orders (
  order_id STRING, -- UUID
  customer_id INT, -- range between 3000 and 3250
  product_id INT, -- range between 1000 and 1500
  price DOUBLE -- range between 0.00 and 100.00
);
```

```sql
SELECT * FROM examples.marketplace.orders;
```

```sql
order_id                             customer_id product_id price
36d77b21-e68f-4123-b87a-cc19ac1f36ac 3137        1305       65.71
7fd3cd2a-392b-4f8f-b953-0bfa1d331354 3063        1327       17.75
1a223c61-38a5-4b8c-8465-2a6b359bf05e 3064        1166       14.95
...
```

```sql
SELECT
  examples.marketplace.customers.name AS customer_name,
  examples.marketplace.products.name AS product_name,
  examples.marketplace.orders.price
FROM examples.marketplace.products
JOIN examples.marketplace.orders ON examples.marketplace.products.product_id = examples.marketplace.orders.product_id
JOIN examples.marketplace.customers ON examples.marketplace.customers.customer_id = examples.marketplace.orders.customer_id;
```

```sql
customer_name       product_name              price
Mr. Lexie Collins   Fantastic Rubber Car      32.76
Lyle Spencer        Synergistic Leather Clock 21.28
Mrs. Candida Howe   Lightweight Silk Hat      35.38
Colette Ebert       Sleek Steel Keyboard      92.22
```

```sql
examples.marketplace.products
```

```sql
CREATE TABLE products (
  product_id INT, -- range between 1000 and 1500
  name STRING, -- set by the datafaker Commerce class
  brand STRING, -- set by the datafaker Commerce class
  vendor STRING, -- set by the datafaker Commerce class
  department STRING, -- set by the datafaker Commerce class
  PRIMARY KEY (product_id) NOT ENFORCED
 );
```

```sql
SELECT * FROM examples.marketplace.products;
```

```sql
product_id name                        brand   vendor         department
1440       Enormous Aluminum Keyboard  LG      Dollar General Garden & Movies
1404       Practical Plastic Computer  Adidas  Target         Outdoors
1132       Gorgeous Paper Watch        Samsung Amazon         Home, Kids & Movies
...
```

---

### Confluent CLI commands with Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/flink-sql-cli.html

Confluent CLI commands with Confluent Cloud for Apache Flink¶ Manage Flink SQL statements and compute pools in Confluent Cloud for Apache Flink® by using the confluent flink commands in the Confluent CLI. To see the available commands, use the --help option. confluent flink statement --help confluent flink compute-pool --help confluent flink region --help Use the Confluent CLI to manage these features: Statements Compute pools Regions For the complete CLI reference, see confluent flink statement. In addition to the CLI, you can manage Flink statements and compute pools by using these Confluent tools: Flink SQL REST API Cloud Console SQL shell Confluent Terraform Provider Manage statements¶ Using the Confluent CLI, you can perform these actions: Submit a statement List statements Describe a statement List exceptions from a statement Delete a statement Update a statement Managing Flink SQL statements may require the following inputs, depending on the command: export STATEMENT_NAME="<statement-name>" # example: "user-filter" export COMPUTE_POOL_ID="<compute-pool-id>" # example: "lfcp-8m03rm" export CLUSTER_ID="<kafka-cluster-id>" # example: "lkc-a1b2c3" export PRINCIPAL_ID="<principal-id>" # example: "sa-23kgz4" for a service account, or "u-aq1dr2" for a user account export SQL_CODE="<sql-statement-text>" # example: "SELECT * FROM USERS;" For the complete CLI reference, see confluent flink statement. Submit a statement¶ The confluent flink statement create command submits a statement in your compute pool. Run the following command to submit a Flink SQL statement in the current compute pool with your user account. confluent flink statement create --sql "${SQL_CODE}" Your output should resemble: +---------------+------------------------------------------------------------+ | Creation Date | 2024-02-28 21:08:08.9749 +0000 | | | UTC | | Name | cli-2024-02-28-130806-78dd77b5-16a9-40ab-9786-db95b9895eaa | | Statement | Select 1; | | Compute Pool | lfcp-8m09g0 | | Status | PENDING | +---------------+------------------------------------------------------------+ For long-running statements, Confluent recommends submitting statements with a service account instead of your user account. The following command submits a Flink SQL statement for the specified principal in the specified compute pool and Flink database (Kafka cluster). confluent flink statement create ${STATEMENT_NAME} \ --service-account ${PRINCIPAL_ID} \ --sql "${SQL_CODE}" \ --compute-pool ${COMPUTE_POOL_ID} \ --database ${CLUSTER_ID} List statements¶ Run the confluent flink statement list command to list all of the non-deleted statements in your environment. confluent flink statement list Your output should resemble: Creation Date | Name | Statement | Compute Pool | Status | Status Detail --------------------------------+----------------------+--------------------------------+--------------+-----------+--------------------------------- 2023-07-08 21:04:06 +0000 UTC | 4b1d3494-f0f7-460d-9 | INSERT INTO copytopic | lfcp-r2j1x9 | RUNNING | | | SELECT symbol,price from | | | | | topic_datagen; | | | 2023-07-08 21:07:04 +0000 UTC | 6c43b973-b3c6-4be8-9 | INSERT INTO copytopic | lfcp-r2j1x9 | RUNNING | | | SELECT symbol,price from | | | | | topic_datagen; | | | ... To list only the statements that you’ve created, get the context for your current Confluent Cloud login session and provide the context with the context option. confluent context list Your output should resemble: Current | Name | Platform | Credential ----------+--------------------------------------------------------+-----------------+------------------------------------ * | login-<your-email-address>-https://confluent.cloud | confluent.cloud | username-<your-email-address> For convenience, save the context in an environment variable: export MY_CONTEXT="login-<your-email-address>-https://confluent.cloud" Run the confluent flink statement list command with your context. confluent flink statement list ${MY_CONTEXT} Your output should resemble: Creation Date | Name | Statement | Compute Pool | Status | Status Detail ---------------------------------+------------------------------------------------------------+-----------+--------------+-----------+---------------- 2024-02-28 21:08:08.9749 +0000 | cli-2024-02-28-130806-78dd77b5-16a9-40ab-9786-db95b9895eaa | Select 1; | lfcp-8m09g0 | COMPLETED | UTC | | | | | ... To list only the statements in your compute pool, provide the compute pool ID with the --compute-pool option. confluent flink statement list --compute-pool ${COMPUTE_POOL_ID} Describe a statement¶ Run the confluent flink statement describe command to view the details of an existing statement. confluent flink statement describe ${STATEMENT_NAME} Your output should resemble: Creation Date | Name | Statement | Compute Pool | Status | Status Detail --------------------------------+--------------------+------------+--------------+-----------+---------------- 2023-07-19 19:26:52 +0000 UTC | fdc6cbf5-038a-408c | show jobs; | lfcp-a1b2c3 | COMPLETED | List exceptions from a statement¶ Run the confluent flink statement exception list command to get exceptions that have been thrown by a statement. confluent flink statement exception list ${STATEMENT_NAME} Delete a statement¶ Run the confluent flink statement delete command to delete an existing statement permanently. All of its resources, like checkpoints, are also deleted. Deleting a statement stops charges for its use. confluent flink statement delete ${STATEMENT_NAME} Your output should resemble: Deleted Flink SQL statement "ac23db14-b5dc-49fb-b". Update a statement¶ Run the confluent flink statement delete command to stop an existing statement or resume a stopped statement. # Request to stop a statement. confluent flink statement update ${STATEMENT_NAME} --stopped=true # Request to resume a stopped statement. confluent flink statement update ${STATEMENT_NAME} --stopped=false Manage compute pools¶ Using the Confluent CLI, you can perform these actions: Create a compute pool Describe a compute pool List compute pools Update a compute pool Set the current compute pool Unset the current compute pool Delete a compute pool You must be authorized to create, update, delete (FlinkAdmin) or use (FlinkDeveloper) a compute pool. For more information, see Grant Role-Based Access in Confluent Cloud for Apache Flink. Managing compute pools may require the following inputs, depending on the command: export COMPUTE_POOL_NAME=<compute-pool-name> # human-readable name, for example, "my-compute-pool" export COMPUTE_POOL_ID="<compute-pool-id>" # example: "lfcp-8m03rm" export CLOUD_PROVIDER="<cloud-provider>" # example: "aws" export CLOUD_REGION="<cloud-region>" # example: "us-east-1" export MAX_CFU="<max-cfu>" # example: 5 For the complete CLI reference, see confluent flink compute-pool. Create a compute pool¶ Run the confluent flink compute-pool create command to create a compute pool. Creating a compute pool requires the following inputs: export COMPUTE_POOL_NAME=<compute-pool-name> # human-readable name, for example, "my-compute-pool" export CLOUD_PROVIDER="<cloud-provider>" # example: "aws" export CLOUD_REGION="<cloud-region>" # example: "us-east-1" export ENV_ID="<environment-id>" # example: "env-z3y2x1" export MAX_CFU="<max-cfu>" # example: 5 Run the following command to create a compute pool in the specified cloud provider and environment. confluent flink compute-pool create ${COMPUTE_POOL_NAME} \ --cloud ${CLOUD_PROVIDER} \ --region ${CLOUD_REGION} \ --max-cfu ${MAX_CFU} \ --environment ${ENV_ID} Your output should resemble: +-------------+-----------------+ | Current | false | | ID | lfcp-xxd6og | | Name | my-compute-pool | | Environment | env-z3y2x1 | | Current CFU | 0 | | Max CFU | 5 | | Cloud | AWS | | Region | us-east-1 | | Status | PROVISIONING | +-------------+-----------------+ Describe a compute pool¶ Run the confluent flink compute-pool describe command to get details about a compute pool. Describing a compute pool requires the following inputs: export COMPUTE_POOL_ID="<compute-pool-id>" # example: "lfcp-8m03rm" export ENV_ID="<environment-id>" # example: "env-z3y2x1" Run the following command to get details about a compute pool in the specified environment. confluent flink compute-pool describe ${COMPUTE_POOL_ID} \ --environment ${ENV_ID} Your output should resemble: +-------------+-----------------+ | Current | false | | ID | lfcp-xxd6og | | Name | my-compute-pool | | Environment | env-z3y2x1 | | Current CFU | 0 | | Max CFU | 5 | | Cloud | AWS | | Region | us-east-1 | | Status | PROVISIONED | +-------------+-----------------+ List compute pools¶ Run the confluent flink compute-pool list command to compute pools in the specified environment. Listing compute pools may require the following inputs, depending on the command: export CLOUD_REGION="<cloud-region>" # example: "us-east-1" export ENV_ID="<environment-id>" # example: "env-z3y2x1" Run the following command to get details about a compute pool in the specified environment. confluent flink compute-pool list --environment ${ENV_ID} Your output should resemble: Current | ID | Name | Environment | Current CFU | Max CFU | Cloud | Region | Status ----------+-------------+---------------------------+-------------+-------------+---------+-------+-----------+-------------- * | lfcp-xxd6og | my-compute-pool | env-z3y2x1 | 0 | 5 | AWS | us-east-1 | PROVISIONED | lfcp-8m03rm | test-blue-compute-pool | env-z3q9rd | 0 | 10 | AWS | us-east-1 | PROVISIONED ... Update a compute pool¶ Run the confluent flink compute-pool update command to update a compute pool. Updating a compute pool may require the following inputs, depending on the command: export COMPUTE_POOL_NAME=<compute-pool-name> # human-readable name, for example, "my-compute-pool" export COMPUTE_POOL_ID="<compute-pool-id>" # example: "lfcp-8m03rm" export ENV_ID="<environment-id>" # example: "env-z3y2x1" export MAX_CFU="<max-cfu>" # example: 5 Run the following command to update a compute pool in the specified environment. confluent flink compute-pool update ${COMPUTE_POOL_ID} \ --environment ${ENV_ID} \ --name ${COMPUTE_POOL_NAME} \ --max-cfu ${MAX_CFU} Your output should resemble: +-------------+----------------------+ | Current | false | | ID | lfcp-xxd6og | | Name | renamed-compute-pool | | Environment | env-z3y2x1 | | Current CFU | 0 | | Max CFU | 10 | | Cloud | AWS | | Region | us-east-1 | | Status | PROVISIONED | +-------------+----------------------+ Set the current compute pool¶ Run the confluent flink compute-pool use command to use a compute pool in subsequent commands. Setting a compute pool requires the following inputs: export COMPUTE_POOL_ID="<compute-pool-id>" # example: "lfcp-8m03rm" export ENV_ID="<environment-id>" # example: "env-z3y2x1" Run the following commands to set the current compute pool in the specified environment. First, you must run the confluent environment use command to set the current environment. confluent environment use ${ENV_ID} && \ confluent flink compute-pool use ${COMPUTE_POOL_ID} Your output should resemble: Using environment "env-z3y2x1". Using Flink compute pool "lfcp-xxd6og". Unset the current compute pool¶ Run the confluent flink compute-pool unset command to unset the current compute pool. Run the following command to unset the current compute pool. confluent flink compute-pool unset Your output should resemble: Unset Flink compute pool "lfcp-xxd6og". Delete a compute pool¶ Run the confluent flink compute-pool delete command to delete a compute pool. Run the following command to delete a compute pool in the specified environment. The optional --force flag skips the confirmation prompt. confluent flink compute-pool delete ${COMPUTE_POOL_ID} \ --environment ${ENV_ID} --force Your output should resemble: Deleted Flink compute pool "lfcp-xxd6og". Manage regions¶ Using the Confluent CLI, you can perform these actions: List available regions Set the current region Managing Flink SQL regions may require the following inputs, depending on the command: export CLOUD_PROVIDER="<cloud-provider>" # example: "aws" export CLOUD_REGION="<cloud-region>" # example: "us-east-1" For the complete CLI reference, see confluent flink region. List available regions¶ Run the confluent flink region list to see all available regions where you can run Flink statements. confluent flink region list Your output should resemble: Current | Name | Cloud | Region ----------+-------------------------------+-------+----------------------- | Belgium (europe-west1) | gcp | europe-west1 | Frankfurt (eu-central-1) | aws | eu-central-1 | Frankfurt (europe-west3) | gcp | europe-west3 | Iowa (us-central1) | gcp | us-central1 | Ireland (eu-west-1) | aws | eu-west-1 | Las Vegas (us-west4) | gcp | us-west4 | London (eu-west-2) | aws | eu-west-2 * | N. Virginia (us-east-1) | aws | us-east-1 | N. Virginia (us-east4) | gcp | us-east4 | Netherlands (westeurope) | azure | westeurope | Ohio (us-east-2) | aws | us-east-2 | Oregon (us-west-2) | aws | us-west-2 | S. Carolina (us-east1) | gcp | us-east1 | Singapore (ap-southeast-1) | aws | ap-southeast-1 | Singapore (asia-southeast1) | gcp | asia-southeast1 | Singapore (southeastasia) | azure | southeastasia | Sydney (ap-southeast-2) | aws | ap-southeast-2 | Sydney (australia-southeast1) | gcp | australia-southeast1 | Virginia (eastus) | azure | eastus | Virginia (eastus2) | azure | eastus2 | Washington (westus2) | azure | westus2 Run the following command to filter the list of available regions by cloud provider. confluent flink region list --cloud ${CLOUD_PROVIDER} Your output should resemble: Current | Name | Cloud | Region ----------+----------------------------+-------+----------------- | Frankfurt (eu-central-1) | aws | eu-central-1 | Ireland (eu-west-1) | aws | eu-west-1 | London (eu-west-2) | aws | eu-west-2 * | N. Virginia (us-east-1) | aws | us-east-1 | Ohio (us-east-2) | aws | us-east-2 | Oregon (us-west-2) | aws | us-west-2 | Singapore (ap-southeast-1) | aws | ap-southeast-1 | Sydney (ap-southeast-2) | aws | ap-southeast-2 Set the current region¶ Run the confluent flink region use to set the current region where subsequent Flink statements run. You must have a compute pool in the region to run statements. confluent flink region use --cloud ${CLOUD_PROVIDER} --region ${CLOUD_REGION} For CLOUD_PROVIDER=aws and CLOUD_REGION=us-east-2, your output should resemble: Using Flink region "Ohio (us-east-2)". Related content¶ Flink SQL Shell Quick Start Monitor Flink SQL Statements Flink SQL REST API Confluent Terraform Provider Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
confluent flink statement --help
confluent flink compute-pool --help
confluent flink region --help
```

```sql
export STATEMENT_NAME="<statement-name>" # example: "user-filter"
export COMPUTE_POOL_ID="<compute-pool-id>" # example: "lfcp-8m03rm"
export CLUSTER_ID="<kafka-cluster-id>" # example: "lkc-a1b2c3"
export PRINCIPAL_ID="<principal-id>" # example: "sa-23kgz4" for a service account, or "u-aq1dr2" for a user account
export SQL_CODE="<sql-statement-text>" # example: "SELECT * FROM USERS;"
```

```sql
confluent flink statement create --sql "${SQL_CODE}"
```

```sql
+---------------+------------------------------------------------------------+
| Creation Date | 2024-02-28 21:08:08.9749 +0000                             |
|               | UTC                                                        |
| Name          | cli-2024-02-28-130806-78dd77b5-16a9-40ab-9786-db95b9895eaa |
| Statement     | Select 1;                                                  |
| Compute Pool  | lfcp-8m09g0                                                |
| Status        | PENDING                                                    |
+---------------+------------------------------------------------------------+
```

```sql
confluent flink statement create ${STATEMENT_NAME} \
  --service-account ${PRINCIPAL_ID} \
  --sql "${SQL_CODE}" \
  --compute-pool ${COMPUTE_POOL_ID} \
  --database ${CLUSTER_ID}
```

```sql
confluent flink statement list
```

```sql
Creation Date         |         Name         |           Statement            | Compute Pool |  Status   |         Status Detail
--------------------------------+----------------------+--------------------------------+--------------+-----------+---------------------------------
  2023-07-08 21:04:06 +0000 UTC | 4b1d3494-f0f7-460d-9 | INSERT INTO copytopic          | lfcp-r2j1x9  | RUNNING   |
                                |                      | SELECT symbol,price from       |              |           |
                                |                      | topic_datagen;                 |              |           |
  2023-07-08 21:07:04 +0000 UTC | 6c43b973-b3c6-4be8-9 | INSERT INTO copytopic          | lfcp-r2j1x9  | RUNNING   |
                                |                      | SELECT symbol,price from       |              |           |
                                |                      | topic_datagen;                 |              |           |
...
```

```sql
confluent context list
```

```sql
Current |                          Name                          |    Platform     |            Credential
----------+--------------------------------------------------------+-----------------+------------------------------------
  *       |   login-<your-email-address>-https://confluent.cloud   | confluent.cloud | username-<your-email-address>
```

```sql
export MY_CONTEXT="login-<your-email-address>-https://confluent.cloud"
```

```sql
confluent flink statement list ${MY_CONTEXT}
```

```sql
Creation Date          |                            Name                            | Statement | Compute Pool |  Status   | Status Detail
---------------------------------+------------------------------------------------------------+-----------+--------------+-----------+----------------
  2024-02-28 21:08:08.9749 +0000 | cli-2024-02-28-130806-78dd77b5-16a9-40ab-9786-db95b9895eaa | Select 1; | lfcp-8m09g0  | COMPLETED |
  UTC                            |                                                            |           |              |           |
...
```

```sql
--compute-pool
```

```sql
confluent flink statement list --compute-pool ${COMPUTE_POOL_ID}
```

```sql
confluent flink statement describe ${STATEMENT_NAME}
```

```sql
Creation Date         |        Name        | Statement  | Compute Pool |  Status   | Status Detail
--------------------------------+--------------------+------------+--------------+-----------+----------------
  2023-07-19 19:26:52 +0000 UTC | fdc6cbf5-038a-408c | show jobs; | lfcp-a1b2c3  | COMPLETED |
```

```sql
confluent flink statement exception list ${STATEMENT_NAME}
```

```sql
confluent flink statement delete ${STATEMENT_NAME}
```

```sql
Deleted Flink SQL statement "ac23db14-b5dc-49fb-b".
```

```sql
# Request to stop a statement.
confluent flink statement update ${STATEMENT_NAME} --stopped=true

# Request to resume a stopped statement.
confluent flink statement update ${STATEMENT_NAME} --stopped=false
```

```sql
FlinkDeveloper
```

```sql
export COMPUTE_POOL_NAME=<compute-pool-name> # human-readable name, for example, "my-compute-pool"
export COMPUTE_POOL_ID="<compute-pool-id>" # example: "lfcp-8m03rm"
export CLOUD_PROVIDER="<cloud-provider>" # example: "aws"
export CLOUD_REGION="<cloud-region>" # example: "us-east-1"
export MAX_CFU="<max-cfu>" # example: 5
```

```sql
export COMPUTE_POOL_NAME=<compute-pool-name> # human-readable name, for example, "my-compute-pool"
export CLOUD_PROVIDER="<cloud-provider>" # example: "aws"
export CLOUD_REGION="<cloud-region>" # example: "us-east-1"
export ENV_ID="<environment-id>" # example: "env-z3y2x1"
export MAX_CFU="<max-cfu>" # example: 5
```

```sql
confluent flink compute-pool create ${COMPUTE_POOL_NAME} \
  --cloud ${CLOUD_PROVIDER} \
  --region ${CLOUD_REGION} \
  --max-cfu ${MAX_CFU} \
  --environment ${ENV_ID}
```

```sql
+-------------+-----------------+
| Current     | false           |
| ID          | lfcp-xxd6og     |
| Name        | my-compute-pool |
| Environment | env-z3y2x1      |
| Current CFU | 0               |
| Max CFU     | 5               |
| Cloud       | AWS             |
| Region      | us-east-1       |
| Status      | PROVISIONING    |
+-------------+-----------------+
```

```sql
export COMPUTE_POOL_ID="<compute-pool-id>" # example: "lfcp-8m03rm"
export ENV_ID="<environment-id>" # example: "env-z3y2x1"
```

```sql
confluent flink compute-pool describe ${COMPUTE_POOL_ID} \
  --environment ${ENV_ID}
```

```sql
+-------------+-----------------+
| Current     | false           |
| ID          | lfcp-xxd6og     |
| Name        | my-compute-pool |
| Environment | env-z3y2x1      |
| Current CFU | 0               |
| Max CFU     | 5               |
| Cloud       | AWS             |
| Region      | us-east-1       |
| Status      | PROVISIONED     |
+-------------+-----------------+
```

```sql
export CLOUD_REGION="<cloud-region>" # example: "us-east-1"
export ENV_ID="<environment-id>" # example: "env-z3y2x1"
```

```sql
confluent flink compute-pool list --environment ${ENV_ID}
```

```sql
Current |     ID      |           Name            | Environment | Current CFU | Max CFU | Cloud |  Region   |   Status
----------+-------------+---------------------------+-------------+-------------+---------+-------+-----------+--------------
  *       | lfcp-xxd6og | my-compute-pool           | env-z3y2x1  |           0 |       5 | AWS   | us-east-1 | PROVISIONED
          | lfcp-8m03rm | test-blue-compute-pool    | env-z3q9rd  |           0 |      10 | AWS   | us-east-1 | PROVISIONED
...
```

```sql
export COMPUTE_POOL_NAME=<compute-pool-name> # human-readable name, for example, "my-compute-pool"
export COMPUTE_POOL_ID="<compute-pool-id>" # example: "lfcp-8m03rm"
export ENV_ID="<environment-id>" # example: "env-z3y2x1"
export MAX_CFU="<max-cfu>" # example: 5
```

```sql
confluent flink compute-pool update ${COMPUTE_POOL_ID} \
  --environment ${ENV_ID} \
  --name ${COMPUTE_POOL_NAME} \
  --max-cfu ${MAX_CFU}
```

```sql
+-------------+----------------------+
| Current     | false                |
| ID          | lfcp-xxd6og          |
| Name        | renamed-compute-pool |
| Environment | env-z3y2x1           |
| Current CFU | 0                    |
| Max CFU     | 10                   |
| Cloud       | AWS                  |
| Region      | us-east-1            |
| Status      | PROVISIONED          |
+-------------+----------------------+
```

```sql
export COMPUTE_POOL_ID="<compute-pool-id>" # example: "lfcp-8m03rm"
export ENV_ID="<environment-id>" # example: "env-z3y2x1"
```

```sql
confluent environment use
```

```sql
confluent environment use ${ENV_ID} && \
confluent flink compute-pool use ${COMPUTE_POOL_ID}
```

```sql
Using environment "env-z3y2x1".
Using Flink compute pool "lfcp-xxd6og".
```

```sql
confluent flink compute-pool unset
```

```sql
Unset Flink compute pool "lfcp-xxd6og".
```

```sql
confluent flink compute-pool delete ${COMPUTE_POOL_ID} \
  --environment ${ENV_ID}
  --force
```

```sql
Deleted Flink compute pool "lfcp-xxd6og".
```

```sql
export CLOUD_PROVIDER="<cloud-provider>" # example: "aws"
export CLOUD_REGION="<cloud-region>" # example: "us-east-1"
```

```sql
confluent flink region list
```

```sql
Current |             Name              | Cloud |        Region
----------+-------------------------------+-------+-----------------------
          | Belgium (europe-west1)        | gcp   | europe-west1
          | Frankfurt (eu-central-1)      | aws   | eu-central-1
          | Frankfurt (europe-west3)      | gcp   | europe-west3
          | Iowa (us-central1)            | gcp   | us-central1
          | Ireland (eu-west-1)           | aws   | eu-west-1
          | Las Vegas (us-west4)          | gcp   | us-west4
          | London (eu-west-2)            | aws   | eu-west-2
  *       | N. Virginia (us-east-1)       | aws   | us-east-1
          | N. Virginia (us-east4)        | gcp   | us-east4
          | Netherlands (westeurope)      | azure | westeurope
          | Ohio (us-east-2)              | aws   | us-east-2
          | Oregon (us-west-2)            | aws   | us-west-2
          | S. Carolina (us-east1)        | gcp   | us-east1
          | Singapore (ap-southeast-1)    | aws   | ap-southeast-1
          | Singapore (asia-southeast1)   | gcp   | asia-southeast1
          | Singapore (southeastasia)     | azure | southeastasia
          | Sydney (ap-southeast-2)       | aws   | ap-southeast-2
          | Sydney (australia-southeast1) | gcp   | australia-southeast1
          | Virginia (eastus)             | azure | eastus
          | Virginia (eastus2)            | azure | eastus2
          | Washington (westus2)          | azure | westus2
```

```sql
confluent flink region list --cloud ${CLOUD_PROVIDER}
```

```sql
Current |            Name            | Cloud |     Region
----------+----------------------------+-------+-----------------
          | Frankfurt (eu-central-1)   | aws   | eu-central-1
          | Ireland (eu-west-1)        | aws   | eu-west-1
          | London (eu-west-2)         | aws   | eu-west-2
  *       | N. Virginia (us-east-1)    | aws   | us-east-1
          | Ohio (us-east-2)           | aws   | us-east-2
          | Oregon (us-west-2)         | aws   | us-west-2
          | Singapore (ap-southeast-1) | aws   | ap-southeast-1
          | Sydney (ap-southeast-2)    | aws   | ap-southeast-2
```

```sql
confluent flink region use --cloud ${CLOUD_PROVIDER} --region ${CLOUD_REGION}
```

```sql
CLOUD_PROVIDER=aws
```

```sql
CLOUD_REGION=us-east-2
```

```sql
Using Flink region "Ohio (us-east-2)".
```

---

### SQL Information Schema in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/flink-sql-information-schema.html

Information Schema in Confluent Cloud for Apache Flink¶ An information schema, or data dictionary, is a standard SQL schema with a collection of predefined views that enable accessing metadata about objects in Confluent Cloud for Apache Flink®. The Confluent INFORMATION_SCHEMA is based on the SQL-92 ANSI Information Schema, with the addition of views and functions that are specific to Confluent Cloud for Apache Flink. The ANSI standard uses “catalog” to refer to a database. In Confluent Cloud, “schema” refers to a database. Conceptually, the terms are equivalent. The views in the INFORMATION_SCHEMA provide information about database objects, such as tables, columns, and constraints. The views are organized into tables that you can query by using standard SQL statements. For example, you can use the INFORMATION_SCHEMA.COLUMNS table to get details about the columns in a table, like the column name, data type, and whether it allows null values. Similarly, you can use the INFORMATION_SCHEMA.TABLES table to get singular, static configuration details of the relation, if it’s a view, a table, or a system table. For example, you can query for details like watermark definition and the number of partitions in the topic. Every Flink catalog has a corresponding INFORMATION_SCHEMA, so you can always run a statement like SELECT (...) FROM <catalog-name>.INFORMATION_SCHEMA.TABLES WHERE (...). Global views are available in every INFORMATION_SCHEMA, which means that you can query for information across catalogs. For example, you can query the global INFORMATION_SCHEMA.CATALOGS view to list all catalogs. The information schema is a powerful tool for querying metadata about your Flink catalogs and databases, and you can use it for a variety of purposes, such as generating reports, documenting a schema, and troubleshooting performance issues. The following views are supported in the Confluent INFORMATION_SCHEMA: Catalogs and databases CATALOGS INFORMATION_SCHEMA_CATALOG_NAME SCHEMATA / DATABASES Functions PARAMETERS ROUTINES TABLES COLUMNS KEY_COLUMN_USAGE SYSTEM_TABLES TABLE_CONSTRAINTS TABLE_OPTIONS VIEWS Query syntax in INFORMATION_SCHEMA¶ Metadata queries on the INFORMATION_SCHEMA tables support the following syntax. Supported data types: INT STRING Supported operators: SELECT WHERE UNION ALL Supported expressions: CAST(NULL AS dt), CAST(x as dt) UNION ALL (see this example) AND, OR = , <>, IS NULL, IS NOT NULL AS STRING and INT literals The following limitations apply to INFORMATION_SCHEMA: You can use INFORMATION_SCHEMA views only in SELECT statements, not in INSERT INTO statements. You can’t use INFORMATION_SCHEMA in joins with real tables. Only the previously listed equality and basic expressions are supported. Catalogs and databases¶ CATALOGS¶ The global catalogs view. The rows returned are limited to the schemas that you have permission to interact with. This view is an extension to the SQL standard. Column Name Data Type Standard Description CATALOG_ID STRING NOT NULL No The ID of the catalog/environment, for example, env-xmzdkk. CATALOG_NAME STRING NOT NULL No The human readable name of the catalog/environment, for example, default. ExampleRun the following code to query for all catalogs across environments. SELECT `CATALOG_ID`, `CATALOG_NAME` FROM `INFORMATION_SCHEMA`.`CATALOGS`; INFORMATION_SCHEMA_CATALOG_NAME¶ Local catalog view. Returns the name of the current information schema’s catalog. Column Name Data Type Standard Description CATALOG_ID STRING NOT NULL No The ID of the catalog/environment, for example, env-xmzdkk. CATALOG_NAME STRING NOT NULL Yes The human readable name of the catalog/environment, for example, default. ExampleRun the following code to query for the name of this information schema’s catalog. SELECT `CATALOG_ID`, `CATALOG_NAME` FROM `INFORMATION_SCHEMA`.`INFORMATION_SCHEMA_CATALOG_NAME` SCHEMATA / DATABASES¶ Describes databases within the catalog. For convenience, DATABASES is an alias for SCHEMATA. The rows returned are limited to the schemas that you have permission to interact with. Column Name Data Type Standard Description CATALOG_ID STRING NOT NULL No The ID of the catalog/environment, for example, env-xmzdkk. CATALOG_NAME STRING NOT NULL Yes The human readable name of the catalog/environment, for example, default. SCHEMA_ID STRING NOT NULL No The ID of the database/cluster, for example, lkc-kgjwwv. SCHEMA_NAME STRING NOT NULL Yes The human readable name of the database/cluster, for example, MyCluster. ExampleRun the following code to list all Flink databases within a catalog, (Kafka clusters within an environment), excluding information schema. SELECT `SCHEMA_ID`, `SCHEMA_NAME` FROM `INFORMATION_SCHEMA`.`SCHEMATA` WHERE `SCHEMA_NAME` <> 'INFORMATION_SCHEMA'; COLUMNS¶ Describes columns of tables and virtual tables (views) in the catalog. Column Name Data Type Standard Description COLUMN_NAME STRING NOT NULL Yes Column reference. COMMENT STRING NULL No (Like Databricks, Snowflake) An optional comment that describes the relation. DATA_TYPE STRING NOT NULL Yes Type root, for example, VARCHAR or ROW. DISTRIBUTION_ORDINAL_POSITION INT NULL No (Like BigQuery for clustering key) If the table IS_DISTRIBUTED, contains the position of the key in a DISTRIBUTED BY clause. FULL_DATA_TYPE STRING NOT NULL No (Like Databricks) Fully qualified data type. for example, VARCHAR(32) or ROW<…>. GENERATION_EXPRESSION STRING NULL Yes (Like BigQuery and Databricks) For computed columns. IS_GENERATED STRING NOT NULL Yes Indicates whether column is a computed column. Values are YES or NO. IS_HIDDEN STRING NOT NULL No (Like BigQuery) Indicates whether a column is a system column. Values are YES or NO. IS_METADATA STRING NOT NULL No Indicates whether column is a metadata column. Values are YES or NO. IS_NULLABLE STRING NOT NULL No Indicates whether the column is nullable. Values are YES or NO. IS_PERSISTED STRING NOT NULL No Indicates whether a metadata column is stored during INSERT INTO. Also YES if a physical column. Values are YES or NO. METADATA_KEY STRING NULL No For metadata columns. ORDINAL_POSITION INT NOT NULL Yes Position of the column in the key, starting at 1. TABLE_CATALOG STRING NOT NULL Yes The human readable name of the catalog. TABLE_CATALOG_ID STRING NOT NULL No The ID of the catalog. TABLE_NAME STRING NOT NULL Yes The name of the relation. TABLE_SCHEMA STRING NOT NULL Yes The human-readable name of the database. TABLE_SCHEMA_ID STRING NOT NULL No The ID of the database. ExamplesThis example shows a complex query. The complexity comes from reducing the number of requests. Because the views are in normal form, instead of issuing three requests, you can batch them into single one by using UNION ALL. UNION ALL avoids the need for various inner/outer joins. The result is a sparse table that contains different “sections”. The overall schema looks like this: ( section, column_name, column_pos, column_type, constraint_name, constraint_type, constraint_enforced ) Run the following code to list columns, like name, position, data type, and their primary key characteristics. ( SELECT 'COLUMNS' AS `section`, `COLUMN_NAME` AS `column_name`, `ORDINAL_POSITION` AS `column_pos`, `FULL_DATA_TYPE` AS `column_type`, CAST(NULL AS STRING) AS `constraint_name`, CAST(NULL AS STRING) AS `constraint_type`, CAST(NULL AS STRING) AS `constraint_enforced` FROM `<current-catalog>`.`INFORMATION_SCHEMA`.`COLUMNS` WHERE `TABLE_CATALOG` = '<current-catalog>' AND `TABLE_SCHEMA` = '<current-database>' AND `TABLE_NAME` = '<current-table>' AND `IS_HIDDEN` = 'NO' ) UNION ALL ( SELECT 'TABLE_CONSTRAINTS' AS `section`, CAST(NULL AS STRING) AS `column_name`, CAST(NULL AS INT) AS `column_pos`, CAST(NULL AS STRING) AS `column_type`, `CONSTRAINT_NAME` AS `constraint_name`, `CONSTRAINT_TYPE` AS `constraint_type`, `ENFORCED` AS `constraint_enforced` FROM `<<CURRENT_CAT>>`.`INFORMATION_SCHEMA`.`TABLE_CONSTRAINTS` WHERE `CONSTRAINT_CATALOG` = '<current-catalog>' AND `CONSTRAINT_SCHEMA` = '<current-database>' AND `TABLE_CATALOG` = '<current-catalog>' AND `TABLE_SCHEMA` = '<current-database>' AND `TABLE_NAME` = '<current-table>' ) UNION ALL ( SELECT 'KEY_COLUMN_USAGE' AS `section`, `COLUMN_NAME` AS `column_name`, `ORDINAL_POSITION` AS `column_pos`, CAST(NULL AS STRING) AS `column_type`, `CONSTRAINT_NAME` AS `constraint_name`, CAST(NULL AS STRING) AS `constraint_type`, CAST(NULL AS STRING) AS `constraint_enforced` FROM `<<CURRENT_CAT>>`.`INFORMATION_SCHEMA`.`KEY_COLUMN_USAGE` WHERE `TABLE_CATALOG` = '<current-catalog>' AND `TABLE_SCHEMA` = '<current-database>' AND `TABLE_NAME` = '<current-table>' ); KEY_COLUMN_USAGE¶ Side view of TABLE_CONSTRAINTS for key columns. Column Name Data Type Standard Description COLUMN_NAME STRING NOT NULL Yes The name of the constrained column. CONSTRAINT_CATALOG STRING NOT NULL Yes Catalog name containing the constraint. CONSTRAINT_CATALOG_ID STRING NOT NULL No Catalog ID containing the constraint. CONSTRAINT_SCHEMA STRING NOT NULL Yes Schema name containing the constraint. CONSTRAINT_SCHEMA_ID STRING NOT NULL No Schema ID containing the constraint. CONSTRAINT_NAME STRING NOT NULL Yes Name of the constraint. ORDINAL_POSITION INT NOT NULL Yes The ordinal position of the column within the constraint key (starting at 1). TABLE_CATALOG STRING NOT NULL Yes The human readable name of the catalog. TABLE_CATALOG_ID STRING NOT NULL No The ID of the catalog. TABLE_NAME STRING NOT NULL Yes The name of the relation. TABLE_SCHEMA STRING NOT NULL Yes The human readable name of the database. TABLE_SCHEMA_ID STRING NOT NULL No The ID of the database. ExampleRun the following code to query for a side view of TABLE_CONSTRAINTS for key columns. SELECT * FROM `INFORMATION_SCHEMA`.`KEY_COLUMN_USAGE` SYSTEM_TABLES¶ Contains the object-level metadata for virtual tables within the catalog. Virtual tables do not represent physical data. Instead, they expose specific metadata. The rows returned are limited to the schemas the user has permission to interact with. Column Name Data Type Description BASE_TABLE_NAME STRING NULL The name of the relation to which this system view corresponds. COMMENT STRING NULL An optional comment that describes the system view. TABLE_CATALOG STRING NOT NULL The human-readable name of the catalog. TABLE_CATALOG_ID STRING NOT NULL The ID of the catalog. TABLE_NAME STRING NOT NULL The name of the relation. TABLE_SCHEMA STRING NOT NULL The human-readable name of the database. TABLE_SCHEMA_ID STRING NOT NULL The ID of the database. TABLES¶ Contains the object level metadata for tables and virtual tables (views) within the catalog. The rows returned are limited to the schemas that you have permission to interact with. Column Name Data Type Standard Description COMMENT STRING NULL No (Like Databricks and Snowflake) An optional comment that describes the relation. DISTRIBUTION_ALGORITHM STRING NULL No Only HASH is supported. DISTRIBUTION_BUCKETS INT NULL No Number of buckets, if defined. IS_DISTRIBUTED STRING NOT NULL No (Like Snowflake for clustering key) Indicates whether the table is bucketed using the DISTRIBUTED BY clause. Values are YES or NO. IS_WATERMARKED STRING NOT NULL No Indicates whether the table has a watermark from the WATERMARK FOR clause. Values are YES or NO. TABLE_CATALOG STRING NOT NULL Yes The human-readable name of the catalog. TABLE_CATALOG_ID STRING NOT NULL No The ID of the catalog. TABLE_NAME STRING NOT NULL Yes The name of the relation. TABLE_SCHEMA STRING NOT NULL Yes The human-readable name of the database. TABLE_SCHEMA_ID STRING NOT NULL No The ID of the database. TABLE_TYPE STRING NOT NULL Yes Values are BASE TABLE, EXTERNAL TABLE, SYSTEM TABLE, or VIEW [1]. WATERMARK_COLUMN STRING NULL No Time attribute column for which the watermark is defined. WATERMARK_EXPRESSION STRING NULL No Watermark expression. WATERMARK_IS_HIDDEN STRING NULL No Indicates whether the watermark is the default, system-provided one. [1]These are the valid values for the TABLE_TYPE column. BASE TABLE: For Confluent-native tables that can be used conceptually for reading and writing, like a regular database table. EXTERNAL TABLE: For non-native Confluent table, for example non-Kafka and Tableflow. Usually, those tables are read-only. SYSTEM TABLE: For tables that the system creates, either with a BASE TABLE or on its own. Only $error is supported. Compared with BASE TABLE, these tables are read-only. VIEW: For SQL views. ExamplesRun the following code to list all tables within a catalog (Kafka topics within an environment), excluding the information schema. SELECT `TABLE_CATALOG`, `TABLE_SCHEMA`, `TABLE_NAME` FROM `INFORMATION_SCHEMA`.`TABLES` WHERE `TABLE_SCHEMA` <> 'INFORMATION_SCHEMA'; Run the following code to list all tables within a database (Kafka topics within a cluster). SELECT `TABLE_CATALOG`, `TABLE_SCHEMA`, `TABLE_NAME` FROM `<current-catalog>`.`INFORMATION_SCHEMA`.`TABLES` WHERE `TABLE_SCHEMA` = '<current-database>'; TABLE_CONSTRAINTS¶ Side view of TABLES for all primary key constraints within the catalog. Column Name Data Type Standard Description CONSTRAINT_CATALOG STRING NOT NULL Yes Catalog name containing the constraint. CONSTRAINT_CATALOG_ID STRING NOT NULL No Catalog ID containing the constraint. CONSTRAINT_SCHEMA STRING NOT NULL Yes Schema name containing the constraint. CONSTRAINT_SCHEMA_ID STRING NOT NULL No Schema ID containing the constraint. CONSTRAINT_NAME STRING NOT NULL Yes Name of the constraint. CONSTRAINT_TYPE STRING NOT NULL Yes Currently, only PRIMARY KEY. ENFORCED STRING NOT NULL Yes YES if constraint is enforced, otherwise NO. TABLE_CATALOG STRING NOT NULL Yes The human readable name of the catalog. TABLE_CATALOG_ID STRING NOT NULL No The ID of the catalog. TABLE_NAME STRING NOT NULL Yes The name of the relation. TABLE_SCHEMA STRING NOT NULL Yes The human readable name of the database. TABLE_SCHEMA_ID STRING NOT NULL No The ID of the database. ExamplesRun the following code to query for a side view of TABLES for all primary key constraints within the catalog. SELECT * FROM `INFORMATION_SCHEMA`.`TABLE_CONSTRAINTS`; TABLE_OPTIONS¶ Side view of TABLES for WITH. Extension to the SQL Standard Information Schema. Column Name Data Type Description TABLE_CATALOG STRING NOT NULL The human readable name of the catalog. TABLE_CATALOG_ID STRING NOT NULL The ID of the catalog. TABLE_NAME STRING NOT NULL The name of the relation. TABLE_SCHEMA STRING NOT NULL The human readable name of the database. TABLE_SCHEMA_ID STRING NOT NULL The ID of the database. OPTION_KEY STRING NOT NULL Option key. OPTION_VALUE STRING NOT NULL Option value. ExamplesRun the following code to query for a side view of TABLES for WITH. SELECT * FROM `INFORMATION_SCHEMA`.`TABLE_OPTIONS`; VIEWS¶ Contains the object-level metadata for views within the catalog. The rows returned are limited to the schemas the user has permission to interact with. Column Name Data Type Standard Description COMMENT STRING NULL No (Like Databricks and Snowflake) An optional comment that describes the relation. TABLE_CATALOG STRING NOT NULL Yes The human-readable name of the catalog. TABLE_CATALOG_ID STRING NOT NULL No The ID of the catalog. TABLE_NAME STRING NOT NULL Yes The name of the relation. TABLE_SCHEMA STRING NOT NULL Yes The human-readable name of the database. TABLE_SCHEMA_ID STRING NOT NULL No The ID of the database. VIEW_DEFINITION STRING NULL Yes Text of the view’s expanded query expression. Like Databricks, NULL if the user does not own the view. Functions¶ Confluent Cloud for Apache Flink supports a number of features for routines. overloading structured types, that is, Java POJOs var-args procedures and polymorphic table functions (PTFs) table, model arguments with traits for PTFs input type strategies and return type strategies These special cases are considered in the INFORMATION_SCHEMA design: overloading: SPECIFIC_NAME with _1, _2 behavior, similar to Databricks structured types, return type strategies: FULL_DATA_TYPE and DATA_TYPE = NULL var-args, input type strategies: at least indicated with IS_STATIC = NO procedures and PTFs, table, model arguments with traits for PTFs: special columns TRAITS and DATA_TYPE = NULL ROUTINES¶ Contains the object-level metadata for functions within the catalog. Column Name Data Type Standard Description CREATED TIMESTAMP_LTZ(9) NOT NULL Yes Creation time of the function. DATA_TYPE STRING NULL Yes Type root, for example, VARCHAR or ROW. NULL is not standard but is reserved for user-defined functions with type strategies instead of a static return type or procedures. Table-valued functions always return a ROW type: it is either an automatic wrapper in case of a non-row type or a ROW returned by the function. EXTERNAL_ARTIFACTS STRING NULL No Contains the content from USING JAR. Null if not external. For example: confluent-artifact://<artifact-id>/<version-id>. If multiple artifacts are supported, use a semicolon separated list. The information about this being a JAR can be derived from EXTERNAL_LANGUAGE for now. EXTERNAL_LANGUAGE STRING NULL Yes JAVA or PYTHON. NULL if not external. EXTERNAL_NAME STRING NULL Yes Identifier in the external language. For example, for Java, it’s the fully qualified class path. NULL is for non-external functions, that is, functions implemented in SQL. Contains the content of the AS clause, for example, ‘<class>’: CREATE FUNCTION … AS ‘<class>’ FULL_DATA_TYPE STRING NULL No Fully qualified data type, for example, VARCHAR(32) or ROW<…>. NULL is not standard but reserved for user-defined functions with type strategies instead of a static return type or procedures. Table-valued functions always return a ROW type: it is either an automatic wrapper in case of a non-row type or a ROW returned by the function. FUNCTION_KIND STRING NULL No For ROUTINE_TYPE of FUNCTION or PTF, defines a more specific function kind, corresponding to Flink’s FunctionKind. Values are: TABLE, SCALAR, AGGREGATE, PROCESS_TABLE. Null is reserved for PROCEDURES. FUNCTION_REQUIREMENTS STRING NULL No Semicolon separated list of requirements for ROUTINE_TYPE of FUNCTION or PTF. Corresponds to Flink’s FunctionRequirement. Values are: OVER_WINDOW_ONLY. Null if there are no calling requirements. IS_DETERMINISTIC STRING NOT NULL Yes YES or NO. IS_DYNAMIC STRING NOT NULL No YES or NO. Whether the signature has static arguments or uses a strategy. In the latter case, PARAMETERS doesn’t contain information about the given specific routine. ROUTINE_BODY STRING NOT NULL Yes EXTERNAL for Java and Python. Deviates from standard because PTFs are also EXTERNAL. ROUTINE_CATALOG STRING NOT NULL Yes Matches SPECIFIC_CATALOG. ROUTINE_CATALOG_ID STRING NOT NULL No Matches SPECIFIC_CATALOG_ID. ROUTINE_NAME STRING NOT NULL Yes Name of the routine. ROUTINE_SCHEMA STRING NOT NULL Yes Matches SPECIFIC_SCHEMA. ROUTINE_SCHEMA_ID STRING NOT NULL No Matches SPECIFIC_SCHEMA_ID. ROUTINE_TYPE STRING NOT NULL Yes FUNCTION for user-defined functions. PTF for user-defined PTFs. SPECIFIC_CATALOG STRING NOT NULL Yes The human-readable name of the catalog. SPECIFIC_CATALOG_ID STRING NOT NULL No The ID of the catalog/environment. SPECIFIC_NAME STRING NOT NULL Yes Uniquely identifies a potentially overloaded routine signature. For example, a function f takes both f(INT) and f(STRING). Each overload gets a specific name such as f_1 or f_2. The specific name is not callable in SQL but is used for references by other INFORMATION_SCHEMA views such as PARAMETERS. SPECIFIC_SCHEMA STRING NOT NULL Yes The human-readable name of the database. SPECIFIC_SCHEMA_ID STRING NOT NULL No The ID of the database. PARAMETERS¶ The parameters view supports only functions with static arguments. Column Name Data Type Standard Description DATA_TYPE STRING NULL Yes Type root, for example, VARCHAR or ROW. NULL is not standard but is reserved for PTFs with untyped table arguments. FULL_DATA_TYPE STRING NULL No Fully qualified data type, for example, VARCHAR(32) or ROW<…>. IS_OPTIONAL STRING NOT NULL No YES or NO whether this parameter is optional. ORDINAL_POSITION INT NOT NULL Yes Position (1-based) of the parameter in the signature. PARAMETER_MODE STRING NOT NULL Yes Always IN. Reserved for future use. PARAMETER_NAME STRING NOT NULL Yes Name of the parameter. ROUTINE_NAME STRING NOT NULL No Name of the routine. SPECIFIC_CATALOG STRING NOT NULL Yes The human-readable name of the catalog. SPECIFIC_CATALOG_ID STRING NOT NULL No The ID of the catalog/environment. SPECIFIC_NAME STRING NOT NULL Yes Uniquely identifies a potentially overloaded routine signature. For example, a function f takes both f(INT) and f(STRING). Then each overload gets a specific name such as: f_1 or f_2. The specific name is not callable in SQL but is used for references by other INFORMATION_SCHEMA views such as ROUTINES. SPECIFIC_SCHEMA STRING NOT NULL Yes The human-readable name of the database. SPECIFIC_SCHEMA_ID STRING NOT NULL No The ID of the database. TRAITS STRING NOT NULL No Semicolon-separated list of traits. By default, SCALAR only. Related content¶ Data Types Queries Reserved Keywords Statements Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
SELECT (...) FROM <catalog-name>.INFORMATION_SCHEMA.TABLES WHERE (...)
```

```sql
SELECT
  `CATALOG_ID`,
  `CATALOG_NAME`
FROM `INFORMATION_SCHEMA`.`CATALOGS`;
```

```sql
SELECT
  `CATALOG_ID`,
  `CATALOG_NAME`
FROM `INFORMATION_SCHEMA`.`INFORMATION_SCHEMA_CATALOG_NAME`
```

```sql
SELECT
  `SCHEMA_ID`,
  `SCHEMA_NAME`
FROM `INFORMATION_SCHEMA`.`SCHEMATA`
WHERE `SCHEMA_NAME` <> 'INFORMATION_SCHEMA';
```

```sql
(
  section,
  column_name,
  column_pos,
  column_type,
  constraint_name,
  constraint_type,
  constraint_enforced
)
```

```sql
(
  SELECT
    'COLUMNS' AS `section`,
    `COLUMN_NAME` AS `column_name`,
    `ORDINAL_POSITION` AS `column_pos`,
    `FULL_DATA_TYPE` AS `column_type`,
    CAST(NULL AS STRING) AS `constraint_name`,
    CAST(NULL AS STRING) AS `constraint_type`,
    CAST(NULL AS STRING) AS `constraint_enforced`
  FROM
    `<current-catalog>`.`INFORMATION_SCHEMA`.`COLUMNS`
  WHERE
    `TABLE_CATALOG` = '<current-catalog>' AND
    `TABLE_SCHEMA` = '<current-database>' AND
    `TABLE_NAME` = '<current-table>' AND
    `IS_HIDDEN` = 'NO'

)
UNION ALL
(
  SELECT
    'TABLE_CONSTRAINTS' AS `section`,
    CAST(NULL AS STRING) AS `column_name`,
    CAST(NULL AS INT) AS `column_pos`,
    CAST(NULL AS STRING) AS `column_type`,
    `CONSTRAINT_NAME` AS `constraint_name`,
    `CONSTRAINT_TYPE` AS `constraint_type`,
    `ENFORCED` AS `constraint_enforced`
  FROM
    `<<CURRENT_CAT>>`.`INFORMATION_SCHEMA`.`TABLE_CONSTRAINTS`
  WHERE
    `CONSTRAINT_CATALOG` = '<current-catalog>' AND
    `CONSTRAINT_SCHEMA` = '<current-database>' AND
    `TABLE_CATALOG` = '<current-catalog>' AND
    `TABLE_SCHEMA` = '<current-database>' AND
    `TABLE_NAME` = '<current-table>'
)
UNION ALL
(
  SELECT
    'KEY_COLUMN_USAGE' AS `section`,
    `COLUMN_NAME` AS `column_name`,
    `ORDINAL_POSITION` AS `column_pos`,
    CAST(NULL AS STRING) AS `column_type`,
    `CONSTRAINT_NAME` AS `constraint_name`,
    CAST(NULL AS STRING) AS `constraint_type`,
    CAST(NULL AS STRING) AS `constraint_enforced`
  FROM
    `<<CURRENT_CAT>>`.`INFORMATION_SCHEMA`.`KEY_COLUMN_USAGE`
  WHERE
    `TABLE_CATALOG` = '<current-catalog>' AND
    `TABLE_SCHEMA` = '<current-database>' AND
    `TABLE_NAME` = '<current-table>'
);
```

```sql
SELECT *
FROM `INFORMATION_SCHEMA`.`KEY_COLUMN_USAGE`
```

```sql
EXTERNAL TABLE
```

```sql
SYSTEM TABLE
```

```sql
SELECT
  `TABLE_CATALOG`,
  `TABLE_SCHEMA`,
  `TABLE_NAME`
FROM `INFORMATION_SCHEMA`.`TABLES`
WHERE `TABLE_SCHEMA` <> 'INFORMATION_SCHEMA';
```

```sql
SELECT
  `TABLE_CATALOG`,
  `TABLE_SCHEMA`,
  `TABLE_NAME`
FROM `<current-catalog>`.`INFORMATION_SCHEMA`.`TABLES`
WHERE `TABLE_SCHEMA` = '<current-database>';
```

```sql
SELECT *
FROM `INFORMATION_SCHEMA`.`TABLE_CONSTRAINTS`;
```

```sql
SELECT *
FROM `INFORMATION_SCHEMA`.`TABLE_OPTIONS`;
```

```sql
FULL_DATA_TYPE
```

```sql
DATA_TYPE = NULL
```

```sql
IS_STATIC = NO
```

```sql
DATA_TYPE = NULL
```

```sql
PROCESS_TABLE
```

```sql
OVER_WINDOW_ONLY
```

---

### SQL aggregate functions in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/functions/aggregate-functions.html

Aggregate Functions in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides these built-in functions to aggregate rows in Flink SQL queries: AVG COLLECT COUNT CUME_DIST DENSE_RANK FIRST_VALUE LAG LAST_VALUE LEAD LISTAGG MAX MIN NTILE PERCENT_RANK RANK ROW_NUMBER STDDEV_POP STDDEV_SAMP SUM VAR_POP VAR_SAMP VARIANCE The aggregate functions take an expression across all the rows as the input and return a single aggregated value as the result. AVG¶ SyntaxAVG([ ALL | DISTINCT ] expression) DescriptionBy default or with keyword ALL, returns the average (arithmetic mean) of expression over all input rows. Use DISTINCT to return one unique instance of each value. Example-- returns 1.500000 SELECT AVG(my_values) FROM (VALUES (0.0), (1.0), (2.0), (3.0)) AS my_values; COLLECT¶ SyntaxCOLLECT([ ALL | DISTINCT ] expression) DescriptionBy default or with the ALL keyword, returns a multiset of expression over all input rows. NULL values are ignored. Use DISTINCT to return one unique instance of each value. COUNT¶ SyntaxCOUNT([ ALL ] expression | DISTINCT expression1 [, expression2]*) DescriptionBy default or with ALL, returns the number of input rows for which expression isn’t NULL. Use DISTINCT to return one unique instance of each value. Use COUNT(*) or COUNT(1) to return the number of input rows. Example-- returns 4 SELECT COUNT(my_values) FROM (VALUES (0), (1), (2), (3)) AS my_values; CUME_DIST¶ SyntaxCUME_DIST() DescriptionReturns the cumulative distribution of a value in a group of values. The result is the number of rows preceding or equal to the current row in the partition ordering divided by the number of rows in the window partition. DENSE_RANK¶ SyntaxDENSE_RANK() DescriptionReturns the rank of a value in a group of values. The result is one plus the previously assigned rank value. Unlike the RANK function, DENSE_RANK doesn’t produce gaps in the ranking sequence. Related function RANK FIRST_VALUE¶ SyntaxFIRST_VALUE(expression) DescriptionReturns the first value in an ordered set of values. Example-- returns first SELECT FIRST_VALUE(my_values) FROM (VALUES ('first'), ('second'), ('third')) AS my_values; Related function LAST_VALUE LAG¶ SyntaxLAG(expression [, offset] [, default]) DescriptionReturns the value of expression at the offsetth row before the current row in the window. The default value of offset is 1, and the default value of the default argument is NULL. ExampleThe following example shows how to use the LAG function to see player scores changing over time. SELECT $rowtime AS row_time , player_id , game_room_id , points , LAG(points, 1) OVER (PARTITION BY player_id ORDER BY $rowtime) previous_points_value FROM gaming_player_activity; For the full code example, see Compare Current and Previous Values in a Data Stream. Related function LEAD LAST_VALUE¶ SyntaxLAST_VALUE(expression) DescriptionReturns the last value in an ordered set of values. Example-- returns third SELECT LAST_VALUE(my_values) FROM (VALUES ('first'), ('second'), ('third')) AS my_values; Related function FIRST_VALUE LEAD¶ SyntaxLEAD(expression [, offset] [, default]) DescriptionReturns the value of the expression at the offsetth row after the current row in the window. The default value of offset is 1, and the default value of the default argument is NULL. Related function LAG LISTAGG¶ SyntaxLISTAGG(expression [, separator]) DescriptionConcatenates the values of string expressions and inserts separator values between them. The separator isn’t added at the end of string. The default value of separator is ','. Example-- returns first,second,third SELECT LISTAGG(my_values) FROM (VALUES ('first'), ('second'), ('third')) AS my_values; MAX¶ SyntaxMAX([ ALL | DISTINCT ] expression) DescriptionBy default or with the ALL keyword, returns the maximum value of expression over all input rows. Use DISTINCT to return one unique instance of each value. Examples-- returns 3 SELECT MAX(my_values) FROM (VALUES (0), (1), (2), (3)) AS my_values; The following example shows how to use the MAX function to find the highest player score in a tumbling window. SELECT window_start, window_end, SUM(points) AS total, MIN(points) as min_points, MAX(points) as max_points FROM TABLE(TUMBLE(TABLE gaming_player_activity_source, DESCRIPTOR($rowtime), INTERVAL '10' SECOND)) GROUP BY window_start, window_end; For the full code example, see Aggregate a Stream in a Tumbling Window. Related function MIN MIN¶ SyntaxMIN([ ALL | DISTINCT ] expression ) DescriptionBy default or with the ALL keyword, returns the minimum value of expression across all input rows. Use DISTINCT to return one unique instance of each value. Examples-- returns 0 SELECT MIN(my_values) FROM (VALUES (0), (1), (2), (3)) AS my_values; The following example shows how to use the MIN function to find the lowest player score in a tumbling window. SELECT window_start, window_end, SUM(points) AS total, MIN(points) as min_points, MAX(points) as max_points FROM TABLE(TUMBLE(TABLE gaming_player_activity_source, DESCRIPTOR($rowtime), INTERVAL '10' SECOND)) GROUP BY window_start, window_end; For the full code example, see Aggregate a Stream in a Tumbling Window. Related function MAX NTILE¶ SyntaxNTILE(n) DescriptionDivides the rows for each window partition into n buckets ranging from 1 to at most n. If the number of rows in the window partition doesn’t divide evenly into the number of buckets, the remainder values are distributed one per bucket, starting with the first bucket. For example, with 6 rows and 4 buckets, the bucket values would be: 1 1 2 2 3 4 PERCENT_RANK¶ SyntaxPERCENT_RANK() DescriptionReturns the percentage ranking of a value in a group of values. The result is the rank value minus one, divided by the number of rows in the partition minus one. If the partition only contains one row, the PERCENT_RANK function returns 0. RANK¶ SyntaxRANK() DescriptionReturns the rank of a value in a group of values. The result is one plus the number of rows preceding or equal to the current row in the partition ordering. The values produce gaps in the sequence. Related functions DENSE_RANK ROW_NUMBER ROW_NUMBER¶ SyntaxROW_NUMBER() DescriptionAssigns a unique, sequential number to each row, starting with one, according to the ordering of rows within the window partition. The ROW_NUMBER and RANK functions are similar. ROW_NUMBER numbers all rows sequentially, for example, 1, 2, 3, 4, 5. RANK provides the same numeric value for ties, for example 1, 2, 2, 4, 5. Related functions RANK DENSE_RANK STDDEV_POP¶ SyntaxSTDDEV_POP([ ALL | DISTINCT ] expression) DescriptionBy default or with the ALL keyword, returns the population standard deviation of expression over all input rows. Use DISTINCT to return one unique instance of each value. Example-- returns 0.986154 SELECT STDDEV_POP(my_values) FROM (VALUES (0.5), (1.5), (2.2), (3.2)) AS my_values; Related function STDDEV_SAMP STDDEV_SAMP¶ SyntaxSTDDEV_SAMP([ ALL | DISTINCT ] expression) DescriptionBy default or with the ALL keyword, returns the sample standard deviation of expression over all input rows. Use DISTINCT to return one unique instance of each value. Example-- returns 1.138713 SELECT STDDEV_SAMP(my_values) FROM (VALUES (0.5), (1.5), (2.2), (3.2)) AS my_values; Related function STDDEV_POP SUM¶ SyntaxSUM([ ALL | DISTINCT ] expression) By default or with the ALL keyword, returns the sum of expression across all input rows. Use DISTINCT to return one unique instance of each value. Examples-- returns 6 SELECT SUM(my_values) FROM (VALUES (0), (1), (2), (3)) AS my_values; The following example shows how to use the SUM function to find the total of player scores in a tumbling window. SELECT window_start, window_end, SUM(points) AS total, MIN(points) as min_points, MAX(points) as max_points FROM TABLE(TUMBLE(TABLE gaming_player_activity_source, DESCRIPTOR($rowtime), INTERVAL '10' SECOND)) GROUP BY window_start, window_end; For the full code example, see Aggregate a Stream in a Tumbling Window. VAR_POP¶ SyntaxVAR_POP([ ALL | DISTINCT ] expression) DescriptionBy default or with the ALL keyword, returns the population variance, which is the square of the population standard deviation, of expression over all input rows. Use DISTINCT to return one unique instance of each value. Example-- returns 0.972500 SELECT VAR_POP(my_values) FROM (VALUES (0.5), (1.5), (2.2), (3.2)) AS my_values; Related function VAR_SAMP VAR_SAMP¶ SyntaxVAR_SAMP([ ALL | DISTINCT ] expression) DescriptionBy default or with the ALL keyword, returns the sample variance, which is the square of the sample standard deviation, of expression over all input rows. Use DISTINCT to return one unique instance of each value. The VARIANCE function is equivalent to VAR_SAMP. Example-- returns 1.296667 SELECT VAR_SAMP(my_values) FROM (VALUES (0.5), (1.5), (2.2), (3.2)) AS my_values; Related functions STDDEV_POP VARIANCE VARIANCE¶ SyntaxVARIANCE([ ALL | DISTINCT ] expression) DescriptionEquivalent to VAR_SAMP. Other built-in functions¶ Aggregate Functions Collection Functions Comparison Functions Conditional Functions Datetime Functions Hash Functions JSON Functions ML Preprocessing Functions Model Inference Functions Numeric Functions String Functions Table API Functions Related content¶ User-defined Functions Create a User Defined Function Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
AVG([ ALL | DISTINCT ] expression)
```

```sql
-- returns 1.500000
SELECT AVG(my_values)
FROM (VALUES (0.0), (1.0), (2.0), (3.0)) AS my_values;
```

```sql
COLLECT([ ALL | DISTINCT ] expression)
```

```sql
COUNT([ ALL ] expression | DISTINCT expression1 [, expression2]*)
```

```sql
-- returns 4
SELECT COUNT(my_values)
FROM (VALUES (0), (1), (2), (3)) AS my_values;
```

```sql
CUME_DIST()
```

```sql
DENSE_RANK()
```

```sql
FIRST_VALUE(expression)
```

```sql
-- returns first
SELECT FIRST_VALUE(my_values)
FROM (VALUES ('first'), ('second'), ('third')) AS my_values;
```

```sql
LAG(expression [, offset] [, default])
```

```sql
SELECT $rowtime AS row_time
  , player_id
  , game_room_id
  , points
  , LAG(points, 1) OVER (PARTITION BY player_id ORDER BY $rowtime) previous_points_value
 FROM gaming_player_activity;
```

```sql
LAST_VALUE(expression)
```

```sql
-- returns third
SELECT LAST_VALUE(my_values)
FROM (VALUES ('first'), ('second'), ('third')) AS my_values;
```

```sql
LEAD(expression [, offset] [, default])
```

```sql
LISTAGG(expression [, separator])
```

```sql
-- returns first,second,third
SELECT LISTAGG(my_values)
FROM (VALUES ('first'), ('second'), ('third')) AS my_values;
```

```sql
MAX([ ALL | DISTINCT ] expression)
```

```sql
-- returns 3
SELECT MAX(my_values)
FROM (VALUES (0), (1), (2), (3)) AS my_values;
```

```sql
SELECT
  window_start,
  window_end,
  SUM(points) AS total,
  MIN(points) as min_points,
  MAX(points) as max_points
FROM TABLE(TUMBLE(TABLE gaming_player_activity_source, DESCRIPTOR($rowtime), INTERVAL '10' SECOND))
GROUP BY window_start, window_end;
```

```sql
MIN([ ALL | DISTINCT ] expression )
```

```sql
-- returns 0
SELECT MIN(my_values)
FROM (VALUES (0), (1), (2), (3)) AS my_values;
```

```sql
SELECT
  window_start,
  window_end,
  SUM(points) AS total,
  MIN(points) as min_points,
  MAX(points) as max_points
FROM TABLE(TUMBLE(TABLE gaming_player_activity_source, DESCRIPTOR($rowtime), INTERVAL '10' SECOND))
GROUP BY window_start, window_end;
```

```sql
1 1 2 2 3 4
```

```sql
PERCENT_RANK()
```

```sql
PERCENT_RANK
```

```sql
ROW_NUMBER()
```

```sql
1, 2, 3, 4, 5
```

```sql
1, 2, 2, 4, 5
```

```sql
STDDEV_POP([ ALL | DISTINCT ] expression)
```

```sql
-- returns 0.986154
SELECT STDDEV_POP(my_values)
FROM (VALUES (0.5), (1.5), (2.2), (3.2)) AS my_values;
```

```sql
STDDEV_SAMP([ ALL | DISTINCT ] expression)
```

```sql
-- returns 1.138713
SELECT STDDEV_SAMP(my_values)
FROM (VALUES (0.5), (1.5), (2.2), (3.2)) AS my_values;
```

```sql
SUM([ ALL | DISTINCT ] expression)
```

```sql
-- returns 6
SELECT SUM(my_values)
FROM (VALUES (0), (1), (2), (3)) AS my_values;
```

```sql
SELECT
  window_start,
  window_end,
  SUM(points) AS total,
  MIN(points) as min_points,
  MAX(points) as max_points
FROM TABLE(TUMBLE(TABLE gaming_player_activity_source, DESCRIPTOR($rowtime), INTERVAL '10' SECOND))
GROUP BY window_start, window_end;
```

```sql
VAR_POP([ ALL | DISTINCT ] expression)
```

```sql
-- returns 0.972500
SELECT VAR_POP(my_values)
FROM (VALUES (0.5), (1.5), (2.2), (3.2)) AS my_values;
```

```sql
VAR_SAMP([ ALL | DISTINCT ] expression)
```

```sql
-- returns 1.296667
SELECT VAR_SAMP(my_values)
FROM (VALUES (0.5), (1.5), (2.2), (3.2)) AS my_values;
```

```sql
VARIANCE([ ALL | DISTINCT ] expression)
```

---

### SQL Collection Functions in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/functions/collection-functions.html

Collection Functions in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides these built-in collection functions to use in Flink SQL queries: ARRAY ARRAY_AGG ARRAY_APPEND ARRAY_CONCAT ARRAY_CONTAINS ARRAY_DISTINCT ARRAY_EXCEPT ARRAY_INTERSECT ARRAY_JOIN ARRAY_MAX ARRAY_MIN ARRAY_POSITION ARRAY_PREPEND ARRAY_REMOVE ARRAY_REVERSE ARRAY_SLICE ARRAY_SORT ARRAY_UNION CARDINALITY(array) CARDINALITY(map) ELEMENT GROUP_ID GROUPING Implicit row constructor MAP MAP_ENTRIES MAP_FROM_ARRAYS MAP_KEYS MAP_UNION MAP_VALUES ARRAY¶ SyntaxARRAY ‘[’ value1 [, value2 ]* ‘]’ DescriptionCreates an array from the specified list of values, (value1, value2, ...). Use the bracket syntax, array_name[INT], to return the element at position INT in the array. The index starts at 1. Example-- returns Java SELECT ARRAY['Java', 'SQL'][1]; ARRAY_AGG¶ SyntaxARRAY_AGG([ ALL | DISTINCT ] expression [ RESPECT NULLS | IGNORE NULLS ]) DescriptionConcatenates the input rows and returns an array, or NULL if there are no input rows. Use the DISTINCT keyword to specify one unique instance of each value. The ALL keyword concatenates all rows. The default is ALL. By default, NULL values are respected. You can use IGNORE NULLS to skip NULL values. Currently, the ORDER BY clause is not supported. Example-- returns: -- product_name quantities -- Apple [3, 7] -- Orange [2] -- Banana [5, 4] WITH sales_data (id, product_name, quantity_sold) AS ( VALUES (1, 'Apple', 3), (2, 'Banana', 5), (3, 'Apple', 7), (4, 'Orange', 2), (5, 'Banana', 4) ) SELECT product_name, ARRAY_AGG(quantity_sold) AS quantities FROM sales_data GROUP BY product_name; ARRAY_APPEND¶ SyntaxARRAY_APPEND(array, element) DescriptionAppends an element to the end of the array and returns the result. If array is NULL, the function returns NULL. If element is NULL, the NULL element is added to the end of the array. Example-- returns [SQL,Java,C#] SELECT ARRAY_APPEND(ARRAY['SQL', 'Java'], 'C#'); ARRAY_CONCAT¶ SyntaxARRAY_CONCAT(array1, array2, …) DescriptionReturns an array that is the result of concatenating at least one array. The returned array contains all of the elements in the first array, followed by all of the elements in the second array, and so forth, up to the Nth array. If any input array is NULL, the function returns NULL. Example-- returns [SQL,Java,Python,Python,Rust,Haskell,C#] SELECT ARRAY_CONCAT(ARRAY['SQL', 'Java'], ARRAY['Python'], ARRAY['Python', 'Rust', 'Haskell', 'C#']); ARRAY_CONTAINS¶ SyntaxARRAY_CONTAINS(array, element) DescriptionReturns a value indicating whether the element exists in array. Checking for NULL elements in the array is supported. If array is NULL, the ARRAY_CONTAINS function returns NULL. The specified element is cast implicitly to the array’s element type, if necessary. Example-- returns TRUE SELECT ARRAY_CONTAINS(ARRAY['Java', 'SQL'], 'SQL'); ARRAY_DISTINCT¶ SyntaxARRAY_DISTINCT(array) DescriptionReturns an array with unique elements. If array is NULL, the ARRAY_DISTINCT function returns NULL. The order of elements in the source array is preserved in the returned array. Example-- returns [SQL,Java,Python] SELECT ARRAY_DISTINCT(ARRAY['SQL', 'Java', 'SQL', 'Python', 'SQL']); ARRAY_EXCEPT¶ SyntaxARRAY_EXCEPT(array1, array2) DescriptionReturns an array that contains the elements from array1 that are not in array2, without duplicates. The order of the elements from array1 is retained. If no elements remain after excluding the elements in array2 from array1, the function returns an empty array. If one or both arguments are NULL, the function returns NULL. Example-- returns [Java, SQL] SELECT ARRAY_EXCEPT(ARRAY['SQL', 'Java', 'Python', 'Rust',], ARRAY['Python', 'Rust', 'Haskell', 'C#']); ARRAY_INTERSECT¶ SyntaxARRAY_INTERSECT(array1, array2) DescriptionReturns an array that contains the elements from array1 that are also in array2, without duplicates. The order of the elements from array1 is retained. If there are no common elements in array1 and array2, the function returns an empty array. If either array is NULL, the function returns NULL. Example-- returns [Python, Rust] SELECT ARRAY_INTERSECT(ARRAY['SQL', 'Java', 'Python', 'Rust',], ARRAY['Python', 'Rust', 'Haskell', 'C#']); ARRAY_JOIN¶ SyntaxARRAY_JOIN(array, delimiter [, nullReplacement]) DescriptionReturns a string that represents the concatenation of the elements in array. Elements are cast to their string representation. The delimiter is a string that separates each pair of consecutive elements of the array. The optional nullReplacement is a string that replaces null elements in the array. If nullReplacement is not specified, null elements in the array are omitted from the resulting string. Returns NULL if any of the inputs is NULL. Example-- returns "Java, SQL, Python, not specified" SELECT ARRAY_JOIN(ARRAY['Java', 'SQL', 'Python', NULL], ', ', 'not specified'); ARRAY_MAX¶ SyntaxARRAY_MAX(array) DescriptionReturns the maximum value from array, or NULL if array is NULL. Example-- returns 4 SELECT ARRAY_MAX(ARRAY[1, 2, 3, 4]); ARRAY_MIN¶ SyntaxARRAY_MIN(array) DescriptionReturns the minimum value from array, or NULL if array is NULL. Example-- returns 1 SELECT ARRAY_MIN(ARRAY[1, 2, 3, 4]); ARRAY_POSITION¶ SyntaxARRAY_POSITION(array, element) DescriptionReturns the position of the first occurrence of element in array as an integer. The index is 1-based, so the first element in the array has index 1. Returns 0 if element is not found in array. Returns NULL if either of the arguments is NULL. Example-- returns 2 SELECT ARRAY_POSITION(ARRAY['Java', 'SQL', 'Python'], 'SQL'); ARRAY_PREPEND¶ SyntaxARRAY_PREPEND(array, element) DescriptionPrepends an element to the beginning of the array and returns the result. If array is NULL, the function returns NULL. If element is NULL, the NULL element is prepended to the beginning of the array. Example-- returns [SQL,Java,Python] SELECT ARRAY_PREPEND(ARRAY['Java', 'Python'], 'SQL'); ARRAY_REMOVE¶ SyntaxARRAY_REMOVE(array, element) DescriptionRemoves from array all elements that are equal to element. Order of elements is retained. If array is NULL, the function returns NULL. Example-- returns [Java,Python] SELECT ARRAY_REMOVE(ARRAY['Java', 'SQL', 'Python'], 'SQL'); ARRAY_REVERSE¶ SyntaxARRAY_REVERSE(array) DescriptionReturns an array that has elements in the reverse order of the elements in array. If array is NULL, the function returns NULL. Example-- returns [Python,SQL,Java] SELECT ARRAY_REVERSE(ARRAY['Java', 'SQL', 'Python']); ARRAY_SLICE¶ SyntaxARRAY_SLICE(array, start_offset [, end_offset]) DescriptionReturns a subarray of the input array between start_offset and end_offset, inclusive. The offsets are 1-based, but 0 is also treated as the beginning of the array. Elements of the subarray are returned in the order they appear in array. Positive values are counted from the beginning of the array. Negative values are counted from the end. If end_offset is omitted, this offset is treated as the length of the array. If start_offset is after end_offset, or both are out of array bounds, an empty array is returned. Returns NULL if any input value is NULL. Example-- returns [SQL,Python,C#,JavaScript] SELECT ARRAY_SLICE(ARRAY['Java', 'SQL', 'Python', 'C#', 'JavaScript', 'Go'], 2, 5); ARRAY_SORT¶ SyntaxARRAY_SORT(array [, ascending_order [, null_first]]) DescriptionReturns an array that has the elements of array in sorted order. When only array is specified, the function defaults to ascending order with NULLs at the start. Specifying ascending_order as TRUE orders the array in ascending order, with NULLs first. Setting ascending_order to FALSE orders the array in descending order, with NULLs last. Independently, specifying null_first as TRUE moves NULLs to the beginning. specifying null_first as FALSE moves NULLs to the end, irrespective of the sorting order. The function returns NULL if any input is NULL. Example-- returns [1,2,3,4,5] SELECT ARRAY_SORT(ARRAY[5,4,3,2,1]); -- returns [NULL,SQL,Python,Java,Go,C#] SELECT ARRAY_SORT(ARRAY['Java', 'SQL', 'Python', NULL, 'Go', 'C#'], FALSE, TRUE); ARRAY_UNION¶ SyntaxARRAY_UNION(array1, array2) DescriptionReturns an array that has the elements from the union of array1 and array2. Duplicate elements are removed. If array1 or array2 is NULL, the function returns NULL. Example-- returns [Java,SQL,Python,C#,Go] SELECT ARRAY_UNION(ARRAY['Java', 'SQL', 'Python'], ARRAY['C#', 'SQL', 'Go']); CARDINALITY(array)¶ SyntaxCARDINALITY(array) DescriptionReturns the number of elements in the specified array. Example-- returns 5 SELECT CARDINALITY(ARRAY['Java', 'SQL', 'Python', 'Rust', 'C++']); CARDINALITY(map)¶ SyntaxCARDINALITY(map) DescriptionReturns the number of entries in the specified map. Example-- returns 3 SELECT CARDINALITY(MAP['Java', 5, 'SQL', 4, 'Python', 3]); ELEMENT¶ SyntaxELEMENT(array) DescriptionReturns the sole element of the specified array. The cardinality of array must be 1. Returns NULL if array is empty. Throws an exception if array has more than one element. Example-- returns Java SELECT ELEMENT(ARRAY['Java']); GROUP_ID¶ SyntaxGROUP_ID() DescriptionReturns an integer that uniquely identifies the combination of grouping keys. GROUPING¶ SyntaxGROUPING(expression1 [, expression2]* ) GROUPING_ID(expression1 [, expression2]* ) DescriptionReturns a bit vector of the specified grouping expressions. Implicit row constructor¶ Syntax(value1 [, value2]*) DescriptionReturns a row created from a list of values, (value1, value2,...). The implicit row constructor supports arbitrary expressions as fields and requires at least two fields. The explicit row constructor can deal with an arbitrary number of fields but doesn’t support all kinds of field expressions. Example-- returns (1, SQL) SELECT (1, 'SQL'); MAP¶ SyntaxMAP [ key1, value1 [, key2, value2 ], ... ] DescriptionReturns a map created from the specified list of key-value pairs, ((key1, value1), (key2, value2), ...). Use the bracket syntax, map_name[key], to return the value that corresponds with the specified key. Example-- returns 4 SELECT MAP['Java', 5, 'SQL', 4, 'Python', 3]['SQL']; MAP_ENTRIES¶ SyntaxMAP_ENTRIES(map) DescriptionReturns an array with all elements in map. Order of elements in the returned array is not guaranteed. Example-- returns [Java,5,SQL,4,Python,3] SELECT MAP_ENTRIES(MAP['Java', 5, 'SQL', 4, 'Python', 3]); MAP_FROM_ARRAYS¶ SyntaxMAP_FROM_ARRAYS(array_of_keys, array_of_values) DescriptionReturns a map created from an array of keys and an array of and values. The lengths of array_of_keys and array_of_values must be the same. Example-- returns {key1=Python, key2=SQL, key3=Java} SELECT MAP_FROM_ARRAYS(ARRAY['key1', 'key2', 'key3'], ARRAY['Python', 'SQL', 'Java']); MAP_KEYS¶ SyntaxMAP_KEYS(map) DescriptionReturns the keys of map as an array. Order of elements in the returned array is not guaranteed. Example-- returns [Java,Python,SQL] SELECT MAP_KEYS(MAP['Java', 5, 'SQL', 4, 'Python', 3]); MAP_UNION¶ SyntaxMAP_UNION(map1, …) DescriptionReturns a map created by merging at least one map. The maps must have a common map type. If there are overlapping keys, the value from map2 overwrites the value from map1, the value from map3 overwrites the value from map2, the value from mapn overwrites the value from map(n-1). If any of the maps is NULL, the function returns NULL. Example-- returns ['Java', 5, 'SQL', 4, 'Python', 3, 'C#', 2, 'Rust', 1] SELECT MAP_UNION(MAP['Java', 5, 'SQL', 4, 'Python', 3], MAP['C#', 2, 'Rust', 1]); MAP_VALUES¶ SyntaxMAP_VALUES(map) DescriptionReturns the values of map as an array. Order of elements in the returned array is not guaranteed. Example-- returns [3,5,4] SELECT MAP_VALUES(MAP['Java', 5, 'SQL', 4, 'Python', 3]); Other built-in functions¶ Aggregate Functions Collection Functions Comparison Functions Conditional Functions Datetime Functions Hash Functions JSON Functions ML Preprocessing Functions Model Inference Functions Numeric Functions String Functions Table API Functions Related content¶ User-defined Functions Create a User Defined Function Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
ARRAY ‘[’ value1 [, value2 ]* ‘]’
```

```sql
(value1, value2, ...)
```

```sql
array_name[INT]
```

```sql
-- returns Java
SELECT ARRAY['Java', 'SQL'][1];
```

```sql
ARRAY_AGG([ ALL | DISTINCT ] expression [ RESPECT NULLS | IGNORE NULLS ])
```

```sql
-- returns:
-- product_name quantities
-- Apple        [3, 7]
-- Orange       [2]
-- Banana       [5, 4]
WITH sales_data (id, product_name, quantity_sold) AS (
  VALUES
    (1, 'Apple', 3),
    (2, 'Banana', 5),
    (3, 'Apple', 7),
    (4, 'Orange', 2),
    (5, 'Banana', 4)
)
SELECT
  product_name,
  ARRAY_AGG(quantity_sold) AS quantities
FROM sales_data
GROUP BY product_name;
```

```sql
ARRAY_APPEND(array, element)
```

```sql
-- returns [SQL,Java,C#]
SELECT ARRAY_APPEND(ARRAY['SQL', 'Java'], 'C#');
```

```sql
ARRAY_CONCAT(array1, array2, …)
```

```sql
-- returns [SQL,Java,Python,Python,Rust,Haskell,C#]
SELECT ARRAY_CONCAT(ARRAY['SQL', 'Java'], ARRAY['Python'], ARRAY['Python', 'Rust', 'Haskell', 'C#']);
```

```sql
ARRAY_CONTAINS(array, element)
```

```sql
ARRAY_CONTAINS
```

```sql
-- returns TRUE
SELECT ARRAY_CONTAINS(ARRAY['Java', 'SQL'], 'SQL');
```

```sql
ARRAY_DISTINCT(array)
```

```sql
ARRAY_DISTINCT
```

```sql
-- returns [SQL,Java,Python]
SELECT ARRAY_DISTINCT(ARRAY['SQL', 'Java', 'SQL', 'Python', 'SQL']);
```

```sql
ARRAY_EXCEPT(array1, array2)
```

```sql
-- returns [Java, SQL]
SELECT ARRAY_EXCEPT(ARRAY['SQL', 'Java', 'Python', 'Rust',], ARRAY['Python', 'Rust', 'Haskell', 'C#']);
```

```sql
ARRAY_INTERSECT(array1, array2)
```

```sql
-- returns [Python, Rust]
SELECT ARRAY_INTERSECT(ARRAY['SQL', 'Java', 'Python', 'Rust',], ARRAY['Python', 'Rust', 'Haskell', 'C#']);
```

```sql
ARRAY_JOIN(array, delimiter [, nullReplacement])
```

```sql
nullReplacement
```

```sql
nullReplacement
```

```sql
-- returns "Java, SQL, Python, not specified"
SELECT ARRAY_JOIN(ARRAY['Java', 'SQL', 'Python', NULL], ', ', 'not specified');
```

```sql
ARRAY_MAX(array)
```

```sql
-- returns 4
SELECT ARRAY_MAX(ARRAY[1, 2, 3, 4]);
```

```sql
ARRAY_MIN(array)
```

```sql
-- returns 1
SELECT ARRAY_MIN(ARRAY[1, 2, 3, 4]);
```

```sql
ARRAY_POSITION(array, element)
```

```sql
-- returns 2
SELECT ARRAY_POSITION(ARRAY['Java', 'SQL', 'Python'], 'SQL');
```

```sql
ARRAY_PREPEND(array, element)
```

```sql
-- returns [SQL,Java,Python]
SELECT ARRAY_PREPEND(ARRAY['Java', 'Python'], 'SQL');
```

```sql
ARRAY_REMOVE(array, element)
```

```sql
-- returns [Java,Python]
SELECT ARRAY_REMOVE(ARRAY['Java', 'SQL', 'Python'], 'SQL');
```

```sql
ARRAY_REVERSE(array)
```

```sql
-- returns [Python,SQL,Java]
SELECT ARRAY_REVERSE(ARRAY['Java', 'SQL', 'Python']);
```

```sql
ARRAY_SLICE(array, start_offset [, end_offset])
```

```sql
start_offset
```

```sql
start_offset
```

```sql
-- returns [SQL,Python,C#,JavaScript]
SELECT ARRAY_SLICE(ARRAY['Java', 'SQL', 'Python', 'C#', 'JavaScript', 'Go'], 2, 5);
```

```sql
ARRAY_SORT(array [, ascending_order [, null_first]])
```

```sql
ascending_order
```

```sql
ascending_order
```

```sql
-- returns [1,2,3,4,5]
SELECT ARRAY_SORT(ARRAY[5,4,3,2,1]);

-- returns [NULL,SQL,Python,Java,Go,C#]
SELECT ARRAY_SORT(ARRAY['Java', 'SQL', 'Python', NULL, 'Go', 'C#'], FALSE, TRUE);
```

```sql
ARRAY_UNION(array1, array2)
```

```sql
-- returns [Java,SQL,Python,C#,Go]
SELECT ARRAY_UNION(ARRAY['Java', 'SQL', 'Python'], ARRAY['C#', 'SQL', 'Go']);
```

```sql
CARDINALITY(array)
```

```sql
-- returns 5
SELECT CARDINALITY(ARRAY['Java', 'SQL', 'Python', 'Rust', 'C++']);
```

```sql
CARDINALITY(map)
```

```sql
-- returns 3
SELECT CARDINALITY(MAP['Java', 5, 'SQL', 4, 'Python', 3]);
```

```sql
ELEMENT(array)
```

```sql
-- returns Java
SELECT ELEMENT(ARRAY['Java']);
```

```sql
GROUPING(expression1 [, expression2]* )
GROUPING_ID(expression1 [, expression2]* )
```

```sql
(value1 [, value2]*)
```

```sql
(value1, value2,...)
```

```sql
-- returns (1, SQL)
SELECT (1, 'SQL');
```

```sql
MAP [ key1, value1 [, key2, value2 ], ... ]
```

```sql
((key1, value1), (key2, value2), ...)
```

```sql
map_name[key]
```

```sql
-- returns 4
SELECT MAP['Java', 5, 'SQL', 4, 'Python', 3]['SQL'];
```

```sql
MAP_ENTRIES(map)
```

```sql
-- returns [Java,5,SQL,4,Python,3]
SELECT MAP_ENTRIES(MAP['Java', 5, 'SQL', 4, 'Python', 3]);
```

```sql
MAP_FROM_ARRAYS(array_of_keys, array_of_values)
```

```sql
array_of_keys
```

```sql
array_of_values
```

```sql
-- returns {key1=Python, key2=SQL, key3=Java}
SELECT MAP_FROM_ARRAYS(ARRAY['key1', 'key2', 'key3'], ARRAY['Python', 'SQL', 'Java']);
```

```sql
MAP_KEYS(map)
```

```sql
-- returns [Java,Python,SQL]
SELECT MAP_KEYS(MAP['Java', 5, 'SQL', 4, 'Python', 3]);
```

```sql
MAP_UNION(map1, …)
```

```sql
-- returns ['Java', 5, 'SQL', 4, 'Python', 3, 'C#', 2, 'Rust', 1]
SELECT MAP_UNION(MAP['Java', 5, 'SQL', 4, 'Python', 3], MAP['C#', 2, 'Rust', 1]);
```

```sql
MAP_VALUES(map)
```

```sql
-- returns [3,5,4]
SELECT MAP_VALUES(MAP['Java', 5, 'SQL', 4, 'Python', 3]);
```

---

### SQL comparison functions in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/functions/comparison-functions.html

Comparison Functions in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides these built-in comparison functions to use in SQL queries: Equality operations Logical operations Comparison Functions Conversion functions Equality operations¶ SQL function Description value1 = value2 Returns TRUE if value1 is equal to value2. Returns UNKNOWN if value1 or value2 is NULL. value1 <> value2 Returns TRUE if value1 is not equal to value2. Returns UNKNOWN if value1 or value2 is NULL. value1 > value2 Returns TRUE if value1 is greater than value2. Returns UNKNOWN if value1 or value2 is NULL. value1 >= value2 Returns TRUE if value1 is greater than or equal to value2. Returns UNKNOWN if value1 or value2 is NULL. value1 < value2 Returns TRUE if value1 is less than value2. Returns UNKNOWN if value1 or value2 is NULL. value1 <= value2 Returns TRUE if value1 is less than or equal to value2. Returns UNKNOWN if value1 or value2 is NULL. Logical operations¶ Logical operation Description boolean1 OR boolean2 Returns TRUE if boolean1 is TRUE or boolean2 is TRUE. Supports three-valued logic. For example, TRUE || NULL(BOOLEAN) returns TRUE. boolean1 AND boolean2 Returns TRUE if boolean1 and boolean2 are both TRUE. Supports three-valued logic. For example, TRUE && NULL(BOOLEAN) returns UNKNOWN. NOT boolean Returns TRUE if boolean is FALSE; returns FALSE if boolean is TRUE; returns UNKNOWN if boolean is UNKNOWN. boolean IS FALSE Returns TRUE if boolean is FALSE; returns FALSE if boolean is TRUE or UNKNOWN. boolean IS NOT FALSE Returns TRUE if boolean is TRUE or UNKNOWN; returns FALSE if boolean is FALSE. boolean IS TRUE Returns TRUE if boolean is TRUE; returns FALSE if boolean is FALSE or UNKNOWN. boolean IS NOT TRUE Returns TRUE if boolean is FALSE or UNKNOWN; returns FALSE if boolean is TRUE. boolean IS UNKNOWN Returns TRUE if boolean is UNKNOWN; returns FALSE if boolean is TRUE or FALSE. boolean IS NOT UNKNOWN Returns TRUE if boolean is TRUE or FALSE; returns FALSE if boolean is UNKNOWN. Comparison functions¶ BETWEEN NOT BETWEEN IN NOT IN IS DISTINCT FROM IS NOT DISTINCT FROM IS NULL IS NOT NULL LIKE NOT LIKE SIMILAR TO NOT SIMILAR TO EXISTS BETWEEN¶ Checks whether a value is between two other values. Syntaxvalue1 BETWEEN [ ASYMMETRIC | SYMMETRIC ] value2 AND value3 DescriptionThe BETWEEN function returns TRUE if value1 is greater than or equal to value2 and less than or equal to value3, if ASYMMETRIC is specified. The default is ASYMMETRIC. If SYMMETRIC is specified, the BETWEEN function returns TRUE if value1 is inclusively between value2 and value3. When either value2 or value3 is NULL, returns FALSE or UNKNOWN. Examples- returns FALSE SELECT 12 BETWEEN 15 AND 12; - returns TRUE SELECT 12 BETWEEN SYMMETRIC 15 AND 12; - returns UNKNOWN SELECT 12 BETWEEN 10 AND NULL; - returns FALSE SELECT 12 BETWEEN NULL AND 10; - returns UNKNOWN SELECT 12 BETWEEN SYMMETRIC NULL AND 12; NOT BETWEEN¶ Checks whether a value is not between two other values. Syntaxvalue1 NOT BETWEEN [ ASYMMETRIC | SYMMETRIC ] value2 AND value3 DescriptionBy default (or with the ASYMMETRIC keyword), The NOT BETWEEN function returns TRUE if value1 is less than value2 or greater than value3, if ASYMMETRIC is specified. If SYMMETRIC is specified, The NOT BETWEEN function returns TRUE if value1 is not inclusively between value2 and value3. When either value2 or value3 is NULL, returns TRUE or UNKNOWN. Examples-- returns TRUE SELECT 12 NOT BETWEEN 15 AND 12; -- returns FALSE SELECT 12 NOT BETWEEN SYMMETRIC 15 AND 12; -- returns UNKNOWN SELECT 12 NOT BETWEEN NULL AND 15; -- returns TRUE SELECT 12 NOT BETWEEN 15 AND NULL; -- returns UNKNOWN SELECT 12 NOT BETWEEN SYMMETRIC 12 AND NULL; EXISTS¶ Check whether a query returns a row. SyntaxEXISTS (sub-query) DescriptionThe EXISTS function returns TRUE if sub-query returns at least one row. The EXISTS function is supported only if the operation can be rewritten in a join and group operation. For streaming queries, the operation is rewritten in a join and group operation. The required state to compute the query result might grow indefinitely, depending on the number of distinct input rows. Provide a query configuration with valid retention interval to prevent excessive state size. ExamplesSELECT user_id, item_id FROM user_behavior WHERE EXISTS ( SELECT * FROM category WHERE category.item_id = user_behavior.item_id AND category.name = 'book' ); IN¶ Checks whether a value exists in a list. Syntaxvalue1 IN (value2 [, value3]* ) value IN (sub-query) DescriptionThe IN function returns TRUE if value1 exists in the specified list (value2, value3, ...). If a subquery is specified, The IN function returns TRUE if value is equal to a row returned by sub-query. When (value2, value3, ...) contains NULL, The IN function returns TRUE if the element can be found and UNKNOWN otherwise. Always returns UNKNOWN if value1 is NULL. Examples-- returns FALSE SELECT 4 IN (1, 2, 3); -- returns TRUE SELECT 1 IN (1, 2, NULL); -- returns UNKNOWN SELECT 4 IN (1, 2, NULL); NOT IN¶ Checks whether a value doesn’t exist in a list. Syntaxvalue1 NOT IN (value2 [, value3]* ) value NOT IN (sub-query) DescriptionThe NOT IN function returns TRUE if value1 does not exist in the specified list (value2, value3, ...). If a subquery is specified, The NOT IN function returns TRUE if value isn’t equal to a row returned by sub-query. When (value2, value3, ...) contains NULL, the NOT IN function returns FALSE if value1 can be found and UNKNOWN otherwise. Always returns UNKNOWN if value1 is NULL. Examples-- returns TRUE SELECT 4 NOT IN (1, 2, 3); -- returns FALSE SELECT 1 NOT IN (1, 2, NULL); -- returns UNKNOWN SELECT 4 NOT IN (1, 2, NULL); IS DISTINCT FROM¶ Checks whether two values are different. Syntaxvalue1 IS DISTINCT FROM value2 DescriptionThe IS DISTINCT FROM function returns TRUE if two values are different. NULL values are treated as identical. Examples-- returns TRUE SELECT 1 IS DISTINCT FROM 2; -- returns TRUE SELECT 1 IS DISTINCT FROM NULL; -- returns FALSE SELECT NULL IS DISTINCT FROM NULL; IS NOT DISTINCT FROM¶ Checks whether two values are equal. Syntaxvalue1 IS NOT DISTINCT FROM value2 DescriptionThe IS NOT DISTINCT FROM function returns TRUE if two values are equal. NULL values are treated as identical. Examples-- returns FALSE SELECT 1 IS NOT DISTINCT FROM 2; -- returns FALSE SELECT 1 IS NOT DISTINCT FROM NULL; -- returns TRUE SELECT NULL IS NOT DISTINCT FROM NULL; IS NULL¶ Checks whether a value is NULL. Syntaxvalue IS NULL DescriptionThe IS NULL function returns TRUE if value is NULL. Examples-- returns FALSE SELECT 1 IS NULL; -- returns TRUE SELECT NULL IS NULL; IS NOT NULL¶ Checks whether a value is assigned. Syntaxvalue IS NOT NULL DescriptionThe IS NOT NULL function returns TRUE if value is not NULL. Examples-- returns TRUE SELECT 1 IS NOT NULL; -- returns FALSE SELECT NULL IS NOT NULL; LIKE¶ Checks whether a string matches a pattern. Syntaxstring1 LIKE string2 DescriptionThe LIKE function returns TRUE if string1 matches the pattern specified by string2. The pattern can contain these special characters: % – matches any number of characters _ – matches a single character Returns UNKNOWN if either string1 or string2 is NULL. Examples-- returns TRUE SELECT 'book-23' LIKE 'book-%'; -- returns FALSE SELECT 'book23' LIKE 'book_'; -- returns TRUE SELECT 'book2' LIKE 'book_'; NOT LIKE¶ Checks whether a string matches a pattern. Syntaxstring1 NOT LIKE string2 [ ESCAPE char ] DescriptionThe NOT LIKE function returns TRUE if string1 does not match the pattern specified by string2. The pattern can contain these special characters: % – matches any number of characters _ – matches a single character Returns UNKNOWN if string1 or string2 is NULL. Examples-- returns FALSE SELECT 'book-23' NOT LIKE 'book-%'; -- returns TRUE SELECT 'book23' NOT LIKE 'book_'; -- returns FALSE SELECT 'book2' NOT LIKE 'book_'; SIMILAR TO¶ Checks whether a string matches a regular expression. Syntaxstring1 SIMILAR TO string2 DescriptionThe SIMILAR TO function returns TRUE if string1 matches the SQL regular expression in string2. The pattern can contain any characters that are valid in regular expressions, like ., which matches any character, *, which matches zero or more occurrences, and + which matches one or more occurrences. Returns UNKNOWN if string1 or string2 is NULL. Examples-- returns TRUE SELECT 'book-523' SIMILAR TO 'book-[0-9]+'; -- returns TRUE SELECT 'bob.dobbs@example.com' SIMILAR TO '%@example.com'; NOT SIMILAR TO¶ Checks whether a string doesn’t match a regular expression. Syntaxstring1 NOT SIMILAR TO string2 [ ESCAPE char ] DescriptionThe NOT SIMILAR TO function returns TRUE if string1 does not match the SQL regular expression specified by string2. Returns UNKNOWN if string1 or string2 is NULL. Examples-- returns TRUE SELECT 'book-nan' NOT SIMILAR TO 'book-[0-9]+'; -- returns TRUE SELECT 'bob.dobbs@company.com' NOT SIMILAR TO '%@example.com'; Conversion functions¶ CAST TRY_CAST TYPEOF CAST¶ Casts a value to a different type. SyntaxCAST(value AS type) DescriptionThe CAST function returns the specified value cast to the type specified by type. A cast error throws an exception and fails the job. When performing a cast operation that may fail, like STRING to INT, prefer TRY_CAST, to enable handling errors. If table.exec.legacy-cast-behaviour is enabled, the CAST function behaves like TRY_CAST. Examples-- returns 42 SELECT CAST('42' AS INT); -- returns NULL of type STRING SELECT CAST(NULL AS STRING); -- throws an exception and fails the job SELECT CAST('not-a-number' AS INT); TRY_CAST¶ Casts a value to a different type and returns NULL on error. SyntaxTRY_CAST(value AS type) DescriptionSimilar to the CAST function, but in case of error, returns NULL rather than failing the job. Examples-- returns 42 SELECT TRY_CAST('42' AS INT); -- returns NULL of type STRING SELECT TRY_CAST(NULL AS STRING); -- returns NULL of type INT SELECT TRY_CAST('not-a-number' AS INT); -- returns 0 of type INT SELECT COALESCE(TRY_CAST('not-a-number' AS INT), 0); TYPEOF¶ Gets the string representation of a data type. SyntaxTYPEOF(input) TYPEOF(input, force_serializable) DescriptionThe TYPEOF function returns the string representation of the input expression’s data type. By default, the returned string is a summary string that might omit certain details for readability. If force_serializable is set to TRUE, the string represents a full data type that can be persisted in a catalog. Anonymous, inline data types have no serializable string representation. In these cases, NULL is returned. Examples-- returns "CHAR(13) NOT NULL" SELECT TYPEOF('a string type'); -- returns "INT NOT NULL" SELECT TYPEOF(23); -- returns "DATE NOT NULL" SELECT TYPEOF(DATE '2023-05-04'); -- returns "NULL" SELECT TYPEOF(NULL); Other built-in functions¶ Aggregate Functions Collection Functions Comparison Functions Conditional Functions Datetime Functions Hash Functions JSON Functions ML Preprocessing Functions Model Inference Functions Numeric Functions String Functions Table API Functions Related content¶ User-defined Functions Create a User Defined Function Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
value1 = value2
```

```sql
value1 <> value2
```

```sql
value1 > value2
```

```sql
value1 >= value2
```

```sql
value1 < value2
```

```sql
value1 <= value2
```

```sql
boolean1 OR boolean2
```

```sql
TRUE || NULL(BOOLEAN)
```

```sql
boolean1 AND boolean2
```

```sql
TRUE && NULL(BOOLEAN)
```

```sql
NOT boolean
```

```sql
boolean IS FALSE
```

```sql
boolean IS NOT FALSE
```

```sql
boolean IS TRUE
```

```sql
boolean IS NOT TRUE
```

```sql
boolean IS UNKNOWN
```

```sql
boolean IS NOT UNKNOWN
```

```sql
value1 BETWEEN [ ASYMMETRIC | SYMMETRIC ] value2 AND value3
```

```sql
- returns FALSE
SELECT 12 BETWEEN 15 AND 12;

- returns TRUE
SELECT 12 BETWEEN SYMMETRIC 15 AND 12;

- returns UNKNOWN
SELECT 12 BETWEEN 10 AND NULL;

- returns FALSE
SELECT 12 BETWEEN NULL AND 10;

- returns UNKNOWN
SELECT 12 BETWEEN SYMMETRIC NULL AND 12;
```

```sql
value1 NOT BETWEEN [ ASYMMETRIC | SYMMETRIC ] value2 AND value3
```

```sql
NOT BETWEEN
```

```sql
NOT BETWEEN
```

```sql
-- returns TRUE
SELECT 12 NOT BETWEEN 15 AND 12;

-- returns FALSE
SELECT 12 NOT BETWEEN SYMMETRIC 15 AND 12;

-- returns UNKNOWN
SELECT 12 NOT BETWEEN NULL AND 15;

-- returns TRUE
SELECT 12 NOT BETWEEN 15 AND NULL;

--  returns UNKNOWN
SELECT 12 NOT BETWEEN SYMMETRIC 12 AND NULL;
```

```sql
EXISTS (sub-query)
```

```sql
SELECT user_id, item_id
FROM user_behavior
WHERE EXISTS (
  SELECT * FROM category
  WHERE category.item_id = user_behavior.item_id
  AND category.name = 'book'
);
```

```sql
value1 IN (value2 [, value3]* )
value IN (sub-query)
```

```sql
(value2, value3, ...)
```

```sql
(value2, value3, ...)
```

```sql
-- returns FALSE
SELECT 4 IN (1, 2, 3);

-- returns TRUE
SELECT 1 IN (1, 2, NULL);

-- returns UNKNOWN
SELECT 4 IN (1, 2, NULL);
```

```sql
value1 NOT IN (value2 [, value3]* )
value NOT IN (sub-query)
```

```sql
(value2, value3, ...)
```

```sql
(value2, value3, ...)
```

```sql
-- returns TRUE
SELECT 4 NOT IN (1, 2, 3);

-- returns FALSE
SELECT 1 NOT IN (1, 2, NULL);

-- returns UNKNOWN
SELECT 4 NOT IN (1, 2, NULL);
```

```sql
value1 IS DISTINCT FROM value2
```

```sql
IS DISTINCT FROM
```

```sql
--  returns TRUE
SELECT 1 IS DISTINCT FROM 2;

--  returns TRUE
SELECT 1 IS DISTINCT FROM NULL;

--  returns FALSE
SELECT NULL IS DISTINCT FROM NULL;
```

```sql
value1 IS NOT DISTINCT FROM value2
```

```sql
IS NOT DISTINCT FROM
```

```sql
--  returns FALSE
SELECT 1 IS NOT DISTINCT FROM 2;

--  returns FALSE
SELECT 1 IS NOT DISTINCT FROM NULL;

--  returns TRUE
SELECT NULL IS NOT DISTINCT FROM NULL;
```

```sql
value IS NULL
```

```sql
--  returns FALSE
SELECT 1 IS NULL;

--  returns TRUE
SELECT NULL IS NULL;
```

```sql
value IS NOT NULL
```

```sql
IS NOT NULL
```

```sql
--  returns TRUE
SELECT 1 IS NOT NULL;

--  returns FALSE
SELECT NULL IS NOT NULL;
```

```sql
string1 LIKE string2
```

```sql
-- returns TRUE
SELECT 'book-23' LIKE 'book-%';

-- returns FALSE
SELECT 'book23' LIKE 'book_';

-- returns TRUE
SELECT 'book2' LIKE 'book_';
```

```sql
string1 NOT LIKE string2 [ ESCAPE char ]
```

```sql
-- returns FALSE
SELECT 'book-23' NOT LIKE 'book-%';

-- returns TRUE
SELECT 'book23' NOT LIKE 'book_';

-- returns FALSE
SELECT 'book2' NOT LIKE 'book_';
```

```sql
string1 SIMILAR TO string2
```

```sql
-- returns TRUE
SELECT 'book-523' SIMILAR TO 'book-[0-9]+';

-- returns TRUE
SELECT 'bob.dobbs@example.com' SIMILAR TO '%@example.com';
```

```sql
string1 NOT SIMILAR TO string2 [ ESCAPE char ]
```

```sql
NOT SIMILAR TO
```

```sql
-- returns TRUE
SELECT 'book-nan' NOT SIMILAR TO 'book-[0-9]+';

-- returns TRUE
SELECT 'bob.dobbs@company.com' NOT SIMILAR TO '%@example.com';
```

```sql
CAST(value AS type)
```

```sql
table.exec.legacy-cast-behaviour
```

```sql
--  returns 42
SELECT CAST('42' AS INT);

-- returns NULL of type STRING
SELECT CAST(NULL AS STRING);

--  throws an exception and fails the job
SELECT CAST('not-a-number' AS INT);
```

```sql
TRY_CAST(value AS type)
```

```sql
--  returns 42
SELECT TRY_CAST('42' AS INT);

--  returns NULL of type STRING
SELECT TRY_CAST(NULL AS STRING);

--  returns NULL of type INT
SELECT TRY_CAST('not-a-number' AS INT);

--  returns 0 of type INT
SELECT COALESCE(TRY_CAST('not-a-number' AS INT), 0);
```

```sql
TYPEOF(input)
TYPEOF(input, force_serializable)
```

```sql
force_serializable
```

```sql
-- returns "CHAR(13) NOT NULL"
SELECT TYPEOF('a string type');

-- returns "INT NOT NULL"
SELECT TYPEOF(23);

-- returns "DATE NOT NULL"
SELECT TYPEOF(DATE '2023-05-04');

-- returns "NULL"
SELECT TYPEOF(NULL);
```

---

### SQL conditional functions in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/functions/conditional-functions.html

Conditional Functions in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides these built-in functions for controlling execution flow in SQL queries: CASE CASE WHEN CONDITION COALESCE GREATEST IF IFNULL IS_ALPHA IS_DECIMAL IS_DIGIT LEAST NULLIF CASE¶ SyntaxCASE value WHEN value1_1 [, value1_2]* THEN result1 (WHEN value2_1 [, value2_2 ]* THEN result2)* (ELSE result_z) END DescriptionReturns resultX when the specified value is contained in (valueX_1, valueX_2, ...). If no value matches, CASE returns result_z, if it’s provided, otherwise NULL. CASE WHEN CONDITION¶ SyntaxCASE WHEN condition1 THEN result1 (WHEN condition2 THEN result2)* (ELSE result_z) END Returns resultX when the first conditionX is met. When no condition is met, returns result_z, if it’s provided, otherwise NULL. COALESCE¶ SyntaxCOALESCE(value1 [, value2]*) Returns the first argument that is not NULL. If all arguments are NULL, the COALESCE function returns NULL. The return type is the least-restrictive, common type of all the arguments. The return type is nullable if all arguments are nullable as well. ExampleThe following SELECT statements return the values indicated in the comment lines. -- Returns 'default' SELECT COALESCE(NULL, 'default'); -- Returns the first non-null value among column0 and column1, -- or 'default' if column0 and column1 are both NULL. SELECT COALESCE(column0, column1, 'default'); GREATEST¶ SyntaxGREATEST(value1[, value2]*) Returns the greatest value in the specified list of arguments. Returns NULL if any argument is NULL. Example-- returns 4 SELECT GREATEST(1, 2, 3, 4); -- returns d SELECT GREATEST('a', 'b', 'c', 'd'); IF¶ SyntaxIF(condition, true_value, false_value) Returns the true_value if condition is met, otherwise false_value. ExampleThe following SELECT statements return the values indicated in the comment lines. -- returns 5 SELECT IF(5 > 3, 5, 3); IFNULL¶ SyntaxIFNULL(input, null_replacement) Returns null_replacement if input is NULL; otherwise returns input. The IFNULL function enables passing nullable columns into a function or table that is declared with a NOT NULL constraint. Compared with COALESCE or CASE, the IFNULL function returns a data type that’s specific with respect to nullability. The returned type is the common type of both arguments but only nullable if the null_replacement is nullable. For example, IFNULL(nullable_column, 5) never returns NULL. IS_ALPHA¶ SyntaxIS_ALPHA(string) Returns TRUE if all characters in the specified string are alphabetic, otherwise FALSE. Example-- returns FALSE SELECT IS_ALPHA('42'); -- returns TRUE SELECT IS_ALPHA('string'); IS_DECIMAL¶ SyntaxIS_DECIMAL(string) Returns TRUE if the specified string can be parsed to a valid NUMERIC, otherwise FALSE. Example-- returns TRUE SELECT IS_DECIMAL('23'); -- returns FALSE SELECT IS_DECIMAL('not a number'); IS_DIGIT¶ SyntaxIS_DIGIT(string) Returns TRUE if all characters in the specified string are digits, otherwise FALSE. Example-- returns TRUE SELECT IS_DIGIT('23'); -- returns FALSE SELECT IS_DIGIT('2 not a digit 3'); LEAST¶ SyntaxLEAST(value1[, value2]*) Returns the lowest value in the specified list of arguments. Returns NULL if any argument is NULL. Example-- returns 1 SELECT LEAST(1, 2, 3, 4); -- returns a SELECT LEAST('a', 'b', 'c', 'd'); NULLIF¶ SyntaxNULLIF(value1, value2) DescriptionReturns NULL if value1 is equal to value2, otherwise returns value1. Example-- returns NULL SELECT NULLIF(5, 5); -- returns 5 SELECT NULLIF(5, 0); Other built-in functions¶ Aggregate Functions Collection Functions Comparison Functions Conditional Functions Datetime Functions Hash Functions JSON Functions ML Preprocessing Functions Model Inference Functions Numeric Functions String Functions Table API Functions Related content¶ User-defined Functions Create a User Defined Function Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
CASE value
  WHEN value1_1 [, value1_2]* THEN result1
  (WHEN value2_1 [, value2_2 ]* THEN result2)*
  (ELSE result_z)
END
```

```sql
(valueX_1, valueX_2, ...)
```

```sql
CASE
  WHEN condition1 THEN result1
  (WHEN condition2 THEN result2)*
  (ELSE result_z)
END
```

```sql
COALESCE(value1 [, value2]*)
```

```sql
-- Returns 'default'
SELECT COALESCE(NULL, 'default');

-- Returns the first non-null value among column0 and column1,
-- or 'default' if column0 and column1 are both NULL.
SELECT COALESCE(column0, column1, 'default');
```

```sql
GREATEST(value1[, value2]*)
```

```sql
-- returns 4
SELECT GREATEST(1, 2, 3, 4);

-- returns d
SELECT GREATEST('a', 'b', 'c', 'd');
```

```sql
IF(condition, true_value, false_value)
```

```sql
false_value
```

```sql
-- returns 5
SELECT IF(5 > 3, 5, 3);
```

```sql
IFNULL(input, null_replacement)
```

```sql
null_replacement
```

```sql
null_replacement
```

```sql
IFNULL(nullable_column, 5)
```

```sql
IS_ALPHA(string)
```

```sql
-- returns FALSE
SELECT IS_ALPHA('42');

-- returns TRUE
SELECT IS_ALPHA('string');
```

```sql
IS_DECIMAL(string)
```

```sql
-- returns TRUE
SELECT IS_DECIMAL('23');

-- returns FALSE
SELECT IS_DECIMAL('not a number');
```

```sql
IS_DIGIT(string)
```

```sql
-- returns TRUE
SELECT IS_DIGIT('23');

-- returns FALSE
SELECT IS_DIGIT('2 not a digit 3');
```

```sql
LEAST(value1[, value2]*)
```

```sql
-- returns 1
SELECT LEAST(1, 2, 3, 4);

-- returns a
SELECT LEAST('a', 'b', 'c', 'd');
```

```sql
NULLIF(value1, value2)
```

```sql
-- returns NULL
SELECT NULLIF(5, 5);

-- returns 5
SELECT NULLIF(5, 0);
```

---

### SQL Datetime Functions in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/functions/datetime-functions.html

Datetime Functions in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides these built-in functions for handling date and time logic in SQL queries: Date Time Timestamp Utility CURRENT_DATE CONVERT_TZ CURRENT_TIMESTAMP CEIL DATE_FORMAT CURRENT_TIME CURRENT_ROW_TIMESTAMP CURRENT_WATERMARK DATE HOUR LOCALTIMESTAMP EXTRACT DAYOFMONTH LOCALTIME TIMESTAMP FLOOR DAYOFWEEK MINUTE TO_TIMESTAMP FROM_UNIXTIME DAYOFYEAR NOW TO_TIMESTAMP_LTZ INTERVAL MONTH SECOND TIMESTAMPADD SOURCE_WATERMARK QUARTER TIME TIMESTAMPDIFF OVERLAPS TO_DATE UNIX_TIMESTAMP WEEK UNIX_TIMESTAMP YEAR Time interval and point unit specifiers¶ The following table lists specifiers for time interval and time point units. Time interval unit Time point unit MILLENNIUM CENTURY DECADE YEAR YEAR YEAR TO MONTH QUARTER QUARTER MONTH MONTH WEEK WEEK DAY DAY DAY TO HOUR DAY TO MINUTE DAY TO SECOND HOUR HOUR HOUR TO MINUTE HOUR TO SECOND MINUTE MINUTE MINUTE TO SECOND SECOND SECOND MILLISECOND MILLISECOND MICROSECOND MICROSECOND NANOSECOND EPOCH DOY DOW EPOCH ISODOW ISOYEAR SQL_TSI_YEAR SQL_TSI_QUARTER SQL_TSI_MONTH SQL_TSI_WEEK SQL_TSI_DAY SQL_TSI_HOUR SQL_TSI_MINUTE SQL_TSI_SECOND CEIL¶ Rounds a time point up. SyntaxCEIL(timepoint TO timeintervalunit) DescriptionThe CEIL function returns a value that rounds timepoint up to the time unit specified by timeintervalunit. Example-- returns "12:45:00" SELECT CEIL(TIME '12:44:31' TO MINUTE); Related function FLOOR CONVERT_TZ¶ Converts a datetime from one time zone to another. SyntaxCONVERT_TZ(string1, string2, string3) DescriptionThe CONVERT_TZ function converts a datetime string1 that has the default ISO timestamp format, “yyyy-MM-dd hh:mm:ss”, from the time zone specified by string2 to the time zone specified by string3. The format of the time zone arguments is either an abbreviation, like “PST”, a full name, like “America/Los_Angeles”, or a custom ID, like “GMT-08:00”. Example-- returns "1969-12-31 16:00:00" SELECT CONVERT_TZ('1970-01-01 00:00:00', 'UTC', 'America/Los_Angeles'); CURRENT_DATE¶ Returns the current date. SyntaxCURRENT_DATE DescriptionThe CURRENT_DATE function returns the current SQL date in the local time zone. In streaming mode, the current date is evaluated for each record. In batch mode, the current date is evaluated once when the query starts, and CURRENT_DATE returns the same result for every row. Example-- returns the current date SELECT CURRENT_DATE; CURRENT_ROW_TIMESTAMP¶ Returns the current timestamp for each row. SyntaxCURRENT_ROW_TIMESTAMP() DescriptionThe CURRENT_ROW_TIMESTAMP function returns the current SQL timestamp in the local time zone. The return type is TIMESTAMP_LTZ(3). The timestamp is evaluated for each row, in both batch and streaming mode. Example-- returns the timestamp of the current datetime SELECT CURRENT_ROW_TIMESTAMP(); CURRENT_TIME¶ SyntaxCURRENT_TIME DescriptionThe CURRENT_TIME function returns the current SQL time in the local time zone. The CURRENT_TIME function is equivalent to LOCALTIME. Example-- returns the current time, for example: -- 13:03:56 SELECT CURRENT_TIME; CURRENT_TIMESTAMP¶ SyntaxCURRENT_TIMESTAMP DescriptionThe CURRENT_TIMESTAMP function returns the current SQL timestamp in the local time zone. The return type is TIMESTAMP_LTZ(3). In streaming mode, the current timestamp is evaluated for each record. In batch mode, the current timestamp is evaluated once when the query starts, and CURRENT_TIMESTAMP returns the same result for every row. The CURRENT_TIMESTAMP function is equivalent to NOW. Example-- returns the current timestamp, for example: -- 2023-10-16 13:04:58.081 SELECT CURRENT_TIMESTAMP; CURRENT_WATERMARK¶ Gets the current watermark for a rowtime column. SyntaxCURRENT_WATERMARK(rowtime) DescriptionThe CURRENT_WATERMARK function returns the current watermark for the given rowtime attribute, or NULL if no common watermark of all upstream operations is available at the current operation in the pipeline. The return type of the function is inferred to match that of the provided rowtime attribute, but with an adjusted precision of 3. For example, if the rowtime attribute is TIMESTAMP_LTZ(9), the function returns TIMESTAMP_LTZ(3). This function can return NULL, and it may be necessary to consider this case. For more information, see watermarks. ExampleThe following example shows how to filter out late data by using the CURRENT_WATERMARK function with a rowtime column named ts. WHERE CURRENT_WATERMARK(ts) IS NULL OR ts > CURRENT_WATERMARK(ts) Related function SOURCE_WATERMARK DATE_FORMAT¶ Converts a timestamp to a formatted string. SyntaxDATE_FORMAT(timestamp, date_format) DescriptionThe DATE_FORMAT function converts the specified timestamp to a string value in the format specified by the date_format string. The format string is compatible with the Java SimpleDateFormat. class. Example-- returns "5:32 PM, UTC" SELECT DATE_FORMAT('2023-03-15 17:32:01.009', 'K:mm a, z'); DATE¶ Parses a DATE from a string. SyntaxDATE string DescriptionThe DATE function returns a SQL date parsed from the specified string. The date format of the input string must be “yyyy-MM-dd”. Example-- returns "2023-05-23" SELECT DATE '2023-05-23'; DAYOFMONTH¶ Gets the day of month from a DATE. SyntaxDAYOFMONTH(date) DescriptionThe DAYOFMONTH function returns the day of a month from the specified SQL DATE as an integer between 1 and 31. The DAYOFMONTH function is equivalent to EXTRACT(DAY FROM date). Example-- returns 27 SELECT DAYOFMONTH(DATE '1994-09-27'); DAYOFWEEK¶ Gets the day of week from a DATE. SyntaxDAYOFWEEK(date) DescriptionThe DAYOFWEEK function returns the day of a week from the specified SQL DATE as an integer between 1 and 7. The DAYOFWEEK function is equivalent to EXTRACT(DOW FROM date). Example-- returns 3 SELECT DAYOFWEEK(DATE '1994-09-27'); DAYOFYEAR¶ Gets the day of year from a DATE. SyntaxDAYOFYEAR(date) DescriptionThe DAYOFYEAR function returns the day of a year from the specified SQL DATE as an integer between 1 and 366. The DAYOFYEAR function is equivalent to EXTRACT(DOY FROM date). Example-- returns 270 SELECT DAYOFYEAR(DATE '1994-09-27'); EXTRACT¶ Gets a time interval unit from a datetime. SyntaxEXTRACT(timeintervalunit FROM temporal) DescriptionThe EXTRACT function returns a LONG value extracted from the specified timeintervalunit part of temporal. Example-- returns 5 SELECT EXTRACT(DAY FROM DATE '2006-06-05'); Related functions DAYOFMONTH DAYOFWEEK DAYOFYEAR FLOOR¶ Rounds a time point down. SyntaxFLOOR(timepoint TO timeintervalunit) DescriptionThe FLOOR function returns a value that rounds timepoint down to the time unit specified by timeintervalunit. Example-- returns 12:44:00 SELECT FLOOR(TIME '12:44:31' TO MINUTE); Related function CEIL FROM_UNIXTIME¶ Gets a Unix time as a formatted string. SyntaxFROM_UNIXTIME(numeric[, string]) DescriptionThe FROM_UNIXTIME function returns a representation of the NUMERIC argument as a value in string format. The default format is “yyyy-MM-dd hh:mm:ss”. The specified NUMERIC is an internal timestamp value representing seconds since “1970-01-01 00:00:00” UTC, such as produced by the UNIX_TIMESTAMP function. The return value is expressed in the session time zone (specified in TableConfig). Example-- Returns "1970-01-01 00:00:44" if in the UTC time zone, -- but returns "1970-01-01 09:00:44" if in the 'Asia/Tokyo' time zone. SELECT FROM_UNIXTIME(44); HOUR¶ Gets the hour of day from a timestamp. SyntaxHOUR(timestamp) DescriptionThe HOUR function returns the hour of a day from the specified SQL timestamp as an integer between 0 and 23. The HOUR function is equivalent to EXTRACT(HOUR FROM timestamp). Example-- returns 13 SELECT HOUR(TIMESTAMP '1994-09-27 13:14:15'); Related functions MINUTE SECOND INTERVAL¶ Parses an interval string. SyntaxINTERVAL string range DescriptionThe INTERVAL function parses an interval string in the form “dd hh:mm:ss.fff” for SQL intervals of milliseconds, or “yyyy-mm” for SQL intervals of months. For intervals of milliseconds, these interval ranges apply: DAY MINUTE DAY TO HOUR DAY TO SECOND For intervals of months, these interval ranges apply: YEAR YEAR TO MONTH ExamplesThe following SELECT statements return the values indicated in the comment lines. -- returns +10 00:00:00.004 SELECT INTERVAL '10 00:00:00.004' DAY TO SECOND; -- returns +10 00:00:00.000 SELECT INTERVAL '10' DAY; -- returns +2-10 SELECT INTERVAL '2-10' YEAR TO MONTH; LOCALTIME¶ Gets the current local time. SyntaxLOCALTIME DescriptionThe LOCALTIME function returns the current SQL time in the local time zone. The return type is TIME(0). In streaming mode, the current local time is evaluated for each record. In batch mode, the current local time is evaluated once when the query starts, and LOCALTIME returns the same result for every row. Example-- returns the local machine time as "hh:mm:ss", for example: -- 13:16:03 SELECT LOCALTIME; LOCALTIMESTAMP¶ Gets the current timestamp. SyntaxLOCALTIMESTAMP DescriptionThe LOCALTIMESTAMP function returns the current SQL timestamp in local time zone. The return type is TIMESTAMP(3). In streaming mode, the current timestamp is evaluated for each record. In batch mode, the current timestamp is evaluated once when the query starts, and LOCALTIMESTAMP returns the same result for every row. Example-- returns the local machine datetime as "yyyy-mm-dd hh:mm:ss.sss", for example: -- 2023-10-16 13:15:32.390 SELECT LOCALTIMESTAMP; MINUTE¶ Gets the minute of hour from a timestamp. SyntaxMINUTE(timestamp) DescriptionThe MINUTE function returns the minute of an hour from the specified SQL timestamp as an integer between 0 and 59. The MINUTE function is equivalent to EXTRACT(MINUTE FROM timestamp). Example- returns 14 SELECT MINUTE(TIMESTAMP '1994-09-27 13:14:15'); Related functions HOUR SECOND MONTH¶ Gets the month of year from a DATE. SyntaxMONTH(date) DescriptionThe MONTH function returns the month of a year from the specified SQL date as an integer between 1 and 12. The MONTH function is equivalent to EXTRACT(MONTH FROM date). Example-- returns 9 SELECT MONTH(DATE '1994-09-27'); Related functions DAYOFMONTH DAYOFYEAR WEEK YEAR NOW¶ Gets the current timestamp. SyntaxNOW() DescriptionThe NOW function returns the current SQL timestamp in the local time zone. The NOW function is equivalent to CURRENT_TIMESTAMP. Example-- returns the local machine datetime as "yyyy-mm-dd hh:mm:ss.sss", for example: -- 2023-10-16 13:17:54.382 SELECT NOW(); OVERLAPS¶ Checks whether two time intervals overlap. Syntax(timepoint1, temporal1) OVERLAPS (timepoint2, temporal2) DescriptionThe OVERLAPS function returns TRUE if two time intervals defined by (timepoint1, temporal1) and (timepoint2, temporal2) overlap. The temporal values can be either a time point or a time interval. Example-- returns TRUE SELECT (TIME '2:55:00', INTERVAL '1' HOUR) OVERLAPS (TIME '3:30:00', INTERVAL '2' HOUR); -- returns FALSE SELECT (TIME '9:00:00', TIME '10:00:00') OVERLAPS (TIME '10:15:00', INTERVAL '3' HOUR); QUARTER¶ Gets the quarter of year from a DATE. SyntaxQUARTER(date) DescriptionThe QUARTER function returns the quarter of a year from the specified SQL DATE as an integer between 1 and 4. The QUARTER function is equivalent to EXTRACT(QUARTER FROM date). Example-- returns 3 SELECT QUARTER(DATE '1994-09-27'); Related functions DAYOFMONTH DAYOFYEAR WEEK YEAR SECOND¶ Gets the second of minute from a TIMESTAMP. SyntaxSECOND(timestamp) DescriptionThe SECOND function returns the second of a minute from the specified SQL TIMESTAMP as an integer between 0 and 59. The SECOND function is equivalent to EXTRACT(SECOND FROM timestamp). Example-- returns 15 SELECT SECOND(TIMESTAMP '1994-09-27 13:14:15'); Related functions HOUR MINUTE SOURCE_WATERMARK¶ Provides a default watermark strategy. SyntaxWATERMARK FOR column AS SOURCE_WATERMARK() DescriptionThe SOURCE_WATERMARK function provides a default watermark strategy. Watermarks are assigned per Kafka partition in the source operator. They are based on a moving histogram of observed out-of-orderness in the table, In other words, the difference between the current event time of an event and the maximum event time seen so far. The watermark is then assigned as the maximum event time seen to this point, minus the 95% quantile of observed out-of-orderness. In other words, the default watermark strategy aims to assign watermarks so that at most 5% of messages are “late”, meaning they arrive after the watermark. The minimum out-of-orderness is 50 milliseconds. The maximum out-of-orderness is 7 days. The algorithm always considers the out-of-orderness of the last 5000 events per partition. During warmup, before the algorithm has seen 1000 messages (per partition) it applies an additional safety margin to the observed out-of-orderness. The safety margin depends on the number of messages seen so far. Number of messages Safety margin 1 - 250 7 days 251 - 500 30s 501 - 750 10s 751 - 1000 1s In effect, the algorithm doesn’t provide a usable watermark before it has seen 250 records per partition. Example-- Create a table that has the default watermark strategy -- on the ts column. CREATE TABLE t2 ( i INT, ts TIMESTAMP_LTZ(3), WATERMARK FOR ts AS SOURCE_WATERMARK()); -- The queryable schema for the table has the default watermark -- strategy on the ts column. ( i INT, ts TIMESTAMP_LTZ(3), `$rowtime` TIMESTAMP_LTZ(3) NOT NULL METADATA VIRTUAL COMMENT 'SYSTEM', WATERMARK FOR ts AS SOURCE_WATERMARK() ); Related functions CURRENT_WATERMARK Watermark clause TIME¶ Parses a string to a TIME. SyntaxTIME string DescriptionThe TIME function returns a SQL TIME parsed from the specified string. The time format of the input string must be “hh:mm:ss”. Example-- returns 23:42:55 as a TIME SELECT TIME '23:42:55'; TIMESTAMP¶ SyntaxTIMESTAMP string DescriptionThe TIMESTAMP function returns a SQL TIMESTAMP parsed from the specified string. The timestamp format of the input string must be “yyyy-MM-dd hh:mm:ss[.SSS]”. Example-- returns 2023-05-04 23:42:55 as a TIMESTAMP SELECT TIMESTAMP '2023-05-04 23:42:55'; TO_DATE¶ Converts a date string to a DATE. SyntaxTO_DATE(string1[, string2]) DescriptionThe TO_DATE function converts the date string string1 with format string2 to a DATE. The default format is ‘yyyy-mm-dd’. Example-- returns 2023-05-04 as a DATE SELECT TO_DATE('2023-05-04'); TO_TIMESTAMP¶ Converts a date string to a TIMESTAMP. SyntaxTO_TIMESTAMP(string1[, string2]) DescriptionThe TO_TIMESTAMP function converts datetime string string1 with format string2 under the ‘UTC+0’ time zone to a TIMESTAMP. The default format is ‘yyyy-mm-dd hh:mm:ss’. Example-- returns 2023-05-04 23:42:55.000 as a TIMESTAMP SELECT TO_TIMESTAMP('2023-05-04 23:42:55', 'yyyy-mm-dd hh:mm:ss'); TO_TIMESTAMP_LTZ¶ Converts a Unix time to a TIMESTAMP_LTZ. SyntaxTO_TIMESTAMP_LTZ(numeric, precision) TO_TIMESTAMP_LTZ(string1[, string2[, string3]]) DescriptionThe first version of the TO_TIMESTAMP_LTZ function converts Unix epoch seconds or epoch milliseconds to a TIMESTAMP_LTZ. These are the valid precision values: 0, which represents TO_TIMESTAMP_LTZ(epoch_seconds, 0) 3, which represents TO_TIMESTAMP_LTZ(epoch_milliseconds, 3) If no precision is provided, the default precision is 3. The second version converts a timestamp string string1 with format string2 (by default ‘yyyy-MM-dd HH:mm:ss.SSS’) in time zone string3 (by default ‘UTC’) to a TIMESTAMP_LTZ. If any input is NULL, the function will return NULL. Examples-- convert 1000 epoch seconds -- returns 1970-01-01 00:16:40.000 as a TIMESTAMP_LTZ SELECT TO_TIMESTAMP_LTZ(1000, 0); -- convert 1000 epoch milliseconds -- returns 1970-01-01 00:00:01.000 as a TIMESTAMP_LTZ SELECT TO_TIMESTAMP_LTZ(1000, 3); -- convert timestamp string with custom format and timezone -- returns appropriate TIMESTAMP_LTZ based on the timezone SELECT TO_TIMESTAMP_LTZ('2023-05-04 12:00:00', 'yyyy-MM-dd HH:mm:ss', 'America/Los_Angeles'); TIMESTAMPADD¶ Adds a time interval to a datetime. SyntaxTIMESTAMPADD(timeintervalunit, interval, timepoint) DescriptionReturns the sum of timepoint and the interval number of time units specified by timeintervalunit. The unit for the interval is given by the first argument, which must be one of the following values: DAY HOUR MINUTE MONTH SECOND YEAR Example-- returns 2000-01-01 SELECT TIMESTAMPADD(DAY, 1, DATE '1999-12-31'); -- returns 2000-01-01 01:00:00 SELECT TIMESTAMPADD(HOUR, 2, TIMESTAMP '1999-12-31 23:00:00'); TIMESTAMPDIFF¶ Computes the interval between two datetimes. SyntaxTIMESTAMPDIFF(timepointunit, timepoint1, timepoint2) DescriptionThe TIMESTAMPDIFF function returns the (signed) number of timepointunit between timepoint1 and timepoint2. The unit for the interval is given by the first argument, which must be one of the following values: DAY HOUR MINUTE MONTH SECOND YEAR Example-- returns -1 SELECT TIMESTAMPDIFF(DAY, DATE '2000-01-01', DATE '1999-12-31'); -- returns -2 SELECT TIMESTAMPDIFF(HOUR, TIMESTAMP '2000-01-01 01:00:00', TIMESTAMP '1999-12-31 23:00:00'); UNIX_TIMESTAMP¶ Gets the current Unix timestamp in seconds. SyntaxUNIX_TIMESTAMP() DescriptionThe UNIX_TIMESTAMP function is not deterministic, which means the value is recalculated for each row. Example-- returns Epoch seconds, for example: -- 1697487923 SELECT UNIX_TIMESTAMP(); UNIX_TIMESTAMP¶ Converts a datetime string to a Unix timestamp. SyntaxUNIX_TIMESTAMP(string1[, string2]) DescriptionThe UNIX_TIMESTAMP(string) function converts the specified datetime string string1 in format string2 to a Unix timestamp (in seconds), using the time zone specified in table config. The default format is “yyyy-MM-dd HH:mm:ss”. If a time zone is specified in the datetime string and parsed by the UTC+X format, like yyyy-MM-dd HH:mm:ss.SSS X, this function uses the specified timezone in the datetime string instead of the timezone in the table configuration. If the datetime string can’t be parsed, the default value of Long.MIN_VALUE(-9223372036854775808) is returned. Examples-- returns 1683201600 SELECT UNIX_TIMESTAMP('2023-05-04 12:00:00'); -- Returns 25201 SELECT UNIX_TIMESTAMP('1970-01-01 08:00:01.001', 'yyyy-MM-dd HH:mm:ss.SSS'); -- Returns 1 SELECT UNIX_TIMESTAMP('1970-01-01 08:00:01.001 +0800', 'yyyy-MM-dd HH:mm:ss.SSS X'); -- Returns 25201 SELECT UNIX_TIMESTAMP('1970-01-01 08:00:01.001 +0800', 'yyyy-MM-dd HH:mm:ss.SSS'); -- Returns -9223372036854775808 SELECT UNIX_TIMESTAMP('1970-01-01 08:00:01.001', 'yyyy-MM-dd HH:mm:ss.SSS X'); WEEK¶ Gets the week of year from a DATE. SyntaxWEEK(date) DescriptionThe WEEK function returns the week of a year from the specified SQL DATE as an integer between 1 and 53. The WEEK function is equivalent to EXTRACT(WEEK FROM date). Example-- returns 39 SELECT WEEK(DATE '1994-09-27'); Related functions DAYOFMONTH DAYOFYEAR QUARTER YEAR YEAR¶ Gets the year from a DATE. SyntaxYEAR(date) The YEAR function returns the year from the specified SQL DATE. The YEAR function is equivalent to EXTRACT(YEAR FROM date). Example-- returns 1994 SELECT YEAR(DATE '1994-09-27'); Related functions DAYOFMONTH DAYOFYEAR QUARTER MONTH Other built-in functions¶ Aggregate Functions Collection Functions Comparison Functions Conditional Functions Datetime Functions Hash Functions JSON Functions ML Preprocessing Functions Model Inference Functions Numeric Functions String Functions Table API Functions Related content¶ User-defined Functions Create a User Defined Function Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
YEAR TO MONTH
```

```sql
DAY TO HOUR
```

```sql
DAY TO MINUTE
```

```sql
DAY TO SECOND
```

```sql
HOUR TO MINUTE
```

```sql
HOUR TO SECOND
```

```sql
MINUTE TO SECOND
```

```sql
MILLISECOND
```

```sql
MILLISECOND
```

```sql
MICROSECOND
```

```sql
MICROSECOND
```

```sql
SQL_TSI_YEAR
```

```sql
SQL_TSI_QUARTER
```

```sql
SQL_TSI_MONTH
```

```sql
SQL_TSI_WEEK
```

```sql
SQL_TSI_DAY
```

```sql
SQL_TSI_HOUR
```

```sql
SQL_TSI_MINUTE
```

```sql
SQL_TSI_SECOND
```

```sql
CEIL(timepoint TO timeintervalunit)
```

```sql
timeintervalunit
```

```sql
-- returns "12:45:00"
SELECT CEIL(TIME '12:44:31' TO MINUTE);
```

```sql
CONVERT_TZ(string1, string2, string3)
```

```sql
-- returns "1969-12-31 16:00:00"
SELECT CONVERT_TZ('1970-01-01 00:00:00', 'UTC', 'America/Los_Angeles');
```

```sql
CURRENT_DATE
```

```sql
CURRENT_DATE
```

```sql
CURRENT_DATE
```

```sql
-- returns the current date
SELECT CURRENT_DATE;
```

```sql
CURRENT_ROW_TIMESTAMP()
```

```sql
CURRENT_ROW_TIMESTAMP
```

```sql
TIMESTAMP_LTZ(3)
```

```sql
-- returns the timestamp of the current datetime
SELECT CURRENT_ROW_TIMESTAMP();
```

```sql
CURRENT_TIME
```

```sql
CURRENT_TIME
```

```sql
CURRENT_TIME
```

```sql
-- returns the current time, for example:
-- 13:03:56
SELECT CURRENT_TIME;
```

```sql
CURRENT_TIMESTAMP
```

```sql
CURRENT_TIMESTAMP
```

```sql
TIMESTAMP_LTZ(3)
```

```sql
CURRENT_TIMESTAMP
```

```sql
CURRENT_TIMESTAMP
```

```sql
-- returns the current timestamp, for example:
-- 2023-10-16 13:04:58.081
SELECT CURRENT_TIMESTAMP;
```

```sql
CURRENT_WATERMARK(rowtime)
```

```sql
CURRENT_WATERMARK
```

```sql
TIMESTAMP_LTZ(9)
```

```sql
TIMESTAMP_LTZ(3)
```

```sql
CURRENT_WATERMARK
```

```sql
WHERE
  CURRENT_WATERMARK(ts) IS NULL
  OR ts > CURRENT_WATERMARK(ts)
```

```sql
DATE_FORMAT(timestamp, date_format)
```

```sql
DATE_FORMAT
```

```sql
date_format
```

```sql
-- returns "5:32 PM, UTC"
SELECT DATE_FORMAT('2023-03-15 17:32:01.009', 'K:mm a, z');
```

```sql
DATE string
```

```sql
-- returns "2023-05-23"
SELECT DATE '2023-05-23';
```

```sql
DAYOFMONTH(date)
```

```sql
EXTRACT(DAY FROM date)
```

```sql
-- returns 27
SELECT DAYOFMONTH(DATE '1994-09-27');
```

```sql
DAYOFWEEK(date)
```

```sql
EXTRACT(DOW FROM date)
```

```sql
-- returns 3
SELECT DAYOFWEEK(DATE '1994-09-27');
```

```sql
DAYOFYEAR(date)
```

```sql
EXTRACT(DOY FROM date)
```

```sql
-- returns 270
SELECT DAYOFYEAR(DATE '1994-09-27');
```

```sql
EXTRACT(timeintervalunit FROM temporal)
```

```sql
timeintervalunit
```

```sql
-- returns 5
SELECT EXTRACT(DAY FROM DATE '2006-06-05');
```

```sql
FLOOR(timepoint TO timeintervalunit)
```

```sql
timeintervalunit
```

```sql
-- returns 12:44:00
SELECT FLOOR(TIME '12:44:31' TO MINUTE);
```

```sql
FROM_UNIXTIME(numeric[, string])
```

```sql
FROM_UNIXTIME
```

```sql
-- Returns "1970-01-01 00:00:44" if in the UTC time zone,
-- but returns "1970-01-01 09:00:44" if in the 'Asia/Tokyo' time zone.
SELECT FROM_UNIXTIME(44);
```

```sql
HOUR(timestamp)
```

```sql
EXTRACT(HOUR FROM timestamp)
```

```sql
-- returns 13
SELECT HOUR(TIMESTAMP '1994-09-27 13:14:15');
```

```sql
INTERVAL string range
```

```sql
-- returns +10 00:00:00.004
SELECT INTERVAL '10 00:00:00.004' DAY TO SECOND;

-- returns +10 00:00:00.000
SELECT INTERVAL '10' DAY;

-- returns +2-10
SELECT INTERVAL '2-10' YEAR TO MONTH;
```

```sql
-- returns the local machine time as "hh:mm:ss", for example:
-- 13:16:03
SELECT LOCALTIME;
```

```sql
LOCALTIMESTAMP
```

```sql
LOCALTIMESTAMP
```

```sql
TIMESTAMP(3)
```

```sql
LOCALTIMESTAMP
```

```sql
-- returns the local machine datetime as "yyyy-mm-dd hh:mm:ss.sss", for example:
-- 2023-10-16 13:15:32.390
SELECT LOCALTIMESTAMP;
```

```sql
MINUTE(timestamp)
```

```sql
EXTRACT(MINUTE FROM timestamp)
```

```sql
- returns 14
SELECT MINUTE(TIMESTAMP '1994-09-27 13:14:15');
```

```sql
MONTH(date)
```

```sql
EXTRACT(MONTH FROM date)
```

```sql
-- returns 9
SELECT MONTH(DATE '1994-09-27');
```

```sql
-- returns the local machine datetime as "yyyy-mm-dd hh:mm:ss.sss", for example:
-- 2023-10-16 13:17:54.382
SELECT NOW();
```

```sql
(timepoint1, temporal1) OVERLAPS (timepoint2, temporal2)
```

```sql
(timepoint1, temporal1)
```

```sql
(timepoint2, temporal2)
```

```sql
-- returns TRUE
SELECT (TIME '2:55:00', INTERVAL '1' HOUR) OVERLAPS (TIME '3:30:00', INTERVAL '2' HOUR);

-- returns FALSE
SELECT (TIME '9:00:00', TIME '10:00:00') OVERLAPS (TIME '10:15:00', INTERVAL '3' HOUR);
```

```sql
QUARTER(date)
```

```sql
EXTRACT(QUARTER FROM date)
```

```sql
--  returns 3
SELECT QUARTER(DATE '1994-09-27');
```

```sql
SECOND(timestamp)
```

```sql
EXTRACT(SECOND FROM timestamp)
```

```sql
--  returns 15
SELECT SECOND(TIMESTAMP '1994-09-27 13:14:15');
```

```sql
WATERMARK FOR column AS SOURCE_WATERMARK()
```

```sql
SOURCE_WATERMARK
```

```sql
-- Create a table that has the default watermark strategy
-- on the ts column.
CREATE TABLE t2 (
   i INT,
   ts TIMESTAMP_LTZ(3),
   WATERMARK FOR ts AS SOURCE_WATERMARK());

 -- The queryable schema for the table has the default watermark
 -- strategy on the ts column.
 (
   i INT,
   ts TIMESTAMP_LTZ(3),
   `$rowtime` TIMESTAMP_LTZ(3) NOT NULL METADATA VIRTUAL COMMENT 'SYSTEM',
   WATERMARK FOR ts AS SOURCE_WATERMARK()
);
```

```sql
TIME string
```

```sql
-- returns 23:42:55 as a TIME
SELECT TIME '23:42:55';
```

```sql
TIMESTAMP string
```

```sql
-- returns 2023-05-04 23:42:55 as a TIMESTAMP
SELECT TIMESTAMP '2023-05-04 23:42:55';
```

```sql
TO_DATE(string1[, string2])
```

```sql
-- returns 2023-05-04 as a DATE
SELECT TO_DATE('2023-05-04');
```

```sql
TO_TIMESTAMP(string1[, string2])
```

```sql
TO_TIMESTAMP
```

```sql
-- returns 2023-05-04 23:42:55.000 as a TIMESTAMP
SELECT TO_TIMESTAMP('2023-05-04 23:42:55', 'yyyy-mm-dd hh:mm:ss');
```

```sql
TIMESTAMP_LTZ
```

```sql
TO_TIMESTAMP_LTZ(numeric, precision)
TO_TIMESTAMP_LTZ(string1[, string2[, string3]])
```

```sql
TO_TIMESTAMP_LTZ
```

```sql
TIMESTAMP_LTZ
```

```sql
TO_TIMESTAMP_LTZ(epoch_seconds, 0)
```

```sql
TO_TIMESTAMP_LTZ(epoch_milliseconds, 3)
```

```sql
-- convert 1000 epoch seconds
-- returns 1970-01-01 00:16:40.000 as a TIMESTAMP_LTZ
SELECT TO_TIMESTAMP_LTZ(1000, 0);

-- convert 1000 epoch milliseconds
-- returns 1970-01-01 00:00:01.000 as a TIMESTAMP_LTZ
SELECT TO_TIMESTAMP_LTZ(1000, 3);

-- convert timestamp string with custom format and timezone
-- returns appropriate TIMESTAMP_LTZ based on the timezone
SELECT TO_TIMESTAMP_LTZ('2023-05-04 12:00:00', 'yyyy-MM-dd HH:mm:ss', 'America/Los_Angeles');
```

```sql
TIMESTAMPADD(timeintervalunit, interval, timepoint)
```

```sql
timeintervalunit
```

```sql
-- returns 2000-01-01
SELECT TIMESTAMPADD(DAY, 1, DATE '1999-12-31');

-- returns 2000-01-01 01:00:00
SELECT TIMESTAMPADD(HOUR, 2, TIMESTAMP '1999-12-31 23:00:00');
```

```sql
TIMESTAMPDIFF(timepointunit, timepoint1, timepoint2)
```

```sql
TIMESTAMPDIFF
```

```sql
timepointunit
```

```sql
-- returns -1
SELECT TIMESTAMPDIFF(DAY, DATE '2000-01-01', DATE '1999-12-31');

-- returns -2
SELECT TIMESTAMPDIFF(HOUR, TIMESTAMP '2000-01-01 01:00:00', TIMESTAMP '1999-12-31 23:00:00');
```

```sql
UNIX_TIMESTAMP()
```

```sql
UNIX_TIMESTAMP
```

```sql
-- returns Epoch seconds, for example:
-- 1697487923
SELECT UNIX_TIMESTAMP();
```

```sql
UNIX_TIMESTAMP(string1[, string2])
```

```sql
UNIX_TIMESTAMP(string)
```

```sql
yyyy-MM-dd HH:mm:ss.SSS X
```

```sql
Long.MIN_VALUE(-9223372036854775808)
```

```sql
-- returns 1683201600
SELECT UNIX_TIMESTAMP('2023-05-04 12:00:00');

-- Returns 25201
SELECT UNIX_TIMESTAMP('1970-01-01 08:00:01.001', 'yyyy-MM-dd HH:mm:ss.SSS');

-- Returns 1
SELECT UNIX_TIMESTAMP('1970-01-01 08:00:01.001 +0800', 'yyyy-MM-dd HH:mm:ss.SSS X');

-- Returns 25201
SELECT UNIX_TIMESTAMP('1970-01-01 08:00:01.001 +0800', 'yyyy-MM-dd HH:mm:ss.SSS');

-- Returns -9223372036854775808
SELECT UNIX_TIMESTAMP('1970-01-01 08:00:01.001', 'yyyy-MM-dd HH:mm:ss.SSS X');
```

```sql
EXTRACT(WEEK FROM date)
```

```sql
--  returns 39
SELECT WEEK(DATE '1994-09-27');
```

```sql
EXTRACT(YEAR FROM date)
```

```sql
--  returns 1994
SELECT YEAR(DATE '1994-09-27');
```

---

### SQL hash functions in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/functions/hash-functions.html

Hash Functions in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides these built-in functions to generate hash codes in SQL queries: MD5 SHA1 SHA2 SHA224 SHA256 SHA384 SHA512 MD5¶ Gets the MD5 hash of a string. SyntaxMD5(string) DescriptionThe MD5 function returns the MD5 hash of the specified string as a string of 32 hexadecimal digits. Returns NULL if string is NULL. Example-- returns 99dc0ea422440e5b3f675cffe6d... SELECT MD5('string-to-hash'); SHA1¶ Gets the SHA-1 hash of a string. SyntaxSHA1(string) DescriptionThe SHA1 function returns the SHA-1 hash of the specified string as a string of 40 hexadecimal digits. Returns NULL if string is NULL. Example-- returns 771a2b04044f8c51e3383a2675a... SELECT SHA1('string-to-hash'); SHA2¶ Hashes a string with one of the SHA-2 functions. SyntaxSHA2(string, hashLength) DescriptionThe SHA2 function returns the hash using the SHA-2 family of hash functions (SHA-224, SHA-256, SHA-384, and SHA-512). The first argument, string, is the string to be hashed. The second argument, hashLength, is the bit length of the result. These are the valid bit lengths for hashLength: 224 256 384 512 Returns NULL if string or hashLength is NULL. Example-- returns 222145560dbaa2abc1617e2c7ce... SELECT SHA2('string-to-hash', 512); SHA224¶ Gets the SHA-224 hash of a string. SyntaxSHA224(string) DescriptionThe SHA224 function returns the SHA-224 hash of the specified string as a string of 56 hexadecimal digits. Returns NULL if string is NULL. Example-- returns af1f1c988d9154f2ddb6201f60f... SELECT SHA224('string-to-hash'); SHA256¶ Gets the SHA-256 hash of a string. SyntaxSHA256(string) DescriptionThe SHA256 function returns the SHA-256 hash of the specified string as a string of 64 hexadecimal digits. Returns NULL if string is NULL. Example-- returns 2267d414e45335fd02e64057d55... SELECT SHA256('string-to-hash'); SHA384¶ Gets the SHA-384 hash of a string. SyntaxSHA384(string) DescriptionThe SHA5384 function returns the SHA-384 hash of the specified string as a string of 96 hexadecimal digits. Returns NULL if string is NULL. Example-- returns 02ba979b23f1b4a098732463ea8... SELECT SHA384('string-to-hash'); SHA512¶ Gets the SHA-512 hash of a string. SyntaxSHA512(string) DescriptionThe SHA512 function returns the SHA-512 hash of the specified string as a string of 128 hexadecimal digits. Returns NULL if string is NULL. Example-- returns 222145560dbaa2abc1617e2c7ce... SELECT SHA512('string-to-hash'); Other built-in functions¶ Aggregate Functions Collection Functions Comparison Functions Conditional Functions Datetime Functions Hash Functions JSON Functions ML Preprocessing Functions Model Inference Functions Numeric Functions String Functions Table API Functions Related content¶ User-defined Functions Create a User Defined Function Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
MD5(string)
```

```sql
-- returns 99dc0ea422440e5b3f675cffe6d...
SELECT MD5('string-to-hash');
```

```sql
SHA1(string)
```

```sql
-- returns 771a2b04044f8c51e3383a2675a...
SELECT SHA1('string-to-hash');
```

```sql
SHA2(string, hashLength)
```

```sql
-- returns 222145560dbaa2abc1617e2c7ce...
SELECT SHA2('string-to-hash', 512);
```

```sql
SHA224(string)
```

```sql
-- returns af1f1c988d9154f2ddb6201f60f...
SELECT SHA224('string-to-hash');
```

```sql
SHA256(string)
```

```sql
-- returns 2267d414e45335fd02e64057d55...
SELECT SHA256('string-to-hash');
```

```sql
SHA384(string)
```

```sql
-- returns 02ba979b23f1b4a098732463ea8...
SELECT SHA384('string-to-hash');
```

```sql
SHA512(string)
```

```sql
-- returns 222145560dbaa2abc1617e2c7ce...
SELECT SHA512('string-to-hash');
```

---

### SQL JSON functions in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/functions/json-functions.html

JSON Functions in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides these built-in functions to help with JSON in SQL queries: IS JSON JSON_ARRAY JSON_ARRAYAGG JSON_EXISTS JSON_OBJECT JSON_OBJECTAGG JSON_QUERY JSON_QUOTE JSON_STRING JSON_UNQUOTE JSON_VALUE JSON functions make use of JSON path expressions as described in ISO/IEC TR 19075-6 of the SQL standard. Their syntax is inspired by and adopts many features of ECMAScript, but is neither a subset nor superset of the standard. Path expressions come in two flavors, lax and strict. When omitted, it defaults to the strict mode. Strict mode is intended to examine data from a schema perspective and will throw errors whenever data does not adhere to the path expression. However, functions like JSON_VALUE allow defining fallback behavior if an error is encountered. Lax mode, on the other hand, is more forgiving and converts errors to empty sequences. The special character $ denotes the root node in a JSON path. Paths can access properties ($.a), array elements ($.a[0].b), or branch over all elements in an array ($.a[*].b). Known Limitations: Not all features of Lax mode are currently supported. This is an upstream bug (CALCITE-4717). Non-standard behavior is not guaranteed. IS JSON¶ Checks whether a string is valid JSON. SyntaxIS JSON [ { VALUE | SCALAR | ARRAY | OBJECT } ] DescriptionThe IS JSON function determines whether the specified string is valid JSON. Providing the optional type argument constrains the type of JSON object to check for validity. The default is VALUE. If the string is valid JSON but not the provided type, IS JSON returns FALSE. ExamplesThe following SELECT statements return TRUE. -- The following statements return TRUE. SELECT '1' IS JSON; SELECT '[]' IS JSON; SELECT '{}' IS JSON; SELECT '"abc"' IS JSON; SELECT '1' IS JSON SCALAR; SELECT '{}' IS JSON OBJECT; The following SELECT statements return FALSE. -- The following statements return FALSE. SELECT 'abc' IS JSON; SELECT '1' IS JSON ARRAY; SELECT '1' IS JSON OBJECT; SELECT '{}' IS JSON SCALAR; SELECT '{}' IS JSON ARRAY; JSON_ARRAY¶ Creates a JSON array string from a list of values. SyntaxJSON_ARRAY([value]* [ { NULL | ABSENT } ON NULL ]) DescriptionThe JSON_ARRAY function returns a JSON string from the specified list of values. The values can be arbitrary expressions. The ON NULL behavior defines how to handle NULL values. If omitted, ABSENT ON NULL is the default. Elements that are created from other JSON construction function calls are inserted directly, rather than as a string. This enables building nested JSON structures by using the JSON_OBJECT and JSON_ARRAY construction functions. ExamplesThe following SELECT statements return the values indicated in the comment lines. -- returns '[]' SELECT JSON_ARRAY(); -- returns '[1,"2"]' SELECT JSON_ARRAY(1, '2'); -- Use an expression as a value. SELECT JSON_ARRAY(orders.orderId); -- ON NULL -- returns '[null]' SELECT JSON_ARRAY(CAST(NULL AS STRING) NULL ON NULL); -- ON NULL -- returns '[]' SELECT JSON_ARRAY(CAST(NULL AS STRING) ABSENT ON NULL); -- returns '[[1]]' SELECT JSON_ARRAY(JSON_ARRAY(1)); -- returns '[{"nested_json":{"value":42}}]' SELECT JSON_ARRAY(JSON('{"nested_json": {"value": 42}}')); JSON_ARRAYAGG¶ Aggregates items into a JSON array string. SyntaxJSON_ARRAYAGG(items [ { NULL | ABSENT } ON NULL ]) DescriptionThe JSON_ARRAYAGG function creates a JSON object string by aggregating the specified items into an array. The item expressions can be arbitrary, including other JSON functions. If a value is NULL, the ON NULL behavior defines what to do. If omitted, ABSENT ON NULL is the default. The JSON_ARRAYAGG function isn’t supported in OVER windows, unbounded session windows, or HOP windows. Example-- '["Apple","Banana","Orange"]' SELECT JSON_ARRAYAGG(product) FROM orders; JSON_EXISTS¶ Checks a JSON path. SyntaxJSON_EXISTS(jsonValue, path [ { TRUE | FALSE | UNKNOWN | ERROR } ON ERROR ]) DescriptionThe JSON_EXISTS function determines whether a JSON string satisfies a specified path search criterion. If the ON ERROR behavior is omitted, the default is FALSE ON ERROR. ExamplesThe following SELECT statements return TRUE. -- The following statements return TRUE. SELECT JSON_EXISTS('{"a": true}', '$.a'); SELECT JSON_EXISTS('{"a": [{ "b": 1 }]}', '$.a[0].b'); SELECT JSON_EXISTS('{"a": true}', 'strict $.b' TRUE ON ERROR); The following SELECT statements return FALSE. -- The following statements return FALSE. SELECT JSON_EXISTS('{"a": true}', '$.b'); SELECT JSON_EXISTS('{"a": true}', 'strict $.b' FALSE ON ERROR); JSON_OBJECT¶ SyntaxJSON_OBJECT([[KEY] key VALUE value]* [ { NULL | ABSENT } ON NULL ]) DescriptionThe JSON_OBJECT function creates a JSON object string from the specified list of key-value pairs. Keys must be non-NULL string literals, and values may be arbitrary expressions. The JSON_OBJECT function returns a JSON string. The ON NULL behavior defines how to treat NULL values. If omitted, NULL ON NULL is the default. Values that are created from another JSON construction function calls are inserted directly, rather than as a string. This enables building nested JSON structures by using the JSON_OBJECT and JSON_ARRAY construction functions. ExamplesThe following SELECT statements return the values indicated in the comment lines. -- returns '{}' SELECT JSON_OBJECT(); -- returns '{"K1":"V1","K2":"V2"}' SELECT JSON_OBJECT('K1' VALUE 'V1', 'K2' VALUE 'V2'); -- Use an expression as a value. SELECT JSON_OBJECT('orderNo' VALUE orders.orderId); -- ON NULL -- '{"K1":null}' SELECT JSON_OBJECT(KEY 'K1' VALUE CAST(NULL AS STRING) NULL ON NULL); -- ON NULL -- '{}' SELECT JSON_OBJECT(KEY 'K1' VALUE CAST(NULL AS STRING) ABSENT ON NULL); -- returns '{"K1":{"nested_json":{"value":42}}}' SELECT JSON_OBJECT('K1' VALUE JSON('{"nested_json": {"value": 42}}')); -- returns '{"K1":{"K2":"V"}}' SELECT JSON_OBJECT( KEY 'K1' VALUE JSON_OBJECT( KEY 'K2' VALUE 'V' ) ); JSON_OBJECTAGG¶ Aggregates key-value expressions into a JSON string. SyntaxJSON_OBJECTAGG([KEY] key VALUE value [ { NULL | ABSENT } ON NULL ]) DescriptionThe JSON_OBJECTAGG function creates a JSON object string by aggregating key-value expressions into a single JSON object. The key expression must return a non-nullable character string. Value expressions can be arbitrary, including other JSON functions. Keys must be unique. If a key occurs multiple times, an error is thrown. If a value is NULL, the ON NULL behavior defines what to do. If omitted, NULL ON NULL is the default. The JSON_OBJECTAGG function isn’t supported in OVER windows. Example JSON_QUERY¶ Gets values from a JSON string. SyntaxJSON_QUERY(jsonValue, path [ RETURNING ] [ { WITHOUT | WITH CONDITIONAL | WITH UNCONDITIONAL } [ ARRAY ] WRAPPER ] [ { NULL | EMPTY ARRAY | EMPTY OBJECT | ERROR } ON EMPTY ] [ { NULL | EMPTY ARRAY | EMPTY OBJECT | ERROR } ON ERROR ]) DescriptionThe JSON_QUERY function extracts JSON values from the specified JSON string. The result is returned as a STRING or an ARRAY<STRING>. Use the RETURNING clause to control the return type. The WRAPPER clause specifies whether the extracted value should be wrapped into an array and whether to do so unconditionally or only if the value itself isn’t an array already. The ON EMPTY and ON ERROR clauses specify the behavior if the path expression is empty, or in case an error was raised, respectively. By default, in both cases NULL is returned. Other choices are to use an empty array, an empty object, or to raise an error. ExamplesThe following SELECT statements return the values indicated in the comment lines. -- returns '{ "b": 1 }' SELECT JSON_QUERY('{ "a": { "b": 1 } }', '$.a'); -- returns '[1, 2]' SELECT JSON_QUERY('[1, 2]', '$'); -- returns NULL SELECT JSON_QUERY(CAST(NULL AS STRING), '$'); -- returns array ['c1','c2'] SELECT JSON_QUERY('{"a":[{"c":"c1"},{"c":"c2"}]}', 'lax $.a[*].c' RETURNING ARRAY<STRING>); -- Wrap the result into an array. -- returns '[{}]' SELECT JSON_QUERY('{}', '$' WITH CONDITIONAL ARRAY WRAPPER); -- returns '[1, 2]' SELECT JSON_QUERY('[1, 2]', '$' WITH CONDITIONAL ARRAY WRAPPER); -- returns '[[1, 2]]' SELECT JSON_QUERY('[1, 2]', '$' WITH UNCONDITIONAL ARRAY WRAPPER); -- Scalars must be wrapped to be returned. -- returns NULL SELECT JSON_QUERY(1, '$'); -- returns '[1]' SELECT JSON_QUERY(1, '$' WITH CONDITIONAL ARRAY WRAPPER); -- Behavior if the path expression is empty. -- returns '{}' SELECT JSON_QUERY('{}', 'lax $.invalid' EMPTY OBJECT ON EMPTY); -- Behavior if the path expression has an error. -- returns '[]' SELECT JSON_QUERY('{}', 'strict $.invalid' EMPTY ARRAY ON ERROR); JSON_QUOTE¶ Quotes a string as a JSON value by wrapping it with double-quote characters. SyntaxJSON_QUOTE(string) DescriptionThe JSON_QUOTE function quotes a string as a JSON value by wrapping it with double-quote characters, escaping interior quote and special characters (’”’, ‘’, ‘/’, ‘b’, ‘f’, ’n’, ‘r’, ’t’), and returning the result as a string. If string is NULL, the function returns NULL. Example -- returns { "SQL string" } SELECT JSON_QUOTE('SQL string'); JSON_STRING¶ Serializes a string to JSON. SyntaxJSON_STRING(value) DescriptionThe JSON_STRING function returns a JSON string containing the serialized value. If the value is NULL, the function returns NULL. ExamplesThe following SELECT statements return the values indicated in the comment lines. -- returns NULL SELECT JSON_STRING(CAST(NULL AS INT)); -- returns '1' SELECT JSON_STRING(1); -- returns 'true' SELECT JSON_STRING(TRUE); -- returns '"Hello, World!"' JSON_STRING('Hello, World!'); -- returns '[1,2]' JSON_STRING(ARRAY[1, 2]) JSON_UNQUOTE¶ Unquotes a JSON value. SyntaxJSON_UNQUOTE(string) DescriptionThe JSON_UNQUOTE function unquotes a JSON value, unescapes escaped special characters (’”’, ‘’, ‘/’, ‘b’, ‘f’, ’n’, ‘r’, ’t’, ‘u’), and returns the result as a string. If string is NULL, the function returns NULL. If string doesn’t start and end with double quotes, or if it starts and ends with double quotes but is not a valid JSON string literal, the value is passed through unmodified. Example -- returns { "SQL string" } SELECT JSON_UNQUOTE('SQL string'); JSON_VALUE¶ Gets a value from a JSON string. SyntaxJSON_VALUE(jsonValue, path [RETURNING <dataType>] [ { NULL | ERROR | DEFAULT <defaultExpr> } ON EMPTY ] [ { NULL | ERROR | DEFAULT <defaultExpr> } ON ERROR ]) DescriptionThe JSON_VALUE function extracts a scalar value from a JSON string. It searches a JSON string with the specified path expression and returns the value if the value at that path is scalar. Non-scalar values can’t be returned. By default, the value is returned as STRING. Use RETURNING to specify a different return type. The following return types are supported: BOOLEAN DOUBLE INTEGER VARCHAR / STRING For empty path expressions or errors, you can define a behavior to return NULL, raise an error, or return a defined default value instead. The default is NULL ON EMPTY or NULL ON ERROR, respectively. The default value may be a literal or an expression. If the default value itself raises an error, it falls through to the error behavior for ON EMPTY and raises an error for ON ERROR. For paths that contain special characters, like spaces, you can use ['property'] or ["property"] to select the specified property in a parent object. Be sure to put single or double quotes around the property name. When using JSON_VALUE in SQL, the path is a character parameter that’s already single-quoted, so you must escape the single quotes around the property name, for example, JSON_VALUE('{"a b": "true"}', '$.[''a b'']'). ExamplesThe following SELECT statements return the values indicated in the comment lines. -- returns "true" SELECT JSON_VALUE('{"a": true}', '$.a'); -- returns TRUE SELECT JSON_VALUE('{"a": true}', '$.a' RETURNING BOOLEAN); -- returns "false" SELECT JSON_VALUE('{"a": true}', 'lax $.b' DEFAULT FALSE ON EMPTY); -- returns "false" SELECT JSON_VALUE('{"a": true}', 'strict $.b' DEFAULT FALSE ON ERROR); -- returns 0.998D SELECT JSON_VALUE('{"a.b": [0.998,0.996]}','$.["a.b"][0]' RETURNING DOUBLE); -- returns "right" SELECT JSON_VALUE('{"contains blank": "right"}', 'strict $.[''contains blank'']' NULL ON EMPTY DEFAULT 'wrong' ON ERROR); Other built-in functions¶ Aggregate Functions Collection Functions Comparison Functions Conditional Functions Datetime Functions Hash Functions JSON Functions ML Preprocessing Functions Model Inference Functions Numeric Functions String Functions Table API Functions Related content¶ User-defined Functions Create a User Defined Function Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
IS JSON [ { VALUE | SCALAR | ARRAY | OBJECT } ]
```

```sql
-- The following statements return TRUE.
SELECT '1' IS JSON;
SELECT '[]' IS JSON;
SELECT '{}' IS JSON;
SELECT '"abc"' IS JSON;
SELECT '1' IS JSON SCALAR;
SELECT '{}' IS JSON OBJECT;
```

```sql
-- The following statements return FALSE.
SELECT 'abc' IS JSON;
SELECT '1' IS JSON ARRAY;
SELECT '1' IS JSON OBJECT;
SELECT '{}' IS JSON SCALAR;
SELECT '{}' IS JSON ARRAY;
```

```sql
JSON_ARRAY([value]* [ { NULL | ABSENT } ON NULL ])
```

```sql
ABSENT ON NULL
```

```sql
JSON_OBJECT
```

```sql
-- returns '[]'
SELECT JSON_ARRAY();

-- returns '[1,"2"]'
SELECT JSON_ARRAY(1, '2');

-- Use an expression as a value.
SELECT JSON_ARRAY(orders.orderId);

-- ON NULL
-- returns '[null]'
SELECT JSON_ARRAY(CAST(NULL AS STRING) NULL ON NULL);

-- ON NULL
-- returns '[]'
SELECT JSON_ARRAY(CAST(NULL AS STRING) ABSENT ON NULL);

-- returns '[[1]]'
SELECT JSON_ARRAY(JSON_ARRAY(1));

-- returns '[{"nested_json":{"value":42}}]'
SELECT JSON_ARRAY(JSON('{"nested_json": {"value": 42}}'));
```

```sql
JSON_ARRAYAGG(items [ { NULL | ABSENT } ON NULL ])
```

```sql
JSON_ARRAYAGG
```

```sql
ABSENT ON NULL
```

```sql
JSON_ARRAYAGG
```

```sql
-- '["Apple","Banana","Orange"]'
SELECT
JSON_ARRAYAGG(product)
FROM orders;
```

```sql
JSON_EXISTS(jsonValue, path [ { TRUE | FALSE | UNKNOWN | ERROR } ON ERROR ])
```

```sql
JSON_EXISTS
```

```sql
FALSE ON ERROR
```

```sql
-- The following statements return TRUE.
SELECT JSON_EXISTS('{"a": true}', '$.a');
SELECT JSON_EXISTS('{"a": [{ "b": 1 }]}', '$.a[0].b');
SELECT JSON_EXISTS('{"a": true}', 'strict $.b' TRUE ON ERROR);
```

```sql
-- The following statements return FALSE.
SELECT JSON_EXISTS('{"a": true}', '$.b');
SELECT JSON_EXISTS('{"a": true}', 'strict $.b' FALSE ON ERROR);
```

```sql
JSON_OBJECT([[KEY] key VALUE value]* [ { NULL | ABSENT } ON NULL ])
```

```sql
JSON_OBJECT
```

```sql
JSON_OBJECT
```

```sql
NULL ON NULL
```

```sql
JSON_OBJECT
```

```sql
-- returns '{}'
SELECT JSON_OBJECT();

-- returns '{"K1":"V1","K2":"V2"}'
SELECT JSON_OBJECT('K1' VALUE 'V1', 'K2' VALUE 'V2');

-- Use an expression as a value.
SELECT JSON_OBJECT('orderNo' VALUE orders.orderId);

-- ON NULL
-- '{"K1":null}'
SELECT JSON_OBJECT(KEY 'K1' VALUE CAST(NULL AS STRING) NULL ON NULL);

-- ON NULL
-- '{}'
SELECT JSON_OBJECT(KEY 'K1' VALUE CAST(NULL AS STRING) ABSENT ON NULL);

-- returns '{"K1":{"nested_json":{"value":42}}}'
SELECT JSON_OBJECT('K1' VALUE JSON('{"nested_json": {"value": 42}}'));

-- returns '{"K1":{"K2":"V"}}'
SELECT JSON_OBJECT(
  KEY 'K1'
  VALUE JSON_OBJECT(
    KEY 'K2'
    VALUE 'V'
  )
);
```

```sql
JSON_OBJECTAGG([KEY] key VALUE value [ { NULL | ABSENT } ON NULL ])
```

```sql
JSON_OBJECTAGG
```

```sql
NULL ON NULL
```

```sql
JSON_OBJECTAGG
```

```sql
JSON_QUERY(jsonValue, path
  [ RETURNING ]
  [ { WITHOUT | WITH CONDITIONAL | WITH UNCONDITIONAL } [ ARRAY ] WRAPPER ]
  [ { NULL | EMPTY ARRAY | EMPTY OBJECT | ERROR } ON EMPTY ]
  [ { NULL | EMPTY ARRAY | EMPTY OBJECT | ERROR } ON ERROR ])
```

```sql
ARRAY<STRING>
```

```sql
-- returns '{ "b": 1 }'
SELECT JSON_QUERY('{ "a": { "b": 1 } }', '$.a');

-- returns '[1, 2]'
SELECT JSON_QUERY('[1, 2]', '$');

-- returns NULL
SELECT JSON_QUERY(CAST(NULL AS STRING), '$');

-- returns array ['c1','c2']
SELECT JSON_QUERY('{"a":[{"c":"c1"},{"c":"c2"}]}', 'lax $.a[*].c' RETURNING ARRAY<STRING>);

-- Wrap the result into an array.
-- returns '[{}]'
SELECT JSON_QUERY('{}', '$' WITH CONDITIONAL ARRAY WRAPPER);

-- returns '[1, 2]'
SELECT JSON_QUERY('[1, 2]', '$' WITH CONDITIONAL ARRAY WRAPPER);

-- returns '[[1, 2]]'
SELECT JSON_QUERY('[1, 2]', '$' WITH UNCONDITIONAL ARRAY WRAPPER);

-- Scalars must be wrapped to be returned.
-- returns NULL
SELECT JSON_QUERY(1, '$');

-- returns '[1]'
SELECT JSON_QUERY(1, '$' WITH CONDITIONAL ARRAY WRAPPER);

-- Behavior if the path expression is empty.
-- returns '{}'
SELECT JSON_QUERY('{}', 'lax $.invalid' EMPTY OBJECT ON EMPTY);

-- Behavior if the path expression has an error.
-- returns '[]'
SELECT JSON_QUERY('{}', 'strict $.invalid' EMPTY ARRAY ON ERROR);
```

```sql
JSON_QUOTE(string)
```

```sql
-- returns { "SQL string" }
SELECT JSON_QUOTE('SQL string');
```

```sql
JSON_STRING(value)
```

```sql
JSON_STRING
```

```sql
-- returns NULL
SELECT JSON_STRING(CAST(NULL AS INT));

-- returns '1'
SELECT JSON_STRING(1);

-- returns 'true'
SELECT JSON_STRING(TRUE);

-- returns '"Hello, World!"'
JSON_STRING('Hello, World!');

-- returns '[1,2]'
JSON_STRING(ARRAY[1, 2])
```

```sql
JSON_UNQUOTE(string)
```

```sql
JSON_UNQUOTE
```

```sql
-- returns { "SQL string" }
SELECT JSON_UNQUOTE('SQL string');
```

```sql
JSON_VALUE(jsonValue, path
  [RETURNING <dataType>]
  [ { NULL | ERROR | DEFAULT <defaultExpr> } ON EMPTY ]
  [ { NULL | ERROR | DEFAULT <defaultExpr> } ON ERROR ])
```

```sql
NULL ON EMPTY
```

```sql
NULL ON ERROR
```

```sql
['property']
```

```sql
["property"]
```

```sql
JSON_VALUE('{"a b": "true"}', '$.[''a b'']')
```

```sql
-- returns "true"
SELECT JSON_VALUE('{"a": true}', '$.a');

-- returns TRUE
SELECT JSON_VALUE('{"a": true}', '$.a' RETURNING BOOLEAN);

-- returns "false"
SELECT JSON_VALUE('{"a": true}', 'lax $.b' DEFAULT FALSE ON EMPTY);

-- returns "false"
SELECT JSON_VALUE('{"a": true}', 'strict $.b' DEFAULT FALSE ON ERROR);

-- returns 0.998D
SELECT JSON_VALUE('{"a.b": [0.998,0.996]}','$.["a.b"][0]' RETURNING DOUBLE);

-- returns "right"
SELECT JSON_VALUE('{"contains blank": "right"}', 'strict $.[''contains blank'']' NULL ON EMPTY DEFAULT 'wrong' ON ERROR);
```

---

### Machine-Learning Preprocessing Functions in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/functions/ml-preprocessing-functions.html

Machine-Learning Preprocessing Functions in Confluent Cloud for Apache Flink¶ The following built-in functions are available for ML preprocessing in Confluent Cloud for Apache Flink®. These functions help transform features into representations more suitable for downstream processors. ML_BUCKETIZE ML_CHARACTER_TEXT_SPLITTER ML_FILE_FORMAT_TEXT_SPLITTER ML_LABEL_ENCODER ML_MAX_ABS_SCALER ML_MIN_MAX_SCALER ML_NGRAMS ML_NORMALIZER ML_ONE_HOT_ENCODER ML_RECURSIVE_TEXT_SPLITTER ML_ROBUST_SCALER ML_STANDARD_SCALER ML_BUCKETIZE¶ Bucketizes numerical values into discrete bins based on split points. SyntaxML_BUCKETIZE(value, splitBucketPoints [, bucketNames]) DescriptionThe ML_BUCKETIZE function divides numerical values into discrete buckets based on specified split points. Each bucket represents a range of values, and the function returns the bucket index or name for each input value. Arguments value: Numerical expression to be bucketized. If the input value is NaN or NULL, it is bucketized to the NULL bucket. splitBucketPoints: Array of numerical values that define the bucket boundaries, or split points. If the splitBucketPoints array is empty, an exception is thrown. Any split points that are NaN or NULL are removed from the splitBucketPoints array. splitBucketPoints must be in ascending order, or an exception is thrown. Duplicates are removed from splitBucketPoints. bucketNames: (Optional) Array of names of the buckets defined in splitBucketPoints. If the bucketNames array is not provided, buckets are named bin_NULL, bin_1, bin_2 … bin_n, with n being the total number of buckets in splitBucketPoints. If the bucketNames array is provided, names must be in the same order as in the splitBucketPoints array. Names for all of the buckets must be provided, including the NULL bucket, or an exception is thrown. If the bucketNames array is provided, the first name is the name for the NULL bucket. Example-- returns 'bin_2' SELECT ML_BUCKETIZE(2, ARRAY[1, 4, 7]); -- returns 'b2' SELECT ML_BUCKETIZE(2, ARRAY[1, 4, 7], ARRAY['b_null','b1','b2','b3','b4']); ML_CHARACTER_TEXT_SPLITTER¶ Splits text into chunks based on character count and separators. SyntaxML_CHARACTER_TEXT_SPLITTER(text, chunkSize, chunkOverlap, separator, isSeparatorRegex [, trimWhitespace] [, keepSeparator] [, separatorPosition]) DescriptionThe ML_CHARACTER_TEXT_SPLITTER function splits text into chunks based on character count and specified separators. This is useful for processing large text documents into smaller, manageable pieces. If any argument other than text is NULL, an exception is thrown. The returned array of chunks has the same order as the input. The function tries to keep every chunk within the chunkSize limit, but if a chunk is more than the limit, it is returned as is. Arguments text: The input text to be split. If the input text is NULL, it is returned as is. chunkSize: The size of each chunk. If chunkSize < 0 or chunkOverlap > chunkSize, an exception is thrown. chunkOverlap: The number of overlapping characters between chunks. If chunkOverlap < 0, an exception is thrown. separator: The separator used for splitting. isSeparatorRegex: Whether the separator is a regex pattern. trimWhitespace: (Optional) Whether to trim whitespace from chunks. The default is TRUE. keepSeparator: (Optional) Whether to keep the separator in the chunks. The default is FALSE. separatorPosition: (Optional) The position of the separator. Valid values are START or END. The default is START. START means place the separator at the start of the following chunk, and END means place the separator at the end of the previous chunk. Example-- returns ['This is the text I would like to ch', 'o chunk up. It is the example text ', 'ext for this exercise'] SELECT ML_CHARACTER_TEXT_SPLITTER('This is the text I would like to chunk up. It is the example text for this exercise', 35, 4, '', TRUE, FALSE, TRUE, 'END'); ML_FILE_FORMAT_TEXT_SPLITTER¶ Splits text into chunks based on specific file format patterns. SyntaxML_FILE_FORMAT_TEXT_SPLITTER(text, chunkSize, chunkOverlap, formatName, [trimWhitespace] [, keepSeparator] [, separatorPosition]) DescriptionThe ML_FILE_FORMAT_TEXT_SPLITTER function splits text into chunks based on specific file format patterns. It uses format-specific separators to split code intelligently or structure text. The returned array of chunks has the same order as the input. The function starts splitting the chunks with the first separator in the separators list. If a chunk is bigger than chunkSize, the function splits the chunk recursively using the next separator in the separators list for the given file format. If separators are exhausted, and the remaining text is bigger than chunkSize, the function returns the smallest chunk possible, even though it is bigger than chunkSize. Arguments text: The input text to be split. If the input text is NULL, it is returned as is. chunkSize: The size of each chunk. If chunkSize < 0 or chunkOverlap > chunkSize, an exception is thrown. chunkOverlap: The number of overlapping characters between chunks. If chunkOverlap < 0, an exception is thrown. formatName: ENUM of the format names. Valid values are: Valid values for formatName C CPP CSHARP ELIXIR GO HTML JAVA JAVASCRIPT JSON KOTLIN LATEX MARKDOWN PHP PYTHON RUBY RUST SCALA SQL SWIFT TYPESCRIPT XML trimWhitespace: (Optional) Whether to trim whitespace from chunks. The default is TRUE. keepSeparator: (Optional) Whether to keep the separator in the chunks. The default is FALSE. separatorPosition: (Optional) The position of the separator. Valid values are START or END. The default is START. START means place the separator at the start of the following chunk, and END means place the separator at the end of the previous chunk. Example-- returns ['def hello_world():\n print("Hello, World!")', '# Call the function\nhello_world()'] SELECT ML_FILE_FORMAT_TEXT_SPLITTER('def hello_world():\n print("Hello, World!")\n\n# Call the function\nhello_world()\n', 50, 0, 'PYTHON'); ML_LABEL_ENCODER¶ Encodes categorical variables into numerical labels. SyntaxML_LABEL_ENCODER(input, categories [, includeZeroLabel]) DescriptionThe ML_LABEL_ENCODER function encodes categorical variables into numerical labels. Each unique category is assigned a unique integer label. Arguments input: Input value to encode. If the input value is NULL, NaN, or Infinity, it is considered in the unknown category, which is given the 0 label. If the input value is not one of the categories, it is labeled as -1 or 0 depending on includeZeroLabel: -1 if includeZeroLabel is TRUE and 0 if includeZeroLabel is FALSE. categories: Arrays of category values to encode input value to. Category values must be the same type as the input value. If the categories array is empty, all inputs are considered to be in the unknown category, which is given the 0 label. The categories array can’t be NULL, or an exception is thrown. The categories array can’t have NULL or duplicate values, or an exception is thrown. The categories array must be sorted in ascending lexicographical order, or an exception is thrown. includeZeroLabel: (Optional) The start index for valid categories is 0. The default is FALSE. If includeZeroLabel is TRUE, the valid categories index starts at 0, and unknown values are labeled as -1. If includeZeroLabel is FALSE, the valid categories index starts at 1, and unknown values are labeled as 0. Example-- returns 1 SELECT ML_LABEL_ENCODER('abc', ARRAY['abc', 'def', 'efg', 'hikj']); -- returns 0 SELECT ML_LABEL_ENCODER('abc', ARRAY['abc', 'def', 'efg', 'hikj'], TRUE ); ML_MAX_ABS_SCALER¶ Scales numerical values by their maximum absolute value. SyntaxML_MAX_ABS_SCALER(value, absoluteMax) DescriptionThe ML_MAX_ABS_SCALER function scales numerical values by dividing them by the maximum absolute value. This preserves zero entries in sparse data. Arguments value: Numerical expression to be scaled. If the input value is NULL, NaN, or Infinity, it is returned as is. absoluteMax: Absolute Maximum value of the feature data seen in the dataset. If absoluteMax is NULL or NaN, an exception is thrown. If absoluteMax is Infinity, 0 is returned. If absoluteMax is 0, the scaled value is returned as is. Example-- returns 0.2 SELECT ML_MAX_ABS_SCALER(1, 5); ML_MIN_MAX_SCALER¶ Scales numerical values to a specified range using min-max normalization. SyntaxML_MIN_MAX_SCALER(value, min, max) DescriptionThe ML_MIN_MAX_SCALER function scales numerical values to a specified range using min-max normalization. The function transforms values to the range [0, 1] by default, or to a custom range if min and max are specified. Arguments value: Numerical expression to be scaled. If the input value is NULL, NaN, or Infinity, it is returned as is. If value > max, it is set to 1.0. If value < min, it is set to 0.0. If max == min, the range is set to 1.0 to avoid division by zero. min: Minimum value of the feature data seen in the dataset. If min is NULL, NaN, or Infinity, an exception is thrown. max: Maximum value of the feature data seen in the dataset. If max is NULL, NaN, or Infinity, an exception is thrown. If max < min, an exception is thrown. Example-- returns 0.25 SELECT ML_MIN_MAX_SCALER(2, 1, 5); ML_NGRAMS¶ Generates n-grams from an array of strings. SyntaxML_NGRAMS(input [, nValue] [, separator]) DescriptionThe ML_NGRAMS function generates n-grams from an array of strings. N-grams are contiguous sequences of n items from a given sample of text. The ordering of the returned output is the same as the input array. Arguments input: Array of CHAR or VARCHAR to return n-gram for. If the input array has NULL, it is ignored while forming N-GRAMS. If the input array is NULL or empty, an empty N-GRAMS array is returned. Empty strings in the input array are treated as is. Strings with only whitespace are treated as empty strings. nValue: (Optional) N value of n-gram function. The default is 2. If nValue < 1, an exception is thrown. If nValue > input.size(), an empty N-GRAMS array is returned. separator: (Optional) Characters to join n-gram values with. The default is whitespace. Example-- returns ['ab', 'cd', 'de', 'pwe'] SELECT ML_NGRAMS(ARRAY['ab', 'cd', 'de', 'pwe'], 1, '#'); -- returns ['ab#cd', 'cd#de'] SELECT ML_NGRAMS(ARRAY['ab','cd','de', NULL], 2, '#'); ML_NORMALIZER¶ Normalizes numerical values using p-norm normalization. SyntaxML_NORMALIZER(value, normValue) DescriptionThe ML_NORMALIZER function normalizes numerical values using p-norm normalization. This scales each sample to have unit norm. Arguments value: Numerical expression to be scaled. If the input value is NULL, NaN, or Infinity, it is returned as is. normValue: Calculated norm value of the feature data using p-norm. If normValue is NULL or NaN, an exception is thrown. If normValue is Infinity, 0 is returned. If normValue is 0, which is only possible when all the values are 0, the input value is returned as is. Example-- returns 0.6 SELECT ML_NORMALIZER(3.0, 5.0); ML_ONE_HOT_ENCODER¶ Encodes categorical variables into a binary vector representation. SyntaxML_ONE_HOT_ENCODER(input, categories [, dropLast] [, handleUnknown]) DescriptionThe ML_ONE_HOT_ENCODER function encodes categorical variables into a binary vector representation. Each category is represented by a binary vector where only one element is 1 and the rest are 0. Arguments input: Input value to encode. If the input value is NULL, it is considered to be in the unknown category. categories: Array of category values to encode input value to. The input argument must be of same type as the categories array. If the categories array is empty, an exception is thrown. The categories array can’t be NULL, or an exception is thrown. The categories array can’t have NULL or duplicate values, or an exception is thrown. dropLast: (Optional) Whether to drop the last category. The default is TRUE. By default, the last category is dropped, to prevent perfectly collinear features. handleUnknown: (Optional) ERROR, IGNORE, KEEP options to indicate how to handle unknown values. The default is IGNORE. If handleUnknown is ERROR, an exception is thrown when the input is an unknown value. If handleUnknown is IGNORE, unknown values are ignored and values of all the columns are 0. If handleUnknown is KEEP, the unknown category column has value 1. If handleUnknown is KEEP, the last column is for the unknown category. Example-- returns [1, 0, 0, 0] SELECT ML_ONE_HOT_ENCODER('abc', ARRAY['abc', 'def', 'efg', 'hikj']); -- returns [0, 0, 0, 0, 1] SELECT ML_ONE_HOT_ENCODER('abcd', ARRAY['abc', 'def', 'efg', 'hik'], TRUE, 'KEEP' ); ML_RECURSIVE_TEXT_SPLITTER¶ Splits text into chunks using multiple separators recursively. SyntaxML_RECURSIVE_TEXT_SPLITTER(text, chunkSize, chunkOverlap [, separators] [, isSeparatorRegex] [, trimWhitespace] [, keepSeparator] [, separatorPosition]) DescriptionThe ML_RECURSIVE_TEXT_SPLITTER function splits text into chunks using multiple separators recursively. It starts with the first separator and recursively applies subsequent separators if chunks are still too large. If any argument other than text is NULL, an exception is thrown. The returned array of chunks has the same order as the input. Arguments text: The input text to be split. If the input text is NULL, it is returned as is. chunkSize: The size of each chunk. If chunkSize < 0 or chunkOverlap > chunkSize, an exception is thrown. chunkOverlap: The number of overlapping characters between chunks. If chunkOverlap < 0, an exception is thrown. separators: (Optional) The list of separators used for splitting. The default is ["\n\n", "\n", " ", ""] isSeparatorRegex: (Optional) Whether the separator is a regex pattern. The default is FALSE trimWhitespace: (Optional) Whether to trim whitespace from chunks. The default is TRUE keepSeparator: (Optional) Whether to keep the separator in the chunks. The default is FALSE separatorPosition: (Optional) The position of the separator. Valid values are START or END. The default is START. START means place the separator at the start of the following chunk, and END means place the separator at the end of the previous chunk. Example-- returns ['Hello', '. world', '!'] SELECT ML_RECURSIVE_TEXT_SPLITTER('Hello. world!', 0, 0, ARRAY['[!]','[.]'], TRUE, TRUE, TRUE, 'START'); ML_ROBUST_SCALER¶ Scales numerical values using statistics that are robust to outliers. SyntaxML_ROBUST_SCALER(value, median, firstQuartile, thirdQuartile [, withCentering, withScaling) DescriptionThe ML_ROBUST_SCALER function scales numerical values using statistics that are robust to outliers. It removes the median and scales the data according to the quantile range. Arguments value: Numerical expression to be scaled. If the input value is NULL, NaN, or Infinity, it is returned as is. median: Median of the feature data seen in the training dataset. If median is NULL, NaN, or Infinity, an exception is thrown. firstQuartile: First Quartile of feature data seen in the dataset. If firstQuartile is NULL, NaN, or Infinity, an exception is thrown. thirdQuartile: Third Quartile of feature data seen in the dataset. If thirdQuartile is NULL, NaN, or Infinity, an exception is thrown. If thirdQuartile - firstQuartile = 0, the range is set to 1.0 to avoid division by zero. withCentering: (Optional) Boolean value indicating to center the numerical value using median before scaling. The default is TRUE. If withCentering is FALSE, the median value is ignored. withScaling: (Optional) Boolean value indicating to scale the numerical value using IQR after centering. The default is TRUE. If withScaling is FALSE, the firstQuartile and thirdQuartile values are ignored. Example-- returns 0.3333333333333333 SELECT ML_ROBUST_SCALER(2, 1, 0, 3, TRUE, TRUE); ML_STANDARD_SCALER¶ Standardizes numerical values by removing the mean and scaling to unit variance. SyntaxML_STANDARD_SCALER(value, mean, standardDeviation [, withCentering] [, withScaling]) DescriptionThe ML_STANDARD_SCALER function standardizes numerical values by removing the mean and scaling to unit variance. This is useful for features that follow a normal distribution. Arguments value: Numerical expression to be scaled. If the input value is NULL, NaN or Infinity, it is returned as is. mean: Mean of the feature data seen in the dataset. If mean is NULL, NaN or Infinity, an exception is thrown. standardDeviation: Standard Deviation of the feature data seen in the dataset. If standardDeviation is NULL or NaN, an exception is thrown. If standardDeviation is Infinity, 0 is returned. If standardDeviation is 0, the value does not need to be scaled, so it is returned as is. withCentering: (Optional) Boolean value indicating to center the numerical value using mean before scaling. The default is TRUE. If withCentering is FALSE, the mean value is ignored. withScaling: (Optional) Boolean value indicating to scale the numerical value using std after centering. The default is TRUE. If withScaling is FALSE, the standardDeviation value is ignored. Example-- returns 0.2 SELECT ML_STANDARD_SCALER(2, 1, 5, TRUE, TRUE); Other built-in functions¶ Aggregate Functions Collection Functions Comparison Functions Conditional Functions Datetime Functions Hash Functions JSON Functions ML Preprocessing Functions Model Inference Functions Numeric Functions String Functions Table API Functions Related content¶ AI Model Inference Functions Build AI with Flink SQL Flink SQL Queries Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
ML_BUCKETIZE(value, splitBucketPoints [, bucketNames])
```

```sql
ML_BUCKETIZE
```

```sql
splitBucketPoints
```

```sql
splitBucketPoints
```

```sql
splitBucketPoints
```

```sql
splitBucketPoints
```

```sql
splitBucketPoints
```

```sql
bucketNames
```

```sql
splitBucketPoints
```

```sql
bucketNames
```

```sql
splitBucketPoints
```

```sql
bucketNames
```

```sql
-- returns 'bin_2'
SELECT ML_BUCKETIZE(2, ARRAY[1, 4, 7]);

-- returns 'b2'
SELECT ML_BUCKETIZE(2, ARRAY[1, 4, 7], ARRAY['b_null','b1','b2','b3','b4']);
```

```sql
ML_CHARACTER_TEXT_SPLITTER(text, chunkSize, chunkOverlap, separator,
isSeparatorRegex [, trimWhitespace] [, keepSeparator] [, separatorPosition])
```

```sql
ML_CHARACTER_TEXT_SPLITTER
```

```sql
chunkSize < 0
```

```sql
chunkOverlap > chunkSize
```

```sql
chunkOverlap < 0
```

```sql
-- returns ['This is the text I would like to ch', 'o chunk up. It is the example text ', 'ext for this exercise']
SELECT ML_CHARACTER_TEXT_SPLITTER('This is the text I would like to chunk up. It is the example text for this exercise', 35, 4, '', TRUE, FALSE, TRUE, 'END');
```

```sql
ML_FILE_FORMAT_TEXT_SPLITTER(text, chunkSize, chunkOverlap, formatName,
[trimWhitespace] [, keepSeparator] [, separatorPosition])
```

```sql
ML_FILE_FORMAT_TEXT_SPLITTER
```

```sql
chunkSize < 0
```

```sql
chunkOverlap > chunkSize
```

```sql
chunkOverlap < 0
```

```sql
-- returns ['def hello_world():\n print("Hello, World!")', '# Call the function\nhello_world()']
SELECT ML_FILE_FORMAT_TEXT_SPLITTER('def hello_world():\n print("Hello, World!")\n\n# Call the function\nhello_world()\n', 50, 0, 'PYTHON');
```

```sql
ML_LABEL_ENCODER(input, categories [, includeZeroLabel])
```

```sql
ML_LABEL_ENCODER
```

```sql
includeZeroLabel
```

```sql
includeZeroLabel
```

```sql
includeZeroLabel
```

```sql
includeZeroLabel
```

```sql
includeZeroLabel
```

```sql
-- returns 1
SELECT ML_LABEL_ENCODER('abc', ARRAY['abc', 'def', 'efg', 'hikj']);

-- returns 0
SELECT ML_LABEL_ENCODER('abc', ARRAY['abc', 'def', 'efg', 'hikj'], TRUE );
```

```sql
ML_MAX_ABS_SCALER(value, absoluteMax)
```

```sql
ML_MAX_ABS_SCALER
```

```sql
absoluteMax
```

```sql
absoluteMax
```

```sql
absoluteMax
```

```sql
-- returns 0.2
SELECT ML_MAX_ABS_SCALER(1, 5);
```

```sql
ML_MIN_MAX_SCALER(value, min, max)
```

```sql
ML_MIN_MAX_SCALER
```

```sql
value > max
```

```sql
value < min
```

```sql
-- returns 0.25
SELECT ML_MIN_MAX_SCALER(2, 1, 5);
```

```sql
ML_NGRAMS(input [, nValue] [, separator])
```

```sql
nValue > input.size()
```

```sql
-- returns ['ab', 'cd', 'de', 'pwe']
SELECT ML_NGRAMS(ARRAY['ab', 'cd', 'de', 'pwe'], 1, '#');

-- returns ['ab#cd', 'cd#de']
SELECT ML_NGRAMS(ARRAY['ab','cd','de', NULL], 2, '#');
```

```sql
ML_NORMALIZER(value, normValue)
```

```sql
ML_NORMALIZER
```

```sql
-- returns 0.6
SELECT ML_NORMALIZER(3.0, 5.0);
```

```sql
ML_ONE_HOT_ENCODER(input, categories [, dropLast] [, handleUnknown])
```

```sql
ML_ONE_HOT_ENCODER
```

```sql
handleUnknown
```

```sql
handleUnknown
```

```sql
handleUnknown
```

```sql
handleUnknown
```

```sql
-- returns [1, 0, 0, 0]
SELECT ML_ONE_HOT_ENCODER('abc', ARRAY['abc', 'def', 'efg', 'hikj']);

-- returns [0, 0, 0, 0, 1]
SELECT ML_ONE_HOT_ENCODER('abcd', ARRAY['abc', 'def', 'efg', 'hik'], TRUE, 'KEEP' );
```

```sql
ML_RECURSIVE_TEXT_SPLITTER(text, chunkSize, chunkOverlap [, separators]
[, isSeparatorRegex] [, trimWhitespace] [, keepSeparator]
[, separatorPosition])
```

```sql
ML_RECURSIVE_TEXT_SPLITTER
```

```sql
chunkSize < 0
```

```sql
chunkOverlap > chunkSize
```

```sql
chunkOverlap < 0
```

```sql
["\n\n", "\n", " ", ""]
```

```sql
-- returns ['Hello', '. world', '!']
SELECT ML_RECURSIVE_TEXT_SPLITTER('Hello. world!', 0, 0, ARRAY['[!]','[.]'], TRUE, TRUE, TRUE, 'START');
```

```sql
ML_ROBUST_SCALER(value, median, firstQuartile, thirdQuartile [,
withCentering, withScaling)
```

```sql
ML_ROBUST_SCALER
```

```sql
firstQuartile
```

```sql
thirdQuartile
```

```sql
thirdQuartile - firstQuartile = 0
```

```sql
withCentering
```

```sql
withScaling
```

```sql
-- returns 0.3333333333333333
SELECT ML_ROBUST_SCALER(2, 1, 0, 3, TRUE, TRUE);
```

```sql
ML_STANDARD_SCALER(value, mean, standardDeviation [, withCentering]
[, withScaling])
```

```sql
ML_STANDARD_SCALER
```

```sql
standardDeviation
```

```sql
standardDeviation
```

```sql
standardDeviation
```

```sql
withCentering
```

```sql
withScaling
```

```sql
standardDeviation
```

```sql
-- returns 0.2
SELECT ML_STANDARD_SCALER(2, 1, 5, TRUE, TRUE);
```

---

### AI Model Inference and Machine Learning Functions in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/functions/model-inference-functions.html

AI Model Inference and Machine Learning Functions in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides built-in functions for invoking remote AI/ML models in Flink SQL queries. These simplify developing and deploying AI applications by providing a unified platform for both data processing and AI/ML tasks. AI_COMPLETE: Generate text completions. AI_EMBEDDING: Create embeddings. AI_FORECAST: Forecast trends. AI_TOOL_INVOKE: Invoke model context protocol (MCP) tools. ML_DETECT_ANOMALIES: Detect anomalies in your data. ML_EVALUATE: Evaluate the performance of an AI/ML model. ML_PREDICT: Run a remote AI/ML model for tasks like predicting outcomes, generating text, and classification. Search Functions¶ Confluent Cloud for Apache Flink also supports read-only external tables to enable search with federated query execution on external databases. KEY_SEARCH_AGG: Perform exact key lookups in external databases like JDBC, REST APIs, MongoDB, and Couchbase. TEXT_SEARCH_AGG: Execute full-text searches in external databases like MongoDB, Couchbase, and Elasticsearch. VECTOR_SEARCH_AGG: Run semantic similarity searches using vector embeddings in databases like MongoDB, Pinecone, Elasticsearch, and Couchbase. For machine-language preprocessing utilities, see ML Preprocessing Functions. ML_PREDICT¶ Run a remote AI/ML model for tasks like predicting outcomes, generating text, and classification. SyntaxML_PREDICT(`model_name[$version_id]`, column); -- map settings are optional ML_PREDICT(`model_name[$version_id]`, column, map['async_enabled', [boolean], 'client_timeout', [int], 'max_parallelism', [int], 'retry_count', [int]]); DescriptionThe ML_PREDICT function performs predictions using pre-trained machine learning models. The first argument to the ML_PREDICT table function is the model name. The other arguments are the columns used for prediction. They are defined in the model resource INPUT for AI models and may vary in length or type. Before using ML_PREDICT, you must register the model by using the CREATE MODEL statement. For more information, see Run an AI Model. ConfigurationYou can control how calls to the remote model execute with these optional parameters. async_enabled: Calls to remote models are asynchronous and don’t block. The default is true. client_timeout: Time, in seconds, after which the request to the model endpoint times out. The default is 30 seconds. debug: Return a detailed stack trace in the API response. The default is false. Confluent Cloud for Apache Flink implements data masking for error messages to remove any secrets or customer input, but the stack trace may contain the prompt itself or some part of the response string. retry_count: Maximum number of times the remote model request is retried if the request to the model fails. The default is 3. max_parallelism: Maximum number of parallel requests that the function can make. Can be used only when async_enabled is true. The default is 10. ExampleAfter you have registered the AI model by using the CREATE MODEL statement, run the model by using the ML_PREDICT function in a SQL query. The following example runs a model named embeddingmodel on the data in a table named text_stream. SELECT id, text, embedding FROM text_stream, LATERAL TABLE(ML_PREDICT('embeddingmodel', text)); The following examples call the ML_PREDICT function with different configurations. -- Specify the timeout. SELECT * FROM `db1`.`tb1`, LATERAL TABLE(ML_PREDICT('md1', key, map['client_timeout', 60 ])); -- Specify all configuration parameters. SELECT * FROM `db1`.`tb1`, LATERAL TABLE(ML_PREDICT('md1', key, map['async_enabled', true, 'client_timeout', 60, 'max_parallelism', 20, 'retry_count', 5])); ML_DETECT_ANOMALIES¶ Identify outliers in a data stream. SyntaxML_DETECT_ANOMALIES( data_column, timestamp_column, JSON_OBJECT('p' VALUE 1, 'q' VALUE 1, 'd' VALUE 1, 'minTrainingSize' VALUE 10)); DescriptionThe ML_DETECT_ANOMALIES function uses an ARIMA model to identify outliers in time-series data. Your data must include: A timestamp column. A target column representing some quantity of interest at each timestamp. For more information, see Detect Anomalies in Data. ParametersFor anomaly detection parameters, see ARIMA model parameters. ExampleSELECT ML_DETECT_ANOMALIES( total_orderunits, summed_ts, JSON_OBJECT('p' VALUE 1, 'q' VALUE 1, 'd' VALUE 1, 'minTrainingSize' VALUE 10)) OVER ( ORDER BY summed_ts RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS anomalies FROM test_table; ML_EVALUATE¶ Aggregate a table and return model evaluation metrics. SyntaxML_EVALUATE(`model_name`, label, col1, col2, ...) FROM 'eval_data_table'; Description¶ The ML_EVALUATE function is a table aggregation function that takes an entire table and returns a single row of model evaluation metrics. If run on all versions of a model, the function returns one row for each model version. After comparing the metrics for different versions, you can update the default version for deployment with the model that has the best evaluation metrics. Internally, the ML_EVALUATE function runs ML_PREDICT and processes the results. Before using ML_EVALUATE, you must register the model by using the CREATE MODEL statement. The first argument to the ML_EVALUATE table function is the model name. The second argument is the true label that the output of the model should be evaluated against. Its type depends on the model OUTPUT type and the model task. The other arguments are the columns used for prediction. They are defined in the model resource INPUT for AI models and may vary in length or type. The return type of the ML_EVALUATE function is Map<String, Double> for all types of tasks. Each task type has different metrics keys in the map, depending on the task type. Metrics¶ The metric columns returned by ML_EVALUATE depend on the task type of the specified model. Classification¶ Classification models choose a group to place their inputs in and return one of N possible values. A classification model that returns only 2 possible values is called a binary classifier. If it returns more than 2 values, it is referred to as multi-class. Classification models return these metrics: Accuracy: Total Fraction of correct predictions across all classes. F1 Score: Harmonic mean of precision and recall. Precision: (Class X Correctly Predicted) / (# of Class X Predicted) Recall: (Class X Correctly Predicted) / (# of actual Class X) Clustering¶ Clustering models group the model examples into K groups. Metrics are a measure of how compact the clusters are. Clustering models return these metrics: Davies Bouldin Index: A measure of how separated clusters are and how compact they are. Intra-Cluster Variance (Mean Squared Distance): Average Squared distance of each training point to the centroid of the cluster it was assigned to. Silhouette Score: Compares how similar each point is to its own cluster with how dissimilar it is to other clusters. Embedding¶ Embedding models return these metrics: Mean Cosine Similarity: A measure of how similar two vectors are. Mean Jaccard Similarity: A measure of how similar two sets are. Mean Euclidean Distance: A measure of how similar two vectors are. Regression¶ Regression models predict a continuous output variable based on one or more input features. Regression models return these metrics: Mean Absolute Error: The average of the absolute differences between the predicted and actual values. Mean Squared Error: The average of the squared differences between the predicted and actual values. Text generation¶ Text generation models generate text based on a prompt. Text generation models return these metrics: Mean BLEU: A measure of how similar two texts are. Mean ROUGE: A measure of how similar two texts are. Mean Semantic Similarity: A measure of how similar two texts are. Example metrics¶ The following table shows example metrics for different task types. Task type Example metrics Classification {Accuracy=0.9999991465990892, Precision=0.9996998081063332, Recall=0.0013025368892873059, F1=0.0013025368892873059} Clustering {Mean Davies-Bouldin Index=0.9999991465990892} Embedding {Mean Cosine Similarity=0.9999991465990892, Mean Jaccard Similarity=0.9996998081063332, Mean Euclidean Distance=0.0013025368892873059} Regression {MAE=0.9999991465990892, MSE=0.9996998081063332, RMSE=0.0013025368892873059, MAPE=0.0013025368892873059, R²=0.0043025368892873059} Text generation {Mean BLEU=0.9999991465990892, Mean ROUGE=0.9996998081063332, Mean Semantic Similarity=0.0013025368892873059} Example¶ After you have registered the AI model by using the CREATE MODEL statement, run the model by using the ML_EVALUATE function in a SQL query. The following example statement registers a remote OpenAI model for a classification task. CREATE MODEL `my_remote_model` INPUT (f1 INT, f2 STRING) OUTPUT (output_label STRING) WITH( 'task' = 'classification', 'type' = 'remote', 'provider' = 'openai', 'openai.endpoint' = 'https://api.openai.com/v1/llm/v1/chat', 'openai.api_key' = '<api-key>' ); The following statements show how to run the ML_EVALUATE function on various versions of my_remote_model using data in a table named eval_data. -- Model evaluation with all versions SELECT ML_EVALUATE(`my_remote_model$all`, label, f1, f2) FROM `eval_data`; -- Model evaluation with default version SELECT ML_EVALUATE(`my_remote_model`, label, f1, f2) FROM `eval_data`; -- Model evaluation with specific version 2 SELECT ML_EVALUATE(`my_remote_model$2`, label, f1, f2) FROM `eval_data`; KEY_SEARCH_AGG¶ Run a key search over an external table. SyntaxKEY_SEARCH_AGG(<external_table>, DESCRIPTOR(<input_column>), <search_column>); DescriptionUse the KEY_SEARCH_AGG function to run key searches over external databases in Confluent Cloud for Apache Flink. The KEY_SEARCH_AGG function uses a combination of serialized table properties and configuration settings to interact with external databases. It’s designed to handle the deserialization of table properties and manage the runtime environment for executing search queries. The output of KEY_SEARCH_AGG is an array with all rows in the external table that have a matching key in the search column. <input_column> Search result <input_column_key> array[row1<column1, column2…>, row2<column1, column2…>, …] ML_FORECAST¶ Perform continuous forecasting on a table. SyntaxML_FORECAST( data_column, timestamp_column, JSON_OBJECT('p' VALUE 1, 'q' VALUE 1, 'd' VALUE 1, 'minTrainingSize' VALUE 10)); DescriptionThe ML_FORECAST function uses an ARIMA model to perform time-series forecasting. Your data must include: A timestamp column. A target column representing some quantity of interest at each timestamp. For more information, see Forecast Data Trends. ParametersFor forecasting parameters, see ARIMA model parameters. ExampleSELECT ML_FORECAST( total_orderunits, summed_ts, JSON_OBJECT('p' VALUE 1, 'q' VALUE 1, 'd' VALUE 1, 'minTrainingSize' VALUE 10)) OVER ( ORDER BY summed_ts RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS forecast FROM test_table; AI_COMPLETE¶ Invoke a large language model (LLM) to generate text completions, summaries, or answers. SyntaxAI_COMPLETE(model_name, input_prompt [, invocation_config]); DescriptionThe AI_COMPLETE function provides a streamlined approach for generating text, taking a single string as input and returning a single string as output. This functionality enables you to leverage LLMs to produce text based on any given prompt. Configuration model_name: Name of the model entity to call to for prediction [STRING]. input_prompt: Input prompt to pass to the LLM for prediction [STRING]. invocation_config[optional]: Map to pass the configuration to manage function behavior, for example, MAP['debug', true]. ExampleThe following example shows how to invoke an LLM to generate text completions. # Create an OpenAI connection. CREATE CONNECTION openai_connection WITH ( 'type' = 'openai', 'endpoint' = 'https://api.openai.com/v1/chat/completions', 'api-key' = '<api-key>' ); CREATE MODEL description_extractor INPUT (input STRING) OUTPUT (output_json STRING) WITH( 'provider' = 'openai', 'openai.connection' = 'openai_connection', 'openai.system_prompt' = 'Extract json from input free text', 'task' = 'text_generation' ); CREATE TABLE claims_with_structured_description(id INT, customer_id INT, output_json STRING); INSERT INTO claims_with_structured_description SELECT id, customer_id, output_json FROM claims_submitted, LATERAL TABLE(AI_COMPLETE('description_extractor', description)); AI_EMBEDDING¶ Generate vector embeddings for text or other data using a registered embedding model. AI_EMBEDDING(model_name, input_text [, invocation_config]); DescriptionThe AI_EMBEDDING function provides a straightforward interface, accepting a single string input and returning an array of floats as the embedding response. This functionality enables you to leverage large language models (LLMs) to generate embeddings for text efficiently. Configuration model_name: Name of the model entity to call to for embeddings [STRING]. input_text: Input text to pass to the LLM for embeddings [STRING]. invocation_config[optional]: Map to pass the configuration to manage function behavior, for example, MAP['debug', true]. ExampleThe following example shows how to generate vector embeddings for text or other data using a registered embedding model. # Create an OpenAI connection. CREATE CONNECTION openai_embedding_connection WITH ( 'type' = 'openai', 'endpoint' = 'https://api.openai.com/v1/embeddings', 'api-key' = '<api-key>' ); CREATE MODEL description_embedding INPUT (input STRING) OUTPUT (embeddings ARRAY<FLOAT>) WITH( 'provider' = 'openai', 'openai.connection' = 'openai_embedding_connection', 'task' = 'embedding' ); CREATE TABLE claims_embeddings(id INT, customer_id INT, embeddings ARRAY<FLOAT>); INSERT INTO claims_embeddings SELECT id, customer_id, embeddings FROM claims_submitted, LATERAL TABLE(AI_EMBEDDING('description_embedding', description)); AI_TOOL_INVOKE¶ Invoke a registered tool, either externally by using an MCP server or locally by using a UDF, as part of an AI workflow. SyntaxAI_TOOL_INVOKE(model_name, input_prompt, remote_udf_descriptor, mcp_tool_descriptor [, invocation_config]); DescriptionThe AI_TOOL_INVOKE function enables large language models (LLMs) to access various tools. The LLM decides which tools should be accessed, then the AI_TOOL_INVOKE function invokes the tools, gets the responses, and returns the responses to the LLM. The function returns a map that includes all the tools that were accessed, along with their responses and the status of the call, indicating whether it was a SUCCESS or FAILURE. This function supports only SSE-based MCP servers. The following models are supported: Anthropic AzureOpenAI Gemini OpenAI Note The AI_TOOL_INVOKE function is available for preview. A Preview feature is a Confluent Cloud component that is being introduced to gain early feedback from developers. Preview features can be used for evaluation and non-production testing purposes or to provide feedback to Confluent. The warranty, SLA, and Support Services provisions of your agreement with Confluent do not apply to Preview features. Confluent may discontinue providing preview releases of the Preview features at any time in Confluent’s’ sole discretion. Configuration model_name: Name of the model entity to call [STRING]. input_prompt: Input prompt to pass to the LLM [STRING]. remote_udf_descriptor: Map to pass UDF names as key and function description as value [MAP<String, String>]. A maximum of 3 UDFs can be passed. mcp_tool_descriptor: Map to pass MCP tool names as key and tool description as value [MAP<String, String>]. A maximum of 5 tools can be passed. This additional description is passed to the LLM as “Additional description”. If the MCP server already has a description, and if the server doesn’t have a description, mcp_tool_descriptor is added as the description. You can leave it empty, in which case no changes are made to the description provided by the server. invocation_config[optional]: Map to pass the config to manage function behavior, for example, MAP['debug', true, 'on_error', 'continue']. ExampleThe following example shows how to invoke a UDF and a registered external tool or API as part of an AI workflow. When you create an MCP server connection, specify the following options: endpoint: Defines the base URL for all non-SSE communications with the MCP server, including other http calls and general data exchange. sse_endpoint: Specifies the explicit URL endpoint used to establish a Server-Sent Events (SSE) connection with the MCP server. If omitted, the client defaults to constructing the SSE endpoint by appending /sse to the domain specified in endpoint. # Create an MCP server connection. CREATE CONNECTION claims_mcp_server WITH ( 'type' = 'mcp_server', 'endpoint' = 'https://mcp.deepwiki.com', 'sse-endpoint' = 'https://mcp.deepwiki.com/sse', 'api-key' = 'api_key' ); -- Create a model that uses the MCP server connection. CREATE MODEL tool_invoker INPUT (input_message STRING) OUTPUT (tool_calls STRING) WITH( 'provider' = 'openai', 'openai.connection' = openai_connection, 'openai.system_prompt' = 'Select the best tools to complete the task', 'mcp.connection' = 'claims_mcp_server' ); -- Create a table that contains the input prompts. CREATE TABLE claims_verified ( id int, customer_id int ); -- Run the AI_TOOL_INVOKE function. SELECT id, customer_id, AI_TOOL_INVOKE( 'tool_invoker', customer_id, MAP['udf_1', 'udf_1 description', 'udf_2', 'udf_2 description'], MAP['tool_1', 'tool_1_description', 'tool_2', 'tool_2_description'] ) AS verified_result FROM claims_verified; TEXT_SEARCH_AGG¶ Run a text search over an external table. SyntaxSELECT * FROM key_input, LATERAL TABLE(TEXT_SEARCH_AGG(<external_table>, DESCRIPTOR(<input_column>), <search_column>, <LIMIT>)); DescriptionUse the TEXT_SEARCH_AGG function to run full-text searches over external databases in Confluent Cloud for Apache Flink. The TEXT_SEARCH_AGG function uses a combination of serialized table properties and configuration settings to interact with external databases. It’s designed to handle the deserialization of table properties and manage the runtime environment for executing search queries. The output of TEXT_SEARCH_AGG is an array with all rows in the external table that have matching text in the search column. <input_column> Search result <input_column_text> array[row1<column1, column2…>, row2<column1, column2…>, …] VECTOR_SEARCH_AGG¶ Run a vector search over an external table. SyntaxVECTOR_SEARCH_AGG(<external_table>, DESCRIPTOR(<input_column>), <embedding_column>, <LIMIT>); Note Vector Search is an Open Preview feature in Confluent Cloud. A Preview feature is a Confluent Cloud component that is being introduced to gain early feedback from developers. Preview features can be used for evaluation and non-production testing purposes or to provide feedback to Confluent. The warranty, SLA, and Support Services provisions of your agreement with Confluent do not apply to Preview features. Confluent may discontinue providing preview releases of the Preview features at any time in Confluent’s’ sole discretion. DescriptionUse the VECTOR_SEARCH_AGG function in conjunction with AI model inference to enable LLM-RAG use cases on Confluent Cloud. The VECTOR_SEARCH_AGG function uses a combination of serialized table properties and configuration settings to interact with external databases. It’s designed to handle the deserialization of table properties and manage the runtime environment for executing search queries. The output of VECTOR_SEARCH_AGG is an array with all rows in the external table that have a matching vector in the search column. <input_column> Search result <input_column_vector> array[row1<column1, column2…>, row2<column1, column2…>, …] ExampleAfter you have registered the AI inference model by using the CREATE MODEL statement, you can start running vector searches. The following example assumes a vector search endpoint as shown in Elasticsearch Quick Start Guide and an API key as shown in Kibana API Keys. Once your vector search is created, the following example shows these steps: Create a connection resource with the Elasticsearch endpoint and API key. Create an Elasticsearch external table. Create an input vector table. Run the vector search. Run the following statement to create a connection resource named elastic-connection that uses your AWS credentials. CREATE CONNECTION elastic-connection WITH ( 'type' = 'elastic', 'endpoint' = '<ELASTICSEARCH_ENDPOINT>', 'api-key' = '<ELASTIC_API_KEY>' ); Run the following statements to creates the tables and run the vector search. -- Create the external table. CREATE TABLE elastic ( vector array<FLOAT>, text string ) WITH ( 'connector' = 'elastic', 'elastic.connection' = 'elastic-connection', 'elastic.index' = 'vector-search-index' ); -- Create the embedding output table. CREATE TABLE embedding_output (text string, embedding array<float>); -- Insert mock data. INSERT INTO embedding_output values ('hello world', ARRAY[1, 5, -20]); -- Run the vector search. SELECT * FROM embedding_output, LATERAL TABLE(VECTOR_SEARCH_AGG('elastic', DESCRIPTOR(embedding), embedding, 3)); For more examples, see Vector Search with Confluent Cloud for Apache Flink. Other built-in functions¶ Aggregate Functions Collection Functions Comparison Functions Conditional Functions Datetime Functions Hash Functions JSON Functions ML Preprocessing Functions Model Inference Functions Numeric Functions String Functions Table API Functions Related content¶ Build AI with Flink SQL CREATE MODEL Flink SQL Queries ML Preprocessing Functions Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
ML_PREDICT(`model_name[$version_id]`, column);

-- map settings are optional
ML_PREDICT(`model_name[$version_id]`, column, map['async_enabled', [boolean], 'client_timeout', [int], 'max_parallelism', [int], 'retry_count', [int]]);
```

```sql
async_enabled
```

```sql
client_timeout
```

```sql
retry_count
```

```sql
max_parallelism
```

```sql
async_enabled
```

```sql
embeddingmodel
```

```sql
text_stream
```

```sql
SELECT id, text, embedding FROM text_stream, LATERAL TABLE(ML_PREDICT('embeddingmodel', text));
```

```sql
-- Specify the timeout.
SELECT * FROM `db1`.`tb1`, LATERAL TABLE(ML_PREDICT('md1', key, map['client_timeout', 60 ]));

-- Specify all configuration parameters.
SELECT * FROM `db1`.`tb1`, LATERAL TABLE(ML_PREDICT('md1', key, map['async_enabled', true, 'client_timeout', 60, 'max_parallelism', 20, 'retry_count', 5]));
```

```sql
ML_DETECT_ANOMALIES(
 data_column,
 timestamp_column,
 JSON_OBJECT('p' VALUE 1, 'q' VALUE 1, 'd' VALUE 1, 'minTrainingSize' VALUE 10));
```

```sql
SELECT
    ML_DETECT_ANOMALIES(
     total_orderunits,
     summed_ts,
     JSON_OBJECT('p' VALUE 1, 'q' VALUE 1, 'd' VALUE 1, 'minTrainingSize' VALUE 10))
    OVER (
        ORDER BY summed_ts
        RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS anomalies
FROM test_table;
```

```sql
ML_EVALUATE(`model_name`, label, col1, col2, ...) FROM 'eval_data_table';
```

```sql
Map<String, Double>
```

```sql
CREATE MODEL `my_remote_model`
INPUT (f1 INT, f2 STRING)
OUTPUT (output_label STRING)
WITH(
  'task' = 'classification',
  'type' = 'remote',
  'provider' = 'openai',
  'openai.endpoint' = 'https://api.openai.com/v1/llm/v1/chat',
  'openai.api_key' = '<api-key>'
);
```

```sql
my_remote_model
```

```sql
-- Model evaluation with all versions
SELECT ML_EVALUATE(`my_remote_model$all`, label, f1, f2) FROM `eval_data`;

-- Model evaluation with default version
SELECT ML_EVALUATE(`my_remote_model`, label, f1, f2) FROM `eval_data`;

-- Model evaluation with specific version 2
SELECT ML_EVALUATE(`my_remote_model$2`, label, f1, f2) FROM `eval_data`;
```

```sql
KEY_SEARCH_AGG(<external_table>, DESCRIPTOR(<input_column>), <search_column>);
```

```sql
ML_FORECAST(
 data_column,
 timestamp_column,
 JSON_OBJECT('p' VALUE 1, 'q' VALUE 1, 'd' VALUE 1, 'minTrainingSize' VALUE 10));
```

```sql
SELECT
    ML_FORECAST(
     total_orderunits,
     summed_ts,
     JSON_OBJECT('p' VALUE 1, 'q' VALUE 1, 'd' VALUE 1, 'minTrainingSize' VALUE 10))
    OVER (
        ORDER BY summed_ts
        RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS forecast
FROM test_table;
```

```sql
AI_COMPLETE(model_name, input_prompt [, invocation_config]);
```

```sql
input_prompt
```

```sql
invocation_config[optional]
```

```sql
MAP['debug', true]
```

```sql
# Create an OpenAI connection.
CREATE CONNECTION openai_connection
  WITH (
    'type' = 'openai',
    'endpoint' = 'https://api.openai.com/v1/chat/completions',
    'api-key' = '<api-key>'
  );

CREATE MODEL description_extractor
  INPUT (input STRING)
  OUTPUT (output_json STRING)
WITH(
    'provider' = 'openai',
    'openai.connection' = 'openai_connection',
    'openai.system_prompt' = 'Extract json from input free text',
    'task' = 'text_generation'
  );

CREATE TABLE claims_with_structured_description(id INT, customer_id INT, output_json STRING);

INSERT INTO claims_with_structured_description
  SELECT id, customer_id, output_json FROM claims_submitted, LATERAL TABLE(AI_COMPLETE('description_extractor', description));
```

```sql
AI_EMBEDDING(model_name, input_text [, invocation_config]);
```

```sql
invocation_config[optional]
```

```sql
MAP['debug', true]
```

```sql
# Create an OpenAI connection.
CREATE CONNECTION openai_embedding_connection
  WITH (
    'type' = 'openai',
    'endpoint' = 'https://api.openai.com/v1/embeddings',
    'api-key' = '<api-key>'
  );

  CREATE MODEL description_embedding
  INPUT (input STRING)
  OUTPUT (embeddings ARRAY<FLOAT>)
  WITH(
    'provider' = 'openai',
    'openai.connection' = 'openai_embedding_connection',
    'task' = 'embedding'
  );

  CREATE TABLE claims_embeddings(id INT, customer_id INT, embeddings ARRAY<FLOAT>);

  INSERT INTO claims_embeddings
    SELECT id, customer_id, embeddings FROM claims_submitted, LATERAL TABLE(AI_EMBEDDING('description_embedding', description));
```

```sql
AI_TOOL_INVOKE(model_name, input_prompt, remote_udf_descriptor, mcp_tool_descriptor [, invocation_config]);
```

```sql
input_prompt
```

```sql
remote_udf_descriptor
```

```sql
mcp_tool_descriptor
```

```sql
mcp_tool_descriptor
```

```sql
invocation_config[optional]
```

```sql
MAP['debug', true, 'on_error', 'continue']
```

```sql
sse_endpoint
```

```sql
# Create an MCP server connection.
CREATE CONNECTION claims_mcp_server
  WITH (
    'type' = 'mcp_server',
    'endpoint' = 'https://mcp.deepwiki.com',
    'sse-endpoint' = 'https://mcp.deepwiki.com/sse',
    'api-key' = 'api_key'
  );
```

```sql
-- Create a model that uses the MCP server connection.
CREATE MODEL tool_invoker
  INPUT (input_message STRING)
  OUTPUT (tool_calls STRING)
  WITH(
    'provider' = 'openai',
    'openai.connection' = openai_connection,
    'openai.system_prompt' = 'Select the best tools to complete the task',
    'mcp.connection' = 'claims_mcp_server'
  );

-- Create a table that contains the input prompts.
CREATE TABLE claims_verified (
  id int,
  customer_id int
);

-- Run the AI_TOOL_INVOKE function.
SELECT
  id,
  customer_id,
  AI_TOOL_INVOKE(
    'tool_invoker',
    customer_id,
    MAP['udf_1', 'udf_1 description', 'udf_2', 'udf_2 description'],
    MAP['tool_1', 'tool_1_description', 'tool_2', 'tool_2_description']
  ) AS verified_result
FROM claims_verified;
```

```sql
SELECT * FROM key_input,
  LATERAL TABLE(TEXT_SEARCH_AGG(<external_table>, DESCRIPTOR(<input_column>), <search_column>, <LIMIT>));
```

```sql
VECTOR_SEARCH_AGG(<external_table>, DESCRIPTOR(<input_column>), <embedding_column>, <LIMIT>);
```

```sql
CREATE CONNECTION elastic-connection
  WITH (
    'type' = 'elastic',
    'endpoint' = '<ELASTICSEARCH_ENDPOINT>',
    'api-key' = '<ELASTIC_API_KEY>'
  );
```

```sql
-- Create the external table.
CREATE TABLE elastic (
  vector array<FLOAT>,
  text string
) WITH (
  'connector' = 'elastic',
  'elastic.connection' = 'elastic-connection',
  'elastic.index' = 'vector-search-index'
);

-- Create the embedding output table.
CREATE TABLE embedding_output (text string, embedding array<float>);

-- Insert mock data.
INSERT INTO embedding_output values ('hello world', ARRAY[1, 5, -20]);

-- Run the vector search.
SELECT * FROM embedding_output, LATERAL TABLE(VECTOR_SEARCH_AGG('elastic', DESCRIPTOR(embedding), embedding, 3));
```

---

### SQL numeric functions in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/functions/numeric-functions.html

Numeric Functions in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides these built-in numeric functions to use in SQL queries: Numeric Trigonometry Random number generators Utility ABS ACOS RAND HEX BIN ASIN RAND(INT) UUID CEILING ATAN RAND_INTEGER(INT) UNHEX E ATAN2 RAND_INTEGER(INT1, INT2) EXP COS FLOOR COSH LN COT LOG DEGREES LOG10 RADIANS LOG2 SIN PERCENTILE SINH PI TAN POWER TANH ROUND SIGN SQRT TRUNCATE ABS¶ Gets the absolute value of a number. SyntaxABS(numeric) DescriptionThe ABS function returns the absolute value of the specified NUMERIC. Examples-- returns 23 SELECT ABS(-23); -- returns 23 SELECT ABS(23); ACOS¶ Computes the arccosine. SyntaxACOS(numeric) DescriptionThe ACOS function returns the arccosine of the specified NUMERIC. Examples-- returns 1.5707963267948966 -- (approximately PI/2) SELECT ACOS(0); -- returns 0.0 SELECT ACOS(1); ASIN¶ Computes the arcsine. SyntaxASIN(numeric) DescriptionThe ASIN function returns the arcsine of the specified NUMERIC. Examples-- returns 0.0 SELECT ASIN(0); -- returns 1.5707963267948966 -- (approximately PI/2) SELECT ASIN(1); ATAN¶ Computes the arctangent. SyntaxATAN(numeric) DescriptionThe ATAN function returns the arctangent of the specified NUMERIC. Examples-- returns 0.0 SELECT ATAN(0); -- returns 0.7853981633974483 -- (approximately PI/4) SELECT ATAN2(1); ATAN2¶ Computes the arctangent of a 2D point. SyntaxATAN2(numeric1, numeric2) DescriptionReturns the arctangent of the coordinate specified by (numeric1, numeric2). Examples-- returns 0.0 SELECT ATAN2(0, 0); -- returns 0.7853981633974483 -- (approximately PI/4) SELECT ATAN2(1, 1); BIN¶ Converts an INTEGER number to binary. SyntaxBIN(int) DescriptionThe BIN function returns a string representation of the specified INTEGER in binary format. Returns NULL if int is NULL. Examples-- returns "100" SELECT BIN(4); -- returns "1100" SELECT BIN(12); CEILING¶ Rounds a number up. SyntaxCEILING(numeric) DescriptionThe CEILING function rounds the specified NUMERIC up and returns the smallest integer that’s greater than or equal to the NUMERIC. This function can be abbreviated to CEIL(numeric). Examples-- returns 24 SELECT CEIL(23.55); -- returns -23 SELECT CEIL(-23.55); COS¶ Computes the cosine of an angle. SyntaxCOS(numeric) DescriptionReturns the cosine of the specified NUMERIC in radians. Examples-- returns 1.0 SELECT COS(0); -- returns 6.123233995736766E-17 -- (approximately 0) SELECT COS(PI()/2); COSH¶ Computes the hyperbolic cosine. SyntaxCOT(numeric) DescriptionThe COSH function returns the hyperbolic cosine of the specified NUMERIC. The return value type is DOUBLE. Example-- returns 1.0 SELECT COSH(0); COT¶ Computes the cotangent of an angle. SyntaxCOT(numeric) DescriptionThe COT function returns the cotangent of the specified NUMERIC in radians. Example-- returns 6.123233995736766E-17 -- (approximately 0) SELECT COT(PI()/2); DEGREES¶ Converts an angle in radians to degrees. SyntaxDEGREES(numeric) DescriptionThe DEGREES function converts the specified NUMERIC value in radians to degrees. Examples-- returns 90.0 SELECT DEGREES(PI()/2); -- returns 180.0 SELECT DEGREES(PI()); -- returns -45.0 SELECT DEGREES(-PI()/4); E¶ Gets the approximate value of e. SyntaxE() DescriptionReturns a value that is closer than any other values to e, the base of the natural logarithm. Examples-- returns 2.718281828459045 -- which is the approximate value of e SELECT E(); -- returns 1.0 SELECT LN(E()); EXP¶ Computes e raised to a power. SyntaxEXP(numeric) DescriptionThe EXP function returns e, the base of the natural logarithm, raised to the power of the specified NUMERIC. Examples-- returns 2.718281828459045 -- which is the approximate value of e SELECT EXP(1); -- returns 7.38905609893065 SELECT EXP(2); -- returns 0.36787944117144233 SELECT EXP(-1); FLOOR¶ Rounds a number down. SyntaxFLOOR(numeric) DescriptionThe FLOOR function rounds the specified NUMERIC down and returns the largest integer that is less than or equal to the NUMERIC. Examples-- returns 23 SELECT FLOOR(23.55); -- returns -24 SELECT FLOOR(-23.55); HEX¶ Converts an integer or string to hexadecimal. SyntaxHEX(numeric) HEX(string) DescriptionThe HEX function returns a string representation of an integer NUMERIC value or a STRING in hexadecimal format. Returns NULL if the argument is NULL. Examples-- returns "14" SELECT HEX(20); -- returns "64" SELECT HEX(100); -- returns "68656C6C6F2C776F726C64" SELECT HEX('hello,world'); Related functionUNHEX LN¶ Computes the natural log. SyntaxLN(numeric) DescriptionThe LN function returns the natural logarithm (base e) of the specified NUMERIC. Examples-- returns 1.0 SELECT LN(E()); -- returns 0.0 SELECT LN(1); LOG¶ Computes a logarithm. SyntaxLOG(numeric1, numeric2) DescriptionThe LOG function returns the logarithm of numeric2 to the base of numeric1. When called with one argument, returns the natural logarithm of numeric2. numeric2 must be greater than 0, and numeric1 must be greater than 1. Examples-- returns 1.0 SELECT LOG(10, 10); -- returns 8.0 SELECT LOG(2, 256); -- returns 1.0 SELECT LOG(E()); LOG10¶ Computes the base-10 logarithm. SyntaxLOG10(numeric) DescriptionThe LOG10 function returns the base-10 logarithm of the specified NUMERIC. Examples-- returns 1.0 SELECT LOG10(10); -- returns 3.0 SELECT LOG(1000); LOG2¶ Computes the base-2 logarithm. SyntaxLOG2(numeric) Description The LOG2 function returns the base-2 logarithm of the specified NUMERIC. Examples-- returns 1.0 SELECT LOG2(2); -- returns 10.0 SELECT LOG2(1024); PERCENTILE¶ Gets a percentile value based on a continuous distribution. SyntaxPERCENTILE(expr, percentage[, frequency]) Arguments expr: A NUMERIC expression. percentage: A NUMERIC expression between 0 and 1, or an ARRAY of NUMERIC expressions, each between 0 and 1. frequency: An optional integral number greater than 0 that describes the number of times expr must be counted. The default is 1. ReturnsDOUBLE if percentage is numeric, or an ARRAY of DOUBLE if percentage is an ARRAY. DescriptionThe PERCENTILE function returns a percentile value based on a continuous distribution of the input column. If no input row lies exactly at the desired percentile, the result is calculated using linear interpolation of the two nearest input values. NULL values are ignored in the calculation. Examples-- returns 6.0 SELECT PERCENTILE(col, 0.3) FROM (VALUES (0), (10), (10)) AS col; -- returns 6.0 SELECT PERCENTILE(col, 0.3, freq) FROM ( VALUES (0, 1), (10, 2)) AS tab(col, freq); -- returns [2.5,7.5] SELECT PERCENTILE(col, ARRAY(0.25, 0.75)) FROM (VALUES (0), (10)) AS col; -- returns 50.0 SELECT PERCENTILE(age, 0.5) FROM (VALUES 0, 50, 100) AS age; PI¶ Gets the approximate value of pi. SyntaxPI() DescriptionThe PI function returns a value that is closer than any other values to pi. Examples-- returns 3.141592653589793 -- (approximately PI) SELECT PI(); -- returns -1.0 SELECT COS(PI()); POWER¶ Raises a number to a power. SyntaxPOWER(numeric1, numeric2) DescriptionThe POWER function returns numeric1 raised to the power of numeric2. Examples-- returns 1000.0 SELECT POWER(10, 3); -- returns 256.0 SELECT POWER(2, 8); -- returns 1.0 SELECT POWER(500, 0); RADIANS¶ Converts an angle in degrees to radians. SyntaxRADIANS(numeric) DescriptionThe RADIANS function converts the specified NUMERIC value in degrees to radians. Examples-- returns 3.141592653589793 -- (approximately PI) SELECT RADIANS(180); -- returns 0.7853981633974483 -- (approximately PI/4) SELECT RADIANS(45); RAND¶ Gets a random number. SyntaxRAND() DescriptionThe RAND function returns a pseudorandom DOUBLE value in the range [0.0, 1.0). Example-- an example return value is 0.9346105267662114 SELECT RAND(); RAND(INT)¶ Gets a random number from a seed. SyntaxRAND(seed INT) DescriptionThe RAND(INT) function returns a pseudorandom DOUBLE value in the range [0.0, 1.0) with the initial seed integer. Two RAND functions return identical sequences of numbers if they have the same initial seed value. Examples-- returns 0.7321323355141605 SELECT RAND(23); -- returns 0.7275636800328681 SELECT RAND(42); RAND_INTEGER(INT)¶ Gets a pseudorandom integer. SyntaxRAND_INTEGER(upper_bound INT) DescriptionThe RAND_INTEGER(INT) functions returns a pseudorandom integer value in the range [0, upper_bound). Examples-- returns 20 SELECT RAND_INTEGER(23); -- returns 28 SELECT RAND_INTEGER(42); RAND_INTEGER(INT1, INT2)¶ Gets a random integer in a range. SyntaxRAND_INTEGER(seed INT, upper_bound INT) DescriptionThe RAND_INTEGER(INT1, INT2) function returns a pseudorandom integer value in the range [0, upper_bound) with the initial seed value seed. Two RAND_INTEGER functions return identical sequences of numbers if they have the same initial seed and bound. Examples-- returns 227 SELECT RAND_INTEGER(23, 1000); -- returns 1130 SELECT RAND_INTEGER(42, 10000); ROUND¶ Rounds a number to the specified precision. SyntaxROUND(numeric, int) DescriptionThe ROUND function returns a number rounded to int decimal places for the specified NUMERIC. Examples-- returns 23.6 SELECT ROUND(23.58, 1); -- returns 3.1416 SELECT ROUND(PI(), 4); SIGN¶ Gets the sign of a number. SyntaxSIGN(numeric) DescriptionThe SIGN function returns the signum of the specified NUMERIC. Examples-- returns -1.00 SELECT SIGN(-23.55); -- returns 1.000 SELECT SIGN(606.808); SIN¶ Compute the sine of an angle. SyntaxSIN(numeric) DescriptionThe SIN function returns the sine of the specified NUMERIC in radians. Examples-- returns 1.0 SELECT SIN(PI()/2); -- returns -1.0 SELECT SIN(-PI()/2); SINH¶ Computes the hyperbolic sine. SyntaxSINH(numeric) DescriptionThe SINH function returns the hyperbolic sine of the specified NUMERIC. The return type is DOUBLE. Example-- returns 0.0 SELECT SINH(0); SQRT¶ Computes the square root of a number. SyntaxSQRT(numeric) DescriptionThe SQRT function returns the square root of the specified NUMERIC, which must greater than or equal to 0. Examples-- returns 8.0 SELECT SQRT(64); -- returns 10.0 SELECT SQRT(100); -- returns 12.0 SELECT SQRT(144); TAN¶ Computes the tangent of an angle. SyntaxTAN(numeric) DescriptionThe TAN function returns the tangent of the specified NUMERIC in radians. Examples-- returns 0.0 SELECT TAN(0); -- returns 0.9999999999999999 SELECT TAN(PI()/4); TANH¶ Computes the hyperbolic tangent. SyntaxTANH(numeric) DescriptionThe TANH function returns the hyperbolic tangent of the specified NUMERIC. The return type is DOUBLE. Examples-- returns 0.0 SELECT TANH(0); -- returns 0.9999092042625951 SELECT TANH(5); TRUNCATE¶ Truncates a number to the specified precision. SyntaxTRUNCATE(numeric, integer) DescriptionThe TRUNCATE(numeric, integer) function returns the specified NUMERIC truncated to the number of decimal places specified by integer. Returns NULL if numeric or integer is NULL. If integer is 0, the result has no decimal point or fractional part. The integer value can be negative, which causes integer digits to the left of the decimal point to become zero. If integer is not set, the function truncates as if integer were 0. Examples-- returns 42.32 SELECT TRUNCATE(42.324, 2); -- returns 42.0 SELECT TRUNCATE(42.324); -- returns 40 SELECT TRUNCATE(42.324, -1); UNHEX¶ Converts a hexadecimal expression to BINARY. SyntaxUNHEX(str) Argumentsstr: a hexadecimal STRING. The characters in str must be legal hexadecimal digits: 0 - 9, A - F, and a - f. ReturnsA BINARY string. If str contains any nonhexadecimal digits, or is NULL, the return value is NULL. DescriptionThe UNHEX function interprets each pair of characters in str as a hexadecimal number and converts it to the byte represented by the number. If the length of str is odd, the first character is discarded, and the result is left-padded with a NULL byte. Examples-- returns "Flink" SELECT DECODE(UNHEX('466C696E6B') , 'UTF-8'); -- returns NULL SELECT UNHEX('ZZ'); Related functions DECODE HEX UUID¶ Generates a UUID. SyntaxUUID() DescriptionThe UUID() function returns a Universally Unique Identifier (UUID) string that conforms to the RFC 4122 type 4 specification. The UUID is generated using a cryptographically strong pseudo-random number generator. Examples-- an example return value is -- 3d3c68f7-f608-473f-b60c-b0c44ad4cc4e SELECT UUID(); Other built-in functions¶ Aggregate Functions Collection Functions Comparison Functions Conditional Functions Datetime Functions Hash Functions JSON Functions ML Preprocessing Functions Model Inference Functions Numeric Functions String Functions Table API Functions Related content¶ User-defined Functions Create a User Defined Function Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
ABS(numeric)
```

```sql
-- returns 23
SELECT ABS(-23);

-- returns 23
SELECT ABS(23);
```

```sql
ACOS(numeric)
```

```sql
-- returns 1.5707963267948966
-- (approximately PI/2)
SELECT ACOS(0);

-- returns 0.0
SELECT ACOS(1);
```

```sql
ASIN(numeric)
```

```sql
-- returns 0.0
SELECT ASIN(0);

-- returns 1.5707963267948966
-- (approximately PI/2)
SELECT ASIN(1);
```

```sql
ATAN(numeric)
```

```sql
-- returns 0.0
SELECT ATAN(0);

-- returns 0.7853981633974483
-- (approximately PI/4)
SELECT ATAN2(1);
```

```sql
ATAN2(numeric1, numeric2)
```

```sql
(numeric1, numeric2)
```

```sql
-- returns 0.0
SELECT ATAN2(0, 0);

-- returns 0.7853981633974483
-- (approximately PI/4)
SELECT ATAN2(1, 1);
```

```sql
-- returns "100"
SELECT BIN(4);

-- returns "1100"
SELECT BIN(12);
```

```sql
CEILING(numeric)
```

```sql
CEIL(numeric)
```

```sql
-- returns 24
SELECT CEIL(23.55);

-- returns -23
SELECT CEIL(-23.55);
```

```sql
COS(numeric)
```

```sql
-- returns 1.0
SELECT COS(0);

-- returns 6.123233995736766E-17
-- (approximately 0)
SELECT COS(PI()/2);
```

```sql
COT(numeric)
```

```sql
-- returns 1.0
SELECT COSH(0);
```

```sql
COT(numeric)
```

```sql
-- returns 6.123233995736766E-17
-- (approximately 0)
SELECT COT(PI()/2);
```

```sql
DEGREES(numeric)
```

```sql
-- returns 90.0
SELECT DEGREES(PI()/2);

-- returns 180.0
SELECT DEGREES(PI());

-- returns -45.0
SELECT DEGREES(-PI()/4);
```

```sql
-- returns 2.718281828459045
-- which is the approximate value of e
SELECT E();

-- returns 1.0
SELECT LN(E());
```

```sql
EXP(numeric)
```

```sql
-- returns 2.718281828459045
-- which is the approximate value of e
SELECT EXP(1);

-- returns 7.38905609893065
SELECT EXP(2);

-- returns 0.36787944117144233
SELECT EXP(-1);
```

```sql
FLOOR(numeric)
```

```sql
-- returns 23
SELECT FLOOR(23.55);

-- returns -24
SELECT FLOOR(-23.55);
```

```sql
HEX(numeric)
HEX(string)
```

```sql
-- returns "14"
SELECT HEX(20);

--  returns "64"
SELECT HEX(100);

-- returns "68656C6C6F2C776F726C64"
SELECT HEX('hello,world');
```

```sql
LN(numeric)
```

```sql
-- returns 1.0
SELECT LN(E());

-- returns 0.0
SELECT LN(1);
```

```sql
LOG(numeric1, numeric2)
```

```sql
-- returns 1.0
SELECT LOG(10, 10);

-- returns 8.0
SELECT LOG(2, 256);

-- returns 1.0
SELECT LOG(E());
```

```sql
LOG10(numeric)
```

```sql
-- returns 1.0
SELECT LOG10(10);

-- returns 3.0
SELECT LOG(1000);
```

```sql
LOG2(numeric)
```

```sql
-- returns 1.0
SELECT LOG2(2);

-- returns 10.0
SELECT LOG2(1024);
```

```sql
PERCENTILE(expr, percentage[, frequency])
```

```sql
-- returns 6.0
SELECT PERCENTILE(col, 0.3) FROM (VALUES (0), (10), (10)) AS col;

-- returns 6.0
SELECT PERCENTILE(col, 0.3, freq) FROM ( VALUES (0, 1), (10, 2)) AS tab(col, freq);

-- returns [2.5,7.5]
SELECT PERCENTILE(col, ARRAY(0.25, 0.75)) FROM (VALUES (0), (10)) AS col;

-- returns 50.0
SELECT PERCENTILE(age, 0.5) FROM (VALUES 0, 50, 100) AS age;
```

```sql
-- returns 3.141592653589793
-- (approximately PI)
SELECT PI();

-- returns -1.0
SELECT COS(PI());
```

```sql
POWER(numeric1, numeric2)
```

```sql
-- returns 1000.0
SELECT POWER(10, 3);

-- returns 256.0
SELECT POWER(2, 8);

-- returns 1.0
SELECT POWER(500, 0);
```

```sql
RADIANS(numeric)
```

```sql
-- returns 3.141592653589793
-- (approximately PI)
SELECT RADIANS(180);

-- returns 0.7853981633974483
-- (approximately PI/4)
SELECT RADIANS(45);
```

```sql
-- an example return value is 0.9346105267662114
SELECT RAND();
```

```sql
RAND(seed INT)
```

```sql
-- returns 0.7321323355141605
SELECT RAND(23);

-- returns 0.7275636800328681
SELECT RAND(42);
```

```sql
RAND_INTEGER(upper_bound INT)
```

```sql
RAND_INTEGER(INT)
```

```sql
-- returns 20
SELECT RAND_INTEGER(23);

-- returns 28
SELECT RAND_INTEGER(42);
```

```sql
RAND_INTEGER(seed INT, upper_bound INT)
```

```sql
RAND_INTEGER(INT1, INT2)
```

```sql
RAND_INTEGER
```

```sql
-- returns 227
SELECT RAND_INTEGER(23, 1000);

-- returns 1130
SELECT RAND_INTEGER(42, 10000);
```

```sql
ROUND(numeric, int)
```

```sql
-- returns 23.6
SELECT ROUND(23.58, 1);

-- returns 3.1416
SELECT ROUND(PI(), 4);
```

```sql
SIGN(numeric)
```

```sql
-- returns -1.00
SELECT SIGN(-23.55);

-- returns 1.000
SELECT SIGN(606.808);
```

```sql
SIN(numeric)
```

```sql
-- returns 1.0
SELECT SIN(PI()/2);

-- returns -1.0
SELECT SIN(-PI()/2);
```

```sql
SINH(numeric)
```

```sql
-- returns 0.0
SELECT SINH(0);
```

```sql
SQRT(numeric)
```

```sql
-- returns 8.0
SELECT SQRT(64);

-- returns 10.0
SELECT SQRT(100);

-- returns 12.0
SELECT SQRT(144);
```

```sql
TAN(numeric)
```

```sql
-- returns 0.0
SELECT TAN(0);

-- returns 0.9999999999999999
SELECT TAN(PI()/4);
```

```sql
TANH(numeric)
```

```sql
-- returns 0.0
SELECT TANH(0);

-- returns 0.9999092042625951
SELECT TANH(5);
```

```sql
TRUNCATE(numeric, integer)
```

```sql
TRUNCATE(numeric, integer)
```

```sql
--  returns 42.32
SELECT TRUNCATE(42.324, 2);

-- returns 42.0
SELECT TRUNCATE(42.324);

-- returns 40
SELECT TRUNCATE(42.324, -1);
```

```sql
-- returns "Flink"
SELECT DECODE(UNHEX('466C696E6B') , 'UTF-8');

-- returns NULL
SELECT UNHEX('ZZ');
```

```sql
-- an example return value is
-- 3d3c68f7-f608-473f-b60c-b0c44ad4cc4e
SELECT UUID();
```

---

### SQL Functions in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/functions/overview.html

Flink SQL Functions in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables you to do data transformations and other operations with the following built-in functions. Aggregate Functions Collection Functions Comparison Functions Conditional Functions Datetime Functions Hash Functions JSON Functions ML Preprocessing Functions Model Inference Functions Numeric Functions String Functions Table API Functions Related content¶ User-defined Functions Create a User Defined Function Flink SQL Queries DDL Statements

---

### SQL string functions in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/functions/string-functions.html

String Functions in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides these built-in string functions to use in SQL queries: ASCII BTRIM string1 || string2 CHARACTER_LENGTH CHR CONCAT CONCAT_WS DECODE ELT ENCODE FROM_BASE64 INITCAP INSTR LEFT LOCATE LOWER LPAD LTRIM OVERLAY PARSE_URL POSITION REGEXP REGEXP_EXTRACT REGEXP_REPLACE REPEAT REPLACE REVERSE RIGHT RPAD RTRIM SPLIT_INDEX STR_TO_MAP SUBSTRING TO_BASE64 TRANSLATE TRIM UPPER URL_DECODE URL_ENCODE ASCII¶ Gets the ASCII value of the first character of a string. SyntaxASCII(string) DescriptionThe ASCII function returns the numeric value of the first character of the specified string. Returns NULL if string is NULL. Examples-- returns 97 SELECT ASCII('abc'); -- returns NULL SELECT ASCII(CAST(NULL AS VARCHAR)); string1 || string2¶ Concatenates two strings. Syntaxstring1 || string2 DescriptionThe || function returns the concatenation of string1 and string2. Examples-- returns "FlinkSQL" SELECT 'Flink' || 'SQL'; Related functions CONCAT CONCAT_WS BTRIM¶ Trim both sides of a string. SyntaxBTRIM(str[, trimStr]) Arguments str: A source STRING expression. trimStr: An optional STRING expression that has characters to be trimmed. The default is the space character. ReturnsA trimmed STRING. DescriptionThe BTRIM function trims the leading and trailing characters from str. Examples-- returns 'www.apache.org' SELECT BTRIM(" www.apache.org "); -- returns 'www.apache.org' SELECT BTRIM('/www.apache.org/', '/'); -- returns 'www.apache.org' SELECT BTRIM('/*www.apache.org*/', '/*'); Related functions LTRIM RTRIM TRIM CHARACTER_LENGTH¶ Gets the length of a string. SyntaxCHARACTER_LENGTH(string) DescriptionThe CHARACTER_LENGTH function returns the number of characters in the specified string. This function can be abbreviated to CHAR_LENGTH(string). Examples-- returns 18 SELECT CHAR_LENGTH('Thomas A. Anderson'); CHR¶ Gets the character for an ASCII code. SyntaxCHR(integer) DescriptionThe CHR function returns the ASCII character that has the binary equivalent to the specified integer. Returns NULL if integer is NULL. If integer is larger than 255, the function computes the modulus of integer divided by 255 first and returns CHR of the modulus. Examples-- returns 'a' SELECT CHR(97); -- returns 'a' SELECT CHR(353); CONCAT¶ Concatenates a list of strings. SyntaxCONCAT(string1, string2, ...) DescriptionThe CONCAT function returns the concatenation of the specified strings. Returns NULL if any argument is NULL. Example-- returns "AABBCC" SELECT CONCAT('AA', 'BB', 'CC'); Related functions string1 || string2 CONCAT_WS CONCAT_WS¶ Concatenates a list of strings with a separator. SyntaxCONCAT_WS(string1, string2, string3, ...) DescriptionThe CONCAT_WS function returns a string that concatenates string2, string3, ... with the separator specified by string1. The separator is added between the strings to be concatenated. Returns NULL If string1 is NULL. Example-- returns "AA~BB~~CC" SELECT CONCAT_WS('~', 'AA', 'BB', '', 'CC'); Related functions string1 || string2 CONCAT DECODE¶ Decodes a binary into a string. SyntaxDECODE(binary, string) DescriptionThe DECODE function decodes the binary argument into a string using the specified character set. Returns NULL if either argument is null. These are the supported character set strings: ‘ISO-8859-1’ ‘US-ASCII’ ‘UTF-8’ ‘UTF-16BE’ ‘UTF-16LE’ ‘UTF-16’ Related function ENCODE ELT¶ Gets the expression at the specified index. SyntaxELT(index, expr[, exprs]*) Arguments index: The 1-based index of the expression to get. index must be an integer between 1 and the number of expressions. expr: An expression that resolves to CHAR, VARCHAR, BINARY, or VARBINARY. ReturnsThe expression at the location in the argument list specified by index. The result has the type of the least common type of all expressions. Returns NULL if index is NULL or out of range. DescriptionReturns the index-th expression. Example-- returns java-2 SELECT ELT(2, 'scala-1', 'java-2', 'go-3'); ENCODE¶ Encodes a string to a BINARY. SyntaxENCODE(string1, string2) DescriptionThe ENCODE function encodes string1 into a BINARY using the specified string2 character set. Returns NULL if either argument is null. These are the supported character set strings: ‘ISO-8859-1’ ‘US-ASCII’ ‘UTF-8’ ‘UTF-16BE’ ‘UTF-16LE’ ‘UTF-16’ Related function DECODE FROM_BASE64¶ Decodes a base-64 encoded string. SyntaxFROM_BASE64(string) DescriptionThe FROM_BASE64 function returns the base64-decoded result from the specified string. Returns NULL if string is NULL. Example-- returns "hello world" SELECT FROM_BASE64('aGVsbG8gd29ybGQ='); Related function TO_BASE64 INITCAP¶ Titlecase a string. SyntaxINITCAP(string) DescriptionThe INITCAP function returns a string that has the first character of each word converted to uppercase and the other characters converted to lowercase. A “word” is assumed to be a sequence of alphanumeric characters. Example-- returns "Title Case This String" SELECT INITCAP('title case this string'); Related functions LOWER UPPER INSTR¶ Find a substring in a string. SyntaxINSTR(string1, string2) DescriptionThe INSTR function returns the position of the first occurrence of string2 in string1. Returns NULL if either argument is NULL. The search is case-sensitive. Example-- returns 33 SELECT INSTR('The quick brown fox jumped over the lazy dog.', 'the'); Related function LOCATE LEFT¶ Gets the leftmost characters in a string. SyntaxLEFT(string, integer) DescriptionThe LEFT function returns the leftmost integer characters from the specified string. Returns an empty string if integer is negative. Returns NULL if either argument is NULL. Example-- returns "Morph" SELECT LEFT('Morpheus', 5); Related function RIGHT LOCATE¶ Finds a substring in a string after a specified position. SyntaxLOCATE(string1, string2[, integer]) DescriptionThe LOCATE function returns the position of the first occurrence of string1 in string2 after position integer. Returns 0 if string1 isn’t found. Returns NULL if any of the arguments is NULL. Example-- returns 12 SELECT LOCATE('the', 'the play’s the thing', 10); LOWER¶ Lowercases a string. SyntaxLOWER(string) DescriptionThe LOWER function returns the specified string in lowercase. To uppercase a string, use the UPPER function. Example-- returns "the quick brown fox jumped over the lazy dog." SELECT LOWER('The Quick Brown Fox Jumped Over The Lazy Dog.'); Related functions INITCAP UPPER LPAD¶ Left-pad a string. SyntaxLPAD(string1, integer, string2) DescriptionThe LPAD function returns a new string from string1 that’s left-padded with string2 to a length of integer characters. If the length of string1 is shorter than integer, the LPAD function returns string1 shortened to integer characters. To right-pad a string, use the RPAD function. Examples-- returns "??hi" SELECT LPAD('hi', 4, '??'); -- returns "h" SELECT LPAD('hi', 1, '??'); Related function - RPAD LTRIM¶ Removes left whitespaces from a string. SyntaxLTRIM(string) DescriptionThe LTRIM function removes the left whitespaces from the specified string. To remove the right whitespaces from a string, use the RTRIM function. Example-- returns "This is a test string." SELECT LTRIM(' This is a test string.'); Related functions BTRIM RTRIM TRIM OVERLAY¶ Replaces characters in a string with another string. SyntaxOVERLAY(string1 PLACING string2 FROM integer1 [ FOR integer2 ]) DescriptionThe OVERLAY function returns a string that replaces integer2 characters of string1 with string2, starting from position integer1. If integer2 isn’t specified, the default is the length of string2. Examples-- returns "xxxxxxxxx" SELECT OVERLAY('xxxxxtest' PLACING 'xxxx' FROM 6); -- returns "xxxxxxxxxst" SELECT OVERLAY('xxxxxtest' PLACING 'xxxx' FROM 6 FOR 2); Related functions REGEXP_REPLACE REPLACE TRANSLATE PARSE_URL¶ Gets parts of a URL. SyntaxPARSE_URL(string1, string2[, string3]) DescriptionThe PARSE_URL function returns the part specified by string2 from the URL in string1. For a URL that has a query, the optional string3 argument specifies the key to extract from the query string. Returns NULL if string1 or string2 is NULL. These are the valid values for string2: ‘AUTHORITY’ ‘FILE’ ‘HOST’ ‘PATH’ ‘PROTOCOL’ ‘QUERY’ ‘REF’ ‘USERINFO’ Example-- returns 'confluent.io' SELECT PARSE_URL('http://confluent.io/path1/p.php?k1=v1&k2=v2#Ref1', 'HOST'); -- returns 'v1' SELECT PARSE_URL('http://confluent.io/path1/p.php?k1=v1&k2=v2#Ref1', 'QUERY', 'k1'); POSITION¶ Finds a substring in a string. SyntaxPOSITION(string1 IN string2) DescriptionThe POSITION function returns the position of the first occurrence of string1 in string2. Returns 0 if string1 isn’t found in string2. The position is 1-based, so the index of the first character is 1. Examples-- returns 1 SELECT POSITION('the' IN 'the quick brown fox'); -- returns 17 SELECT POSITION('fox' IN 'the quick brown fox'); REGEXP¶ Matches a string against a regular expression. SyntaxREGEXP(string1, string2) DescriptionThe REGEXP function returns TRUE if any (possibly empty) substring of string1 matches the regular expression in string2; otherwise, FALSE. Returns NULL if either of the arguments is NULL. Examples-- returns TRUE SELECT REGEXP('800 439 3207', '.?(\d{3}).*(\d{3}).*(\d{4})'); -- returns TRUE SELECT REGEXP('2023-05-04', '((\d{4}.\d{2}).(\d{2}))'); REGEXP_EXTRACT¶ Gets a string from a regular expression matching group. SyntaxREGEXP_EXTRACT(string1, string2[, integer]) DescriptionThe REGEXP_EXTRACT function returns a string from string1 that’s extracted with the regular expression specified in string2 and a regex match group index integer. The regex match group index starts from 1, and 0 specifies matching the whole regex. The regex match group index must not exceed the number of the defined groups. Example-- returns "bar" SELECT REGEXP_EXTRACT('foothebar', 'foo(.*?)(bar)', 2); REGEXP_REPLACE¶ Replaces substrings in a string that match a regular expression. SyntaxREGEXP_REPLACE(string1, string2, string3) DescriptionThe REGEXP_REPLACE function returns a string from string1 with all of the substrings that match the regular expression in string2 consecutively replaced with string3. Example-- returns "fb" SELECT REGEXP_REPLACE('foobar', 'oo|ar', ''); Related functions OVERLAY REPLACE TRANSLATE REPEAT¶ Concatenates copies of a string. SyntaxREPEAT(string, integer) DescriptionThe REPEAT function returns a string that repeats the base string integer times. Example-- returns "TestingTesting" SELECT REPEAT('Testing', 2); REPLACE¶ Replace substrings in a string. SyntaxREPLACE(string1, string2, string3) DescriptionThe REPLACE function returns a new string that replaces all occurrences of string2 with string3 (non-overlapping) from string1. Examples-- returns "hello flink" SELECT REPLACE('hello world', 'world', 'flink'); -- returns "zab" SELECT REPLACE('ababab', 'abab', 'z'); Related functions OVERLAY REGEXP_REPLACE TRANSLATE REVERSE¶ Reverses a string. SyntaxREVERSE(string) DescriptionThe REVERSE function returns the reversed string. Returns NULL if string is NULL. Example-- returns "xof nworb kciuq eht" SELECT REVERSE('the quick brown fox'); RIGHT¶ Gets the rightmost characters in a string. SyntaxRIGHT(string, integer) DescriptionThe RIGHT function returns the rightmost integer characters from the specified string. Returns an empty string if integer is negative. Returns NULL if either argument is NULL. Example-- returns "Anderson" SELECT RIGHT('Thomas A. Anderson', 8); Related function LEFT RPAD¶ Right-pad a string. SyntaxRPAD(string1, integer, string2) DescriptionThe RPAD function returns a new string from string1 that’s right-padded with string2 to a length of integer characters. If the length of string1 is shorter than integer, returns string1 shortened to integer characters. To left-pad a string, use the LPAD function. Examples-- returns "hi??" SELECT RPAD('hi', 4, '??'); -- returns "h" SELECT RPAD('hi', 1, '??'); Related function LPAD RTRIM¶ Removes right whitespaces from a string. SyntaxRTRIM(string) DescriptionThe RTRIM function removes the right whitespaces from the specified string. To remove the left whitespaces from a string, use the LTRIM function. Example-- returns "This is a test string." SELECT RTRIM('This is a test string. '); Related functions BTRIM LTRIM TRIM SPLIT_INDEX¶ Splits a string by a delimiter. SyntaxSPLIT_INDEX(string1, string2, integer1) DescriptionThe SPLIT_INDEX function splits string1 by the delimiter in string2 and returns the integer1 zero-based string of the split strings. Returns NULL if integer is negative. Returns NULL if any of the arguments is NULL. Example-- returns "fox" SELECT SPLIT_INDEX('The quick brown fox', ' ', 3); STR_TO_MAP¶ Creates a map from a list of key-value strings. SyntaxSTR_TO_MAP(string1[, string2, string3]) DescriptionThe STR_TO_MAP function returns a map after splitting string1 into key/value pairs using the pair delimiter specified in string2. The default is ','. The string3 argument specifies the key-value delimiter. The default is '='. Both the pair delimiter and the key-value delimiter are treated as regular expressions, so special characters, like <([{\^-=$!|]})?*+.>), must be properly escaped before using as a delimiter literal. Example-- returns {a=1, b=2, c=3} SELECT STR_TO_MAP('a=1,b=2,c=3'); -- returns {a=1, b=2, c=3} SELECT STR_TO_MAP('a:1;b:2;c:3', ';', ':'); SUBSTRING¶ Finds a substring in a string. SyntaxSUBSTRING(string, integer1 [ FOR integer2 ]) DescriptionThe SUBSTRING function returns a substring of the specified string, starting from position integer1 with length integer2. If integer2 isn’t specified, the substring runs to the end of string. This function can be abbreviated to SUBSTR(string, integer1[, integer2]), but SUBSTR doesn’t support the FROM and FOR keywords. Examples-- returns "fox" SELECT SUBSTR('The quick brown fox', 17); -- returns "The" SELECT SUBSTR('The quick brown fox', 1, 3); TO_BASE64¶ Encodes a string to base64. SyntaxTO_BASE64(string) DescriptionThe TO_BASE64 function returns the base64-encoded representation of the specified string. Returns NULL if string is NULL. Example-- returns "aGVsbG8gd29ybGQ=" SELECT TO_BASE64('hello world'); Related function FROM_BASE64 TRANSLATE¶ Substitutes characters in a string. SyntaxTRANSLATE(expr, from, to) Arguments expr: A source STRING expression. from: A STRING expression that specifies a set of characters to be replaced. to: A STRING expression that specifies a corresponding set of replacement characters. ReturnsA STRING that has the characters of expr replaced with the characters specified in the to string. DescriptionThe TRANSLATE function replaces the characters in the expr source string according to the replacement rules specified in the from and to strings. The replacement is case-sensitive. Examples:-- returns A1B2C3 SELECT TRANSLATE('AaBbCc', 'abc', '123'); -- returns A1BC SELECT TRANSLATE('AaBbCc', 'abc', '1'); -- returns ABC SELECT TRANSLATE('AaBbCc', 'abc', ''); -- returns .APACHE.com SELECT TRANSLATE('www.apache.org', 'wapcheorg', ' APCHEcom'); Related functions OVERLAY REGEXP_REPLACE REPLACE TRIM¶ Removes leading and/or trailing characters from a string. SyntaxTRIM([ BOTH | LEADING | TRAILING ] string1 FROM string2) DescriptionThe TRIM function returns a string that removes leading and/or trailing characters string2 from string1. Examples-- returns "The quick brown " SELECT TRIM(TRAILING 'fox' FROM 'The quick brown fox'); -- returns " quick brown fox" SELECT TRIM(LEADING 'The' FROM 'The quick brown fox'); -- returns " The quick brown fox " SELECT TRIM(BOTH 'yyy' FROM 'yyy The quick brown fox yyy'); Related functions BTRIM LTRIM RTRIM UPPER¶ Uppercases a string. SyntaxUPPER(string) DescriptionThe UPPER function returns the specified string in uppercase. To lowercase a string, use the LOWER function. Example-- returns "THE QUICK BROWN FOX" SELECT UPPER('The quick brown fox'); URL_DECODE¶ Decodes a URL string. SyntaxURL_DECODE(string) DescriptionThe URL_DECODE function decodes the specified string in application/x-www-form-urlencoded format using the UTF-8 encoding scheme. If the input string is NULL, or there is an issue with the decoding process, like encountering an illegal escape pattern, or the encoding scheme is not supported, the function returns NULL. Example-- returns "http://confluent.io" SELECT URL_DECODE('http%3A%2F%2Fconfluent.io'); URL_ENCODE¶ Encodes a URL string. SyntaxURL_ENCODE(string) DescriptionThe URL_ENCODE function translates the specified string into application/x-www-form-urlencoded format using the UTF-8 encoding scheme. If the input string is NULL, or there is an issue with the decoding process, like encountering an illegal escape pattern, or the encoding scheme is not supported, the function returns NULL. Example-- returns "http%3A%2F%2Fconfluent.io" SELECT URL_ENCODE('http://confluent.io'); Other built-in functions¶ Aggregate Functions Collection Functions Comparison Functions Conditional Functions Datetime Functions Hash Functions JSON Functions ML Preprocessing Functions Model Inference Functions Numeric Functions String Functions Table API Functions Related content¶ User-defined Functions Create a User Defined Function Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
ASCII(string)
```

```sql
-- returns 97
SELECT ASCII('abc');

-- returns NULL
SELECT ASCII(CAST(NULL AS VARCHAR));
```

```sql
string1 || string2
```

```sql
-- returns "FlinkSQL"
SELECT 'Flink' || 'SQL';
```

```sql
BTRIM(str[, trimStr])
```

```sql
-- returns 'www.apache.org'
SELECT BTRIM("  www.apache.org  ");

-- returns 'www.apache.org'
SELECT BTRIM('/www.apache.org/', '/');

-- returns 'www.apache.org'
SELECT BTRIM('/*www.apache.org*/', '/*');
```

```sql
CHARACTER_LENGTH(string)
```

```sql
CHARACTER_LENGTH
```

```sql
CHAR_LENGTH(string)
```

```sql
-- returns 18
SELECT CHAR_LENGTH('Thomas A. Anderson');
```

```sql
CHR(integer)
```

```sql
-- returns 'a'
SELECT CHR(97);

-- returns 'a'
SELECT CHR(353);
```

```sql
CONCAT(string1, string2, ...)
```

```sql
--  returns "AABBCC"
SELECT CONCAT('AA', 'BB', 'CC');
```

```sql
CONCAT_WS(string1, string2, string3, ...)
```

```sql
string2, string3, ...
```

```sql
-- returns "AA~BB~~CC"
SELECT CONCAT_WS('~', 'AA', 'BB', '', 'CC');
```

```sql
DECODE(binary, string)
```

```sql
ELT(index, expr[, exprs]*)
```

```sql
-- returns java-2
SELECT ELT(2, 'scala-1', 'java-2', 'go-3');
```

```sql
ENCODE(string1, string2)
```

```sql
FROM_BASE64(string)
```

```sql
FROM_BASE64
```

```sql
-- returns "hello world"
SELECT FROM_BASE64('aGVsbG8gd29ybGQ=');
```

```sql
INITCAP(string)
```

```sql
-- returns "Title Case This String"
SELECT INITCAP('title case this string');
```

```sql
INSTR(string1, string2)
```

```sql
-- returns 33
SELECT INSTR('The quick brown fox jumped over the lazy dog.', 'the');
```

```sql
LEFT(string, integer)
```

```sql
-- returns "Morph"
SELECT LEFT('Morpheus', 5);
```

```sql
LOCATE(string1, string2[, integer])
```

```sql
-- returns 12
SELECT LOCATE('the', 'the play’s the thing', 10);
```

```sql
LOWER(string)
```

```sql
-- returns "the quick brown fox jumped over the lazy dog."
SELECT LOWER('The Quick Brown Fox Jumped Over The Lazy Dog.');
```

```sql
LPAD(string1, integer, string2)
```

```sql
-- returns "??hi"
SELECT LPAD('hi', 4, '??');

-- returns "h"
SELECT LPAD('hi', 1, '??');
```

```sql
LTRIM(string)
```

```sql
-- returns "This is a test string."
SELECT LTRIM(' This is a test string.');
```

```sql
OVERLAY(string1 PLACING string2 FROM integer1 [ FOR integer2 ])
```

```sql
-- returns "xxxxxxxxx"
SELECT OVERLAY('xxxxxtest' PLACING 'xxxx' FROM 6);

-- returns "xxxxxxxxxst"
SELECT OVERLAY('xxxxxtest' PLACING 'xxxx' FROM 6 FOR 2);
```

```sql
PARSE_URL(string1, string2[, string3])
```

```sql
-- returns 'confluent.io'
SELECT PARSE_URL('http://confluent.io/path1/p.php?k1=v1&k2=v2#Ref1', 'HOST');

-- returns 'v1'
SELECT PARSE_URL('http://confluent.io/path1/p.php?k1=v1&k2=v2#Ref1', 'QUERY', 'k1');
```

```sql
POSITION(string1 IN string2)
```

```sql
-- returns 1
SELECT POSITION('the' IN 'the quick brown fox');

-- returns 17
SELECT POSITION('fox' IN 'the quick brown fox');
```

```sql
REGEXP(string1, string2)
```

```sql
-- returns TRUE
SELECT REGEXP('800 439 3207', '.?(\d{3}).*(\d{3}).*(\d{4})');

-- returns TRUE
SELECT REGEXP('2023-05-04', '((\d{4}.\d{2}).(\d{2}))');
```

```sql
REGEXP_EXTRACT(string1, string2[, integer])
```

```sql
REGEXP_EXTRACT
```

```sql
-- returns "bar"
SELECT REGEXP_EXTRACT('foothebar', 'foo(.*?)(bar)', 2);
```

```sql
REGEXP_REPLACE(string1, string2, string3)
```

```sql
REGEXP_REPLACE
```

```sql
--  returns "fb"
SELECT REGEXP_REPLACE('foobar', 'oo|ar', '');
```

```sql
REPEAT(string, integer)
```

```sql
-- returns "TestingTesting"
SELECT REPEAT('Testing', 2);
```

```sql
REPLACE(string1, string2, string3)
```

```sql
-- returns "hello flink"
SELECT REPLACE('hello world', 'world', 'flink');

-- returns "zab"
SELECT REPLACE('ababab', 'abab', 'z');
```

```sql
REVERSE(string)
```

```sql
-- returns "xof nworb kciuq eht"
SELECT REVERSE('the quick brown fox');
```

```sql
RIGHT(string, integer)
```

```sql
-- returns "Anderson"
SELECT RIGHT('Thomas A. Anderson', 8);
```

```sql
RPAD(string1, integer, string2)
```

```sql
-- returns "hi??"
SELECT RPAD('hi', 4, '??');

-- returns "h"
SELECT RPAD('hi', 1, '??');
```

```sql
RTRIM(string)
```

```sql
-- returns "This is a test string."
SELECT RTRIM('This is a test string. ');
```

```sql
SPLIT_INDEX(string1, string2, integer1)
```

```sql
SPLIT_INDEX
```

```sql
-- returns "fox"
SELECT SPLIT_INDEX('The quick brown fox', ' ', 3);
```

```sql
STR_TO_MAP(string1[, string2, string3])
```

```sql
<([{\^-=$!|]})?*+.>)
```

```sql
-- returns {a=1, b=2, c=3}
SELECT STR_TO_MAP('a=1,b=2,c=3');

-- returns {a=1, b=2, c=3}
SELECT STR_TO_MAP('a:1;b:2;c:3', ';', ':');
```

```sql
SUBSTRING(string, integer1 [ FOR integer2 ])
```

```sql
SUBSTR(string, integer1[, integer2])
```

```sql
-- returns "fox"
SELECT SUBSTR('The quick brown fox', 17);

-- returns "The"
SELECT SUBSTR('The quick brown fox', 1, 3);
```

```sql
TO_BASE64(string)
```

```sql
-- returns "aGVsbG8gd29ybGQ="
SELECT TO_BASE64('hello world');
```

```sql
TRANSLATE(expr, from, to)
```

```sql
-- returns A1B2C3
SELECT TRANSLATE('AaBbCc', 'abc', '123');

-- returns A1BC
SELECT TRANSLATE('AaBbCc', 'abc', '1');

-- returns ABC
SELECT TRANSLATE('AaBbCc', 'abc', '');

-- returns    .APACHE.com
SELECT TRANSLATE('www.apache.org', 'wapcheorg', ' APCHEcom');
```

```sql
TRIM([ BOTH | LEADING | TRAILING ] string1 FROM string2)
```

```sql
-- returns "The quick brown "
SELECT TRIM(TRAILING 'fox' FROM 'The quick brown fox');

-- returns " quick brown fox"
SELECT TRIM(LEADING 'The' FROM 'The quick brown fox');

-- returns " The quick brown fox "
SELECT TRIM(BOTH 'yyy' FROM 'yyy The quick brown fox yyy');
```

```sql
UPPER(string)
```

```sql
-- returns "THE QUICK BROWN FOX"
SELECT UPPER('The quick brown fox');
```

```sql
URL_DECODE(string)
```

```sql
application/x-www-form-urlencoded
```

```sql
-- returns "http://confluent.io"
SELECT URL_DECODE('http%3A%2F%2Fconfluent.io');
```

```sql
URL_ENCODE(string)
```

```sql
application/x-www-form-urlencoded
```

```sql
-- returns "http%3A%2F%2Fconfluent.io"
SELECT URL_ENCODE('http://confluent.io');
```

---

### Table API functions in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/functions/table-api-functions.html

Table API in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® supports programming applications with the Table API. For more information, see the Table API Overview. To get started with programming a streaming data application with the Table API, see the Java Table API Quick Start. Confluent Cloud for Apache Flink supports the following Table API functions. TableEnvironment interface Table interface: SQL equivalents Table interface: API extensions TablePipeline interface StatementSet interface TableResult interface TableConfig class TableConfig class Confluent Others TableEnvironment interface¶ TableEnvironment.createStatementSet() TableEnvironment.createTable(String, TableDescriptor) TableEnvironment.executeSql(String) TableEnvironment.explainSql(String) TableEnvironment.from(String) TableEnvironment.fromValues(…) TableEnvironment.getConfig() TableEnvironment.getCurrentCatalog() TableEnvironment.getCurrentDatabase() TableEnvironment.listCatalogs() TableEnvironment.listDatabases() TableEnvironment.listFunctions() TableEnvironment.listTables() TableEnvironment.listTables(String, String) TableEnvironment.listViews() TableEnvironment.sqlQuery(String) TableEnvironment.useCatalog(String) TableEnvironment.useDatabase(String) Table interface: SQL equivalents¶ Table.as(…) Table.distinct() Table.executeInsert(String) Table.fetch(…) Table.filter(…) Table.fullOuterJoin(…) Table.groupBy(…) Table.insertInto(String) Table.intersect(…) Table.intersectAll(…) Table.join(…) Table.leftOuterJoin(…) Table.limit(…) Table.minus(…) Table.minusAll(…) Table.offset(…) Table.orderBy(…) Table.rightOuterJoin(…) Table.select(…) Table.union(…) Table.unionAll(…) Table.where(…) Table.window(…) Table interface: API extensions¶ Table.addColumns(…) Table.addOrReplaceColumns(…) Table.dropColumns(…) Table.execute() Table.explain() Table.getResolvedSchema() Table.map(…) Table.printExplain() Table.printSchema() Table.renameColumns(…) TablePipeline interface¶ TablePipeline.execute() TablePipeline.explain() TablePipeline.printExplain() StatementSet interface¶ StatementSet.add(TablePipeline) StatementSet.addInsert(String, Table) StatementSet.addInsertSql(String) StatementSet.execute() StatementSet.explain() TableResult interface¶ TableResult.await(…) TableResult.collect() TableResult.getJobClient().cancel() TableResult.getResolvedSchema() TableResult.print() TableConfig class¶ TableConfig.set(…) Expressions class¶ Expressions.* (except for call()) Others¶ FormatDescriptor.* TableDescriptor.* Over.* Session.* Slide.* Tumble.* Confluent¶ Confluent adds the following classes for more convenience: ConfluentSettings.* ConfluentTools.* ConfluentTableDescriptor.* Related content¶ Course: Apache Flink® Table API: Processing Data Streams in Java Table API Overview Java Table API Quick Start Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

---

### Flink SQL Keywords in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/keywords.html

Flink SQL Reserved Keywords in Confluent Cloud for Apache Flink¶ Keywords are words that have significance in Confluent Cloud for Apache Flink®. Some keywords, like AND, CHAR, and SELECT are reserved and require special treatment for use as identifiers like table names, column names, and the names of built-in functions. You can use reserved words as identifiers if you quote them with backtick characters. If you want to use one of the reserved words as a field name, enclose it with backticks, for example: `DATABASES` `RAW` You can use nonreserved keywords as identifiers without enclosing them with backticks. In the following tables, reserved keywords are shown in bold. Some string combinations are reserved as keywords for future use. Index¶ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z A¶ A ABS ABSENT ABSOLUTE ACTION ADA ADD ADMIN AFTER ALL ALLOCATE ALLOW ALTER ALWAYS AND ANALYZE ANY APPLY ARE ARRAY ARRAY_AGG ARRAY_CONCAT_AGG ARRAY_MAX_CARDINALITY AS ASC ASENSITIVE ASSERTION ASSIGNMENT ASYMMETRIC AT ATOMIC ATTRIBUTE ATTRIBUTES AUTHORIZATION AVG B¶ BEFORE BEGIN BEGIN_FRAME BEGIN_PARTITION BERNOULLI BETWEEN BIGINT BINARY BIT BLOB BOOLEAN BOTH BREADTH BUCKETS BY BYTES C¶ C CALL CALLED CARDINALITY CASCADE CASCADED CASE CAST CATALOG CATALOG_NAME CATALOGS CEIL CEILING CENTURY CHAIN CHANGELOG_MODE CHAR CHARACTER CHARACTERISTICS CHARACTERS CHARACTER_LENGTH CHARACTER_SET_CATALOG CHARACTER_SET_NAME CHARACTER_SET_SCHEMA CHAR_LENGTH CHECK CLASS_ORIGIN CLASSIFIER CLOB CLOSE COALESCE COBOL COLLATE COLLATION COLLATION_CATALOG COLLATION_NAME COLLATION_SCHEMA COLLECT COLUMN COLUMNS COLUMN_NAME COMMAND_FUNCTION COMMAND_FUNCTION_CODE COMMENT COMMIT COMMITTED COMPACT COMPILE COMPUTE CONDITION CONDITION_NUMBER CONDITIONAL CONNECT CONNECTION CONNECTION_NAME CONSTRAINT CONSTRAINTS CONSTRAINT_CATALOG CONSTRAINT_NAME CONSTRAINT_SCHEMA CONSTRUCTOR CONTAINS CONTAINS_SUBSTR CONTINUE CONTINUOUS CONVERT CORR CORRESPONDING COUNT COVAR_POP COVAR_SAMP CREATE CROSS CUBE CUME_DIST CURRENT CURRENT_CATALOG CURRENT_DATE CURRENT_DEFAULT_TRANSFORM_GROUP CURRENT_PATH CURRENT_ROLE CURRENT_ROW CURRENT_SCHEMA CURRENT_TIME CURRENT_TIMESTAMP CURRENT_TRANSFORM_GROUP_FOR_TYPE CURRENT_USER CURSOR CURSOR_NAME CYCLE D¶ DATA DATABASE DATABASES DATE DATE_DIFF DATE_TRUNC DATETIME DATETIME_DIFF DATETIME_INTERVAL_CODE DATETIME_INTERVAL_PRECISION DAY DAYOFWEEK DAYS DAYOFYEAR DATETIME_TRUNC DEALLOCATE DEC DECADE DECIMAL DECLARE DEFAULT DEFAULTS DEFERRABLE DEFERRED DEFINE DEFINED DEFINER DEGREE DELETE DENSE_RANK DEPTH DEREF DERIVED DESC DESCRIBE DESCRIPTION DESCRIPTOR DETERMINISTIC DIAGNOSTICS DISALLOW DISCONNECT DISPATCH DISTINCT DISTRIBUTED DISTRIBUTION DOMAIN DOT DOUBLE DOW DOY DRAIN DROP DYNAMIC DYNAMIC_FUNCTION DYNAMIC_FUNCTION_CODE E¶ EACH ELEMENT ELSE EMPTY ENCODING END END-EXEC END_FRAME END_PARTITION ENFORCED EPOCH EQUALS ERROR ESCAPE ESTIMATED_COST EVERY EXCEPT EXCEPTION EXCLUDE EXCLUDING EXEC EXECUTE EXISTS EXP EXPLAIN EXTEND EXTENDED EXTERNAL EXTRACT F¶ FALSE FETCH FILTER FINAL FIRST FIRST_VALUE FLOAT FLOOR FOLLOWING FOR FOREIGN FORMAT FORTRAN FOUND FRAC_SECOND FRAME_ROW FREE FRESHNESS FRIDAY FROM FULL FUNCTION FUNCTIONS FUSION G¶ G GENERAL GENERATED GEOMETRY GET GLOBAL GO GOTO GRANT GRANTED GROUP GROUPING GROUPS GROUP_CONCAT H¶ HAVING HASH HIERARCHY HOLD HOP HOUR HOURS I¶ IDENTITY IF IGNORE IMMEDIATE IMMEDIATELY IMPLEMENTATION ILIKE IMPORT IN INCLUDE INCLUDING INCREMENT INDICATOR INITIAL INITIALLY INNER INOUT INPUT INSENSITIVE INSERT INSTANCE INSTANTIABLE INT INTEGER INTERSECT INTERSECTION INTERVAL INTO INVOKER IS ISODOW ISOLATION ISOYEAR J¶ JAR JARS JAVA JOB JOBS JOIN JSON JSON_ARRAY JSON_ARRAYAGG JSON_EXECUTION_PLAN JSON_EXISTS JSON_OBJECT JSON_OBJECTAGG JSON_QUERY JSON_SCOPE JSON_VALUE K¶ K KEY KEY_MEMBER KEY_TYPE L¶ LABEL LAG LANGUAGE LARGE LAST LAST_VALUE LATERAL LEAD LEADING LEFT LENGTH LEVEL LIBRARY LIKE LIKE_REGEX LIMIT LN LOAD LOCAL LOCALTIME LOCALTIMESTAMP LOCATOR LOWER M¶ M MAP MATCH MATCHED MATCHES MATCH_NUMBER MATCH_RECOGNIZE MATERIALIZED MAX MAXVALUE MEASURES MEMBER MERGE MESSAGE_LENGTH MESSAGE_OCTET_LENGTH MESSAGE_TEXT METADATA METHOD MICROSECOND MILLENNIUM MILLISECOND MIN MINUS MINUTE MINUTES MINUTE MINVALUE ML_PREDICT MOD MODEL MODELS MODIFIES MODIFY MODULE MODULES MONDAY MONTH MONTHS MORE MULTISET MUMPS N¶ NAME NAMES NANOSECOND NATIONAL NATURAL NCHAR NCLOB NESTING NEW NEXT NO NONE NORMALIZE NORMALIZED NOT NTH_VALUE NTILE NULL NULLABLE NULLIF NULLS NUMBER NUMERIC O¶ OBJECT OCCURRENCES_REGEX OCTETS OCTET_LENGTH OF OFFSET OLD OMIT ON ONE ONLY OPEN OPTION OPTIONS OR ORDER ORDERING ORDINAL ORDINALITY OTHERS OUT OUTER OUTPUT OVER OVERLAPS OVERLAY OVERRIDING OVERWRITE OVERWRITING P¶ PAD PARAMETER PARAMETER_MODE PARAMETER_NAME PARAMETER_ORDINAL_POSITION PARAMETER_SPECIFIC_CATALOG PARAMETER_SPECIFIC_NAME PARAMETER_SPECIFIC_SCHEMA PARTIAL PARTITION PARTITIONED PARTITIONS PASCAL PASSING PASSTHROUGH PAST PATH PATTERN PER PERCENT PERCENTILE_CONT PERCENTILE_DISC PERCENT_RANK PERIOD PERMUTE PIVOT PLACING PLAN PLAN_ADVICE PLI PORTION POSITION POSITION_REGEX POWER PRECEDES PRECEDING PRECISION PREPARE PRESERVE PREV PRIMARY PRIOR PRIVILEGES PROCEDURE PROCEDURES PUBLIC PYTHON Q¶ QUALIFY QUARTER QUARTERS R¶ RANGE RANK RAW READ READS REAL RECURSIVE REF REFERENCES REFERENCING REFRESH_MODE REGR_AVGX REGR_AVGY REGR_COUNT REGR_INTERCEPT REGR_R2 REGR_SLOPE REGR_SXX REGR_SXY REGR_SYY RELATIVE RELEASE REMOVE RENAME REPEATABLE REPLACE RESET RESPECT RESTART RESTRICT RESULT RETURN RETURNED_CARDINALITY RETURNED_LENGTH RETURNED_OCTET_LENGTH RETURNED_SQLSTATE RETURNING RETURNS REVOKE RIGHT RLIKE ROLE ROLLBACK ROLLUP ROUTINE ROUTINE_CATALOG ROUTINE_NAME ROUTINE_SCHEMA ROW ROWS ROW_COUNT ROW_NUMBER RUNNING S¶ SAFE_CAST SAFE_OFFSET SAFE_ORDINAL SATURDAY SAVEPOINT SCALA SCALAR SCALE SCHEMA SCHEMA_NAME SCOPE SCOPE_CATALOGS SCOPE_NAME SCOPE_SCHEMA SCROLL SEARCH SECOND SECONDS SECTION SECURITY SEEK SELECT SELF SENSITIVE SEPARATOR SEQUENCE SERIALIZABLE SERVER SERVER_NAME SESSION SESSION_USER SET SETS SHOW SIMILAR SIMPLE SIZE SKIP SMALLINT SOME SOURCE SPACE SPECIFIC SPECIFICTYPE SPECIFIC_NAME SQL SQLEXCEPTION SQLSTATE SQLWARNING SQL_BIGINT SQL_BINARY SQL_BIT SQL_BLOB SQL_BOOLEAN SQL_CHAR SQL_CLOB SQL_DATE SQL_DECIMAL SQL_DOUBLE SQL_FLOAT SQL_INTEGER SQL_INTERVAL_DAY SQL_INTERVAL_DAY_TO_HOUR SQL_INTERVAL_DAY_TO_MINUTE SQL_INTERVAL_DAY_TO_SECOND SQL_INTERVAL_HOUR SQL_INTERVAL_HOUR_TO_MINUTE SQL_INTERVAL_HOUR_TO_SECOND SQL_INTERVAL_MINUTE SQL_INTERVAL_MINUTE_TO_SECOND SQL_INTERVAL_MONTH SQL_INTERVAL_SECOND SQL_INTERVAL_YEAR SQL_INTERVAL_YEAR_TO_MONTH SQL_LONGVARBINARY SQL_LONGVARCHAR SQL_LONGVARNCHAR SQL_NCHAR SQL_NCLOB SQL_NUMERIC SQL_NVARCHAR SQL_REAL SQL_SMALLINT SQL_TIME SQL_TIMESTAMP SQL_TINYINT SQL_TSI_DAY SQL_TSI_FRAC_SECOND SQL_TSI_HOUR SQL_TSI_MICROSECOND SQL_TSI_MINUTE SQL_TSI_MONTH SQL_TSI_QUARTER SQL_TSI_SECOND SQL_TSI_WEEK SQL_TSI_YEAR SQL_VARBINARY SQL_VARCHAR SQRT START STATE STATEMENT STATIC STATISTICS STDDEV_POP STDDEV_SAMP STOP STREAM STRING STRING_AGG STRUCTURE STYLE SUBCLASS_ORIGIN SUBMULTISET SUBSET SUBSTITUTE SUBSTRING SUBSTRING_REGEX SUCCEEDS SUM SUNDAY SUSPEND SYMMETRIC SYSTEM SYSTEM_TIME SYSTEM_USER T¶ TABLE TABLES TABLESAMPLE TABLE_NAME TEMPORARY THEN THURSDAY TIES TIME TIMESTAMP TIMESTAMP_DIFF TIMESTAMP_LTZ TIMESTAMP_TRUNC TIMESTAMPADD TIMESTAMPDIFF TIMEZONE_HOUR TIMEZONE_MINUTE TIME_DIFF TIME_TRUNC TINYINT TO TOP_LEVEL_COUNT TRAILING TRANSACTION TRANSACTIONS_ACTIVE TRANSACTIONS_COMMITTED TRANSACTIONS_ROLLED_BACK TRANSFORM TRANSFORMS TRANSLATE TRANSLATE_REGEX TRANSLATION TREAT TRIGGER TRIGGER_CATALOG TRIGGER_NAME TRIGGER_SCHEMA TRIM TRIM_ARRAY TRUE TRUNCATE TRY_CAST TUESDAY TUMBLE TYPE U¶ UESCAPE UNBOUNDED UNCOMMITTED UNCONDITIONAL UNDER UNION UNIQUE UNKNOWN UNLOAD UNNAMED UNNEST UNPIVOT UPDATE UPPER UPSERT USAGE USE USER USER_DEFINED_TYPE_CATALOG USER_DEFINED_TYPE_CODE USER_DEFINED_TYPE_NAME USER_DEFINED_TYPE_SCHEMA USING UTF16 UTF32 UTF8 V¶ VALUE VALUES VALUE_OF VARBINARY VARCHAR VARYING VAR_POP VAR_SAMP VERSION VERSIONING VIEW VIEWS VIRTUAL W¶ WATERMARK WATERMARKS WEDNESDAY WEEK WEEKS WHEN WHENEVER WHERE WIDTH_BUCKET WINDOW WITH WITHIN WITHOUT WORK WRAPPER WRITE X¶ XML Y¶ YEAR YEARS Z¶ ZONE Related content¶ DDL Statements Flink SQL Queries Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
`DATABASES`
`RAW`
```

---

### Flink SQL and Table API Reference in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/overview.html

Flink SQL and Table API Reference in Confluent Cloud for Apache Flink¶ This section describes the SQL language support in Confluent Cloud for Apache Flink®, including Data Definition Language (DDL) statements, Data Manipulation Language (DML) statements, built-in functions, and the Table API. Apache Flink® SQL is based on Apache Calcite, which implements the SQL standard. Data Types¶ Flink SQL has a rich set of native data types that you can use in SQL statements and queries. Data Types Serialize and deserialize data¶ Data Type Mappings Reserved keywords¶ Some string combinations are reserved as keywords for future use. Flink SQL Reserved Keywords Related content¶ Stream Processing Concepts Time and Watermarks Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

---

### SQL Deduplication Queries in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/queries/deduplication.html

Deduplication Queries in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables removing duplicate rows over a set of columns in a Flink SQL table. Syntax¶ SELECT [column_list] FROM ( SELECT [column_list], ROW_NUMBER() OVER ([PARTITION BY column1[, column2...]] ORDER BY time_attr [asc|desc]) AS rownum FROM table_name) WHERE rownum = 1 Parameter Specification Note This query pattern must be followed exactly, otherwise, the optimizer can’t translate the query. ROW_NUMBER(): Assigns an unique, sequential number to each row, starting with one. PARTITION BY column1[, column2...]: Specifies the partition columns by the deduplicate key. ORDER BY time_attr [asc|desc]: Specifies the ordering column, which must be a time attribute. Flink SQL supports the event time attribute. Processing time is not supported in Confluent Cloud for Apache Flink. Ordering by ASC means keeping the first row, ordering by DESC means keeping the last row. WHERE rownum = 1: The rownum = 1 is required for Flink SQL to recognize the query is deduplication. Description¶ Deduplication removes duplicate rows over a set of columns, keeping only the first or last row. Flink SQL uses the ROW_NUMBER() function to remove duplicates, similar to its usage in Top-N Queries in Confluent Cloud for Apache Flink. Deduplication is a special case of the Top-N query, in which N is 1 and row order is by event time. In some cases, an upstream ETL job isn’t end-to-end exactly-once, which may cause duplicate records in the sink, in case of failover. Duplicate records affect the correctness of downstream analytical jobs, like SUM and COUNT, so deduplication is required before further analysis can continue. See deduplication in action Apply the Deduplicate Topic action to generate a table that contains only unique records from an input table. Example¶ In the Flink SQL shell or in a Cloud Console workspace, run the following statement to see an example of row deduplication. It returns the first URL that the customer has visited. The rows are deduplicated by the $rowtime column, which is the system column mapped to the Kafka record timestamp and can be either LogAppendTime or CreateTime. Run the following statement to return the deduplicated rows. SELECT user_id, url, $rowtime FROM ( SELECT *, $rowtime, ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY $rowtime ASC) AS rownum FROM `examples`.`marketplace`.`clicks`) WHERE rownum = 1; Your output should resemble: user_id url $rowtime 3246 https://www.acme.com/product/upmtv 2024-04-16 08:04:47.365 4028 https://www.acme.com/product/jtahp 2024-04-16 08:04:47.367 4549 https://www.acme.com/product/ixsir 2024-04-16 08:04:47.367 Related content¶ Flink action: Deduplicate Rows in a Table Flink SQL Queries Flink SQL Functions Statements Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
SELECT [column_list]
FROM (
   SELECT [column_list],
     ROW_NUMBER() OVER ([PARTITION BY column1[, column2...]]
       ORDER BY time_attr [asc|desc]) AS rownum
   FROM table_name)
WHERE rownum = 1
```

```sql
ROW_NUMBER()
```

```sql
PARTITION BY column1[, column2...]
```

```sql
ORDER BY time_attr [asc|desc]
```

```sql
WHERE rownum = 1
```

```sql
ROW_NUMBER()
```

```sql
LogAppendTime
```

```sql
SELECT user_id, url, $rowtime
FROM (
   SELECT *, $rowtime,
     ROW_NUMBER() OVER (PARTITION BY user_id
       ORDER BY $rowtime ASC) AS rownum
   FROM `examples`.`marketplace`.`clicks`)
WHERE rownum = 1;
```

```sql
user_id    url                                  $rowtime
3246       https://www.acme.com/product/upmtv   2024-04-16 08:04:47.365
4028       https://www.acme.com/product/jtahp   2024-04-16 08:04:47.367
4549       https://www.acme.com/product/ixsir   2024-04-16 08:04:47.367
```

---

### SQL Group Aggregation Queries in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/queries/group-aggregation.html

Group Aggregation Queries in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables computing a single result from multiple input rows in a Flink SQL table. Description¶ Compute a single result from multiple input rows in a table. Like most data systems, Apache Flink® supports aggregate functions. An aggregate function computes a single result from multiple input rows. For example, there are aggregates to compute the COUNT, SUM, AVG (average), MAX (maximum) and MIN (minimum) values over a set of rows. The following example shows how to count the number of rows in a table, by using the COUNT function. SELECT COUNT(*) FROM orders For streaming queries, Flink runs continuous queries that never terminate. A continuous query updates the result table according to the updates on its input tables. For the previous query, Flink outputs an updated count each time a new row is inserted into the orders table. GROUP BY clause¶ Flink SQL supports the standard GROUP BY clause for aggregating data. The following example shows how to count the number of rows in a table and group the results by a table column. SELECT COUNT(*) FROM orders GROUP BY order_id For streaming queries, the required state for computing the query result might grow indefinitely. State size depends on the number of groups and the number and type of aggregation functions. For example, MIN and MAX are heavy on state size while COUNT is inexpensive. DISTINCT Aggregation¶ Distinct aggregates remove duplicate values before applying an aggregation function. The following example counts the number of distinct order_ids instead of the total number of rows in an orders table. SELECT COUNT(DISTINCT order_id) FROM orders For streaming queries, the required state for computing the query result might grow indefinitely. State size depends primarily on the number of distinct rows and the time that a group is maintained. Short-lived GROUP BY windows are not a problem. GROUPING SETS¶ Grouping sets enable more complex grouping operations than those you can describe with a standard GROUP BY clause. Rows are grouped separately by each specified grouping set, and aggregates are computed for each group just as for simple GROUP BY clauses. The following example show how to use GROUPING SETS to SELECT supplier_id, rating, COUNT(*) AS total FROM (VALUES ('supplier1', 'product1', 4), ('supplier1', 'product2', 3), ('supplier2', 'product3', 3), ('supplier2', 'product4', 4)) AS Products(supplier_id, product_id, rating) GROUP BY GROUPING SETS ((supplier_id, rating), (supplier_id), ()) Results: +-------------+--------+-------+ | supplier_id | rating | total | +-------------+--------+-------+ | supplier1 | 4 | 1 | | supplier1 | (NULL) | 2 | | (NULL) | (NULL) | 4 | | supplier1 | 3 | 1 | | supplier2 | 3 | 1 | | supplier2 | (NULL) | 2 | | supplier2 | 4 | 1 | +-------------+--------+-------+ Each sublist of GROUPING SETS specifies zero or more columns or expressions and is interpreted as if it were used directly in the GROUP BY clause. An empty grouping set means that all rows are aggregated down to a single group, which is output even if no input rows were present. References to the grouping columns or expressions are replaced by null values in result rows for grouping sets in which those columns don’t appear. For streaming queries, the required state for computing the query result might grow indefinitely. State size depends on the number of group sets and type of aggregation functions. ROLLUP¶ ROLLUP is a shorthand notation for specifying a common type of grouping set. It represents the given list of expressions and all prefixes of the list, including the empty list. For example, the following query is equivalent to the previous GROUP BY GROUPING SETS query. SELECT supplier_id, rating, COUNT(*) FROM (VALUES ('supplier1', 'product1', 4), ('supplier1', 'product2', 3), ('supplier2', 'product3', 3), ('supplier2', 'product4', 4)) AS Products(supplier_id, product_id, rating) GROUP BY ROLLUP (supplier_id, rating) CUBE¶ CUBE is a shorthand notation for specifying a common type of grouping set. It represents the given list and all of its possible subsets, which is also known as the power set. For example, the following two queries are equivalent. SELECT supplier_id, rating, product_id, COUNT(*) FROM (VALUES ('supplier1', 'product1', 4), ('supplier1', 'product2', 3), ('supplier2', 'product3', 3), ('supplier2', 'product4', 4)) AS Products(supplier_id, product_id, rating) GROUP BY CUBE (supplier_id, rating, product_id) SELECT supplier_id, rating, product_id, COUNT(*) FROM (VALUES ('supplier1', 'product1', 4), ('supplier1', 'product2', 3), ('supplier2', 'product3', 3), ('supplier2', 'product4', 4)) AS Products(supplier_id, product_id, rating) GROUP BY GROUPING SET ( ( supplier_id, product_id, rating ), ( supplier_id, product_id ), ( supplier_id, rating ), ( supplier_id ), ( product_id, rating ), ( product_id ), ( rating ), ( ) ) HAVING¶ The HAVING clause eliminates group rows that don’t satisfy the specified condition. HAVING is distinct from the WHERE clause, because WHERE filters individual rows before the GROUP BY, while HAVING filters group rows created by GROUP BY. Each column referenced in the condition must unambiguously reference a grouping column, unless it appears within an aggregate function. SELECT SUM(amount) FROM orders GROUP BY users HAVING SUM(amount) > 50 The presence of a HAVING clause turns a query into a grouped query, even if there is no GROUP BY clause. It’s the same as what happens when the query contains aggregate functions but no GROUP BY clause. The query considers all selected rows to form a single group, and the SELECT list and HAVING clause can reference only table columns from within aggregate functions. Such a query emits a single row if the HAVING condition is true, and zero rows if it’s not true. Related content¶ Flink SQL Queries Flink SQL Functions Statements Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
SELECT COUNT(*) FROM orders
```

```sql
SELECT COUNT(*)
FROM orders
GROUP BY order_id
```

```sql
SELECT COUNT(DISTINCT order_id) FROM orders
```

```sql
SELECT supplier_id, rating, COUNT(*) AS total
FROM (VALUES
    ('supplier1', 'product1', 4),
    ('supplier1', 'product2', 3),
    ('supplier2', 'product3', 3),
    ('supplier2', 'product4', 4))
AS Products(supplier_id, product_id, rating)
GROUP BY GROUPING SETS ((supplier_id, rating), (supplier_id), ())
```

```sql
+-------------+--------+-------+
| supplier_id | rating | total |
+-------------+--------+-------+
|   supplier1 |      4 |     1 |
|   supplier1 | (NULL) |     2 |
|      (NULL) | (NULL) |     4 |
|   supplier1 |      3 |     1 |
|   supplier2 |      3 |     1 |
|   supplier2 | (NULL) |     2 |
|   supplier2 |      4 |     1 |
+-------------+--------+-------+
```

```sql
GROUPING SETS
```

```sql
SELECT supplier_id, rating, COUNT(*)
FROM (VALUES
    ('supplier1', 'product1', 4),
    ('supplier1', 'product2', 3),
    ('supplier2', 'product3', 3),
    ('supplier2', 'product4', 4))
AS Products(supplier_id, product_id, rating)
GROUP BY ROLLUP (supplier_id, rating)
```

```sql
SELECT supplier_id, rating, product_id, COUNT(*)
FROM (VALUES
    ('supplier1', 'product1', 4),
    ('supplier1', 'product2', 3),
    ('supplier2', 'product3', 3),
    ('supplier2', 'product4', 4))
AS Products(supplier_id, product_id, rating)
GROUP BY CUBE (supplier_id, rating, product_id)

SELECT supplier_id, rating, product_id, COUNT(*)
FROM (VALUES
    ('supplier1', 'product1', 4),
    ('supplier1', 'product2', 3),
    ('supplier2', 'product3', 3),
    ('supplier2', 'product4', 4))
AS Products(supplier_id, product_id, rating)
GROUP BY GROUPING SET (
    ( supplier_id, product_id, rating ),
    ( supplier_id, product_id         ),
    ( supplier_id,             rating ),
    ( supplier_id                     ),
    (              product_id, rating ),
    (              product_id         ),
    (                          rating ),
    (                                 )
)
```

```sql
SELECT SUM(amount)
FROM orders
GROUP BY users
HAVING SUM(amount) > 50
```

---

### SQL INSERT INTO FROM SELECT Statement in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/queries/insert-into-from-select.html

INSERT INTO FROM SELECT Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables inserting SELECT query results directly into a Flink SQL table. Syntax¶ [EXECUTE] INSERT { INTO | OVERWRITE } [catalog_name.][database_name.]table_name [PARTITION (partition_column_name1=value1 [, partition_column_name2=value2, ...])] [(column_name1 [, column_name2, ...])] select_statement OVERWRITEINSERT OVERWRITE overwrites all existing data in the table or partition. New records are appended. PARTITIONThe PARTITION clause contains static partition columns for the insertion. COLUMN LISTFor a table T(a INT, b INT, c INT), Flink supports INSERT INTO T(c, b) SELECT x, y FROM S. The x result is written to column c, and the y result is written to column b. If column a is nullable, a is set to NULL. Description¶ Insert query results into a table. Use the INSERT INTO FROM SELECT statement to insert rows into a table from another table or query. For example, if you have a table T with columns a, b, and c, and another table S with columns x and y, the following query writes the values of x and y from S into c and b of T, respectively. INSERT INTO T (c, b) SELECT x, y FROM S If column a of T is nullable, Flink sets it to NULL. Examples¶ Insert rows into a simple table¶ In the Flink SQL shell or in a Cloud Console workspace, run the following commands to see an example of the INSERT INTO FROM SELECT statement. Create a table for web page click events. -- Create a table for web page click events. CREATE TABLE clicks ( ip_address VARCHAR, url VARCHAR, click_ts_raw BIGINT ); Populate the table with mock clickstream data. -- Populate the table with mock clickstream data. INSERT INTO clicks VALUES( '10.0.0.1', 'https://acme.com/index.html', 1692812175), ( '10.0.0.12', 'https://apache.org/index.html', 1692826575), ( '10.0.0.13', 'https://confluent.io/index.html', 1692826575), ( '10.0.0.1', 'https://acme.com/index.html', 1692812175), ( '10.0.0.12', 'https://apache.org/index.html', 1692819375), ( '10.0.0.13', 'https://confluent.io/index.html', 1692826575); Press ENTER to return to the SQL shell. Because INSERT INTO VALUES is a point-in-time statement, it exits after it completes inserting records. Create another table for filtered web page click events. CREATE TABLE filtered_clicks ( ip_address VARCHAR, url VARCHAR, click_ts_raw BIGINT ); Run the following statement to insert filtered rows into the filtered_clicks table. Only clicks that have an IP address of 10.0.0.1 are inserted. INSERT INTO filtered_clicks( ip_address, url, click_ts_raw ) SELECT * FROM clicks WHERE ip_address = '10.0.0.1'; View the rows in the filtered_clicks table. SELECT * FROM filtered_clicks; Your output should resemble: ip_address url click_timestamp 10.0.0.1 https://acme.com/index.html 2023-08-23 10:36:15 10.0.0.1 https://acme.com/index.html 2023-08-23 10:36:15 Fill a table without specifying all columns¶ CREATE TABLE t_insert_gaps (c1 STRING, c2 STRING, c3 STRING, c4 STRING); INSERT INTO t_insert_gaps (c3) SELECT 'Bob'; INSERT INTO t_insert_gaps (c3, c2) SELECT 'Bob', 'Alice'; SELECT * FROM t_insert_gaps; Properties A column list is defined between the table name and the SELECT in the INSERT INTO statement, so the SELECT statement uses a reduced schema. Columns c1, c2, are c4 are filled with NULLs. If one of the columns is declared NOT NULL, an error occurs. Related content¶ Convert the Serialization Format of a Topic INSERT VALUES Flink SQL Queries Flink SQL Functions Statements Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
[EXECUTE] INSERT { INTO | OVERWRITE } [catalog_name.][database_name.]table_name
  [PARTITION (partition_column_name1=value1 [, partition_column_name2=value2, ...])]
  [(column_name1 [, column_name2, ...])]
  select_statement
```

```sql
INSERT OVERWRITE
```

```sql
T(a INT, b INT, c INT)
```

```sql
INSERT INTO T(c, b) SELECT x, y FROM S.
```

```sql
INSERT INTO T (c, b) SELECT x, y FROM S
```

```sql
-- Create a table for web page click events.
CREATE TABLE clicks (
  ip_address VARCHAR,
  url VARCHAR,
  click_ts_raw BIGINT
);
```

```sql
-- Populate the table with mock clickstream data.
INSERT INTO clicks
VALUES( '10.0.0.1',  'https://acme.com/index.html',     1692812175),
      ( '10.0.0.12', 'https://apache.org/index.html',   1692826575),
      ( '10.0.0.13', 'https://confluent.io/index.html', 1692826575),
      ( '10.0.0.1',  'https://acme.com/index.html',     1692812175),
      ( '10.0.0.12', 'https://apache.org/index.html',   1692819375),
      ( '10.0.0.13', 'https://confluent.io/index.html', 1692826575);
```

```sql
CREATE TABLE filtered_clicks (
  ip_address VARCHAR,
  url VARCHAR,
  click_ts_raw BIGINT
);
```

```sql
filtered_clicks
```

```sql
INSERT INTO filtered_clicks(
  ip_address,
  url,
  click_ts_raw
)
SELECT * FROM clicks WHERE ip_address = '10.0.0.1';
```

```sql
filtered_clicks
```

```sql
SELECT * FROM filtered_clicks;
```

```sql
ip_address url                         click_timestamp
10.0.0.1   https://acme.com/index.html 2023-08-23 10:36:15
10.0.0.1   https://acme.com/index.html 2023-08-23 10:36:15
```

```sql
CREATE TABLE t_insert_gaps (c1 STRING, c2 STRING, c3 STRING, c4 STRING);

INSERT INTO t_insert_gaps (c3) SELECT 'Bob';

INSERT INTO t_insert_gaps (c3, c2) SELECT 'Bob', 'Alice';

SELECT * FROM t_insert_gaps;
```

---

### SQL INSERT VALUES Statement in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/queries/insert-values.html

INSERT VALUES Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables inserting data directly into a Flink SQL table. Syntax¶ [EXECUTE] INSERT { INTO | OVERWRITE } [catalog_name.][database_name.]table_name VALUES (value1 [, value2, ...]) [, (value3 [, value4, ...])] Description¶ Insert data into a table. Use the INSERT VALUES statement to insert one or more rows into a table by specifying the value for each column. For example, the following statement inserts a single row into a table named orders that has four columns. INSERT INTO orders VALUES (1, 1001, '2023-02-24', 50.0); You can insert multiple rows by using a comma-separated list of values. INSERT INTO orders VALUES (1, 1001, '2023-02-24', 50.0), (2, 1002, '2023-02-25', 60.0), (3, 1003, '2023-02-26', 70.0); Example¶ In the Flink SQL shell or in a Cloud Console workspace, run the following commands to see an example of the INSERT VALUES statement. Create a users table. -- Create a users table. CREATE TABLE users ( user_id STRING, registertime BIGINT, gender STRING, regionid STRING ); Insert rows into the users table. -- Populate the table with mock users data. INSERT INTO users VALUES ('Thomas A. Anderson', 1677260724, 'male', 'Region_4'), ('Trinity', 1677260733, 'female', 'Region_4'), ('Morpheus', 1677260742, 'male', 'Region_8'), ('Dozer', 1677260823, 'male', 'Region_1'), ('Agent Smith', 1677260955, 'male', 'Region_0'), ('Persephone', 1677260901, 'female', 'Region_2'), ('Niobe', 1677260921, 'female', 'Region_3'), ('Zee', 1677260922, 'female', 'Region_5'); Inspect the inserted rows. SELECT * FROM users; Your output should resemble: user_id registertime gender regionid Thomas A. Anderson 1677260724 male Region_4 Trinity 1677260733 female Region_4 Morpheus 1677260742 male Region_8 Dozer 1677260823 male Region_1 Agent Smith 1677260955 male Region_0 Persephone 1677260901 female Region_2 Niobe 1677260921 female Region_3 Zee 1677260922 female Region_5 Related content¶ INSERT INTO FROM SELECT Flink SQL Queries Flink SQL Functions Statements Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
[EXECUTE] INSERT { INTO | OVERWRITE } [catalog_name.][database_name.]table_name VALUES
  (value1 [, value2, ...])
  [, (value3 [, value4, ...])]
```

```sql
INSERT INTO orders VALUES (1, 1001, '2023-02-24', 50.0);
```

```sql
INSERT INTO orders VALUES
  (1, 1001, '2023-02-24', 50.0),
  (2, 1002, '2023-02-25', 60.0),
  (3, 1003, '2023-02-26', 70.0);
```

```sql
-- Create a users table.
CREATE TABLE users (
  user_id STRING,
  registertime BIGINT,
  gender STRING,
  regionid STRING
);
```

```sql
-- Populate the table with mock users data.
INSERT INTO users VALUES
  ('Thomas A. Anderson', 1677260724, 'male', 'Region_4'),
  ('Trinity', 1677260733, 'female', 'Region_4'),
  ('Morpheus', 1677260742, 'male', 'Region_8'),
  ('Dozer', 1677260823, 'male', 'Region_1'),
  ('Agent Smith', 1677260955, 'male', 'Region_0'),
  ('Persephone', 1677260901, 'female', 'Region_2'),
  ('Niobe', 1677260921, 'female', 'Region_3'),
  ('Zee', 1677260922, 'female', 'Region_5');
```

```sql
SELECT * FROM users;
```

```sql
user_id            registertime gender regionid
Thomas A. Anderson 1677260724   male   Region_4
Trinity            1677260733   female Region_4
Morpheus           1677260742   male   Region_8
Dozer              1677260823   male   Region_1
Agent Smith        1677260955   male   Region_0
Persephone         1677260901   female Region_2
Niobe              1677260921   female Region_3
Zee                1677260922   female Region_5
```

---

### SQL Join Queries in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/queries/joins.html

Join Queries in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables join data streams over Flink SQL dynamic tables. Description¶ Flink supports complex and flexible join operations over dynamic tables. There are a number of different types of joins to account for the wide variety of semantics that queries may require. By default, the order of joins is not optimized. Tables are joined in the order in which they are specified in the FROM clause. You can tweak the performance of your join queries, by listing the tables with the lowest update frequency first and the tables with the highest update frequency last. Make sure to specify tables in an order that doesn’t yield a cross join (Cartesian product), which aren’t supported and would cause a query to fail. Regular joins¶ Regular joins are the most generic type of join in which any new row, or changes to either side of the join, are visible and affect the whole join result. For example, if there is a new record on the left side, it is joined with all of the previous and future records on the right side when the join fields are equal. SELECT * FROM orders INNER JOIN Product ON orders.productId = Product.id For streaming queries, the grammar of regular joins is the most flexible and enables any kind of updates (insert, update, delete) on the input table. But this operation has important implications: it requires keeping both sides of the join input in state forever, so the required state for computing the query result might grow indefinitely, depending on the number of distinct input rows of all input tables and intermediate join results. INNER Equi-JOIN¶ Returns a simple Cartesian product restricted by the join condition. Only equi-joins are supported, i.e., joins that have at least one conjunctive condition with an equality predicate. Arbitrary cross or theta joins aren’t supported. SELECT * FROM orders INNER JOIN Product ON orders.product_id = Product.id OUTER Equi-JOIN¶ Returns all rows in the qualified Cartesian product (i.e., all combined rows that pass its join condition), plus one copy of each row in an outer table for which the join condition did not match with any row of the other table. Flink supports LEFT, RIGHT, and FULL outer joins. Only equi-joins are supported, i.e., joins that have at least one conjunctive condition with an equality predicate. Arbitrary cross or theta joins aren’t supported. SELECT * FROM orders LEFT JOIN Product ON orders.product_id = Product.id SELECT * FROM orders RIGHT JOIN Product ON orders.product_id = Product.id SELECT * FROM orders FULL OUTER JOIN Product ON orders.product_id = Product.id Interval joins¶ An interval join returns a simple Cartesian product restricted by the join condition and a time constraint. An interval join requires at least one equi-join predicate and a join condition that bounds the time on both sides. Two appropriate range predicates can define such a condition (<, <=, >=, >), a BETWEEN predicate, or a single equality predicate that compares time attributes of both input tables. For example, the following query joins all orders with their corresponding shipments if the order was shipped four hours after the order was received. SELECT * FROM orders o, Shipments s WHERE o.id = s.order_id AND o.order_time BETWEEN s.ship_time - INTERVAL '4' HOUR AND s.ship_time The following predicates are examples of valid interval join conditions: ltime = rtime ltime >= rtime AND ltime < rtime + INTERVAL '10' MINUTE ltime BETWEEN rtime - INTERVAL '10' SECOND AND rtime + INTERVAL '5' SECOND For streaming queries, compared to the regular join, interval join only supports append-only tables with time attributes. Because time attributes increase quasi-monotonically, Flink can remove old values from its state without affecting the correctness of the result. Temporal joins¶ A temporal join joins one table with another table that is updated over time. This join is made possible by linking both tables using a time attribute, which allows the join to consider the historical changes in the table. When viewing the table at a specific point in time, the join becomes a time-versioned join. In a temporal join, the join condition is based on a time attribute, and the join result includes all rows that satisfy the temporal relationship. A common use case for temporal joins is analyzing financial data, which often includes information that changes over time, such as stock prices, interest rates, and exchange rates. Event-time temporal joins¶ Event-time temporal joins are used to join two or more tables based on a common event time. Event time is the time at which an event occurred, which is typically embedded in the data itself. With Confluent Cloud for Apache Flink, you can use the $rowtime system column to get the timestamp from an Apache Kafka® record. This is also used for the default watermark strategy in Confluent Cloud. Temporal joins take an arbitrary table (left input/probe side) and correlate each row to the corresponding row’s relevant version in the versioned table (right input/build side). Flink uses the SQL syntax of FOR SYSTEM_TIME AS OF to perform this operation from the SQL:2011 standard. The syntax of a temporal join is as follows: SELECT [column_list] FROM table1 [AS <alias1>] [LEFT] JOIN table2 FOR SYSTEM_TIME AS OF table1.{ rowtime } [AS <alias2>] ON table1.column-name1 = table2.column-name1 With an event-time attribute, you can retrieve the value of a key as it was at some point in the past. This enables joining the two tables at a common point in time. The versioned table stores all versions, identified by time, since the last watermark. For example, suppose you have a table of orders, each with prices in different currencies. To properly normalize this table to a single currency, such as USD, each order needs to be joined with the proper currency conversion rate from the point in time when the order was placed. CREATE TABLE orders ( order_id STRING, price DECIMAL(32,2), currency STRING ); CREATE TABLE currency_rates ( currency STRING, conversion_rate DECIMAL(32, 2), PRIMARY KEY(currency) NOT ENFORCED ); SELECT orders.order_id, orders.price, orders.currency, currency_rates.conversion_rate FROM orders LEFT JOIN currency_rates FOR SYSTEM_TIME AS OF orders.`$rowtime` ON orders.currency = currency_rates.currency; The event-time temporal join requires the primary key contained in the equivalence condition of the temporal join condition. In this example, the primary key currency_rates.currency in the currency_rates table is constrained in the condition orders.currency = currency_rates.currency expression. With temporal joins, there’s some indeterminate amount of latency involved. In the example with orders and currency_rates, when enriching a particular order, an event-time temporal join waits until the watermark on the currency-rate stream reaches the timestamp of that order, because only then is it reasonable to be confident that the result of the join is being produced with complete knowledge of the relevant exchange-rate data. Event-time temporal joins can’t guarantee perfectly correct results. Despite having waited for the watermark, the most relevant exchange-rate record could still be late, in which case the join will be executed using an earlier version of the exchange rate. If the enrichment stream has infrequent updates, this will cause problems, because of the behavior of watermarking on idle streams. The operator, like any operator with two input streams, normally waits for the watermarks on both incoming streams to reach the desired timestamp before taking action. Array expansion¶ Returns a new row for each element in the given array. Unnesting WITH ORDINALITY is not yet supported. SELECT order_id, tag FROM orders CROSS JOIN UNNEST(tags) AS t (tag) Related content¶ Confluent Developer: Temporal Joins Explained Flink SQL Queries Flink SQL Functions Statements Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
SELECT * FROM orders
INNER JOIN Product
ON orders.productId = Product.id
```

```sql
SELECT *
FROM orders
INNER JOIN Product
ON orders.product_id = Product.id
```

```sql
SELECT *
FROM orders
LEFT JOIN Product
ON orders.product_id = Product.id

SELECT *
FROM orders
RIGHT JOIN Product
ON orders.product_id = Product.id

SELECT *
FROM orders
FULL OUTER JOIN Product
ON orders.product_id = Product.id
```

```sql
SELECT *
FROM orders o, Shipments s
WHERE o.id = s.order_id
AND o.order_time BETWEEN s.ship_time - INTERVAL '4' HOUR AND s.ship_time
```

```sql
ltime = rtime
```

```sql
ltime >= rtime AND ltime < rtime + INTERVAL '10' MINUTE
```

```sql
ltime BETWEEN rtime - INTERVAL '10' SECOND AND rtime + INTERVAL '5' SECOND
```

```sql
SELECT [column_list]
FROM table1 [AS <alias1>]
[LEFT] JOIN table2 FOR SYSTEM_TIME AS OF table1.{ rowtime } [AS <alias2>]
ON table1.column-name1 = table2.column-name1
```

```sql
CREATE TABLE orders (
    order_id    STRING,
    price       DECIMAL(32,2),
    currency    STRING
);

CREATE TABLE currency_rates (
    currency STRING,
    conversion_rate DECIMAL(32, 2),
    PRIMARY KEY(currency) NOT ENFORCED
);

SELECT
     orders.order_id,
     orders.price,
     orders.currency,
     currency_rates.conversion_rate
FROM orders
LEFT JOIN currency_rates FOR SYSTEM_TIME AS OF orders.`$rowtime`
ON orders.currency = currency_rates.currency;
```

```sql
currency_rates.currency
```

```sql
currency_rates
```

```sql
condition orders.currency = currency_rates.currency
```

```sql
currency_rates
```

```sql
WITH ORDINALITY
```

```sql
SELECT order_id, tag
FROM orders CROSS JOIN UNNEST(tags) AS t (tag)
```

---

### SQL LIMIT clause in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/queries/limit.html

LIMIT Clause in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables constraining the number of rows returned by a SELECT statement. Description¶ The LIMIT clause constrains the number of rows returned by a SELECT statement. Usually, this clause is used in conjunction with ORDER BY to ensure that the results are deterministic. Example¶ In the Flink SQL shell or in a Cloud Console workspace, run the following commands to see an example of the LIMIT clause. The following example selects the first 3 rows from a web page clicks table. Create a table for web page click events. -- Create a table for web page click events. CREATE TABLE clicks ( ip_address VARCHAR, url VARCHAR, click_ts_raw BIGINT ); Populate the table with mock clickstream data. -- Populate the table with mock clickstream data. INSERT INTO clicks VALUES( '10.0.0.1', 'https://acme.com/index.html', 1692812175), ( '10.0.0.12', 'https://apache.org/index.html', 1692826575), ( '10.0.0.13', 'https://confluent.io/index.html', 1692826575), ( '10.0.0.1', 'https://acme.com/index.html', 1692812175), ( '10.0.0.12', 'https://apache.org/index.html', 1692819375), ( '10.0.0.13', 'https://confluent.io/index.html', 1692826575); Press ENTER to return to the SQL shell. Because INSERT INTO VALUES is a point-in-time statement, it exits after it completes inserting records. View the rows in the clicks table and limit the result to 3 rows. SELECT * FROM clicks LIMIT 3; Your output should resemble: ip_address url click_ts_raw 10.0.0.1 https://acme.com/index.html 1692812175 10.0.0.12 https://apache.org/index.html 1692826575 10.0.0.13 https://confluent.io/index.html 1692826575 Related content¶ Flink SQL Queries Flink SQL Functions Statements Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
-- Create a table for web page click events.
CREATE TABLE clicks (
  ip_address VARCHAR,
  url VARCHAR,
  click_ts_raw BIGINT
);
```

```sql
-- Populate the table with mock clickstream data.
INSERT INTO clicks
VALUES( '10.0.0.1',  'https://acme.com/index.html',     1692812175),
      ( '10.0.0.12', 'https://apache.org/index.html',   1692826575),
      ( '10.0.0.13', 'https://confluent.io/index.html', 1692826575),
      ( '10.0.0.1',  'https://acme.com/index.html',     1692812175),
      ( '10.0.0.12', 'https://apache.org/index.html',   1692819375),
      ( '10.0.0.13', 'https://confluent.io/index.html', 1692826575);
```

```sql
SELECT * FROM clicks LIMIT 3;
```

```sql
ip_address url                             click_ts_raw
10.0.0.1   https://acme.com/index.html     1692812175
10.0.0.12  https://apache.org/index.html   1692826575
10.0.0.13  https://confluent.io/index.html 1692826575
```

---

### SQL Pattern Recognition Queries in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/queries/match_recognize.html

Pattern Recognition Queries in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables pattern detection in event streams. Syntax¶ SELECT T.aid, T.bid, T.cid FROM MyTable MATCH_RECOGNIZE ( PARTITION BY userid ORDER BY $rowtime MEASURES A.id AS aid, B.id AS bid, C.id AS cid PATTERN (A B C) DEFINE A AS name = 'a', B AS name = 'b', C AS name = 'c' ) AS T Pattern recognition¶ It is a common use case to search for a set of event patterns, especially in case of data streams. Apache Flink® comes with a complex event processing (CEP) library, which enables pattern detection in event streams. Furthermore, the Flink SQL API provides a relational way of expressing queries with a large set of built-in functions and rule-based optimizations that you can use out of the box. In December 2016, the International Organization for Standardization (ISO) released a new version of the SQL standard which includes Row Pattern Recognition in SQL (ISO/IEC TR 19075-5:2016). It enables Flink to consolidate CEP and SQL API using the MATCH_RECOGNIZE clause for complex event processing in SQL. A MATCH_RECOGNIZE clause enables the following tasks: Logically partition and order the data that is used with the PARTITION BY and ORDER BY clauses. Define patterns of rows to seek using the PATTERN clause. These patterns use a syntax similar to that of regular expressions. The logical components of the row pattern variables are specified in the DEFINE clause. Define measures, which are expressions usable in other parts of the SQL query, in the MEASURES clause. This topic explains each keyword in more detail and illustrates more complex examples. Important The Flink implementation of the MATCH_RECOGNIZE clause is a subset of the full standard. Only the features documented in the following sections are supported. For more information, see Known limitations. Installation¶ To use the MATCH_RECOGNIZE clause in the Flink SQL CLI, no action is necessary, because all dependencies are included by default. SQL semantics¶ Every MATCH_RECOGNIZE query consists of the following clauses: PARTITION BY - defines the logical partitioning of the table, similar to a GROUP BY operation. ORDER BY - specifies how the incoming rows should be ordered, which is essential, because patterns depend on an order. MEASURES - defines the output of the clause, similar to a SELECT clause. ONE ROW PER MATCH - output mode that defines how many rows per match to produce. AFTER MATCH SKIP - specifies where the next match should start. This is also a way to control how many distinct matches a single event can belong to. PATTERN - enables constructing patterns that will be searched for using a syntax that’s similar to regular expressions. DEFINE - defines the conditions that the pattern variables must satisfy. Examples¶ These examples assume that a table Ticker has been registered. The table contains prices of stocks at a particular point in time. The table has a following schema: Ticker |-- symbol: String # symbol of the stock |-- price: Long # price of the stock |-- tax: Long # tax liability of the stock |-- rowtime: TimeIndicatorTypeInfo(rowtime) # point in time when the change to those values happened For simplicity, only the incoming data for a single stock, named ACME, is considered. A ticker could look similar to the following table, where rows are continuously appended. symbol rowtime price tax ====== ==================== ======= ======= 'ACME' '01-Apr-11 10:00:00' 12 1 'ACME' '01-Apr-11 10:00:01' 17 2 'ACME' '01-Apr-11 10:00:02' 19 1 'ACME' '01-Apr-11 10:00:03' 21 3 'ACME' '01-Apr-11 10:00:04' 25 2 'ACME' '01-Apr-11 10:00:05' 18 1 'ACME' '01-Apr-11 10:00:06' 15 1 'ACME' '01-Apr-11 10:00:07' 14 2 'ACME' '01-Apr-11 10:00:08' 24 2 'ACME' '01-Apr-11 10:00:09' 25 2 'ACME' '01-Apr-11 10:00:10' 19 1 The task is to find periods of a constantly decreasing price of a single ticker. To accomplish this, you could write a query like the following: SELECT * FROM Ticker MATCH_RECOGNIZE ( PARTITION BY symbol ORDER BY $rowtime MEASURES START_ROW.rowtime AS start_tstamp, LAST(PRICE_DOWN.$rowtime) AS bottom_tstamp, LAST(PRICE_UP.$rowtime) AS end_tstamp ONE ROW PER MATCH AFTER MATCH SKIP TO LAST PRICE_UP PATTERN (START_ROW PRICE_DOWN+ PRICE_UP) DEFINE PRICE_DOWN AS (LAST(PRICE_DOWN.price, 1) IS NULL AND PRICE_DOWN.price < START_ROW.price) OR PRICE_DOWN.price < LAST(PRICE_DOWN.price, 1), PRICE_UP AS PRICE_UP.price > LAST(PRICE_DOWN.price, 1) ) MR; The query partitions the Ticker table by the symbol column and orders it by the rowtime time attribute. The PATTERN clause specifies a pattern with a starting event START_ROW that is followed by one or more PRICE_DOWN events and concluded with a PRICE_UP event. If such a pattern can be found, the next pattern match will be seeked at the last PRICE_UP event as indicated by the AFTER MATCH SKIP TO LAST clause. The DEFINE clause specifies the conditions that need to be met for a PRICE_DOWN and PRICE_UP event. Although the START_ROW pattern variable is not present it has an implicit condition that is evaluated always as TRUE. A pattern variable PRICE_DOWN is defined as a row with a price that is smaller than the price of the last row that met the PRICE_DOWN condition. For the initial case or when there is no last row that met the PRICE_DOWN condition, the price of the row should be smaller than the price of the preceding row in the pattern (referenced by START_ROW). A pattern variable PRICE_UP is defined as a row with a price that is larger than the price of the last row that met the PRICE_DOWN condition. This query produces a summary row for each period in which the price of a stock was continuously decreasing. The exact representation of the output rows is defined in the MEASURES part of the query. The number of output rows is defined by the ONE ROW PER MATCH output mode. symbol start_tstamp bottom_tstamp end_tstamp ========= ================== ================== ================== ACME 01-APR-11 10:00:04 01-APR-11 10:00:07 01-APR-11 10:00:08 The resulting row describes a period of falling prices that started at 01-APR-11 10:00:04 and achieved the lowest price at 01-APR-11 10:00:07 that increased again at 01-APR-11 10:00:08. Partitioning¶ It is possible to look for patterns in partitioned data, e.g., trends for a single ticker or a particular user. This can be expressed using the PARTITION BY clause. The clause is similar to using GROUP BY for aggregations. It is highly advised to partition the incoming data because otherwise the MATCH_RECOGNIZE clause will be translated into a non-parallel operator to ensure global ordering. Order of events¶ Flink enables searching for patterns based on time, either event time. Processing time is not supported in Confluent Cloud for Apache Flink. In the case of event time, the events are sorted before they are passed to the internal pattern state machine. As a consequence, the produced output will be correct regardless of the order in which rows are appended to the table. Instead, the pattern is evaluated in the order specified by the time contained in each row. The MATCH_RECOGNIZE clause assumes a time attribute with ascending ordering as the first argument to ORDER BY clause. For the example Ticker table, a definition like ORDER BY rowtime ASC, price DESC is valid but ORDER BY price, rowtime or ORDER BY rowtime DESC, price ASC is not. Define and measures¶ The DEFINE and MEASURES keywords have similar meanings to the WHERE and SELECT clauses in a simple SQL query. The MEASURES clause defines what will be included in the output of a matching pattern. It can project columns and define expressions for evaluation. The number of produced rows depends on the output mode setting. The DEFINE clause specifies conditions that rows have to fulfill in order to be classified to a corresponding pattern variable. If a condition isn’t defined for a pattern variable, a default condition is used, which evaluates to TRUE for every row. For a more detailed explanation about expressions that can be used in those clauses, see event stream navigation. Aggregations¶ Aggregations can be used in DEFINE and MEASURES clauses. Built-in functions are supported. Aggregate functions are applied to each subset of rows mapped to a match. To understand how these subsets are evaluated, see event stream navigation section. The task of the following example is to find the longest period of time for which the average price of a ticker did not go below a certain threshold. It shows how expressible MATCH_RECOGNIZE can become with aggregations. The following query performs this task. SELECT * FROM Ticker MATCH_RECOGNIZE ( PARTITION BY symbol ORDER BY rowtime MEASURES FIRST(A.rowtime) AS start_tstamp, LAST(A.rowtime) AS end_tstamp, AVG(A.price) AS avgPrice ONE ROW PER MATCH AFTER MATCH SKIP PAST LAST ROW PATTERN (A+ B) DEFINE A AS AVG(A.price) < 15 ) MR; Given this query and following input values: symbol rowtime price tax ====== ==================== ======= ======= 'ACME' '01-Apr-11 10:00:00' 12 1 'ACME' '01-Apr-11 10:00:01' 17 2 'ACME' '01-Apr-11 10:00:02' 13 1 'ACME' '01-Apr-11 10:00:03' 16 3 'ACME' '01-Apr-11 10:00:04' 25 2 'ACME' '01-Apr-11 10:00:05' 2 1 'ACME' '01-Apr-11 10:00:06' 4 1 'ACME' '01-Apr-11 10:00:07' 10 2 'ACME' '01-Apr-11 10:00:08' 15 2 'ACME' '01-Apr-11 10:00:09' 25 2 'ACME' '01-Apr-11 10:00:10' 25 1 'ACME' '01-Apr-11 10:00:11' 30 1 The query accumulates events as part of the pattern variable A, as long as their average price doesn’t exceed 15. For example, such a limit exceeding happens at 01-Apr-11 10:00:04. The following period exceeds the average price of 15 again at 01-Apr-11 10:00:11. Here are results of the query: symbol start_tstamp end_tstamp avgPrice ========= ================== ================== ============ ACME 01-APR-11 10:00:00 01-APR-11 10:00:03 14.5 ACME 01-APR-11 10:00:05 01-APR-11 10:00:10 13.5 Aggregations can be applied to expressions, but only if they reference a single pattern variable. For example, SUM(A.price * A.tax) is valid, but AVG(A.price * B.tax) is not. Note DISTINCT aggregations aren’t supported. Define a pattern¶ The MATCH_RECOGNIZE clause enables you to search for patterns in event streams using a powerful and expressive syntax that is somewhat similar to the widely used regular expression syntax. Every pattern is constructed from basic building blocks, called pattern variables, to which operators (quantifiers and other modifiers) can be applied. The whole pattern must be enclosed in brackets. The following SQL shows an example pattern: PATTERN (A B+ C* D) You can use the following operators: Concatenation - a pattern like (A B) means that the contiguity is strict between A and B, so there can be no rows that weren’t mapped to A or B in between. Quantifiers - modify the number of rows that can be mapped to the pattern variable. * — 0 or more rows + — 1 or more rows ? — 0 or 1 rows { n } — exactly n rows (n > 0) { n, } — n or more rows (n ≥ 0) { n, m } — between n and m (inclusive) rows (0 ≤ n ≤ m, 0 < m) { , m } — between 0 and m (inclusive) rows (m > 0) Important Patterns that can potentially produce an empty match aren’t supported. For example, patterns like these produce an empty match: PATTERN (A*) PATTERN (A? B*) PATTERN (A{0,} B{0,} C*) Greedy and reluctant quantifiers¶ Each quantifier can be either greedy (default behavior) or reluctant. Greedy quantifiers try to match as many rows as possible, while reluctant quantifiers try to match as few as possible. To see the difference, the following example shows a query where a greedy quantifier is applied to the B variable: SELECT * FROM Ticker MATCH_RECOGNIZE( PARTITION BY symbol ORDER BY rowtime MEASURES C.price AS lastPrice ONE ROW PER MATCH AFTER MATCH SKIP PAST LAST ROW PATTERN (A B* C) DEFINE A AS A.price > 10, B AS B.price < 15, C AS C.price > 12 ) Given the following input: symbol tax price rowtime ======= ===== ======== ===================== XYZ 1 10 2018-09-17 10:00:02 XYZ 2 11 2018-09-17 10:00:03 XYZ 1 12 2018-09-17 10:00:04 XYZ 2 13 2018-09-17 10:00:05 XYZ 1 14 2018-09-17 10:00:06 XYZ 2 16 2018-09-17 10:00:07 The example pattern produces the following output: symbol lastPrice ======== =========== XYZ 16 If the query is modified to be reluctant, changing B* to B*?, it produces the following output: symbol lastPrice ======== =========== XYZ 13 XYZ 16 The pattern variable B matches only the row with price 12 instead of swallowing the rows with prices 12, 13, and 14. You can’t use a greedy quantifier for the last variable of a pattern. So a pattern like (A B*) isn’t valid. You can work around this limitation by introducing an artificial state, like C, that has a negated condition of B. The following query shows an example. PATTERN (A B* C) DEFINE A AS condA(), B AS condB(), C AS NOT condB() Note The optional-reluctant quantifier (A?? or A{0,1}?) isn’t supported. Time constraint¶ Especially for streaming use cases, it’s often required that a pattern finishes within a given period of time. This enables limiting the overall state size that Flink must maintain internally, even in the case of greedy quantifiers. For this reason, Flink SQL supports the additional (non-standard SQL) WITHIN clause for defining a time constraint for a pattern. The clause can be defined after the PATTERN clause and takes an interval of millisecond resolution. If the time between the first and last event of a potential match is longer than the given value, a match isn’t appended to the result table. Note It’s good practice to use the WITHIN clause, because it helps Flink with efficient memory management. Underlying state can be pruned once the threshold is reached. But the WITHIN clause isn’t part of the SQL standard. The recommended way of dealing with time constraints might change in the future. The following example query shows the WITHIN clause used with MATCH_RECOGNIZE. SELECT * FROM Ticker MATCH_RECOGNIZE( PARTITION BY symbol ORDER BY rowtime MEASURES C.rowtime AS dropTime, A.price - C.price AS dropDiff ONE ROW PER MATCH AFTER MATCH SKIP PAST LAST ROW PATTERN (A B* C) WITHIN INTERVAL '1' HOUR DEFINE B AS B.price > A.price - 10, C AS C.price < A.price - 10 ) The query detects a price drop of 10 that happens within an interval of 1 hour. Assume the query is used to analyze the following ticker data. symbol rowtime price tax ====== ==================== ======= ======= 'ACME' '01-Apr-11 10:00:00' 20 1 'ACME' '01-Apr-11 10:20:00' 17 2 'ACME' '01-Apr-11 10:40:00' 18 1 'ACME' '01-Apr-11 11:00:00' 11 3 'ACME' '01-Apr-11 11:20:00' 14 2 'ACME' '01-Apr-11 11:40:00' 9 1 'ACME' '01-Apr-11 12:00:00' 15 1 'ACME' '01-Apr-11 12:20:00' 14 2 'ACME' '01-Apr-11 12:40:00' 24 2 'ACME' '01-Apr-11 13:00:00' 1 2 'ACME' '01-Apr-11 13:20:00' 19 1 The query produces the following results: symbol dropTime dropDiff ====== ==================== ============= 'ACME' '01-Apr-11 13:00:00' 14 The resulting row represents a price drop from 15 (at 01-Apr-11 12:00:00) to 1 (at 01-Apr-11 13:00:00). The dropDiff column contains the price difference. Even though prices also drop by higher values, for example, by 11 (between 01-Apr-11 10:00:00 and 01-Apr-11 11:40:00), the time difference between those two events is larger than 1 hour, they don’t produce a match. Output mode¶ The output mode describes how many rows should be emitted for every found match. The SQL standard describes two modes: ALL ROWS PER MATCH ONE ROW PER MATCH In Flink SQL, the only supported output mode is ONE ROW PER MATCH, and it always produces one output summary row for each found match. The schema of the output row is a concatenation of [partitioning columns] + [measures columns], in that order. The following example shows the output of a query defined as: SELECT * FROM Ticker MATCH_RECOGNIZE( PARTITION BY symbol ORDER BY rowtime MEASURES FIRST(A.price) AS startPrice, LAST(A.price) AS topPrice, B.price AS lastPrice ONE ROW PER MATCH PATTERN (A+ B) DEFINE A AS LAST(A.price, 1) IS NULL OR A.price > LAST(A.price, 1), B AS B.price < LAST(A.price) ) For the following input rows: symbol tax price rowtime ======== ===== ======== ===================== XYZ 1 10 2018-09-17 10:00:02 XYZ 2 12 2018-09-17 10:00:03 XYZ 1 13 2018-09-17 10:00:04 XYZ 2 11 2018-09-17 10:00:05 The query produces the following output: symbol startPrice topPrice lastPrice ======== ============ ========== =========== XYZ 10 13 11 The pattern recognition is partitioned by the symbol column. Even though not explicitly mentioned in the MEASURES clause, the partitioned column is added at the beginning of the result. Pattern navigation¶ The DEFINE and MEASURES clauses enable navigating within the list of rows that (potentially) match a pattern. This section discusses navigation for declaring conditions or producing output results. Pattern variable referencing¶ A pattern variable reference enables referencoing a set of rows mapped to a particular pattern variable in the DEFINE or MEASURES clauses. For example, the expression A.price describes a set of rows mapped so far to A plus the current row, if the query tries to match the current row to A. If an expression in the DEFINE / MEASURES clause requires a single row, for example, A.price or A.price > 10, it selects the last value belonging to the corresponding set. If no pattern variable is specified, for example, SUM(price), an expression references the default pattern variable *, which references all variables in the pattern. In other words, it creates a list of all the rows mapped so far to any variable plus the current row. Example¶ For a more thorough example, consider the following pattern and corresponding conditions. PATTERN (A B+) DEFINE A AS A.price >= 10, B AS B.price > A.price AND SUM(price) < 100 AND SUM(B.price) < 80 The following table describes how these conditions are evaluated for each incoming event. The table consists of the following columns: # - the row identifier that uniquely identifies an incoming row in the lists [A.price] / [B.price] / [price]. price - the price of the incoming row. [A.price]/ [B.price]/ [price] - describe lists of rows which are used in the DEFINE clause to evaluate conditions. Classifier - the classifier of the current row which indicates the pattern variable the row is mapped to. A.price/ B.price/ SUM(price)/ SUM(B.price) - describes the result after those expressions have been evaluated. == ===== ========== ========= ============== ================== ======= ======= ========== ============ # price Classifier [A.price] [B.price] [price] A.price B.price SUM(price) SUM(B.price) == ===== ========== ========= ============== ================== ======= ======= ========== ============ #1 10 -> A #1 - - 10 - - - #2 15 -> B #1 #2 #1, #2 10 15 25 15 #3 20 -> B #1 #2, #3 #1, #2, #3 10 20 45 35 #4 31 -> B #1 #2, #3, #4 #1, #2, #3, #4 10 31 76 66 #5 35 #1 #2, #3, #4, #5 #1, #2, #3, #4, #5 10 35 111 101 == ===== ========== ========= ============== ================== ======= ======= ========== ============ The table shows that the first row is mapped to pattern variable A, and subsequent rows are mapped to pattern variable B. But the last row doesn’t fulfill the B condition, because the sum over all mapped rows, SUM(price), and the sum over all rows in B exceed the specified thresholds. Logical offsets¶ Logical offsets enable navigation within the events that were mapped to a particular pattern variable. This can be expressed with two corresponding functions. Offset functions Description LAST(variable.field, n) Returns the value of the field from the event that was mapped to the n-th last element of the variable. The counting starts at the last element mapped. FIRST(variable.field, n) Returns the value of the field from the event that was mapped to the n-th element of the variable. The counting starts at the first element mapped. Examples¶ For a more thorough example, consider the following pattern and corresponding conditions: PATTERN (A B+) DEFINE A AS A.price >= 10, B AS (LAST(B.price, 1) IS NULL OR B.price > LAST(B.price, 1)) AND (LAST(B.price, 2) IS NULL OR B.price > 2 * LAST(B.price, 2)) The following table describes how these conditions are evaluated for each incoming event. The table consists of the following columns: price - the price of the incoming row. Classifier - the classifier of the current row which indicates the pattern variable the row is mapped to. LAST(B.price, 1)/ LAST(B.price, 2) - describes the result after these expressions have been evaluated. ===== ========== ================ ================ ======================================================================================== price Classifier LAST(B.price, 1) LAST(B.price, 2) Comment ===== ========== ================ ================ ======================================================================================== 10 -> A 15 -> B null null Notice that ``LAST(B.price, 1)`` is null because there is still nothing mapped to ``B``. 20 -> B 15 null 31 -> B 20 15 35 31 20 Not mapped because ``35 < 2 * 20``. ===== ========== ================ ================ ======================================================================================== It might also make sense to use the default pattern variable with logical offsets. In this case, an offset considers all the rows mapped so far: PATTERN (A B? C) DEFINE B AS B.price < 20, C AS LAST(price, 1) < C.price ===== ========== ============== ===================================================================================== price Classifier LAST(price, 1) Comment ===== ========== ============== ===================================================================================== 10 -> A 15 -> B 20 -> C 15 ``LAST(price, 1)`` is evaluated as the price of the row mapped to the ``B`` variable. ===== ========== ============== ===================================================================================== If the second row didn’t map to the B variable, the query returns the following results: ===== ========== ============== ===================================================================================== price Classifier LAST(price, 1) Comment ===== ========== ============== ===================================================================================== 10 -> A 20 -> C 10 ``LAST(price, 1)`` is evaluated as the price of the row mapped to the ``A`` variable. ===== ========== ============== ===================================================================================== It’s also possible to use multiple pattern variable references in the first argument of the FIRST/LAST functions. This way, you can write an expression that accesses multiple columns, but all of them must use the same pattern variable. In other words, the value of the LAST/ FIRST function must be computed in a single row. this means that it’s possible to use LAST(A.price * A.tax), but an expression like LAST(A.price * B.tax) is not valid. After-match strategy¶ The AFTER MATCH SKIP clause specifies where to start a new matching procedure after a complete match was found. There are four different strategies: SKIP PAST LAST ROW - resumes the pattern matching at the next row after the last row of the current match. SKIP TO NEXT ROW - continues searching for a new match starting at the next row after the starting row of the match. SKIP TO LAST variable - resumes the pattern matching at the last row that is mapped to the specified pattern variable. SKIP TO FIRST variable - resumes the pattern matching at the first row that is mapped to the specified pattern variable. This is also a way to specify how many matches a single event can belong to. For example, with the SKIP PAST LAST ROW strategy, every event can belong to at most one match. Examples¶ To better understand the differences between these strategies consider the following example. For the following input rows: symbol tax price rowtime ======== ===== ======= ===================== XYZ 1 7 2018-09-17 10:00:01 XYZ 2 9 2018-09-17 10:00:02 XYZ 1 10 2018-09-17 10:00:03 XYZ 2 5 2018-09-17 10:00:04 XYZ 2 10 2018-09-17 10:00:05 XYZ 2 7 2018-09-17 10:00:06 XYZ 2 14 2018-09-17 10:00:07 Evaluate the following query with different strategies: SELECT * FROM Ticker MATCH_RECOGNIZE( PARTITION BY symbol ORDER BY rowtime MEASURES SUM(A.price) AS sumPrice, FIRST(rowtime) AS startTime, LAST(rowtime) AS endTime ONE ROW PER MATCH [AFTER MATCH STRATEGY] PATTERN (A+ C) DEFINE A AS SUM(A.price) < 30 ) The query returns the sum of the prices of all rows mapped to A and the first and last timestamp of the overall match. The query produces different results based on which AFTER MATCH strategy is used: AFTER MATCH SKIP PAST LAST ROW¶ symbol sumPrice startTime endTime ======== ========== ===================== ===================== XYZ 26 2018-09-17 10:00:01 2018-09-17 10:00:04 XYZ 17 2018-09-17 10:00:05 2018-09-17 10:00:07 The first result matched against the rows #1, #2, #3, #4. The second result matched against the rows #5, #6, #7. AFTER MATCH SKIP TO NEXT ROW¶ symbol sumPrice startTime endTime ======== ========== ===================== ===================== XYZ 26 2018-09-17 10:00:01 2018-09-17 10:00:04 XYZ 24 2018-09-17 10:00:02 2018-09-17 10:00:05 XYZ 25 2018-09-17 10:00:03 2018-09-17 10:00:06 XYZ 22 2018-09-17 10:00:04 2018-09-17 10:00:07 XYZ 17 2018-09-17 10:00:05 2018-09-17 10:00:07 Again, the first result matched against the rows #1, #2, #3, #4. Compared to the previous strategy, the next match includes row #2 again for the next matching. Therefore, the second result matched against the rows #2, #3, #4, #5. The third result matched against the rows #3, #4, #5, #6. The forth result matched against the rows #4, #5, #6, #7. The last result matched against the rows #5, #6, #7. AFTER MATCH SKIP TO LAST A¶ symbol sumPrice startTime endTime ======== ========== ===================== ===================== XYZ 26 2018-09-17 10:00:01 2018-09-17 10:00:04 XYZ 25 2018-09-17 10:00:03 2018-09-17 10:00:06 XYZ 17 2018-09-17 10:00:05 2018-09-17 10:00:07 Again, the first result matched against the rows #1, #2, #3, #4. Compared to the previous strategy, the next match includes only row #3 (mapped to A) again for the next matching. Therefore, the second result matched against the rows #3, #4, #5, #6. The last result matched against the rows #5, #6, #7. AFTER MATCH SKIP TO FIRST A¶ This combination produces a runtime exception, because one would always try to start a new match where the last one started. This would produce an infinite loop and, so it’s not valid. In case of the SKIP TO FIRST/LAST variable strategy, it may be possible that there are no rows mapped to that variable, for example, for pattern A*. In such cases, a runtime exception is thrown, because the standard requires a valid row to continue the matching. Time attributes¶ To apply some subsequent queries on top of the MATCH_RECOGNIZE it may be necessary to use time attributes. There are two functions for selecting these: MATCH_ROWTIME([rowtime_field])Returns the timestamp of the last row that was mapped to the given pattern. The function accepts zero or one operand, which is a field reference with rowtime attribute. If there is no operand, the function returns the rowtime attribute with TIMESTAMP type. Otherwise, the return type is same as the operand type. The resulting attribute is a rowtime attribute that you can use in subsequent time-based operations, like interval joins and group window or over-window aggregations. Control memory consumption¶ Memory consumption is an important consideration when writing MATCH_RECOGNIZE queries, because the space of potential matches is built in a breadth-first-like manner. This means that you must ensure that the pattern can finish, preferably with a reasonable number of rows mapped to the match, as they have to fit into memory. For example, the pattern must not have a quantifier without an upper limit that accepts every single row. Such a pattern could look like this: PATTERN (A B+ C) DEFINE A as A.price > 10, C as C.price > 20 This query maps every incoming row to the B variable, so it never finishes. This query could be fixed, for example, by negating the condition for C: PATTERN (A B+ C) DEFINE A as A.price > 10, B as B.price <= 20, C as C.price > 20 Also, the query could be fixed by using the reluctant quantifier: PATTERN (A B+? C) DEFINE A as A.price > 10, C as C.price > 20 Note The MATCH_RECOGNIZE clause doesn’t use a configured state retention time. You may want to use the WITHIN clause <flink-sql-pattern-recognition-time-constraint> for this purpose. Known limitations¶ The Flink SQL implementation of the MATCH_RECOGNIZE clause is an ongoing effort, and some features of the SQL standard are not yet supported. Unsupported features include: Pattern expressions Pattern groups - this means that e.g. quantifiers can not be applied to a subsequence of the pattern. Thus, (A (B C)+) is not a valid pattern. Alterations - patterns like PATTERN((A B | C D) E), which means that either a subsequence A B or C D has to be found before looking for the E row. PERMUTE operator - which is equivalent to all permutations of variables that it was applied to e.g. PATTERN (PERMUTE (A, B, C)) = PATTERN (A B C | A C B | B A C | B C A | C A B | C B A). Anchors - ^, $, which denote beginning/end of a partition, those do not make sense in the streaming context and will not be supported. Exclusion - PATTERN ({- A -} B) meaning that A will be looked for but will not participate in the output. This works only for the ALL ROWS PER MATCH mode. Reluctant optional quantifier - PATTERN A?? only the greedy optional quantifier is supported. ALL ROWS PER MATCH output mode - which produces an output row for every row that participated in the creation of a found match. This also means: The only supported semantic for the MEASURES clause is FINAL. CLASSIFIER function, which returns the pattern variable that a row was mapped to, is not yet supported. SUBSET - which allows creating logical groups of pattern variables and using those groups in the DEFINE and MEASURES clauses. Physical offsets - PREV/NEXT, which indexes all events seen rather than only those that were mapped to a pattern variable (as in the logical offsets case). MATCH_RECOGNIZE is supported only for SQL. There is no equivalent in the Table API. Aggregations Distinct aggregations are not supported. Related content¶ Time Attributes Flink SQL Queries Flink SQL Functions Statements Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
SELECT T.aid, T.bid, T.cid
FROM MyTable
    MATCH_RECOGNIZE (
      PARTITION BY userid
      ORDER BY $rowtime
      MEASURES
        A.id AS aid,
        B.id AS bid,
        C.id AS cid
      PATTERN (A B C)
      DEFINE
        A AS name = 'a',
        B AS name = 'b',
        C AS name = 'c'
    ) AS T
```

```sql
MATCH_RECOGNIZE
```

```sql
MATCH_RECOGNIZE
```

```sql
PARTITION BY
```

```sql
MATCH_RECOGNIZE
```

```sql
MATCH_RECOGNIZE
```

```sql
MATCH_RECOGNIZE
```

```sql
Ticker
     |-- symbol: String                           # symbol of the stock
     |-- price: Long                              # price of the stock
     |-- tax: Long                                # tax liability of the stock
     |-- rowtime: TimeIndicatorTypeInfo(rowtime)  # point in time when the change to those values happened
```

```sql
symbol         rowtime         price    tax
======  ====================  ======= =======
'ACME'  '01-Apr-11 10:00:00'   12      1
'ACME'  '01-Apr-11 10:00:01'   17      2
'ACME'  '01-Apr-11 10:00:02'   19      1
'ACME'  '01-Apr-11 10:00:03'   21      3
'ACME'  '01-Apr-11 10:00:04'   25      2
'ACME'  '01-Apr-11 10:00:05'   18      1
'ACME'  '01-Apr-11 10:00:06'   15      1
'ACME'  '01-Apr-11 10:00:07'   14      2
'ACME'  '01-Apr-11 10:00:08'   24      2
'ACME'  '01-Apr-11 10:00:09'   25      2
'ACME'  '01-Apr-11 10:00:10'   19      1
```

```sql
SELECT *
FROM Ticker
    MATCH_RECOGNIZE (
        PARTITION BY symbol
        ORDER BY $rowtime
        MEASURES
            START_ROW.rowtime AS start_tstamp,
            LAST(PRICE_DOWN.$rowtime) AS bottom_tstamp,
            LAST(PRICE_UP.$rowtime) AS end_tstamp
        ONE ROW PER MATCH
        AFTER MATCH SKIP TO LAST PRICE_UP
        PATTERN (START_ROW PRICE_DOWN+ PRICE_UP)
        DEFINE
            PRICE_DOWN AS
                (LAST(PRICE_DOWN.price, 1) IS NULL AND PRICE_DOWN.price < START_ROW.price) OR
                    PRICE_DOWN.price < LAST(PRICE_DOWN.price, 1),
            PRICE_UP AS
                PRICE_UP.price > LAST(PRICE_DOWN.price, 1)
    ) MR;
```

```sql
AFTER MATCH SKIP TO LAST
```

```sql
ONE ROW PER MATCH
```

```sql
symbol       start_tstamp       bottom_tstamp         end_tstamp
=========  ==================  ==================  ==================
ACME       01-APR-11 10:00:04  01-APR-11 10:00:07  01-APR-11 10:00:08
```

```sql
01-APR-11 10:00:04
```

```sql
01-APR-11 10:00:07
```

```sql
01-APR-11 10:00:08
```

```sql
PARTITION BY
```

```sql
MATCH_RECOGNIZE
```

```sql
MATCH_RECOGNIZE
```

```sql
ORDER BY rowtime ASC, price DESC
```

```sql
ORDER BY price, rowtime
```

```sql
ORDER BY rowtime DESC, price ASC
```

```sql
MATCH_RECOGNIZE
```

```sql
SELECT *
FROM Ticker
    MATCH_RECOGNIZE (
        PARTITION BY symbol
        ORDER BY rowtime
        MEASURES
            FIRST(A.rowtime) AS start_tstamp,
            LAST(A.rowtime) AS end_tstamp,
            AVG(A.price) AS avgPrice
        ONE ROW PER MATCH
        AFTER MATCH SKIP PAST LAST ROW
        PATTERN (A+ B)
        DEFINE
            A AS AVG(A.price) < 15
    ) MR;
```

```sql
symbol         rowtime         price    tax
======  ====================  ======= =======
'ACME'  '01-Apr-11 10:00:00'   12      1
'ACME'  '01-Apr-11 10:00:01'   17      2
'ACME'  '01-Apr-11 10:00:02'   13      1
'ACME'  '01-Apr-11 10:00:03'   16      3
'ACME'  '01-Apr-11 10:00:04'   25      2
'ACME'  '01-Apr-11 10:00:05'   2       1
'ACME'  '01-Apr-11 10:00:06'   4       1
'ACME'  '01-Apr-11 10:00:07'   10      2
'ACME'  '01-Apr-11 10:00:08'   15      2
'ACME'  '01-Apr-11 10:00:09'   25      2
'ACME'  '01-Apr-11 10:00:10'   25      1
'ACME'  '01-Apr-11 10:00:11'   30      1
```

```sql
01-Apr-11 10:00:04
```

```sql
01-Apr-11 10:00:11
```

```sql
symbol       start_tstamp       end_tstamp          avgPrice
=========  ==================  ==================  ============
ACME       01-APR-11 10:00:00  01-APR-11 10:00:03     14.5
ACME       01-APR-11 10:00:05  01-APR-11 10:00:10     13.5
```

```sql
SUM(A.price * A.tax)
```

```sql
AVG(A.price * B.tax)
```

```sql
MATCH_RECOGNIZE
```

```sql
PATTERN (A B+ C* D)
```

```sql
PATTERN (A*)
PATTERN (A? B*)
PATTERN (A{0,} B{0,} C*)
```

```sql
SELECT *
FROM Ticker
    MATCH_RECOGNIZE(
        PARTITION BY symbol
        ORDER BY rowtime
        MEASURES
            C.price AS lastPrice
        ONE ROW PER MATCH
        AFTER MATCH SKIP PAST LAST ROW
        PATTERN (A B* C)
        DEFINE
            A AS A.price > 10,
            B AS B.price < 15,
            C AS C.price > 12
    )
```

```sql
symbol  tax   price          rowtime
======= ===== ======== =====================
 XYZ     1     10       2018-09-17 10:00:02
 XYZ     2     11       2018-09-17 10:00:03
 XYZ     1     12       2018-09-17 10:00:04
 XYZ     2     13       2018-09-17 10:00:05
 XYZ     1     14       2018-09-17 10:00:06
 XYZ     2     16       2018-09-17 10:00:07
```

```sql
symbol   lastPrice
======== ===========
 XYZ      16
```

```sql
symbol   lastPrice
======== ===========
 XYZ      13
 XYZ      16
```

```sql
PATTERN (A B* C)
DEFINE
    A AS condA(),
    B AS condB(),
    C AS NOT condB()
```

```sql
MATCH_RECOGNIZE
```

```sql
SELECT *
FROM Ticker
    MATCH_RECOGNIZE(
        PARTITION BY symbol
        ORDER BY rowtime
        MEASURES
            C.rowtime AS dropTime,
            A.price - C.price AS dropDiff
        ONE ROW PER MATCH
        AFTER MATCH SKIP PAST LAST ROW
        PATTERN (A B* C) WITHIN INTERVAL '1' HOUR
        DEFINE
            B AS B.price > A.price - 10,
            C AS C.price < A.price - 10
    )
```

```sql
symbol         rowtime         price    tax
======  ====================  ======= =======
'ACME'  '01-Apr-11 10:00:00'   20      1
'ACME'  '01-Apr-11 10:20:00'   17      2
'ACME'  '01-Apr-11 10:40:00'   18      1
'ACME'  '01-Apr-11 11:00:00'   11      3
'ACME'  '01-Apr-11 11:20:00'   14      2
'ACME'  '01-Apr-11 11:40:00'   9       1
'ACME'  '01-Apr-11 12:00:00'   15      1
'ACME'  '01-Apr-11 12:20:00'   14      2
'ACME'  '01-Apr-11 12:40:00'   24      2
'ACME'  '01-Apr-11 13:00:00'   1       2
'ACME'  '01-Apr-11 13:20:00'   19      1
```

```sql
symbol         dropTime         dropDiff
======  ====================  =============
'ACME'  '01-Apr-11 13:00:00'      14
```

```sql
01-Apr-11 12:00:00
```

```sql
01-Apr-11 13:00:00
```

```sql
01-Apr-11 10:00:00
```

```sql
01-Apr-11 11:40:00
```

```sql
ALL ROWS PER MATCH
```

```sql
ONE ROW PER MATCH
```

```sql
ONE ROW PER MATCH
```

```sql
[partitioning columns] + [measures columns]
```

```sql
SELECT *
FROM Ticker
    MATCH_RECOGNIZE(
        PARTITION BY symbol
        ORDER BY rowtime
        MEASURES
            FIRST(A.price) AS startPrice,
            LAST(A.price) AS topPrice,
            B.price AS lastPrice
        ONE ROW PER MATCH
        PATTERN (A+ B)
        DEFINE
            A AS LAST(A.price, 1) IS NULL OR A.price > LAST(A.price, 1),
            B AS B.price < LAST(A.price)
    )
```

```sql
symbol   tax   price          rowtime
======== ===== ======== =====================
 XYZ      1     10       2018-09-17 10:00:02
 XYZ      2     12       2018-09-17 10:00:03
 XYZ      1     13       2018-09-17 10:00:04
 XYZ      2     11       2018-09-17 10:00:05
```

```sql
symbol   startPrice   topPrice   lastPrice
======== ============ ========== ===========
 XYZ      10           13         11
```

```sql
A.price > 10
```

```sql
PATTERN (A B+)
DEFINE
  A AS A.price >= 10,
  B AS B.price > A.price AND SUM(price) < 100 AND SUM(B.price) < 80
```

```sql
SUM(B.price)
```

```sql
== ===== ========== ========= ============== ================== ======= ======= ========== ============
#  price Classifier [A.price] [B.price]      [price]            A.price B.price SUM(price) SUM(B.price)
== ===== ========== ========= ============== ================== ======= ======= ========== ============
#1 10    -> A       #1        -              -                  10      -       -          -
#2 15    -> B       #1        #2             #1, #2             10      15      25         15
#3 20    -> B       #1        #2, #3         #1, #2, #3         10      20      45         35
#4 31    -> B       #1        #2, #3, #4     #1, #2, #3, #4     10      31      76         66
#5 35               #1        #2, #3, #4, #5 #1, #2, #3, #4, #5 10      35      111        101
== ===== ========== ========= ============== ================== ======= ======= ========== ============
```

```sql
LAST(variable.field, n)
```

```sql
FIRST(variable.field, n)
```

```sql
PATTERN (A B+)
DEFINE
  A AS A.price >= 10,
  B AS (LAST(B.price, 1) IS NULL OR B.price > LAST(B.price, 1)) AND
       (LAST(B.price, 2) IS NULL OR B.price > 2 * LAST(B.price, 2))
```

```sql
LAST(B.price, 1)
```

```sql
LAST(B.price, 2)
```

```sql
===== ========== ================ ================ ========================================================================================
price Classifier LAST(B.price, 1) LAST(B.price, 2) Comment
===== ========== ================ ================ ========================================================================================
10    -> A
15    -> B       null             null             Notice that ``LAST(B.price, 1)`` is null because there is still nothing mapped to ``B``.
20    -> B       15               null
31    -> B       20               15
35               31               20               Not mapped because ``35 < 2 * 20``.
===== ========== ================ ================ ========================================================================================
```

```sql
PATTERN (A B? C)
DEFINE
  B AS B.price < 20,
  C AS LAST(price, 1) < C.price
```

```sql
===== ========== ============== =====================================================================================
price Classifier LAST(price, 1) Comment
===== ========== ============== =====================================================================================
10    -> A
15    -> B
20    -> C       15             ``LAST(price, 1)`` is evaluated as the price of the row mapped to the ``B`` variable.
===== ========== ============== =====================================================================================
```

```sql
===== ========== ============== =====================================================================================
price Classifier LAST(price, 1) Comment
===== ========== ============== =====================================================================================
10    -> A
20    -> C       10             ``LAST(price, 1)`` is evaluated as the price of the row mapped to the ``A`` variable.
===== ========== ============== =====================================================================================
```

```sql
LAST(A.price * A.tax)
```

```sql
LAST(A.price * B.tax)
```

```sql
AFTER MATCH SKIP
```

```sql
SKIP PAST LAST ROW
```

```sql
SKIP TO NEXT ROW
```

```sql
SKIP TO LAST variable
```

```sql
SKIP TO FIRST variable
```

```sql
SKIP PAST LAST ROW
```

```sql
symbol   tax   price         rowtime
======== ===== ======= =====================
 XYZ      1     7       2018-09-17 10:00:01
 XYZ      2     9       2018-09-17 10:00:02
 XYZ      1     10      2018-09-17 10:00:03
 XYZ      2     5       2018-09-17 10:00:04
 XYZ      2     10      2018-09-17 10:00:05
 XYZ      2     7       2018-09-17 10:00:06
 XYZ      2     14      2018-09-17 10:00:07
```

```sql
SELECT *
FROM Ticker
    MATCH_RECOGNIZE(
        PARTITION BY symbol
        ORDER BY rowtime
        MEASURES
            SUM(A.price) AS sumPrice,
            FIRST(rowtime) AS startTime,
            LAST(rowtime) AS endTime
        ONE ROW PER MATCH
        [AFTER MATCH STRATEGY]
        PATTERN (A+ C)
        DEFINE
            A AS SUM(A.price) < 30
    )
```

```sql
AFTER MATCH
```

```sql
AFTER MATCH SKIP PAST LAST ROW
```

```sql
symbol   sumPrice        startTime              endTime
======== ========== ===================== =====================
 XYZ      26         2018-09-17 10:00:01   2018-09-17 10:00:04
 XYZ      17         2018-09-17 10:00:05   2018-09-17 10:00:07
```

```sql
AFTER MATCH SKIP TO NEXT ROW
```

```sql
symbol   sumPrice        startTime              endTime
======== ========== ===================== =====================
 XYZ      26         2018-09-17 10:00:01   2018-09-17 10:00:04
 XYZ      24         2018-09-17 10:00:02   2018-09-17 10:00:05
 XYZ      25         2018-09-17 10:00:03   2018-09-17 10:00:06
 XYZ      22         2018-09-17 10:00:04   2018-09-17 10:00:07
 XYZ      17         2018-09-17 10:00:05   2018-09-17 10:00:07
```

```sql
AFTER MATCH SKIP TO LAST A
```

```sql
symbol   sumPrice        startTime              endTime
======== ========== ===================== =====================
 XYZ      26         2018-09-17 10:00:01   2018-09-17 10:00:04
 XYZ      25         2018-09-17 10:00:03   2018-09-17 10:00:06
 XYZ      17         2018-09-17 10:00:05   2018-09-17 10:00:07
```

```sql
AFTER MATCH SKIP TO FIRST A
```

```sql
SKIP TO FIRST/LAST variable
```

```sql
MATCH_RECOGNIZE
```

```sql
MATCH_ROWTIME([rowtime_field])
```

```sql
MATCH_RECOGNIZE
```

```sql
PATTERN (A B+ C)
DEFINE
  A as A.price > 10,
  C as C.price > 20
```

```sql
PATTERN (A B+ C)
DEFINE
  A as A.price > 10,
  B as B.price <= 20,
  C as C.price > 20
```

```sql
PATTERN (A B+? C)
DEFINE
  A as A.price > 10,
  C as C.price > 20
```

```sql
MATCH_RECOGNIZE
```

```sql
MATCH_RECOGNIZE
```

```sql
PATTERN((A B | C D) E)
```

```sql
PATTERN (PERMUTE (A, B, C))
```

```sql
PATTERN (A B C | A C B | B A C | B C A | C A B | C B A)
```

```sql
PATTERN ({- A -} B)
```

```sql
ALL ROWS PER MATCH
```

```sql
PATTERN A??
```

```sql
ALL ROWS PER MATCH
```

```sql
MATCH_RECOGNIZE
```

---

### SQL ORDER BY Clause in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/queries/orderby.html

ORDER BY Clause in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables sorting rows from a SELECT statement. Description¶ The ORDER BY clause causes the result rows to be sorted according to the specified expression(s). If two rows are equal according to the leftmost expression, they are compared according to the next expression, and so on. If they are equal according to all specified expressions, they are returned in an implementation-dependent order. When running in streaming mode, the primary sort order of a table must be ascending on a time attribute. All subsequent sort orders can be freely chosen. When running in batch mode, there is no sort-order limitation. Example¶ SELECT * FROM orders ORDER BY order_time, order_id Related content¶ Flink SQL Queries Flink SQL Functions Statements Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
SELECT *
FROM orders
ORDER BY order_time, order_id
```

---

### SQL OVER Aggregation Queries in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/queries/over-aggregation.html

OVER Aggregation Queries in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables computing an aggregated value for every row over a range of ordered rows. Syntax¶ SELECT agg_func(agg_col) OVER ( [PARTITION BY column1[, column2, ...]] ORDER BY time_column range_definition), ... FROM ... Description¶ Compute an aggregated value for every row over a range of ordered rows. OVER aggregates compute an aggregated value for every input row over a range of ordered rows. In contrast to a GROUP BY aggregate, an OVER aggregate doesn’t reduce the number of result rows to a single row for every group. Instead, an OVER aggregate produces an aggregated value for every input row. You can define multiple OVER window aggregates in a SELECT clause. However, for streaming queries, the OVER windows for all aggregates must be identical due to current limitation. ORDER BY¶ OVER windows are defined on an ordered sequence of rows. Since tables do not have an inherent order, the ORDER BY clause is mandatory. For streaming queries, Flink currently only supports OVER windows that are defined with an ascending time attributes order. Additional orderings are not supported. PARTITION BY¶ OVER windows can be defined on a partitioned table. In presence of a PARTITION BY clause, the aggregate is computed for each input row only over the rows of its partition. Range Definitions¶ The range definition specifies how many rows are included in the aggregate. The range is defined with a BETWEEN clause that defines a lower and an upper boundary. All rows between these boundaries are included in the aggregate. Flink only supports CURRENT ROW as the upper boundary. There are two options to define the range, ROWS intervals and RANGE intervals. RANGE intervals¶ A RANGE interval is defined on the values of the ORDER BY column, which is in case of Flink always a time attribute. The following RANGE interval defines that all rows with a time attribute of at most 30 minutes less than the current row are included in the aggregate. RANGE BETWEEN INTERVAL '30' MINUTE PRECEDING AND CURRENT ROW ROW intervals¶ A ROWS interval is a count-based interval. It defines exactly how many rows are included in the aggregate. The following ROWS interval defines that the 10 rows preceding the current row and the current row (so 11 rows in total) are included in the aggregate. ROWS BETWEEN 10 PRECEDING AND CURRENT ROW The WINDOW clause can be used to define an OVER window outside of the SELECT clause. It can make queries more readable and also allows us to reuse the window definition for multiple aggregates. SELECT order_id, order_time, amount, SUM(amount) OVER w AS sum_amount, AVG(amount) OVER w AS avg_amount FROM orders WINDOW w AS ( PARTITION BY product ORDER BY order_time RANGE BETWEEN INTERVAL '1' HOUR PRECEDING AND CURRENT ROW) Example¶ The following query computes for every order the sum of amounts of all orders for the same product that were received within one hour before the current order. SELECT order_id, order_time, amount, SUM(amount) OVER ( PARTITION BY product ORDER BY order_time RANGE BETWEEN INTERVAL '1' HOUR PRECEDING AND CURRENT ROW ) AS one_hour_prod_amount_sum FROM orders Related content¶ Confluent Developer: OVER aggregations Time Attributes Flink SQL Queries Flink SQL Functions Statements Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
SELECT
  agg_func(agg_col) OVER (
    [PARTITION BY column1[, column2, ...]]
    ORDER BY time_column
    range_definition),
  ...
FROM ...
```

```sql
PARTITION BY
```

```sql
CURRENT ROW
```

```sql
RANGE BETWEEN INTERVAL '30' MINUTE PRECEDING AND CURRENT ROW
```

```sql
ROWS BETWEEN 10 PRECEDING AND CURRENT ROW
```

```sql
SELECT order_id, order_time, amount,
  SUM(amount) OVER w AS sum_amount,
  AVG(amount) OVER w AS avg_amount
FROM orders
WINDOW w AS (
  PARTITION BY product
  ORDER BY order_time
  RANGE BETWEEN INTERVAL '1' HOUR PRECEDING AND CURRENT ROW)
```

```sql
SELECT order_id, order_time, amount,
  SUM(amount) OVER (
    PARTITION BY product
    ORDER BY order_time
    RANGE BETWEEN INTERVAL '1' HOUR PRECEDING AND CURRENT ROW
  ) AS one_hour_prod_amount_sum
FROM orders
```

---

### SQL Queries in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/queries/overview.html

Flink SQL Queries in Confluent Cloud for Apache Flink¶ In Confluent Cloud for Apache Flink®, Data Manipulation Language (DML) statements, also known as queries, are declarative verbs that read and modify data in Apache Flink® tables. Unlike Data Definition Language (DDL) statements, DML statements modify only data and don’t change metadata. When you want to change metadata, use DDL statements. These are the available DML statements in Confluent Cloud for Flink SQL. Deduplication Queries in Confluent Cloud for Apache Flink Group Aggregation Queries in Confluent Cloud for Apache Flink INSERT INTO FROM SELECT Statement in Confluent Cloud for Apache Flink INSERT VALUES Statement in Confluent Cloud for Apache Flink Interval joins LIMIT Clause in Confluent Cloud for Apache Flink EXECUTE STATEMENT SET in Confluent Cloud for Apache Flink ORDER BY Clause in Confluent Cloud for Apache Flink Pattern Recognition Queries in Confluent Cloud for Apache Flink Regular joins SELECT Statement in Confluent Cloud for Apache Flink Set Logic in Confluent Cloud for Apache Flink Temporal joins Top-N Queries in Confluent Cloud for Apache Flink Window Aggregation Queries in Confluent Cloud for Apache Flink Window Deduplication Queries in Confluent Cloud for Apache Flink Window Join Queries in Confluent Cloud for Apache Flink Window Top-N Queries in Confluent Cloud for Apache Flink Windowing Table-Valued Functions (Windowing TVFs) in Confluent Cloud for Apache Flink WITH Clause in Confluent Cloud for Apache Flink Prerequisites¶ You need the following prerequisites to use Confluent Cloud for Apache Flink. Access to Confluent Cloud. The organization ID, environment ID, and compute pool ID for your organization. The OrganizationAdmin, EnvironmentAdmin, or FlinkAdmin role for creating compute pools, or the FlinkDeveloper role if you already have a compute pool. If you don’t have the appropriate role, reach out to your OrganizationAdmin or EnvironmentAdmin. The Confluent CLI. To use the Flink SQL shell, update to the latest version of the Confluent CLI by running the following command: confluent update --yes If you used homebrew to install the Confluent CLI, update the CLI by using the brew upgrade command, instead of confluent update. For more information, see Confluent CLI. Use a workspace or the Flink SQL shell¶ You can run queries and statements either in a Confluent Cloud Console workspace or in the Flink SQL shell. To run queries in the Confluent Cloud Console, follow these steps. Log in to the Confluent Cloud Console. Navigate to the Environments page. Click the tile that has the environment where your Flink compute pools are provisioned. Click Flink. The Compute Pools list opens. In the compute pool where you want to run statements, click Open SQL workspace. The workspace opens with a cell for editing SQL statements. To run queries in the Flink SQL shell, run the following command: confluent flink shell --compute-pool <compute-pool-id> --environment <env-id> You’re ready to run your first Flink SQL query. Hello SQL¶ Run the following simple query to print “Hello SQL”. SELECT 'Hello SQL'; Your output should resemble: EXPR$0 Hello SQL Run the following query to aggregate values in a table. SELECT Name, COUNT(*) AS Num FROM (VALUES ('Neo'), ('Trinity'), ('Morpheus'), ('Trinity')) AS NameTable(Name) GROUP BY Name; Your output should resemble: Name Num Neo 1 Morpheus 1 Trinity 2 Functions¶ Flink supports many built-in functions that help you build sophisticated SQL queries. Run the SHOW FUNCTIONS statement to see the full list of built-in functions. SHOW FUNCTIONS; Your output should resemble: +------------------------+ | function name | +------------------------+ | % | | * | | + | | - | | / | | < | | <= | | <> | | = | | > | | >= | | ABS | | ACOS | | AND | | ARRAY | | ARRAY_CONTAINS | | ... Run the following statement to execute the built-in CURRENT_TIMESTAMP function, which returns the local machine’s current system time. SELECT CURRENT_TIMESTAMP; Your output should resemble: CURRENT_TIMESTAMP 2024-01-17 13:07:43.537 Run the following statement to compute the cosine of 0. SELECT COS(0) AS cosine; Your output should resemble: cosine 1.0 Source Tables¶ As with all SQL engines, Flink SQL queries operate on rows in tables. But unlike traditional databases, Flink doesn’t manage data-at-rest in a local store. Instead, Flink SQL queries operate continuously over external tables. Flink data processing pipelines begin with source tables. Source tables produce rows operated over during the query’s execution; they are the tables referenced in the FROM clause of a query. Tables are created automatically in Confluent Cloud from all the Apache Kafka® topics. Also, you can create tables by using the SQL shell. The Flink SQL shell supports SQL DDL commands similar to traditional SQL. Standard SQL DDL is used to create and alter tables. The following statement creates an employee_information table. CREATE TABLE employee_information( emp_id INT, name VARCHAR, dept_id INT); Confluent Cloud creates the corresponding employee_information topic automatically. Continuous Queries¶ You can define a continuous foreground query from the employee_information table that reads new rows as they are made available and immediately outputs their results. For example, you can filter for the employees who work in department 1. SELECT * from employee_information WHERE dept_id = 1; Although SQL wasn’t designed initially with streaming semantics in mind, it’s a powerful tool for building continuous data pipelines. A Flink query differs from a traditional database query by consuming rows continuously as they arrive and producing updates to the query results. A continuous query never terminates and produces a dynamic table as a result. Dynamic tables are the core concept of Flink’s SQL support for streaming data. Aggregations on continuous streams must store aggregated results continuously during the execution of the query. For example, suppose you need to count the number of employees for each department from an incoming data stream. To output timely results as new rows are processed, the query must maintain the most up-to-date count for each department. SELECT dept_id, COUNT(*) as emp_count FROM employee_information GROUP BY dept_id; Such queries are considered stateful. Flink’s advanced fault-tolerance mechanism maintains internal state and consistency, so queries always return the correct result, even in the face of hardware failure. Sink Tables¶ When running the previous query, the Flink SQL provides output in real-time but in a read-only fashion. Storing results - to power a report or dashboard - requires writing out to another table. You can achieve this by using an INSERT INTO statement. The table referenced in this clause is known as a sink table. An INSERT INTO statement is submitted as a detached query to Flink. INSERT INTO department_counts SELECT dept_id, COUNT(*) as emp_count FROM employee_information; Once submitted, this query runs and stores the results into the sink table directly, instead of loading the results into the system memory. Syntax¶ Flink parses SQL using Apache Calcite, which supports standard ANSI SQL. The following BNF-grammar describes the superset of supported SQL features. query: values | WITH withItem [ , withItem ]* query | { select | selectWithoutFrom | query UNION [ ALL ] query | query EXCEPT query | query INTERSECT query } [ ORDER BY orderItem [, orderItem ]* ] [ LIMIT { count | ALL } ] [ OFFSET start { ROW | ROWS } ] [ FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } ONLY] withItem: name [ '(' column [, column ]* ')' ] AS '(' query ')' orderItem: expression [ ASC | DESC ] select: SELECT [ ALL | DISTINCT ] { * | projectItem [, projectItem ]* } FROM tableExpression [ WHERE booleanExpression ] [ GROUP BY { groupItem [, groupItem ]* } ] [ HAVING booleanExpression ] [ WINDOW windowName AS windowSpec [, windowName AS windowSpec ]* ] selectWithoutFrom: SELECT [ ALL | DISTINCT ] { * | projectItem [, projectItem ]* } projectItem: expression [ [ AS ] columnAlias ] | tableAlias . * tableExpression: tableReference [, tableReference ]* | tableExpression [ NATURAL ] [ LEFT | RIGHT | FULL ] JOIN tableExpression [ joinCondition ] joinCondition: ON booleanExpression | USING '(' column [, column ]* ')' tableReference: tablePrimary [ matchRecognize ] [ [ AS ] alias [ '(' columnAlias [, columnAlias ]* ')' ] ] tablePrimary: [ TABLE ] tablePath [ dynamicTableOptions ] [systemTimePeriod] [[AS] correlationName] | LATERAL TABLE '(' functionName '(' expression [, expression ]* ')' ')' | [ LATERAL ] '(' query ')' | UNNEST '(' expression ')' tablePath: [ [ catalogName . ] databaseName . ] tableName systemTimePeriod: FOR SYSTEM_TIME AS OF dateTimeExpression dynamicTableOptions: /*+ OPTIONS(key=val [, key=val]*) */ key: stringLiteral val: stringLiteral values: VALUES expression [, expression ]* groupItem: expression | '(' ')' | '(' expression [, expression ]* ')' | CUBE '(' expression [, expression ]* ')' | ROLLUP '(' expression [, expression ]* ')' | GROUPING SETS '(' groupItem [, groupItem ]* ')' windowRef: windowName | windowSpec windowSpec: [ windowName ] '(' [ ORDER BY orderItem [, orderItem ]* ] [ PARTITION BY expression [, expression ]* ] [ RANGE numericOrIntervalExpression {PRECEDING} | ROWS numericExpression {PRECEDING} ] ')' matchRecognize: MATCH_RECOGNIZE '(' [ PARTITION BY expression [, expression ]* ] [ ORDER BY orderItem [, orderItem ]* ] [ MEASURES measureColumn [, measureColumn ]* ] [ ONE ROW PER MATCH ] [ AFTER MATCH ( SKIP TO NEXT ROW | SKIP PAST LAST ROW | SKIP TO FIRST variable | SKIP TO LAST variable | SKIP TO variable ) ] PATTERN '(' pattern ')' [ WITHIN intervalLiteral ] DEFINE variable AS condition [, variable AS condition ]* ')' measureColumn: expression AS alias pattern: patternTerm [ '|' patternTerm ]* patternTerm: patternFactor [ patternFactor ]* patternFactor: variable [ patternQuantifier ] patternQuantifier: '*' | '*?' | '+' | '+?' | '?' | '??' | '{' { [ minRepeat ], [ maxRepeat ] } '}' ['?'] | '{' repeat '}' statementSet: EXECUTE STATEMENT SET BEGIN { insertStatement ';' }+ END ';' Flink uses a lexical policy for identifier (table, attribute, function names) that’s similar to Java. The case of identifiers is preserved whether or not they are quoted. After which, identifiers are matched case-sensitively. Unlike Java, back-ticks enable identifiers to contain non-alphanumeric characters, for example: SELECT a AS `my field` FROM t; String literals must be enclosed in single quotes, for example, SELECT 'Hello World'. Duplicate a single quote for escaping, for example, SELECT 'It''s me'. SELECT 'Hello World', 'It''s me'; Your output should resemble: EXPR$0 EXPR$1 Hello World It's me Unicode characters are supported in string literals. If explicit unicode code points are required, use the following syntax. Use the backslash (\) as the escaping character (default), for example, SELECT U&'\263A': SELECT U&'\263A'; Your output should resemble: EXPR$0 ☺ Also, you can use a custom escaping character with UESCAPE, for example, SELECT U&'#2713' UESCAPE '#': SELECT U&'#2713' UESCAPE '#'; Your output should resemble: EXPR$0 ✓ Related content¶ DDL Statements Stream Processing Concepts Built-in Functions Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
confluent update --yes
```

```sql
brew upgrade
```

```sql
confluent update
```

```sql
confluent flink shell --compute-pool <compute-pool-id> --environment <env-id>
```

```sql
SELECT 'Hello SQL';
```

```sql
EXPR$0
Hello SQL
```

```sql
SELECT Name, COUNT(*) AS Num
FROM
  (VALUES ('Neo'), ('Trinity'), ('Morpheus'), ('Trinity')) AS NameTable(Name)
GROUP BY Name;
```

```sql
Name     Num
Neo      1
Morpheus 1
Trinity  2
```

```sql
SHOW FUNCTIONS
```

```sql
SHOW FUNCTIONS;
```

```sql
+------------------------+
|     function name      |
+------------------------+
| %                      |
| *                      |
| +                      |
| -                      |
| /                      |
| <                      |
| <=                     |
| <>                     |
| =                      |
| >                      |
| >=                     |
| ABS                    |
| ACOS                   |
| AND                    |
| ARRAY                  |
| ARRAY_CONTAINS         |
| ...
```

```sql
CURRENT_TIMESTAMP
```

```sql
SELECT CURRENT_TIMESTAMP;
```

```sql
CURRENT_TIMESTAMP
2024-01-17 13:07:43.537
```

```sql
SELECT COS(0) AS cosine;
```

```sql
employee_information
```

```sql
CREATE TABLE employee_information(
  emp_id INT,
  name VARCHAR,
  dept_id INT);
```

```sql
employee_information
```

```sql
employee_information
```

```sql
SELECT * from employee_information WHERE dept_id = 1;
```

```sql
SELECT
   dept_id,
   COUNT(*) as emp_count
FROM employee_information
GROUP BY dept_id;
```

```sql
INSERT INTO
```

```sql
INSERT INTO
```

```sql
INSERT INTO department_counts
SELECT
   dept_id,
COUNT(*) as emp_count
FROM employee_information;
```

```sql
query:
    values
  | WITH withItem [ , withItem ]* query
  | {
        select
      | selectWithoutFrom
      | query UNION [ ALL ] query
      | query EXCEPT query
      | query INTERSECT query
    }
    [ ORDER BY orderItem [, orderItem ]* ]
    [ LIMIT { count | ALL } ]
    [ OFFSET start { ROW | ROWS } ]
    [ FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } ONLY]

withItem:
    name
    [ '(' column [, column ]* ')' ]
    AS '(' query ')'

orderItem:
    expression [ ASC | DESC ]

select:
    SELECT [ ALL | DISTINCT ]
    { * | projectItem [, projectItem ]* }
    FROM tableExpression
    [ WHERE booleanExpression ]
    [ GROUP BY { groupItem [, groupItem ]* } ]
    [ HAVING booleanExpression ]
    [ WINDOW windowName AS windowSpec [, windowName AS windowSpec ]* ]

selectWithoutFrom:
    SELECT [ ALL | DISTINCT ]
    { * | projectItem [, projectItem ]* }

projectItem:
    expression [ [ AS ] columnAlias ]
  | tableAlias . *

tableExpression:
    tableReference [, tableReference ]*
  | tableExpression [ NATURAL ] [ LEFT | RIGHT | FULL ] JOIN tableExpression [ joinCondition ]

joinCondition:
    ON booleanExpression
  | USING '(' column [, column ]* ')'

tableReference:
    tablePrimary
    [ matchRecognize ]
    [ [ AS ] alias [ '(' columnAlias [, columnAlias ]* ')' ] ]

tablePrimary:
    [ TABLE ] tablePath [ dynamicTableOptions ] [systemTimePeriod] [[AS] correlationName]
  | LATERAL TABLE '(' functionName '(' expression [, expression ]* ')' ')'
  | [ LATERAL ] '(' query ')'
  | UNNEST '(' expression ')'

tablePath:
    [ [ catalogName . ] databaseName . ] tableName

systemTimePeriod:
    FOR SYSTEM_TIME AS OF dateTimeExpression

dynamicTableOptions:
    /*+ OPTIONS(key=val [, key=val]*) */

key:
    stringLiteral

val:
    stringLiteral

values:
    VALUES expression [, expression ]*

groupItem:
    expression
  | '(' ')'
  | '(' expression [, expression ]* ')'
  | CUBE '(' expression [, expression ]* ')'
  | ROLLUP '(' expression [, expression ]* ')'
  | GROUPING SETS '(' groupItem [, groupItem ]* ')'

windowRef:
    windowName
  | windowSpec

windowSpec:
    [ windowName ]
    '('
    [ ORDER BY orderItem [, orderItem ]* ]
    [ PARTITION BY expression [, expression ]* ]
    [
        RANGE numericOrIntervalExpression {PRECEDING}
      | ROWS numericExpression {PRECEDING}
    ]
    ')'

matchRecognize:
    MATCH_RECOGNIZE '('
    [ PARTITION BY expression [, expression ]* ]
    [ ORDER BY orderItem [, orderItem ]* ]
    [ MEASURES measureColumn [, measureColumn ]* ]
    [ ONE ROW PER MATCH ]
    [ AFTER MATCH
      ( SKIP TO NEXT ROW
      | SKIP PAST LAST ROW
      | SKIP TO FIRST variable
      | SKIP TO LAST variable
      | SKIP TO variable )
    ]
    PATTERN '(' pattern ')'
    [ WITHIN intervalLiteral ]
    DEFINE variable AS condition [, variable AS condition ]*
    ')'

measureColumn:
    expression AS alias

pattern:
    patternTerm [ '|' patternTerm ]*

patternTerm:
    patternFactor [ patternFactor ]*

patternFactor:
    variable [ patternQuantifier ]

patternQuantifier:
    '*'
  | '*?'
  | '+'
  | '+?'
  | '?'
  | '??'
  | '{' { [ minRepeat ], [ maxRepeat ] } '}' ['?']
  | '{' repeat '}'

statementSet:
    EXECUTE STATEMENT SET
    BEGIN
      { insertStatement ';' }+
    END ';'
```

```sql
SELECT a AS `my field` FROM t;
```

```sql
SELECT 'Hello World'
```

```sql
SELECT 'It''s me'
```

```sql
SELECT 'Hello World', 'It''s me';
```

```sql
EXPR$0      EXPR$1
Hello World It's me
```

```sql
SELECT U&'\263A'
```

```sql
SELECT U&'\263A';
```

```sql
SELECT U&'#2713' UESCAPE '#'
```

```sql
SELECT U&'#2713' UESCAPE '#';
```

---

### SQL SELECT statement in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/queries/select.html

SELECT Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables querying the content of your tables by using familiar SELECT syntax. Syntax¶ SELECT [DISTINCT] select_list FROM table_expression [ WHERE boolean_expression ] [ LIMIT row_limit ] Description¶ The SELECT statement in Flink does what the SQL standard says it must do. You needn’t look further than standard SQL itself to understand the behavior. For example, UNION without ALL means that duplicate rows must be removed. Flink maintains the relation, called a dynamic table, specified by the SQL query. Its behavior is always the same as if you ran the SQL query again, over the current snapshot of the data, each time a new row arrives for any table in the relation. This formalism is what enables you to reason about exactly what Flink will do just by understanding what any SQL system, like MySQL, Snowflake, or Oracle, would do. Another way to understand what Flink SQL does is to consider the following statement: SELECT * FROM clicks ORDER BY clickTime LIMIT 10; This statement doesn’t only look at 10 rows, sort them, and terminate. It maintains this relation, and as new orders arrive, the relation changes, always showing the top 10 most recent orders. This is exactly as if you re-ran the query each time a new row was written to the clicks table. You’ll get the same result. Select list¶ The select_list specification * means the query resolves all columns. But in production, using * is not recommended, because it makes queries less robust to catalog changes. Instead, use a select_list to specify a subset of available columns or make calculations using the columns. For example, if an orders table has columns named order_id, price, and tax you could write the following query: SELECT order_id, price + tax FROM orders Table expression¶ The table_expression can be any source of data, including a table, view, or VALUES clause, the joined results of multiple existing tables, or a subquery. Assuming that an orders table is available in the catalog, the following would read all rows from . SELECT * FROM orders; VALUES clause¶ Queries can consume from inline data by using the VALUES clause. Each tuple corresponds to one row. You can provide an alias to assign a name to each column. SELECT order_id, price FROM (VALUES (1, 2.0), (2, 3.1)) AS t (order_id, price); Your output should resemble: order_id price 1 2.0 2 3.1 WHERE clause¶ Filter rows by using the WHERE clause. SELECT price + tax FROM orders WHERE id = 10; Functions¶ You can invoke built-in scalar functions on the columns of a single row. SELECT PRETTY_PRINT(order_id) FROM orders; DISTINCT¶ If SELECT DISTINCT is specified, all duplicate rows are removed from the result set, which means that one row is kept from each group of duplicates. For streaming queries, the required state for computing the query result might grow infinitely. State size depends on the number of distinct rows. SELECT DISTINCT id FROM orders; Usage¶ In the Flink SQL shell or in a Cloud Console workspace, run the following commands to see examples of the SELECT statement. Create a table for web page click events. -- Create a table for web page click events. CREATE TABLE clicks ( ip_address VARCHAR, url VARCHAR, click_ts_raw BIGINT ); Populate the table with mock clickstream data. -- Populate the table with mock clickstream data. INSERT INTO clicks VALUES( '10.0.0.1', 'https://acme.com/index.html', 1692812175), ( '10.0.0.12', 'https://apache.org/index.html', 1692826575), ( '10.0.0.13', 'https://confluent.io/index.html', 1692826575), ( '10.0.0.1', 'https://acme.com/index.html', 1692812175), ( '10.0.0.12', 'https://apache.org/index.html', 1692819375), ( '10.0.0.13', 'https://confluent.io/index.html', 1692826575); Press ENTER to return to the SQL shell. Because INSERT INTO VALUES is a point-in-time statement, it exits after it completes inserting records. View all rows in the clicks table by using a SELECT statement. SELECT * FROM clicks; Your output should resemble: ip_address url click_ts_raw 10.0.0.1 https://acme.com/index.html 1692812175 10.0.0.12 https://apache.org/index.html 1692826575 10.0.0.13 https://confluent.io/index.html 1692826575 10.0.0.1 https://acme.com/index.html 1692812175 10.0.0.12 https://apache.org/index.html 1692819375 10.0.0.13 https://confluent.io/index.html 1692826575 View only unique rows in the clicks table by using a SELECT DISTINCT statement. SELECT DISTINCT * FROM clicks; Your output should resemble: ip_address url click_ts_raw 10.0.0.1 https://acme.com/index.html 1692812175 10.0.0.12 https://apache.org/index.html 1692826575 10.0.0.13 https://confluent.io/index.html 1692826575 10.0.0.12 https://apache.org/index.html 1692819375 View only records that have the ip_address of 10.0.0.1 by using a SELECT WHERE statement. SELECT * FROM clicks WHERE ip_address='10.0.0.1'; Your output should resemble: ip_address url click_ts_raw 10.0.0.1 https://acme.com/index.html 1692812175 10.0.0.1 https://acme.com/index.html 1692812175 Examples¶ The following examples show frequently encountered scenarios with SELECT. Most minimal statement¶ SyntaxSELECT 1; Properties Statement is bounded Check local time zone is configured correctly¶ SyntaxSELECT NOW(); Properties Statement is bounded NOW() returns a TIMSTAMP_LTZ(3), so if the client is configured correctly, it should show a timestamp in your local time zone. Combine multiple tables into one¶ SyntaxCREATE TABLE t_union_1 (i INT); CREATE TABLE t_union_2 (i INT); TABLE t_union_1 UNION ALL TABLE t_union_2; -- alternate syntax SELECT * FROM t_union_1 UNION ALL SELECT * FROM t_union_2; Get insights into the current watermark¶ SyntaxCREATE TABLE t_watermarked_insight (s STRING) DISTRIBUTED INTO 1 BUCKETS; INSERT INTO t_watermarked_insight VALUES ('Bob'), ('Alice'), ('Charly'); SELECT $rowtime, CURRENT_WATERMARK($rowtime) FROM t_watermarked_insight; The output resembles: $rowtime EXPR$1 2024-04-29 11:59:01.080 NULL 2024-04-29 11:59:01.093 2024-04-04 15:27:37.433 2024-04-29 11:59:01.094 2024-04-04 15:27:37.433 Properties The CURRENT_WATERMARK function returns the watermark that arrived at the operator evaluating the SELECT statement. The returned watermark is the minimum of all inputs, across all tables/topics and their partitions. If a common watermark was not received from all inputs, the function returns NULL. The CURRENT_WATERMARK function takes a time attribute, which is a column that has WATERMARK FOR defined. A watermark is always emitted after the row has been processed, so the first row always has a NULL watermark. Because the default watermark algorithm requires at least 250 records, initially it assumes the maximum lag of 7 days plus a safety margin of 7 days. The watermark quickly (exponentially) goes down as more data arrives. Sources emit watermarks every 200 ms, but within the first 200 ms they emit per row for powering examples like this. Flatten fields into columns¶ SyntaxCREATE TABLE t_flattening (i INT, r1 ROW<i INT, s STRING>, r2 ROW<other INT>); SELECT r1.*, r2.* FROM t_flattening; PropertiesYou can apply the * operator on nested data, which enables flattening fields into columns of the table. Related content¶ Flink SQL Queries Flink SQL Functions Statements Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
SELECT [DISTINCT] select_list FROM table_expression [ WHERE boolean_expression ] [ LIMIT row_limit ]
```

```sql
SELECT * FROM clicks ORDER BY clickTime LIMIT 10;
```

```sql
select_list
```

```sql
select_list
```

```sql
SELECT order_id, price + tax FROM orders
```

```sql
table_expression
```

```sql
SELECT * FROM orders;
```

```sql
SELECT order_id, price
   FROM (VALUES (1, 2.0), (2, 3.1))
   AS t (order_id, price);
```

```sql
order_id price
1        2.0
2        3.1
```

```sql
SELECT price + tax
   FROM orders
   WHERE id = 10;
```

```sql
SELECT PRETTY_PRINT(order_id) FROM orders;
```

```sql
SELECT DISTINCT
```

```sql
SELECT DISTINCT id FROM orders;
```

```sql
-- Create a table for web page click events.
CREATE TABLE clicks (
  ip_address VARCHAR,
  url VARCHAR,
  click_ts_raw BIGINT
);
```

```sql
-- Populate the table with mock clickstream data.
INSERT INTO clicks
VALUES( '10.0.0.1',  'https://acme.com/index.html',     1692812175),
      ( '10.0.0.12', 'https://apache.org/index.html',   1692826575),
      ( '10.0.0.13', 'https://confluent.io/index.html', 1692826575),
      ( '10.0.0.1',  'https://acme.com/index.html',     1692812175),
      ( '10.0.0.12', 'https://apache.org/index.html',   1692819375),
      ( '10.0.0.13', 'https://confluent.io/index.html', 1692826575);
```

```sql
SELECT * FROM clicks;
```

```sql
ip_address url                             click_ts_raw
10.0.0.1   https://acme.com/index.html     1692812175
10.0.0.12  https://apache.org/index.html   1692826575
10.0.0.13  https://confluent.io/index.html 1692826575
10.0.0.1   https://acme.com/index.html     1692812175
10.0.0.12  https://apache.org/index.html   1692819375
10.0.0.13  https://confluent.io/index.html 1692826575
```

```sql
SELECT DISTINCT * FROM clicks;
```

```sql
ip_address url                             click_ts_raw
10.0.0.1   https://acme.com/index.html     1692812175
10.0.0.12  https://apache.org/index.html   1692826575
10.0.0.13  https://confluent.io/index.html 1692826575
10.0.0.12  https://apache.org/index.html   1692819375
```

```sql
SELECT * FROM clicks WHERE ip_address='10.0.0.1';
```

```sql
ip_address url                         click_ts_raw
10.0.0.1   https://acme.com/index.html 1692812175
10.0.0.1   https://acme.com/index.html 1692812175
```

```sql
SELECT NOW();
```

```sql
CREATE TABLE t_union_1 (i INT);
CREATE TABLE t_union_2 (i INT);
TABLE t_union_1 UNION ALL TABLE t_union_2;

-- alternate syntax
SELECT * FROM t_union_1
UNION ALL
SELECT * FROM t_union_2;
```

```sql
CREATE TABLE t_watermarked_insight (s STRING) DISTRIBUTED INTO 1 BUCKETS;

INSERT INTO t_watermarked_insight VALUES ('Bob'), ('Alice'), ('Charly');

SELECT $rowtime, CURRENT_WATERMARK($rowtime) FROM t_watermarked_insight;
```

```sql
$rowtime                EXPR$1
2024-04-29 11:59:01.080 NULL
2024-04-29 11:59:01.093 2024-04-04 15:27:37.433
2024-04-29 11:59:01.094 2024-04-04 15:27:37.433
```

```sql
CREATE TABLE t_flattening (i INT, r1 ROW<i INT, s STRING>, r2 ROW<other INT>);

SELECT r1.*, r2.* FROM t_flattening;
```

---

### SQL Set Logic in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/queries/set-logic.html

Set Logic in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables set logic operations on tables in SQL statements. EXCEPT EXISTS IN INTERSECT UNION Example data¶ The following examples use these tables to show how the different logical operators work. -- Create tables for the set logic operations. CREATE TABLE t1(chr CHAR); INSERT INTO t1 VALUES('c'), ('a'), ('b'), ('b'), ('c'); CREATE TABLE t2(chr CHAR); INSERT INTO t2 VALUES('d'), ('e'), ('a'), ('b'), ('b'); EXCEPT¶ EXCEPT and EXCEPT ALL return the rows that are found in one table but not the other. EXCEPT returns only distinct rows. EXCEPT ALL doesn’t remove duplicates from the result rows. The following code example shows output from the EXCEPT function on tables t1 and t2. (SELECT chr FROM t1) EXCEPT (SELECT chr FROM t2); Your output should resemble: chr c The following code example shows output from the EXCEPT ALL function on tables t1 and t2. (SELECT chr FROM t1) EXCEPT ALL (SELECT chr FROM t2); Your output should resemble: +----+ | chr| +----+ | c| | c| +----+ EXISTS¶ SELECT user, amount FROM orders WHERE product EXISTS ( SELECT product FROM NewProducts ) Returns TRUE if the sub-query returns at least one row. Only supported if the operation can be rewritten in a join and group operation. The optimizer rewrites the EXISTS operation into a join and group operation. For streaming queries, the required state for computing the query result might grow infinitely depending on the number of distinct input rows. IN¶ Returns TRUE if an expression exists in a table sub-query. The sub-query table must consist of one column. This column must have the same data type as the expression. SELECT user, amount FROM orders WHERE product IN ( SELECT product FROM NewProducts ) The optimizer rewrites the IN condition into a join and group operation. For streaming queries, the required state for computing the query result might grow infinitely depending on the number of distinct input rows. INTERSECT¶ INTERSECT and INTERSECT ALL return the rows that are found in both tables. INTERSECT returns only distinct rows. INTERSECT ALL doesn’t remove duplicates from the result rows. The following code example shows output from the INTERSECT function on tables t1 and t2. (SELECT chr FROM t1) INTERSECT (SELECT chr FROM t2); Your output should resemble: chr a b The following code example shows output from the INTERSECT ALL function on tables t1 and t2. (SELECT chr FROM t1) INTERSECT ALL (SELECT chr FROM t2); Your output should resemble: chr a b b UNION¶ UNION and UNION ALL return the rows that are found in either table. UNION returns only distinct rows. UNION ALL doesn’t remove duplicates from the result rows. The following code example shows output from the UNION function on tables t1 and t2. (SELECT chr FROM view1) UNION (SELECT chr FROM view2); Your output should resemble: chr c a b d e The following code example shows output from the UNION ALL function on tables t1 and t2. (SELECT chr FROM t1) UNION ALL (SELECT chr FROM t2); Your output should resemble: chr c a b b c d e a b b Related content¶ Flink SQL Queries Flink SQL Functions Statements Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
-- Create tables for the set logic operations.
CREATE TABLE t1(chr CHAR);
INSERT INTO t1 VALUES('c'), ('a'), ('b'), ('b'), ('c');

CREATE TABLE t2(chr CHAR);
INSERT INTO t2 VALUES('d'), ('e'), ('a'), ('b'), ('b');
```

```sql
(SELECT chr FROM t1) EXCEPT (SELECT chr FROM t2);
```

```sql
(SELECT chr FROM t1) EXCEPT ALL (SELECT chr FROM t2);
```

```sql
+----+
| chr|
+----+
|   c|
|   c|
+----+
```

```sql
SELECT user, amount
FROM orders
WHERE product EXISTS (
    SELECT product FROM NewProducts
)
```

```sql
SELECT user, amount
FROM orders
WHERE product IN (
    SELECT product FROM NewProducts
)
```

```sql
INTERSECT ALL
```

```sql
INTERSECT ALL
```

```sql
(SELECT chr FROM t1) INTERSECT (SELECT chr FROM t2);
```

```sql
INTERSECT ALL
```

```sql
(SELECT chr FROM t1) INTERSECT ALL (SELECT chr FROM t2);
```

```sql
(SELECT chr FROM view1) UNION (SELECT chr FROM view2);
```

```sql
chr
c
a
b
d
e
```

```sql
(SELECT chr FROM t1) UNION ALL (SELECT chr FROM t2);
```

```sql
chr
c
a
b
b
c
d
e
a
b
b
```

---

### SQL Statement Sets in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/queries/statement-set.html

EXECUTE STATEMENT SET in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables executing multiple SQL statements as a single, optimized statement by using statement sets. Syntax¶ EXECUTE STATEMENT SET BEGIN -- one or more INSERT INTO statements { INSERT INTO <select_statement>; }+ END; Description¶ Statement sets are a feature of Confluent Cloud for Apache Flink® that enables executing a set of SQL statements as a single, optimized statement. This is useful when you have multiple SQL statements that share common intermediate results, as it enables you to reuse those results and avoid unnecessary computation. To use statement sets, you enclose one or more SQL statements in a block and execute them as a single unit. All statements in the block are optimized and executed together as a single Flink statement. Statement sets are particularly useful when you have multiple INSERT INTO statements that read from the same table or share intermediate results. By executing these statements together as a single statement, you can avoid redundant computation and improve performance. Example¶ The following query results in a single statement being executed which reads from an orders table. If the status is completed, the product and quantity values are written to the sales table. If the status is returned, the product and quantity values are written to the returns table. EXECUTE STATEMENT SET BEGIN INSERT INTO `sales` (product, quantity) SELECT product, quantity FROM orders WHERE status = 'completed'; INSERT INTO `returns` (product, quantity) SELECT product, quantity FROM orders WHERE status = 'returned'; END; Related content¶ INSERT INTO FROM SELECT INSERT VALUES Flink SQL Queries Flink SQL Functions Statements Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
EXECUTE STATEMENT SET
BEGIN
  -- one or more INSERT INTO statements
  { INSERT INTO <select_statement>; }+
END;
```

```sql
EXECUTE STATEMENT SET
BEGIN
   INSERT INTO `sales` (product, quantity) SELECT product, quantity FROM orders WHERE status = 'completed';
   INSERT INTO `returns` (product, quantity) SELECT product, quantity FROM orders WHERE status = 'returned';
END;
```

---

### SQL Top-N queries in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/queries/topn.html

Top-N Queries in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables finding the smallest or largest values, ordered by columns, in a table. Syntax¶ SELECT [column_list] FROM ( SELECT [column_list], ROW_NUMBER() OVER ([PARTITION BY column1[, column2...]] ORDER BY column1 [asc|desc][, column2 [asc|desc]...]) AS rownum FROM table_name) WHERE rownum <= N [AND conditions] Parameter Specification Note This query pattern must be followed exactly, otherwise, the optimizer can’t translate the query. ROW_NUMBER(): Assigns an unique, sequential number to each row, starting with one, according to the ordering of rows within the partition. Currently, Flink supports only ROW_NUMBER as the over window function. In the future, Flink may support RANK() and DENSE_RANK(). PARTITION BY column1[, column2...]: Specifies the partition columns. Each partition has a Top-N result. ORDER BY column1 [asc|desc][, column2 [asc|desc]...]: Specifies the ordering columns. The ordering directions can be different on different columns. WHERE rownum <= N: The rownum <= N is required for Flink to recognize this query is a Top-N query. The N represents the number of smallest or largest records to retain. [AND conditions]: You can add other conditions in the WHERE clause, but the other conditions can only be combined with rownum <= N using the AND conjunction. Description¶ Find the smallest or largest values, ordered by columns, in a table. Top-N queries return the N smallest or largest values in a table, ordered by columns. Both smallest and largest values sets are considered Top-N queries. Top-N queries are useful in cases where the need is to display only the N bottom-most or the N top- most records from batch/streaming table on a condition. This result set can be used for further analysis. Flink uses the combination of a OVER window clause and a filter condition to express a Top-N query. With the power of OVER window PARTITION BY clause, Flink also supports per group Top-N. For example, the top five products per category that have the maximum sales in realtime. Top-N queries are supported for SQL on batch and streaming tables. The Top-N query is Result Updating, which means that Flink sorts the input stream according to the order key. If the top N rows have changed, the changed rows are sent downstream as retraction/update records. Examples¶ The following examples show how to specify Top-N queries on streaming tables. The unique key of a Top-N query is the combination of partition columns and the rownum column. Also, a Top-N query can derive the unique key of upstream. The following example shows how to get “the top five products per category that have the maximum sales in realtime”. If product_id is the unique key of the ShopSales table, the unique keys of the Top-N query are [category, rownum] and [product_id]. CREATE TABLE ShopSales ( product_id STRING, category STRING, product_name STRING, sales BIGINT ) WITH (...); SELECT * FROM ( SELECT *, ROW_NUMBER() OVER (PARTITION BY category ORDER BY sales DESC) AS row_num FROM ShopSales) WHERE row_num <= 5 No ranking output optimization¶ As described in the previous example, the rownum field is written into the result table as one field of the unique key, which may cause many records to be written to the result table. For example, when a record, fro example, product-1001, of ranking 9 is updated and its rank is upgraded to 1, all the records from ranking 1 - 9 are output to the result table as update messages. If the result table receives too many rows, it may slow the SQL job execution. To optimize the query, omit the rownum field in the outer SELECT clause of the Top-N query. This approach is reasonable, because the number of Top-N rows usually isn’t large, so consumers can sort the rows themselves quickly. Without the rownum field, only the changed record (product-1001) must be sent to downstream, which can reduce much of the IO to the result table. The following example shows how to optimize the previous Top-N example by : CREATE TABLE ShopSales ( product_id STRING, category STRING, product_name STRING, sales BIGINT ) WITH (...); -- omit row_num field from the output SELECT product_id, category, product_name, sales FROM ( SELECT *, ROW_NUMBER() OVER (PARTITION BY category ORDER BY sales DESC) AS row_num FROM ShopSales) WHERE row_num <= 5 Note In Streaming Mode, to output the above query to an external storage and have a correct result, the external storage must have the same unique key with the Top-N query. In the above example query, if the product_id is the unique key of the query, then the external table should also has product_id as the unique key. Related content¶ Window Aggregation Queries Windowing Table-Valued Functions (Windowing TVFs) Top-N Queries Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
SELECT [column_list]
FROM (
  SELECT [column_list],
    ROW_NUMBER() OVER ([PARTITION BY column1[, column2...]]
      ORDER BY column1 [asc|desc][, column2 [asc|desc]...]) AS rownum
    FROM table_name)
WHERE rownum <= N [AND conditions]
```

```sql
ROW_NUMBER()
```

```sql
DENSE_RANK()
```

```sql
PARTITION BY column1[, column2...]
```

```sql
ORDER BY column1 [asc|desc][, column2 [asc|desc]...]
```

```sql
WHERE rownum <= N
```

```sql
rownum <= N
```

```sql
[AND conditions]
```

```sql
rownum <= N
```

```sql
PARTITION BY
```

```sql
CREATE TABLE ShopSales (
  product_id   STRING,
  category     STRING,
  product_name STRING,
  sales        BIGINT
) WITH (...);

SELECT *
FROM (
  SELECT *,
    ROW_NUMBER() OVER (PARTITION BY category ORDER BY sales DESC) AS row_num
  FROM ShopSales)
WHERE row_num <= 5
```

```sql
product-1001
```

```sql
product-1001
```

```sql
CREATE TABLE ShopSales (
  product_id   STRING,
  category     STRING,
  product_name STRING,
  sales        BIGINT
) WITH (...);

-- omit row_num field from the output
SELECT product_id, category, product_name, sales
FROM (
  SELECT *,
    ROW_NUMBER() OVER (PARTITION BY category ORDER BY sales DESC) AS row_num
  FROM ShopSales)
WHERE row_num <= 5
```

---

### SQL Window Aggregation Queries in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/queries/window-aggregation.html

Window Aggregation Queries in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables aggregating data over windows in a table. Syntax¶ SELECT ... FROM <windowed_table> -- relation applied windowing TVF GROUP BY window_start, window_end, ... Description¶ Window TVF Aggregation¶ Window aggregations are defined in the GROUP BY clause containing “window_start” and “window_end” columns of the relation applied Windowing TVF. Just like queries with regular GROUP BY clauses, queries with a group by window aggregation compute a single result row per group. Unlike other aggregations on continuous tables, window aggregations do not emit intermediate results but only a final result: the total aggregation at the end of the window. Moreover, window aggregations purge all intermediate state when they’re no longer needed. Windowing TVFs¶ Flink supports TUMBLE, HOP, CUMULATE and SESSION types of window aggregations. The time attribute field of a window table-valued function must be event time attributes. For more information, see Windowing TVF. In batch mode, the time attribute field of a window table-valued function must be an attribute of type TIMESTAMP or TIMESTAMP_LTZ. SESSION window aggregation is not supported in batch mode. Examples¶ The following examples show Window aggregations over example data streams that you can experiment with. Note To show the behavior of windowing more clearly in the following examples, TIMESTAMP(3) values may be simplified so that trailing zeroes aren’t shown. For example, 2020-04-15 08:05:00.000 may be shown as 2020-04-15 08:05. Columns may be hidden intentionally to enhance the readability of the content. Here are some examples for TUMBLE, HOP, CUMULATE and SESSION window aggregations. DESCRIBE `examples`.`marketplace`.`orders`; +--------------+-----------+----------+---------------+ | Column Name | Data Type | Nullable | Extras | +--------------+-----------+----------+---------------+ | order_id | STRING | NOT NULL | | | customer_id | INT | NOT NULL | | | product_id | STRING | NOT NULL | | | price | DOUBLE | NOT NULL | | +--------------+-----------+----------+---------------+ SELECT * FROM `examples`.`marketplace`.`orders`; order_id customer_id product_id price d770a538-a70c-4de6-9d06-e6c16c5bef5a 3075 1379 32.21 787ee1f4-d0d0-4c39-bdb9-44dc2d203d55 3028 1335 34.74 7ab7ce23-5f61-4398-afad-b1e3f548fee3 3148 1045 69.26 6fea712c-9454-497e-8038-ebaf6dfc7a17 3247 1390 67.26 dc9daf5e-98d5-4bcd-8839-251fed13b75e 3167 1309 12.04 ab3151d0-2950-49cd-9783-016ccc6a3281 3105 1094 21.52 d27ca945-3cff-48a4-afcc-7b17446aa95d 3168 1250 99.95 -- apply aggregation on the tumbling windowed table SELECT window_start, window_end, SUM(price) as `sum` FROM TABLE( TUMBLE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES)) GROUP BY window_start, window_end; window_start window_end sum 2023-11-02 10:40:00 2023-11-02 10:50:00 258484.93 2023-11-02 10:50:00 2023-11-02 11:00:00 287632.15 2023-11-02 11:00:00 2023-11-02 11:10:00 271945.78 2023-11-02 11:10:00 2023-11-02 11:20:00 315207.46 2023-11-02 11:20:00 2023-11-02 11:30:00 342618.92 2023-11-02 11:30:00 2023-11-02 11:40:00 329754.31 -- apply aggregation on the hopping windowed table SELECT window_start, window_end, SUM(price) as `sum` FROM TABLE( HOP(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '5' MINUTES, INTERVAL '10' MINUTES)) GROUP BY window_start, window_end; window_start window_end sum 2023-11-02 11:10:00 2023-11-02 11:20:00 296049.38 2023-11-02 11:15:00 2023-11-02 11:25:00 1122455.07 2023-11-02 11:20:00 2023-11-02 11:30:00 1648270.20 2023-11-02 11:25:00 2023-11-02 11:35:00 2143271.00 2023-11-02 11:30:00 2023-11-02 11:40:00 2701592.45 2023-11-02 11:35:00 2023-11-02 11:45:00 3214376.78 -- apply aggregation on the cumulating windowed table SELECT window_start, window_end, SUM(price) as `sum` FROM TABLE( CUMULATE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '2' MINUTES, INTERVAL '10' MINUTES)) GROUP BY window_start, window_end; window_start window_end sum 2023-11-02 12:40:00.000 2023-11-02 12:46:00.000 327376.23 2023-11-02 12:40:00.000 2023-11-02 12:48:00.000 661272.70 2023-11-02 12:40:00.000 2023-11-02 12:50:00.000 989294.13 2023-11-02 12:50:00.000 2023-11-02 12:52:00.000 1316596.58 2023-11-02 12:50:00.000 2023-11-02 12:54:00.000 1648097.20 2023-11-02 12:50:00.000 2023-11-02 12:56:00.000 1977881.53 2023-11-02 12:50:00.000 2023-11-02 12:58:00.000 2304080.32 2023-11-02 12:50:00.000 2023-11-02 13:00:00.000 2636795.56 -- apply aggregation on the session windowed table SELECT window_start, window_end, customer_id, SUM(price) as `sum` FROM TABLE( SESSION(TABLE `examples`.`marketplace`.`orders` PARTITION BY customer_id, DESCRIPTOR($rowtime), INTERVAL '1' MINUTES)) GROUP BY window_start, window_end, customer_id; window_start window_end sum 2023-11-02 12:40:00 2023-11-02 12:46:00 327376.23 2023-11-02 12:40:00 2023-11-02 12:48:00 661272.70 2023-11-02 12:40:00 2023-11-02 12:50:00 989294.13 2023-11-02 12:50:00 2023-11-02 12:52:00 1316596.58 2023-11-02 12:50:00 2023-11-02 12:54:00 1648097.20 2023-11-02 12:50:00 2023-11-02 12:56:00 1977881.53 2023-11-02 12:50:00 2023-11-02 12:58:00 2304080.32 2023-11-02 12:50:00 2023-11-02 13:00:00 2636795.56 GROUPING SETS¶ Window aggregations also support GROUPING SETS syntax. Grouping sets allow for more complex grouping operations than those describable by a standard GROUP BY. Rows are grouped separately by each specified grouping set and aggregates are computed for each group just as for simple GROUP BY clauses. Window aggregations with GROUPING SETS require both the window_start and window_end columns have to be in the GROUP BY clause, but not in the GROUPING SETS clause. SELECT window_start, window_end, player_id, SUM(points) as `sum` FROM TABLE( TUMBLE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES)) GROUP BY window_start, window_end, GROUPING SETS ((player_id), ()); window_start window_end player_id sum 2023-11-03 11:20 2023-11-03 11:30 (NULL) 6596 2023-11-03 11:20 2023-11-03 11:30 1025 6232 2023-11-03 11:20 2023-11-03 11:30 1007 4486 2023-11-03 11:30 2023-11-03 11:40 (NULL) 6073 2023-11-03 11:30 2023-11-03 11:40 1025 6953 2023-11-03 11:30 2023-11-03 11:40 1007 3723 Each sublist of GROUPING SETS may specify zero or more columns or expressions and is interpreted the same way as though used directly in the GROUP BY clause. An empty grouping set means that all rows are aggregated down to a single group, which is output even if no input rows were present. References to the grouping columns or expressions are replaced by null values in result rows for grouping sets in which those columns do not appear. ROLLUP¶ ROLLUP is a shorthand notation for specifying a common type of grouping set. It represents the given list of expressions and all prefixes of the list, including the empty list. Window aggregations with ROLLUP requires both the window_start and window_end columns have to be in the GROUP BY clause, but not in the ROLLUP clause. For example, the following query is equivalent to the one above. SELECT window_start, window_end, player_id, SUM(points) as `sum` FROM TABLE( TUMBLE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES)) GROUP BY window_start, window_end, ROLLUP (player_id); CUBE¶ CUBE is a shorthand notation for specifying a common type of grouping set. It represents the given list and all of its possible subsets - the power set. Window aggregations with CUBE requires both the window_start and window_end columns have to be in the GROUP BY clause, but not in the CUBE clause. For example, the following two queries are equivalent. SELECT window_start, window_end, game_room_id, player_id, SUM(points) as `sum` FROM TABLE( TUMBLE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES)) GROUP BY window_start, window_end, CUBE (player_id, game_room_id); SELECT window_start, window_end, game_room_id, player_id, SUM(points) as `sum` FROM TABLE( TUMBLE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES)) GROUP BY window_start, window_end, GROUPING SETS ( (player_id, game_room_id), (player_id ), ( game_room_id), ( ) ); Selecting Group Window Start and End Timestamps¶ The start and end timestamps of group windows can be selected with the grouped window_start and window_end columns. Cascading Window Aggregation¶ The window_start and window_end columns are regular timestamp columns, not time attributes, so they can’t be used as time attributes in subsequent time-based operations. To propagate time attributes, you also need to add window_time column into GROUP BY clause. The window_time is the third column produced by Windowing TVFs, which is a time attribute of the assigned window. Adding window_time into a GROUP BY clause makes window_time also to be a group key that can be selected. Following queries can use this column for subsequent time-based operations, like cascading window aggregations and Window TopN. The following code shows a cascading window aggregation in which the first window aggregation propagates the time attribute for the second window aggregation. -- tumbling 5 minutes for each player_id WITH fiveminutewindow AS ( -- Note: The window start and window end fields of inner Window TVF -- are optional in the SELECT clause. But if they appear in the clause, -- they must be aliased to prevent name conflicts with the window start -- and window end of the outer Window TVF. SELECT window_start AS window_5mintumble_start, window_end as window_5mintumble_end, window_time AS rowtime, SUM(points) as `partial_sum` FROM TABLE( TUMBLE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '5' MINUTES)) GROUP BY player_id, window_start, window_end, window_time ) -- tumbling 10 minutes on the first window SELECT window_start, window_end, SUM(partial_price) as total_price FROM TABLE( TUMBLE(TABLE fiveminutewindow, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES)) GROUP BY window_start, window_end; Related content¶ Course: Window Aggregations Top-N Queries Window Top-N Queries Windowing Table-Valued Functions (Windowing TVFs) Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
SELECT ...
FROM <windowed_table> -- relation applied windowing TVF
GROUP BY window_start, window_end, ...
```

```sql
TIMESTAMP_LTZ
```

```sql
TIMESTAMP(3)
```

```sql
2020-04-15 08:05:00.000
```

```sql
2020-04-15 08:05
```

```sql
DESCRIBE `examples`.`marketplace`.`orders`;
```

```sql
+--------------+-----------+----------+---------------+
| Column Name  | Data Type | Nullable |    Extras     |
+--------------+-----------+----------+---------------+
| order_id     | STRING    | NOT NULL |               |
| customer_id  | INT       | NOT NULL |               |
| product_id   | STRING    | NOT NULL |               |
| price        | DOUBLE    | NOT NULL |               |
+--------------+-----------+----------+---------------+
```

```sql
SELECT * FROM `examples`.`marketplace`.`orders`;
```

```sql
order_id                             customer_id  product_id price
d770a538-a70c-4de6-9d06-e6c16c5bef5a 3075         1379       32.21
787ee1f4-d0d0-4c39-bdb9-44dc2d203d55 3028         1335       34.74
7ab7ce23-5f61-4398-afad-b1e3f548fee3 3148         1045       69.26
6fea712c-9454-497e-8038-ebaf6dfc7a17 3247         1390       67.26
dc9daf5e-98d5-4bcd-8839-251fed13b75e 3167         1309       12.04
ab3151d0-2950-49cd-9783-016ccc6a3281 3105         1094       21.52
d27ca945-3cff-48a4-afcc-7b17446aa95d 3168         1250       99.95
```

```sql
-- apply aggregation on the tumbling windowed table
SELECT window_start, window_end, SUM(price) as `sum`
  FROM TABLE(
    TUMBLE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES))
  GROUP BY window_start, window_end;
```

```sql
window_start        window_end          sum
2023-11-02 10:40:00 2023-11-02 10:50:00 258484.93
2023-11-02 10:50:00 2023-11-02 11:00:00 287632.15
2023-11-02 11:00:00 2023-11-02 11:10:00 271945.78
2023-11-02 11:10:00 2023-11-02 11:20:00 315207.46
2023-11-02 11:20:00 2023-11-02 11:30:00 342618.92
2023-11-02 11:30:00 2023-11-02 11:40:00 329754.31
```

```sql
-- apply aggregation on the hopping windowed table
SELECT window_start, window_end, SUM(price) as `sum`
  FROM TABLE(
    HOP(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '5' MINUTES, INTERVAL '10' MINUTES))
  GROUP BY window_start, window_end;
```

```sql
window_start        window_end          sum
2023-11-02 11:10:00 2023-11-02 11:20:00 296049.38
2023-11-02 11:15:00 2023-11-02 11:25:00 1122455.07
2023-11-02 11:20:00 2023-11-02 11:30:00 1648270.20
2023-11-02 11:25:00 2023-11-02 11:35:00 2143271.00
2023-11-02 11:30:00 2023-11-02 11:40:00 2701592.45
2023-11-02 11:35:00 2023-11-02 11:45:00 3214376.78
```

```sql
-- apply aggregation on the cumulating windowed table
SELECT window_start, window_end, SUM(price) as `sum`
  FROM TABLE(
    CUMULATE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '2' MINUTES, INTERVAL '10' MINUTES))
  GROUP BY window_start, window_end;
```

```sql
window_start            window_end              sum
2023-11-02 12:40:00.000 2023-11-02 12:46:00.000 327376.23
2023-11-02 12:40:00.000 2023-11-02 12:48:00.000 661272.70
2023-11-02 12:40:00.000 2023-11-02 12:50:00.000 989294.13
2023-11-02 12:50:00.000 2023-11-02 12:52:00.000 1316596.58
2023-11-02 12:50:00.000 2023-11-02 12:54:00.000 1648097.20
2023-11-02 12:50:00.000 2023-11-02 12:56:00.000 1977881.53
2023-11-02 12:50:00.000 2023-11-02 12:58:00.000 2304080.32
2023-11-02 12:50:00.000 2023-11-02 13:00:00.000 2636795.56
```

```sql
-- apply aggregation on the session windowed table
SELECT window_start, window_end, customer_id, SUM(price) as `sum`
  FROM TABLE(
    SESSION(TABLE `examples`.`marketplace`.`orders` PARTITION BY customer_id, DESCRIPTOR($rowtime), INTERVAL '1' MINUTES))
  GROUP BY window_start, window_end, customer_id;
```

```sql
window_start        window_end          sum
2023-11-02 12:40:00 2023-11-02 12:46:00 327376.23
2023-11-02 12:40:00 2023-11-02 12:48:00 661272.70
2023-11-02 12:40:00 2023-11-02 12:50:00 989294.13
2023-11-02 12:50:00 2023-11-02 12:52:00 1316596.58
2023-11-02 12:50:00 2023-11-02 12:54:00 1648097.20
2023-11-02 12:50:00 2023-11-02 12:56:00 1977881.53
2023-11-02 12:50:00 2023-11-02 12:58:00 2304080.32
2023-11-02 12:50:00 2023-11-02 13:00:00 2636795.56
```

```sql
GROUPING SETS
```

```sql
GROUPING SETS
```

```sql
window_start
```

```sql
GROUPING SETS
```

```sql
SELECT window_start, window_end, player_id, SUM(points) as `sum`
  FROM TABLE(
    TUMBLE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES))
  GROUP BY window_start, window_end, GROUPING SETS ((player_id), ());
```

```sql
window_start     window_end       player_id sum
2023-11-03 11:20 2023-11-03 11:30 (NULL)    6596
2023-11-03 11:20 2023-11-03 11:30 1025      6232
2023-11-03 11:20 2023-11-03 11:30 1007      4486
2023-11-03 11:30 2023-11-03 11:40 (NULL)    6073
2023-11-03 11:30 2023-11-03 11:40 1025      6953
2023-11-03 11:30 2023-11-03 11:40 1007      3723
```

```sql
GROUPING SETS
```

```sql
window_start
```

```sql
SELECT window_start, window_end, player_id, SUM(points) as `sum`
    FROM TABLE(
      TUMBLE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES))
    GROUP BY window_start, window_end, ROLLUP (player_id);
```

```sql
window_start
```

```sql
SELECT window_start, window_end, game_room_id, player_id, SUM(points) as `sum`
   FROM TABLE(
     TUMBLE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES))
   GROUP BY window_start, window_end, CUBE (player_id, game_room_id);

SELECT window_start, window_end, game_room_id, player_id, SUM(points) as `sum`
   FROM TABLE(
     TUMBLE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES))
   GROUP BY window_start, window_end, GROUPING SETS (
            (player_id, game_room_id),
            (player_id              ),
            (           game_room_id),
            (                 )
      );
```

```sql
window_start
```

```sql
window_start
```

```sql
window_time
```

```sql
window_time
```

```sql
window_time
```

```sql
window_time
```

```sql
-- tumbling 5 minutes for each player_id
WITH fiveminutewindow AS (
-- Note: The window start and window end fields of inner Window TVF
-- are optional in the SELECT clause. But if they appear in the clause,
-- they must be aliased to prevent name conflicts with the window start
-- and window end of the outer Window TVF.
SELECT window_start AS window_5mintumble_start, window_end as window_5mintumble_end, window_time AS rowtime, SUM(points) as `partial_sum`
  FROM TABLE(
    TUMBLE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '5' MINUTES))
  GROUP BY player_id, window_start, window_end, window_time
)
-- tumbling 10 minutes on the first window
SELECT window_start, window_end, SUM(partial_price) as total_price
  FROM TABLE(
      TUMBLE(TABLE fiveminutewindow, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES))
  GROUP BY window_start, window_end;
```

---

### SQL Window Deduplication Queries in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/queries/window-deduplication.html

Window Deduplication Queries in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables removing duplicate rows over a set of columns in a windowed table. Syntax¶ SELECT [column_list] FROM ( SELECT [column_list], ROW_NUMBER() OVER (PARTITION BY window_start, window_end [, column_key1...] ORDER BY time_attr [asc|desc]) AS rownum FROM table_name) -- relation applied windowing TVF WHERE (rownum = 1 | rownum <=1 | rownum < 2) [AND conditions] Parameter Specification Note This query pattern must be followed exactly, otherwise, the optimizer won’t translate the query to Window Deduplication. ROW_NUMBER(): Assigns an unique, sequential number to each row, starting with one. PARTITION BY window_start, window_end [, column_key1...]: Specifies the partition columns which contain window_start, window_end and other partition keys. ORDER BY time_attr [asc|desc]: Specifies the ordering column, which must be a time attribute. Flink SQL supports the event time attribute. Processing time is not supported in Confluent Cloud for Apache Flink. Ordering by ASC means keeping the first row, ordering by DESC means keeping the last row. WHERE (rownum = 1 | rownum <=1 | rownum < 2): The rownum = 1 | rownum <=1 | rownum < 2 is required for the optimizer to recognize the query should be translated to Window Deduplication. Description¶ Window Deduplication is a special deduplication that removes duplicate rows over a set of columns, keeping the first row or the last row for each window and partitioned keys. For streaming queries, unlike regular deduplicate on continuous tables, Window Deduplication doesn’t emit intermediate results, instead emitting only a final result at the end of the window. Also, window Deduplication purges all intermediate state when it’s no longer needed. As a result, Window Deduplication queries have better performance, if you don’t need results updated per row. Usually, Window Deduplication is used with Windowing TVF directly. Window Deduplication can be used with other operations based on Windowing TVF, like Window Aggregation, Window TopN, and Window Join. Window Deduplication can be defined in the same syntax as regular Deduplication. For more information, see Deduplication Queries in Confluent Cloud for Apache Flink. Window Deduplication requires that the PARTITION BY clause contains window_start and window_end columns of the relation, otherwise, the optimizer can’t translate the query. Flink uses ROW_NUMBER() to remove duplicates, similar to its usage in Top-N Queries in Confluent Cloud for Apache Flink. Deduplication is a special case of the Top-N query, in which N is one and order is by event time. Example¶ The following example shows how to keep the last record for every 10-minute tumbling window. The mock data is produced by the Datagen Source Connector configured with the Gaming Player Activity quickstart. DESCRIBE gaming_player_activity_source; +--------------+-----------+----------+---------------+ | Column Name | Data Type | Nullable | Extras | +--------------+-----------+----------+---------------+ | key | BYTES | NULL | PARTITION KEY | | player_id | INT | NOT NULL | | | game_room_id | INT | NOT NULL | | | points | INT | NOT NULL | | | coordinates | STRING | NOT NULL | | +--------------+-----------+----------+---------------+ SELECT * FROM gaming_player_activity_source; player_id game_room_id points coordinates 1051 1144 371 [65,36] 1079 3451 38 [20,71] 1017 4177 419 [63,05] 1092 1801 209 [31,67] 1074 3013 401 [32,69] 1003 1038 284 [18,32] 1081 2265 196 [78,68] SELECT * FROM ( SELECT $rowtime, points, game_room_id, player_id, window_start, window_end, ROW_NUMBER() OVER (PARTITION BY window_start, window_end ORDER BY $rowtime DESC) AS rownum FROM TABLE( TUMBLE(TABLE gaming_player_activity_source, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES)) ) WHERE rownum <= 1; $rowtime points game_room_id player_id window_start window_end rownum 2023-11-03 19:59:59.407 371 2504 1094 2023-11-03 19:50 2023-11-03 20:00 1 2023-11-03 20:09:59.921 188 4342 1036 2023-11-03 20:00 2023-11-03 20:10 1 2023-11-03 20:19:59.741 128 3427 1046 2023-11-03 20:10 2023-11-03 20:20 1 2023-11-03 20:29:59.992 311 1000 1049 2023-11-03 20:20 2023-11-03 20:30 1 2023-11-03 20:39:59.569 429 1217 1062 2023-11-03 20:30 2023-11-03 20:40 1 Related content¶ Top-N Queries Window Top-N Queries Windowing Table-Valued Functions (Windowing TVFs) Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
SELECT [column_list]
FROM (
   SELECT [column_list],
     ROW_NUMBER() OVER (PARTITION BY window_start, window_end [, column_key1...]
       ORDER BY time_attr [asc|desc]) AS rownum
   FROM table_name) -- relation applied windowing TVF
WHERE (rownum = 1 | rownum <=1 | rownum < 2) [AND conditions]
```

```sql
ROW_NUMBER()
```

```sql
PARTITION BY window_start, window_end [, column_key1...]
```

```sql
window_start
```

```sql
ORDER BY time_attr [asc|desc]
```

```sql
WHERE (rownum = 1 | rownum <=1 | rownum < 2)
```

```sql
rownum = 1 | rownum <=1 | rownum < 2
```

```sql
PARTITION BY
```

```sql
window_start
```

```sql
ROW_NUMBER()
```

```sql
DESCRIBE gaming_player_activity_source;
```

```sql
+--------------+-----------+----------+---------------+
| Column Name  | Data Type | Nullable |    Extras     |
+--------------+-----------+----------+---------------+
| key          | BYTES     | NULL     | PARTITION KEY |
| player_id    | INT       | NOT NULL |               |
| game_room_id | INT       | NOT NULL |               |
| points       | INT       | NOT NULL |               |
| coordinates  | STRING    | NOT NULL |               |
+--------------+-----------+----------+---------------+
```

```sql
SELECT * FROM gaming_player_activity_source;
```

```sql
player_id game_room_id points coordinates
1051      1144         371    [65,36]
1079      3451         38     [20,71]
1017      4177         419    [63,05]
1092      1801         209    [31,67]
1074      3013         401    [32,69]
1003      1038         284    [18,32]
1081      2265         196    [78,68]
```

```sql
SELECT *
  FROM (
    SELECT $rowtime, points, game_room_id, player_id, window_start, window_end,
      ROW_NUMBER() OVER (PARTITION BY window_start, window_end ORDER BY $rowtime DESC) AS rownum
    FROM TABLE(
               TUMBLE(TABLE gaming_player_activity_source, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES))
  ) WHERE rownum <= 1;
```

```sql
$rowtime                points game_room_id player_id window_start     window_end       rownum
2023-11-03 19:59:59.407 371    2504         1094      2023-11-03 19:50 2023-11-03 20:00 1
2023-11-03 20:09:59.921 188    4342         1036      2023-11-03 20:00 2023-11-03 20:10 1
2023-11-03 20:19:59.741 128    3427         1046      2023-11-03 20:10 2023-11-03 20:20 1
2023-11-03 20:29:59.992 311    1000         1049      2023-11-03 20:20 2023-11-03 20:30 1
2023-11-03 20:39:59.569 429    1217         1062      2023-11-03 20:30 2023-11-03 20:40 1
```

---

### SQL Window Join Queries in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/queries/window-join.html

Window Join Queries in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables joining data over time windows in dynamic tables. Syntax¶ The following shows the syntax of the INNER/LEFT/RIGHT/FULL OUTER Window Join statement. SELECT ... FROM L [LEFT|RIGHT|FULL OUTER] JOIN R -- L and R are relations applied windowing TVF ON L.window_start = R.window_start AND L.window_end = R.window_end AND ... Description¶ A window join adds the dimension of time into the join criteria themselves. In doing so, the window join joins the elements of two streams that share a common key and are in the same window. For streaming queries, unlike other joins on continuous tables, window join does not emit intermediate results but only emits final results at the end of the window. Moreover, window join purge all intermediate state when no longer needed. Usually, Window Join is used with Windowing TVF. Also, Window Join can follow after other operations based on Windowing TVF, like Window Aggregation and Window TopN. Window Join requires that the join on condition contains window_starts equality of input tables and window_ends equality of input tables. Window Join supports INNER/LEFT/RIGHT/FULL OUTER/ANTI/SEMI JOIN. The syntax is very similar for all of the different joins. Examples¶ The following examples show Window joins over mock data produced by the Datagen Source Connector configured with the Gaming Player Activity quickstart. Note To show the behavior of windowing more clearly in the following examples, TIMESTAMP(3) values may be simplified so that trailing zeroes aren’t shown. For example, 2020-04-15 08:05:00.000 may be shown as 2020-04-15 08:05. Columns may be hidden intentionally to enhance the readability of the content. FULL OUTER JOIN¶ The following example shows a FULL OUTER JOIN, with a Window Join that works on a Tumble Window TVF. When performing a window join, all elements with a common key and a common tumbling window are joined together. By scoping the region of time for the oin into fixed five-minute intervals, the datasets are chopped into two distinct windows of time: [12:00, 12:05) and [12:05, 12:10). The L2 and R2 rows don’t join together because they fall into separate windows. describe LeftTable; +-------------+--------------+----------+--------+ | Column Name | Data Type | Nullable | Extras | +-------------+--------------+----------+--------+ | row_time | TIMESTAMP(3) | NULL | | | num | INT | NULL | | | id | STRING | NULL | | +-------------+--------------+----------+--------+ SELECT * FROM LeftTable; row_time num id 2023-11-03 12:22:47.268 1 L1 2023-11-03 12:22:43.189 2 L2 2023-11-03 12:22:47.486 3 L3 describe RightTable; +-------------+--------------+----------+--------+ | Column Name | Data Type | Nullable | Extras | +-------------+--------------+----------+--------+ | row_time | TIMESTAMP(3) | NULL | | | num | INT | NULL | | | id | STRING | NULL | | +-------------+--------------+----------+--------+ SELECT * FROM RightTable; row_time num id 2023-11-03 12:23:22.045 2 R2 2023-11-03 12:23:16.437 3 R3 2023-11-03 12:23:18.349 4 R4 SELECT L.num as L_Num, L.id as L_Id, R.num as R_Num, R.id as R_Id, COALESCE(L.window_start, R.window_start) as window_start, COALESCE(L.window_end, R.window_end) as window_end FROM ( SELECT * FROM TABLE(TUMBLE(TABLE LeftTable, DESCRIPTOR($rowtime), INTERVAL '5' MINUTES)) ) L FULL JOIN ( SELECT * FROM TABLE(TUMBLE(TABLE RightTable, DESCRIPTOR($rowtime), INTERVAL '5' MINUTES)) ) R ON L.num = R.num AND L.window_start = R.window_start AND L.window_end = R.window_end; The output resembles: L_Num L_Id R_Num R_Id window_start window_end 1 L1 NULL NULL 2023-11-03 13:20 2023-11-03 13:25 NULL NULL 2 R2 2023-11-03 13:20 2023-11-03 13:25 3 L3 3 R3 2023-11-03 13:20 2023-11-03 13:25 2 L2 NULL NULL 2023-11-03 13:25 2023-11-03 13:30 NULL NULL 4 R4 2023-11-03 13:25 2023-11-03 13:30 SEMI¶ Semi Window Joins return a row from one left record if there is at least one matching row on the right side within the common window. SELECT * FROM ( SELECT * FROM TABLE(TUMBLE(TABLE LeftTable, DESCRIPTOR($rowtime), INTERVAL '5' MINUTES)) ) L WHERE L.num IN ( SELECT num FROM ( SELECT * FROM TABLE(TUMBLE(TABLE RightTable, DESCRIPTOR($rowtime), INTERVAL '5' MINUTES)) ) R WHERE L.window_start = R.window_start AND L.window_end = R.window_end); row_time num id window_start window_end window_time 2023-11-03 12:43:57.095 1 L3 2023-11-03 13:40 2023-11-03 13:45 2023-11-03 13:44:59.999 2023-11-03 12:43:54.914 1 L2 2023-11-03 13:40 2023-11-03 13:45 2023-11-03 13:44:59.999 2023-11-03 12:43:56.898 1 L1 2023-11-03 13:40 2023-11-03 13:45 2023-11-03 13:44:59.999 2023-11-03 12:43:59.112 1 L1 2023-11-03 13:40 2023-11-03 13:45 2023-11-03 13:44:59.999 2023-11-03 12:43:59.626 1 L5 2023-11-03 13:40 2023-11-03 13:45 2023-11-03 13:44:59.999 SELECT * FROM ( SELECT * FROM TABLE(TUMBLE(TABLE LeftTable, DESCRIPTOR($rowtime), INTERVAL '5' MINUTES)) ) L WHERE EXISTS ( SELECT * FROM ( SELECT * FROM TABLE(TUMBLE(TABLE RightTable, DESCRIPTOR($rowtime), INTERVAL '5' MINUTES)) ) R WHERE L.num = R.num AND L.window_start = R.window_start AND L.window_end = R.window_end); row_time num id window_start window_end window_time 2023-11-03 12:45:08.329 2 L4 2023-11-03 13:45 2023-11-03 13:50 2023-11-03 13:49:59.999 2023-11-03 12:45:06.702 2 L3 2023-11-03 13:45 2023-11-03 13:50 2023-11-03 13:49:59.999 2023-11-03 12:45:07.024 2 L4 2023-11-03 13:45 2023-11-03 13:50 2023-11-03 13:49:59.999 2023-11-03 12:45:05.581 2 L3 2023-11-03 13:45 2023-11-03 13:50 2023-11-03 13:49:59.999 ANTI¶ Anti Window Joins are the obverse of the Inner Window Join: they contain all of the unjoined rows within each common window. SELECT * FROM ( SELECT * FROM TABLE(TUMBLE(TABLE LeftTable, DESCRIPTOR($rowtime), INTERVAL '5' MINUTES)) ) L WHERE L.num NOT IN ( SELECT num FROM ( SELECT * FROM TABLE(TUMBLE(TABLE RightTable, DESCRIPTOR($rowtime), INTERVAL '5' MINUTES)) ) R WHERE L.window_start = R.window_start AND L.window_end = R.window_end); row_time num id window_start window_end window_time 2023-11-03 12:23:42.865 1 L1 2023-11-03 13:20 2023-11-03 13:25 2023-11-03 13:24:59.999 2023-11-03 12:23:42.956 1 L5 2023-11-03 13:20 2023-11-03 13:25 2023-11-03 13:24:59.999 2023-11-03 12:23:41.029 2 L1 2023-11-03 13:20 2023-11-03 13:25 2023-11-03 13:24:59.999 2023-11-03 12:23:36.826 1 L1 2023-11-03 13:20 2023-11-03 13:25 2023-11-03 13:24:59.999 2023-11-03 12:23:36.435 1 L4 2023-11-03 13:20 2023-11-03 13:25 2023-11-03 13:24:59.999 SELECT * FROM ( SELECT * FROM TABLE(TUMBLE(TABLE LeftTable, DESCRIPTOR($rowtime), INTERVAL '5' MINUTES)) ) L WHERE NOT EXISTS ( SELECT * FROM ( SELECT * FROM TABLE(TUMBLE(TABLE RightTable, DESCRIPTOR($rowtime), INTERVAL '5' MINUTES)) ) R WHERE L.num = R.num AND L.window_start = R.window_start AND L.window_end = R.window_end); row_time num id window_start window_end window_time 2023-11-03 12:23:14.693 2 L1 2023-11-03 13:20 2023-11-03 13:25 2023-11-03 13:24:59.999 2023-11-03 12:23:19.174 2 L1 2023-11-03 13:20 2023-11-03 13:25 2023-11-03 13:24:59.999 2023-11-03 12:23:11.035 2 L1 2023-11-03 13:20 2023-11-03 13:25 2023-11-03 13:24:59.999 2023-11-03 12:23:11.764 2 L3 2023-11-03 13:20 2023-11-03 13:25 2023-11-03 13:24:59.999 2023-11-03 12:23:16.240 2 L5 2023-11-03 13:20 2023-11-03 13:25 2023-11-03 13:24:59.999 Limitations¶ Limitation on Join clause¶ Currently, the window join requires that the join-on condition contains window-starts equality of input tables and window-ends equality of input tables. In the future, the join on clause could be simplified to include only the window-start equality if the windowing TVF is TUMBLE or HOP. Limitation on Windowing TVFs of inputs¶ Currently, the windowing TVFs must be the same for left and right inputs. This could be extended in the future, for example, tumbling windows join sliding windows with the same window size. Related content¶ Top-N Queries Window Top-N Queries Windowing Table-Valued Functions (Windowing TVFs) Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
SELECT ...
FROM L [LEFT|RIGHT|FULL OUTER] JOIN R -- L and R are relations applied windowing TVF
ON L.window_start = R.window_start AND L.window_end = R.window_end AND ...
```

```sql
window_starts
```

```sql
window_ends
```

```sql
TIMESTAMP(3)
```

```sql
2020-04-15 08:05:00.000
```

```sql
2020-04-15 08:05
```

```sql
[12:00, 12:05)
```

```sql
[12:05, 12:10)
```

```sql
describe LeftTable;
```

```sql
+-------------+--------------+----------+--------+
| Column Name |  Data Type   | Nullable | Extras |
+-------------+--------------+----------+--------+
| row_time    | TIMESTAMP(3) | NULL     |        |
| num         | INT          | NULL     |        |
| id          | STRING       | NULL     |        |
+-------------+--------------+----------+--------+
```

```sql
SELECT * FROM LeftTable;
```

```sql
row_time                num id
2023-11-03 12:22:47.268 1   L1
2023-11-03 12:22:43.189 2   L2
2023-11-03 12:22:47.486 3   L3
```

```sql
describe RightTable;
```

```sql
+-------------+--------------+----------+--------+
| Column Name |  Data Type   | Nullable | Extras |
+-------------+--------------+----------+--------+
| row_time    | TIMESTAMP(3) | NULL     |        |
| num         | INT          | NULL     |        |
| id          | STRING       | NULL     |        |
+-------------+--------------+----------+--------+
```

```sql
SELECT * FROM RightTable;
```

```sql
row_time                num id
2023-11-03 12:23:22.045 2   R2
2023-11-03 12:23:16.437 3   R3
2023-11-03 12:23:18.349 4   R4
```

```sql
SELECT L.num as L_Num, L.id as L_Id, R.num as R_Num, R.id as R_Id,
  COALESCE(L.window_start, R.window_start) as window_start,
  COALESCE(L.window_end, R.window_end) as window_end
  FROM (
    SELECT * FROM TABLE(TUMBLE(TABLE LeftTable, DESCRIPTOR($rowtime), INTERVAL '5' MINUTES))
  ) L
  FULL JOIN (
    SELECT * FROM TABLE(TUMBLE(TABLE RightTable, DESCRIPTOR($rowtime), INTERVAL '5' MINUTES))
  ) R
  ON L.num = R.num AND L.window_start = R.window_start AND L.window_end = R.window_end;
```

```sql
L_Num L_Id R_Num R_Id window_start     window_end
1     L1   NULL  NULL 2023-11-03 13:20 2023-11-03 13:25
NULL  NULL 2     R2   2023-11-03 13:20 2023-11-03 13:25
3     L3   3     R3   2023-11-03 13:20 2023-11-03 13:25
2     L2   NULL  NULL 2023-11-03 13:25 2023-11-03 13:30
NULL  NULL 4     R4   2023-11-03 13:25 2023-11-03 13:30
```

```sql
SELECT *
  FROM (
     SELECT * FROM TABLE(TUMBLE(TABLE LeftTable, DESCRIPTOR($rowtime), INTERVAL '5' MINUTES))
  ) L WHERE L.num IN (
    SELECT num FROM (
      SELECT * FROM TABLE(TUMBLE(TABLE RightTable, DESCRIPTOR($rowtime), INTERVAL '5' MINUTES))
    ) R WHERE L.window_start = R.window_start AND L.window_end = R.window_end);
```

```sql
row_time                num id window_start     window_end       window_time
2023-11-03 12:43:57.095 1   L3 2023-11-03 13:40 2023-11-03 13:45 2023-11-03 13:44:59.999
2023-11-03 12:43:54.914 1   L2 2023-11-03 13:40 2023-11-03 13:45 2023-11-03 13:44:59.999
2023-11-03 12:43:56.898 1   L1 2023-11-03 13:40 2023-11-03 13:45 2023-11-03 13:44:59.999
2023-11-03 12:43:59.112 1   L1 2023-11-03 13:40 2023-11-03 13:45 2023-11-03 13:44:59.999
2023-11-03 12:43:59.626 1   L5 2023-11-03 13:40 2023-11-03 13:45 2023-11-03 13:44:59.999
```

```sql
SELECT *
  FROM (
     SELECT * FROM TABLE(TUMBLE(TABLE LeftTable, DESCRIPTOR($rowtime), INTERVAL '5' MINUTES))
  ) L WHERE EXISTS (
    SELECT * FROM (
      SELECT * FROM TABLE(TUMBLE(TABLE RightTable, DESCRIPTOR($rowtime), INTERVAL '5' MINUTES))
    ) R WHERE L.num = R.num AND L.window_start = R.window_start AND L.window_end = R.window_end);
```

```sql
row_time                num id  window_start     window_end       window_time
2023-11-03 12:45:08.329 2   L4  2023-11-03 13:45 2023-11-03 13:50 2023-11-03 13:49:59.999
2023-11-03 12:45:06.702 2   L3  2023-11-03 13:45 2023-11-03 13:50 2023-11-03 13:49:59.999
2023-11-03 12:45:07.024 2   L4  2023-11-03 13:45 2023-11-03 13:50 2023-11-03 13:49:59.999
2023-11-03 12:45:05.581 2   L3  2023-11-03 13:45 2023-11-03 13:50 2023-11-03 13:49:59.999
```

```sql
SELECT *
  FROM (
    SELECT * FROM TABLE(TUMBLE(TABLE LeftTable, DESCRIPTOR($rowtime), INTERVAL '5' MINUTES))
  ) L WHERE L.num NOT IN (
     SELECT num FROM (
       SELECT * FROM TABLE(TUMBLE(TABLE RightTable, DESCRIPTOR($rowtime), INTERVAL '5' MINUTES))
     ) R WHERE L.window_start = R.window_start AND L.window_end = R.window_end);
```

```sql
row_time                num id window_start     window_end       window_time
2023-11-03 12:23:42.865 1   L1 2023-11-03 13:20 2023-11-03 13:25 2023-11-03 13:24:59.999
2023-11-03 12:23:42.956 1   L5 2023-11-03 13:20 2023-11-03 13:25 2023-11-03 13:24:59.999
2023-11-03 12:23:41.029 2   L1 2023-11-03 13:20 2023-11-03 13:25 2023-11-03 13:24:59.999
2023-11-03 12:23:36.826 1   L1 2023-11-03 13:20 2023-11-03 13:25 2023-11-03 13:24:59.999
2023-11-03 12:23:36.435 1   L4 2023-11-03 13:20 2023-11-03 13:25 2023-11-03 13:24:59.999
```

```sql
SELECT *
  FROM (
    SELECT * FROM TABLE(TUMBLE(TABLE LeftTable, DESCRIPTOR($rowtime), INTERVAL '5' MINUTES))
  ) L WHERE NOT EXISTS (
    SELECT * FROM (
      SELECT * FROM TABLE(TUMBLE(TABLE RightTable, DESCRIPTOR($rowtime), INTERVAL '5' MINUTES))
    ) R WHERE L.num = R.num AND L.window_start = R.window_start AND L.window_end = R.window_end);
```

```sql
row_time                num id window_start     window_end       window_time
2023-11-03 12:23:14.693 2   L1 2023-11-03 13:20 2023-11-03 13:25 2023-11-03 13:24:59.999
2023-11-03 12:23:19.174 2   L1 2023-11-03 13:20 2023-11-03 13:25 2023-11-03 13:24:59.999
2023-11-03 12:23:11.035 2   L1 2023-11-03 13:20 2023-11-03 13:25 2023-11-03 13:24:59.999
2023-11-03 12:23:11.764 2   L3 2023-11-03 13:20 2023-11-03 13:25 2023-11-03 13:24:59.999
2023-11-03 12:23:16.240 2   L5 2023-11-03 13:20 2023-11-03 13:25 2023-11-03 13:24:59.999
```

---

### SQL Window Top-N Queries in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/queries/window-topn.html

Window Top-N Queries in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables Window Top-N queries in dynamic tables. Syntax¶ SELECT [column_list] FROM ( SELECT [column_list], ROW_NUMBER() OVER (PARTITION BY window_start, window_end [, col_key1...] ORDER BY col1 [asc|desc][, col2 [asc|desc]...]) AS rownum FROM table_name) -- relation applied windowing TVF WHERE rownum <= N [AND conditions] Description¶ Window Top-N is a special Top-N that returns the N smallest or largest values for each window and other partitioned keys. For streaming queries, unlike regular Top-N on continuous tables, Window Top-N doesn’t emit intermediate results, but only a final result, the total Top N records at the end of the window. Moreover, Window Top-N purges all intermediate state when no longer needed, so Window Top-N queries have better performance if you don’t need results updated per record. Usually, Window Top-N is used with Windowing TVF directly, but Window Top-N can be used with other operations based on Windowing TVF, like Window Aggregation, and Window Join. You can define Window Top-N with the same syntax as regular Top-N. For more information, see Top-N. In addition, Window Top-N requires that the PARTITION BY clause contains window_start and window_end columns of the relation applied by Windowing TVF or Window Aggregation. Otherwise, the optimizer can’t translate the query. Examples¶ The following examples show Window Top-N aggregations over example data streams that you can experiment with. Note To show the behavior of windowing more clearly in the following examples, TIMESTAMP(3) values may be simplified so that trailing zeroes aren’t shown. For example, 2020-04-15 08:05:00.000 may be shown as 2020-04-15 08:05. Columns may be hidden intentionally to enhance the readability of the content. Window Top-N follows after Window Aggregation¶ The following example shows how to calculate Top 3 customers who have the highest order value for every tumbling 10 minutes window. DESCRIBE `examples`.`marketplace`.`orders`; +--------------+-----------+----------+---------------+ | Column Name | Data Type | Nullable | Extras | +--------------+-----------+----------+---------------+ | order_id | STRING | NOT NULL | | | customer_id | INT | NOT NULL | | | product_id | STRING | NOT NULL | | | price | DOUBLE | NOT NULL | | +--------------+-----------+----------+---------------+ SELECT * FROM `examples`.`marketplace`.`orders`; order_id customer_id product_id price d770a538-a70c-4de6-9d06-e6c16c5bef5a 3075 1379 32.21 787ee1f4-d0d0-4c39-bdb9-44dc2d203d55 3028 1335 34.74 7ab7ce23-5f61-4398-afad-b1e3f548fee3 3148 1045 69.26 6fea712c-9454-497e-8038-ebaf6dfc7a17 3247 1390 67.26 dc9daf5e-98d5-4bcd-8839-251fed13b75e 3167 1309 12.04 ab3151d0-2950-49cd-9783-016ccc6a3281 3105 1094 21.52 d27ca945-3cff-48a4-afcc-7b17446aa95d 3168 1250 99.95 SELECT * FROM ( SELECT *, ROW_NUMBER() OVER (PARTITION BY window_start, window_end ORDER BY price DESC) as rownum FROM ( SELECT window_start, window_end, customer_id, SUM(price) as price, COUNT(*) as cnt FROM TABLE( TUMBLE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES)) GROUP BY window_start, window_end, customer_id ) ) WHERE rownum <= 3; window_start window_end customer_id price cnt rownum 2023-11-02 17:50 2023-11-02 18:00 3084 1523.75 18 1 2023-11-02 17:50 2023-11-02 18:00 3092 1487.32 15 2 2023-11-02 17:50 2023-11-02 18:00 3082 1452.18 17 3 2023-11-02 18:00 2023-11-02 18:10 3095 1698.50 20 1 2023-11-02 18:00 2023-11-02 18:10 3088 1645.23 19 2 2023-11-02 18:00 2023-11-02 18:10 3079 1589.75 16 3 Window Top-N follows after Windowing TVF¶ The following example shows how to calculate Top 3 customers which have the highest order value for every tumbling 10 minutes window. SELECT * FROM ( SELECT $rowtime, price, product_id, customer_id, window_start, window_end, ROW_NUMBER() OVER (PARTITION BY window_start, window_end ORDER BY price DESC) as rownum FROM TABLE( TUMBLE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES)) ) WHERE rownum <= 3; $rowtime price product_id customer_id window_start window_end rownum 2023-11-05 19:35:38 99.53 1382 3120 2023-11-05 19:30 2023-11-05 19:40 1 2023-11-05 19:35:39 99.04 1216 3204 2023-11-05 19:30 2023-11-05 19:40 2 2023-11-05 19:35:32 98.95 1364 3114 2023-11-05 19:30 2023-11-05 19:40 3 2023-11-05 19:42:41 97.75 1295 3187 2023-11-05 19:40 2023-11-05 19:50 1 2023-11-05 19:41:53 97.30 1428 3256 2023-11-05 19:40 2023-11-05 19:50 2 2023-11-05 19:43:17 96.80 1173 3092 2023-11-05 19:40 2023-11-05 19:50 3 Related content¶ Top-N Queries Windowing Table-Valued Functions (Windowing TVFs) Window Aggregation Queries Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
SELECT [column_list]
FROM (
   SELECT [column_list],
     ROW_NUMBER() OVER (PARTITION BY window_start, window_end [, col_key1...]
       ORDER BY col1 [asc|desc][, col2 [asc|desc]...]) AS rownum
   FROM table_name) -- relation applied windowing TVF
WHERE rownum <= N [AND conditions]
```

```sql
PARTITION BY
```

```sql
window_start
```

```sql
TIMESTAMP(3)
```

```sql
2020-04-15 08:05:00.000
```

```sql
2020-04-15 08:05
```

```sql
DESCRIBE `examples`.`marketplace`.`orders`;
```

```sql
+--------------+-----------+----------+---------------+
| Column Name  | Data Type | Nullable |    Extras     |
+--------------+-----------+----------+---------------+
| order_id     | STRING    | NOT NULL |               |
| customer_id  | INT       | NOT NULL |               |
| product_id   | STRING    | NOT NULL |               |
| price        | DOUBLE    | NOT NULL |               |
+--------------+-----------+----------+---------------+
```

```sql
SELECT * FROM `examples`.`marketplace`.`orders`;
```

```sql
order_id                             customer_id  product_id price
d770a538-a70c-4de6-9d06-e6c16c5bef5a 3075         1379       32.21
787ee1f4-d0d0-4c39-bdb9-44dc2d203d55 3028         1335       34.74
7ab7ce23-5f61-4398-afad-b1e3f548fee3 3148         1045       69.26
6fea712c-9454-497e-8038-ebaf6dfc7a17 3247         1390       67.26
dc9daf5e-98d5-4bcd-8839-251fed13b75e 3167         1309       12.04
ab3151d0-2950-49cd-9783-016ccc6a3281 3105         1094       21.52
d27ca945-3cff-48a4-afcc-7b17446aa95d 3168         1250       99.95
```

```sql
SELECT *
  FROM (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY window_start, window_end ORDER BY price DESC) as rownum
    FROM (
      SELECT window_start, window_end, customer_id, SUM(price) as price, COUNT(*) as cnt
      FROM TABLE(
        TUMBLE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES))
      GROUP BY window_start, window_end, customer_id
    )
  ) WHERE rownum <= 3;
```

```sql
window_start      window_end       customer_id price   cnt rownum
2023-11-02 17:50  2023-11-02 18:00 3084        1523.75 18  1
2023-11-02 17:50  2023-11-02 18:00 3092        1487.32 15  2
2023-11-02 17:50  2023-11-02 18:00 3082        1452.18 17  3
2023-11-02 18:00  2023-11-02 18:10 3095        1698.50 20  1
2023-11-02 18:00  2023-11-02 18:10 3088        1645.23 19  2
2023-11-02 18:00  2023-11-02 18:10 3079        1589.75 16  3
```

```sql
SELECT *
  FROM (
    SELECT $rowtime, price, product_id, customer_id, window_start, window_end, ROW_NUMBER() OVER (PARTITION BY window_start, window_end ORDER BY price DESC) as rownum
    FROM TABLE(
                TUMBLE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES))
  ) WHERE rownum <= 3;
```

```sql
$rowtime            price product_id customer_id window_start        window_end          rownum
2023-11-05 19:35:38 99.53 1382       3120        2023-11-05 19:30    2023-11-05 19:40    1
2023-11-05 19:35:39 99.04 1216       3204        2023-11-05 19:30    2023-11-05 19:40    2
2023-11-05 19:35:32 98.95 1364       3114        2023-11-05 19:30    2023-11-05 19:40    3
2023-11-05 19:42:41 97.75 1295       3187        2023-11-05 19:40    2023-11-05 19:50    1
2023-11-05 19:41:53 97.30 1428       3256        2023-11-05 19:40    2023-11-05 19:50    2
2023-11-05 19:43:17 96.80 1173       3092        2023-11-05 19:40    2023-11-05 19:50    3
```

---

### SQL Windowing Table-Valued Functions in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/queries/window-tvf.html

Windowing Table-Valued Functions (Windowing TVFs) in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides several window table-valued functions (TVFs) for dividing the elements of a table into windows. Description¶ Windows are central to processing infinite streams. Windows split the stream into “buckets” of finite size, over which you can apply computations. This document focuses on how windowing is performed in Confluent Cloud for Apache Flink and how you can benefit from windowed functions. Flink provides several window table-valued functions (TVF) to divide the elements of your table into windows, including: Tumble Windows Hop Windows Cumulate Windows Session Windows (not supported in batch mode) Note that each element can logically belong to more than one window, depending on the windowing table-valued function you use. For example, HOP windowing creates overlapping windows in which a single element can be assigned to multiple windows. Windowing TVFs are Flink-defined Polymorphic Table Functions (abbreviated PTF). PTF is part of the SQL 2016 standard, a special table-function, but can have a table as a parameter. PTF is a powerful feature to change the shape of a table. Because PTFs are used semantically like tables, their invocation occurs in a FROM clause of a SELECT statement. These are frequently-used computations based on windowing TVF: Window Aggregation Window TopN Window Join Window Deduplication Window functions¶ Flink provides 4 built-in windowing TVFs: TUMBLE, HOP, CUMULATE and SESSION. The return value of windowing TVF is a new relation that includes all columns of original relation as well as additional 3 columns named “window_start”, “window_end”, “window_time” to indicate the assigned window. In streaming mode, the “window_time” field is a time attribute of the window. In batch mode, the “window_time” field is an attribute of type TIMESTAMP or TIMESTAMP_LTZ based on input time field type. The “window_time” field can be used in subsequent time-based operations, for example, another windowing TVF, interval-join, or over aggregation. The value of window_time always equal to window_end - 1ms. Window alignment¶ Time-based window boundaries align with clock seconds, minutes, hours, and days. For example, assume that you have events with these timestamps (in UTC): 00:59:00.000 00:59:30.000 01:00:15.000 If you put these events into hour-long tumbling windows, the first two land in the window for 00:00:00-00:59:59.999, and the third event lands in the following hour. Supported time units¶ Window TVFs support the following time units: SECOND MINUTE HOUR DAY MONTH and YEAR time units are not currently supported. Examples¶ The following examples show Window TVFs over example data streams that you can experiment with. Note To show the behavior of windowing more clearly in the following examples, TIMESTAMP(3) values may be simplified so that trailing zeroes aren’t shown. For example, 2020-04-15 08:05:00.000 may be shown as 2020-04-15 08:05. Columns may be hidden intentionally to enhance the readability of the content. TUMBLE¶ The TUMBLE function assigns each element to a window of specified window size. Tumbling windows have a fixed size and do not overlap. For example, suppose you specify a tumbling window with a size of 5 minutes. In that case, Flink will evaluate the current window, and a new window started every five minutes, as illustrated by the following figure. The TUMBLE function assigns a window for each row of a relation based on a time attribute field. In streaming mode, the time attribute field must be an event time attribute. In batch mode, the time attribute field of window table function must be an attribute of type TIMESTAMP or TIMESTAMP_LTZ. The return value of TUMBLE is a new relation that includes all columns of the original relation, as well as an additional 3 columns named window_start, window_end, and window_time to indicate the assigned window. The original time attribute, timecol is a regular timestamp column after windowing TVF. The TUMBLE function takes three required parameters and one optional parameter: TUMBLE(TABLE data, DESCRIPTOR(timecol), size [, offset ]) data: is a table parameter that can be any relation with a time attribute column. timecol: is a column descriptor indicating which time attributes column of data should be mapped to tumbling windows. size: is a duration specifying the width of the tumbling windows. offset: is an optional parameter to specify the offset which window start would be shifted by. Here is an example invocation on the orders table: DESCRIBE `examples`.`marketplace`.`orders`; The output resembles: +--------------+-----------+----------+---------------+ | Column Name | Data Type | Nullable | Extras | +--------------+-----------+----------+---------------+ | order_id | STRING | NOT NULL | | | customer_id | INT | NOT NULL | | | product_id | STRING | NOT NULL | | | price | DOUBLE | NOT NULL | | +--------------+-----------+----------+---------------+ The following query returns all rows in the orders table. SELECT * FROM `examples`.`marketplace`.`orders`; The output resembles: order_id customer_id product_id price d770a538-a70c-4de6-9d06-e6c16c5bef5a 3075 1379 32.21 787ee1f4-d0d0-4c39-bdb9-44dc2d203d55 3028 1335 34.74 7ab7ce23-5f61-4398-afad-b1e3f548fee3 3148 1045 69.26 6fea712c-9454-497e-8038-ebaf6dfc7a17 3247 1390 67.26 dc9daf5e-98d5-4bcd-8839-251fed13b75e 3167 1309 12.04 ab3151d0-2950-49cd-9783-016ccc6a3281 3105 1094 21.52 d27ca945-3cff-48a4-afcc-7b17446aa95d 3168 1250 99.95 The following queries return all rows in the orders table in 10-minute tumbling windows. SELECT * FROM TABLE( TUMBLE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES)) -- or with the named params -- note: the DATA param must be the first SELECT * FROM TABLE( TUMBLE( DATA => TABLE `examples`.`marketplace`.`orders`, TIMECOL => DESCRIPTOR($rowtime), SIZE => INTERVAL '10' MINUTES)); The output resembles: order_id customer_id product_id price $rowtime window_start window_end window_time e69058b5-7ed9-44fa-86ff-4d6f8baff028 3145 1488 63.94 2023-11-02 13:20:27 2023-11-02 13:20:00 2023-11-02 13:30:00 2023-11-02 13:29:59.999 92e81cc4-93c4-488b-9386-ae9300d7cd21 3223 1328 29.37 2023-11-02 13:20:27 2023-11-02 13:20:00 2023-11-02 13:30:00 2023-11-02 13:29:59.999 7ca2ddaa-dd5e-41dc-ac47-c9aa7477d913 3223 1402 49.78 2023-11-02 13:20:27 2023-11-02 13:20:00 2023-11-02 13:30:00 2023-11-02 13:29:59.999 84efa0d0-7157-4cd3-a893-e7d2780cefdd 3076 1321 47.38 2023-11-02 13:20:27 2023-11-02 13:20:00 2023-11-02 13:30:00 2023-11-02 13:29:59.999 d72a37d2-ef15-4740-8ae8-1199ddf84ea9 3211 1234 56.27 2023-11-02 13:20:27 2023-11-02 13:20:00 2023-11-02 13:30:00 2023-11-02 13:29:59.999 4d57c754-63e1-413a-8af8-768d54d128ee 3126 1223 21.52 2023-11-02 13:20:27 2023-11-02 13:20:00 2023-11-02 13:30:00 2023-11-02 13:29:59.999 80f9fe0b-3e5d-4c25-aa6e-0b3dacfa36de 3087 1393 70.26 2023-11-02 13:20:27 2023-11-02 13:20:00 2023-11-02 13:30:00 2023-11-02 13:29:59.999 ea733533-1516-41b6-b5e3-cadcb6f71529 3079 1488 17.55 2023-11-02 13:20:27 2023-11-02 13:20:00 2023-11-02 13:30:00 2023-11-02 13:29:59.999 cef1cd9f-379e-4791-8a0d-69eec8adae35 3211 1293 91.20 2023-11-02 13:20:27 2023-11-02 13:20:00 2023-11-02 13:30:00 2023-11-02 13:29:59.999 The following query computes the sum of the price column in the orders table within 10-minute tumbling windows. -- apply aggregation on the tumbling windowed table SELECT window_start, window_end, SUM(price) as `sum` FROM TABLE( TUMBLE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES)) GROUP BY window_start, window_end; The output resembles: window_start window_end sum 2023-11-02 10:40:00 2023-11-02 10:50:00 258484.93 2023-11-02 10:50:00 2023-11-02 11:00:00 287632.15 2023-11-02 11:00:00 2023-11-02 11:10:00 271945.78 2023-11-02 11:10:00 2023-11-02 11:20:00 315207.46 2023-11-02 11:20:00 2023-11-02 11:30:00 342618.92 2023-11-02 11:30:00 2023-11-02 11:40:00 329754.31 HOP¶ The HOP function assigns elements to windows of fixed length. Like a TUMBLE windowing function, the size of the windows is configured by the window size parameter. An additional window slide parameter controls how frequently a hopping window is started. Hence, hopping windows can be overlapping if the slide is smaller than the window size. In this case, elements are assigned to multiple windows. Hopping windows are also known as “sliding windows”. For example, you could have windows of size 10 minutes that slides by 5 minutes. With this, you get every 5 minutes a window that contains the events that arrived during the last 10 minutes, as depicted by the following figure. The HOP function assigns windows that cover rows within the interval of size and shifting every slide based on a time attribute field. In streaming mode, the time attribute field must be an event time attribute. In batch mode, the time attribute field of window table function must be an attribute of type TIMESTAMP or TIMESTAMP_LTZ. The return value of HOP is a new relation that includes all columns of the original relation as well as an additional 3 columns named window_start, window_end, and window_time to indicate the assigned window. The original time attribute, timecol, is a regular timestamp column after windowing TVF. The HOP takes four required parameters and one optional parameter: HOP(TABLE data, DESCRIPTOR(timecol), slide, size [, offset ]) data: is a table parameter that can be any relation with an time attribute column. timecol: is a column descriptor indicating which time attributes column of data should be mapped to hopping windows. slide: is a duration specifying the duration between the start of sequential hopping windows size: is a duration specifying the width of the hopping windows. offset: is an optional parameter to specify the offset which window start would be shifted by. The following queries return all rows in the orders table in hopping windows with a 5-minute slide and 10-minute size. SELECT * FROM TABLE( HOP(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '5' MINUTES, INTERVAL '10' MINUTES)) -- or with the named params -- note: the DATA param must be the first SELECT * FROM TABLE( HOP( DATA => TABLE `examples`.`marketplace`.`orders`, TIMECOL => DESCRIPTOR($rowtime), SLIDE => INTERVAL '5' MINUTES, SIZE => INTERVAL '10' MINUTES)); The output resembles: order_id customer_id product_id price $rowtime window_start window_end window_time 10ae1386-496e-4c6c-9436-7f7e2e7a59f9 3160 1015 26.20 2023-11-02 19:24:46 2023-11-02 19:20:00 2023-11-02 19:30:00 2023-11-02 19:29:59.999 10ae1386-496e-4c6c-9436-7f7e2e7a59f9 3160 1015 26.20 2023-11-02 19:24:46 2023-11-02 19:15:00 2023-11-02 19:25:00 2023-11-02 19:24:59.999 66ecb3b3-7a3d-43ac-b3a2-4c35e06a8d7c 3046 1081 20.24 2023-11-02 19:24:46 2023-11-02 19:20:00 2023-11-02 19:30:00 2023-11-02 19:29:59.999 66ecb3b3-7a3d-43ac-b3a2-4c35e06a8d7c 3046 1081 20.24 2023-11-02 19:24:46 2023-11-02 19:15:00 2023-11-02 19:25:00 2023-11-02 19:24:59.999 4d86db03-a573-4fc2-9699-85455331a7c4 3023 1346 85.45 2023-11-02 19:24:46 2023-11-02 19:20:00 2023-11-02 19:30:00 2023-11-02 19:29:59.999 4d86db03-a573-4fc2-9699-85455331a7c4 3023 1346 85.45 2023-11-02 19:24:46 2023-11-02 19:15:00 2023-11-02 19:25:00 2023-11-02 19:24:59.999 d1460cf7-9472-45e0-9c2d-40537c9f34c0 3114 1333 49.56 2023-11-02 19:24:47 2023-11-02 19:20:00 2023-11-02 19:30:00 2023-11-02 19:29:59.999 d1460cf7-9472-45e0-9c2d-40537c9f34c0 3114 1333 49.56 2023-11-02 19:24:47 2023-11-02 19:15:00 2023-11-02 19:25:00 2023-11-02 19:24:59.999 e38984d8-5683-4e55-9f7a-e43350de7c3d 3024 1402 90.75 2023-11-02 19:24:47 2023-11-02 19:20:00 2023-11-02 19:30:00 2023-11-02 19:29:59.999 e38984d8-5683-4e55-9f7a-e43350de7c3d 3024 1402 90.75 2023-11-02 19:24:47 2023-11-02 19:15:00 2023-11-02 19:25:00 2023-11-02 19:24:59.999 The following query computes the sum of the price column in the orders table within hopping windows that have a 5-minute slide and 10-minute size. -- apply aggregation on the hopping windowed table SELECT window_start, window_end, SUM(price) as `sum` FROM TABLE( HOP(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '5' MINUTES, INTERVAL '10' MINUTES)) GROUP BY window_start, window_end; The output resembles: window_start window_end sum 2023-11-02 11:10:00 2023-11-02 11:20:00 296049.38 2023-11-02 11:15:00 2023-11-02 11:25:00 1122455.07 2023-11-02 11:20:00 2023-11-02 11:30:00 1648270.20 2023-11-02 11:25:00 2023-11-02 11:35:00 2143271.00 2023-11-02 11:30:00 2023-11-02 11:40:00 2701592.45 2023-11-02 11:35:00 2023-11-02 11:45:00 3214376.78 CUMULATE¶ Cumulating windows are useful in some scenarios, such as tumbling windows with early firing in a fixed window interval. For example, a daily dashboard might display cumulative unique views (UVs) from 00:00 to every minute, and the UV at 10:00 might represent the total number of UVs from 00:00 to 10:00. This can be implemented easily and efficiently by CUMULATE windowing. The CUMULATE function assigns elements to windows that cover rows within an initial interval of a specified step size, and it expands by one more step size, keeping the window start fixed, for every step, until the maximum window size is reached. CUMULATE function windows all have the same window start but add a step size to each window until the max value is reached, so the window size is always changing, and the windows overlap. When the max value is reached, the window start is advanced to the end of the last window, and the size resets to the step size. In comparison, TUMBLE function windows all have the same size, the step size, and do not overlap. For example, you could have a cumulating window with a 1-hour step and 1-day maximum size, and you will get these windows for every day: [00:00, 01:00) [00:00, 02:00) [00:00, 03:00) … [00:00, 24:00) The CUMULATE function assigns windows based on a time attribute column. In streaming mode, the time attribute field must be an event time attribute. In batch mode, the time attribute field of window table function must be an attribute of type TIMESTAMP or TIMESTAMP_LTZ. The return value of CUMULATE is a new relation that includes all columns of the original relation, as well as an additional 3 columns named window_start, window_end, and window_time to indicate the assigned window. The original time attribute, timecol, is a regular timestamp column after window TVF. The CUMULATE takes four required parameters and one optional parameter: CUMULATE(TABLE data, DESCRIPTOR(timecol), step, size) data: is a table parameter that can be any relation with an time attribute column. timecol: is a column descriptor indicating which time attributes column of data should be mapped to cumulating windows. step: is a duration specifying the increased window size between the end of sequential cumulating windows. size: is a duration specifying the max width of the cumulating windows. size must be an integral multiple of step. offset: is an optional parameter to specify the offset which window start would be shifted by. The following queries return all rows in the orders table in CUMULATE windows that have a 2-minute step and 10-minute size. SELECT * FROM TABLE( CUMULATE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '2' MINUTES, INTERVAL '10' MINUTES)); -- or with the named params -- note: the DATA param must be the first SELECT * FROM TABLE( CUMULATE( DATA => TABLE `examples`.`marketplace`.`orders`, TIMECOL => DESCRIPTOR($rowtime), STEP => INTERVAL '2' MINUTES, SIZE => INTERVAL '10' MINUTES)); The output resembles: order_id customer_id product_id price $rowtime window_start window_end window_time 2572a2e0-2ba2-4947-8926-e70e31b68df3 3239 1015 13.59 2023-11-02 19:27:39 2023-11-02 19:20:00 2023-11-02 19:28:00 2023-11-02 19:27:59.999 2572a2e0-2ba2-4947-8926-e70e31b68df3 3239 1015 13.59 2023-11-02 19:27:39 2023-11-02 19:20:00 2023-11-02 19:30:00 2023-11-02 19:29:59.999 7f791e40-a524-4a9b-bb0d-35a2c1b5a7c4 3102 1374 93.59 2023-11-02 19:27:39 2023-11-02 19:20:00 2023-11-02 19:28:00 2023-11-02 19:27:59.999 7f791e40-a524-4a9b-bb0d-35a2c1b5a7c4 3102 1374 93.59 2023-11-02 19:27:39 2023-11-02 19:20:00 2023-11-02 19:30:00 2023-11-02 19:29:59.999 47e70310-8fa4-4568-b521-7e2b68b06634 3026 1142 58.26 2023-11-02 19:27:39 2023-11-02 19:20:00 2023-11-02 19:28:00 2023-11-02 19:27:59.999 47e70310-8fa4-4568-b521-7e2b68b06634 3026 1142 58.26 2023-11-02 19:27:39 2023-11-02 19:20:00 2023-11-02 19:30:00 2023-11-02 19:29:59.999 fe1b440e-dc75-4092-be11-8e1c3afe55c7 3106 1057 11.37 2023-11-02 19:27:39 2023-11-02 19:20:00 2023-11-02 19:28:00 2023-11-02 19:27:59.999 fe1b440e-dc75-4092-be11-8e1c3afe55c7 3106 1057 11.37 2023-11-02 19:27:39 2023-11-02 19:20:00 2023-11-02 19:30:00 2023-11-02 19:29:59.999 6668e4dc-d574-44db-8f0f-2b8e1b1f3c2e 3061 1049 26.20 2023-11-02 19:27:39 2023-11-02 19:20:00 2023-11-02 19:28:00 2023-11-02 19:27:59.999 6668e4dc-d574-44db-8f0f-2b8e1b1f3c2e 3061 1049 26.20 2023-11-02 19:27:39 2023-11-02 19:20:00 2023-11-02 19:30:00 2023-11-02 19:29:59.999 The following query computes the sum of the price column in the orders table within CUMULATE windows that have a 2-minute step and 10-minute size. -- apply aggregation on the cumulating windowed table SELECT window_start, window_end, SUM(price) as `sum` FROM TABLE( CUMULATE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '2' MINUTES, INTERVAL '10' MINUTES)) GROUP BY window_start, window_end; The output resembles: window_start window_end sum 2023-11-02 12:40:00.000 2023-11-02 12:46:00.000 327376.23 2023-11-02 12:40:00.000 2023-11-02 12:48:00.000 661272.70 2023-11-02 12:40:00.000 2023-11-02 12:50:00.000 989294.13 2023-11-02 12:50:00.000 2023-11-02 12:52:00.000 1316596.58 2023-11-02 12:50:00.000 2023-11-02 12:54:00.000 1648097.20 2023-11-02 12:50:00.000 2023-11-02 12:56:00.000 1977881.53 2023-11-02 12:50:00.000 2023-11-02 12:58:00.000 2304080.32 2023-11-02 12:50:00.000 2023-11-02 13:00:00.000 2636795.56 SESSION¶ The SESSION function groups elements by sessions of activity. Unlike TUMBLE and HOP windows, session windows do not overlap and do not have a fixed start and end time. Instead, a session window closes when it doesn’t receive elements for a certain period of time, that is, when a gap of inactivity occurs. A session window is configured with a static session gap that defines the duration of inactivity. When this period expires, the current session closes and subsequent elements are assigned to a new session window. For example, you could have windows with a gap of 1 minute. With this configuration, when the interval between two events is less than 1 minute, these events are grouped into the same session window. If there is no data for 1 minute following the latest event, then this session window closes and is sent downstream. Subsequent events are assigned to a new session window. The SESSION function assigns windows that cover rows based on a time attribute. In streaming mode, the time attribute field must be an event time attribute. SESSION Window TVF is not supported in batch mode. The return value of SESSION is a new relation that includes all columns of the original relation, as well as three additional columns named window_start, window_end, and window_time to indicate the assigned window. The original time attribute timecol becomes a regular timestamp column after the windowing TVF. The SESSION function takes three required parameters and one optional parameter: SESSION(TABLE data [PARTITION BY(keycols, ...)], DESCRIPTOR(timecol), gap) data: is a table parameter that can be any relation with a time attribute column. keycols: is a column or set of columns indicating which columns should be used to partition the data prior to session windows. timecol: is a column descriptor indicating which time attribute column of data should be mapped to session windows. gap: is the maximum interval in timestamp for two events to be considered part of the same session window. The following query returns all columns from the orders table within SESSION windows that have a 1-minute gap, partitioned by product_id: SELECT * FROM TABLE( SESSION(TABLE `examples`.`marketplace`.`orders` PARTITION BY product_id, DESCRIPTOR($rowtime), INTERVAL '1' MINUTES)); -- or with the named params -- note: the DATA param must be the first SELECT * FROM TABLE( SESSION( DATA => TABLE `examples`.`marketplace`.`orders` PARTITION BY product_id, TIMECOL => DESCRIPTOR($rowtime), GAP => INTERVAL '1' MINUTES)); The output resembles: order_id customer_id product_id price $rowtime window_start window_end window_time d7ef1f9a-4f5f-406e-bbad-25db521c38bf 3068 1234 17.08 2023-11-02T19:43:58.626Z 2023-11-02 21:43:58.626 2023-11-02 21:44:58.626 2023-11-02T19:44:58.625Z 804f0c86-a59a-4425-a293-b28bafaa9674 3071 1332 48.12 2023-11-02T19:44:00.506Z 2023-11-02 21:44:00.506 2023-11-02 21:45:00.506 2023-11-02T19:45:00.505Z 61ea63e3-f040-4501-b78e-8db1fdcf45fc 3179 1267 12.35 2023-11-02T19:43:58.405Z 2023-11-02 21:43:58.405 2023-11-02 21:45:07.925 2023-11-02T19:45:07.924Z b70ba5bc-428c-41d7-b8fc-8014dd3fd429 3234 1267 40.81 2023-11-02T19:44:00.365Z 2023-11-02 21:43:58.405 2023-11-02 21:45:07.925 2023-11-02T19:45:07.924Z 37688f8c-65ee-4e27-a567-4890e6c7663b 3179 1267 98.17 2023-11-02T19:44:07.925Z 2023-11-02 21:43:58.405 2023-11-02 21:45:07.925 2023-11-02T19:45:07.924Z 4cfa0cc6-881a-43b3-bb34-1746c3b93094 3077 1047 16.78 2023-11-02T19:44:01.985Z 2023-11-02 21:44:01.985 2023-11-02 21:45:23.285 2023-11-02T19:45:23.284Z e007ce6e-5a76-4390-8fb3-50f46025b965 3095 1047 77.48 2023-11-02T19:44:11.365Z 2023-11-02 21:44:01.985 2023-11-02 21:45:23.285 2023-11-02T19:45:23.284Z 487a0248-a534-489e-bbc5-733e87d19cc7 3200 1047 47.86 2023-11-02T19:44:23.285Z 2023-11-02 21:44:01.985 2023-11-02 21:45:23.285 2023-11-02T19:45:23.284Z 4dd1ab51-8ca4-4de6-9f79-bb2ad7ab2498 3043 1235 36.5 2023-11-02T19:43:57.785Z 2023-11-02 21:43:57.785 2023-11-02 21:45:24.625 2023-11-02T19:45:24.624Z bb524ec6-1b21-40f1-8c54-3aac7b454c5b 3232 1235 36.98 2023-11-02T19:44:07.265Z 2023-11-02 21:43:57.785 2023-11-02 21:45:24.625 2023-11-02T19:45:24.624Z 9c218c8a-1566-4982-9640-a0deb9ac203c 3065 1235 30.17 2023-11-02T19:44:16.966Z 2023-11-02 21:43:57.785 2023-11-02 21:45:24.625 2023-11-02T19:45:24.624Z 6623c41b-04fa-4df0-a312-45b6dfcdc639 3143 1235 12.2 2023-11-02T19:44:24.625Z 2023-11-02 21:43:57.785 2023-11-02 21:45:24.625 2023-11-02T19:45:24.624Z The following query computes the sum of the price column in the orders table within SESSION windows that have a 5-minute gap. SELECT window_start, window_end, customer_id, SUM(price) as `sum` FROM TABLE( SESSION(TABLE `examples`.`marketplace`.`orders` PARTITION BY customer_id, DESCRIPTOR($rowtime), INTERVAL '1' MINUTES)) GROUP BY window_start, window_end, customer_id; The output resembles: window_start window_end sum 2023-11-02 12:40:00 2023-11-02 12:46:00 327376.23 2023-11-02 12:40:00 2023-11-02 12:48:00 661272.70 2023-11-02 12:40:00 2023-11-02 12:50:00 989294.13 2023-11-02 12:50:00 2023-11-02 12:52:00 1316596.58 2023-11-02 12:50:00 2023-11-02 12:54:00 1648097.20 2023-11-02 12:50:00 2023-11-02 12:56:00 1977881.53 2023-11-02 12:50:00 2023-11-02 12:58:00 2304080.32 2023-11-02 12:50:00 2023-11-02 13:00:00 2636795.56 Window Offset¶ Offset is an optional parameter that you can use to change the window assignment. It can be a positive duration or a negative duration. The default value for a window offset is 0. The same record may be assigned to a different window if set to a different offset value. For example, which window would a record be assigned to if it has a timestamp of 2021-06-30 00:00:00, for a Tumble window with 10 MINUTE as size? If the offset is -16 MINUTE, the record assigns to window [2021-06-29 23:44:00, 2021-06-29 23:54:00]. If the offset is -6 MINUTE, the record assigns to window [2021-06-29 23:54:00, 2021-06-30 00:04:00]. If the offset is -4 MINUTE, the record assigns to window [2021-06-29 23:56:00, 2021-06-30 00:06:00]. If the offset is 0, the record assigns to window [2021-06-30 00:00:00, 2021-06-30 00:10:00]. If the offset is 4 MINUTE, the record assigns to window [2021-06-30 00:04:00, 2021-06-30 00:14:00]. If the offset is 6 MINUTE, the record assigns to window [2021-06-30 00:06:00, 2021-06-30 00:16:00]. If the offset is 16 MINUTE, the record assigns to window [2021-06-30 00:16:00, 2021-06-30 00:26:00]. Note The effect of window offset is only for updating window assignment. It has no effect on Watermark. Examples¶ The following SQL examples show how to use offset in a tumbling window. SELECT * FROM TABLE( TUMBLE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES, INTERVAL '1' MINUTES)); -- or with the named params -- note: the DATA param must be the first SELECT * FROM TABLE( TUMBLE( DATA => TABLE `examples`.`marketplace`.`orders`, TIMECOL => DESCRIPTOR($rowtime), SIZE => INTERVAL '10' MINUTES, OFFSET => INTERVAL '1' MINUTES)); The output resembles: order_id customer_id product_id price $rowtime window_start window_end window_time 0932497b-a3c2-4f80-9b1f-9d099b091696 3063 1035 75.85 2023-11-02 19:29:51 2023-11-02 19:21:00 2023-11-02 19:31:00 2023-11-02 19:30:59.999 20f4529c-9c86-4a54-8c38-f6c3caa1d7b8 3131 1207 89.00 2023-11-02 19:29:51 2023-11-02 19:21:00 2023-11-02 19:31:00 2023-11-02 19:30:59.999 cbda6c08-e0c7-41cb-ae04-c50f5b1f5e3c 3074 1312 63.71 2023-11-02 19:29:51 2023-11-02 19:21:00 2023-11-02 19:31:00 2023-11-02 19:30:59.999 d049ed28-cbbb-479b-8df6-8c637c1b68f5 3006 1201 72.14 2023-11-02 19:29:51 2023-11-02 19:21:00 2023-11-02 19:31:00 2023-11-02 19:30:59.999 63b6f2ef-c0e9-4737-ab81-f5acb93e4a64 3182 1346 76.18 2023-11-02 19:29:51 2023-11-02 19:21:00 2023-11-02 19:31:00 2023-11-02 19:30:59.999 00c088db-9cb7-4128-a4fd-4e06c0e95f7a 3198 1166 63.49 2023-11-02 19:29:51 2023-11-02 19:21:00 2023-11-02 19:31:00 2023-11-02 19:30:59.999 b9ca292e-635a-4ef7-a6ee-bcf099df7c1b 3236 1462 69.13 2023-11-02 19:29:51 2023-11-02 19:21:00 2023-11-02 19:31:00 2023-11-02 19:30:59.999 3299fd08-264e-4e49-8bb9-82cae18c5d7c 3058 1226 59.53 2023-11-02 19:29:51 2023-11-02 19:21:00 2023-11-02 19:31:00 2023-11-02 19:30:59.999 45878388-7cb3-409d-91a4-8ef1f02c8576 3028 1228 16.63 2023-11-02 19:29:51 2023-11-02 19:21:00 2023-11-02 19:31:00 2023-11-02 19:30:59.999 c2fef024-c0c2-4c0f-9880-bc423d1c2db6 3219 1071 80.66 2023-11-02 19:29:51 2023-11-02 19:21:00 2023-11-02 19:31:00 2023-11-02 19:30:59.999 The following query computes the sum of the price column in the orders table within 10-minute tumbling windows that have an offset of 1 minute. -- apply aggregation on the tumbling windowed table SELECT window_start, window_end, SUM(price) as `sum` FROM TABLE( TUMBLE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES, INTERVAL '1' MINUTES)) GROUP BY window_start, window_end; The output resembles: window_start window_end sum 2023-11-02 19:21:00 2023-11-02 19:31:00 7285.64 2023-11-02 19:22:00 2023-11-02 19:32:00 6932.18 2023-11-02 19:23:00 2023-11-02 19:33:00 7104.53 2023-11-02 19:24:00 2023-11-02 19:34:00 7456.92 2023-11-02 19:25:00 2023-11-02 19:35:00 7198.75 2023-11-02 19:26:00 2023-11-02 19:36:00 6875.39 2023-11-02 19:27:00 2023-11-02 19:37:00 7312.87 2023-11-02 19:28:00 2023-11-02 19:38:00 7089.26 2023-11-02 19:29:00 2023-11-02 19:39:00 7401.58 2023-11-02 19:30:00 2023-11-02 19:40:00 7156.43 Related content¶ Course: Window Aggregations Confluent Developer: How to create cumulating windows Top-N Queries Window Top-N Queries Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
TIMESTAMP_LTZ
```

```sql
window_time
```

```sql
window_end - 1ms
```

```sql
00:00:00-00:59:59.999
```

```sql
TIMESTAMP(3)
```

```sql
2020-04-15 08:05:00.000
```

```sql
2020-04-15 08:05
```

```sql
TIMESTAMP_LTZ
```

```sql
window_start
```

```sql
window_time
```

```sql
TUMBLE(TABLE data, DESCRIPTOR(timecol), size [, offset ])
```

```sql
DESCRIBE `examples`.`marketplace`.`orders`;
```

```sql
+--------------+-----------+----------+---------------+
   | Column Name  | Data Type | Nullable |    Extras     |
   +--------------+-----------+----------+---------------+
   | order_id     | STRING    | NOT NULL |               |
   | customer_id  | INT       | NOT NULL |               |
   | product_id   | STRING    | NOT NULL |               |
   | price        | DOUBLE    | NOT NULL |               |
   +--------------+-----------+----------+---------------+
```

```sql
SELECT * FROM `examples`.`marketplace`.`orders`;
```

```sql
order_id                             customer_id  product_id price
d770a538-a70c-4de6-9d06-e6c16c5bef5a 3075         1379       32.21
787ee1f4-d0d0-4c39-bdb9-44dc2d203d55 3028         1335       34.74
7ab7ce23-5f61-4398-afad-b1e3f548fee3 3148         1045       69.26
6fea712c-9454-497e-8038-ebaf6dfc7a17 3247         1390       67.26
dc9daf5e-98d5-4bcd-8839-251fed13b75e 3167         1309       12.04
ab3151d0-2950-49cd-9783-016ccc6a3281 3105         1094       21.52
d27ca945-3cff-48a4-afcc-7b17446aa95d 3168         1250       99.95
```

```sql
SELECT * FROM TABLE(
   TUMBLE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES))

-- or with the named params
-- note: the DATA param must be the first
SELECT * FROM TABLE(
   TUMBLE(
     DATA => TABLE `examples`.`marketplace`.`orders`,
     TIMECOL => DESCRIPTOR($rowtime),
     SIZE => INTERVAL '10' MINUTES));
```

```sql
order_id                             customer_id product_id price $rowtime            window_start        window_end          window_time
e69058b5-7ed9-44fa-86ff-4d6f8baff028 3145        1488       63.94 2023-11-02 13:20:27 2023-11-02 13:20:00 2023-11-02 13:30:00 2023-11-02 13:29:59.999
92e81cc4-93c4-488b-9386-ae9300d7cd21 3223        1328       29.37 2023-11-02 13:20:27 2023-11-02 13:20:00 2023-11-02 13:30:00 2023-11-02 13:29:59.999
7ca2ddaa-dd5e-41dc-ac47-c9aa7477d913 3223        1402       49.78 2023-11-02 13:20:27 2023-11-02 13:20:00 2023-11-02 13:30:00 2023-11-02 13:29:59.999
84efa0d0-7157-4cd3-a893-e7d2780cefdd 3076        1321       47.38 2023-11-02 13:20:27 2023-11-02 13:20:00 2023-11-02 13:30:00 2023-11-02 13:29:59.999
d72a37d2-ef15-4740-8ae8-1199ddf84ea9 3211        1234       56.27 2023-11-02 13:20:27 2023-11-02 13:20:00 2023-11-02 13:30:00 2023-11-02 13:29:59.999
4d57c754-63e1-413a-8af8-768d54d128ee 3126        1223       21.52 2023-11-02 13:20:27 2023-11-02 13:20:00 2023-11-02 13:30:00 2023-11-02 13:29:59.999
80f9fe0b-3e5d-4c25-aa6e-0b3dacfa36de 3087        1393       70.26 2023-11-02 13:20:27 2023-11-02 13:20:00 2023-11-02 13:30:00 2023-11-02 13:29:59.999
ea733533-1516-41b6-b5e3-cadcb6f71529 3079        1488       17.55 2023-11-02 13:20:27 2023-11-02 13:20:00 2023-11-02 13:30:00 2023-11-02 13:29:59.999
cef1cd9f-379e-4791-8a0d-69eec8adae35 3211        1293       91.20 2023-11-02 13:20:27 2023-11-02 13:20:00 2023-11-02 13:30:00 2023-11-02 13:29:59.999
```

```sql
-- apply aggregation on the tumbling windowed table
SELECT window_start, window_end, SUM(price) as `sum`
  FROM TABLE(
    TUMBLE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES))
  GROUP BY window_start, window_end;
```

```sql
window_start        window_end          sum
2023-11-02 10:40:00 2023-11-02 10:50:00 258484.93
2023-11-02 10:50:00 2023-11-02 11:00:00 287632.15
2023-11-02 11:00:00 2023-11-02 11:10:00 271945.78
2023-11-02 11:10:00 2023-11-02 11:20:00 315207.46
2023-11-02 11:20:00 2023-11-02 11:30:00 342618.92
2023-11-02 11:30:00 2023-11-02 11:40:00 329754.31
```

```sql
TIMESTAMP_LTZ
```

```sql
window_start
```

```sql
window_time
```

```sql
HOP(TABLE data, DESCRIPTOR(timecol), slide, size [, offset ])
```

```sql
SELECT * FROM TABLE(
    HOP(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '5' MINUTES, INTERVAL '10' MINUTES))

-- or with the named params
-- note: the DATA param must be the first
SELECT * FROM TABLE(
    HOP(
      DATA => TABLE `examples`.`marketplace`.`orders`,
      TIMECOL => DESCRIPTOR($rowtime),
      SLIDE => INTERVAL '5' MINUTES,
      SIZE => INTERVAL '10' MINUTES));
```

```sql
order_id                             customer_id product_id price $rowtime            window_start        window_end          window_time
10ae1386-496e-4c6c-9436-7f7e2e7a59f9 3160        1015       26.20 2023-11-02 19:24:46 2023-11-02 19:20:00 2023-11-02 19:30:00 2023-11-02 19:29:59.999
10ae1386-496e-4c6c-9436-7f7e2e7a59f9 3160        1015       26.20 2023-11-02 19:24:46 2023-11-02 19:15:00 2023-11-02 19:25:00 2023-11-02 19:24:59.999
66ecb3b3-7a3d-43ac-b3a2-4c35e06a8d7c 3046        1081       20.24 2023-11-02 19:24:46 2023-11-02 19:20:00 2023-11-02 19:30:00 2023-11-02 19:29:59.999
66ecb3b3-7a3d-43ac-b3a2-4c35e06a8d7c 3046        1081       20.24 2023-11-02 19:24:46 2023-11-02 19:15:00 2023-11-02 19:25:00 2023-11-02 19:24:59.999
4d86db03-a573-4fc2-9699-85455331a7c4 3023        1346       85.45 2023-11-02 19:24:46 2023-11-02 19:20:00 2023-11-02 19:30:00 2023-11-02 19:29:59.999
4d86db03-a573-4fc2-9699-85455331a7c4 3023        1346       85.45 2023-11-02 19:24:46 2023-11-02 19:15:00 2023-11-02 19:25:00 2023-11-02 19:24:59.999
d1460cf7-9472-45e0-9c2d-40537c9f34c0 3114        1333       49.56 2023-11-02 19:24:47 2023-11-02 19:20:00 2023-11-02 19:30:00 2023-11-02 19:29:59.999
d1460cf7-9472-45e0-9c2d-40537c9f34c0 3114        1333       49.56 2023-11-02 19:24:47 2023-11-02 19:15:00 2023-11-02 19:25:00 2023-11-02 19:24:59.999
e38984d8-5683-4e55-9f7a-e43350de7c3d 3024        1402       90.75 2023-11-02 19:24:47 2023-11-02 19:20:00 2023-11-02 19:30:00 2023-11-02 19:29:59.999
e38984d8-5683-4e55-9f7a-e43350de7c3d 3024        1402       90.75 2023-11-02 19:24:47 2023-11-02 19:15:00 2023-11-02 19:25:00 2023-11-02 19:24:59.999
```

```sql
-- apply aggregation on the hopping windowed table
SELECT window_start, window_end, SUM(price) as `sum`
  FROM TABLE(
    HOP(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '5' MINUTES, INTERVAL '10' MINUTES))
  GROUP BY window_start, window_end;
```

```sql
window_start        window_end          sum
2023-11-02 11:10:00 2023-11-02 11:20:00 296049.38
2023-11-02 11:15:00 2023-11-02 11:25:00 1122455.07
2023-11-02 11:20:00 2023-11-02 11:30:00 1648270.20
2023-11-02 11:25:00 2023-11-02 11:35:00 2143271.00
2023-11-02 11:30:00 2023-11-02 11:40:00 2701592.45
2023-11-02 11:35:00 2023-11-02 11:45:00 3214376.78
```

```sql
[00:00, 01:00)
```

```sql
[00:00, 02:00)
```

```sql
[00:00, 03:00)
```

```sql
[00:00, 24:00)
```

```sql
TIMESTAMP_LTZ
```

```sql
window_start
```

```sql
window_time
```

```sql
CUMULATE(TABLE data, DESCRIPTOR(timecol), step, size)
```

```sql
SELECT * FROM TABLE(
    CUMULATE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '2' MINUTES, INTERVAL '10' MINUTES));

-- or with the named params
-- note: the DATA param must be the first
SELECT * FROM TABLE(
    CUMULATE(
      DATA => TABLE `examples`.`marketplace`.`orders`,
      TIMECOL => DESCRIPTOR($rowtime),
      STEP => INTERVAL '2' MINUTES,
      SIZE => INTERVAL '10' MINUTES));
```

```sql
order_id                             customer_id product_id price $rowtime            window_start        window_end          window_time
2572a2e0-2ba2-4947-8926-e70e31b68df3 3239        1015       13.59 2023-11-02 19:27:39 2023-11-02 19:20:00 2023-11-02 19:28:00 2023-11-02 19:27:59.999
2572a2e0-2ba2-4947-8926-e70e31b68df3 3239        1015       13.59 2023-11-02 19:27:39 2023-11-02 19:20:00 2023-11-02 19:30:00 2023-11-02 19:29:59.999
7f791e40-a524-4a9b-bb0d-35a2c1b5a7c4 3102        1374       93.59 2023-11-02 19:27:39 2023-11-02 19:20:00 2023-11-02 19:28:00 2023-11-02 19:27:59.999
7f791e40-a524-4a9b-bb0d-35a2c1b5a7c4 3102        1374       93.59 2023-11-02 19:27:39 2023-11-02 19:20:00 2023-11-02 19:30:00 2023-11-02 19:29:59.999
47e70310-8fa4-4568-b521-7e2b68b06634 3026        1142       58.26 2023-11-02 19:27:39 2023-11-02 19:20:00 2023-11-02 19:28:00 2023-11-02 19:27:59.999
47e70310-8fa4-4568-b521-7e2b68b06634 3026        1142       58.26 2023-11-02 19:27:39 2023-11-02 19:20:00 2023-11-02 19:30:00 2023-11-02 19:29:59.999
fe1b440e-dc75-4092-be11-8e1c3afe55c7 3106        1057       11.37 2023-11-02 19:27:39 2023-11-02 19:20:00 2023-11-02 19:28:00 2023-11-02 19:27:59.999
fe1b440e-dc75-4092-be11-8e1c3afe55c7 3106        1057       11.37 2023-11-02 19:27:39 2023-11-02 19:20:00 2023-11-02 19:30:00 2023-11-02 19:29:59.999
6668e4dc-d574-44db-8f0f-2b8e1b1f3c2e 3061        1049       26.20 2023-11-02 19:27:39 2023-11-02 19:20:00 2023-11-02 19:28:00 2023-11-02 19:27:59.999
6668e4dc-d574-44db-8f0f-2b8e1b1f3c2e 3061        1049       26.20 2023-11-02 19:27:39 2023-11-02 19:20:00 2023-11-02 19:30:00 2023-11-02 19:29:59.999
```

```sql
-- apply aggregation on the cumulating windowed table
SELECT window_start, window_end, SUM(price) as `sum`
  FROM TABLE(
    CUMULATE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '2' MINUTES, INTERVAL '10' MINUTES))
  GROUP BY window_start, window_end;
```

```sql
window_start            window_end              sum
2023-11-02 12:40:00.000 2023-11-02 12:46:00.000 327376.23
2023-11-02 12:40:00.000 2023-11-02 12:48:00.000 661272.70
2023-11-02 12:40:00.000 2023-11-02 12:50:00.000 989294.13
2023-11-02 12:50:00.000 2023-11-02 12:52:00.000 1316596.58
2023-11-02 12:50:00.000 2023-11-02 12:54:00.000 1648097.20
2023-11-02 12:50:00.000 2023-11-02 12:56:00.000 1977881.53
2023-11-02 12:50:00.000 2023-11-02 12:58:00.000 2304080.32
2023-11-02 12:50:00.000 2023-11-02 13:00:00.000 2636795.56
```

```sql
window_start
```

```sql
window_time
```

```sql
SESSION(TABLE data [PARTITION BY(keycols, ...)], DESCRIPTOR(timecol), gap)
```

```sql
SELECT * FROM TABLE(
  SESSION(TABLE `examples`.`marketplace`.`orders` PARTITION BY product_id, DESCRIPTOR($rowtime), INTERVAL '1' MINUTES));

-- or with the named params
-- note: the DATA param must be the first
SELECT * FROM TABLE(
    SESSION(
      DATA => TABLE `examples`.`marketplace`.`orders` PARTITION BY product_id,
      TIMECOL => DESCRIPTOR($rowtime),
      GAP => INTERVAL '1' MINUTES));
```

```sql
order_id                             customer_id product_id price     $rowtime                window_start         window_end           window_time
d7ef1f9a-4f5f-406e-bbad-25db521c38bf 3068        1234       17.08     2023-11-02T19:43:58.626Z 2023-11-02 21:43:58.626 2023-11-02 21:44:58.626 2023-11-02T19:44:58.625Z
804f0c86-a59a-4425-a293-b28bafaa9674 3071        1332       48.12     2023-11-02T19:44:00.506Z 2023-11-02 21:44:00.506 2023-11-02 21:45:00.506 2023-11-02T19:45:00.505Z
61ea63e3-f040-4501-b78e-8db1fdcf45fc 3179        1267       12.35     2023-11-02T19:43:58.405Z 2023-11-02 21:43:58.405 2023-11-02 21:45:07.925 2023-11-02T19:45:07.924Z
b70ba5bc-428c-41d7-b8fc-8014dd3fd429 3234        1267       40.81     2023-11-02T19:44:00.365Z 2023-11-02 21:43:58.405 2023-11-02 21:45:07.925 2023-11-02T19:45:07.924Z
37688f8c-65ee-4e27-a567-4890e6c7663b 3179        1267       98.17     2023-11-02T19:44:07.925Z 2023-11-02 21:43:58.405 2023-11-02 21:45:07.925 2023-11-02T19:45:07.924Z
4cfa0cc6-881a-43b3-bb34-1746c3b93094 3077        1047       16.78     2023-11-02T19:44:01.985Z 2023-11-02 21:44:01.985 2023-11-02 21:45:23.285 2023-11-02T19:45:23.284Z
e007ce6e-5a76-4390-8fb3-50f46025b965 3095        1047       77.48     2023-11-02T19:44:11.365Z 2023-11-02 21:44:01.985 2023-11-02 21:45:23.285 2023-11-02T19:45:23.284Z
487a0248-a534-489e-bbc5-733e87d19cc7 3200        1047       47.86     2023-11-02T19:44:23.285Z 2023-11-02 21:44:01.985 2023-11-02 21:45:23.285 2023-11-02T19:45:23.284Z
4dd1ab51-8ca4-4de6-9f79-bb2ad7ab2498 3043        1235       36.5      2023-11-02T19:43:57.785Z 2023-11-02 21:43:57.785 2023-11-02 21:45:24.625 2023-11-02T19:45:24.624Z
bb524ec6-1b21-40f1-8c54-3aac7b454c5b 3232        1235       36.98     2023-11-02T19:44:07.265Z 2023-11-02 21:43:57.785 2023-11-02 21:45:24.625 2023-11-02T19:45:24.624Z
9c218c8a-1566-4982-9640-a0deb9ac203c 3065        1235       30.17     2023-11-02T19:44:16.966Z 2023-11-02 21:43:57.785 2023-11-02 21:45:24.625 2023-11-02T19:45:24.624Z
6623c41b-04fa-4df0-a312-45b6dfcdc639 3143        1235       12.2      2023-11-02T19:44:24.625Z 2023-11-02 21:43:57.785 2023-11-02 21:45:24.625 2023-11-02T19:45:24.624Z
```

```sql
SELECT window_start, window_end, customer_id, SUM(price) as `sum`
  FROM TABLE(
    SESSION(TABLE `examples`.`marketplace`.`orders` PARTITION BY customer_id, DESCRIPTOR($rowtime), INTERVAL '1' MINUTES))
  GROUP BY window_start, window_end, customer_id;
```

```sql
window_start        window_end          sum
2023-11-02 12:40:00 2023-11-02 12:46:00 327376.23
2023-11-02 12:40:00 2023-11-02 12:48:00 661272.70
2023-11-02 12:40:00 2023-11-02 12:50:00 989294.13
2023-11-02 12:50:00 2023-11-02 12:52:00 1316596.58
2023-11-02 12:50:00 2023-11-02 12:54:00 1648097.20
2023-11-02 12:50:00 2023-11-02 12:56:00 1977881.53
2023-11-02 12:50:00 2023-11-02 12:58:00 2304080.32
2023-11-02 12:50:00 2023-11-02 13:00:00 2636795.56
```

```sql
2021-06-30 00:00:00
```

```sql
2021-06-29 23:44:00
```

```sql
2021-06-29 23:54:00
```

```sql
2021-06-29 23:54:00
```

```sql
2021-06-30 00:04:00
```

```sql
2021-06-29 23:56:00
```

```sql
2021-06-30 00:06:00
```

```sql
2021-06-30 00:00:00
```

```sql
2021-06-30 00:10:00
```

```sql
2021-06-30 00:04:00
```

```sql
2021-06-30 00:14:00
```

```sql
2021-06-30 00:06:00
```

```sql
2021-06-30 00:16:00
```

```sql
2021-06-30 00:16:00
```

```sql
2021-06-30 00:26:00
```

```sql
SELECT * FROM TABLE(
   TUMBLE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES, INTERVAL '1' MINUTES));

-- or with the named params
-- note: the DATA param must be the first
SELECT * FROM TABLE(
   TUMBLE(
     DATA => TABLE `examples`.`marketplace`.`orders`,
     TIMECOL => DESCRIPTOR($rowtime),
     SIZE => INTERVAL '10' MINUTES,
     OFFSET => INTERVAL '1' MINUTES));
```

```sql
order_id                             customer_id product_id price $rowtime            window_start        window_end          window_time
0932497b-a3c2-4f80-9b1f-9d099b091696 3063        1035       75.85 2023-11-02 19:29:51 2023-11-02 19:21:00 2023-11-02 19:31:00 2023-11-02 19:30:59.999
20f4529c-9c86-4a54-8c38-f6c3caa1d7b8 3131        1207       89.00 2023-11-02 19:29:51 2023-11-02 19:21:00 2023-11-02 19:31:00 2023-11-02 19:30:59.999
cbda6c08-e0c7-41cb-ae04-c50f5b1f5e3c 3074        1312       63.71 2023-11-02 19:29:51 2023-11-02 19:21:00 2023-11-02 19:31:00 2023-11-02 19:30:59.999
d049ed28-cbbb-479b-8df6-8c637c1b68f5 3006        1201       72.14 2023-11-02 19:29:51 2023-11-02 19:21:00 2023-11-02 19:31:00 2023-11-02 19:30:59.999
63b6f2ef-c0e9-4737-ab81-f5acb93e4a64 3182        1346       76.18 2023-11-02 19:29:51 2023-11-02 19:21:00 2023-11-02 19:31:00 2023-11-02 19:30:59.999
00c088db-9cb7-4128-a4fd-4e06c0e95f7a 3198        1166       63.49 2023-11-02 19:29:51 2023-11-02 19:21:00 2023-11-02 19:31:00 2023-11-02 19:30:59.999
b9ca292e-635a-4ef7-a6ee-bcf099df7c1b 3236        1462       69.13 2023-11-02 19:29:51 2023-11-02 19:21:00 2023-11-02 19:31:00 2023-11-02 19:30:59.999
3299fd08-264e-4e49-8bb9-82cae18c5d7c 3058        1226       59.53 2023-11-02 19:29:51 2023-11-02 19:21:00 2023-11-02 19:31:00 2023-11-02 19:30:59.999
45878388-7cb3-409d-91a4-8ef1f02c8576 3028        1228       16.63 2023-11-02 19:29:51 2023-11-02 19:21:00 2023-11-02 19:31:00 2023-11-02 19:30:59.999
c2fef024-c0c2-4c0f-9880-bc423d1c2db6 3219        1071       80.66 2023-11-02 19:29:51 2023-11-02 19:21:00 2023-11-02 19:31:00 2023-11-02 19:30:59.999
```

```sql
-- apply aggregation on the tumbling windowed table
SELECT window_start, window_end, SUM(price) as `sum`
  FROM TABLE(
    TUMBLE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES, INTERVAL '1' MINUTES))
  GROUP BY window_start, window_end;
```

```sql
window_start        window_end          sum
2023-11-02 19:21:00 2023-11-02 19:31:00 7285.64
2023-11-02 19:22:00 2023-11-02 19:32:00 6932.18
2023-11-02 19:23:00 2023-11-02 19:33:00 7104.53
2023-11-02 19:24:00 2023-11-02 19:34:00 7456.92
2023-11-02 19:25:00 2023-11-02 19:35:00 7198.75
2023-11-02 19:26:00 2023-11-02 19:36:00 6875.39
2023-11-02 19:27:00 2023-11-02 19:37:00 7312.87
2023-11-02 19:28:00 2023-11-02 19:38:00 7089.26
2023-11-02 19:29:00 2023-11-02 19:39:00 7401.58
2023-11-02 19:30:00 2023-11-02 19:40:00 7156.43
```

---

### SQL WITH Clause in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/queries/with.html

WITH Clause in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables writing auxiliary statements to use in larger SQL queries. Syntax¶ WITH <with_item_definition> [ , ... ] SELECT ... FROM ...; <with_item_defintion>: with_item_name (column_name[, ...n]) AS ( <select_query> ) Description¶ The WITH clause provides a way to write auxiliary statements for use in a larger query. These statements, which are often referred to as Common Table Expressions (CTE), can be thought of as defining temporary views that exist just for one query. Example¶ The following example defines a common table expression orders_with_total and uses it in a GROUP BY query. WITH orders_with_total AS ( SELECT order_id, price + tax AS total FROM orders ) SELECT order_id, SUM(total) FROM orders_with_total GROUP BY order_id; Related content¶ Flink SQL Queries Flink SQL Functions Statements Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
WITH <with_item_definition> [ , ... ]
SELECT ... FROM ...;

<with_item_defintion>:
   with_item_name (column_name[, ...n]) AS ( <select_query> )
```

```sql
orders_with_total
```

```sql
WITH orders_with_total AS (
    SELECT order_id, price + tax AS total
    FROM orders
)
SELECT order_id, SUM(total)
FROM orders_with_total
GROUP BY order_id;
```

---

### Data Type Mappings with Flink SQL Statements in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/serialization.html

Data Type Mappings in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® supports records in the Avro Schema Registry, JSON_SR, and Protobuf Schema Registry formats. Avro schemas JSON Schema Protobuf schema Avro schemas¶ Known limitations¶ Avro enums have limited support. Flink supports reading and writing enums but treats them as a STRING type. From Flink’s perspective, enums are not distinguishable from the STRING type. You can’t create an Avro schema from Flink that has an enum field. Flink doesn’t support reading Avro time-micros as a TIME type. Flink supports TIME with precision up to 3. time-micros is read and written as BIGINT. Field names must match Avro criteria. Avro expects field names to start with [A-Za-z_] and subsequently contain only [A-Za-z0-9_]. These Flink types are not supported: INTERVAL_DAY_TIME INTERVAL_YEAR_MONTH TIMESTAMP_WITH_TIMEZONE Flink SQL types to Avro types¶ The following table shows the mapping of Flink SQL types to Avro physical types. This mapping is important for creating tables, because it defines the Avro schema that’s produced by a CREATE TABLE statement. ARRAY¶ Avro type: array Avro logical type: – Additional properties: – Example: { "type" : "array", "items" : "long" } BIGINT¶ Avro type: long Avro logical type: – Additional properties: – Example: long BINARY¶ Avro type: fixed Avro logical type: – Additional properties: flink.maxLength (MAX_LENGTH if not set) Example: { "type" : "fixed", "name" : "row", "namespace" : "io.confluent", "size" : 123 } BOOLEAN¶ Avro type: boolean Avro logical type: – Additional properties: – Example: boolean CHAR¶ Avro type: string Avro logical type: – Additional properties: flink.maxLength (MAX_LENGTH if not set) Example: { "type" : "string", "flink.maxLength" : 123, "flink.minLength" : 123, "flink.version" : "1" } DATE¶ Avro type: int Avro logical type: date Additional properties: – Example: { "type" : "int", "logicalType" : "date" } DECIMAL¶ Avro type: bytes Avro logical type: decimal Additional properties: – Example: { "type" : "bytes", "logicalType" : "decimal", "precision" : 6, "scale" : 3 } DOUBLE¶ Avro type: double Avro logical type: – Additional properties: – Example: double FLOAT¶ Avro type: float Avro logical type: – Additional properties: – Example: float INT¶ Avro type: int Avro logical type: – Additional properties: – Example: int MAP (character key)¶ Avro type: map Avro logical type: – Additional properties: – Example: { "type" : "map", "values" : "boolean" } MAP (non-character key)¶ Avro type: array Avro logical type: – Additional properties: array of io.confluent.connect.avro.MapEntry(key, value) Example: { "type" : "array", "items" : { "type" : "record", "name" : "MapEntry", "namespace" : "io.confluent.connect.avro", "fields" : [ { "name" : "key", "type" : "int" }, { "name" : "value", "type" : "bytes" } ] } } MULTISET (character element)¶ Avro type: map Avro logical type: – Additional properties: flink.type : multiset Example: { "type" : "map", "values" : "int", "flink.type" : "multiset", "flink.version" : "1" } MULTISET (non-character key)¶ Avro type: array Avro logical type: – Additional properties: array of io.confluent.connect.avro.MapEntry(key, value), flink.type : multiset Example: { "type" : "array", "items" : { "type" : "record", "name" : "MapEntry", "namespace" : "io.confluent.connect.avro", "fields" : [ { "name" : "key", "type" : "long" }, { "name" : "value", "type" : "int" } ] }, "flink.type" : "multiset", "flink.version" : "1" } ROW¶ Avro type: record Avro logical type: – Additional properties: connect.type=int16 Name: org.apache.flink.avro.generated.record Nested records name: org.apache.flink.avro.generated.record_$fieldName Example: { "type" : "record", "name" : "row", "namespace" : "io.confluent", "fields" : [ { "name" : "f0", "type" : "long", "doc" : "field comment" } ] } SMALLINT¶ Avro type: int Avro logical type: – Additional properties: connect.type=int16 Example: { "type" : "int", "connect.type" : "int16" } STRING / VARCHAR¶ Avro type: string Avro logical type: – Additional properties: flink.maxLength = flink.minLength (MAX_LENGTH if not set) Example: { "type" : "string", "flink.maxLength" : 123, "flink.version" : "1" } TIME¶ Avro type: int Avro logical type: time-millis Additional properties: flink.precision (default: 3, max supported: 3) Example: { "type" : "int", "flink.precision" : 2, "flink.version" : "1", "logicalType" : "time-millis" } TIMESTAMP¶ Avro type: long Avro logical type: local-timestamp-millis / local-timestamp-micros Additional properties: flink.precision (default: 3/6, max supported: 3/9) Example: { "type" : "long", "flink.precision" : 2, "flink.version" : "1", "logicalType" : "local-timestamp-millis" } TIMESTAMP_LTZ¶ Avro type: long Avro logical type: timestamp-millis / timestamp-micros Additional properties: flink.precision (default: 3/6, max supported: 3/9) Example: { "type" : "long", "flink.precision" : 2, "flink.version" : "1", "logicalType" : "timestamp-millis" } TINYINT¶ Avro type: int Avro logical type: – Additional properties: connect.type=int8 Example: { "type" : "int", "connect.type" : "int8" } VARBINARY¶ Avro type: bytes Avro logical type: – Additional properties: flink.maxLength (MAX_LENGTH if not set) Example: { "type" : "bytes", "flink.maxLength" : 123, "flink.version" : "1" } Avro types to Flink SQL types¶ The following table shows the mapping of Avro types to Flink SQL and types. It shows only mappings that are not covered by the previous table. These types can’t originate from Flink SQL. This mapping is important when consuming/reading records with a schema that was created outside of Flink. The mapping defines the Flink table’s schema inferred from an Avro schema. Flink SQL supports reading and writing nullable types. A nullable type is mapped to an Avro union(avro_type, null), with the avro_type converted from the corresponding Flink type. Avro type Avro logical type Flink SQL type Example long time-micros BIGINT – enum – STRING – union with null type (null + one other type) – NULLABLE(type) – union (other unions) – ROW(type_name Type0, …) [ "long", "string", { "type": "record", "name": "User", "namespace": "io.test1", "fields": [ { "name": "f0", "type": "long" } ] } ] string (uuid) – STRING – fixed (duration) – BINARY(size) – JSON Schema¶ Flink SQL types to JSON Schema types¶ The following table shows the mapping of Flink SQL types to JSON Schema types. This mapping is important for creating tables, because it defines the JSON Schema that’s produced by a CREATE TABLE statement. Nullable types are expressed as oneOf(Null, T). Object for a MAP and MULTISET must have two fields [key, value]. MULTISET is equivalent to MAP[K, INT] and is serialized accordingly. ARRAY¶ JSON Schema type: Array Additional properties: – JSON type title: – Example: { "type": "array", "items": { "type": "number", "title": "org.apache.kafka.connect.data.Time", "flink.precision": 2, "connect.type": "int32", "flink.version": "1" } } BIGINT¶ JSON Schema type: Number Additional properties: connect.type=int64 JSON type title: – Example: { "type": "number", "connect.type": "int64" } BINARY¶ JSON Schema type: String Additional properties: connect.type=bytes flink.minLength=flink.maxLength: Different from JSON’s minLength/maxLength, because this property describes bytes length, not string length. JSON type title: – Example: { "type": "string", "flink.maxLength": 123, "flink.minLength": 123, "flink.version": "1", "connect.type": "bytes" } BOOLEAN¶ JSON Schema type: Boolean Additional properties: – JSON type title: – Example: { "type": "array", "items": { "type": "number", "title": "org.apache.kafka.connect.data.Time", "flink.precision": 2, "connect.type": "int32", "flink.version": "1" } } CHAR¶ JSON Schema type: String Additional properties: minLength=maxLength JSON type title: – Example: { "type": "string", "minLength": 123, "maxLength": 123 } DATE¶ JSON Schema type: Number Additional properties: connect.type=int32 JSON type title: org.apache.kafka.connect.data.Date Example: – DECIMAL¶ JSON Schema type: Number Additional properties: connect.type=bytes JSON type title: org.apache.kafka.connect.data.Decimal Example: – DOUBLE¶ JSON Schema type: Number Additional properties: connect.type=float64 JSON type title: – Example: { "type": "number", "connect.type": "float64" } FLOAT¶ JSON Schema type: Number Additional properties: connect.type=float32 JSON type title: – Example: { "type": "number", "connect.type": "float32" } INT¶ JSON Schema type: Number Additional properties: connect.type=int32 JSON type title: – Example: { "type": "number", "connect.type": "int32" } MAP[K, V]¶ JSON Schema type: Array[Object] Additional properties: connect.type=map JSON type title: – Example: { "type": "array", "connect.type": "map", "items": { "type": "object", "properties": { "value": { "type": "number", "connect.type": "int64" }, "key": { "type": "number", "connect.type": "int32" } } } } MAP[VARCHAR, V]¶ JSON Schema type: Object Additional properties: connect.type=map JSON type title: – Example: { "type":"object", "connect.type":"map", "additionalProperties": { "type":"number", "connect.type":"int64" } } MULTISET[K]¶ JSON Schema type: Array[Object] Additional properties: connect.type=map flink.type=multiset JSON type title: The count (value) in the JSON schema must map to a Flink INT type. For MULTISET types, the count (value) in the JSON schema must map to a Flink INT type, which corresponds to connect.type: int32 in the JSON Schema. Using connect.type: int64 causes a validation error. Example: { "type": "array", "connect.type": "map", "flink.type": "multiset", "items": { "type": "object", "properties": { "value": { "type": "number", "connect.type": "int32" }, "key": { "type": "number", "connect.type": "int32" } } } } MULTISET[VARCHAR]¶ JSON Schema type: Object Additional properties: connect.type=map flink.type=multiset JSON type title: The count (value) in the JSON schema must map to a Flink INT type. For MULTISET types, the count (value) in the JSON schema must map to a Flink INT type, which corresponds to connect.type: int32 in the JSON Schema. Using connect.type: int64 causes a validation error. Example: { "type": "object", "connect.type": "map", "flink.type": "multiset", "additionalProperties": { "type": "number", "connect.type": "int32" } } ROW¶ JSON Schema type: Object Additional properties: – JSON type title: – Example: – SMALLINT¶ JSON Schema type: Number Additional properties: connect.type=int16 JSON type title: – Example: { "type": "number", "connect.type": "int16" } TIME¶ JSON Schema type: Number Additional properties: connect.type=int32 flink.precision JSON type title: org.apache.kafka.connect.data.Time Example: { "type":"number", "title":"org.apache.kafka.connect.data.Time", "flink.precision":2, "connect.type":"int32", "flink.version":"1" } TIMESTAMP¶ JSON Schema type: Number Additional properties: connect.type=int64 flink.precision flink.type=timestamp JSON type title: org.apache.kafka.connect.data.Timestamp Example: { "type":"number", "title":"org.apache.kafka.connect.data.Timestamp", "flink.precision":2, "flink.type":"timestamp", "connect.type":"int64", "flink.version":"1" } TIMESTAMP_LTZ¶ JSON Schema type: Number Additional properties: connect.type=int64 flink.precision JSON type title: org.apache.kafka.connect.data.Timestamp Example: { "type":"number", "title":"org.apache.kafka.connect.data.Timestamp", "flink.precision":2, "connect.type":"int64", "flink.version":"1" } TINYINT¶ JSON Schema type: Number Additional properties: connect.type=int8 JSON type title: – Example: { "type": "number", "connect.type": "int8" } VARBINARY¶ JSON Schema type: String Additional properties: connect.type=bytes flink.maxLength: Different from JSON’s maxLength, because this property describes bytes length, not string length. JSON type title: – Example: { "type": "string", "flink.maxLength": 123, "flink.version": "1", "connect.type": "bytes" } VARCHAR¶ JSON Schema type: String Additional properties: maxLength JSON type title: – Example: { "type": "string", "maxLength": 123 } JSON types to Flink SQL types¶ The following table shows the mapping of JSON types to Flink SQL types. It shows only mappings that are not covered by the previous table. These types can’t originate from Flink SQL. This mapping is important when consuming/reading records with a schema that was created outside of Flink. The mapping defines the Flink table’s schema inferred from JSON Schema. JSON type Flink SQL type Combined ROW Enum VARCHAR Number(requiresInteger=true) BIGINT Number(requiresInteger=false) DOUBLE Protobuf schema¶ Flink SQL types to Protobuf types¶ The following table shows the mapping of Flink SQL types to Protobuf types. This mapping is important for creating tables, because it defines the Protobuf schema that’s produced by a CREATE TABLE statement. ARRAY[T]¶ Protobuf type: repeated T Message type: – Additional properties: flink.wrapped, which indicates that Flink wrappers are used to represent nullability, because Protobuf doesn’t support nullable repeated natively. Example: repeated int64 value = 1; Nullable array: arrayNullableRepeatedWrapper arrayNullable = 1 [(confluent.field_meta) = { params: [ { key: "flink.wrapped", value: "true" }, { key: "flink.version", value: "1" } ] }]; message arrayNullableRepeatedWrapper { repeated int64 value = 1; } Nullable elements: repeated elementNullableElementWrapper elementNullable = 2 [(confluent.field_meta) = { params: [ { key: "flink.wrapped", value: "true" }, { key: "flink.version", value: "1" } ] }]; message elementNullableElementWrapper { optional int64 value = 1; } BIGINT¶ Protobuf type: INT64 Message type: – Additional properties: – Example: optional int64 bigint = 8; BINARY¶ Protobuf type: BYTES Message type: – Additional properties: flink.maxLength=flink.minLength Example: optional bytes binary = 13 [(confluent.field_meta) = { params: [ { key: "flink.maxLength", value: "123" }, { key: "flink.minLength", value: "123" }, { key: "flink.version", value: "1" } ] }]; BOOLEAN¶ Protobuf type: BOOL Message type: – Additional properties: – Example: optional bool boolean = 2; CHAR¶ Protobuf type: STRING Message type: – Additional properties: flink.maxLength=flink.minLength Example: optional string char = 11 [(confluent.field_meta) = { params: [ { key: "flink.maxLength", value: "123" }, { key: "flink.minLength", value: "123" }, { key: "flink.version", value: "1" } ] }]; DATE¶ Protobuf type: MESSAGE Message type: google.type.Date Additional properties: – Example: optional .google.type.Date date = 17; DECIMAL¶ Protobuf type: MESSAGE Message type: confluent.type.Decimal Additional properties: – Example: optional .confluent.type.Decimal decimal = 19 [(confluent.field_meta) = { params: [ { value: "5", key: "precision" }, { value: "1", key: "scale" }, { key: "flink.version", value: "1" } ] }]; DOUBLE¶ Protobuf type: DOUBLE Message type: – Additional properties: – Example: optional double double = 10; FLOAT¶ Protobuf type: FLOAT Message type: – Additional properties: – Example: optional float float = 9; INT¶ Protobuf type: INT32 Message type: – Additional properties: – Example: optional int32 int = 7; MAP[K, V]¶ Protobuf type: repeated MESSAGE Message type: XXEntry(K key, V value) Additional properties: flink.wrapped, which indicates that Flink wrappers are used to represent nullability, because Protobuf doesn’t support nullable repeated natively. For examples, see the ARRAY type. Example: repeated MapEntry map = 20; message MapEntry { optional string key = 1; optional int64 value = 2; } MULTISET[V]¶ Protobuf type: repeated MESSAGE Message type: XXEntry(V key, int32 value) Additional properties: flink.wrapped, which indicates that Flink wrappers are used to represent nullability, because Protobuf doesn’t support nullable repeated natively. For examples, see the ARRAY type. flink.type=multiset Example: repeated MultisetEntry multiset = 1 [(confluent.field_meta) = { params: [ { key: "flink.type", value: "multiset" }, { key: "flink.version", value: "1" } ] }]; message MultisetEntry { optional string key = 1; int32 value = 2; } ROW¶ Protobuf type: MESSAGE Message type: fieldName Additional properties: – Example: meta_Row meta = 1; message meta_Row { float a = 1; float b = 2; } SMALLINT¶ Protobuf type: INT32 Message type: – Additional properties: MetaProto extension: connect.type = int16 Example: optional int32 smallInt = 6 [(confluent.field_meta) = { doc: "smallInt comment", params: [ { key: "flink.version", value: "1" }, { key: "connect.type", value: "int16" } ] }]; TIMESTAMP¶ Protobuf type: MESSAGE Message type: google.protobuf.Timestamp Additional properties: flink.precision flink.type=timestamp Example: optional .google.protobuf.Timestamp timestamp_ltz_3 = 16 [(confluent.field_meta) = { params: [ { key: "flink.type", value: "timestamp" }, { key: "flink.precision", value: "3" }, { key: "flink.version", value: "1" } ] }]; TIMESTAMP_LTZ¶ Protobuf type: MESSAGE Message type: google.protobuf.Timestamp Additional properties: flink.precision Example: optional .google.protobuf.Timestamp timestamp_ltz_3 = 15 [(confluent.field_meta) = { params: [ { key: "flink.precision", value: "3" }, { key: "flink.version", value: "1" } ] }]; TIME_WITHOUT_TIME_ZONE¶ Protobuf type: MESSAGE Message type: google.type.TimeOfDay Additional properties: – Example: optional .google.type.TimeOfDay time = 18 [(confluent.field_meta) = { params: [ { key: "flink.precision", value: "3" }, { key: "flink.version", value: "1" } ] }]; TINYINT¶ Protobuf type: INT32 Message type: – Additional properties: MetaProto extension: connect.type = int8 Example: optional int32 tinyInt = 4 [(confluent.field_meta) = { doc: "tinyInt comment", params: [ { key: "flink.version", value: "1" }, { key: "connect.type", value: "int8" } ] }]; VARBINARY¶ Protobuf type: BYTES Message type: – Additional properties: flink.maxLength (default = MAX_LENGTH) Example: optional bytes varbinary = 14 [(confluent.field_meta) = { params: [ { key: "flink.maxLength", value: "123" }, { key: "flink.version", value: "1" } ] }]; VARCHAR¶ Protobuf type: STRING Message type: – Additional properties: flink.maxLength (default = MAX_LENGTH) Example: optional string varchar = 12 [(confluent.field_meta) = { params: [ { key: "flink.maxLength", value: "123" }, { key: "flink.version", value: "1" } ] }]; Protobuf types to Flink SQL types¶ The following table shows the mapping of Protobuf types to Flink SQL and Connect types. It shows only mappings that are not covered by the previous table. These types can’t originate from Flink SQL. This mapping is important when consuming/reading records with a schema that was created outside of Flink. The mapping defines the Flink table’s schema inferred from a Protobuf schema. Protobuf type Flink SQL type Message type Connect type annotation FIXED32 | FIXED64 | SFIXED64 BIGINT – – INT32 | SINT32 | SFIXED32 INT – – INT32 | SINT32 | SFIXED32 SMALLINT – int16 INT32 | SINT32 | SFIXED32 TINYINT – int8 INT64 | SINT64 BIGINT – – UINT32 | UINT64 BIGINT – – MESSAGE BIGINT google.protobuf.Int64Value – MESSAGE BIGINT google.protobuf.UInt64Value – MESSAGE BIGINT google.protobuf.UInt32Value – MESSAGE BOOLEAN google.protobuf.BoolValue – MESSAGE DOUBLE google.protobuf.DoubleValue – MESSAGE FLOAT google.protobuf.FloatValue – MESSAGE INT google.protobuf.Int32Value – MESSAGE VARBINARY google.protobuf.BytesValue – MESSAGE VARCHAR google.protobuf.StringValue – oneOf ROW – – Protobuf 3 nullable field behavior¶ When working with Protobuf 3 schemas in Confluent Cloud for Apache Flink, it’s important to understand how nullable fields are handled. When converting to a Protobuf schema, Flink marks all NULLABLE fields as optional. In Protobuf, expressing something as NULLABLE or NOT NULL is not straightforward. All non-MESSAGE types are NOT NULL. If not set explicitly, the default value is assigned. Non-MESSAGE types marked with optional can be checked if they were set. If not set, Flink assumes NULL. MESSAGE types are all NULLABLE, which means that all fields of MESSAGE type are optional, and there is no way to ensure on a format level they are NOT NULL. To store this information, Flink uses the flink.notNull property, for example: message Row { .google.type.Date date = 1 [(confluent.field_meta) = { params: [ { key: "flink.version", value: "1" }, { key: "flink.notNull", value: "true" } ] }]; } Fields without the optional keywordIn Protobuf 3, fields without the optional keyword are treated as NOT NULL by Flink. This is because Protobuf 3 doesn’t support nullable getters/setters by default. If a field is omitted in the data, Protobuf 3 assigns the default value, which is 0 for numbers, the empty string for strings, and false for booleans. Fields with the optional keywordFields marked with optional in Protobuf 3 are treated as nullable by Flink. When such a field is not set in the data, Flink interprets it as NULL. Fields with the repeated keywordFields marked with repeated in Protobuf 3 are treated as arrays by Flink. The array itself is NOT NULL, but individual elements within the array can be nullable depending on their type. For MESSAGE types, elements are nullable by default. For primitive types, elements are NOT NULL. This behavior is consistent across all streaming platforms that work with Protobuf 3, including Kafka Streams and other Confluent products, and is not specific to Flink. It’s a fundamental characteristic of the Protobuf 3 specification itself. In a Protobuf 3 schema, if you want a field to be nullable in Flink, you must explicitly mark it as optional, for example: message Example { string required_field = 1; // NOT NULL in Flink optional string nullable_field = 2; // NULLABLE in Flink repeated string array_field = 3; // NOT NULL array in Flink repeated optional string nullable_array_field = 4; // NOT NULL array with nullable elements } Related content¶ Data Types Apache Avro Specification JSON Schema Specification Protocol Buffers Version 3 Language Specification Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
time-micros
```

```sql
time-micros
```

```sql
[A-Za-z0-9_]
```

```sql
{
  "type" : "array",
  "items" : "long"
}
```

```sql
flink.maxLength
```

```sql
{
    "type" : "fixed",
    "name" : "row",
    "namespace" : "io.confluent",
    "size" : 123
  }
```

```sql
flink.maxLength
```

```sql
{
  "type" : "string",
  "flink.maxLength" : 123,
  "flink.minLength" : 123,
  "flink.version" : "1"
}
```

```sql
{
  "type" : "int",
  "logicalType" : "date"
}
```

```sql
{
  "type" : "bytes",
  "logicalType" : "decimal",
  "precision" : 6,
  "scale" : 3
}
```

```sql
{
  "type" : "map",
  "values" : "boolean"
}
```

```sql
io.confluent.connect.avro.MapEntry(key, value)
```

```sql
{
  "type" : "array",
  "items" : {
    "type" : "record",
    "name" : "MapEntry",
    "namespace" : "io.confluent.connect.avro",
    "fields" : [ {
      "name" : "key",
      "type" : "int"
    }, {
      "name" : "value",
      "type" : "bytes"
    } ]
  }
}
```

```sql
flink.type : multiset
```

```sql
{
  "type" : "map",
  "values" : "int",
  "flink.type" : "multiset",
  "flink.version" : "1"
}
```

```sql
io.confluent.connect.avro.MapEntry(key, value)
```

```sql
flink.type : multiset
```

```sql
{
  "type" : "array",
  "items" : {
    "type" : "record",
    "name" : "MapEntry",
    "namespace" : "io.confluent.connect.avro",
    "fields" : [ {
      "name" : "key",
      "type" : "long"
    }, {
      "name" : "value",
      "type" : "int"
    } ]
  },
  "flink.type" : "multiset",
  "flink.version" : "1"
}
```

```sql
connect.type=int16
```

```sql
org.apache.flink.avro.generated.record
```

```sql
org.apache.flink.avro.generated.record_$fieldName
```

```sql
{
  "type" : "record",
  "name" : "row",
  "namespace" : "io.confluent",
  "fields" : [ {
    "name" : "f0",
    "type" : "long",
    "doc" : "field comment"
  } ]
}
```

```sql
connect.type=int16
```

```sql
{
  "type" : "int",
  "connect.type" : "int16"
}
```

```sql
flink.maxLength = flink.minLength
```

```sql
{
  "type" : "string",
  "flink.maxLength" : 123,
  "flink.version" : "1"
}
```

```sql
time-millis
```

```sql
flink.precision
```

```sql
{
  "type" : "int",
  "flink.precision" : 2,
  "flink.version" : "1",
  "logicalType" : "time-millis"
}
```

```sql
local-timestamp-millis
```

```sql
local-timestamp-micros
```

```sql
flink.precision
```

```sql
{
  "type" : "long",
  "flink.precision" : 2,
  "flink.version" : "1",
  "logicalType" : "local-timestamp-millis"
}
```

```sql
timestamp-millis
```

```sql
timestamp-micros
```

```sql
flink.precision
```

```sql
{
  "type" : "long",
  "flink.precision" : 2,
  "flink.version" : "1",
  "logicalType" : "timestamp-millis"
}
```

```sql
connect.type=int8
```

```sql
{
  "type" : "int",
  "connect.type" : "int8"
}
```

```sql
flink.maxLength
```

```sql
{
    "type" : "bytes",
    "flink.maxLength" : 123,
    "flink.version" : "1"
  }
```

```sql
union(avro_type, null)
```

```sql
[
  "long",
  "string",
  {
    "type": "record",
    "name": "User",
    "namespace": "io.test1",
    "fields": [
      {
        "name": "f0",
        "type": "long"
      }
    ]
  }
]
```

```sql
{
  "type": "array",
  "items": {
    "type": "number",
    "title": "org.apache.kafka.connect.data.Time",
    "flink.precision": 2,
    "connect.type": "int32",
    "flink.version": "1"
  }
}
```

```sql
connect.type=int64
```

```sql
{
  "type": "number",
  "connect.type": "int64"
}
```

```sql
connect.type=bytes
```

```sql
flink.minLength=flink.maxLength
```

```sql
minLength/maxLength
```

```sql
{
  "type": "string",
  "flink.maxLength": 123,
  "flink.minLength": 123,
  "flink.version": "1",
  "connect.type": "bytes"
}
```

```sql
{
  "type": "array",
  "items": {
    "type": "number",
    "title": "org.apache.kafka.connect.data.Time",
    "flink.precision": 2,
    "connect.type": "int32",
    "flink.version": "1"
  }
}
```

```sql
minLength=maxLength
```

```sql
{
  "type": "string",
  "minLength": 123,
  "maxLength": 123
}
```

```sql
connect.type=int32
```

```sql
org.apache.kafka.connect.data.Date
```

```sql
connect.type=bytes
```

```sql
org.apache.kafka.connect.data.Decimal
```

```sql
connect.type=float64
```

```sql
{
  "type": "number",
  "connect.type": "float64"
}
```

```sql
connect.type=float32
```

```sql
{
  "type": "number",
  "connect.type": "float32"
}
```

```sql
connect.type=int32
```

```sql
{
  "type": "number",
  "connect.type": "int32"
}
```

```sql
Array[Object]
```

```sql
connect.type=map
```

```sql
{
  "type": "array",
  "connect.type": "map",
  "items": {
    "type": "object",
    "properties": {
      "value": {
        "type": "number",
        "connect.type": "int64"
      },
      "key": {
        "type": "number",
        "connect.type": "int32"
      }
    }
  }
}
```

```sql
connect.type=map
```

```sql
{
  "type":"object",
  "connect.type":"map",
  "additionalProperties":
   {
     "type":"number",
     "connect.type":"int64"
   }
}
```

```sql
Array[Object]
```

```sql
connect.type=map
```

```sql
flink.type=multiset
```

```sql
connect.type: int32
```

```sql
connect.type: int64
```

```sql
{
  "type": "array",
  "connect.type": "map",
  "flink.type": "multiset",
  "items": {
    "type": "object",
    "properties": {
      "value": {
        "type": "number",
        "connect.type": "int32"
      },
      "key": {
        "type": "number",
        "connect.type": "int32"
      }
    }
  }
}
```

```sql
connect.type=map
```

```sql
flink.type=multiset
```

```sql
connect.type: int32
```

```sql
connect.type: int64
```

```sql
{
  "type": "object",
  "connect.type": "map",
  "flink.type": "multiset",
  "additionalProperties": {
    "type": "number",
    "connect.type": "int32"
  }
}
```

```sql
connect.type=int16
```

```sql
{
  "type": "number",
  "connect.type": "int16"
}
```

```sql
connect.type=int32
```

```sql
flink.precision
```

```sql
org.apache.kafka.connect.data.Time
```

```sql
{
  "type":"number",
  "title":"org.apache.kafka.connect.data.Time",
  "flink.precision":2,
  "connect.type":"int32",
  "flink.version":"1"
}
```

```sql
connect.type=int64
```

```sql
flink.precision
```

```sql
flink.type=timestamp
```

```sql
org.apache.kafka.connect.data.Timestamp
```

```sql
{
  "type":"number",
  "title":"org.apache.kafka.connect.data.Timestamp",
  "flink.precision":2,
  "flink.type":"timestamp",
  "connect.type":"int64",
  "flink.version":"1"
}
```

```sql
connect.type=int64
```

```sql
flink.precision
```

```sql
org.apache.kafka.connect.data.Timestamp
```

```sql
{
  "type":"number",
  "title":"org.apache.kafka.connect.data.Timestamp",
  "flink.precision":2,
  "connect.type":"int64",
  "flink.version":"1"
}
```

```sql
connect.type=int8
```

```sql
{
  "type": "number",
  "connect.type": "int8"
}
```

```sql
connect.type=bytes
```

```sql
flink.maxLength
```

```sql
{
  "type": "string",
  "flink.maxLength": 123,
  "flink.version": "1",
  "connect.type": "bytes"
}
```

```sql
{
  "type": "string",
  "maxLength": 123
}
```

```sql
flink.wrapped
```

```sql
repeated int64 value = 1;
```

```sql
arrayNullableRepeatedWrapper arrayNullable = 1 [(confluent.field_meta) = {
  params: [
    {
      key: "flink.wrapped",
      value: "true"
    },
    {
      key: "flink.version",
      value: "1"
    }
  ]
}];

message arrayNullableRepeatedWrapper {
  repeated int64 value = 1;
}
```

```sql
repeated elementNullableElementWrapper elementNullable = 2 [(confluent.field_meta) = {
  params: [
    {
      key: "flink.wrapped",
      value: "true"
    },
    {
      key: "flink.version",
      value: "1"
    }
  ]
}];

message elementNullableElementWrapper {
  optional int64 value = 1;
}
```

```sql
optional int64 bigint = 8;
```

```sql
flink.maxLength=flink.minLength
```

```sql
optional bytes binary = 13 [(confluent.field_meta) = {
  params: [
    {
      key: "flink.maxLength",
      value: "123"
    },
    {
      key: "flink.minLength",
      value: "123"
    },
    {
      key: "flink.version",
      value: "1"
    }
  ]
}];
```

```sql
optional bool boolean = 2;
```

```sql
flink.maxLength=flink.minLength
```

```sql
optional string char = 11 [(confluent.field_meta) = {
  params: [
    {
      key: "flink.maxLength",
      value: "123"
    },
    {
      key: "flink.minLength",
      value: "123"
    },
    {
      key: "flink.version",
      value: "1"
    }
  ]
}];
```

```sql
google.type.Date
```

```sql
optional .google.type.Date date = 17;
```

```sql
confluent.type.Decimal
```

```sql
optional .confluent.type.Decimal decimal = 19 [(confluent.field_meta) = {
  params: [
    {
      value: "5",
      key: "precision"
    },
    {
      value: "1",
      key: "scale"
    },
    {
      key: "flink.version",
      value: "1"
    }
  ]
}];
```

```sql
optional double double = 10;
```

```sql
optional float float = 9;
```

```sql
optional int32 int = 7;
```

```sql
repeated MESSAGE
```

```sql
XXEntry(K key, V value)
```

```sql
flink.wrapped
```

```sql
repeated MapEntry map = 20;

message MapEntry {
    optional string key = 1;
    optional int64 value = 2;
  }
```

```sql
repeated MESSAGE
```

```sql
XXEntry(V key, int32 value)
```

```sql
flink.wrapped
```

```sql
flink.type=multiset
```

```sql
repeated MultisetEntry multiset = 1 [(confluent.field_meta) = {
  params: [
    {
      key: "flink.type",
      value: "multiset"
    },
    {
      key: "flink.version",
      value: "1"
    }
  ]
}];

message MultisetEntry {
  optional string key = 1;
  int32 value = 2;
}
```

```sql
meta_Row meta = 1;

message meta_Row {
  float a = 1;
  float b = 2;
}
```

```sql
connect.type = int16
```

```sql
optional int32 smallInt = 6 [(confluent.field_meta) = {
  doc: "smallInt comment",
  params: [
    {
      key: "flink.version",
      value: "1"
    },
    {
      key: "connect.type",
      value: "int16"
    }
  ]
}];
```

```sql
google.protobuf.Timestamp
```

```sql
flink.precision
```

```sql
flink.type=timestamp
```

```sql
optional .google.protobuf.Timestamp timestamp_ltz_3 = 16 [(confluent.field_meta) = {
  params: [
    {
      key: "flink.type",
      value: "timestamp"
    },
    {
      key: "flink.precision",
      value: "3"
    },
    {
      key: "flink.version",
      value: "1"
    }
  ]
}];
```

```sql
google.protobuf.Timestamp
```

```sql
flink.precision
```

```sql
optional .google.protobuf.Timestamp timestamp_ltz_3 = 15 [(confluent.field_meta) = {
  params: [
    {
      key: "flink.precision",
      value: "3"
    },
    {
      key: "flink.version",
      value: "1"
    }
  ]
}];
```

```sql
google.type.TimeOfDay
```

```sql
optional .google.type.TimeOfDay time = 18 [(confluent.field_meta) = {
  params: [
    {
      key: "flink.precision",
      value: "3"
    },
    {
      key: "flink.version",
      value: "1"
    }
  ]
}];
```

```sql
connect.type = int8
```

```sql
optional int32 tinyInt = 4 [(confluent.field_meta) = {
  doc: "tinyInt comment",
  params: [
    {
      key: "flink.version",
      value: "1"
    },
    {
      key: "connect.type",
      value: "int8"
    }
  ]
}];
```

```sql
flink.maxLength
```

```sql
optional bytes varbinary = 14 [(confluent.field_meta) = {
  params: [
    {
      key: "flink.maxLength",
      value: "123"
    },
    {
      key: "flink.version",
      value: "1"
    }
  ]
}];
```

```sql
flink.maxLength
```

```sql
optional string varchar = 12 [(confluent.field_meta) = {
  params: [
    {
      key: "flink.maxLength",
      value: "123"
    },
    {
      key: "flink.version",
      value: "1"
    }
  ]
}];
```

```sql
flink.notNull
```

```sql
message Row {
  .google.type.Date date = 1 [(confluent.field_meta) = {
    params: [
      {
        key: "flink.version",
        value: "1"
      },
      {
        key: "flink.notNull",
        value: "true"
      }
    ]
  }];
}
```

```sql
message Example {
  string required_field = 1;        // NOT NULL in Flink
  optional string nullable_field = 2;  // NULLABLE in Flink
  repeated string array_field = 3;     // NOT NULL array in Flink
  repeated optional string nullable_array_field = 4;  // NOT NULL array with nullable elements
}
```

---

### Flink SQL Examples in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/sql-examples.html

Flink SQL Examples in Confluent Cloud for Apache Flink¶ The following code examples show common Flink SQL use cases with Confluent Cloud for Apache Flink®. CREATE TABLE Inferred tables ALTER TABLE SELECT Schema reference CREATE TABLE examples¶ The following examples show how to create Flink tables with various options. Minimal table¶ CREATE TABLE t_minimal (s STRING); Properties Append changelog mode. No Schema Registry key. Round robin distribution. 6 Kafka partitions. The $rowtime column and system watermark are added implicitly. Table with a primary key¶ SyntaxCREATE TABLE t_pk (k INT PRIMARY KEY NOT ENFORCED, s STRING); Properties Upsert changelog mode. The primary key defines an implicit DISTRIBUTED BY(k). k is the Schema Registry key. Hash distribution on k. The table has 6 Kafka partitions. k is declared as being unique, meaning no duplicate rows. k must not contain NULLs, so an implicit NOT NULL is added. The $rowtime column and system watermark are added implicitly. Table with a primary key in append mode¶ SyntaxCREATE TABLE t_pk_append (k INT PRIMARY KEY NOT ENFORCED, s STRING) DISTRIBUTED INTO 4 BUCKETS WITH ('changelog.mode' = 'append'); Properties Append changelog mode. k is the Schema Registry key. Hash distribution on k. The table has 4 Kafka partitions. k is declared as being unique, meaning no duplicate rows. k must not contain NULLs, meaning implicit NOT NULL. The $rowtime column and system watermark are added implicitly. Table with hash distribution¶ SyntaxCREATE TABLE t_dist (k INT, s STRING) DISTRIBUTED BY (k) INTO 4 BUCKETS; Properties Append changelog mode. k is the Schema Registry key. Hash distribution on k. The table has 4 Kafka partitions. The $rowtime column and system watermark are added implicitly. Complex table with all concepts combined¶ SyntaxCREATE TABLE t_complex (k1 INT, k2 INT, PRIMARY KEY (k1, k2) NOT ENFORCED, s STRING) COMMENT 'My complex table' DISTRIBUTED BY HASH(k1) INTO 4 BUCKETS WITH ('changelog.mode' = 'append'); Properties Append changelog mode. k1 is the Schema Registry key. Hash distribution on k1. k2 is treated as a value column and is stored in the value part of Schema Registry. The table has 4 Kafka partitions. k1 and k2 are declared as being unique, meaning no duplicates. k and k2 must not contain NULLs, meaning implicit NOT NULL. The $rowtime column and system watermark are added implicitly. An additional comment is added. Table with overlapping names in key/value of Schema Registry but disjoint data¶ SyntaxCREATE TABLE t_disjoint (from_key_k INT, k STRING) DISTRIBUTED BY (from_key_k) WITH ('key.fields-prefix' = 'from_key_'); Properties Append changelog mode. Hash distribution on from_key_k. The key prefix from_key_ is defined and is stripped before storing the schema in Schema Registry. Therefore, k is the Schema Registry key of type INT. Also, k is the Schema Registry value of type STRING. Both key and value store disjoint data, so they can have different data types Create with overlapping names in key/value of Schema Registry but joint data¶ SyntaxCREATE TABLE t_joint (k INT, v STRING) DISTRIBUTED BY (k) WITH ('value.fields-include' = 'all'); Properties Append changelog mode. Hash distribution on k. By default, the key is never included in the value in Schema Registry. By setting 'value.fields-include' = 'all', the value contains the full table schema Therefore, k is the Schema Registry key. Also, k, v is the Schema Registry value. The payload of k is stored twice in the Kafka message, because key and value store joint data and they have the same data type for k. Table with metadata columns for writing a Kafka message timestamp¶ SyntaxCREATE TABLE t_metadata_write (name STRING, ts TIMESTAMP_LTZ(3) NOT NULL METADATA FROM 'timestamp') DISTRIBUTED INTO 1 BUCKETS; Properties Adds the ts metadata column, which isn’t part of Schema Registry but instead is a pure Flink concept. In contrast with $rowtime, which is declared as a METADATA VIRTUAL column, ts is selected in a SELECT * statement and is writable. The following examples show how to fill Kafka messages with an instant. INSERT INTO t (ts, name) SELECT NOW(), 'Alice'; INSERT INTO t (ts, name) SELECT TO_TIMESTAMP_LTZ(0, 3), 'Bob'; SELECT $rowtime, * FROM t; The Schema Registry subject compatibility mode must be FULL or FULL_TRANSITIVE. For more information, see Schema Evolution and Compatibility for Schema Registry on Confluent Cloud. Table with string key and value in Schema Registry¶ SyntaxCREATE TABLE t_raw_string_key (key STRING, i INT) DISTRIBUTED BY (key) WITH ('key.format' = 'raw'); Properties Schema Registry is filled with a value subject containing i. The key columns are determined by the DISTRIBUTED BY clause. By default, Avro in Schema Registry would be used for the key, but the WITH clause overrides this to the raw format. Tables with cross-region schema sharing¶ Create two Kafka clusters in different regions, for example, eu-west-1 and us-west-2. Create two Flink compute pools in different regions, for example, eu-west-1 and us-west-2. In the first region, run the following statement. CREATE TABLE t_shared_schema (key STRING, s STRING) DISTRIBUTED BY (key); In the second region, run the same statement. CREATE TABLE t_shared_schema (key STRING, s STRING) DISTRIBUTED BY (key); Properties Schema Registry is shared across regions. The SQL metastore, Flink compute pools, and Kafka clusters are regional. Both tables in either region share the Schema Registry subjects t_shared_schema-key and t_shared_schema-value. Create with different changelog modes¶ There are three ways of storing events in a table’s log, this is, in the underlying Kafka topic. append Every insertion event is an immutable fact. Every event is insert-only. Events can be distributed in a round-robin fashion across workers/shards because they are unrelated. upsert Events are related using a primary key. Every event is either an upsert or delete event for a primary key. Events for the same primary key should land at the same worker/shard. retract Every upsert event is a fact that can be “undone”. This means that every event is either an insertion or its retraction. So, two events are related by all columns. In other words, the entire row is the key. For example, +I['Bob', 42] is related to -D['Bob', 42] and +U['Alice', 13] is related to -U['Alice', 13]. The retract mode is intermediate between the append and upsert modes. The append and upsert modes are natural to existing Kafka consumers and producers. Kafka compaction is a kind of upsert. Start with a table created by the following statement. CREATE TABLE t_changelog_modes (i BIGINT); Properties Confluent Cloud for Apache Flink always derives an appropriate changelog mode for the preceding declaration. If there is no primary key, append is the safest option, because it prevents users from pushing updates into a topic accidentally, and it has the best support of downstream consumers. -- works because the query is non-updating INSERT INTO t_changelog_modes SELECT 1; -- does not work because the query is updating, causing an error INSERT INTO t_changelog_modes SELECT COUNT(*) FROM (VALUES (1), (2), (3)); If you need updates, and if downstream consumers support it, for example, when the consumer is another Flink job, you can set the changelog mode to retract. ALTER TABLE t_changelog_modes SET ('changelog.mode' = 'retract'); Properties The table starts accepting retractions during INSERT INTO. Already existing records in the Kafka topic are treated as insertions. Newly added records receive a changeflag (+I, +U, -U, -D) in the Kafka message header. Going back to append mode is possible, but retractions (-U, -D) appear as insertions, and the Kafka header metadata column reveals the changeflag. ALTER TABLE t_changelog_modes SET ('changelog.mode' = 'append'); ALTER TABLE t_changelog_modes ADD headers MAP<BYTES, BYTES> METADATA VIRTUAL; -- Shows what is serialized internally SELECT i, headers FROM t_changelog_modes; Table with infinite retention time¶ CREATE TABLE t_infinite_retention (i INT) WITH ('kafka.retention.time' = '0'); Properties By default, the retention time is 7 days, as in all other APIs. Flink doesn’t support -1 for durations, so 0 means infinite retention time. Durations in Flink support 2 day or 2 d syntax, so it doesn’t need to be in milliseconds. If no unit is specified, the unit is milliseconds. The following units are supported: "d", "day", "h", "hour", "m", "min", "minute", "ms", "milli", "millisecond", "µs", "micro", "microsecond", "ns", "nano", "nanosecond" Inferred table examples¶ Inferred tables are tables that have not been created by using a CREATE TABLE statement, but instead are automatically detected from information about existing Kafka topics and Schema Registry entries. You can use the ALTER TABLE statement to evolve schemas for inferred tables. The following examples show output from the SHOW CREATE TABLE statement called on the resulting table. No key or value in Schema Registry¶ For an inferred table with no registered key or value schemas, SHOW CREATE TABLE returns the following output: CREATE TABLE `t_raw` ( `key` VARBINARY(2147483647), `val` VARBINARY(2147483647) ) DISTRIBUTED BY HASH(`key`) INTO 2 BUCKETS WITH ( 'changelog.mode' = 'append', 'connector' = 'confluent', 'key.format' = 'raw', 'value.format' = 'raw' ... ) Properties Key and value formats are raw (binary format) with BYTES. Following Kafka message semantics, both key and value support NULL as well, so the following code is valid: INSERT INTO t_raw (key, val) SELECT CAST(NULL AS BYTES), CAST(NULL AS BYTES); No key and but record value in Schema Registry¶ For the following value schema in Schema Registry: { "type": "record", "name": "TestRecord", "fields": [ { "name": "i", "type": "int" }, { "name": "s", "type": "string" } ] } SHOW CREATE TABLE returns the following output: CREATE TABLE `t_raw_key` ( `key` VARBINARY(2147483647), `i` INT NOT NULL, `s` VARCHAR(2147483647) NOT NULL ) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS WITH ( 'changelog.mode' = 'append', 'connector' = 'confluent', 'key.format' = 'raw', 'value.format' = 'avro-registry' ... ) Properties The key format is raw (binary format) with BYTES. Following Kafka message semantics, the key supports NULL as well, so the following code is valid: INSERT INTO t_raw_key SELECT CAST(NULL AS BYTES), 12, 'Bob'; Atomic key and record value in Schema Registry¶ For the following key schema in Schema Registry: "int" And for the following value schema in Schema Registry: { "type": "record", "name": "TestRecord", "fields": [ { "name": "i", "type": "int" }, { "name": "s", "type": "string" } ] } SHOW CREATE TABLE returns the following output: CREATE TABLE `t_atomic_key` ( `key` INT NOT NULL, `i` INT NOT NULL, `s` VARCHAR(2147483647) NOT NULL ) DISTRIBUTED BY HASH(`key`) INTO 2 BUCKETS WITH ( 'changelog.mode' = 'append', 'connector' = 'confluent', 'key.format' = 'avro-registry', 'value.format' = 'avro-registry' ... ) Properties Schema Registry defines the column data type as INT NOT NULL. The column name, key, is used as the default, because Schema Registry doesn’t provide a column name. Overlapping names in key/value, no key in Schema Registry¶ For the following value schema in Schema Registry: { "type": "record", "name": "TestRecord", "fields": [ { "name": "i", "type": "int" }, { "name": "key", "type": "string" } ] } SHOW CREATE TABLE returns the following output: CREATE TABLE `t_raw_disjoint` ( `key_key` VARBINARY(2147483647), `i` INT NOT NULL, `key` VARCHAR(2147483647) NOT NULL ) DISTRIBUTED BY HASH(`key_key`) INTO 1 BUCKETS WITH ( 'changelog.mode' = 'append', 'connector' = 'confluent', 'key.fields-prefix' = 'key_', 'key.format' = 'raw', 'value.format' = 'avro-registry' ... ) Properties The Schema Registry value schema defines columns i INT NOT NULL and key STRING. The column name key BYTES is used as the default if no key is in Schema Registry. Because key would collide with value schema column, the key_ prefix is added. Record key and record value in Schema Registry¶ For the following key schema in Schema Registry: { "type": "record", "name": "TestRecord", "fields": [ { "name": "uid", "type": "int" } ] } And for the following value schema in Schema Registry: { "type": "record", "name": "TestRecord", "fields": [ { "name": "name", "type": "string" }, { "name": "zip_code", "type": "string" } ] } SHOW CREATE TABLE returns the following output: CREATE TABLE `t_sr_disjoint` ( `uid` INT NOT NULL, `name` VARCHAR(2147483647) NOT NULL, `zip_code` VARCHAR(2147483647) NOT NULL ) DISTRIBUTED BY HASH(`uid`) INTO 1 BUCKETS WITH ( 'changelog.mode' = 'append', 'connector' = 'confluent', 'value.format' = 'avro-registry' ... ) Properties Schema Registry defines columns for both key and value. The column names of key and value are disjoint sets and don’t overlap. Record key and record value with overlap in Schema Registry¶ For the following key schema in Schema Registry: { "type": "record", "name": "TestRecord", "fields": [ { "name": "uid", "type": "int" } ] } And for the following value schema in Schema Registry: { "type": "record", "name": "TestRecord", "fields": [ { "name": "uid", "type": "int" },{ "name": "name", "type": "string" }, { "name": "zip_code", "type": "string" } ] } SHOW CREATE TABLE returns the following output: CREATE TABLE `t_sr_joint` ( `uid` INT NOT NULL, `name` VARCHAR(2147483647) NOT NULL, `zip_code` VARCHAR(2147483647) NOT NULL ) DISTRIBUTED BY HASH(`uid`) INTO 1 BUCKETS WITH ( 'changelog.mode' = 'append', 'connector' = 'confluent', 'value.fields-include' = 'all', 'value.format' = 'avro-registry' ... ) Properties Schema Registry defines columns for both key and value. The column names of key and value overlap on uid. 'value.fields-include' = 'all' is set to exclude the key, because it is fully contained in the value. Detecting that key is fully contained in the value requires that both field name and data type match completely, including nullability, and all fields of the key are included in the value. Union types in Schema Registry¶ For the following value schema in Schema Registry: ["int", "string"] SHOW CREATE TABLE returns the following output: CREATE TABLE `t_union` ( `key` VARBINARY(2147483647), `int` INT, `string` VARCHAR(2147483647) ) ... For the following value schema in Schema Registry: [ "string", { "type": "record", "name": "User", "fields": [ { "name": "uid", "type": "int" },{ "name": "name", "type": "string" } ] }, { "type": "record", "name": "Address", "fields": [ { "name": "zip_code", "type": "string" } ] } ] SHOW CREATE TABLE returns the following output: CREATE TABLE `t_union` ( `key` VARBINARY(2147483647), `string` VARCHAR(2147483647), `User` ROW<`uid` INT NOT NULL, `name` VARCHAR(2147483647) NOT NULL>, `Address` ROW<`zip_code` VARCHAR(2147483647) NOT NULL> ) ... Properties NULL and NOT NULL are inferred depending on whether a union contains NULL. Elements of a union are always NULL, because they need to be set to NULL when a different element is set. If a record defines a namespace, the field is prefixed with it, for example, org.myorg.avro.User. Multi-message protobuf schema in Schema Registry¶ For the following value schema in Schema Registry: syntax = "proto3"; message Purchase { string item = 1; double amount = 2; string customer_id = 3; } message Pageview { string url = 1; bool is_special = 2; string customer_id = 3; } SHOW CREATE TABLE returns the following output: CREATE TABLE `t` ( `key` VARBINARY(2147483647), `Purchase` ROW< `item` VARCHAR(2147483647) NOT NULL, `amount` DOUBLE NOT NULL, `customer_id` VARCHAR(2147483647) NOT NULL >, `Pageview` ROW< `url` VARCHAR(2147483647) NOT NULL, `is_special` BOOLEAN NOT NULL, `customer_id` VARCHAR(2147483647) NOT NULL > ) ... For the following value schema in Schema Registry: syntax = "proto3"; message Purchase { string item = 1; double amount = 2; string customer_id = 3; Pageview pageview = 4; } message Pageview { string url = 1; bool is_special = 2; string customer_id = 3; } SHOW CREATE TABLE returns the following output: CREATE TABLE `t` ( `key` VARBINARY(2147483647), `Purchase` ROW< `item` VARCHAR(2147483647) NOT NULL, `amount` DOUBLE NOT NULL, `customer_id` VARCHAR(2147483647) NOT NULL, `pageview` ROW< `url` VARCHAR(2147483647) NOT NULL, `is_special` BOOLEAN NOT NULL, `customer_id` VARCHAR(2147483647) NOT NULL > >, `Pageview` ROW< `url` VARCHAR(2147483647) NOT NULL, `is_special` BOOLEAN NOT NULL, `customer_id` VARCHAR(2147483647) NOT NULL > ) ... For the following value schema in Schema Registry: syntax = "proto3"; message Purchase { string item = 1; double amount = 2; string customer_id = 3; Pageview pageview = 4; message Pageview { string url = 1; bool is_special = 2; string customer_id = 3; } } SHOW CREATE TABLE returns the following output: CREATE TABLE `t` ( `key` VARBINARY(2147483647), `item` VARCHAR(2147483647) NOT NULL, `amount` DOUBLE NOT NULL, `customer_id` VARCHAR(2147483647) NOT NULL, `pageview` ROW< `url` VARCHAR(2147483647) NOT NULL, `is_special` BOOLEAN NOT NULL, `customer_id` VARCHAR(2147483647) NOT NULL > ) ... Debezium CDC format in Schema Registry¶ For a Debezium CDC format with the following value schema in Schema Registry: { "type": "record", "name": "Customer", "namespace": "io.debezium.data", "fields": [ { "name": "before", "type": ["null", { "type": "record", "name": "Value", "fields": [ {"name": "id", "type": "int"}, {"name": "name", "type": "string"}, {"name": "email", "type": "string"} ] }], "default": null }, { "name": "after", "type": ["null", "Value"], "default": null }, { "name": "source", "type": { "type": "record", "name": "Source", "fields": [ {"name": "version", "type": "string"}, {"name": "connector", "type": "string"}, {"name": "name", "type": "string"}, {"name": "ts_ms", "type": "long"}, {"name": "db", "type": "string"}, {"name": "schema", "type": "string"}, {"name": "table", "type": "string"} ] } }, {"name": "op", "type": "string"}, {"name": "ts_ms", "type": ["null", "long"], "default": null}, {"name": "transaction", "type": ["null", { "type": "record", "name": "Transaction", "fields": [ {"name": "id", "type": "string"}, {"name": "total_order", "type": "long"}, {"name": "data_collection_order", "type": "long"} ] }], "default": null} ] } SHOW CREATE TABLE returns the following output: CREATE TABLE `customer_changes` ( `key` VARBINARY(2147483647), `id` INT NOT NULL, `name` VARCHAR(2147483647) NOT NULL, `email` VARCHAR(2147483647) NOT NULL ) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS WITH ( 'changelog.mode' = 'retract', 'connector' = 'confluent', 'key.format' = 'raw', 'value.format' = 'avro-debezium-registry' ... ) Properties Flink detects the Debezium format automatically, based on the schema structure with after, before, and op fields. The table schema is inferred from the after schema, exposing only the actual data fields. Automatic Debezium Envelope Detection: For schemas created after May 19, 2025 at 09:00 UTC, Flink automatically detects Debezium envelopes and sets appropriate defaults: value.format defaults to *-debezium-registry (instead of *-registry) changelog.mode defaults to retract (instead of append) Exception: If Kafka cleanup.policy is compact, changelog.mode is set to upsert The default changelog.mode is retract, which properly handles all CDC operations, including inserts, updates, and deletes. You can manually override the changelog mode if necessary: -- Change to upsert mode for primary key-based operations ALTER TABLE customer_changes SET ('changelog.mode' = 'upsert'); -- Change to append mode (processes only inserts and updates) ALTER TABLE customer_changes SET ('changelog.mode' = 'append'); ALTER TABLE examples¶ The following examples show frequently used scenarios for ALTER TABLE. Define a watermark for perfectly ordered data¶ Flink guarantees that rows are always emitted before the watermark is generated. The following statements ensure that for perfectly ordered events, meaning events without time-skew, a watermark can be equal to the timestamp or 1 ms less than the timestamp. CREATE TABLE t_perfect_watermark (i INT); -- If multiple events can have the same timestamp. ALTER TABLE t_perfect_watermark MODIFY WATERMARK FOR $rowtime AS $rowtime - INTERVAL '0.001' SECOND; -- If a single event can have the timestamp. ALTER TABLE t_perfect_watermark MODIFY WATERMARK FOR $rowtime AS $rowtime; Drop your custom watermark strategy¶ Remove the custom watermark strategy to restore the default watermark strategy. View the current table schema and metadata. DESCRIBE `orders`; Your output should resemble: +-------------+------------------------+----------+-------------------+ | Column Name | Data Type | Nullable | Extras | +-------------+------------------------+----------+-------------------+ | user | BIGINT | NOT NULL | PRIMARY KEY | | product | STRING | NULL | | | amount | INT | NULL | | | ts | TIMESTAMP(3) *ROWTIME* | NULL | WATERMARK AS `ts` | +-------------+------------------------+----------+-------------------+ Remove the watermark strategy of the table. ALTER TABLE `orders` DROP WATERMARK; Your output should resemble: Statement phase is COMPLETED. Check the new table schema and metadata. DESCRIBE `orders`; Your output should resemble: +-------------+--------------+----------+-------------+ | Column Name | Data Type | Nullable | Extras | +-------------+--------------+----------+-------------+ | user | BIGINT | NOT NULL | PRIMARY KEY | | product | STRING | NULL | | | amount | INT | NULL | | | ts | TIMESTAMP(3) | NULL | | +-------------+--------------+----------+-------------+ Configure Debezium format for CDC data¶ Change regular format to Debezium format¶ Note For schemas created after May 19, 2025 at 09:00 UTC, Flink automatically detects Debezium envelopes and configures the appropriate format and changelog mode. Manual conversion is necessary only for older schemas or when you want to override the default behavior. For tables that have been inferred with regular formats but contain Debezium CDC (Change Data Capture) data: AvroJSON SchemaProtobuf-- Convert from regular Avro format to Debezium CDC format -- and configure the appropriate Flink changelog interpretation mode: -- * append: Treats each record as an INSERT operation with no relationship between records -- * retract: Handles paired operations (INSERT/UPDATE/DELETE) where changes to the same row -- are represented as a retraction of the old value followed by an addition of the new value -- * upsert: Groups all operations for the primary key (derived from the Kafka message key), -- with each operation effectively merging with or replacing previous state -- (INSERT creates, UPDATE modifies, DELETE removes) ALTER TABLE customer_data SET ( 'value.format' = 'avro-debezium-registry', 'changelog.mode' = 'retract' ); -- Convert from regular JSON format to Debezium CDC format -- and configure the appropriate Flink changelog interpretation mode: -- * append: Treats each record as an INSERT operation with no relationship between records -- * retract: Handles paired operations (INSERT/UPDATE/DELETE) where changes to the same row -- are represented as a retraction of the old value followed by an addition of the new value -- * upsert: Groups all operations for the primary key (derived from the Kafka message key), -- with each operation effectively merging with or replacing previous state -- (INSERT creates, UPDATE modifies, DELETE removes) ALTER TABLE customer_data_json SET ( 'value.format' = 'json-debezium-registry', 'changelog.mode' = 'retract' ); -- Convert from regular Protobuf format to Debezium CDC format -- and configure the appropriate Flink changelog interpretation mode: -- * append: Treats each record as an INSERT operation with no relationship between records -- * retract: Handles paired operations (INSERT/UPDATE/DELETE) where changes to the same row -- are represented as a retraction of the old value followed by an addition of the new value -- * upsert: Groups all operations for the primary key (derived from the Kafka message key), -- with each operation effectively merging with or replacing previous state -- (INSERT creates, UPDATE modifies, DELETE removes) ALTER TABLE customer_data_proto SET ( 'value.format' = 'proto-debezium-registry', 'changelog.mode' = 'retract' ); Modify Changelog Processing Mode¶ For tables with any type of data that need a different processing mode for handling changes: -- Change to append mode (default) -- Best for event streams where each record is independent ALTER TABLE customer_changes SET ( 'changelog.mode' = 'append' ); -- Change to retract mode -- Useful when changes to the same row are represented as paired operations ALTER TABLE customer_changes SET ( 'changelog.mode' = 'retract' ); -- Change upsert mode when working with primary keys -- Best when tracking state changes using a primary key (derived from Kafka message key) ALTER TABLE customer_changes SET ( 'changelog.mode' = 'upsert' ); Read and/or write Kafka headers¶ -- Create example topic CREATE TABLE t_headers (i INT); -- For read-only (virtual) ALTER TABLE t_headers ADD headers MAP<BYTES, BYTES> METADATA VIRTUAL; -- For read and write (persisted). Column becomes mandatory in INSERT INTO. ALTER TABLE t_headers MODIFY headers MAP<BYTES, BYTES> METADATA; -- Use implicit casting (origin is always MAP<BYTES, BYTES>) ALTER TABLE t_headers MODIFY headers MAP<STRING, STRING> METADATA; -- Insert and read INSERT INTO t_headers SELECT 42, MAP['k1', 'v1', 'k2', 'v2']; SELECT * FROM t_headers; Properties The metadata key is headers. If you don’t want to name the column this way, use: other_name MAP<BYTES, BYTES> METADATA FROM 'headers' VIRTUAL. Keys of headers must be unique. Multi-key headers are not supported. Add headers as a metadata column¶ You can get the headers of a Kafka record as a map of raw bytes by adding a headers virtual metadata column. Run the following statement to add the Kafka partition as a metadata column: ALTER TABLE `orders` ADD ( `headers` MAP<BYTES,BYTES> METADATA VIRTUAL); View the new schema. DESCRIBE `orders`; Your output should resemble: +-------------+-------------------+----------+-------------------------+ | Column Name | Data Type | Nullable | Extras | +-------------+-------------------+----------+-------------------------+ | user | BIGINT | NOT NULL | PRIMARY KEY, BUCKET KEY | | product | STRING | NULL | | | amount | INT | NULL | | | ts | TIMESTAMP(3) | NULL | | | headers | MAP<BYTES, BYTES> | NULL | METADATA VIRTUAL | +-------------+-------------------+----------+-------------------------+ Read topic from specific offsets¶ -- Create example topic with 1 partition filled with values CREATE TABLE t_specific_offsets (i INT) DISTRIBUTED INTO 1 BUCKETS; INSERT INTO t_specific_offsets VALUES (1), (2), (3), (4), (5); -- Returns 1, 2, 3, 4, 5 SELECT * FROM t_specific_offsets; -- Changes the scan range ALTER TABLE t_specific_offsets SET ( 'scan.startup.mode' = 'specific-offsets', 'scan.startup.specific-offsets' = 'partition:0,offset:3' ); -- Returns 4, 5 SELECT * FROM t_specific_offsets; Properties scan.startup.mode and scan.bounded.mode control which range in the changelog (Kafka topic) to read. scan.startup.specific-offsets and scan.bounded.specific-offsets define offsets per partition. In the example, only 1 partition is used. For multiple partitions, use the following syntax: 'scan.startup.specific-offsets' = 'partition:0,offset:3; partition:1,offset:42; partition:2,offset:0' Debug “no output” and no watermark cases¶ The root cause for most “no output” cases is that a time-based operation, for example, TUMBLE, MATCH_RECOGNIZE, and FOR SYSTEM_TIME AS OF, did not receive recent enough watermarks. The current time of an operator is calculated by the minimum watermark of all inputs, meaning across all tables/topics and their partitions. If one partition does not emit a watermark, it can affect the entire pipeline. The following statements may be helpful for debugging issues related to watermarks. -- example table CREATE TABLE t_watermark_debugging (k INT, s STRING) DISTRIBUTED BY (k) INTO 4 BUCKETS; -- Each value lands in a separate Kafka partition (out of 4). -- Leave out values to see missing watermarks. INSERT INTO t_watermark_debugging VALUES (1, 'Bob'), (2, 'Alice'), (8, 'John'), (15, 'David'); -- If ROW_NUMBER doesn't show results, it's clearly a watermark issue. SELECT ROW_NUMBER() OVER (ORDER BY $rowtime ASC) AS `number`, * FROM t_watermark_debugging; -- Add partition information as metadata column ALTER TABLE t_watermark_debugging ADD part INT METADATA FROM 'partition' VIRTUAL; -- Use the CURRENT_WATERMARK() function to check which watermark is calculated SELECT *, part AS `Row Partition`, $rowtime AS `Row Timestamp`, CURRENT_WATERMARK($rowtime) AS `Operator Watermark` FROM t_watermark_debugging; -- Visualize the highest timestamp per Kafka partition -- Due to the table declaration (with 4 buckets), this query should show 4 rows. -- If not, the missing partitions might be the cause for watermark issues. SELECT part AS `Partition`, MAX($rowtime) AS `Max Timestamp in Partition` FROM t_watermark_debugging GROUP BY part; -- A workaround could be to not use the system watermark: ALTER TABLE t_watermark_debugging MODIFY WATERMARK FOR $rowtime AS $rowtime - INTERVAL '2' SECOND; -- Or for perfect input data: ALTER TABLE t_watermark_debugging MODIFY WATERMARK FOR $rowtime AS $rowtime - INTERVAL '0.001' SECOND; -- Add "fresh" data while the above statements with -- ROW_NUMBER() or CURRENT_WATERMARK() are running. INSERT INTO t_watermark_debugging VALUES (1, 'Fresh Bob'), (2, 'Fresh Alice'), (8, 'Fresh John'), (15, 'Fresh David'); The debugging examples above won’t solve everything but may help in finding the root cause. The system watermark strategy is smart and excludes idle Kafka partitions from the watermark calculation after some time, but at least one partition must produce new data for the “logical clock” with watermarks. Typically, root causes are: Idle Kafka partitions No data in Kafka partitions Not enough data in Kafka partitions Watermark strategy is too conservative No fresh data after warm up with historical data for progressing the logical clock Handle idle partitions for missing watermarks¶ Idle partitions often cause missing watermarks. Also, no data in a partition or infrequent data can be a root cause. -- Create a topic with 4 partitions. CREATE TABLE t_watermark_idle (k INT, s STRING) DISTRIBUTED BY (k) INTO 4 BUCKETS; -- Avoid the "not enough data" problem by using a custom watermark. -- The watermark strategy is still coarse-grained enough for this example. ALTER TABLE t_watermark_idle MODIFY WATERMARK FOR $rowtime AS $rowtime - INTERVAL '2' SECONDS; -- Each value lands in a separate Kafka partition, and partition 1 is empty. INSERT INTO t_watermark_idle VALUES (1, 'Bob in partition 0'), (2, 'Alice in partition 3'), (8, 'John in partition 2'); -- Thread 1: Start a streaming job. SELECT ROW_NUMBER() OVER (ORDER BY $rowtime ASC) AS `number`, * FROM t_watermark_idle; -- Thread 2: Insert some data immediately -> Thread 1 still without results. INSERT INTO t_watermark_idle VALUES (1, 'Another Bob in partition 0 shortly after'); -- Thread 2: Insert some data after 15s -> Thread 1 should show results. INSERT INTO t_watermark_idle VALUES (1, 'Another Bob in partition 0 after 15s') Within the first 15 seconds, all partitions contribute to the watermark calculation, so the first INSERT INTO has no effect because partition 1 is still empty. After 15 seconds, all partitions are marked as idle. No partition contributes to the watermark calculation. But when the second INSERT INTO is executed, it becomes the main driving partition for the logical clock. The global watermark jumps to “second INSERT INTO - 2 seconds”. In the following code, the sql.tables.scan.idle-timeout configuration overrides the default idle-detection algorithm, so even an immediate INSERT INTO can be the main driving partition for the logical clock, because all other partitions are marked as idle after 1 second. -- Thread 1: Start a streaming job. -- Lower the idle timeout further. SET 'sql.tables.scan.idle-timeout' = '1s'; SELECT ROW_NUMBER() OVER (ORDER BY $rowtime ASC) AS `number`, * FROM t_watermark_idle; -- Thread 2: Insert some data immediately -> Thread 1 should show results. INSERT INTO t_watermark_idle VALUES (1, 'Another Bob in partition 0 shortly after'); Change the schema context property¶ You can set the schema context for key and value formats to control the namespace for your schema resolution in Schema Registry. Set the schema context for the value format ALTER TABLE `orders` SET ('value.format.schema-context' = '.lsrc-newcontext'); Your output should resemble: Statement phase is COMPLETED. Check the new table properties. SHOW CREATE TABLE `orders`; Your output should resemble: +----------------------------------------------------------------------+ | SHOW CREATE TABLE | +----------------------------------------------------------------------+ | CREATE TABLE `catalog`.`database`.`orders` ( | | `user` BIGINT NOT NULL, | | `product` VARCHAR(2147483647), | | `amount` INT, | | `ts` TIMESTAMP(3) | | ) | | DISTRIBUTED BY HASH(`user`) INTO 6 BUCKETS | | WITH ( | | 'changelog.mode' = 'upsert', | | 'connector' = 'confluent', | | 'kafka.cleanup-policy' = 'delete', | | 'kafka.max-message-size' = '2097164 bytes', | | 'kafka.retention.size' = '0 bytes', | | 'kafka.retention.time' = '604800000 ms', | | 'key.format' = 'avro-registry', | | 'scan.bounded.mode' = 'unbounded', | | 'scan.startup.mode' = 'latest-offset', | | 'value.format' = 'avro-registry', | | 'value.format.schema-context' = '.lsrc-newcontext' | | ) | | | +----------------------------------------------------------------------+ Inferred tables schema evolution¶ You can use the ALTER TABLE statement to evolve schemas for inferred tables. The following examples show output from the SHOW CREATE TABLE statement called on the resulting table. Schema Registry columns overlap with computed/metadata columns¶ For the following value schema in Schema Registry: { "type": "record", "name": "TestRecord", "fields": [ { "name": "uid", "type": "int" } ] } Evolve a table by adding metadata: ALTER TABLE t_metadata_overlap ADD `timestamp` TIMESTAMP_LTZ(3) NOT NULL METADATA; SHOW CREATE TABLE returns the following output: CREATE TABLE t_metadata_overlap` ( `key` VARBINARY(2147483647), `uid` INT NOT NULL, `timestamp` TIMESTAMP(3) WITH LOCAL TIME ZONE NOT NULL METADATA ) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS WITH ( ... ) Properties Schema Registry says there is a timestamp physical column, but Flink says there is timestamp metadata column. In this case, metadata columns and computed columns have precedence, and Confluent Cloud for Apache Flink removes the physical column from the schema. Because Confluent Cloud for Apache Flink advertises FULL_TRANSITIVE mode, queries still work, and the physical column is set to NULL in the payload: INSERT INTO t_metadata_overlap SELECT CAST(NULL AS BYTES), 42, TO_TIMESTAMP_LTZ(0, 3); Evolve the table by renaming metadata: ALTER TABLE t_metadata_overlap DROP `timestamp`; ALTER TABLE t_metadata_overlap ADD message_timestamp TIMESTAMP_LTZ(3) METADATA FROM 'timestamp'; SELECT * FROM t_metadata_overlap; SHOW CREATE TABLE returns the following output: CREATE TABLE `t_metadata_overlap` ( `key` VARBINARY(2147483647), `uid` INT NOT NULL, `timestamp` VARCHAR(2147483647), `message_timestamp` TIMESTAMP(3) WITH LOCAL TIME ZONE METADATA FROM 'timestamp' ) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS WITH ( ... ) Properties Now, both physical and metadata columns appear and can be accessed for reading and writing. Enrich a column that has no Schema Registry information¶ For the following value schema in Schema Registry: { "type": "record", "name": "TestRecord", "fields": [ { "name": "uid", "type": "int" } ] } SHOW CREATE TABLE returns the following output: CREATE TABLE `t_enrich_raw_key` ( `key` VARBINARY(2147483647), `uid` INT NOT NULL ) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS WITH ( 'changelog.mode' = 'append', 'connector' = 'confluent', 'key.format' = 'raw', 'value.format' = 'avro-registry' ... ) Properties Schema Registry provides only information for the value part. Because the key part is not backed by Schema Registry, the key.format is raw. The default data type of raw is BYTES, but you can change this by using the ALTER TABLE statement. Evolve the table by giving a raw format column a specific type: ALTER TABLE t_enrich_raw_key MODIFY key STRING; SHOW CREATE TABLE returns the following output: CREATE TABLE `t_enrich_raw_key` ( `key` STRING, `uid` INT NOT NULL ) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS WITH ( 'changelog.mode' = 'append', 'connector' = 'confluent', 'key.format' = 'raw', 'value.format' = 'avro-registry' ... ) Properties Only changes to simple, atomic types, like INT, BYTES, and STRING are supported, where the binary representation is clear. For more complex modifications, use Schema Registry. In multi-cluster scenarios, the ALTER TABLE statement must be executed for every cluster, because the data type for key is stored in the Flink regional metastore. Configure Schema Registry subject names¶ When working with topics that use RecordNameStrategy or TopicRecordNameStrategy, you can configure the subject names for the schema resolution in Schema Registry. This is particularly useful when handling multiple event types in a single topic. For topics using these strategies, Flink initially infers a raw binary table: SHOW CREATE TABLE events; Your output will show a raw binary structure: CREATE TABLE `events` ( `key` VARBINARY(2147483647), `value` VARBINARY(2147483647) ) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS WITH ( 'changelog.mode' = 'append', 'connector' = 'confluent', 'key.format' = 'raw', 'value.format' = 'raw' ) Configure value schema subject names for each format: AvroJSON SchemaProtobufALTER TABLE events SET ( 'value.format' = 'avro-registry', 'value.avro-registry.subject-names' = 'com.example.Order;com.example.Shipment' ); ALTER TABLE events SET ( 'value.format' = 'json-registry', 'value.json-registry.subject-names' = 'com.example.Order;com.example.Shipment' ); ALTER TABLE events SET ( 'value.format' = 'proto-registry', 'value.proto-registry.subject-names' = 'com.example.Order;com.example.Shipment' ); If your topic uses keyed messages, you can also configure the key format: ALTER TABLE events SET ( 'key.format' = 'avro-registry', 'key.avro-registry.subject-names' = 'com.example.OrderKey' ); You can configure both key and value schema subject names in a single statement: ALTER TABLE events SET ( 'key.format' = 'avro-registry', 'key.avro-registry.subject-names' = 'com.example.OrderKey', 'value.format' = 'avro-registry', 'value.avro-registry.subject-names' = 'com.example.Order;com.example.Shipment' ); Properties: Use semicolons (;) to separate multiple subject names Subject names must match exactly with the names registered in Schema Registry The format prefix (avro-registry, json-registry, or proto-registry) must match the schema format in Schema Registry Reset a key value¶ You can use the RESET option to set any key to its default value. The following example shows how to reset a table that has a JSON Schema back to raw format. ALTER TABLE json_table RESET ( 'value.json-registry.wire-encoding', 'value.json-registry.subject-names' ); Custom error handling¶ You can use ALTER TABLE with the error-handling.mode and error-handling.log.target table properties to set custom error handling for deserialization errors. The following code example shows how to log errors to the specified Dead Letter Queue (DLQ) table and enable processing to continue. ALTER TABLE my_table SET ( 'error-handling.mode' = 'log', 'error-handling.log.target' = 'my_error_table' ); Related content¶ Video: How to Set Idle Timeouts SELECT examples¶ The following examples show frequently used scenarios for SELECT. Most minimal statement¶ SyntaxSELECT 1; Properties Statement is bounded Check local time zone is configured correctly¶ SyntaxSELECT NOW(); Properties Statement is bounded NOW() returns a TIMSTAMP_LTZ(3), so if the client is configured correctly, it should show a timestamp in your local time zone. Combine multiple tables into one¶ SyntaxCREATE TABLE t_union_1 (i INT); CREATE TABLE t_union_2 (i INT); TABLE t_union_1 UNION ALL TABLE t_union_2; -- alternate syntax SELECT * FROM t_union_1 UNION ALL SELECT * FROM t_union_2; Get insights into the current watermark¶ SyntaxCREATE TABLE t_watermarked_insight (s STRING) DISTRIBUTED INTO 1 BUCKETS; INSERT INTO t_watermarked_insight VALUES ('Bob'), ('Alice'), ('Charly'); SELECT $rowtime, CURRENT_WATERMARK($rowtime) FROM t_watermarked_insight; The output resembles: $rowtime EXPR$1 2024-04-29 11:59:01.080 NULL 2024-04-29 11:59:01.093 2024-04-04 15:27:37.433 2024-04-29 11:59:01.094 2024-04-04 15:27:37.433 Properties The CURRENT_WATERMARK function returns the watermark that arrived at the operator evaluating the SELECT statement. The returned watermark is the minimum of all inputs, across all tables/topics and their partitions. If a common watermark was not received from all inputs, the function returns NULL. The CURRENT_WATERMARK function takes a time attribute, which is a column that has WATERMARK FOR defined. A watermark is always emitted after the row has been processed, so the first row always has a NULL watermark. Because the default watermark algorithm requires at least 250 records, initially it assumes the maximum lag of 7 days plus a safety margin of 7 days. The watermark quickly (exponentially) goes down as more data arrives. Sources emit watermarks every 200 ms, but within the first 200 ms they emit per row for powering examples like this. Flatten fields into columns¶ SyntaxCREATE TABLE t_flattening (i INT, r1 ROW<i INT, s STRING>, r2 ROW<other INT>); SELECT r1.*, r2.* FROM t_flattening; PropertiesYou can apply the * operator on nested data, which enables flattening fields into columns of the table. Schema reference examples¶ The following examples show how to use schema references in Flink SQL. For the following schemas in Schema Registry: AvroProtobufJSON{ "type":"record", "namespace": "io.confluent.developer.avro", "name":"Purchase", "fields": [ {"name": "item", "type":"string"}, {"name": "amount", "type": "double"}, {"name": "customer_id", "type": "string"} ] } syntax = "proto3"; package io.confluent.developer.proto; message Purchase { string item = 1; double amount = 2; string customer_id = 3; } { "$schema": "http://json-schema.org/draft-07/schema#", "title": "Purchase", "type": "object", "properties": { "item": { "type": "string" }, "amount": { "type": "number" }, "customer_id": { "type": "string" } }, "required": ["item", "amount", "customer_id"] } AvroProtobufJSON{ "type":"record", "namespace": "io.confluent.developer.avro", "name":"Pageview", "fields": [ {"name": "url", "type":"string"}, {"name": "is_special", "type": "boolean"}, {"name": "customer_id", "type": "string"} ] } syntax = "proto3"; package io.confluent.developer.proto; message Pageview { string url = 1; bool is_special = 2; string customer_id = 3; } { "$schema": "http://json-schema.org/draft-07/schema#", "title": "Pageview", "type": "object", "properties": { "url": { "type": "string" }, "is_special": { "type": "boolean" }, "customer_id": { "type": "string" } }, "required": ["url", "is_special", "customer_id"] } AvroProtobufJSON[ "io.confluent.developer.avro.Purchase", "io.confluent.developer.avro.Pageview" ] syntax = "proto3"; package io.confluent.developer.proto; import "purchase.proto"; import "pageview.proto"; message CustomerEvent { oneof action { Purchase purchase = 1; Pageview pageview = 2; } } { "$schema": "http://json-schema.org/draft-07/schema#", "title": "CustomerEvent", "type": "object", "oneOf": [ { "$ref": "io.confluent.developer.json.Purchase" }, { "$ref": "io.confluent.developer.json.Pageview" } ] } and references: AvroProtobufJSON[ { "name": "io.confluent.developer.avro.Purchase", "subject": "purchase", "version": 1 }, { "name": "io.confluent.developer.avro.Pageview", "subject": "pageview", "version": 1 } ] [ { "name": "purchase.proto", "subject": "purchase", "version": 1 }, { "name": "pageview.proto", "subject": "pageview", "version": 1 } ] [ { "name": "io.confluent.developer.json.Purchase", "subject": "purchase", "version": 1 }, { "name": "io.confluent.developer.json.Pageview", "subject": "pageview", "version": 1 } ] SHOW CREATE TABLE customer-events; returns the following output: CREATE TABLE `customer-events` ( `key` VARBINARY(2147483647), `Purchase` ROW<`item` VARCHAR(2147483647) NOT NULL, `amount` DOUBLE NOT NULL, `customer_id` VARCHAR(2147483647) NOT NULL>, `Pageview` ROW<`url` VARCHAR(2147483647) NOT NULL, `is_special` BOOLEAN NOT NULL, `customer_id` VARCHAR(2147483647) NOT NULL> ) DISTRIBUTED BY HASH(`key`) INTO 2 BUCKETS WITH ( 'changelog.mode' = 'append', 'connector' = 'confluent', 'kafka.cleanup-policy' = 'delete', 'kafka.max-message-size' = '2097164 bytes', 'kafka.retention.size' = '0 bytes', 'kafka.retention.time' = '7 d', 'key.format' = 'raw', 'scan.bounded.mode' = 'unbounded', 'scan.startup.mode' = 'earliest-offset', 'value.format' = '[VALUE_FORMAT]' ) Split into tables for each type¶ Syntax CREATE TABLE purchase AS SELECT Purchase.* FROM `customer-events` WHERE Purchase IS NOT NULL; SELECT * FROM purchase; CREATE TABLE pageview AS SELECT Pageview.* FROM `customer-events` WHERE Pageview IS NOT NULL; SELECT * FROM pageview; Output: item amount customer_id apple 9.99 u-21 jam 4.29 u-67 mango 13.99 u-67 socks 7.99 u-123 url is_special customer_id https://www.confluent.io TRUE u-67 http://www.cflt.io FALSE u-12 Related content¶ Flink SQL Queries Flink SQL Functions DDL Statements in Confluent Cloud for Apache Flink Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
CREATE TABLE t_minimal (s STRING);
```

```sql
CREATE TABLE t_pk (k INT PRIMARY KEY NOT ENFORCED, s STRING);
```

```sql
CREATE TABLE t_pk_append (k INT PRIMARY KEY NOT ENFORCED, s STRING)
  DISTRIBUTED INTO 4 BUCKETS
  WITH ('changelog.mode' = 'append');
```

```sql
CREATE TABLE t_dist (k INT, s STRING) DISTRIBUTED BY (k) INTO 4 BUCKETS;
```

```sql
CREATE TABLE t_complex (k1 INT, k2 INT, PRIMARY KEY (k1, k2) NOT ENFORCED, s STRING)
  COMMENT 'My complex table'
  DISTRIBUTED BY HASH(k1) INTO 4 BUCKETS
  WITH ('changelog.mode' = 'append');
```

```sql
CREATE TABLE t_disjoint (from_key_k INT, k STRING)
  DISTRIBUTED BY (from_key_k)
  WITH ('key.fields-prefix' = 'from_key_');
```

```sql
CREATE TABLE t_joint (k INT, v STRING)
  DISTRIBUTED BY (k)
  WITH ('value.fields-include' = 'all');
```

```sql
'value.fields-include' = 'all'
```

```sql
CREATE TABLE t_metadata_write (name STRING, ts TIMESTAMP_LTZ(3) NOT NULL METADATA FROM 'timestamp')
  DISTRIBUTED INTO 1 BUCKETS;
```

```sql
INSERT INTO t (ts, name) SELECT NOW(), 'Alice';
INSERT INTO t (ts, name) SELECT TO_TIMESTAMP_LTZ(0, 3), 'Bob';
SELECT $rowtime, * FROM t;
```

```sql
CREATE TABLE t_raw_string_key (key STRING, i INT)
  DISTRIBUTED BY (key)
  WITH ('key.format' = 'raw');
```

```sql
CREATE TABLE t_shared_schema (key STRING, s STRING) DISTRIBUTED BY (key);
```

```sql
CREATE TABLE t_shared_schema (key STRING, s STRING) DISTRIBUTED BY (key);
```

```sql
t_shared_schema-key
```

```sql
t_shared_schema-value
```

```sql
+I['Bob', 42]
```

```sql
-D['Bob', 42]
```

```sql
+U['Alice', 13]
```

```sql
-U['Alice', 13]
```

```sql
CREATE TABLE t_changelog_modes (i BIGINT);
```

```sql
-- works because the query is non-updating
INSERT INTO t_changelog_modes SELECT 1;

-- does not work because the query is updating, causing an error
INSERT INTO t_changelog_modes SELECT COUNT(*) FROM (VALUES (1), (2), (3));
```

```sql
ALTER TABLE t_changelog_modes SET ('changelog.mode' = 'retract');
```

```sql
ALTER TABLE t_changelog_modes SET ('changelog.mode' = 'append');
ALTER TABLE t_changelog_modes ADD headers MAP<BYTES, BYTES> METADATA VIRTUAL;

-- Shows what is serialized internally
SELECT i, headers FROM t_changelog_modes;
```

```sql
CREATE TABLE t_infinite_retention (i INT) WITH ('kafka.retention.time' = '0');
```

```sql
"d", "day", "h", "hour", "m", "min", "minute", "ms", "milli", "millisecond",
"µs", "micro", "microsecond", "ns", "nano", "nanosecond"
```

```sql
CREATE TABLE `t_raw` (
  `key` VARBINARY(2147483647),
  `val` VARBINARY(2147483647)
) DISTRIBUTED BY HASH(`key`) INTO 2 BUCKETS
WITH (
  'changelog.mode' = 'append',
  'connector' = 'confluent',
  'key.format' = 'raw',
  'value.format' = 'raw'
  ...
)
```

```sql
INSERT INTO t_raw (key, val) SELECT CAST(NULL AS BYTES), CAST(NULL AS BYTES);
```

```sql
{
  "type": "record",
  "name": "TestRecord",
  "fields": [
    {
      "name": "i",
      "type": "int"
    },
    {
      "name": "s",
      "type": "string"
    }
  ]
}
```

```sql
CREATE TABLE `t_raw_key` (
  `key` VARBINARY(2147483647),
  `i` INT NOT NULL,
  `s` VARCHAR(2147483647) NOT NULL
) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS
WITH (
  'changelog.mode' = 'append',
  'connector' = 'confluent',
  'key.format' = 'raw',
  'value.format' = 'avro-registry'
  ...
)
```

```sql
INSERT INTO t_raw_key SELECT CAST(NULL AS BYTES), 12, 'Bob';
```

```sql
{
  "type": "record",
  "name": "TestRecord",
  "fields": [
    {
      "name": "i",
      "type": "int"
    },
    {
      "name": "s",
      "type": "string"
    }
  ]
}
```

```sql
CREATE TABLE `t_atomic_key` (
  `key` INT NOT NULL,
  `i` INT NOT NULL,
  `s` VARCHAR(2147483647) NOT NULL
) DISTRIBUTED BY HASH(`key`) INTO 2 BUCKETS
WITH (
  'changelog.mode' = 'append',
  'connector' = 'confluent',
  'key.format' = 'avro-registry',
  'value.format' = 'avro-registry'
  ...
)
```

```sql
{
  "type": "record",
  "name": "TestRecord",
  "fields": [
    {
      "name": "i",
      "type": "int"
    },
    {
      "name": "key",
      "type": "string"
    }
  ]
}
```

```sql
CREATE TABLE `t_raw_disjoint` (
  `key_key` VARBINARY(2147483647),
  `i` INT NOT NULL,
  `key` VARCHAR(2147483647) NOT NULL
) DISTRIBUTED BY HASH(`key_key`) INTO 1 BUCKETS
WITH (
  'changelog.mode' = 'append',
  'connector' = 'confluent',
  'key.fields-prefix' = 'key_',
  'key.format' = 'raw',
  'value.format' = 'avro-registry'
  ...
)
```

```sql
i INT NOT NULL
```

```sql
{
  "type": "record",
  "name": "TestRecord",
  "fields": [
    {
      "name": "uid",
      "type": "int"
    }
  ]
}
```

```sql
{
  "type": "record",
  "name": "TestRecord",
  "fields": [
    {
      "name": "name",
      "type": "string"
    },
    {
      "name": "zip_code",
      "type": "string"
    }
  ]
}
```

```sql
CREATE TABLE `t_sr_disjoint` (
  `uid` INT NOT NULL,
  `name` VARCHAR(2147483647) NOT NULL,
  `zip_code` VARCHAR(2147483647) NOT NULL
) DISTRIBUTED BY HASH(`uid`) INTO 1 BUCKETS
WITH (
  'changelog.mode' = 'append',
  'connector' = 'confluent',
  'value.format' = 'avro-registry'
  ...
)
```

```sql
{
  "type": "record",
  "name": "TestRecord",
  "fields": [
    {
      "name": "uid",
      "type": "int"
    }
  ]
}
```

```sql
{
    "type": "record",
    "name": "TestRecord",
    "fields": [
      {
        "name": "uid",
        "type": "int"
      },{
        "name": "name",
        "type": "string"
      },
      {
        "name": "zip_code",
        "type": "string"
      }
    ]
  }
```

```sql
CREATE TABLE `t_sr_joint` (
  `uid` INT NOT NULL,
  `name` VARCHAR(2147483647) NOT NULL,
  `zip_code` VARCHAR(2147483647) NOT NULL
) DISTRIBUTED BY HASH(`uid`) INTO 1 BUCKETS
WITH (
  'changelog.mode' = 'append',
  'connector' = 'confluent',
  'value.fields-include' = 'all',
  'value.format' = 'avro-registry'
  ...
)
```

```sql
'value.fields-include' = 'all'
```

```sql
["int", "string"]
```

```sql
CREATE TABLE `t_union` (
  `key` VARBINARY(2147483647),
  `int` INT,
  `string` VARCHAR(2147483647)
)
...
```

```sql
[
  "string",
  {
    "type": "record",
    "name": "User",
    "fields": [
      {
        "name": "uid",
        "type": "int"
      },{
        "name": "name",
        "type": "string"
      }
    ]
  },
  {
    "type": "record",
    "name": "Address",
    "fields": [
      {
        "name": "zip_code",
        "type": "string"
      }
    ]
  }
]
```

```sql
CREATE TABLE `t_union` (
  `key` VARBINARY(2147483647),
  `string` VARCHAR(2147483647),
  `User` ROW<`uid` INT NOT NULL, `name` VARCHAR(2147483647) NOT NULL>,
  `Address` ROW<`zip_code` VARCHAR(2147483647) NOT NULL>
)
...
```

```sql
org.myorg.avro.User
```

```sql
syntax = "proto3";

message Purchase {
   string item = 1;
   double amount = 2;
   string customer_id = 3;
}

message Pageview {
   string url = 1;
   bool is_special = 2;
   string customer_id = 3;
}
```

```sql
CREATE TABLE `t` (
  `key` VARBINARY(2147483647),
  `Purchase` ROW<
      `item` VARCHAR(2147483647) NOT NULL,
      `amount` DOUBLE NOT NULL,
      `customer_id` VARCHAR(2147483647) NOT NULL
   >,
  `Pageview` ROW<
      `url` VARCHAR(2147483647) NOT NULL,
      `is_special` BOOLEAN NOT NULL,
      `customer_id` VARCHAR(2147483647) NOT NULL
   >
)
...
```

```sql
syntax = "proto3";

message Purchase {
   string item = 1;
   double amount = 2;
   string customer_id = 3;
   Pageview pageview = 4;
}

message Pageview {
   string url = 1;
   bool is_special = 2;
   string customer_id = 3;
}
```

```sql
CREATE TABLE `t` (
  `key` VARBINARY(2147483647),
  `Purchase` ROW<
      `item` VARCHAR(2147483647) NOT NULL,
      `amount` DOUBLE NOT NULL,
      `customer_id` VARCHAR(2147483647) NOT NULL,
      `pageview` ROW<
         `url` VARCHAR(2147483647) NOT NULL,
         `is_special` BOOLEAN NOT NULL,
         `customer_id` VARCHAR(2147483647) NOT NULL
      >
   >,
  `Pageview` ROW<
      `url` VARCHAR(2147483647) NOT NULL,
      `is_special` BOOLEAN NOT NULL,
      `customer_id` VARCHAR(2147483647) NOT NULL
   >
)
...
```

```sql
syntax = "proto3";

message Purchase {
   string item = 1;
   double amount = 2;
   string customer_id = 3;
   Pageview pageview = 4;
   message Pageview {
      string url = 1;
      bool is_special = 2;
      string customer_id = 3;
   }
}
```

```sql
CREATE TABLE `t` (
  `key` VARBINARY(2147483647),
  `item` VARCHAR(2147483647) NOT NULL,
  `amount` DOUBLE NOT NULL,
  `customer_id` VARCHAR(2147483647) NOT NULL,
  `pageview` ROW<
      `url` VARCHAR(2147483647) NOT NULL,
      `is_special` BOOLEAN NOT NULL,
      `customer_id` VARCHAR(2147483647) NOT NULL
   >
)
...
```

```sql
{
  "type": "record",
  "name": "Customer",
  "namespace": "io.debezium.data",
  "fields": [
    {
      "name": "before",
      "type": ["null", {
        "type": "record",
        "name": "Value",
        "fields": [
          {"name": "id", "type": "int"},
          {"name": "name", "type": "string"},
          {"name": "email", "type": "string"}
        ]
      }],
      "default": null
    },
    {
      "name": "after",
      "type": ["null", "Value"],
      "default": null
    },
    {
      "name": "source",
      "type": {
        "type": "record",
        "name": "Source",
        "fields": [
          {"name": "version", "type": "string"},
          {"name": "connector", "type": "string"},
          {"name": "name", "type": "string"},
          {"name": "ts_ms", "type": "long"},
          {"name": "db", "type": "string"},
          {"name": "schema", "type": "string"},
          {"name": "table", "type": "string"}
        ]
      }
    },
    {"name": "op", "type": "string"},
    {"name": "ts_ms", "type": ["null", "long"], "default": null},
    {"name": "transaction", "type": ["null", {
      "type": "record",
      "name": "Transaction",
      "fields": [
        {"name": "id", "type": "string"},
        {"name": "total_order", "type": "long"},
        {"name": "data_collection_order", "type": "long"}
      ]
    }], "default": null}
  ]
}
```

```sql
CREATE TABLE `customer_changes` (
  `key` VARBINARY(2147483647),
   `id` INT NOT NULL,
   `name` VARCHAR(2147483647) NOT NULL,
   `email` VARCHAR(2147483647) NOT NULL
)
DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS
WITH (
  'changelog.mode' = 'retract',
  'connector' = 'confluent',
  'key.format' = 'raw',
  'value.format' = 'avro-debezium-registry'
  ...
)
```

```sql
value.format
```

```sql
*-debezium-registry
```

```sql
changelog.mode
```

```sql
cleanup.policy
```

```sql
changelog.mode
```

```sql
changelog.mode
```

```sql
-- Change to upsert mode for primary key-based operations
ALTER TABLE customer_changes SET ('changelog.mode' = 'upsert');

-- Change to append mode (processes only inserts and updates)
ALTER TABLE customer_changes SET ('changelog.mode' = 'append');
```

```sql
CREATE TABLE t_perfect_watermark (i INT);

-- If multiple events can have the same timestamp.
ALTER TABLE t_perfect_watermark
  MODIFY WATERMARK FOR $rowtime AS $rowtime - INTERVAL '0.001' SECOND;

-- If a single event can have the timestamp.
ALTER TABLE t_perfect_watermark
  MODIFY WATERMARK FOR $rowtime AS $rowtime;
```

```sql
DESCRIBE `orders`;
```

```sql
+-------------+------------------------+----------+-------------------+
| Column Name |       Data Type        | Nullable |      Extras       |
+-------------+------------------------+----------+-------------------+
| user        | BIGINT                 | NOT NULL | PRIMARY KEY       |
| product     | STRING                 | NULL     |                   |
| amount      | INT                    | NULL     |                   |
| ts          | TIMESTAMP(3) *ROWTIME* | NULL     | WATERMARK AS `ts` |
+-------------+------------------------+----------+-------------------+
```

```sql
ALTER TABLE `orders` DROP WATERMARK;
```

```sql
Statement phase is COMPLETED.
```

```sql
DESCRIBE `orders`;
```

```sql
+-------------+--------------+----------+-------------+
| Column Name |  Data Type   | Nullable |   Extras    |
+-------------+--------------+----------+-------------+
| user        | BIGINT       | NOT NULL | PRIMARY KEY |
| product     | STRING       | NULL     |             |
| amount      | INT          | NULL     |             |
| ts          | TIMESTAMP(3) | NULL     |             |
+-------------+--------------+----------+-------------+
```

```sql
-- Convert from regular Avro format to Debezium CDC format
-- and configure the appropriate Flink changelog interpretation mode:
-- * append:  Treats each record as an INSERT operation with no relationship between records
-- * retract: Handles paired operations (INSERT/UPDATE/DELETE) where changes to the same row
--            are represented as a retraction of the old value followed by an addition of the new value
-- * upsert: Groups all operations for the primary key (derived from the Kafka message key),
--           with each operation effectively merging with or replacing previous state
--           (INSERT creates, UPDATE modifies, DELETE removes)
ALTER TABLE customer_data SET (
  'value.format' = 'avro-debezium-registry',
  'changelog.mode' = 'retract'
);
```

```sql
-- Convert from regular JSON format to Debezium CDC format
-- and configure the appropriate Flink changelog interpretation mode:
-- * append:  Treats each record as an INSERT operation with no relationship between records
-- * retract: Handles paired operations (INSERT/UPDATE/DELETE) where changes to the same row
--            are represented as a retraction of the old value followed by an addition of the new value
-- * upsert: Groups all operations for the primary key (derived from the Kafka message key),
--           with each operation effectively merging with or replacing previous state
--           (INSERT creates, UPDATE modifies, DELETE removes)
ALTER TABLE customer_data_json SET (
  'value.format' = 'json-debezium-registry',
  'changelog.mode' = 'retract'
);
```

```sql
-- Convert from regular Protobuf format to Debezium CDC format
-- and configure the appropriate Flink changelog interpretation mode:
-- * append:  Treats each record as an INSERT operation with no relationship between records
-- * retract: Handles paired operations (INSERT/UPDATE/DELETE) where changes to the same row
--            are represented as a retraction of the old value followed by an addition of the new value
-- * upsert: Groups all operations for the primary key (derived from the Kafka message key),
--           with each operation effectively merging with or replacing previous state
--           (INSERT creates, UPDATE modifies, DELETE removes)
ALTER TABLE customer_data_proto SET (
  'value.format' = 'proto-debezium-registry',
  'changelog.mode' = 'retract'
);
```

```sql
-- Change to append mode (default)
-- Best for event streams where each record is independent
ALTER TABLE customer_changes SET (
  'changelog.mode' = 'append'
);

-- Change to retract mode
-- Useful when changes to the same row are represented as paired operations
ALTER TABLE customer_changes SET (
  'changelog.mode' = 'retract'
);

-- Change upsert mode when working with primary keys
-- Best when tracking state changes using a primary key (derived from Kafka message key)
ALTER TABLE customer_changes SET (
  'changelog.mode' = 'upsert'
);
```

```sql
-- Create example topic
CREATE TABLE t_headers (i INT);

-- For read-only (virtual)
ALTER TABLE t_headers ADD headers MAP<BYTES, BYTES> METADATA VIRTUAL;

-- For read and write (persisted). Column becomes mandatory in INSERT INTO.
ALTER TABLE t_headers MODIFY headers MAP<BYTES, BYTES> METADATA;

-- Use implicit casting (origin is always MAP<BYTES, BYTES>)
ALTER TABLE t_headers MODIFY headers MAP<STRING, STRING> METADATA;

-- Insert and read
INSERT INTO t_headers SELECT 42, MAP['k1', 'v1', 'k2', 'v2'];
SELECT * FROM t_headers;
```

```sql
other_name MAP<BYTES, BYTES> METADATA FROM 'headers' VIRTUAL
```

```sql
ALTER TABLE `orders` ADD (
  `headers` MAP<BYTES,BYTES> METADATA VIRTUAL);
```

```sql
DESCRIBE `orders`;
```

```sql
+-------------+-------------------+----------+-------------------------+
| Column Name |     Data Type     | Nullable |         Extras          |
+-------------+-------------------+----------+-------------------------+
| user        | BIGINT            | NOT NULL | PRIMARY KEY, BUCKET KEY |
| product     | STRING            | NULL     |                         |
| amount      | INT               | NULL     |                         |
| ts          | TIMESTAMP(3)      | NULL     |                         |
| headers     | MAP<BYTES, BYTES> | NULL     | METADATA VIRTUAL        |
+-------------+-------------------+----------+-------------------------+
```

```sql
-- Create example topic with 1 partition filled with values
CREATE TABLE t_specific_offsets (i INT) DISTRIBUTED INTO 1 BUCKETS;
INSERT INTO t_specific_offsets VALUES (1), (2), (3), (4), (5);

-- Returns 1, 2, 3, 4, 5
SELECT * FROM t_specific_offsets;

-- Changes the scan range
ALTER TABLE t_specific_offsets SET (
  'scan.startup.mode' = 'specific-offsets',
  'scan.startup.specific-offsets' = 'partition:0,offset:3'
);

-- Returns 4, 5
SELECT * FROM t_specific_offsets;
```

```sql
scan.startup.mode
```

```sql
scan.bounded.mode
```

```sql
scan.startup.specific-offsets
```

```sql
scan.bounded.specific-offsets
```

```sql
'scan.startup.specific-offsets' = 'partition:0,offset:3; partition:1,offset:42; partition:2,offset:0'
```

```sql
-- example table
CREATE TABLE t_watermark_debugging (k INT, s STRING)
  DISTRIBUTED BY (k) INTO 4 BUCKETS;

-- Each value lands in a separate Kafka partition (out of 4).
-- Leave out values to see missing watermarks.
INSERT INTO t_watermark_debugging
  VALUES (1, 'Bob'), (2, 'Alice'), (8, 'John'), (15, 'David');

-- If ROW_NUMBER doesn't show results, it's clearly a watermark issue.
SELECT ROW_NUMBER() OVER (ORDER BY $rowtime ASC) AS `number`, *
  FROM t_watermark_debugging;

-- Add partition information as metadata column
ALTER TABLE t_watermark_debugging ADD part INT METADATA FROM 'partition' VIRTUAL;

-- Use the CURRENT_WATERMARK() function to check which watermark is calculated
SELECT
  *,
  part AS `Row Partition`,
  $rowtime AS `Row Timestamp`,
  CURRENT_WATERMARK($rowtime) AS `Operator Watermark`
FROM t_watermark_debugging;

-- Visualize the highest timestamp per Kafka partition
-- Due to the table declaration (with 4 buckets), this query should show 4 rows.
-- If not, the missing partitions might be the cause for watermark issues.
SELECT part AS `Partition`, MAX($rowtime) AS `Max Timestamp in Partition`
  FROM t_watermark_debugging
  GROUP BY part;

-- A workaround could be to not use the system watermark:
ALTER TABLE t_watermark_debugging
  MODIFY WATERMARK FOR $rowtime AS $rowtime - INTERVAL '2' SECOND;
-- Or for perfect input data:
ALTER TABLE t_watermark_debugging
  MODIFY WATERMARK FOR $rowtime AS $rowtime - INTERVAL '0.001' SECOND;

-- Add "fresh" data while the above statements with
-- ROW_NUMBER() or CURRENT_WATERMARK() are running.
INSERT INTO t_watermark_debugging VALUES
  (1, 'Fresh Bob'),
  (2, 'Fresh Alice'),
  (8, 'Fresh John'),
  (15, 'Fresh David');
```

```sql
-- Create a topic with 4 partitions.
CREATE TABLE t_watermark_idle (k INT, s STRING)
  DISTRIBUTED BY (k) INTO 4 BUCKETS;

-- Avoid the "not enough data" problem by using a custom watermark.
-- The watermark strategy is still coarse-grained enough for this example.
ALTER TABLE t_watermark_idle
  MODIFY WATERMARK FOR $rowtime AS $rowtime - INTERVAL '2' SECONDS;

-- Each value lands in a separate Kafka partition, and partition 1 is empty.
INSERT INTO t_watermark_idle
  VALUES
    (1, 'Bob in partition 0'),
    (2, 'Alice in partition 3'),
    (8, 'John in partition 2');

-- Thread 1: Start a streaming job.
SELECT ROW_NUMBER() OVER (ORDER BY $rowtime ASC) AS `number`, *
  FROM t_watermark_idle;

-- Thread 2: Insert some data immediately -> Thread 1 still without results.
INSERT INTO t_watermark_idle
  VALUES (1, 'Another Bob in partition 0 shortly after');

-- Thread 2: Insert some data after 15s -> Thread 1 should show results.
INSERT INTO t_watermark_idle
  VALUES (1, 'Another Bob in partition 0 after 15s')
```

```sql
sql.tables.scan.idle-timeout
```

```sql
-- Thread 1: Start a streaming job.
-- Lower the idle timeout further.
SET 'sql.tables.scan.idle-timeout' = '1s';
SELECT ROW_NUMBER() OVER (ORDER BY $rowtime ASC) AS `number`, *
  FROM t_watermark_idle;

-- Thread 2: Insert some data immediately -> Thread 1 should show results.
INSERT INTO t_watermark_idle
  VALUES (1, 'Another Bob in partition 0 shortly after');
```

```sql
ALTER TABLE `orders` SET ('value.format.schema-context' = '.lsrc-newcontext');
```

```sql
Statement phase is COMPLETED.
```

```sql
SHOW CREATE TABLE `orders`;
```

```sql
+----------------------------------------------------------------------+
|                          SHOW CREATE TABLE                           |
+----------------------------------------------------------------------+
| CREATE TABLE `catalog`.`database`.`orders` (                         |
|   `user` BIGINT NOT NULL,                                            |
|   `product` VARCHAR(2147483647),                                     |
|   `amount` INT,                                                      |
|   `ts` TIMESTAMP(3)                                                  |
| )                                                                    |
|   DISTRIBUTED BY HASH(`user`) INTO 6 BUCKETS                         |
| WITH (                                                               |
|   'changelog.mode' = 'upsert',                                       |
|   'connector' = 'confluent',                                         |
|   'kafka.cleanup-policy' = 'delete',                                 |
|   'kafka.max-message-size' = '2097164 bytes',                        |
|   'kafka.retention.size' = '0 bytes',                                |
|   'kafka.retention.time' = '604800000 ms',                           |
|   'key.format' = 'avro-registry',                                    |
|   'scan.bounded.mode' = 'unbounded',                                 |
|   'scan.startup.mode' = 'latest-offset',                             |
|   'value.format' = 'avro-registry',                                  |
|   'value.format.schema-context' = '.lsrc-newcontext'                 |
| )                                                                    |
|                                                                      |
+----------------------------------------------------------------------+
```

```sql
{
  "type": "record",
  "name": "TestRecord",
  "fields": [
    {
      "name": "uid",
      "type": "int"
    }
  ]
}
```

```sql
ALTER TABLE t_metadata_overlap ADD `timestamp` TIMESTAMP_LTZ(3) NOT NULL METADATA;
```

```sql
CREATE TABLE t_metadata_overlap` (
  `key` VARBINARY(2147483647),
  `uid` INT NOT NULL,
  `timestamp` TIMESTAMP(3) WITH LOCAL TIME ZONE NOT NULL METADATA
) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS
WITH (
  ...
)
```

```sql
INSERT INTO t_metadata_overlap
  SELECT CAST(NULL AS BYTES), 42, TO_TIMESTAMP_LTZ(0, 3);
```

```sql
ALTER TABLE t_metadata_overlap DROP `timestamp`;

ALTER TABLE t_metadata_overlap
  ADD message_timestamp TIMESTAMP_LTZ(3) METADATA FROM 'timestamp';

SELECT * FROM t_metadata_overlap;
```

```sql
CREATE TABLE `t_metadata_overlap` (
  `key` VARBINARY(2147483647),
  `uid` INT NOT NULL,
  `timestamp` VARCHAR(2147483647),
  `message_timestamp` TIMESTAMP(3) WITH LOCAL TIME ZONE METADATA FROM 'timestamp'
) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS
WITH (
  ...
)
```

```sql
{
  "type": "record",
  "name": "TestRecord",
  "fields": [
    {
      "name": "uid",
      "type": "int"
    }
  ]
}
```

```sql
CREATE TABLE `t_enrich_raw_key` (
  `key` VARBINARY(2147483647),
  `uid` INT NOT NULL
  ) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS
WITH (
  'changelog.mode' = 'append',
  'connector' = 'confluent',
  'key.format' = 'raw',
  'value.format' = 'avro-registry'
  ...
)
```

```sql
ALTER TABLE t_enrich_raw_key MODIFY key STRING;
```

```sql
CREATE TABLE `t_enrich_raw_key` (
  `key` STRING,
  `uid` INT NOT NULL
) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS
WITH (
  'changelog.mode' = 'append',
  'connector' = 'confluent',
  'key.format' = 'raw',
  'value.format' = 'avro-registry'
  ...
)
```

```sql
SHOW CREATE TABLE events;
```

```sql
CREATE TABLE `events` (
  `key` VARBINARY(2147483647),
  `value` VARBINARY(2147483647)
) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS
WITH (
  'changelog.mode' = 'append',
  'connector' = 'confluent',
  'key.format' = 'raw',
  'value.format' = 'raw'
)
```

```sql
ALTER TABLE events SET (
  'value.format' = 'avro-registry',
  'value.avro-registry.subject-names' = 'com.example.Order;com.example.Shipment'
);
```

```sql
ALTER TABLE events SET (
  'value.format' = 'json-registry',
  'value.json-registry.subject-names' = 'com.example.Order;com.example.Shipment'
);
```

```sql
ALTER TABLE events SET (
  'value.format' = 'proto-registry',
  'value.proto-registry.subject-names' = 'com.example.Order;com.example.Shipment'
);
```

```sql
ALTER TABLE events SET (
  'key.format' = 'avro-registry',
  'key.avro-registry.subject-names' = 'com.example.OrderKey'
);
```

```sql
ALTER TABLE events SET (
  'key.format' = 'avro-registry',
  'key.avro-registry.subject-names' = 'com.example.OrderKey',
  'value.format' = 'avro-registry',
  'value.avro-registry.subject-names' = 'com.example.Order;com.example.Shipment'
);
```

```sql
avro-registry
```

```sql
json-registry
```

```sql
proto-registry
```

```sql
ALTER TABLE json_table RESET (
  'value.json-registry.wire-encoding',
  'value.json-registry.subject-names'
);
```

```sql
ALTER TABLE my_table SET (
  'error-handling.mode' = 'log',
  'error-handling.log.target' = 'my_error_table'
);
```

```sql
SELECT NOW();
```

```sql
CREATE TABLE t_union_1 (i INT);
CREATE TABLE t_union_2 (i INT);
TABLE t_union_1 UNION ALL TABLE t_union_2;

-- alternate syntax
SELECT * FROM t_union_1
UNION ALL
SELECT * FROM t_union_2;
```

```sql
CREATE TABLE t_watermarked_insight (s STRING) DISTRIBUTED INTO 1 BUCKETS;

INSERT INTO t_watermarked_insight VALUES ('Bob'), ('Alice'), ('Charly');

SELECT $rowtime, CURRENT_WATERMARK($rowtime) FROM t_watermarked_insight;
```

```sql
$rowtime                EXPR$1
2024-04-29 11:59:01.080 NULL
2024-04-29 11:59:01.093 2024-04-04 15:27:37.433
2024-04-29 11:59:01.094 2024-04-04 15:27:37.433
```

```sql
CREATE TABLE t_flattening (i INT, r1 ROW<i INT, s STRING>, r2 ROW<other INT>);

SELECT r1.*, r2.* FROM t_flattening;
```

```sql
{
   "type":"record",
   "namespace": "io.confluent.developer.avro",
   "name":"Purchase",
   "fields": [
      {"name": "item", "type":"string"},
      {"name": "amount", "type": "double"},
      {"name": "customer_id", "type": "string"}
   ]
}
```

```sql
syntax = "proto3";

package io.confluent.developer.proto;

message Purchase {
   string item = 1;
   double amount = 2;
   string customer_id = 3;
}
```

```sql
{
   "$schema": "http://json-schema.org/draft-07/schema#",
   "title": "Purchase",
   "type": "object",
   "properties": {
      "item": {
         "type": "string"
      },
      "amount": {
         "type": "number"
      },
      "customer_id": {
         "type": "string"
      }
   },
   "required": ["item", "amount", "customer_id"]
}
```

```sql
{
   "type":"record",
   "namespace": "io.confluent.developer.avro",
   "name":"Pageview",
   "fields": [
      {"name": "url", "type":"string"},
      {"name": "is_special", "type": "boolean"},
      {"name": "customer_id", "type":  "string"}
   ]
}
```

```sql
syntax = "proto3";

package io.confluent.developer.proto;

message Pageview {
   string url = 1;
   bool is_special = 2;
   string customer_id = 3;
}
```

```sql
{
   "$schema": "http://json-schema.org/draft-07/schema#",
   "title": "Pageview",
   "type": "object",
   "properties": {
      "url": {
         "type": "string"
      },
      "is_special": {
         "type": "boolean"
      },
      "customer_id": {
         "type": "string"
      }
   },
   "required": ["url", "is_special", "customer_id"]
}
```

```sql
[
   "io.confluent.developer.avro.Purchase",
   "io.confluent.developer.avro.Pageview"
]
```

```sql
syntax = "proto3";

package io.confluent.developer.proto;

import "purchase.proto";
import "pageview.proto";

message CustomerEvent {
   oneof action {
      Purchase purchase = 1;
      Pageview pageview = 2;
   }
}
```

```sql
{
   "$schema": "http://json-schema.org/draft-07/schema#",
   "title": "CustomerEvent",
   "type": "object",
   "oneOf": [
      { "$ref": "io.confluent.developer.json.Purchase" },
      { "$ref": "io.confluent.developer.json.Pageview" }
   ]
}
```

```sql
[
   {
      "name": "io.confluent.developer.avro.Purchase",
      "subject": "purchase",
      "version": 1
   },
   {
      "name": "io.confluent.developer.avro.Pageview",
      "subject": "pageview",
      "version": 1
   }
]
```

```sql
[
   {
      "name": "purchase.proto",
      "subject": "purchase",
      "version": 1
   },
   {
      "name": "pageview.proto",
      "subject": "pageview",
      "version": 1
   }
]
```

```sql
[
   {
      "name": "io.confluent.developer.json.Purchase",
      "subject": "purchase",
      "version": 1
   },
   {
      "name": "io.confluent.developer.json.Pageview",
      "subject": "pageview",
      "version": 1
   }
]
```

```sql
SHOW CREATE TABLE customer-events;
```

```sql
CREATE TABLE `customer-events` (
  `key` VARBINARY(2147483647),
  `Purchase` ROW<`item` VARCHAR(2147483647) NOT NULL, `amount` DOUBLE NOT NULL, `customer_id` VARCHAR(2147483647) NOT NULL>,
  `Pageview` ROW<`url` VARCHAR(2147483647) NOT NULL, `is_special` BOOLEAN NOT NULL, `customer_id` VARCHAR(2147483647) NOT NULL>
)
DISTRIBUTED BY HASH(`key`) INTO 2 BUCKETS
WITH (
  'changelog.mode' = 'append',
  'connector' = 'confluent',
  'kafka.cleanup-policy' = 'delete',
  'kafka.max-message-size' = '2097164 bytes',
  'kafka.retention.size' = '0 bytes',
  'kafka.retention.time' = '7 d',
  'key.format' = 'raw',
  'scan.bounded.mode' = 'unbounded',
  'scan.startup.mode' = 'earliest-offset',
  'value.format' = '[VALUE_FORMAT]'
)
```

```sql
CREATE TABLE purchase AS
   SELECT Purchase.* FROM `customer-events`
   WHERE Purchase IS NOT NULL;

SELECT * FROM purchase;
```

```sql
CREATE TABLE pageview AS
   SELECT Pageview.* FROM `customer-events`
   WHERE Pageview IS NOT NULL;

SELECT * FROM pageview;
```

---

### Flink SQL Syntax in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/sql-syntax.html

Flink SQL Syntax in Confluent Cloud for Apache Flink¶ SQL is a domain-specific language for managing and manipulating data. It’s used primarily to work with structured data, where the types and relationships across entities are well-defined. Originally adopted for relational databases, SQL is rapidly becoming the language of choice for stream processing. It’s declarative, expressive, and ubiquitous. The American National Standards Institute (ANSI) maintains a standard for the specification of SQL. Flink SQL is compliant with ANSI SQL 2011. Beyond the standard, there are many flavors and extensions to SQL so that it can express programs beyond what’s possible with the SQL 2011 grammar. Lexical structure¶ The grammar of Apache Flink® parses SQL using Apache Calcite, which supports standard ANSI SQL. Syntax¶ Flink SQL inputs are made up of a series of statements. Each statement is made up of a series of tokens and ends in a semicolon (;). The tokens that apply depend on the statement being invoked. A token is any keyword, identifier, backticked identifier, literal, or special character. By convention, tokens are separated by whitespace, unless there is no ambiguity in the grammar. This happens when tokens flank a special character. The following example statements are syntactically valid Flink SQL input: -- Create a users table. CREATE TABLE users ( user_id STRING, registertime BIGINT, gender STRING, regionid STRING ); -- Populate the table with mock users data. INSERT INTO users VALUES ('Thomas A. Anderson', 1677260724, 'male', 'Region_4'), ('Trinity', 1677260733, 'female', 'Region_4'), ('Morpheus', 1677260742, 'male', 'Region_8'); SELECT * FROM users; Keywords¶ Some tokens, such as SELECT, INSERT, and CREATE, are keywords. Keywords are reserved tokens that have a specific meaning in Flink’s syntax. They control their surrounding allowable tokens and execution semantics. Keywords are case insensitive, meaning SELECT and select are equivalent. You can’t create an identifier that is already a reserved word, unless you use backticked identifiers, for example, `table`. For a complete list of keywords, see Flink SQL Reserved Keywords. Identifiers¶ Identifiers are symbols that represent user-defined entities, like tables, columns, and other objects. For example, if you have a table named t1, t1 is an identifier for that table. By default, identifiers are case-sensitive, meaning t1 and T1 refer to different tables. Unless an identifier is backticked, it may be composed only of characters that are a letter, number, or underscore. There is no imposed limit on the number of characters. To make it possible to use any character in an identifier, you can enclose it in backtick characters (`) when you declare and use it. A backticked identifier is useful when you don’t control the data, so it might have special characters, or even keywords. If you want to use one of the keyword strings as an identifier, enclose them with backticks, for example: `value` `count` When you use backticked identifiers, Flink SQL captures the case exactly, and any future references to the identifier are case-sensitive. For example, if you declare the following table: CREATE TABLE `t1` ( id VARCHAR, `@MY-identifier-table-column!` INT); You must select from it by backticking the table name and column name and using the original casing: SELECT `@MY-identifier-table-column!` FROM `t1`; If you use an invalid identifier without enclosing it in backticks, you receive a SQL parse failed error. For example, the following SQL query tries to read records from a table named table-with-dashes, but the dash character (-) is not valid in an identifier. SELECT * FROM table-with-dashes; The error output resembles: SQL parse failed. Encountered "-" at line 1, column 20. You can fix the error by enclosing the identifier with backticks: SELECT * FROM `table-with-dashes`; Constants¶ There are three implicitly typed constants, or literals, in Flink SQL: strings, numbers, and booleans. String constants¶ A string constant is an arbitrary series of characters surrounded by single quotes ('), like 'Hello world'. To include a quote inside of a string literal, escape the quote by prefixing it with another quote, for example, 'You can call me ''Stuart'', or Stu.' Numeric constants¶ Numeric constants are accepted in the following forms: digits digits.[digits][e[+-]digits] [digits].digits[e[+-]digits] digitse[+-]digits where digits is one or more single-digit integers (0 through 9). At least one digit must be present before or after the decimal point, if there is one. At least one digit must follow the exponent symbol e, if there is one. No spaces, underscores, or any other characters are allowed in the constant. Numeric constants may also have a + or - prefix, but this is considered to be a function applied to the constant, not the constant itself. Here are some examples of valid numeric constants: 5 7.2 0.0087 1. .5 1e-3 1.332434e+2 +100 -250 Boolean constants¶ A boolean constant is represented as either the identifier true or false. Boolean constants are not case-sensitive, which means that true evaluates to the same value as TRUE. Operators¶ Operators are infix functions composed of special characters. Flink SQL doesn’t allow you to add user-space operators. For a complete list of operators, see Comparison Functions in Confluent Cloud for Apache Flink. Special characters¶ Some characters have a particular meaning that doesn’t correspond to an operator. The following list describes the special characters and their purposes. Parentheses (()) retain their usual meaning in programming languages for grouping expressions and controlling the order of evaluation. Brackets ([]) are used to work with arrays, both in their construction and subscript access. They also allow you to key into maps. Commas (,) delineate a discrete list of entities. The semi-colon (;) terminates a SQL statement. The asterisk (*), when used in particular syntax, is used as an “all” qualifier. This is seen most commonly in a SELECT command to retrieve all columns. The period (.) accesses a column in a table or a field in a struct data type. Comments¶ A comment is a string beginning with two dashes. It includes all of the content from the dashes to the end of the line: -- Here is a comment. You can also span a comment over multiple lines by using C-style syntax: /* Here is another comment. */ Lexical precedence¶ Operators are evaluated using the following order of precedence: *, /, % +, - =, >, <, >=, <=, <>, != NOT AND BETWEEN, LIKE, OR In an expression, when two operators have the same precedence level, they’re evaluated left-to-right, based on their position. You can enclose an expression in parentheses to force precedence or clarify precedence, for example, (5 + 2) * 3. Related content¶ Flink SQL Reserved Keywords Data Types Flink SQL Queries DDL Statements in Confluent Cloud for Apache Flink Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
-- Create a users table.
CREATE TABLE users (
  user_id STRING,
  registertime BIGINT,
  gender STRING,
  regionid STRING
);

-- Populate the table with mock users data.
INSERT INTO users VALUES
  ('Thomas A. Anderson', 1677260724, 'male', 'Region_4'),
  ('Trinity', 1677260733, 'female', 'Region_4'),
  ('Morpheus', 1677260742, 'male', 'Region_8');

SELECT * FROM users;
```

```sql
CREATE TABLE `t1` (
  id VARCHAR,
  `@MY-identifier-table-column!` INT);
```

```sql
SELECT `@MY-identifier-table-column!` FROM `t1`;
```

```sql
SQL parse failed
```

```sql
table-with-dashes
```

```sql
SELECT * FROM table-with-dashes;
```

```sql
SQL parse failed. Encountered "-" at line 1, column 20.
```

```sql
SELECT * FROM `table-with-dashes`;
```

```sql
'Hello world'
```

```sql
'You can call me ''Stuart'', or Stu.'
```

```sql
1.332434e+2
```

```sql
-- Here is a comment.
```

```sql
/* Here is
   another comment.
*/
```

```sql
(5 + 2) * 3
```

---

### SQL ALTER CONNECTION Statement in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/statements/alter-connection.html

ALTER CONNECTION Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® supports creating secure connections to external services and data sources. You can use these connections in your Flink statements. Use the ALTER CONNECTION statement to change the API key or credentials of an existing connection. Syntax¶ ALTER CONNECTION [IF EXISTS] [catalog_name.][db_name.]connection_name SET (key1=val1[, key2=val2]...) Description¶ Change the API key or credentials of a connection. Secrets are extracted to the secret store and aren’t displayed in subsequent DESCRIBE CONNECTION statements, the Flink SQL shell, or the Confluent Cloud Console. Confluent Cloud for Apache Flink makes a best-effort attempt to redact sensitive values from the CREATE CONNECTION and ALTER CONNECTION statements by masking the values for the known sensitive keys. In Confluent Cloud Console, the sensitive values are redacted in the Flink SQL workspace if you navigate away from the workspace and return, or if you reload the page in the browser. Alternatively, you can use the Confluent CLI commands to create and manage connections. In addition, if syntax in the CREATE CONNECTION statement is incorrect, Confluent Cloud for Apache Flink may not detect the secrets. For example, if you type CREATE CONNECTION my_conn WITH ('ap-key' = 'x'), Flink won’t redact the x, because api-key is misspelled. Examples¶ -- Update the API key for a connection. ALTER CONNECTION `conn-one` SET ('api-key' = '<new-api-key>'); -- Update the credentials for a connection. ALTER CONNECTION `my-couchbase-conn` SET ( 'username' = '<user-name>', 'password' = '<new-password>' ); Related content¶ CREATE CONNECTION DESCRIBE CONNECTION DROP CONNECTION SHOW CONNECTIONS Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
ALTER CONNECTION [IF EXISTS] [catalog_name.][db_name.]connection_name
SET (key1=val1[, key2=val2]...)
```

```sql
CREATE CONNECTION my_conn WITH ('ap-key' = 'x')
```

```sql
-- Update the API key for a connection.
ALTER CONNECTION `conn-one` SET ('api-key' = '<new-api-key>');

-- Update the credentials for a connection.
ALTER CONNECTION `my-couchbase-conn` SET (
  'username' = '<user-name>',
  'password' = '<new-password>'
);
```

---

### SQL ALTER MODEL Statement in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/statements/alter-model.html

ALTER MODEL Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables real-time inference and prediction with AI models. Use the CREATE MODEL statement to register an AI model. Syntax¶ -- Rename a model. ALTER MODEL [IF EXISTS][catalog_name.][database_name.]model_name RENAME TO [catalog_name.][database_name.]new_model_name -- Alter model options. ALTER MODEL [IF EXISTS] [catalog_name.][database_name.]model_name[$version_id] SET (key1=val1[, key2=val2]...) -- Reset model options. ALTER MODEL [IF EXISTS] [catalog_name.][database_name.]model_name[$version_id] RESET (key1[, key2]...) Description¶ Rename an AI model or change model options. Use the <model_name>$<model_version> syntax to change a specific version of a model. For more information, see Model versioning. ALTER MODEL options apply only to model metadata, not model data. If the IF EXISTS clause is provided, and the model doesn’t exist, nothing happens. If the IF EXISTS clause is provided, and the model version doesn’t exist, nothing happens. For RESET, the specified model option keys are reset to the default value. Examples¶ -- Rename a model. ALTER MODEL `my_model` RENAME TO `my_new_model` -- Check for model existence and rename if it exists. ALTER MODEL IF EXISTS `my_model` RENAME TO `my_new_model` -- Change options for version 2. ALTER MODEL `my_model$2` SET ( tag = 'prod', description = "new_description" ); -- Reset the tag option. ALTER MODEL `my_model` RESET (tag) Related content¶ CREATE MODEL DROP MODEL Run an AI Model Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
-- Rename a model.
ALTER MODEL [IF EXISTS][catalog_name.][database_name.]model_name
RENAME TO [catalog_name.][database_name.]new_model_name

-- Alter model options.
ALTER MODEL [IF EXISTS] [catalog_name.][database_name.]model_name[$version_id]
SET (key1=val1[, key2=val2]...)

-- Reset model options.
ALTER MODEL [IF EXISTS] [catalog_name.][database_name.]model_name[$version_id]
RESET (key1[, key2]...)
```

```sql
<model_name>$<model_version>
```

```sql
-- Rename a model.
ALTER MODEL `my_model` RENAME TO `my_new_model`

-- Check for model existence and rename if it exists.
ALTER MODEL IF EXISTS `my_model` RENAME TO `my_new_model`

-- Change options for version 2.
ALTER MODEL `my_model$2` SET (
  tag = 'prod',
  description = "new_description"
);

-- Reset the tag option.
ALTER MODEL `my_model` RESET (tag)
```

---

### SQL ALTER TABLE Statement in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/statements/alter-table.html

ALTER TABLE Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables changing properties of an existing table. Syntax¶ ALTER TABLE [catalog_name.][db_name.]table_name { ADD (metadata_column_name metadata_column_type METADATA [FROM metadata_key] VIRTUAL [COMMENT column_comment]) | ADD (computed_column_name AS computed_column_expression [COMMENT column_comment]) | MODIFY WATERMARK FOR rowtime_column_name AS watermark_strategy_expression | DROP WATERMARK | SET (key1='value1' [, key2='value2', ...]) | RESET (key1 [, key2, ...]) } Description¶ ALTER TABLE allows you to add metadata columns, computed columns, change or remove the watermark, and modify table properties. Physical columns cannot be added, modified, or dropped within Confluent Cloud for Apache Flink directly, but schemas can be evolved in Schema Registry. Examples¶ The following examples show frequently encountered scenarios with ALTER TABLE. Define a watermark for perfectly ordered data¶ Flink guarantees that rows are always emitted before the watermark is generated. The following statements ensure that for perfectly ordered events, meaning events without time-skew, a watermark can be equal to the timestamp or 1 ms less than the timestamp. CREATE TABLE t_perfect_watermark (i INT); -- If multiple events can have the same timestamp. ALTER TABLE t_perfect_watermark MODIFY WATERMARK FOR $rowtime AS $rowtime - INTERVAL '0.001' SECOND; -- If a single event can have the timestamp. ALTER TABLE t_perfect_watermark MODIFY WATERMARK FOR $rowtime AS $rowtime; Drop your custom watermark strategy¶ Remove the custom watermark strategy to restore the default watermark strategy. View the current table schema and metadata. DESCRIBE `orders`; Your output should resemble: +-------------+------------------------+----------+-------------------+ | Column Name | Data Type | Nullable | Extras | +-------------+------------------------+----------+-------------------+ | user | BIGINT | NOT NULL | PRIMARY KEY | | product | STRING | NULL | | | amount | INT | NULL | | | ts | TIMESTAMP(3) *ROWTIME* | NULL | WATERMARK AS `ts` | +-------------+------------------------+----------+-------------------+ Remove the watermark strategy of the table. ALTER TABLE `orders` DROP WATERMARK; Your output should resemble: Statement phase is COMPLETED. Check the new table schema and metadata. DESCRIBE `orders`; Your output should resemble: +-------------+--------------+----------+-------------+ | Column Name | Data Type | Nullable | Extras | +-------------+--------------+----------+-------------+ | user | BIGINT | NOT NULL | PRIMARY KEY | | product | STRING | NULL | | | amount | INT | NULL | | | ts | TIMESTAMP(3) | NULL | | +-------------+--------------+----------+-------------+ Configure Debezium format for CDC data¶ Change regular format to Debezium format¶ Note For schemas created after May 19, 2025 at 09:00 UTC, Flink automatically detects Debezium envelopes and configures the appropriate format and changelog mode. Manual conversion is necessary only for older schemas or when you want to override the default behavior. For tables that have been inferred with regular formats but contain Debezium CDC (Change Data Capture) data: AvroJSON SchemaProtobuf-- Convert from regular Avro format to Debezium CDC format -- and configure the appropriate Flink changelog interpretation mode: -- * append: Treats each record as an INSERT operation with no relationship between records -- * retract: Handles paired operations (INSERT/UPDATE/DELETE) where changes to the same row -- are represented as a retraction of the old value followed by an addition of the new value -- * upsert: Groups all operations for the primary key (derived from the Kafka message key), -- with each operation effectively merging with or replacing previous state -- (INSERT creates, UPDATE modifies, DELETE removes) ALTER TABLE customer_data SET ( 'value.format' = 'avro-debezium-registry', 'changelog.mode' = 'retract' ); -- Convert from regular JSON format to Debezium CDC format -- and configure the appropriate Flink changelog interpretation mode: -- * append: Treats each record as an INSERT operation with no relationship between records -- * retract: Handles paired operations (INSERT/UPDATE/DELETE) where changes to the same row -- are represented as a retraction of the old value followed by an addition of the new value -- * upsert: Groups all operations for the primary key (derived from the Kafka message key), -- with each operation effectively merging with or replacing previous state -- (INSERT creates, UPDATE modifies, DELETE removes) ALTER TABLE customer_data_json SET ( 'value.format' = 'json-debezium-registry', 'changelog.mode' = 'retract' ); -- Convert from regular Protobuf format to Debezium CDC format -- and configure the appropriate Flink changelog interpretation mode: -- * append: Treats each record as an INSERT operation with no relationship between records -- * retract: Handles paired operations (INSERT/UPDATE/DELETE) where changes to the same row -- are represented as a retraction of the old value followed by an addition of the new value -- * upsert: Groups all operations for the primary key (derived from the Kafka message key), -- with each operation effectively merging with or replacing previous state -- (INSERT creates, UPDATE modifies, DELETE removes) ALTER TABLE customer_data_proto SET ( 'value.format' = 'proto-debezium-registry', 'changelog.mode' = 'retract' ); Modify Changelog Processing Mode¶ For tables with any type of data that need a different processing mode for handling changes: -- Change to append mode (default) -- Best for event streams where each record is independent ALTER TABLE customer_changes SET ( 'changelog.mode' = 'append' ); -- Change to retract mode -- Useful when changes to the same row are represented as paired operations ALTER TABLE customer_changes SET ( 'changelog.mode' = 'retract' ); -- Change upsert mode when working with primary keys -- Best when tracking state changes using a primary key (derived from Kafka message key) ALTER TABLE customer_changes SET ( 'changelog.mode' = 'upsert' ); Read and/or write Kafka headers¶ -- Create example topic CREATE TABLE t_headers (i INT); -- For read-only (virtual) ALTER TABLE t_headers ADD headers MAP<BYTES, BYTES> METADATA VIRTUAL; -- For read and write (persisted). Column becomes mandatory in INSERT INTO. ALTER TABLE t_headers MODIFY headers MAP<BYTES, BYTES> METADATA; -- Use implicit casting (origin is always MAP<BYTES, BYTES>) ALTER TABLE t_headers MODIFY headers MAP<STRING, STRING> METADATA; -- Insert and read INSERT INTO t_headers SELECT 42, MAP['k1', 'v1', 'k2', 'v2']; SELECT * FROM t_headers; Properties The metadata key is headers. If you don’t want to name the column this way, use: other_name MAP<BYTES, BYTES> METADATA FROM 'headers' VIRTUAL. Keys of headers must be unique. Multi-key headers are not supported. Add headers as a metadata column¶ You can get the headers of a Kafka record as a map of raw bytes by adding a headers virtual metadata column. Run the following statement to add the Kafka partition as a metadata column: ALTER TABLE `orders` ADD ( `headers` MAP<BYTES,BYTES> METADATA VIRTUAL); View the new schema. DESCRIBE `orders`; Your output should resemble: +-------------+-------------------+----------+-------------------------+ | Column Name | Data Type | Nullable | Extras | +-------------+-------------------+----------+-------------------------+ | user | BIGINT | NOT NULL | PRIMARY KEY, BUCKET KEY | | product | STRING | NULL | | | amount | INT | NULL | | | ts | TIMESTAMP(3) | NULL | | | headers | MAP<BYTES, BYTES> | NULL | METADATA VIRTUAL | +-------------+-------------------+----------+-------------------------+ Read topic from specific offsets¶ -- Create example topic with 1 partition filled with values CREATE TABLE t_specific_offsets (i INT) DISTRIBUTED INTO 1 BUCKETS; INSERT INTO t_specific_offsets VALUES (1), (2), (3), (4), (5); -- Returns 1, 2, 3, 4, 5 SELECT * FROM t_specific_offsets; -- Changes the scan range ALTER TABLE t_specific_offsets SET ( 'scan.startup.mode' = 'specific-offsets', 'scan.startup.specific-offsets' = 'partition:0,offset:3' ); -- Returns 4, 5 SELECT * FROM t_specific_offsets; Properties scan.startup.mode and scan.bounded.mode control which range in the changelog (Kafka topic) to read. scan.startup.specific-offsets and scan.bounded.specific-offsets define offsets per partition. In the example, only 1 partition is used. For multiple partitions, use the following syntax: 'scan.startup.specific-offsets' = 'partition:0,offset:3; partition:1,offset:42; partition:2,offset:0' Debug “no output” and no watermark cases¶ The root cause for most “no output” cases is that a time-based operation, for example, TUMBLE, MATCH_RECOGNIZE, and FOR SYSTEM_TIME AS OF, did not receive recent enough watermarks. The current time of an operator is calculated by the minimum watermark of all inputs, meaning across all tables/topics and their partitions. If one partition does not emit a watermark, it can affect the entire pipeline. The following statements may be helpful for debugging issues related to watermarks. -- example table CREATE TABLE t_watermark_debugging (k INT, s STRING) DISTRIBUTED BY (k) INTO 4 BUCKETS; -- Each value lands in a separate Kafka partition (out of 4). -- Leave out values to see missing watermarks. INSERT INTO t_watermark_debugging VALUES (1, 'Bob'), (2, 'Alice'), (8, 'John'), (15, 'David'); -- If ROW_NUMBER doesn't show results, it's clearly a watermark issue. SELECT ROW_NUMBER() OVER (ORDER BY $rowtime ASC) AS `number`, * FROM t_watermark_debugging; -- Add partition information as metadata column ALTER TABLE t_watermark_debugging ADD part INT METADATA FROM 'partition' VIRTUAL; -- Use the CURRENT_WATERMARK() function to check which watermark is calculated SELECT *, part AS `Row Partition`, $rowtime AS `Row Timestamp`, CURRENT_WATERMARK($rowtime) AS `Operator Watermark` FROM t_watermark_debugging; -- Visualize the highest timestamp per Kafka partition -- Due to the table declaration (with 4 buckets), this query should show 4 rows. -- If not, the missing partitions might be the cause for watermark issues. SELECT part AS `Partition`, MAX($rowtime) AS `Max Timestamp in Partition` FROM t_watermark_debugging GROUP BY part; -- A workaround could be to not use the system watermark: ALTER TABLE t_watermark_debugging MODIFY WATERMARK FOR $rowtime AS $rowtime - INTERVAL '2' SECOND; -- Or for perfect input data: ALTER TABLE t_watermark_debugging MODIFY WATERMARK FOR $rowtime AS $rowtime - INTERVAL '0.001' SECOND; -- Add "fresh" data while the above statements with -- ROW_NUMBER() or CURRENT_WATERMARK() are running. INSERT INTO t_watermark_debugging VALUES (1, 'Fresh Bob'), (2, 'Fresh Alice'), (8, 'Fresh John'), (15, 'Fresh David'); The debugging examples above won’t solve everything but may help in finding the root cause. The system watermark strategy is smart and excludes idle Kafka partitions from the watermark calculation after some time, but at least one partition must produce new data for the “logical clock” with watermarks. Typically, root causes are: Idle Kafka partitions No data in Kafka partitions Not enough data in Kafka partitions Watermark strategy is too conservative No fresh data after warm up with historical data for progressing the logical clock Handle idle partitions for missing watermarks¶ Idle partitions often cause missing watermarks. Also, no data in a partition or infrequent data can be a root cause. -- Create a topic with 4 partitions. CREATE TABLE t_watermark_idle (k INT, s STRING) DISTRIBUTED BY (k) INTO 4 BUCKETS; -- Avoid the "not enough data" problem by using a custom watermark. -- The watermark strategy is still coarse-grained enough for this example. ALTER TABLE t_watermark_idle MODIFY WATERMARK FOR $rowtime AS $rowtime - INTERVAL '2' SECONDS; -- Each value lands in a separate Kafka partition, and partition 1 is empty. INSERT INTO t_watermark_idle VALUES (1, 'Bob in partition 0'), (2, 'Alice in partition 3'), (8, 'John in partition 2'); -- Thread 1: Start a streaming job. SELECT ROW_NUMBER() OVER (ORDER BY $rowtime ASC) AS `number`, * FROM t_watermark_idle; -- Thread 2: Insert some data immediately -> Thread 1 still without results. INSERT INTO t_watermark_idle VALUES (1, 'Another Bob in partition 0 shortly after'); -- Thread 2: Insert some data after 15s -> Thread 1 should show results. INSERT INTO t_watermark_idle VALUES (1, 'Another Bob in partition 0 after 15s') Within the first 15 seconds, all partitions contribute to the watermark calculation, so the first INSERT INTO has no effect because partition 1 is still empty. After 15 seconds, all partitions are marked as idle. No partition contributes to the watermark calculation. But when the second INSERT INTO is executed, it becomes the main driving partition for the logical clock. The global watermark jumps to “second INSERT INTO - 2 seconds”. In the following code, the sql.tables.scan.idle-timeout configuration overrides the default idle-detection algorithm, so even an immediate INSERT INTO can be the main driving partition for the logical clock, because all other partitions are marked as idle after 1 second. -- Thread 1: Start a streaming job. -- Lower the idle timeout further. SET 'sql.tables.scan.idle-timeout' = '1s'; SELECT ROW_NUMBER() OVER (ORDER BY $rowtime ASC) AS `number`, * FROM t_watermark_idle; -- Thread 2: Insert some data immediately -> Thread 1 should show results. INSERT INTO t_watermark_idle VALUES (1, 'Another Bob in partition 0 shortly after'); Change the schema context property¶ You can set the schema context for key and value formats to control the namespace for your schema resolution in Schema Registry. Set the schema context for the value format ALTER TABLE `orders` SET ('value.format.schema-context' = '.lsrc-newcontext'); Your output should resemble: Statement phase is COMPLETED. Check the new table properties. SHOW CREATE TABLE `orders`; Your output should resemble: +----------------------------------------------------------------------+ | SHOW CREATE TABLE | +----------------------------------------------------------------------+ | CREATE TABLE `catalog`.`database`.`orders` ( | | `user` BIGINT NOT NULL, | | `product` VARCHAR(2147483647), | | `amount` INT, | | `ts` TIMESTAMP(3) | | ) | | DISTRIBUTED BY HASH(`user`) INTO 6 BUCKETS | | WITH ( | | 'changelog.mode' = 'upsert', | | 'connector' = 'confluent', | | 'kafka.cleanup-policy' = 'delete', | | 'kafka.max-message-size' = '2097164 bytes', | | 'kafka.retention.size' = '0 bytes', | | 'kafka.retention.time' = '604800000 ms', | | 'key.format' = 'avro-registry', | | 'scan.bounded.mode' = 'unbounded', | | 'scan.startup.mode' = 'latest-offset', | | 'value.format' = 'avro-registry', | | 'value.format.schema-context' = '.lsrc-newcontext' | | ) | | | +----------------------------------------------------------------------+ Inferred tables schema evolution¶ You can use the ALTER TABLE statement to evolve schemas for inferred tables. The following examples show output from the SHOW CREATE TABLE statement called on the resulting table. Schema Registry columns overlap with computed/metadata columns¶ For the following value schema in Schema Registry: { "type": "record", "name": "TestRecord", "fields": [ { "name": "uid", "type": "int" } ] } Evolve a table by adding metadata: ALTER TABLE t_metadata_overlap ADD `timestamp` TIMESTAMP_LTZ(3) NOT NULL METADATA; SHOW CREATE TABLE returns the following output: CREATE TABLE t_metadata_overlap` ( `key` VARBINARY(2147483647), `uid` INT NOT NULL, `timestamp` TIMESTAMP(3) WITH LOCAL TIME ZONE NOT NULL METADATA ) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS WITH ( ... ) Properties Schema Registry says there is a timestamp physical column, but Flink says there is timestamp metadata column. In this case, metadata columns and computed columns have precedence, and Confluent Cloud for Apache Flink removes the physical column from the schema. Because Confluent Cloud for Apache Flink advertises FULL_TRANSITIVE mode, queries still work, and the physical column is set to NULL in the payload: INSERT INTO t_metadata_overlap SELECT CAST(NULL AS BYTES), 42, TO_TIMESTAMP_LTZ(0, 3); Evolve the table by renaming metadata: ALTER TABLE t_metadata_overlap DROP `timestamp`; ALTER TABLE t_metadata_overlap ADD message_timestamp TIMESTAMP_LTZ(3) METADATA FROM 'timestamp'; SELECT * FROM t_metadata_overlap; SHOW CREATE TABLE returns the following output: CREATE TABLE `t_metadata_overlap` ( `key` VARBINARY(2147483647), `uid` INT NOT NULL, `timestamp` VARCHAR(2147483647), `message_timestamp` TIMESTAMP(3) WITH LOCAL TIME ZONE METADATA FROM 'timestamp' ) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS WITH ( ... ) Properties Now, both physical and metadata columns appear and can be accessed for reading and writing. Enrich a column that has no Schema Registry information¶ For the following value schema in Schema Registry: { "type": "record", "name": "TestRecord", "fields": [ { "name": "uid", "type": "int" } ] } SHOW CREATE TABLE returns the following output: CREATE TABLE `t_enrich_raw_key` ( `key` VARBINARY(2147483647), `uid` INT NOT NULL ) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS WITH ( 'changelog.mode' = 'append', 'connector' = 'confluent', 'key.format' = 'raw', 'value.format' = 'avro-registry' ... ) Properties Schema Registry provides only information for the value part. Because the key part is not backed by Schema Registry, the key.format is raw. The default data type of raw is BYTES, but you can change this by using the ALTER TABLE statement. Evolve the table by giving a raw format column a specific type: ALTER TABLE t_enrich_raw_key MODIFY key STRING; SHOW CREATE TABLE returns the following output: CREATE TABLE `t_enrich_raw_key` ( `key` STRING, `uid` INT NOT NULL ) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS WITH ( 'changelog.mode' = 'append', 'connector' = 'confluent', 'key.format' = 'raw', 'value.format' = 'avro-registry' ... ) Properties Only changes to simple, atomic types, like INT, BYTES, and STRING are supported, where the binary representation is clear. For more complex modifications, use Schema Registry. In multi-cluster scenarios, the ALTER TABLE statement must be executed for every cluster, because the data type for key is stored in the Flink regional metastore. Configure Schema Registry subject names¶ When working with topics that use RecordNameStrategy or TopicRecordNameStrategy, you can configure the subject names for the schema resolution in Schema Registry. This is particularly useful when handling multiple event types in a single topic. For topics using these strategies, Flink initially infers a raw binary table: SHOW CREATE TABLE events; Your output will show a raw binary structure: CREATE TABLE `events` ( `key` VARBINARY(2147483647), `value` VARBINARY(2147483647) ) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS WITH ( 'changelog.mode' = 'append', 'connector' = 'confluent', 'key.format' = 'raw', 'value.format' = 'raw' ) Configure value schema subject names for each format: AvroJSON SchemaProtobufALTER TABLE events SET ( 'value.format' = 'avro-registry', 'value.avro-registry.subject-names' = 'com.example.Order;com.example.Shipment' ); ALTER TABLE events SET ( 'value.format' = 'json-registry', 'value.json-registry.subject-names' = 'com.example.Order;com.example.Shipment' ); ALTER TABLE events SET ( 'value.format' = 'proto-registry', 'value.proto-registry.subject-names' = 'com.example.Order;com.example.Shipment' ); If your topic uses keyed messages, you can also configure the key format: ALTER TABLE events SET ( 'key.format' = 'avro-registry', 'key.avro-registry.subject-names' = 'com.example.OrderKey' ); You can configure both key and value schema subject names in a single statement: ALTER TABLE events SET ( 'key.format' = 'avro-registry', 'key.avro-registry.subject-names' = 'com.example.OrderKey', 'value.format' = 'avro-registry', 'value.avro-registry.subject-names' = 'com.example.Order;com.example.Shipment' ); Properties: Use semicolons (;) to separate multiple subject names Subject names must match exactly with the names registered in Schema Registry The format prefix (avro-registry, json-registry, or proto-registry) must match the schema format in Schema Registry Reset a key value¶ You can use the RESET option to set any key to its default value. The following example shows how to reset a table that has a JSON Schema back to raw format. ALTER TABLE json_table RESET ( 'value.json-registry.wire-encoding', 'value.json-registry.subject-names' ); Custom error handling¶ You can use ALTER TABLE with the error-handling.mode and error-handling.log.target table properties to set custom error handling for deserialization errors. The following code example shows how to log errors to the specified Dead Letter Queue (DLQ) table and enable processing to continue. ALTER TABLE my_table SET ( 'error-handling.mode' = 'log', 'error-handling.log.target' = 'my_error_table' ); Related content¶ Video: How to Set Idle Timeouts

#### Code Examples

```sql
ALTER TABLE [catalog_name.][db_name.]table_name {
   ADD (metadata_column_name metadata_column_type METADATA [FROM metadata_key] VIRTUAL [COMMENT column_comment])
 | ADD (computed_column_name AS computed_column_expression [COMMENT column_comment])
 | MODIFY WATERMARK FOR rowtime_column_name AS watermark_strategy_expression
 | DROP WATERMARK
 | SET (key1='value1' [, key2='value2', ...])
 | RESET (key1 [, key2, ...])
}
```

```sql
CREATE TABLE t_perfect_watermark (i INT);

-- If multiple events can have the same timestamp.
ALTER TABLE t_perfect_watermark
  MODIFY WATERMARK FOR $rowtime AS $rowtime - INTERVAL '0.001' SECOND;

-- If a single event can have the timestamp.
ALTER TABLE t_perfect_watermark
  MODIFY WATERMARK FOR $rowtime AS $rowtime;
```

```sql
DESCRIBE `orders`;
```

```sql
+-------------+------------------------+----------+-------------------+
| Column Name |       Data Type        | Nullable |      Extras       |
+-------------+------------------------+----------+-------------------+
| user        | BIGINT                 | NOT NULL | PRIMARY KEY       |
| product     | STRING                 | NULL     |                   |
| amount      | INT                    | NULL     |                   |
| ts          | TIMESTAMP(3) *ROWTIME* | NULL     | WATERMARK AS `ts` |
+-------------+------------------------+----------+-------------------+
```

```sql
ALTER TABLE `orders` DROP WATERMARK;
```

```sql
Statement phase is COMPLETED.
```

```sql
DESCRIBE `orders`;
```

```sql
+-------------+--------------+----------+-------------+
| Column Name |  Data Type   | Nullable |   Extras    |
+-------------+--------------+----------+-------------+
| user        | BIGINT       | NOT NULL | PRIMARY KEY |
| product     | STRING       | NULL     |             |
| amount      | INT          | NULL     |             |
| ts          | TIMESTAMP(3) | NULL     |             |
+-------------+--------------+----------+-------------+
```

```sql
-- Convert from regular Avro format to Debezium CDC format
-- and configure the appropriate Flink changelog interpretation mode:
-- * append:  Treats each record as an INSERT operation with no relationship between records
-- * retract: Handles paired operations (INSERT/UPDATE/DELETE) where changes to the same row
--            are represented as a retraction of the old value followed by an addition of the new value
-- * upsert: Groups all operations for the primary key (derived from the Kafka message key),
--           with each operation effectively merging with or replacing previous state
--           (INSERT creates, UPDATE modifies, DELETE removes)
ALTER TABLE customer_data SET (
  'value.format' = 'avro-debezium-registry',
  'changelog.mode' = 'retract'
);
```

```sql
-- Convert from regular JSON format to Debezium CDC format
-- and configure the appropriate Flink changelog interpretation mode:
-- * append:  Treats each record as an INSERT operation with no relationship between records
-- * retract: Handles paired operations (INSERT/UPDATE/DELETE) where changes to the same row
--            are represented as a retraction of the old value followed by an addition of the new value
-- * upsert: Groups all operations for the primary key (derived from the Kafka message key),
--           with each operation effectively merging with or replacing previous state
--           (INSERT creates, UPDATE modifies, DELETE removes)
ALTER TABLE customer_data_json SET (
  'value.format' = 'json-debezium-registry',
  'changelog.mode' = 'retract'
);
```

```sql
-- Convert from regular Protobuf format to Debezium CDC format
-- and configure the appropriate Flink changelog interpretation mode:
-- * append:  Treats each record as an INSERT operation with no relationship between records
-- * retract: Handles paired operations (INSERT/UPDATE/DELETE) where changes to the same row
--            are represented as a retraction of the old value followed by an addition of the new value
-- * upsert: Groups all operations for the primary key (derived from the Kafka message key),
--           with each operation effectively merging with or replacing previous state
--           (INSERT creates, UPDATE modifies, DELETE removes)
ALTER TABLE customer_data_proto SET (
  'value.format' = 'proto-debezium-registry',
  'changelog.mode' = 'retract'
);
```

```sql
-- Change to append mode (default)
-- Best for event streams where each record is independent
ALTER TABLE customer_changes SET (
  'changelog.mode' = 'append'
);

-- Change to retract mode
-- Useful when changes to the same row are represented as paired operations
ALTER TABLE customer_changes SET (
  'changelog.mode' = 'retract'
);

-- Change upsert mode when working with primary keys
-- Best when tracking state changes using a primary key (derived from Kafka message key)
ALTER TABLE customer_changes SET (
  'changelog.mode' = 'upsert'
);
```

```sql
-- Create example topic
CREATE TABLE t_headers (i INT);

-- For read-only (virtual)
ALTER TABLE t_headers ADD headers MAP<BYTES, BYTES> METADATA VIRTUAL;

-- For read and write (persisted). Column becomes mandatory in INSERT INTO.
ALTER TABLE t_headers MODIFY headers MAP<BYTES, BYTES> METADATA;

-- Use implicit casting (origin is always MAP<BYTES, BYTES>)
ALTER TABLE t_headers MODIFY headers MAP<STRING, STRING> METADATA;

-- Insert and read
INSERT INTO t_headers SELECT 42, MAP['k1', 'v1', 'k2', 'v2'];
SELECT * FROM t_headers;
```

```sql
other_name MAP<BYTES, BYTES> METADATA FROM 'headers' VIRTUAL
```

```sql
ALTER TABLE `orders` ADD (
  `headers` MAP<BYTES,BYTES> METADATA VIRTUAL);
```

```sql
DESCRIBE `orders`;
```

```sql
+-------------+-------------------+----------+-------------------------+
| Column Name |     Data Type     | Nullable |         Extras          |
+-------------+-------------------+----------+-------------------------+
| user        | BIGINT            | NOT NULL | PRIMARY KEY, BUCKET KEY |
| product     | STRING            | NULL     |                         |
| amount      | INT               | NULL     |                         |
| ts          | TIMESTAMP(3)      | NULL     |                         |
| headers     | MAP<BYTES, BYTES> | NULL     | METADATA VIRTUAL        |
+-------------+-------------------+----------+-------------------------+
```

```sql
-- Create example topic with 1 partition filled with values
CREATE TABLE t_specific_offsets (i INT) DISTRIBUTED INTO 1 BUCKETS;
INSERT INTO t_specific_offsets VALUES (1), (2), (3), (4), (5);

-- Returns 1, 2, 3, 4, 5
SELECT * FROM t_specific_offsets;

-- Changes the scan range
ALTER TABLE t_specific_offsets SET (
  'scan.startup.mode' = 'specific-offsets',
  'scan.startup.specific-offsets' = 'partition:0,offset:3'
);

-- Returns 4, 5
SELECT * FROM t_specific_offsets;
```

```sql
scan.startup.mode
```

```sql
scan.bounded.mode
```

```sql
scan.startup.specific-offsets
```

```sql
scan.bounded.specific-offsets
```

```sql
'scan.startup.specific-offsets' = 'partition:0,offset:3; partition:1,offset:42; partition:2,offset:0'
```

```sql
-- example table
CREATE TABLE t_watermark_debugging (k INT, s STRING)
  DISTRIBUTED BY (k) INTO 4 BUCKETS;

-- Each value lands in a separate Kafka partition (out of 4).
-- Leave out values to see missing watermarks.
INSERT INTO t_watermark_debugging
  VALUES (1, 'Bob'), (2, 'Alice'), (8, 'John'), (15, 'David');

-- If ROW_NUMBER doesn't show results, it's clearly a watermark issue.
SELECT ROW_NUMBER() OVER (ORDER BY $rowtime ASC) AS `number`, *
  FROM t_watermark_debugging;

-- Add partition information as metadata column
ALTER TABLE t_watermark_debugging ADD part INT METADATA FROM 'partition' VIRTUAL;

-- Use the CURRENT_WATERMARK() function to check which watermark is calculated
SELECT
  *,
  part AS `Row Partition`,
  $rowtime AS `Row Timestamp`,
  CURRENT_WATERMARK($rowtime) AS `Operator Watermark`
FROM t_watermark_debugging;

-- Visualize the highest timestamp per Kafka partition
-- Due to the table declaration (with 4 buckets), this query should show 4 rows.
-- If not, the missing partitions might be the cause for watermark issues.
SELECT part AS `Partition`, MAX($rowtime) AS `Max Timestamp in Partition`
  FROM t_watermark_debugging
  GROUP BY part;

-- A workaround could be to not use the system watermark:
ALTER TABLE t_watermark_debugging
  MODIFY WATERMARK FOR $rowtime AS $rowtime - INTERVAL '2' SECOND;
-- Or for perfect input data:
ALTER TABLE t_watermark_debugging
  MODIFY WATERMARK FOR $rowtime AS $rowtime - INTERVAL '0.001' SECOND;

-- Add "fresh" data while the above statements with
-- ROW_NUMBER() or CURRENT_WATERMARK() are running.
INSERT INTO t_watermark_debugging VALUES
  (1, 'Fresh Bob'),
  (2, 'Fresh Alice'),
  (8, 'Fresh John'),
  (15, 'Fresh David');
```

```sql
-- Create a topic with 4 partitions.
CREATE TABLE t_watermark_idle (k INT, s STRING)
  DISTRIBUTED BY (k) INTO 4 BUCKETS;

-- Avoid the "not enough data" problem by using a custom watermark.
-- The watermark strategy is still coarse-grained enough for this example.
ALTER TABLE t_watermark_idle
  MODIFY WATERMARK FOR $rowtime AS $rowtime - INTERVAL '2' SECONDS;

-- Each value lands in a separate Kafka partition, and partition 1 is empty.
INSERT INTO t_watermark_idle
  VALUES
    (1, 'Bob in partition 0'),
    (2, 'Alice in partition 3'),
    (8, 'John in partition 2');

-- Thread 1: Start a streaming job.
SELECT ROW_NUMBER() OVER (ORDER BY $rowtime ASC) AS `number`, *
  FROM t_watermark_idle;

-- Thread 2: Insert some data immediately -> Thread 1 still without results.
INSERT INTO t_watermark_idle
  VALUES (1, 'Another Bob in partition 0 shortly after');

-- Thread 2: Insert some data after 15s -> Thread 1 should show results.
INSERT INTO t_watermark_idle
  VALUES (1, 'Another Bob in partition 0 after 15s')
```

```sql
sql.tables.scan.idle-timeout
```

```sql
-- Thread 1: Start a streaming job.
-- Lower the idle timeout further.
SET 'sql.tables.scan.idle-timeout' = '1s';
SELECT ROW_NUMBER() OVER (ORDER BY $rowtime ASC) AS `number`, *
  FROM t_watermark_idle;

-- Thread 2: Insert some data immediately -> Thread 1 should show results.
INSERT INTO t_watermark_idle
  VALUES (1, 'Another Bob in partition 0 shortly after');
```

```sql
ALTER TABLE `orders` SET ('value.format.schema-context' = '.lsrc-newcontext');
```

```sql
Statement phase is COMPLETED.
```

```sql
SHOW CREATE TABLE `orders`;
```

```sql
+----------------------------------------------------------------------+
|                          SHOW CREATE TABLE                           |
+----------------------------------------------------------------------+
| CREATE TABLE `catalog`.`database`.`orders` (                         |
|   `user` BIGINT NOT NULL,                                            |
|   `product` VARCHAR(2147483647),                                     |
|   `amount` INT,                                                      |
|   `ts` TIMESTAMP(3)                                                  |
| )                                                                    |
|   DISTRIBUTED BY HASH(`user`) INTO 6 BUCKETS                         |
| WITH (                                                               |
|   'changelog.mode' = 'upsert',                                       |
|   'connector' = 'confluent',                                         |
|   'kafka.cleanup-policy' = 'delete',                                 |
|   'kafka.max-message-size' = '2097164 bytes',                        |
|   'kafka.retention.size' = '0 bytes',                                |
|   'kafka.retention.time' = '604800000 ms',                           |
|   'key.format' = 'avro-registry',                                    |
|   'scan.bounded.mode' = 'unbounded',                                 |
|   'scan.startup.mode' = 'latest-offset',                             |
|   'value.format' = 'avro-registry',                                  |
|   'value.format.schema-context' = '.lsrc-newcontext'                 |
| )                                                                    |
|                                                                      |
+----------------------------------------------------------------------+
```

```sql
{
  "type": "record",
  "name": "TestRecord",
  "fields": [
    {
      "name": "uid",
      "type": "int"
    }
  ]
}
```

```sql
ALTER TABLE t_metadata_overlap ADD `timestamp` TIMESTAMP_LTZ(3) NOT NULL METADATA;
```

```sql
CREATE TABLE t_metadata_overlap` (
  `key` VARBINARY(2147483647),
  `uid` INT NOT NULL,
  `timestamp` TIMESTAMP(3) WITH LOCAL TIME ZONE NOT NULL METADATA
) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS
WITH (
  ...
)
```

```sql
INSERT INTO t_metadata_overlap
  SELECT CAST(NULL AS BYTES), 42, TO_TIMESTAMP_LTZ(0, 3);
```

```sql
ALTER TABLE t_metadata_overlap DROP `timestamp`;

ALTER TABLE t_metadata_overlap
  ADD message_timestamp TIMESTAMP_LTZ(3) METADATA FROM 'timestamp';

SELECT * FROM t_metadata_overlap;
```

```sql
CREATE TABLE `t_metadata_overlap` (
  `key` VARBINARY(2147483647),
  `uid` INT NOT NULL,
  `timestamp` VARCHAR(2147483647),
  `message_timestamp` TIMESTAMP(3) WITH LOCAL TIME ZONE METADATA FROM 'timestamp'
) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS
WITH (
  ...
)
```

```sql
{
  "type": "record",
  "name": "TestRecord",
  "fields": [
    {
      "name": "uid",
      "type": "int"
    }
  ]
}
```

```sql
CREATE TABLE `t_enrich_raw_key` (
  `key` VARBINARY(2147483647),
  `uid` INT NOT NULL
  ) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS
WITH (
  'changelog.mode' = 'append',
  'connector' = 'confluent',
  'key.format' = 'raw',
  'value.format' = 'avro-registry'
  ...
)
```

```sql
ALTER TABLE t_enrich_raw_key MODIFY key STRING;
```

```sql
CREATE TABLE `t_enrich_raw_key` (
  `key` STRING,
  `uid` INT NOT NULL
) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS
WITH (
  'changelog.mode' = 'append',
  'connector' = 'confluent',
  'key.format' = 'raw',
  'value.format' = 'avro-registry'
  ...
)
```

```sql
SHOW CREATE TABLE events;
```

```sql
CREATE TABLE `events` (
  `key` VARBINARY(2147483647),
  `value` VARBINARY(2147483647)
) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS
WITH (
  'changelog.mode' = 'append',
  'connector' = 'confluent',
  'key.format' = 'raw',
  'value.format' = 'raw'
)
```

```sql
ALTER TABLE events SET (
  'value.format' = 'avro-registry',
  'value.avro-registry.subject-names' = 'com.example.Order;com.example.Shipment'
);
```

```sql
ALTER TABLE events SET (
  'value.format' = 'json-registry',
  'value.json-registry.subject-names' = 'com.example.Order;com.example.Shipment'
);
```

```sql
ALTER TABLE events SET (
  'value.format' = 'proto-registry',
  'value.proto-registry.subject-names' = 'com.example.Order;com.example.Shipment'
);
```

```sql
ALTER TABLE events SET (
  'key.format' = 'avro-registry',
  'key.avro-registry.subject-names' = 'com.example.OrderKey'
);
```

```sql
ALTER TABLE events SET (
  'key.format' = 'avro-registry',
  'key.avro-registry.subject-names' = 'com.example.OrderKey',
  'value.format' = 'avro-registry',
  'value.avro-registry.subject-names' = 'com.example.Order;com.example.Shipment'
);
```

```sql
avro-registry
```

```sql
json-registry
```

```sql
proto-registry
```

```sql
ALTER TABLE json_table RESET (
  'value.json-registry.wire-encoding',
  'value.json-registry.subject-names'
);
```

```sql
ALTER TABLE my_table SET (
  'error-handling.mode' = 'log',
  'error-handling.log.target' = 'my_error_table'
);
```

---

### SQL ALTER VIEW Statement in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/statements/alter-view.html

ALTER VIEW Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables modifying properties of an existing view. Syntax¶ ALTER VIEW [catalog_name.][db_name.]view_name RENAME TO new_view_name ALTER VIEW [catalog_name.][db_name.]view_name AS new_statement_expression Description¶ ALTER VIEW enables you to change the name of a view or modify the statement expression that defines the view. The first syntax enables renaming a view within the same catalog and database. The new view name must not already exist in the catalog and database. The second syntax enables changing the underlying statement that defines the view. The new statement expression must be a valid SELECT statement supported by Flink SQL. The schema of the new statement expression must be compatible with the schema of the existing view. Examples¶ The following examples show frequently encountered scenarios with ALTER VIEW. Rename a view¶ In the Confluent CLI or in a Cloud Console workspace, run the following commands to rename a view. Create a view. CREATE VIEW customer_orders AS SELECT customer_id, SUM(price) AS total_spent FROM `examples`.`marketplace`.`orders` GROUP BY customer_id; Rename the view. ALTER VIEW customer_orders RENAME TO vip_customers; Your output should resemble: Statement phase is COMPLETED. Query the renamed view. SELECT * FROM vip_customers; The statement now references the view by its new name. Change the statement expression of a view¶ View the current definition of the view. SHOW CREATE VIEW vip_customers; Your output should resemble: +------------------------------------------------------------------------------+ | SHOW CREATE VIEW | +------------------------------------------------------------------------------+ | CREATE VIEW vip_customers AS SELECT customer_id, SUM(price) AS total_spent | | FROM orders | | GROUP BY customer_id; | +------------------------------------------------------------------------------+ Change the statement expression of the view. ALTER VIEW vip_customers AS SELECT customer_id, SUM(price) AS total_spent, COUNT(*) AS order_count FROM `examples`.`marketplace`.`orders` GROUP BY customer_id HAVING SUM(price) > 1000; Your output should resemble: Statement phase is COMPLETED. View the updated definition of the view. SHOW CREATE VIEW vip_customers; Your output should resemble: +-----------------------------------------------------------------------------------------------------+ | SHOW CREATE VIEW | +-----------------------------------------------------------------------------------------------------+ | CREATE VIEW vip_customers AS SELECT customer_id, SUM(price) AS total_spent, COUNT(*) AS order_count | | FROM orders | | GROUP BY customer_id | | HAVING SUM(price) > 1000; | +-----------------------------------------------------------------------------------------------------+ The view now includes an additional order_count column representing the number of orders per customer, and filters for only those customers who have spent more than 1000. Related content¶ CREATE VIEW statement SELECT statement Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
ALTER VIEW [catalog_name.][db_name.]view_name RENAME TO new_view_name

ALTER VIEW [catalog_name.][db_name.]view_name AS new_statement_expression
```

```sql
CREATE VIEW customer_orders AS
SELECT customer_id, SUM(price) AS total_spent
FROM `examples`.`marketplace`.`orders`
GROUP BY customer_id;
```

```sql
ALTER VIEW customer_orders RENAME TO vip_customers;
```

```sql
Statement phase is COMPLETED.
```

```sql
SELECT * FROM vip_customers;
```

```sql
SHOW CREATE VIEW vip_customers;
```

```sql
+------------------------------------------------------------------------------+
|                              SHOW CREATE VIEW                                |
+------------------------------------------------------------------------------+
| CREATE VIEW vip_customers AS SELECT customer_id, SUM(price) AS total_spent   |
| FROM orders                                                                  |
| GROUP BY customer_id;                                                        |
+------------------------------------------------------------------------------+
```

```sql
ALTER VIEW vip_customers AS
SELECT customer_id, SUM(price) AS total_spent, COUNT(*) AS order_count
FROM `examples`.`marketplace`.`orders`
GROUP BY customer_id
HAVING SUM(price) > 1000;
```

```sql
Statement phase is COMPLETED.
```

```sql
SHOW CREATE VIEW vip_customers;
```

```sql
+-----------------------------------------------------------------------------------------------------+
|                                         SHOW CREATE VIEW                                            |
+-----------------------------------------------------------------------------------------------------+
| CREATE VIEW vip_customers AS SELECT customer_id, SUM(price) AS total_spent, COUNT(*) AS order_count |
| FROM orders                                                                                         |
| GROUP BY customer_id                                                                                |
| HAVING SUM(price) > 1000;                                                                           |
+-----------------------------------------------------------------------------------------------------+
```

```sql
order_count
```

---

### SQL CREATE CONNECTION Statement in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/statements/create-connection.html

CREATE CONNECTION Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® supports creating secure connections to external services and data sources. You can use these connections in your Flink statements. Connections are resources that you define to configure parameters needed for connecting to third-party services. Connections include endpoint and authentication information. They provide a way to handle sensitive information such as credentials while ensuring security. Connections are essential for secure communications in Confluent AI and Flink UDFs to make secure calls to external services. For more information, see Reuse Confluent Cloud Connections With External Services. A connection has its own lifecycle and can be created, managed, updated, or deleted by users with appropriate permissions. For more information, see Manage Connections. Confluent Cloud for Apache Flink makes a best-effort attempt to redact sensitive values from the CREATE CONNECTION and ALTER CONNECTION statements by masking the values for the known sensitive keys. In Confluent Cloud Console, the sensitive values are redacted in the Flink SQL workspace if you navigate away from the workspace and return, or if you reload the page in the browser. Alternatively, you can use the Confluent CLI commands to create and manage connections. In addition, if syntax in the CREATE CONNECTION statement is incorrect, Confluent Cloud for Apache Flink may not detect the secrets. For example, if you type CREATE CONNECTION my_conn WITH ('ap-key' = 'x'), Flink won’t redact the x, because api-key is misspelled. Note Connection resources are an Open Preview feature in Confluent Cloud. A Preview feature is a Confluent Cloud component that is being introduced to gain early feedback from developers. Preview features can be used for evaluation and non-production testing purposes or to provide feedback to Confluent. The warranty, SLA, and Support Services provisions of your agreement with Confluent do not apply to Preview features. Confluent may discontinue providing preview releases of the Preview features at any time in Confluent’s’ sole discretion. Syntax¶ CREATE [OR REPLACE] CONNECTION [IF NOT EXISTS] [catalog_name.][db_name.]connection_name [COMMENT connection_comment] WITH ( 'type' = '<connection-type>', 'endpoint' = '<endpoint-url>', ['sse-endpoint' = '<sse-endpoint-url>'], ['api-key' = 'api_key'] | ['username' = 'user_name', 'password' = 'user_password'] | ['aws-access-key' = '<aws-access-key-id>', 'aws-secret-key' = '<aws-secret-access-key>', 'aws-session-token' = '<aws-session-token>'] | ); Description¶ Create a new secure connection to an external service or data source. Change the authorization settings of an existing connection by using the ALTER CONNECTION statement. To remove a connection from the current database, use the DROP CONNECTION statement. Confluent Cloud for Apache Flink supports these authentication methods: Basic: username and password. The credentials are added to the HTTP request as a BASIC header. Bearer: token. The credentials are added to the HTTP request as a BEARER header. OAuth: token-endpoint, client-id, client-secret, and scope. The provided options are used to retrieve the OAuth token from the token endpoint and add the token to the HTTP request as a BEARER token. Connection types¶ The following connection types are supported: azureml azureopenai bedrock confluent_jdbc couchbase elastic googleai mcp_server mongodb openai pinecone rest sagemaker vertexai Authorization¶ Depending on the connection type, the following authorization methods are supported: API key: azureml, azureopenai, elastic, googleai, mcp_server, openai, pinecone basic: mongodb, couchbase, confluent_jdbc, or rest bearer: rest or mcp_server connections oauth: rest or mcp_server connections Secrets are extracted to the secret store and aren’t displayed in subsequent DESCRIBE CONNECTION statements, the Flink SQL shell, or the Confluent Cloud Console. The maximum secret length is 4000 bytes, which is checked after the string is converted to bytes. Examples¶ -- example AzureML connection with API key CREATE CONNECTION `my-azureml-connection` WITH ( 'type' = 'AZUREML', 'endpoint' = 'https://myworkspace.myregion.inference.ml.azure.com/test', 'api_key' = '<your-api-key>' ); -- example AzureML connection with comment CREATE CONNECTION `my-azureml-connection` COMMENT 'Connection Comment' WITH ( 'type' = 'AZUREML', 'endpoint' = 'https://myworkspace.myregion.inference.ml.azure.com/test', 'api_key' = '<your-api-key>' ); -- example Couchbase connection with basic authorization CREATE CONNECTION `my-couchbase-connection` WITH ( 'type' = 'COUCHBASE', 'endpoint' = 'couchbases://my-cluster.cloud.couchbase.com', 'username' = '<user-name>', 'password' = '<password>' ); -- example Bedrock connection with AWS authentication CREATE CONNECTION `my-bedrock-connection` WITH ( 'type' = 'BEDROCK', 'endpoint' = 'https://bedrock-runtime.us-east-1.amazonaws.com/model/my-model/invoke', 'aws-access-key' = '<aws-access-key-id>', 'aws-secret-key' = '<aws-secret-access-key>', 'aws-session-token' = '<aws-session-token>' ); -- example REST connection with bearer token CREATE CONNECTION `my-rest-connection` WITH ( 'type' = 'REST', 'endpoint' = 'https://myrest.connection.com', 'token' = '<token>' ); -- example MCP server connection with OAuth CREATE CONNECTION `my-mcp-connection` WITH ( 'type' = 'MCP_SERVER', 'endpoint' = 'https://mymcp.connection.com', 'scope' = '<scope>', 'token-endpoint' = '<token-endpoint>', 'client-id' = '<client-id>', 'client-secret' = '<client-secret>' ); MongoDB external table¶ -- Create a MongoDB connection with basic authorization. CREATE CONNECTION `my-mongodb-connection` WITH ( 'type' = 'MONGODB', 'endpoint' = 'mongodb+srv://myCluster.mongodb.net/myDatabase', 'username' = '<atlas-user-name>', 'password' = '<atlas-password>' ); -- Use the MongoDB connection to create a MongoDB external table. CREATE TABLE mongodb_movies_full_text_search ( title STRING, plot STRING ) WITH ( 'connector' = 'mongodb', 'mongodb.connection' = 'my-mongodb-connection', 'mongodb.database' = 'sample_mflix', 'mongodb.collection' = 'movies', 'mongodb.index' = 'default' ); Confluent JDBC¶ -- Create a Confluent JDBC connection with basic authorization. CREATE CONNECTION `jdbc-postgres-connection` WITH ( 'type' = 'confluent_jdbc', 'endpoint' = 'jdbc:postgresql://my.example.com:5432/mydatabase', 'username' = '<user-name>', 'password' = '<password>'); -- Use the Confluent JDBC connection to create a table. CREATE TABLE jdbc_postgres ( show_id STRING, type STRING, title STRING, cast_members STRING, country STRING, date_added DATE, release_year INT, rating STRING, duration STRING, listed_in STRING, description STRING ) WITH ( 'connector' = 'confluent-jdbc', 'confluent-jdbc.connection' = 'jdbc-postgres-connection', 'confluent-jdbc.table-name' = 'netflix_shows' ); Related content¶ ALTER CONNECTION DESCRIBE CONNECTION DROP CONNECTION SHOW CONNECTIONS Reuse Confluent Cloud Connections With External Services Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
CREATE CONNECTION my_conn WITH ('ap-key' = 'x')
```

```sql
CREATE [OR REPLACE] CONNECTION [IF NOT EXISTS] [catalog_name.][db_name.]connection_name
[COMMENT connection_comment]
WITH (
    'type' = '<connection-type>',
    'endpoint' = '<endpoint-url>',
    ['sse-endpoint' = '<sse-endpoint-url>'],
    ['api-key' = 'api_key'] |
    ['username' = 'user_name', 'password' = 'user_password'] |
    ['aws-access-key' = '<aws-access-key-id>', 'aws-secret-key' = '<aws-secret-access-key>', 'aws-session-token' = '<aws-session-token>'] |
);
```

```sql
token-endpoint
```

```sql
client-secret
```

```sql
-- example AzureML connection with API key
CREATE CONNECTION `my-azureml-connection`
  WITH (
    'type' = 'AZUREML',
    'endpoint' = 'https://myworkspace.myregion.inference.ml.azure.com/test',
    'api_key' = '<your-api-key>'
  );

-- example AzureML connection with comment
CREATE CONNECTION `my-azureml-connection`
  COMMENT 'Connection Comment'
  WITH (
    'type' = 'AZUREML',
    'endpoint' = 'https://myworkspace.myregion.inference.ml.azure.com/test',
    'api_key' = '<your-api-key>'
  );

-- example Couchbase connection with basic authorization
CREATE CONNECTION `my-couchbase-connection`
  WITH (
    'type' = 'COUCHBASE',
    'endpoint' = 'couchbases://my-cluster.cloud.couchbase.com',
    'username' = '<user-name>',
    'password' = '<password>'
  );

-- example Bedrock connection with AWS authentication
CREATE CONNECTION `my-bedrock-connection`
  WITH (
    'type' = 'BEDROCK',
    'endpoint' = 'https://bedrock-runtime.us-east-1.amazonaws.com/model/my-model/invoke',
    'aws-access-key' = '<aws-access-key-id>',
    'aws-secret-key' = '<aws-secret-access-key>',
    'aws-session-token' = '<aws-session-token>'
  );

-- example REST connection with bearer token
CREATE CONNECTION `my-rest-connection`
  WITH (
    'type' = 'REST',
    'endpoint' = 'https://myrest.connection.com',
    'token' = '<token>'
  );

-- example MCP server connection with OAuth
CREATE CONNECTION `my-mcp-connection`
  WITH (
    'type' = 'MCP_SERVER',
    'endpoint' = 'https://mymcp.connection.com',
    'scope' = '<scope>',
    'token-endpoint' = '<token-endpoint>',
    'client-id' = '<client-id>',
    'client-secret' = '<client-secret>'
  );
```

```sql
-- Create a MongoDB connection with basic authorization.
CREATE CONNECTION `my-mongodb-connection`
  WITH (
    'type' = 'MONGODB',
    'endpoint' = 'mongodb+srv://myCluster.mongodb.net/myDatabase',
    'username' = '<atlas-user-name>',
    'password' = '<atlas-password>'
  );

-- Use the MongoDB connection to create a MongoDB external table.
CREATE TABLE mongodb_movies_full_text_search (
    title STRING,
    plot STRING
) WITH (
    'connector' = 'mongodb',
    'mongodb.connection' = 'my-mongodb-connection',
    'mongodb.database' = 'sample_mflix',
    'mongodb.collection' = 'movies',
    'mongodb.index' = 'default'
);
```

```sql
-- Create a Confluent JDBC connection with basic authorization.
CREATE CONNECTION `jdbc-postgres-connection`
  WITH (
    'type' = 'confluent_jdbc',
    'endpoint' = 'jdbc:postgresql://my.example.com:5432/mydatabase',
    'username' = '<user-name>',
    'password' = '<password>');

-- Use the Confluent JDBC connection to create a table.
CREATE TABLE jdbc_postgres (
    show_id STRING,
    type STRING,
    title STRING,
    cast_members STRING,
    country STRING,
    date_added DATE,
    release_year INT,
    rating STRING,
    duration STRING,
    listed_in STRING,
    description STRING
) WITH (
    'connector' = 'confluent-jdbc',
    'confluent-jdbc.connection' = 'jdbc-postgres-connection',
    'confluent-jdbc.table-name' = 'netflix_shows'
);
```

---

### Flink SQL CREATE TABLE Statement in Confluent Cloud | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/statements/create-function.html

CREATE FUNCTION Statement¶ Confluent Cloud for Apache Flink® enables registering customer user defined functions (UDFs) by using the CREATE FUNCTION statement. When your UDFs are registered in a Flink database, you can use it in your SQL queries. Syntax¶ CREATE FUNCTION <function-name> AS <class-name> USING JAR 'confluent-artifact://<plugin-id>/<version-id>'; Description¶ Register a user defined function (UDF) in the current database. To remove a (UDF) from the current database, use the DROP FUNCTION statement. Related content¶ Create a User-Defined Function with Confluent Cloud for Apache Flink confluent flink artifact create Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
CREATE FUNCTION <function-name>
  AS <class-name>
  USING JAR 'confluent-artifact://<plugin-id>/<version-id>';
```

---

### SQL CREATE MODEL Statement in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/statements/create-model.html

CREATE MODEL Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables real-time inference and prediction with AI and ML models. The Flink SQL interface is available in Cloud Console and the Flink SQL shell. Get started using AI models with Run an AI Model. The following providers are supported: AWS Bedrock AWS Sagemaker Azure Machine Learning (Azure ML) Azure OpenAI Google AI OpenAI Vertex AI Syntax¶ CREATE MODEL [IF NOT EXISTS] [[catalogname].[database_name]].model_name [INPUT (input_column_list)] [OUTPUT (output_column_list)] [COMMENT model_comment] WITH(model_option_list) Description¶ Create a new AI model. If a model with the same name exists already, a new version of the model is created. For more information, see version. If the IF NOT EXISTS option is specified and a model with the same name exists already, the statement is ignored. To view the currently registered models, use the SHOW MODELS statement. To view the WITH options that were used to create the model, run the SHOW CREATE MODEL statement. To view the versions, inputs, and outputs of the model, run the Models statement. To change the name or options of an existing model, use the ALTER MODEL statement. To delete a model from the current environment, use the DROP MODEL statement. Tip If you get a 429 error when you run a CREATE MODEL statement, the most likely cause is rate limiting by the model provider. Some providers, like Azure OpenAI, support increasing the default limit of tokens per minute. Increasing this limit to match your throughput may fix 429 errors. Task types¶ Confluent Cloud for Apache Flink supports these types of analysis for AI model inference: Classification: Categorize input data into predefined classes or labels. This task is used in applications like spam detection, where emails are classified as “spam” or “not spam”, and image recognition. Clustering: Group a set of objects so that objects in the same group, called a “cluster”, are more similar to each other than to those in other groups. This task is a form of unsupervised learning, because it doesn’t rely on predefined categories. Applications include customer segmentation in marketing and gene sequence analysis in biology. Embedding: Transform high-dimensional data into lower-dimensional vectors while preserving the relative distances between data points. This is crucial for tasks like natural language processing (NLP), where words or sentences are converted into vectors, enabling models to understand semantic similarities. Embeddings are used in recommendation systems, search engines, and more. Regression: Regression models predict a continuous output variable based on one or more input features. This task is used in scenarios like predicting house prices based on features like size, location, and number of bedrooms, or forecasting stock prices. Regression analysis helps in understanding the relationships between variables and forecasting. Text generation: Generate human-like text based on input data. Applications include chatbots, content creation, and language translation. When you register an AI or ML model, you specify the task type by using the task property. task is a required property, but it applies only when using the ML_EVALUATE function. Examples¶ The following code example shows how to run an AI model. The model must be created with the model provider and registered by using the CREATE MODEL statement with <model-name>. SELECT * FROM my_table, LATERAL TABLE(ML_PREDICT('<model-name>', column1, column2)); All of the CREATE MODEL statements require a connection resource that you create by using the CREATE CONNECTION statement. For example, the following code example shows how to create a connection for AWS Bedrock. # Example command to create a connection for AWS Bedrock. CREATE CONNECTION bedrock-connection WITH ( 'type' = 'bedrock', 'endpoint' = 'https://bedrock-runtime.us-west-2.amazonaws.com/model/amazon.titan-embed-text-v1/invoke', 'aws-access-key' = '<aws-access-key>', 'aws-secret-key' = '<aws-secret-key>', 'aws-session-token' = '<aws-session-token>' ); Classification task¶ The following example shows how to create an OpenAI classification model. For more information, see Sentiment analysis with OpenAI LLM. CREATE MODEL sentimentmodel INPUT(text STRING) OUTPUT(sentiment STRING) COMMENT 'sentiment analysis model' WITH ( 'provider' = 'openai', 'task' = 'classification', 'openai.connection' = '<cli-connection>', 'openai.model_version' = 'gpt-3.5-turbo', 'openai.system_prompt' = 'Analyze the sentiment of the text and return only POSITIVE, NEGATIVE, or NEUTRAL.' ); Clustering task¶ The following example shows how to create an Azure ML clustering model. It requires that a K-Means model has been trained and deployed on Azure. Replace <ENDPOINT> and <REGION> with your values. CREATE MODEL clusteringmodel INPUT (vectors ARRAY<FLOAT>, other_feature INT, other_feature2 STRING) OUTPUT (cluster_num INT) WITH ( 'task' = 'clustering', 'provider' = 'azureml', 'azureml.connection' = '<cli-connection>' ); Embedding task¶ The following example shows how to create an AWS Bedrock text embedding model. Replace <REGION> with your value. For more information, see Text embedding with AWS Bedrock and Azure OpenAI. CREATE MODEL embeddingmodel INPUT (text STRING) OUTPUT (embedding ARRAY<FLOAT>) WITH ( 'task' = 'embedding', 'provider' = 'bedrock', 'bedrock.connection' = '<cli-connection>' ); Text generation task¶ The following example shows how to create an OpenAI text generation task for translating from English to Spanish. CREATE MODEL translatemodel INPUT(english STRING) OUTPUT(spanish STRING) COMMENT 'spanish translation model' WITH ( 'provider' = 'openai', 'task' = 'text_generation', 'openai.connection' = '<cli-connection>', 'openai.model_version' = 'gpt-3.5-turbo', 'openai.system_prompt' = 'Translate to spanish' ); For more examples, see Run an AI Model. Model versioning¶ A model can have multiple versions. A version is an integer number that starts at 1. The default version for a new model is 1. Currently, the maximum number of supported versions is 10. New versions are created by the CREATE MODEL statement for the same model name. A new version increments the current maximum version by 1. To view the versions of a model, use the DESCRIBE MODEL statement. Only model options are versioned, which that means input/output format and comments don’t change across versions. The statement fails if input format, output format, or comments change. For model options, model task changes are not permitted. The following code example shows the result of running CREATE MODEL twice with the same model name. CREATE MODEL `my-model` ... -- Output `my-model` with version 1 created. Default version: 1 CREATE MODEL `my-model` ... -- Output `my-model` with version 2 created. Default version: 1 By default, version 1 is the default version when a model is first created. As more versions are created by the CREATE MODEL statement, you can change the default version by using the ALTER MODEL statement. The following example shows how to change the default version of an existing model. ALTER MODEL <model-name> SET ('default_version'='<version>'); You can access a specific version of a model in queries by using the <model_name>$<model_version> syntax. If no version is specified, the default version is used. The following code examples show how to use a specific version of a model in a query. -- Use version 2 of the model. SELECT * FROM `my-table` LATERAL TABLE (ML_PREDICT('my-model$2', col1, col2)); -- Use the default version of the model. SELECT * FROM `my-table` LATERAL TABLE (ML_PREDICT('my-model', col1, col2)); Use the <model_name>$<model_version> syntax to delete a specific version of a model: -- Delete a specific version of the model. DROP MODEL `<model-name>$<version>`; -- Delete all versions and the model. DROP MODEL `<model-name>$all`; The maximum version number is the next default version. If all versions are dropped, the whole model is deleted. To change the version of an existing model, use the ALTER MODEL statement. If no version is specified, the default version is changed. ALTER MODEL `<model-name>$<version>` SET ('k1'='v1', 'k2'='v2'); WITH options¶ Specify the details of your AI inference model by using the WITH clause. The following tables show the supported properties in the WITH clause. Model Provider Property Common {PROVIDER}.client_timeout {PROVIDER}.connection {PROVIDER}.input_format {PROVIDER}.input_content_type {PROVIDER}.output_format {PROVIDER}.output_content_type {PROVIDER}.PARAMS.* {PROVIDER}.system_prompt OpenAI openai.input_format openai.model_version Azure OpenAI azureopenai.input_format azureopenai.model_version Azure ML azureml.input_format azureml.deployment_name Google AI googleai.input_format Sagemaker sagemaker.custom_attributes sagemaker.enable_explanations sagemaker.inference_component_name sagemaker.inference_id sagemaker.input_content_type sagemaker.output_content_type sagemaker.target_container_hostname sagemaker.target_model sagemaker.target_variant Vertex AI vertexai.service_key vertexai.input_format Connection resource¶ Secrets must be set by using a connection resource that you create by using the CREATE CONNECTION statement. The connection resource securely contains the provider endpoint and secrets like the API key. For example, the following code example shows how to create a connection to OpenAI, named openai-connection. CREATE CONNECTION openai-connection WITH ( 'type' = 'openai', 'endpoint' = 'https://api.openai.com/v1/chat/completions', 'api-key' = '<your-api-key>' ); Specify the connection by name in the {PROVIDER}.connection property of the WITH clause. The environment, cloud, and region options in the CREATE CONNECTION statement must be the same as the compute pool which uses the connection. The following code example shows how to refer to the connection named openai-connection in the WITH clause: 'openai.connection' = 'openai-connection' The maximum secret length is 4000 bytes, which is checked after the string is converted to bytes. Common properties¶ The following properties are common to all of the model providers. {PROVIDER}.client_timeout¶ Set the request timeout to the client endpoint. {PROVIDER}.connection¶ Set the credentials for connecting to a model provider. Create the connection resource by using the CREATE CONNECTION statement. This property is required. {PROVIDER}.input_format¶ Set the json, text, or binary input format used by the model. Each provider has a default value. This property is optional. For supported input formats, see Text generation and LLM model formats and Other formats. {PROVIDER}.input_content_type¶ The HTTP content media type header to set when calling the model. The value is a Media/MIME type. The default is chosen based on input_format. Usually, this property is required only for Sagemaker and Bedrock models. {PROVIDER}.output_format¶ Set the json, text, or binary output format used by the model. The default is chosen based on input_format. This property is optional. For supported output formats, see Text generation and LLM model formats and Other formats.. {PROVIDER}.output_content_type¶ The HTTP Accept media type header to set when calling the model. The value is a Media/MIME type. The default is chosen based on output_format. Usually, this property is required only for Sagemaker and Bedrock models. {PROVIDER}.PARAMS.*¶ Provide parameters based on the input_format. The maximum number of parameters you can set is 32. This property is optional. For more information, see Parameters. {PROVIDER}.system_prompt¶ A system prompt passed to an LLM model to give it general behavioral instructions. The value is a string. Not all models support a system prompt. This property is optional. task¶ Specify the kind of analysis to perform. Supported values are: “classification” “clustering” “embedding” “regression” “text_generation” This property is required, but it applies only when using the ML_EVALUATE function. OpenAI properties¶ openai.input_format¶ Set the input format used by the model. The default is OPENAI-CHAT. This property is optional. openai.model_version¶ Set the version string of the requested model. The default is gpt-3.5-turbo. This property is optional. Azure OpenAI properties¶ Properties for OpenAI models deployed in Azure AI Studio. Azure OpenAI accepts all of the OpenAI parameters, but with a different endpoint. azureopenai.input_format¶ Set the input format used by the model. The default is OPENAI-CHAT. This property is optional. azureopenai.model_version¶ Set the version string of the requested model. The default is gpt-3.5-turbo. This property is optional. Azure ML properties¶ Properties for both Azure Machine Learning and LLM models from Azure AI Studio can use this provider. azureml.input_format¶ Set the input format used by the model. The default is AZUREML-PANDAS-DATAFRAME. For AI Studio LLMs, OPENAI-CHAT is usually the correct format, even for non-OpenAI models. This property is optional. azureml.deployment_name¶ Set the model name. Bedrock properties¶ The default input_format for Bedrock is determined automatically based on the model endpoint, or AMAZON-TITAN-TEXT if there is no match. If necessary, change it to match the model for your endpoint. Google AI properties¶ googleai.input_format¶ Set the input format used by the model. The default is GEMINI-GENERATE. This property is optional. Sagemaker properties¶ sagemaker.custom_attributes¶ Set a model-dependent value that is passed through to Sagemaker in the header of the same name. This property is optional. sagemaker.enable_explanations¶ Enable writing explanations, if your model supports them. Passed through to Sagemaker in the header of the same name. If your model supports writing explanations, they should be disabled, because Confluent Cloud for Apache Flink currently doesn’t support reading them. Don’t set enable_explanations if the model doesn’t support explanations, because this causes Sagemaker to return an error. This property is optional. sagemaker.inference_component_name¶ Specify which inference component to use in the endpoint. Passed through to Sagemaker in the header of the same name. This property is optional. sagemaker.inference_id¶ Set an ID that is passed through to Sagemaker in the header of the same name. Used for tracking request origins. This property is optional. sagemaker.input_content_type¶ The HTTP content media type header to set when calling the model. Setting this property overrides the Content-type header for the model request. Many Sagemaker models use this header to determine their behavior, but set it only if choosing an appropriate input_format is not sufficient. This property is optional. sagemaker.output_content_type¶ The HTTP Accept media type header to set when calling the model. Setting this property overrides the Accept header for the model request. Some Sagemaker models use this header to determine their outputs, but set it only if choosing an appropriate output_format is not sufficient. This property is optional. sagemaker.target_container_hostname¶ Allows calling a specific container when the endpoint has multiple containers. Passed through to Sagemaker in the header of the same name. This property is optional. sagemaker.target_model¶ Enables calling a specific model from multiple models deployed to the same endpoint. Passed through to Sagemaker in the header of the same name. This property is optional. sagemaker.target_variant¶ Enables calling a specific version of the model from multiple deployed variants. Passed through to Sagemaker in the header of the same name. This property is optional. Vertex AI properties¶ vertexai.service_key¶ Set the Service Account Key of a service account with permission to call the inference endpoint. This value is a secret. This property is required. vertexai.input_format¶ Set the input format used by the model. The default is TF-SERVING. Defaults to GEMINI-GENERATE if the endpoint is for a published Gemini model. This property is optional. Supported input/output formats¶ The following input/output formats for text generation and LLM models are supported. AI-21-COMPLETE AMAZON-TITAN-EMBED AMAZON-TITAN-TEXT ANTHROPIC-COMPLETIONS ANTHROPIC-MESSAGES AZURE-EMBED BEDROCK-LLAMA COHERE-CHAT COHERE-EMBED COHERE-GENERATE GEMINI-GENERATE GEMINI-CHAT MISTRAL-CHAT MISTRAL-COMPLETIONS OPENAI-CHAT OPENAI-EMBED VERTEX-EMBED The following additional input/output formats are supported. AZUREML-PANDAS-DATAFRAME AZUREML-TENSOR BINARY CSV JSON JSON-ARRAY JSON:wrapper KSERVE-V1 KSERVE-V2 MLFLOW-TENSOR PANDAS-DATAFRAME TEXT TF-SERVING TF-SERVING-COLUMN TRITON VERTEXAI-PYTORCH Parameters¶ The text generation and LLM formats support some or all of the following parameters. {PROVIDER}.PARAMS.temperature¶ Controls the randomness or “creativity” of the output. Typical values are between 0.0 and 1.0. This parameter is model-dependent. Its type is Float. {PROVIDER}.PARAMS.top_p¶ The probability cutoff for token selection. Usually, either temperature or top_p are specified, but not both. This parameter is model-dependent. Its type is Float. {PROVIDER}.PARAMS.top_k¶ The number of possible tokens to sample from at each step. This parameter is model-dependent. Its type is Float. {PROVIDER}.PARAMS.stop¶ A CSV list of strings to pass as stop sequences to the model. {PROVIDER}.PARAMS.max_tokens¶ The maximum number of tokens for the model to return. Its type is Int. Text generation and LLM model formats¶ The following formats are intended for text generation models and LLMs. They require that the model has a single STRING input and a single STRING output. AI-21-COMPLETE¶ This format is for models using the AI21 Labs J2 Complete API, including the AI21 Labs Foundation models on AWS Bedrock. This format does not support the top_k parameter. AMAZON-TITAN-EMBED¶ This format is for Amazon Titan Text Embedding models. AMAZON-TITAN-TEXT¶ The format is for Amazon’s Titan Text models. This is the default format for the AWS Bedrock provider. This format does not support the top_k parameter. ANTHROPIC-COMPLETIONS¶ This format is for models using the Anthropic Claude Text Completions API, including some Anthropic models on AWS Bedrock. ANTHROPIC-MESSAGES¶ This format is for models using the Anthropic Claude Messages API, including some Anthropic models on AWS Bedrock. Some Anthropic models accept both this and the Completions API format. AZURE-EMBED¶ The embedding format used by other foundation models on Azure. This format is the same as OPENAI-EMBED. BEDROCK-LLAMA¶ The format used by Llama models on AWS Bedrock. This format does not support the top_k or stop parameters. COHERE-CHAT¶ The Cohere Chat API format. COHERE-EMBED¶ Cohere’s Embedding API format. COHERE-GENERATE¶ The legacy Cohere Chat API format. This format is used by AWS Bedrock Cohere Command models. GEMINI-GENERATE¶ The Google Gemini API format. This is the default format for the Google AI provider, but you can also use it with Gemini models on the Google Vertex AI. GEMINI-CHAT¶ Same as the GEMINI-GENERATE format. MISTRAL-CHAT¶ The standard Mistral API format. MISTRAL-COMPLETIONS¶ The legacy Mistral Completions API format used by AWS Bedrock. OPENAI-CHAT¶ The OpenAI Chat API format. This is the default for the OpenAI and Azure OpenAI providers. It is also generally used by most non-OpenAI LLM models deployed in Azure AI Studio using the Azure ML provider. OPENAI-EMBED¶ The OpenAI Embedding model format. VERTEX-EMBED¶ The Embedding format for Vertex AI Gemini models. Other formats¶ The following formats are intended for predictive models running on providers like Sagemaker, Vertex AI, and Azure ML. Usually, these models are used for tasks like classification, regression, and clustering. Currently, none of these formats support PARAMS. Unless specified, each input format defaults to the associated output format with the same name. AZUREML-PANDAS-DATAFRAME¶ Azure ML’s version of the Pandas Dataframe Split format. The only difference is that this version has “input_data” as the top-level field, instead of “dataframe_split”. This is the default format for Azure ML models. The output format defaults to JSON-ARRAY. AZUREML-TENSOR¶ Azure ML’s version of named input tensors. Equivalent to the “JSON:input_data” input format. the output format defaults to “JSON:outputs”. BINARY¶ Raw binary inputs, serialized in little-endian byte order. This input format accepts multiple input columns, which are packed in order. CSV¶ Comma separated text. This is the default format for Sagemaker models, but Sagemaker models vary widely, and most models must choose a different format. JSON¶ The inputs are formatted as a JSON object, with field names equal to the column names of the model input schema. The JSON format supports user-defined parameters. If you specify '{provider}.params.some_key'='value' in the WITH options, the key and value are used in the JSON input as {"some_key": "value"}. Example: { "column1": "String Data", "column2": [1,2,3,4] } JSON-ARRAY¶ The inputs are formatted as a JSON array, including [] brackets, but without the {} braces of a top-level JSON object. Column names are not included in the format. If the model takes a single input array column, it will be output as the top-level array. Models with multiple inputs have their arrays nested in JSON fashion. This format is usually appropriate for models that expect Numpy arrays. Example: [1,2,3,"String Data"] JSON:wrapper¶ Similar to the default JSON behavior, but all fields are wrapped in a named top-level object. The wrapper may be any valid JSON string. Example: { "wrapper": { "column1": "String Data", "column2: [1,2,3,4] } } KSERVE-V1¶ Same as the TF-SERVING format. KSERVE-V2¶ Same as the TRITON format. MLFLOW-TENSOR¶ The format used by some MLFlow models. It is the same format as TF-SERVING-COLUMN. PANDAS-DATAFRAME¶ The Pandas Dataframe Split format used by most MLFlow models. The output format defaults to JSON-ARRAY. TEXT¶ Model input values formatted as raw text. Use newlines to separate multiple inputs. TF-SERVING¶ The Tensorflow Serving Row format. This is the default format for Vertex AI models. It is generally the correct format to use for most predictive models trained in Vertex AI. TF-SERVING-COLUMN¶ The TensorFlow Serving Column format. It is exactly equivalent to “JSON:inputs”. The output format defaults to “JSON:outputs”. TRITON¶ The Triton/KServeV2 format used by NVidia Triton Inference Servers. When possible, this format serializes data in the protocol’s mixed json+binary format. Note that some Tensor datatypes, like 16-bit floats, do not have an exact equivalent in Flink SQL, but they are converted, when possible. VERTEXAI-PYTORCH¶ Vertex AI’s format for PyTorch models. This format is the TF-SERVING format with an extra wrapper around the data. The output format defaults to TF-SERVING. Related content¶ ALTER MODEL DROP MODEL Run an AI Model Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
CREATE MODEL [IF NOT EXISTS] [[catalogname].[database_name]].model_name
  [INPUT (input_column_list)]
  [OUTPUT (output_column_list)]
  [COMMENT model_comment]
  WITH(model_option_list)
```

```sql
<model-name>
```

```sql
SELECT * FROM my_table, LATERAL TABLE(ML_PREDICT('<model-name>', column1, column2));
```

```sql
# Example command to create a connection for AWS Bedrock.
CREATE CONNECTION bedrock-connection
  WITH (
    'type' = 'bedrock',
    'endpoint' = 'https://bedrock-runtime.us-west-2.amazonaws.com/model/amazon.titan-embed-text-v1/invoke',
    'aws-access-key' = '<aws-access-key>',
    'aws-secret-key' = '<aws-secret-key>',
    'aws-session-token' = '<aws-session-token>'
  );
```

```sql
CREATE MODEL sentimentmodel
INPUT(text STRING)
OUTPUT(sentiment STRING)
COMMENT 'sentiment analysis model'
WITH (
  'provider' = 'openai',
  'task' = 'classification',
  'openai.connection' = '<cli-connection>',
  'openai.model_version' = 'gpt-3.5-turbo',
  'openai.system_prompt' = 'Analyze the sentiment of the text and return only POSITIVE, NEGATIVE, or NEUTRAL.'
);
```

```sql
CREATE MODEL clusteringmodel
INPUT (vectors ARRAY<FLOAT>, other_feature INT, other_feature2 STRING)
OUTPUT (cluster_num INT)
WITH (
  'task' = 'clustering',
  'provider' = 'azureml',
  'azureml.connection' = '<cli-connection>'
);
```

```sql
CREATE MODEL embeddingmodel
INPUT (text STRING)
OUTPUT (embedding ARRAY<FLOAT>)
WITH (
  'task' = 'embedding',
  'provider' = 'bedrock',
  'bedrock.connection' = '<cli-connection>'
);
```

```sql
CREATE MODEL translatemodel
INPUT(english STRING)
OUTPUT(spanish STRING)
COMMENT 'spanish translation model'
WITH (
  'provider' = 'openai',
  'task' = 'text_generation',
  'openai.connection' = '<cli-connection>',
  'openai.model_version' = 'gpt-3.5-turbo',
  'openai.system_prompt' = 'Translate to spanish'
);
```

```sql
CREATE MODEL `my-model` ...

-- Output
`my-model` with version 1 created. Default version: 1

CREATE MODEL `my-model` ...

-- Output
`my-model` with version 2 created. Default version: 1
```

```sql
ALTER MODEL <model-name> SET ('default_version'='<version>');
```

```sql
<model_name>$<model_version>
```

```sql
-- Use version 2 of the model.
SELECT * FROM `my-table` LATERAL TABLE (ML_PREDICT('my-model$2', col1, col2));

-- Use the default version of the model.
SELECT * FROM `my-table` LATERAL TABLE (ML_PREDICT('my-model', col1, col2));
```

```sql
<model_name>$<model_version>
```

```sql
-- Delete a specific version of the model.
DROP MODEL `<model-name>$<version>`;

-- Delete all versions and the model.
DROP MODEL `<model-name>$all`;
```

```sql
ALTER MODEL `<model-name>$<version>` SET ('k1'='v1', 'k2'='v2');
```

```sql
openai-connection
```

```sql
CREATE CONNECTION openai-connection
  WITH (
    'type' = 'openai',
    'endpoint' = 'https://api.openai.com/v1/chat/completions',
    'api-key' = '<your-api-key>'
  );
```

```sql
openai-connection
```

```sql
'openai.connection' = 'openai-connection'
```

```sql
input_format
```

```sql
input_format
```

```sql
output_format
```

```sql
input_format
```

```sql
OPENAI-CHAT
```

```sql
gpt-3.5-turbo
```

```sql
OPENAI-CHAT
```

```sql
gpt-3.5-turbo
```

```sql
AZUREML-PANDAS-DATAFRAME
```

```sql
OPENAI-CHAT
```

```sql
input_format
```

```sql
AMAZON-TITAN-TEXT
```

```sql
GEMINI-GENERATE
```

```sql
enable_explanations
```

```sql
Content-type
```

```sql
input_format
```

```sql
output_format
```

```sql
GEMINI-GENERATE
```

```sql
'{provider}.params.some_key'='value'
```

```sql
{"some_key": "value"}
```

```sql
{
  "column1": "String Data",
  "column2": [1,2,3,4]
}
```

```sql
[1,2,3,"String Data"]
```

```sql
{
  "wrapper": {
    "column1": "String Data",
    "column2: [1,2,3,4]
  }
}
```

---

### SQL CREATE TABLE Statement in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/statements/create-table.html

CREATE TABLE Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables creating tables backed by Apache Kafka® topics by using the CREATE TABLE statement. With Flink tables, you can run SQL queries on streaming data in Kafka topics. Syntax¶ CREATE TABLE [IF NOT EXISTS] [catalog_name.][db_name.]table_name ( { <physical_column_definition> | <metadata_column_definition> | <computed_column_definition> | <column_in_vector_db_provider> }[ , ...n] [ <watermark_definition> ] [ <table_constraint> ][ , ...n] ) [COMMENT table_comment] [DISTRIBUTED BY (distribution_column_name1, distribution_column_name2, ...) INTO n BUCKETS] WITH (key1=value1, key2=value2, ...) [ LIKE source_table [( <like_options> )] | AS select_query ] <physical_column_definition>: column_name column_type [ <column_constraint> ] [COMMENT column_comment] <metadata_column_definition>: column_name column_type METADATA [ FROM metadata_key ] [ VIRTUAL ] <computed_column_definition>: column_name AS computed_column_expression [COMMENT column_comment] <column_in_vector_db_provider> column_name column_type <watermark_definition>: WATERMARK FOR rowtime_column_name AS watermark_strategy_expression <table_constraint>: [CONSTRAINT constraint_name] PRIMARY KEY (column_name, ...) NOT ENFORCED <like_options>: { { INCLUDING | EXCLUDING } { ALL | CONSTRAINTS | PARTITIONS } | { INCLUDING | EXCLUDING | OVERWRITING } { GENERATED | OPTIONS | WATERMARKS } } Description¶ Register a table into the current or specified catalog. When a table is registered, you can use it in SQL queries. The CREATE TABLE statement always creates a backing Kafka topic as well as the corresponding schema subjects for key and value. Trying to create a table with a name that exists in the catalog causes an exception. The table name can be in these formats: catalog_name.db_name.table_name: The table is registered with the catalog named “catalog_name” and the database named “db_name”. db_name.table_name: The table is registered into the current catalog of the execution table environment and the database named “db_name”. table_name: The table is registered into the current catalog and the database of the execution table environment. A table registered with the CREATE TABLE statement can be used as both table source and table sink. Flink can’t determine whether the table is used as a source or a sink until it’s referenced in a DML query. The following sections show the options and clauses that are available with the CREATE TABLE statement. Physical / Regular Columns Metadata columns Computed columns System columns Watermark clause PRIMARY KEY constraint DISTRIBUTED BY clause CREATE TABLE AS SELECT (CTAS) LIKE WITH options Usage¶ This following CREATE TABLE statement registers a table named t1 in the current catalog. Also, it creates a backing Kafka topic and corresponding value-schema. By default, the table is registered as append-only, uses AVRO serializers, and reads from the earliest offset. CREATE TABLE t1 ( `id` BIGINT, `name` STRING, `age` INT, `salary` DECIMAL(10,2), `active` BOOLEAN, `created_at` TIMESTAMP_LTZ(3) ); You can override defaults by specifying WITH options. The following SQL registers the table in retraction mode, so you can use the table to sink the results of a streaming join. CREATE TABLE t2 ( `id` BIGINT, `name` STRING, `age` INT, `salary` DECIMAL(10,2), `active` BOOLEAN, `created_at` TIMESTAMP_LTZ(3) ) WITH ( 'changelog.mode' = 'retract' ); Physical / Regular Columns¶ Physical or regular columns are the columns that define the structure of the table and the data types of its fields. Each physical column is defined by a name and a data type, and optionally, a column constraint. You can use the column constraint to specify additional properties of the column, such as whether it is a unique key. ExampleThe following SQL shows how to declare physical columns of various types in a table named t1. For available column types, see Data Types. CREATE TABLE t1 ( `id` BIGINT, `name` STRING, `age` INT, `salary` DECIMAL(10,2), `active` BOOLEAN, `created_at` TIMESTAMP_LTZ(3) ); Metadata columns¶ You can access the following table metadata as metadata columns in a table definition. Available metadata leader-epoch offset partition raw-key raw-value timestamp timestamp-type topic Use the METADATA keyword to declare a metadata column. Metadata fields are readable or readable/writable. Read-only columns must be declared VIRTUAL to exclude them during INSERT INTO operations. Metadata columns are not registered in Schema Registry. ExampleThe following CREATE TABLE statement shows the syntax for exposing metadata fields. CREATE TABLE t ( `user_id` BIGINT, `item_id` BIGINT, `behavior` STRING, `event_time` TIMESTAMP_LTZ(3) METADATA FROM 'timestamp', `partition` BIGINT METADATA VIRTUAL, `offset` BIGINT METADATA VIRTUAL ); Available metadata¶ headers¶ Type: MAP NOT NULL Access: readable/writable Headers of the Kafka record as a map of raw bytes. leader-epoch¶ Type: INT NULL Access: readable Leader epoch of the Kafka record, if available. offset¶ Type: BIGINT NOT NULL Access: readable Offset of the Kafka record in the partition. partition¶ Type: INT NOT NULL Access: readable Partition ID of the Kafka record. raw-key¶ Type: BYTES NOT NULL Access: readable The unique identifier or key of the Kafka record as raw bytes. The type may vary based on the serializer used, for example, STRING for StringSerializer. raw-value¶ Type: BYTES NOT NULL Access: readable The actual message content or payload of the Kafka record as raw bytes. Contains the main data being transmitted. The type may vary based on the serializer used, for example, STRING for StringSerializer. timestamp¶ Type: TIMESTAMP_LTZ(3) NOT NULL Access: readable/writable Timestamp of the Kafka record. With timestamp, you can pass event time end-to-end. Otherwise, the sink uses the ingestion time by default. timestamp-type¶ Type: STRING NOT NULL Access: readable Timestamp type of the Kafka record. Valid values are: “NoTimestampType” “CreateTime” (also set when writing metadata) “LogAppendTime” topic¶ Type: STRING NOT NULL Access: readable Topic name of the Kafka record. Computed columns¶ Computed columns are virtual columns that are not stored in the table but are computed on the fly based on the values of other columns. These virtual columns are not registered in Schema Registry. A computed column is defined by using an expression that references one or more physical or metadata columns in the table. The expression can use arithmetic operators, functions, and other SQL constructs to manipulate the values of the physical and metadata columns and compute the value of the computed column. ExampleThe following CREATE TABLE statement shows the syntax for declaring a full_name computed column by concatenating a first_name column and a last_name column. CREATE TABLE t ( `id` BIGINT, `first_name` STRING, `last_name` STRING, `full_name` AS CONCAT(first_name, ' ', last_name) ); Vector database columns¶ Confluent Cloud for Apache Flink supports read-only external tables to enable search with federated query execution on external vector databases, like MongoDB, Pinecone, and ElasticSearch. Note Vector Search is an Open Preview feature in Confluent Cloud. A Preview feature is a Confluent Cloud component that is being introduced to gain early feedback from developers. Preview features can be used for evaluation and non-production testing purposes or to provide feedback to Confluent. The warranty, SLA, and Support Services provisions of your agreement with Confluent do not apply to Preview features. Confluent may discontinue providing preview releases of the Preview features at any time in Confluent’s’ sole discretion. For more information, see Vector Search. System columns¶ Confluent Cloud for Apache Flink introduces system columns for Flink tables. System columns build on the metadata columns. System columns can only be read and are not part of the query-to-sink schema. System columns aren’t selected in a SELECT * statement, and they’re not shown in DESCRIBE or SHOW CREATE TABLE statements. The result from the DESCRIBE EXTENDED statement does include system columns. Both inferred and manual tables are provisioned with a set of default system columns. $rowtime¶ Currently, $rowtime TIMESTAMP_LTZ(3) NOT NULL is provided as a system column. You can use the $rowtime system column to get the timestamp from a Kafka record, because $rowtime is exactly the Kafka record timestamp. If you want to write out $rowtime, you must use the timestamp metadata key. PRIMARY KEY constraint¶ A primary key constraint is a hint for Flink SQL to leverage for optimizations which specifies that a column or a set of columns in a table or a view are unique and they do not contain null. A primary key uniquely identifies a row in a table. No columns in a primary key can be nullable. You can declare a primary key constraint together with a column definition (a column constraint) or as a single line (a table constraint). In both cases, it must be declared as a singleton. If you define more than one primary key constraint in the same statement, Flink SQL throws an exception. The SQL standard specifies that a constraint can be ENFORCED or NOT ENFORCED, which controls whether the constraint checks are performed on the incoming/outgoing data. Flink SQL doesn’t own the data, so the only mode it supports is NOT ENFORCED. It’s your responsibility to ensure that the query enforces key integrity. Flink SQL assumes correctness of the primary key by assuming that the column’s nullability is aligned with the columns in primary key. Connectors must ensure that these are aligned. The PRIMARY KEY constraint distributes the table implicitly by the key column. A Kafka message key is defined either by an implicit DISTRIBUTED BY clause clause from a PRIMARY KEY constraint or an explicit DISTRIBUTED BY. Note In a CREATE TABLE statement, a primary key constraint alters the column’s nullability, which means that a column with a primary key constraint isn’t nullable. ExampleThe following SQL statement creates a table named latest_page_per_ip with a primary key defined on ip. This statement creates a Kafka topic, a value-schema, and a key-schema. The value-schema contains the definitions for page_url and ts, while the key-schema contains the definition for ip. CREATE TABLE latest_page_per_ip ( `ip` STRING, `page_url` STRING, `ts` TIMESTAMP_LTZ(3), PRIMARY KEY(`ip`) NOT ENFORCED ); DISTRIBUTED BY clause¶ The DISTRIBUTED BY clause buckets the created table by the specified columns. Bucketing enables a file-like structure with a small, human-enumerable key space. It groups rows that have “infinite” key space, like user_id, usually by using a hash function, for example: bucket = hash(user_id) % number_of_buckets Kafka partitions map 1:1 to SQL buckets. The n BUCKETS are used for the number of partitions when creating a topic. If n is not defined, the default is 6. The number of buckets is fixed. A bucket is identifiable regardless of partition. Bucketing is good in long-term storage for reading across partitions based on a large key space, for example, user_id. Also, bucketing is good for short-term storage for load balancing. Every mode comes with a default distribution, so DISTRIBUTED BY is required only by power users. In most cases, a simple CREATE TABLE t (schema); is sufficient. For upsert mode, the bucket key must be equal to primary key. For append/retract mode, the bucket key can be a subset of the primary key. The bucket key can be undefined, which corresponds to a “connector defined” distribution: round robin for append, and hash-by-row for retract. Custom distributions are possible, but currently only custom hash distributions are supported. ExampleThe following SQL declares a table named t_dist that has one key column named k and 4 Kafka partitions. CREATE TABLE t_dist (k INT, s STRING) DISTRIBUTED BY (k) INTO 4 BUCKETS; PARTITIONED BY clause¶ Deprecated Use the DISTRIBUTED BY clause instead. The PARTITIONED BY clause partitions the created table by the specified columns. Use PARTITIONED BY to declare key columns in a table explicitly. A Kafka message key is defined either by an explicit PARTITIONED BY clause or an implicit PARTITIONED BY clause from a PRIMARY KEY constraint. If compaction is enabled, the Kafka message key is overloaded with another semantic used for compaction, which influences constraints on the Kafka message key for partitioning. ExampleThe following SQL declares a table named t that has one key column named key of type INT. CREATE TABLE t (partition_key INT, example_value STRING) PARTITIONED BY (partition_key); Watermark clause¶ The WATERMARK clause defines the event-time attributes of a table. A watermark in Flink is used to track the progress of event time and provide a way to trigger time-based operations. Default watermark strategy¶ Confluent Cloud for Apache Flink provides a default watermark strategy for all tables, whether created automatically from a Kafka topic or from a CREATE TABLE statement. The default watermark strategy is applied on the $rowtime system column. Watermarks are calculated per Kafka partition, and at least 250 events are required per partition. If a delay of longer than 7 days can occur, choose a custom watermark strategy. Because the concrete implementation is provided by Confluent, you see only WATERMARK FOR $rowtime AS SOURCE_WATERMARK() in the declaration. Custom watermark strategies¶ You can replace the default strategy with a custom strategy at any time by using ALTER TABLE. Watermark strategy reference¶ WATERMARK FOR rowtime_column_name AS watermark_strategy_expression The rowtime_column_name defines an existing column that is marked as the event-time attribute of the table. The column must be of type TIMESTAMP(3), and it must be a top-level column in the schema. The watermark_strategy_expression defines the watermark generation strategy. It allows arbitrary non-query expressions, including computed columns, to calculate the watermark. The expression return type must be TIMESTAMP(3), which represents the timestamp since the Unix Epoch. The returned watermark is emitted only if it’s non-null and its value is larger than the previously emitted local watermark, to respect the contract of ascending watermarks. The watermark generation expression is evaluated by Flink SQL for every record. The framework emits the largest generated watermark periodically. No new watermark is emitted if any of the following conditions apply. The current watermark is null. The current watermark is identical to the previous watermark. The value of the returned watermark is smaller than the value of the last emitted watermark. When you use event-time semantics, your tables must contain an event-time attribute and watermarking strategy. Flink SQL provides these watermark strategies. Strictly ascending timestamps: Emit a watermark of the maximum observed timestamp so far. Rows that have a timestamp larger than the max timestamp are not late. WATERMARK FOR rowtime_column AS rowtime_column Ascending timestamps: Emit a watermark of the maximum observed timestamp so far, minus 1. Rows that have a timestamp larger than or equal to the max timestamp are not late. WATERMARK FOR rowtime_column AS rowtime_column - INTERVAL '0.001' SECOND Bounded out-of-orderness timestamps: Emit watermarks which are the maximum observed timestamp minus the specified delay. WATERMARK FOR rowtime_column AS rowtime_column - INTERVAL 'string' timeUnit The following example shows a “5-seconds delayed” watermark strategy. WATERMARK FOR rowtime_column AS rowtime_column - INTERVAL '5' SECOND ExampleThe following CREATE TABLE statement defines an orders table that has a rowtime column named order_time and a watermark strategy with a 5-second delay. CREATE TABLE orders ( `user` BIGINT, `product` STRING, `order_time` TIMESTAMP(3), WATERMARK FOR `order_time` AS `order_time` - INTERVAL '5' SECOND ); Progressive idleness detection¶ When a source does not receive any elements for a timeout time, which is specified by the sql.tables.scan.idle-timeout property, the source is marked as temporarily idle. This enables each downstream task to advance its watermark without the need to wait for watermarks from this source while it’s idle. By default, Confluent Cloud for Apache Flink has progressive idleness detection that starts with an idle-timeout of 15 seconds, and increases to a maximum of 5 minutes over time. You can disable idleness detection by setting the sql.tables.scan.idle-timeout property to 0, or you can set a fixed idleness timeout with your desired value. When idleness detection is disabled, a single idle partition on any of the sources causes the watermarks to stop advancing. In turn, this causes operations that rely on watermarks to stop producing results. On the other hand, with idleness detection enabled, with either progressive idleness or a fixed value, the watermark advances unless all partitions of all sources are idle. For more information, see the video, How to Set Idle Timeouts. CREATE TABLE AS SELECT (CTAS)¶ Tables can also be created and populated by the results of a query in one create-table-as-select (CTAS) statement. CTAS is the simplest and fastest way to create and insert data into a table with a single command. The CTAS statement consists of two parts: The SELECT part can be any SELECT query supported by Flink SQL. The CREATE part takes the resulting schema from the SELECT part and creates the target table. The following two code examples are equivalent. -- Equivalent to the following CREATE TABLE and INSERT INTO statements. CREATE TABLE my_ctas_table AS SELECT id, name, age FROM source_table WHERE mod(id, 10) = 0; -- These two statements are equivalent to the preceding CREATE TABLE AS statement. CREATE TABLE my_ctas_table ( id BIGINT, name STRING, age INT ); INSERT INTO my_ctas_table SELECT id, name, age FROM source_table WHERE mod(id, 10) = 0; Similar to CREATE TABLE, CTAS requires all options of the target table to be specified in the WITH clause. The syntax is CREATE TABLE t WITH (…) AS SELECT …, for example: CREATE TABLE t WITH ('scan.startup.mode' = 'latest-offset') AS SELECT * FROM b; Specifying explicit columns¶ The CREATE part enables you to specify explicit columns. The resulting table schema contains the columns defined in the CREATE part first, followed by the columns from the SELECT part. Columns named in both parts retain the same column position as defined in the SELECT part. You can also override the data type of SELECT columns if you specify it in the CREATE part. CREATE TABLE my_ctas_table ( desc STRING, quantity DOUBLE, cost AS price * quantity, WATERMARK FOR order_time AS order_time - INTERVAL '5' SECOND, ) AS SELECT id, price, quantity, order_time FROM source_table; Primary keys and distribution strategies¶ The CREATE part enable you to specify primary keys and distribution strategies. Primary keys work only on NOT NULL columns. Currently, primary keys only allow you to define columns from the SELECT part, which may be NOT NULL. The following two code examples are equivalent. -- Equivalent to the following CREATE TABLE and INSERT INTO statements. CREATE TABLE my_ctas_table ( PRIMARY KEY (id) NOT ENFORCED ) DISTRIBUTED BY HASH(id) INTO 4 BUCKETS AS SELECT id, name FROM source_table; -- These two statements are equivalent to the preceding CREATE TABLE AS statement. CREATE TABLE my_ctas_table ( id BIGINT NOT NULL PRIMARY KEY NOT ENFORCED, name STRING ) DISTRIBUTED BY HASH(id) INTO 4 BUCKETS; INSERT INTO my_ctas_table SELECT id, name FROM source_table; LIKE¶ The CREATE TABLE LIKE clause enables creating a new table with the same schema as an existing table. It is a combination of SQL features and can be used to extend or exclude certain parts of the original table. The clause must be defined at the top-level of a CREATE statement and applies to multiple parts of the table definition. Use the LIKE options to control the merging logic of table features. You can control the merging behavior of: CONSTRAINTS - Constraints such as primary key. and unique keys. GENERATED - Computed columns. METADATA - Metadata columns. OPTIONS - Table options. PARTITIONS - Partition options. WATERMARKS - Watermark strategies. with three different merging strategies: INCLUDING - Includes the feature of the source table and fails on duplicate entries, for example, if an option with the same key exists in both tables. EXCLUDING - Does not include the given feature of the source table. OVERWRITING - Includes the feature of the source table, overwrites duplicate entries of the source table with properties of the new table. For example, if an option with the same key exists in both tables, the option from the current statement is used. Additionally, you can use the INCLUDING/EXCLUDING ALL option to specify what should be the strategy if no specific strategy is defined. For example, if you use EXCLUDING ALL INCLUDING WATERMARKS, only the watermarks are included from the source table. If you provide no LIKE options, INCLUDING ALL OVERWRITING OPTIONS is used as a default. Example¶ The following CREATE TABLE statement defines a table named t that has 5 physical columns and three metadata columns. CREATE TABLE t ( `user_id` BIGINT, `item_id` BIGINT, `price` DOUBLE, `behavior` STRING, `created_at` TIMESTAMP(3), `price_with_tax` AS `price` * 1.19, `event_time` TIMESTAMP_LTZ(3) METADATA FROM 'timestamp', `partition` BIGINT METADATA VIRTUAL, `offset` BIGINT METADATA VIRTUAL ); You can run the following CREATE TABLE LIKE statement to define table t_derived, which contains the physical and computed columns of t, drops the metadata and default watermark strategy, and applies a custom watermark strategy on event_time. CREATE TABLE t_derived ( WATERMARK FOR `created_at` AS `created_at` - INTERVAL '5' SECOND ) LIKE t ( EXCLUDING WATERMARKS EXCLUDING METADATA ); WITH options¶ Table properties used to create a table source or sink. Both the key and value of the expression key1=val1 are string literals. You can change an existing table’s property values by using the ALTER TABLE Statement in Confluent Cloud for Apache Flink. You can set the following properties when you create a table. changelog.mode error-handling.log.target error-handling.mode kafka.cleanup-policy kafka.max-message-size kafka.retention.size kafka.retention.time key.fields-prefix key.format key.format.schema-context scan.bounded.mode scan.bounded.timestamp-millis scan.startup.mode value.fields-include value.format value.format.schema-context changelog.mode¶ Set the changelog mode of the connector. For more information on changelog modes, see dynamic tables. 'changelog.mode' = [append | upsert | retract] These are the changelog modes for an inferred table: append (if uncompacted and not a Debezium envelope) upsert (if compacted) retract (if a Debezium envelope is detected and uncompacted) These are the changelog modes for a manually created table: append retract upsert Primary key interaction¶ With a primary key declared, the changelog modes have these properties: append means that every row can be treated as an independent fact. retract means that the combination of +U and -U are related and must be partitioned together. upsert means that all rows with same primary key are related and must be partitioned together To build indices, primary keys must be partitioned together. Encoding of changes Default Partitioning without PK Default Partitioning with PK Custom Partitioning without PK Custom Partitioning with PK Each value is an insertion (+I). round robin hash by PK hash by specified column(s) hash by subset of PK A special op header represents the change (+I, -U, +U, -D). The header is omitted for insertions. Append queries encoding is the same for all modes. hash by entire value hash by PK hash by specified column(s) hash by subset of PK If value is null, it represents a deletion (-D). Other values are +U and the engine will normalize the changelog internally. unsupported, PK is mandatory hash by PK unsupported, PK is mandatory unsupported Change type header¶ Changes for an updating table have the change type encoded in the Kafka record as a special op header that represents the change (+I, -U, +U, -D). The value of the op header, if present, represents the kind of change that a row can describe in a changelog: 0: represents INSERT (+I), an insertion operation. 1: represents UPDATE_BEFORE (-U), an update operation with the previous content of the updated row. 2: represents UPDATE_AFTER (+U), an update operation with new content for the updated row. 3: represents DELETE (-D), a deletion operation. The default is 0. For more information, see Changelog entries. error-handling.log.target¶ Type: string Default: error_log 'error-handling.log.target' = '<dlq_table_name>' Specify the destination Dead Letter Queue (DLQ) table for error logs when error-handling.mode is set to log. If error-handling.log.target isn’t set, the default is error_log. If the DLQ table doesn’t exist and can’t be created, the job fails. The principal running the CREATE TABLE or ALTER TABLE statement must have permissions to create the DLQ topic and schema. If permissions are missing, the statement fails. If a principal runs a SELECT or any other query, it needs permissions to write into the defined DLQ table. If permissions are missing, the statement fails. For more information, see Grant Role-Based Access in Confluent Cloud for Apache Flink. error-handling.mode¶ Type: enum Default: fail 'error-handling.mode' = [fail | ignore | log] Control how Flink handles deserialization errors for a table. The following values are supported. fail: The statement fails on error (default). ignore: The error is skipped and processing continues. log: The error is logged to a Dead Letter Queue (DLQ) table and processing continues. When a statement reads from the table, for example, SELECT * FROM my_table, and a deserialization error occurs, as with a poison pill, Flink handles the error based on the error-handling.mode setting. fail: Flink fails the statement. ignore: Flink ignores the error and continues processing with the next row. log: Flink sends the poison pill to the DLQ table and continues processing with the next row. All Flink tables receive the error-handling.mode setting. If you don’t specify a value, the default is fail. You can override the setting for an existing table by using the ALTER TABLE statement. Only table-level overrides are supported. Per-statement overrides are not supported. The following limitations apply: Only deserialization errors at the source are supported. Errors outside the source, for example, in windowed aggregations, are not handled. kafka.cleanup-policy¶ Type: enum Default: delete 'kafka.cleanup-policy' = [delete | compact | delete-compact] Set the default cleanup policy for Kafka topic log segments beyond the retention window. Translates to the Kafka log.cleanup.policy property. For more information, see Log Compaction. compact: topic log is compacted periodically in the background by the log cleaner. delete: old log segments are discarded when their retention time or size limit is reached. delete-compact: compact the log and follow the retention time or size limit settings. kafka.consumer.isolation-level¶ Type: enum Default: read-committed 'kafka.consumer.isolation-level' = [read-committed | read-uncommitted] Controls which transactional messages to read: read-committed: Only return messages from committed transactions. Any transactional messages from aborted or in-progress transactions are filtered out. read-uncommitted: Return all messages, including those from transactional messages that were aborted or are still in progress. For more information, see delivery guarantees and latency. kafka.max-message-size¶ 'kafka.max-message-size' = MemorySize Translates to the Kafka max.message.bytes property. The default is 2097164 bytes. kafka.producer.compression.type¶ Type: enum Default: none 'kafka.producer.compression.type' = [none | gzip | snappy | lz4 | zstd] Translates to the Kafka compression.type property. kafka.retention.size¶ Type: Integer Default: 0 'kafka.retention.size' = MemorySize Translates to the Kafka log.retention.bytes property. kafka.retention.time¶ Type: Duration Default: 7 days 'kafka.retention.time' = '<duration>' Translates to the Kafka log.retention.ms property. key.fields-prefix¶ Type: String Default: “” Specify a custom prefix for all fields of the key format. 'key.fields-prefix' = '<prefix-string>' The key.fields-prefix property defines a custom prefix for all fields of the key format, which avoids name clashes with fields of the value format. By default, the prefix is empty. If a custom prefix is defined, the table schema property works with prefixed names. When constructing the data type of the key format, the prefix is removed, and the non-prefixed names are used within the key format. This option requires that the value.fields-include property is set to EXCEPT_KEY. The prefix for an inferred table is key_, for non-atomic Schema Registry types and fields that have a name. key.format¶ Type: String Default: “avro-registry” Specify the serialization format of the table’s key fields. 'key.format' = '<key-format>' These are the key formats for an inferred table: raw (if no Schema Registry entry) avro-registry (for AVRO Schema Registry entry) json-registry (for JSON Schema Registry entry) proto-registry (for Protobuf Schema Registry entry) These are the key formats for a manually created table: avro-registry (for Avro Schema Registry entry) json-registry (for JSON Schema Registry entry) proto-registry (for Protobuf Schema Registry entry) If no format is specified, Avro Schema Registry is used by default. This applies only if a primary or distribution key is defined. The Schema Registry subject compatibility mode must be FULL or FULL_TRANSITIVE. For more information, see Schema Evolution and Compatibility for Schema Registry on Confluent Cloud. key.format.schema-context¶ Type: String Default: (none) Specify the Confluent Schema Registry Schema Context for the key format. 'key.<format>.schema-context' = '<schema-context>' Similar to value.format.schema-context, this option enables you to specify a schema context for the key format. It provides an independent scope in Schema Registry for key schemas. scan.bounded.mode¶ Type: Enum Default: unbounded Specify the bounded mode for the Kafka consumer. scan.bounded.mode = [latest-offset | timestamp | unbounded] The following list shows the valid bounded mode values. latest-offset: bounded by latest offsets. This is evaluated at the start of consumption from a given partition. timestamp: bounded by a user-supplied timestamp. unbounded: table is unbounded. If scan.bounded.mode isn’t set, the default is an unbounded table. For more information, see Bounded and unbounded tables. If timestamp is specified, the scan.bounded.timestamp-millis config option is required to specify a specific bounded timestamp in milliseconds since the Unix epoch, January 1, 1970 00:00:00.000 GMT. scan.bounded.timestamp-millis¶ Type: Long Default: (none) End at the specified epoch timestamp (milliseconds) when the timestamp bounded mode is set in the scan.bounded.mode property. 'scan.bounded.mode' = 'timestamp', 'scan.bounded.timestamp-millis' = '<long-value>' scan.startup.mode¶ Type: Enum Default: earliest-offset The startup mode for Kafka consumers. 'scan.startup.mode' = '<startup-mode>' The following list shows the valid startup mode values. earliest-offset: start from the earliest offset possible. latest-offset: start from the latest offset. timestamp: start from the user-supplied timestamp for each partition. The default is earliest-offset. This differs from the default in Apache Flink, which is group-offsets. If timestamp is specified, the scan.startup.timestamp-millis config option is required, to define a specific startup timestamp in milliseconds since the Unix epoch, January 1, 1970 00:00:00.000 GMT. scan.startup.timestamp-millis¶ Type: Long Default: (none) Start from the specified Unix epoch timestamp (milliseconds) when the timestamp mode is set in the scan.startup.mode property. 'scan.startup.mode' = 'timestamp', 'scan.startup.timestamp-millis' = '<long-value>' value.fields-include¶ Type: Enum Default: except-key Specify a strategy for handling key columns in the data type of the value format. 'value.fields-include' = [all, except-key] If all is specified, all physical columns of the table schema are included in the value format, which means that key columns appear in the data type for both the key and value format. value.format¶ Type: String Default: “avro-registry” Specify the format for serializing and deserializing the value part of Kafka messages. 'value.format' = '<format>' These are the value formats for an inferred table: raw (if no Schema Registry entry) avro-registry (for Avro Schema Registry entry) json-registry (for JSON Schema Registry entry) proto-registry (for Protobuf Schema Registry entry) avro-debezium-registry (for Avro Debezium Schema Registry entry) json-debezium-registry (for JSON Debezium Schema Registry entry) proto-debezium-registry (for Protobuf Debezium Schema Registry entry) These are the value formats for a manually created table: avro-registry (for Avro Schema Registry entry) json-registry (for JSON Schema Registry entry) proto-registry (for Protobuf Schema Registry entry) If no format is specified, Avro Schema Registry is used by default. value.format.schema-context¶ Type: String Default: (none) Specify the Confluent Schema Registry Schema Context for the value format. 'value.<format>.schema-context' = '<schema-context>' A schema context represents an independent scope in Schema Registry and can be used to create separate “sub-registries” within one Schema Registry. Each schema context is an independent grouping of schema IDs and subject names, enabling the same schema ID in different contexts to represent completely different schemas. Inferred tables¶ Inferred tables are tables that have not been created by using a CREATE TABLE statement, but instead are automatically detected from information about existing Kafka topics and Schema Registry entries. You can use the ALTER TABLE statement to evolve schemas for inferred tables. The following examples show output from the SHOW CREATE TABLE statement called on the resulting table. No key or value in Schema Registry¶ For an inferred table with no registered key or value schemas, SHOW CREATE TABLE returns the following output: CREATE TABLE `t_raw` ( `key` VARBINARY(2147483647), `val` VARBINARY(2147483647) ) DISTRIBUTED BY HASH(`key`) INTO 2 BUCKETS WITH ( 'changelog.mode' = 'append', 'connector' = 'confluent', 'key.format' = 'raw', 'value.format' = 'raw' ... ) Properties Key and value formats are raw (binary format) with BYTES. Following Kafka message semantics, both key and value support NULL as well, so the following code is valid: INSERT INTO t_raw (key, val) SELECT CAST(NULL AS BYTES), CAST(NULL AS BYTES); No key and but record value in Schema Registry¶ For the following value schema in Schema Registry: { "type": "record", "name": "TestRecord", "fields": [ { "name": "i", "type": "int" }, { "name": "s", "type": "string" } ] } SHOW CREATE TABLE returns the following output: CREATE TABLE `t_raw_key` ( `key` VARBINARY(2147483647), `i` INT NOT NULL, `s` VARCHAR(2147483647) NOT NULL ) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS WITH ( 'changelog.mode' = 'append', 'connector' = 'confluent', 'key.format' = 'raw', 'value.format' = 'avro-registry' ... ) Properties The key format is raw (binary format) with BYTES. Following Kafka message semantics, the key supports NULL as well, so the following code is valid: INSERT INTO t_raw_key SELECT CAST(NULL AS BYTES), 12, 'Bob'; Atomic key and record value in Schema Registry¶ For the following key schema in Schema Registry: "int" And for the following value schema in Schema Registry: { "type": "record", "name": "TestRecord", "fields": [ { "name": "i", "type": "int" }, { "name": "s", "type": "string" } ] } SHOW CREATE TABLE returns the following output: CREATE TABLE `t_atomic_key` ( `key` INT NOT NULL, `i` INT NOT NULL, `s` VARCHAR(2147483647) NOT NULL ) DISTRIBUTED BY HASH(`key`) INTO 2 BUCKETS WITH ( 'changelog.mode' = 'append', 'connector' = 'confluent', 'key.format' = 'avro-registry', 'value.format' = 'avro-registry' ... ) Properties Schema Registry defines the column data type as INT NOT NULL. The column name, key, is used as the default, because Schema Registry doesn’t provide a column name. Overlapping names in key/value, no key in Schema Registry¶ For the following value schema in Schema Registry: { "type": "record", "name": "TestRecord", "fields": [ { "name": "i", "type": "int" }, { "name": "key", "type": "string" } ] } SHOW CREATE TABLE returns the following output: CREATE TABLE `t_raw_disjoint` ( `key_key` VARBINARY(2147483647), `i` INT NOT NULL, `key` VARCHAR(2147483647) NOT NULL ) DISTRIBUTED BY HASH(`key_key`) INTO 1 BUCKETS WITH ( 'changelog.mode' = 'append', 'connector' = 'confluent', 'key.fields-prefix' = 'key_', 'key.format' = 'raw', 'value.format' = 'avro-registry' ... ) Properties The Schema Registry value schema defines columns i INT NOT NULL and key STRING. The column name key BYTES is used as the default if no key is in Schema Registry. Because key would collide with value schema column, the key_ prefix is added. Record key and record value in Schema Registry¶ For the following key schema in Schema Registry: { "type": "record", "name": "TestRecord", "fields": [ { "name": "uid", "type": "int" } ] } And for the following value schema in Schema Registry: { "type": "record", "name": "TestRecord", "fields": [ { "name": "name", "type": "string" }, { "name": "zip_code", "type": "string" } ] } SHOW CREATE TABLE returns the following output: CREATE TABLE `t_sr_disjoint` ( `uid` INT NOT NULL, `name` VARCHAR(2147483647) NOT NULL, `zip_code` VARCHAR(2147483647) NOT NULL ) DISTRIBUTED BY HASH(`uid`) INTO 1 BUCKETS WITH ( 'changelog.mode' = 'append', 'connector' = 'confluent', 'value.format' = 'avro-registry' ... ) Properties Schema Registry defines columns for both key and value. The column names of key and value are disjoint sets and don’t overlap. Record key and record value with overlap in Schema Registry¶ For the following key schema in Schema Registry: { "type": "record", "name": "TestRecord", "fields": [ { "name": "uid", "type": "int" } ] } And for the following value schema in Schema Registry: { "type": "record", "name": "TestRecord", "fields": [ { "name": "uid", "type": "int" },{ "name": "name", "type": "string" }, { "name": "zip_code", "type": "string" } ] } SHOW CREATE TABLE returns the following output: CREATE TABLE `t_sr_joint` ( `uid` INT NOT NULL, `name` VARCHAR(2147483647) NOT NULL, `zip_code` VARCHAR(2147483647) NOT NULL ) DISTRIBUTED BY HASH(`uid`) INTO 1 BUCKETS WITH ( 'changelog.mode' = 'append', 'connector' = 'confluent', 'value.fields-include' = 'all', 'value.format' = 'avro-registry' ... ) Properties Schema Registry defines columns for both key and value. The column names of key and value overlap on uid. 'value.fields-include' = 'all' is set to exclude the key, because it is fully contained in the value. Detecting that key is fully contained in the value requires that both field name and data type match completely, including nullability, and all fields of the key are included in the value. Union types in Schema Registry¶ For the following value schema in Schema Registry: ["int", "string"] SHOW CREATE TABLE returns the following output: CREATE TABLE `t_union` ( `key` VARBINARY(2147483647), `int` INT, `string` VARCHAR(2147483647) ) ... For the following value schema in Schema Registry: [ "string", { "type": "record", "name": "User", "fields": [ { "name": "uid", "type": "int" },{ "name": "name", "type": "string" } ] }, { "type": "record", "name": "Address", "fields": [ { "name": "zip_code", "type": "string" } ] } ] SHOW CREATE TABLE returns the following output: CREATE TABLE `t_union` ( `key` VARBINARY(2147483647), `string` VARCHAR(2147483647), `User` ROW<`uid` INT NOT NULL, `name` VARCHAR(2147483647) NOT NULL>, `Address` ROW<`zip_code` VARCHAR(2147483647) NOT NULL> ) ... Properties NULL and NOT NULL are inferred depending on whether a union contains NULL. Elements of a union are always NULL, because they need to be set to NULL when a different element is set. If a record defines a namespace, the field is prefixed with it, for example, org.myorg.avro.User. Multi-message protobuf schema in Schema Registry¶ For the following value schema in Schema Registry: syntax = "proto3"; message Purchase { string item = 1; double amount = 2; string customer_id = 3; } message Pageview { string url = 1; bool is_special = 2; string customer_id = 3; } SHOW CREATE TABLE returns the following output: CREATE TABLE `t` ( `key` VARBINARY(2147483647), `Purchase` ROW< `item` VARCHAR(2147483647) NOT NULL, `amount` DOUBLE NOT NULL, `customer_id` VARCHAR(2147483647) NOT NULL >, `Pageview` ROW< `url` VARCHAR(2147483647) NOT NULL, `is_special` BOOLEAN NOT NULL, `customer_id` VARCHAR(2147483647) NOT NULL > ) ... For the following value schema in Schema Registry: syntax = "proto3"; message Purchase { string item = 1; double amount = 2; string customer_id = 3; Pageview pageview = 4; } message Pageview { string url = 1; bool is_special = 2; string customer_id = 3; } SHOW CREATE TABLE returns the following output: CREATE TABLE `t` ( `key` VARBINARY(2147483647), `Purchase` ROW< `item` VARCHAR(2147483647) NOT NULL, `amount` DOUBLE NOT NULL, `customer_id` VARCHAR(2147483647) NOT NULL, `pageview` ROW< `url` VARCHAR(2147483647) NOT NULL, `is_special` BOOLEAN NOT NULL, `customer_id` VARCHAR(2147483647) NOT NULL > >, `Pageview` ROW< `url` VARCHAR(2147483647) NOT NULL, `is_special` BOOLEAN NOT NULL, `customer_id` VARCHAR(2147483647) NOT NULL > ) ... For the following value schema in Schema Registry: syntax = "proto3"; message Purchase { string item = 1; double amount = 2; string customer_id = 3; Pageview pageview = 4; message Pageview { string url = 1; bool is_special = 2; string customer_id = 3; } } SHOW CREATE TABLE returns the following output: CREATE TABLE `t` ( `key` VARBINARY(2147483647), `item` VARCHAR(2147483647) NOT NULL, `amount` DOUBLE NOT NULL, `customer_id` VARCHAR(2147483647) NOT NULL, `pageview` ROW< `url` VARCHAR(2147483647) NOT NULL, `is_special` BOOLEAN NOT NULL, `customer_id` VARCHAR(2147483647) NOT NULL > ) ... Debezium CDC format in Schema Registry¶ For a Debezium CDC format with the following value schema in Schema Registry: { "type": "record", "name": "Customer", "namespace": "io.debezium.data", "fields": [ { "name": "before", "type": ["null", { "type": "record", "name": "Value", "fields": [ {"name": "id", "type": "int"}, {"name": "name", "type": "string"}, {"name": "email", "type": "string"} ] }], "default": null }, { "name": "after", "type": ["null", "Value"], "default": null }, { "name": "source", "type": { "type": "record", "name": "Source", "fields": [ {"name": "version", "type": "string"}, {"name": "connector", "type": "string"}, {"name": "name", "type": "string"}, {"name": "ts_ms", "type": "long"}, {"name": "db", "type": "string"}, {"name": "schema", "type": "string"}, {"name": "table", "type": "string"} ] } }, {"name": "op", "type": "string"}, {"name": "ts_ms", "type": ["null", "long"], "default": null}, {"name": "transaction", "type": ["null", { "type": "record", "name": "Transaction", "fields": [ {"name": "id", "type": "string"}, {"name": "total_order", "type": "long"}, {"name": "data_collection_order", "type": "long"} ] }], "default": null} ] } SHOW CREATE TABLE returns the following output: CREATE TABLE `customer_changes` ( `key` VARBINARY(2147483647), `id` INT NOT NULL, `name` VARCHAR(2147483647) NOT NULL, `email` VARCHAR(2147483647) NOT NULL ) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS WITH ( 'changelog.mode' = 'retract', 'connector' = 'confluent', 'key.format' = 'raw', 'value.format' = 'avro-debezium-registry' ... ) Properties Flink detects the Debezium format automatically, based on the schema structure with after, before, and op fields. The table schema is inferred from the after schema, exposing only the actual data fields. Automatic Debezium Envelope Detection: For schemas created after May 19, 2025 at 09:00 UTC, Flink automatically detects Debezium envelopes and sets appropriate defaults: value.format defaults to *-debezium-registry (instead of *-registry) changelog.mode defaults to retract (instead of append) Exception: If Kafka cleanup.policy is compact, changelog.mode is set to upsert The default changelog.mode is retract, which properly handles all CDC operations, including inserts, updates, and deletes. You can manually override the changelog mode if necessary: -- Change to upsert mode for primary key-based operations ALTER TABLE customer_changes SET ('changelog.mode' = 'upsert'); -- Change to append mode (processes only inserts and updates) ALTER TABLE customer_changes SET ('changelog.mode' = 'append'); Examples¶ The following examples show how to create Flink tables for frequently encountered scenarios. Minimal table¶ CREATE TABLE t_minimal (s STRING); Properties Append changelog mode. No Schema Registry key. Round robin distribution. 6 Kafka partitions. The $rowtime column and system watermark are added implicitly. Table with a primary key¶ SyntaxCREATE TABLE t_pk (k INT PRIMARY KEY NOT ENFORCED, s STRING); Properties Upsert changelog mode. The primary key defines an implicit DISTRIBUTED BY(k). k is the Schema Registry key. Hash distribution on k. The table has 6 Kafka partitions. k is declared as being unique, meaning no duplicate rows. k must not contain NULLs, so an implicit NOT NULL is added. The $rowtime column and system watermark are added implicitly. Table with a primary key in append mode¶ SyntaxCREATE TABLE t_pk_append (k INT PRIMARY KEY NOT ENFORCED, s STRING) DISTRIBUTED INTO 4 BUCKETS WITH ('changelog.mode' = 'append'); Properties Append changelog mode. k is the Schema Registry key. Hash distribution on k. The table has 4 Kafka partitions. k is declared as being unique, meaning no duplicate rows. k must not contain NULLs, meaning implicit NOT NULL. The $rowtime column and system watermark are added implicitly. Table with hash distribution¶ SyntaxCREATE TABLE t_dist (k INT, s STRING) DISTRIBUTED BY (k) INTO 4 BUCKETS; Properties Append changelog mode. k is the Schema Registry key. Hash distribution on k. The table has 4 Kafka partitions. The $rowtime column and system watermark are added implicitly. Complex table with all concepts combined¶ SyntaxCREATE TABLE t_complex (k1 INT, k2 INT, PRIMARY KEY (k1, k2) NOT ENFORCED, s STRING) COMMENT 'My complex table' DISTRIBUTED BY HASH(k1) INTO 4 BUCKETS WITH ('changelog.mode' = 'append'); Properties Append changelog mode. k1 is the Schema Registry key. Hash distribution on k1. k2 is treated as a value column and is stored in the value part of Schema Registry. The table has 4 Kafka partitions. k1 and k2 are declared as being unique, meaning no duplicates. k and k2 must not contain NULLs, meaning implicit NOT NULL. The $rowtime column and system watermark are added implicitly. An additional comment is added. Table with overlapping names in key/value of Schema Registry but disjoint data¶ SyntaxCREATE TABLE t_disjoint (from_key_k INT, k STRING) DISTRIBUTED BY (from_key_k) WITH ('key.fields-prefix' = 'from_key_'); Properties Append changelog mode. Hash distribution on from_key_k. The key prefix from_key_ is defined and is stripped before storing the schema in Schema Registry. Therefore, k is the Schema Registry key of type INT. Also, k is the Schema Registry value of type STRING. Both key and value store disjoint data, so they can have different data types Create with overlapping names in key/value of Schema Registry but joint data¶ SyntaxCREATE TABLE t_joint (k INT, v STRING) DISTRIBUTED BY (k) WITH ('value.fields-include' = 'all'); Properties Append changelog mode. Hash distribution on k. By default, the key is never included in the value in Schema Registry. By setting 'value.fields-include' = 'all', the value contains the full table schema Therefore, k is the Schema Registry key. Also, k, v is the Schema Registry value. The payload of k is stored twice in the Kafka message, because key and value store joint data and they have the same data type for k. Table with metadata columns for writing a Kafka message timestamp¶ SyntaxCREATE TABLE t_metadata_write (name STRING, ts TIMESTAMP_LTZ(3) NOT NULL METADATA FROM 'timestamp') DISTRIBUTED INTO 1 BUCKETS; Properties Adds the ts metadata column, which isn’t part of Schema Registry but instead is a pure Flink concept. In contrast with $rowtime, which is declared as a METADATA VIRTUAL column, ts is selected in a SELECT * statement and is writable. The following examples show how to fill Kafka messages with an instant. INSERT INTO t (ts, name) SELECT NOW(), 'Alice'; INSERT INTO t (ts, name) SELECT TO_TIMESTAMP_LTZ(0, 3), 'Bob'; SELECT $rowtime, * FROM t; The Schema Registry subject compatibility mode must be FULL or FULL_TRANSITIVE. For more information, see Schema Evolution and Compatibility for Schema Registry on Confluent Cloud. Table with string key and value in Schema Registry¶ SyntaxCREATE TABLE t_raw_string_key (key STRING, i INT) DISTRIBUTED BY (key) WITH ('key.format' = 'raw'); Properties Schema Registry is filled with a value subject containing i. The key columns are determined by the DISTRIBUTED BY clause. By default, Avro in Schema Registry would be used for the key, but the WITH clause overrides this to the raw format. Tables with cross-region schema sharing¶ Create two Kafka clusters in different regions, for example, eu-west-1 and us-west-2. Create two Flink compute pools in different regions, for example, eu-west-1 and us-west-2. In the first region, run the following statement. CREATE TABLE t_shared_schema (key STRING, s STRING) DISTRIBUTED BY (key); In the second region, run the same statement. CREATE TABLE t_shared_schema (key STRING, s STRING) DISTRIBUTED BY (key); Properties Schema Registry is shared across regions. The SQL metastore, Flink compute pools, and Kafka clusters are regional. Both tables in either region share the Schema Registry subjects t_shared_schema-key and t_shared_schema-value. Create with different changelog modes¶ There are three ways of storing events in a table’s log, this is, in the underlying Kafka topic. append Every insertion event is an immutable fact. Every event is insert-only. Events can be distributed in a round-robin fashion across workers/shards because they are unrelated. upsert Events are related using a primary key. Every event is either an upsert or delete event for a primary key. Events for the same primary key should land at the same worker/shard. retract Every upsert event is a fact that can be “undone”. This means that every event is either an insertion or its retraction. So, two events are related by all columns. In other words, the entire row is the key. For example, +I['Bob', 42] is related to -D['Bob', 42] and +U['Alice', 13] is related to -U['Alice', 13]. The retract mode is intermediate between the append and upsert modes. The append and upsert modes are natural to existing Kafka consumers and producers. Kafka compaction is a kind of upsert. Start with a table created by the following statement. CREATE TABLE t_changelog_modes (i BIGINT); Properties Confluent Cloud for Apache Flink always derives an appropriate changelog mode for the preceding declaration. If there is no primary key, append is the safest option, because it prevents users from pushing updates into a topic accidentally, and it has the best support of downstream consumers. -- works because the query is non-updating INSERT INTO t_changelog_modes SELECT 1; -- does not work because the query is updating, causing an error INSERT INTO t_changelog_modes SELECT COUNT(*) FROM (VALUES (1), (2), (3)); If you need updates, and if downstream consumers support it, for example, when the consumer is another Flink job, you can set the changelog mode to retract. ALTER TABLE t_changelog_modes SET ('changelog.mode' = 'retract'); Properties The table starts accepting retractions during INSERT INTO. Already existing records in the Kafka topic are treated as insertions. Newly added records receive a changeflag (+I, +U, -U, -D) in the Kafka message header. Going back to append mode is possible, but retractions (-U, -D) appear as insertions, and the Kafka header metadata column reveals the changeflag. ALTER TABLE t_changelog_modes SET ('changelog.mode' = 'append'); ALTER TABLE t_changelog_modes ADD headers MAP<BYTES, BYTES> METADATA VIRTUAL; -- Shows what is serialized internally SELECT i, headers FROM t_changelog_modes; Table with infinite retention time¶ CREATE TABLE t_infinite_retention (i INT) WITH ('kafka.retention.time' = '0'); Properties By default, the retention time is 7 days, as in all other APIs. Flink doesn’t support -1 for durations, so 0 means infinite retention time. Durations in Flink support 2 day or 2 d syntax, so it doesn’t need to be in milliseconds. If no unit is specified, the unit is milliseconds. The following units are supported: "d", "day", "h", "hour", "m", "min", "minute", "ms", "milli", "millisecond", "µs", "micro", "microsecond", "ns", "nano", "nanosecond" Related content¶ Video: How to Set Idle Timeouts ALTER TABLE statement INSERT INTO FROM SELECT Statement Join Queries Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
CREATE TABLE [IF NOT EXISTS] [catalog_name.][db_name.]table_name
  (
    { <physical_column_definition> |
      <metadata_column_definition> |
      <computed_column_definition> |
      <column_in_vector_db_provider> }[ , ...n]
    [ <watermark_definition> ]
    [ <table_constraint> ][ , ...n]
  )
  [COMMENT table_comment]
  [DISTRIBUTED BY (distribution_column_name1, distribution_column_name2, ...) INTO n BUCKETS]
  WITH (key1=value1, key2=value2, ...)
  [ LIKE source_table [( <like_options> )] | AS select_query ]

<physical_column_definition>:
  column_name column_type [ <column_constraint> ] [COMMENT column_comment]

<metadata_column_definition>:
  column_name column_type METADATA [ FROM metadata_key ] [ VIRTUAL ]

<computed_column_definition>:
  column_name AS computed_column_expression [COMMENT column_comment]

<column_in_vector_db_provider>
  column_name column_type

<watermark_definition>:
  WATERMARK FOR rowtime_column_name AS watermark_strategy_expression

<table_constraint>:
  [CONSTRAINT constraint_name] PRIMARY KEY (column_name, ...) NOT ENFORCED

<like_options>:
{
 { INCLUDING | EXCLUDING } { ALL | CONSTRAINTS | PARTITIONS } |
 { INCLUDING | EXCLUDING | OVERWRITING } { GENERATED | OPTIONS | WATERMARKS }
}
```

```sql
catalog_name.db_name.table_name
```

```sql
db_name.table_name
```

```sql
CREATE TABLE t1 (
  `id` BIGINT,
  `name` STRING,
  `age` INT,
  `salary` DECIMAL(10,2),
  `active` BOOLEAN,
  `created_at` TIMESTAMP_LTZ(3)
);
```

```sql
CREATE TABLE t2 (
  `id` BIGINT,
  `name` STRING,
  `age` INT,
  `salary` DECIMAL(10,2),
  `active` BOOLEAN,
  `created_at` TIMESTAMP_LTZ(3)
) WITH (
  'changelog.mode' = 'retract'
);
```

```sql
CREATE TABLE t1 (
  `id` BIGINT,
  `name` STRING,
  `age` INT,
  `salary` DECIMAL(10,2),
  `active` BOOLEAN,
  `created_at` TIMESTAMP_LTZ(3)
);
```

```sql
CREATE TABLE t (
  `user_id` BIGINT,
  `item_id` BIGINT,
  `behavior` STRING,
  `event_time` TIMESTAMP_LTZ(3) METADATA FROM 'timestamp',
  `partition` BIGINT METADATA VIRTUAL,
  `offset` BIGINT METADATA VIRTUAL
);
```

```sql
StringSerializer
```

```sql
StringSerializer
```

```sql
CREATE TABLE t (
  `id` BIGINT,
  `first_name` STRING,
  `last_name` STRING,
  `full_name` AS CONCAT(first_name, ' ', last_name)
);
```

```sql
SHOW CREATE TABLE
```

```sql
DESCRIBE EXTENDED
```

```sql
$rowtime TIMESTAMP_LTZ(3) NOT NULL
```

```sql
NOT ENFORCED
```

```sql
NOT ENFORCED
```

```sql
PRIMARY KEY
```

```sql
DISTRIBUTED BY
```

```sql
latest_page_per_ip
```

```sql
CREATE TABLE latest_page_per_ip (
    `ip` STRING,
    `page_url` STRING,
    `ts` TIMESTAMP_LTZ(3),
    PRIMARY KEY(`ip`) NOT ENFORCED
);
```

```sql
DISTRIBUTED BY
```

```sql
bucket = hash(user_id) % number_of_buckets
```

```sql
CREATE TABLE t (schema);
```

```sql
CREATE TABLE t_dist (k INT, s STRING) DISTRIBUTED BY (k) INTO 4 BUCKETS;
```

```sql
PARTITIONED BY
```

```sql
PARTITIONED BY
```

```sql
PARTITIONED BY
```

```sql
PARTITIONED BY
```

```sql
CREATE TABLE t (partition_key INT, example_value STRING) PARTITIONED BY (partition_key);
```

```sql
WATERMARK FOR $rowtime AS SOURCE_WATERMARK()
```

```sql
WATERMARK FOR rowtime_column_name AS watermark_strategy_expression
```

```sql
rowtime_column_name
```

```sql
TIMESTAMP(3)
```

```sql
watermark_strategy_expression
```

```sql
TIMESTAMP(3)
```

```sql
WATERMARK FOR rowtime_column AS rowtime_column
```

```sql
WATERMARK FOR rowtime_column AS rowtime_column - INTERVAL '0.001' SECOND
```

```sql
WATERMARK FOR rowtime_column AS rowtime_column - INTERVAL 'string' timeUnit
```

```sql
WATERMARK FOR rowtime_column AS rowtime_column - INTERVAL '5' SECOND
```

```sql
CREATE TABLE orders (
    `user` BIGINT,
    `product` STRING,
    `order_time` TIMESTAMP(3),
    WATERMARK FOR `order_time` AS `order_time` - INTERVAL '5' SECOND
);
```

```sql
sql.tables.scan.idle-timeout
```

```sql
sql.tables.scan.idle-timeout
```

```sql
-- Equivalent to the following CREATE TABLE and INSERT INTO statements.
CREATE TABLE my_ctas_table
AS SELECT id, name, age FROM source_table WHERE mod(id, 10) = 0;
```

```sql
-- These two statements are equivalent to the preceding CREATE TABLE AS statement.
CREATE TABLE my_ctas_table (
    id BIGINT,
    name STRING,
    age INT
);

INSERT INTO my_ctas_table SELECT id, name, age FROM source_table WHERE mod(id, 10) = 0;
```

```sql
CREATE TABLE t WITH (…) AS SELECT …
```

```sql
CREATE TABLE t WITH ('scan.startup.mode' = 'latest-offset') AS SELECT * FROM b;
```

```sql
CREATE TABLE my_ctas_table (
    desc STRING,
    quantity DOUBLE,
    cost AS price * quantity,
    WATERMARK FOR order_time AS order_time - INTERVAL '5' SECOND,
) AS SELECT id, price, quantity, order_time FROM source_table;
```

```sql
-- Equivalent to the following CREATE TABLE and INSERT INTO statements.
CREATE TABLE my_ctas_table (
    PRIMARY KEY (id) NOT ENFORCED
) DISTRIBUTED BY HASH(id) INTO 4 BUCKETS
AS SELECT id, name FROM source_table;
```

```sql
-- These two statements are equivalent to the preceding CREATE TABLE AS statement.
CREATE TABLE my_ctas_table (
    id BIGINT NOT NULL PRIMARY KEY NOT ENFORCED,
    name STRING
) DISTRIBUTED BY HASH(id) INTO 4 BUCKETS;

INSERT INTO my_ctas_table SELECT id, name FROM source_table;
```

```sql
CREATE TABLE t (
  `user_id` BIGINT,
  `item_id` BIGINT,
  `price` DOUBLE,
  `behavior` STRING,
  `created_at` TIMESTAMP(3),
  `price_with_tax` AS `price` * 1.19,
  `event_time` TIMESTAMP_LTZ(3) METADATA FROM 'timestamp',
  `partition` BIGINT METADATA VIRTUAL,
  `offset` BIGINT METADATA VIRTUAL
);
```

```sql
CREATE TABLE t_derived (
    WATERMARK FOR `created_at` AS `created_at` - INTERVAL '5' SECOND
)
LIKE t (
    EXCLUDING WATERMARKS
    EXCLUDING METADATA
);
```

```sql
'changelog.mode' = [append | upsert | retract]
```

```sql
'error-handling.log.target' = '<dlq_table_name>'
```

```sql
error-handling.log.target
```

```sql
'error-handling.mode' = [fail | ignore | log]
```

```sql
SELECT * FROM my_table
```

```sql
error-handling.mode
```

```sql
error-handling.mode
```

```sql
'kafka.cleanup-policy' = [delete | compact | delete-compact]
```

```sql
log.cleanup.policy
```

```sql
delete-compact
```

```sql
read-committed
```

```sql
'kafka.consumer.isolation-level' = [read-committed | read-uncommitted]
```

```sql
read-committed
```

```sql
read-uncommitted
```

```sql
'kafka.max-message-size' = MemorySize
```

```sql
max.message.bytes
```

```sql
'kafka.producer.compression.type' = [none | gzip | snappy | lz4 | zstd]
```

```sql
compression.type
```

```sql
'kafka.retention.size' = MemorySize
```

```sql
log.retention.bytes
```

```sql
'kafka.retention.time' = '<duration>'
```

```sql
log.retention.ms
```

```sql
'key.fields-prefix' = '<prefix-string>'
```

```sql
key.fields-prefix
```

```sql
'key.format' = '<key-format>'
```

```sql
avro-registry
```

```sql
json-registry
```

```sql
proto-registry
```

```sql
avro-registry
```

```sql
json-registry
```

```sql
proto-registry
```

```sql
'key.<format>.schema-context' = '<schema-context>'
```

```sql
scan.bounded.mode = [latest-offset | timestamp | unbounded]
```

```sql
latest-offset
```

```sql
scan.bounded.mode
```

```sql
January 1, 1970 00:00:00.000 GMT
```

```sql
'scan.bounded.mode' = 'timestamp',
'scan.bounded.timestamp-millis' = '<long-value>'
```

```sql
earliest-offset
```

```sql
'scan.startup.mode' = '<startup-mode>'
```

```sql
earliest-offset
```

```sql
latest-offset
```

```sql
earliest-offset
```

```sql
group-offsets
```

```sql
'scan.startup.mode' = 'timestamp',
'scan.startup.timestamp-millis' = '<long-value>'
```

```sql
'value.fields-include' = [all, except-key]
```

```sql
'value.format' = '<format>'
```

```sql
avro-registry
```

```sql
json-registry
```

```sql
proto-registry
```

```sql
avro-debezium-registry
```

```sql
json-debezium-registry
```

```sql
proto-debezium-registry
```

```sql
avro-registry
```

```sql
json-registry
```

```sql
proto-registry
```

```sql
'value.<format>.schema-context' = '<schema-context>'
```

```sql
CREATE TABLE `t_raw` (
  `key` VARBINARY(2147483647),
  `val` VARBINARY(2147483647)
) DISTRIBUTED BY HASH(`key`) INTO 2 BUCKETS
WITH (
  'changelog.mode' = 'append',
  'connector' = 'confluent',
  'key.format' = 'raw',
  'value.format' = 'raw'
  ...
)
```

```sql
INSERT INTO t_raw (key, val) SELECT CAST(NULL AS BYTES), CAST(NULL AS BYTES);
```

```sql
{
  "type": "record",
  "name": "TestRecord",
  "fields": [
    {
      "name": "i",
      "type": "int"
    },
    {
      "name": "s",
      "type": "string"
    }
  ]
}
```

```sql
CREATE TABLE `t_raw_key` (
  `key` VARBINARY(2147483647),
  `i` INT NOT NULL,
  `s` VARCHAR(2147483647) NOT NULL
) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS
WITH (
  'changelog.mode' = 'append',
  'connector' = 'confluent',
  'key.format' = 'raw',
  'value.format' = 'avro-registry'
  ...
)
```

```sql
INSERT INTO t_raw_key SELECT CAST(NULL AS BYTES), 12, 'Bob';
```

```sql
{
  "type": "record",
  "name": "TestRecord",
  "fields": [
    {
      "name": "i",
      "type": "int"
    },
    {
      "name": "s",
      "type": "string"
    }
  ]
}
```

```sql
CREATE TABLE `t_atomic_key` (
  `key` INT NOT NULL,
  `i` INT NOT NULL,
  `s` VARCHAR(2147483647) NOT NULL
) DISTRIBUTED BY HASH(`key`) INTO 2 BUCKETS
WITH (
  'changelog.mode' = 'append',
  'connector' = 'confluent',
  'key.format' = 'avro-registry',
  'value.format' = 'avro-registry'
  ...
)
```

```sql
{
  "type": "record",
  "name": "TestRecord",
  "fields": [
    {
      "name": "i",
      "type": "int"
    },
    {
      "name": "key",
      "type": "string"
    }
  ]
}
```

```sql
CREATE TABLE `t_raw_disjoint` (
  `key_key` VARBINARY(2147483647),
  `i` INT NOT NULL,
  `key` VARCHAR(2147483647) NOT NULL
) DISTRIBUTED BY HASH(`key_key`) INTO 1 BUCKETS
WITH (
  'changelog.mode' = 'append',
  'connector' = 'confluent',
  'key.fields-prefix' = 'key_',
  'key.format' = 'raw',
  'value.format' = 'avro-registry'
  ...
)
```

```sql
i INT NOT NULL
```

```sql
{
  "type": "record",
  "name": "TestRecord",
  "fields": [
    {
      "name": "uid",
      "type": "int"
    }
  ]
}
```

```sql
{
  "type": "record",
  "name": "TestRecord",
  "fields": [
    {
      "name": "name",
      "type": "string"
    },
    {
      "name": "zip_code",
      "type": "string"
    }
  ]
}
```

```sql
CREATE TABLE `t_sr_disjoint` (
  `uid` INT NOT NULL,
  `name` VARCHAR(2147483647) NOT NULL,
  `zip_code` VARCHAR(2147483647) NOT NULL
) DISTRIBUTED BY HASH(`uid`) INTO 1 BUCKETS
WITH (
  'changelog.mode' = 'append',
  'connector' = 'confluent',
  'value.format' = 'avro-registry'
  ...
)
```

```sql
{
  "type": "record",
  "name": "TestRecord",
  "fields": [
    {
      "name": "uid",
      "type": "int"
    }
  ]
}
```

```sql
{
    "type": "record",
    "name": "TestRecord",
    "fields": [
      {
        "name": "uid",
        "type": "int"
      },{
        "name": "name",
        "type": "string"
      },
      {
        "name": "zip_code",
        "type": "string"
      }
    ]
  }
```

```sql
CREATE TABLE `t_sr_joint` (
  `uid` INT NOT NULL,
  `name` VARCHAR(2147483647) NOT NULL,
  `zip_code` VARCHAR(2147483647) NOT NULL
) DISTRIBUTED BY HASH(`uid`) INTO 1 BUCKETS
WITH (
  'changelog.mode' = 'append',
  'connector' = 'confluent',
  'value.fields-include' = 'all',
  'value.format' = 'avro-registry'
  ...
)
```

```sql
'value.fields-include' = 'all'
```

```sql
["int", "string"]
```

```sql
CREATE TABLE `t_union` (
  `key` VARBINARY(2147483647),
  `int` INT,
  `string` VARCHAR(2147483647)
)
...
```

```sql
[
  "string",
  {
    "type": "record",
    "name": "User",
    "fields": [
      {
        "name": "uid",
        "type": "int"
      },{
        "name": "name",
        "type": "string"
      }
    ]
  },
  {
    "type": "record",
    "name": "Address",
    "fields": [
      {
        "name": "zip_code",
        "type": "string"
      }
    ]
  }
]
```

```sql
CREATE TABLE `t_union` (
  `key` VARBINARY(2147483647),
  `string` VARCHAR(2147483647),
  `User` ROW<`uid` INT NOT NULL, `name` VARCHAR(2147483647) NOT NULL>,
  `Address` ROW<`zip_code` VARCHAR(2147483647) NOT NULL>
)
...
```

```sql
org.myorg.avro.User
```

```sql
syntax = "proto3";

message Purchase {
   string item = 1;
   double amount = 2;
   string customer_id = 3;
}

message Pageview {
   string url = 1;
   bool is_special = 2;
   string customer_id = 3;
}
```

```sql
CREATE TABLE `t` (
  `key` VARBINARY(2147483647),
  `Purchase` ROW<
      `item` VARCHAR(2147483647) NOT NULL,
      `amount` DOUBLE NOT NULL,
      `customer_id` VARCHAR(2147483647) NOT NULL
   >,
  `Pageview` ROW<
      `url` VARCHAR(2147483647) NOT NULL,
      `is_special` BOOLEAN NOT NULL,
      `customer_id` VARCHAR(2147483647) NOT NULL
   >
)
...
```

```sql
syntax = "proto3";

message Purchase {
   string item = 1;
   double amount = 2;
   string customer_id = 3;
   Pageview pageview = 4;
}

message Pageview {
   string url = 1;
   bool is_special = 2;
   string customer_id = 3;
}
```

```sql
CREATE TABLE `t` (
  `key` VARBINARY(2147483647),
  `Purchase` ROW<
      `item` VARCHAR(2147483647) NOT NULL,
      `amount` DOUBLE NOT NULL,
      `customer_id` VARCHAR(2147483647) NOT NULL,
      `pageview` ROW<
         `url` VARCHAR(2147483647) NOT NULL,
         `is_special` BOOLEAN NOT NULL,
         `customer_id` VARCHAR(2147483647) NOT NULL
      >
   >,
  `Pageview` ROW<
      `url` VARCHAR(2147483647) NOT NULL,
      `is_special` BOOLEAN NOT NULL,
      `customer_id` VARCHAR(2147483647) NOT NULL
   >
)
...
```

```sql
syntax = "proto3";

message Purchase {
   string item = 1;
   double amount = 2;
   string customer_id = 3;
   Pageview pageview = 4;
   message Pageview {
      string url = 1;
      bool is_special = 2;
      string customer_id = 3;
   }
}
```

```sql
CREATE TABLE `t` (
  `key` VARBINARY(2147483647),
  `item` VARCHAR(2147483647) NOT NULL,
  `amount` DOUBLE NOT NULL,
  `customer_id` VARCHAR(2147483647) NOT NULL,
  `pageview` ROW<
      `url` VARCHAR(2147483647) NOT NULL,
      `is_special` BOOLEAN NOT NULL,
      `customer_id` VARCHAR(2147483647) NOT NULL
   >
)
...
```

```sql
{
  "type": "record",
  "name": "Customer",
  "namespace": "io.debezium.data",
  "fields": [
    {
      "name": "before",
      "type": ["null", {
        "type": "record",
        "name": "Value",
        "fields": [
          {"name": "id", "type": "int"},
          {"name": "name", "type": "string"},
          {"name": "email", "type": "string"}
        ]
      }],
      "default": null
    },
    {
      "name": "after",
      "type": ["null", "Value"],
      "default": null
    },
    {
      "name": "source",
      "type": {
        "type": "record",
        "name": "Source",
        "fields": [
          {"name": "version", "type": "string"},
          {"name": "connector", "type": "string"},
          {"name": "name", "type": "string"},
          {"name": "ts_ms", "type": "long"},
          {"name": "db", "type": "string"},
          {"name": "schema", "type": "string"},
          {"name": "table", "type": "string"}
        ]
      }
    },
    {"name": "op", "type": "string"},
    {"name": "ts_ms", "type": ["null", "long"], "default": null},
    {"name": "transaction", "type": ["null", {
      "type": "record",
      "name": "Transaction",
      "fields": [
        {"name": "id", "type": "string"},
        {"name": "total_order", "type": "long"},
        {"name": "data_collection_order", "type": "long"}
      ]
    }], "default": null}
  ]
}
```

```sql
CREATE TABLE `customer_changes` (
  `key` VARBINARY(2147483647),
   `id` INT NOT NULL,
   `name` VARCHAR(2147483647) NOT NULL,
   `email` VARCHAR(2147483647) NOT NULL
)
DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS
WITH (
  'changelog.mode' = 'retract',
  'connector' = 'confluent',
  'key.format' = 'raw',
  'value.format' = 'avro-debezium-registry'
  ...
)
```

```sql
value.format
```

```sql
*-debezium-registry
```

```sql
changelog.mode
```

```sql
cleanup.policy
```

```sql
changelog.mode
```

```sql
changelog.mode
```

```sql
-- Change to upsert mode for primary key-based operations
ALTER TABLE customer_changes SET ('changelog.mode' = 'upsert');

-- Change to append mode (processes only inserts and updates)
ALTER TABLE customer_changes SET ('changelog.mode' = 'append');
```

```sql
CREATE TABLE t_minimal (s STRING);
```

```sql
CREATE TABLE t_pk (k INT PRIMARY KEY NOT ENFORCED, s STRING);
```

```sql
CREATE TABLE t_pk_append (k INT PRIMARY KEY NOT ENFORCED, s STRING)
  DISTRIBUTED INTO 4 BUCKETS
  WITH ('changelog.mode' = 'append');
```

```sql
CREATE TABLE t_dist (k INT, s STRING) DISTRIBUTED BY (k) INTO 4 BUCKETS;
```

```sql
CREATE TABLE t_complex (k1 INT, k2 INT, PRIMARY KEY (k1, k2) NOT ENFORCED, s STRING)
  COMMENT 'My complex table'
  DISTRIBUTED BY HASH(k1) INTO 4 BUCKETS
  WITH ('changelog.mode' = 'append');
```

```sql
CREATE TABLE t_disjoint (from_key_k INT, k STRING)
  DISTRIBUTED BY (from_key_k)
  WITH ('key.fields-prefix' = 'from_key_');
```

```sql
CREATE TABLE t_joint (k INT, v STRING)
  DISTRIBUTED BY (k)
  WITH ('value.fields-include' = 'all');
```

```sql
'value.fields-include' = 'all'
```

```sql
CREATE TABLE t_metadata_write (name STRING, ts TIMESTAMP_LTZ(3) NOT NULL METADATA FROM 'timestamp')
  DISTRIBUTED INTO 1 BUCKETS;
```

```sql
INSERT INTO t (ts, name) SELECT NOW(), 'Alice';
INSERT INTO t (ts, name) SELECT TO_TIMESTAMP_LTZ(0, 3), 'Bob';
SELECT $rowtime, * FROM t;
```

```sql
CREATE TABLE t_raw_string_key (key STRING, i INT)
  DISTRIBUTED BY (key)
  WITH ('key.format' = 'raw');
```

```sql
CREATE TABLE t_shared_schema (key STRING, s STRING) DISTRIBUTED BY (key);
```

```sql
CREATE TABLE t_shared_schema (key STRING, s STRING) DISTRIBUTED BY (key);
```

```sql
t_shared_schema-key
```

```sql
t_shared_schema-value
```

```sql
+I['Bob', 42]
```

```sql
-D['Bob', 42]
```

```sql
+U['Alice', 13]
```

```sql
-U['Alice', 13]
```

```sql
CREATE TABLE t_changelog_modes (i BIGINT);
```

```sql
-- works because the query is non-updating
INSERT INTO t_changelog_modes SELECT 1;

-- does not work because the query is updating, causing an error
INSERT INTO t_changelog_modes SELECT COUNT(*) FROM (VALUES (1), (2), (3));
```

```sql
ALTER TABLE t_changelog_modes SET ('changelog.mode' = 'retract');
```

```sql
ALTER TABLE t_changelog_modes SET ('changelog.mode' = 'append');
ALTER TABLE t_changelog_modes ADD headers MAP<BYTES, BYTES> METADATA VIRTUAL;

-- Shows what is serialized internally
SELECT i, headers FROM t_changelog_modes;
```

```sql
CREATE TABLE t_infinite_retention (i INT) WITH ('kafka.retention.time' = '0');
```

```sql
"d", "day", "h", "hour", "m", "min", "minute", "ms", "milli", "millisecond",
"µs", "micro", "microsecond", "ns", "nano", "nanosecond"
```

---

### SQL CREATE VIEW Statement in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/statements/create-view.html

CREATE VIEW Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables creating views based on statement expressions by using the CREATE VIEW statement. With Flink views, you can encapsulate complex queries and reference them like regular tables. Syntax¶ CREATE VIEW [IF NOT EXISTS] [catalog_name.][db_name.]view_name [( columnName [, columnName ]* )] [COMMENT view_comment] AS statement_expression Description¶ Create a view with the given statement expression. If a view with the same name already exists in the catalog, an exception is thrown. If you specify IF NOT EXISTS, nothing happens if the view exists already. The view name can be in these formats: catalog_name.db_name.view_name: The view is registered with the catalog named “catalog_name” and the database named “db_name”. db_name.view_name: The view is registered into the current catalog of the execution table environment and the database named “db_name”. view_name: The view is registered into the current catalog and the database of the execution table environment. A view created with the CREATE VIEW statement acts as a virtual table that refers to the result of the specified statement expression. The statement expression can be any valid SELECT statement supported by Flink SQL. Views vs. tables¶ Views in Flink are similar to tables in that they can be referenced in SQL queries just like regular tables. But there are some key differences: Views are read-only and can’t be used as sinks in INSERT statements. Tables support both read and write operations. Views don’t have a physical representation and are computed on-the-fly when referenced in a statement. Creating a view results in creating a special Kafka topic. This Flink resource only reserves the name and doesn’t store data. Creating a table results in creating a regular Kafka topic that stores data and corresponding key and value schemas in Confluent Schema Registry. Views are lightweight and store only the statement expression. Despite these differences, views and tables share the same namespace in Flink. This means a view can’t have the same fully qualified name as an existing table in the same catalog and database. Usage¶ The following CREATE VIEW statement defines a view named orders_by_customer that computes the total order value per customer from an orders table. CREATE VIEW customer_orders AS SELECT customer_id, SUM(price) AS total_spent FROM `examples`.`marketplace`.`orders` GROUP BY customer_id; You can then use this view in queries as if it were a table: SELECT customer_id, total_spent FROM customer_orders WHERE total_spent > 1000; This statement retrieves all customers with a total order value greater than 1000, leveraging the aggregation already computed in the orders_by_customer view. Related content¶ CREATE TABLE statement SELECT statement Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
CREATE VIEW [IF NOT EXISTS] [catalog_name.][db_name.]view_name
  [( columnName [, columnName ]* )] [COMMENT view_comment]
  AS statement_expression
```

```sql
catalog_name.db_name.view_name
```

```sql
db_name.view_name
```

```sql
orders_by_customer
```

```sql
CREATE VIEW customer_orders AS
SELECT customer_id, SUM(price) AS total_spent
FROM `examples`.`marketplace`.`orders`
GROUP BY customer_id;
```

```sql
SELECT customer_id, total_spent
FROM customer_orders
WHERE total_spent > 1000;
```

```sql
orders_by_customer
```

---

### SQL DESCRIBE Statement in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/statements/describe.html

DESCRIBE Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables viewing the schema of an Apache Kafka® topic. Also, you can view details of an AI model, function, or connection. Syntax¶ -- View table details. { DESCRIBE | DESC } [EXTENDED] [catalog_name.][db_name.]table_name -- View model details. { DESCRIBE | DESC } MODEL [[catalogname].[database_name]].model_name -- View function details. { DESCRIBE | DESC } FUNCTION [EXTENDED] [catalog_name.][db_name.]function_name -- View connection details. { DESCRIBE | DESC } CONNECTION [catalog_name.][db_name.]connection_name Description¶ The DESCRIBE statement shows the following properties of a table: Columns and their data type, including nullability constraints Primary keys Bucket keys, i.e., keys of distribution Implicit NOT NULL for primary key columns Custom watermark The DESCRIBE EXTENDED statement shows all of the properties from the DESCRIBE statement and also shows system columns, like $rowtime, including the system watermark. The DESCRIBE MODEL statement shows the following properties of an AI model: Input format Output format Model version isDefault version (yes or no) The DESCRIBE FUNCTION statement shows the following properties of a function: System function (yes or no) Temporary (yes or no) Class name Function language Plugin ID Version ID Argument types Return type The DESCRIBE FUNCTION EXTENDED statement shows all of the properties from the DESCRIBE FUNCTION statement and also shows the following properties: Kind i.e. SCALAR, TABLE, or AGGREGATE Requirements e.g an aggregate function that can only be applied in an OVER window Deterministic (yes or no) Constant folding (yes or no) Signature The DESCRIBE CONNECTION statement shows the following properties of a connection: Name Type Endpoint Comment Examples¶ Tables¶ In the Flink SQL shell or in a Cloud Console workspace, run the following commands to see an example of the DESCRIBE statement. Create a table. CREATE TABLE orders ( `user` BIGINT NOT NULL, product STRING, amount INT, ts TIMESTAMP(3), PRIMARY KEY(`user`) NOT ENFORCED ); Your output should resemble: [INFO] Execute statement succeed. View the table’s schema. DESCRIBE orders; Your output should resemble: +-------------+--------------+----------+-------------------------+ | Column Name | Data Type | Nullable | Extras | +-------------+--------------+----------+-------------------------+ | user | BIGINT | NOT NULL | PRIMARY KEY, BUCKET KEY | | product | STRING | NULL | | | amount | INT | NULL | | | ts | TIMESTAMP(3) | NULL | | +-------------+--------------+----------+-------------------------+ View the table’s schema and system columns. DESCRIBE EXTENDED orders; Your output should resemble: +-------------+----------------------------+----------+-----------------------------------------------------+---------+ | Column Name | Data Type | Nullable | Extras | Comment | +-------------+----------------------------+----------+-----------------------------------------------------+---------+ | user | BIGINT | NOT NULL | PRIMARY KEY, BUCKET KEY | | | product | STRING | NULL | | | | amount | INT | NULL | | | | ts | TIMESTAMP(3) | NULL | | | | $rowtime | TIMESTAMP_LTZ(3) *ROWTIME* | NOT NULL | METADATA VIRTUAL, WATERMARK AS `SOURCE_WATERMARK`() | SYSTEM | +-------------+----------------------------+----------+-----------------------------------------------------+---------+ Models¶ If you have an AI model registered in the Flink environment, you can view its details and creation options by using the DESCRIBE MODEL statement. The following code example shows how to view the default model version: DESCRIBE MODEL `my-model`; Your output should resemble: +-----------------------+---------------------------+---------------------------+---------+ | Inputs | Outputs | Options | Comment | +-----------------------+---------------------------+---------------------------+---------+ | ( | ( | { | | | `credit_limit` INT, | `predicted_default` INT | AZUREML.API_KEY=******, | | | `age` INT | ) | AZUREML.ENDPOINT=h... | | | ) | | | | +-----------------------+---------------------------+---------------------------+---------+ The following code example shows how to view a specific model version: DESCRIBE MODEL `my-model$2`; Your output should resemble: +-----------+------------------+-----------------------+---------------------------+--------------------+---------+ | VersionId | IsDefaultVersion | Inputs | Outputs | Options | Comment | +-----------+------------------+-----------------------+---------------------------+--------------------+---------+ | 2 | true | ( | ( | { | | | | | `credit_limit` INT, | `predicted_default` INT | AZUREML.API_K... | | | | | `age` INT | ) | | | | | | ) | | | | +-----------+------------------+-----------------------+---------------------------+--------------------+---------+ The following code example shows how to view all model versions: DESCRIBE MODEL `my-model$all`; Your output should resemble: +-----------+------------------+-----------------------+---------------------------+--------------------+---------+ | VersionId | IsDefaultVersion | Inputs | Outputs | Options | Comment | +-----------+------------------+-----------------------+---------------------------+--------------------+---------+ | 1 | true | ( | ( | { | | | | | `credit_limit` INT, | `predicted_default` INT | AZUREML.API_K... | | | | | `age` INT | ) | | | | | | ) | | | | | 2 | false | ( | ( | { | | | | | `credit_limit` INT, | `predicted_default` INT | AZUREML.API_K... | | | | | `age` INT | ) | | | | | | ) | | | | +-----------+------------------+-----------------------+---------------------------+--------------------+---------+ For more information, see Model versioning. Functions¶ You can view the details of any system functions or registered user-defined functions in the Flink environment, by using the DESCRIBE FUNCTION statement. The following code example shows how to describe a system function: DESCRIBE FUNCTION `SUM`; Your output should resemble: +-----------------+------------+ | info name | info value | +-----------------+------------+ | system function | true | | temporary | false | +-----------------+------------+ View more details about the system function definition. DESCRIBE FUNCTION EXTENDED `SUM`; Your output should resemble: +------------------+----------------+ | info name | info value | +------------------+----------------+ | system function | true | | temporary | false | | kind | AGGREGATE | | requirements | [] | | deterministic | true | | constant folding | true | | signature | SUM(<NUMERIC>) | +------------------+----------------+ Here is what describing a user-defined function looks like DESCRIBE FUNCTION `MyUpperCaseUdf`; Your output should resemble: +-------------------+----------------------+ | info name | info value | +-------------------+----------------------+ | system function | false | | temporary | true | | class name | org.example.UpperUDF | | function language | JAVA | | plugin id | ccp-xyz | | version id | ver-123 | | argument types | [str] | | return type | str | +-------------------+----------------------+ View more details about the user-defined function definition. DESCRIBE FUNCTION EXTENDED `MyUpperCaseUdf`; Your output should resemble: +-------------------+-------------------------------+ | info name | info value | +-------------------+-------------------------------+ | system function | false | | temporary | true | | class name | org.example.UpperUDF | | function language | JAVA | | kind | SCALAR | | requirements | [] | | deterministic | true | | constant folding | true | | signature | cat.db.MyUpperCaseUdf(STRING) | | plugin id | ccp-xyz | | version id | ver-123 | | argument types | [str] | | return type | str | +-------------------+-------------------------------+ Connections¶ You can view the details of any connection in the Flink environment by using the DESCRIBE CONNECTION statement. The following code example shows how to describe an example connection named azure-openai-connection. DESCRIBE CONNECTION `azure-openai-connection`; Your output should resemble: +-------------------------+-------------+-----------------------------------------------------------------------+---------+ | Name | Type | Endpoint | Comment | +-------------------------+-------------+-----------------------------------------------------------------------+---------+ | azure-openai-connection | AZUREOPENAI | https://<your-project>.openai.azure.com/openai/deployments/matrix-... | | +-------------------------+-------------+-----------------------------------------------------------------------+---------+ Related content¶ CREATE TABLE CREATE MODEL CREATE FUNCTION CREATE CONNECTION USE CATALOG Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
-- View table details.
{ DESCRIBE | DESC } [EXTENDED] [catalog_name.][db_name.]table_name

-- View model details.
{ DESCRIBE | DESC } MODEL [[catalogname].[database_name]].model_name

-- View function details.
{ DESCRIBE | DESC } FUNCTION [EXTENDED] [catalog_name.][db_name.]function_name

-- View connection details.
{ DESCRIBE | DESC } CONNECTION [catalog_name.][db_name.]connection_name
```

```sql
CREATE TABLE orders (
  `user` BIGINT NOT NULL,
  product STRING,
  amount INT,
  ts TIMESTAMP(3),
  PRIMARY KEY(`user`) NOT ENFORCED
);
```

```sql
[INFO] Execute statement succeed.
```

```sql
DESCRIBE orders;
```

```sql
+-------------+--------------+----------+-------------------------+
| Column Name |  Data Type   | Nullable |         Extras          |
+-------------+--------------+----------+-------------------------+
| user        | BIGINT       | NOT NULL | PRIMARY KEY, BUCKET KEY |
| product     | STRING       | NULL     |                         |
| amount      | INT          | NULL     |                         |
| ts          | TIMESTAMP(3) | NULL     |                         |
+-------------+--------------+----------+-------------------------+
```

```sql
DESCRIBE EXTENDED orders;
```

```sql
+-------------+----------------------------+----------+-----------------------------------------------------+---------+
| Column Name |         Data Type          | Nullable |                       Extras                        | Comment |
+-------------+----------------------------+----------+-----------------------------------------------------+---------+
| user        | BIGINT                     | NOT NULL | PRIMARY KEY, BUCKET KEY                             |         |
| product     | STRING                     | NULL     |                                                     |         |
| amount      | INT                        | NULL     |                                                     |         |
| ts          | TIMESTAMP(3)               | NULL     |                                                     |         |
| $rowtime    | TIMESTAMP_LTZ(3) *ROWTIME* | NOT NULL | METADATA VIRTUAL, WATERMARK AS `SOURCE_WATERMARK`() | SYSTEM  |
+-------------+----------------------------+----------+-----------------------------------------------------+---------+
```

```sql
DESCRIBE MODEL `my-model`;
```

```sql
+-----------------------+---------------------------+---------------------------+---------+
|        Inputs         |          Outputs          |          Options          | Comment |
+-----------------------+---------------------------+---------------------------+---------+
| (                     | (                         | {                         |         |
|   `credit_limit` INT, |   `predicted_default` INT |   AZUREML.API_KEY=******, |         |
|   `age` INT           | )                         |   AZUREML.ENDPOINT=h...   |         |
| )                     |                           |                           |         |
+-----------------------+---------------------------+---------------------------+---------+
```

```sql
DESCRIBE MODEL `my-model$2`;
```

```sql
+-----------+------------------+-----------------------+---------------------------+--------------------+---------+
| VersionId | IsDefaultVersion |        Inputs         |          Outputs          |      Options       | Comment |
+-----------+------------------+-----------------------+---------------------------+--------------------+---------+
| 2         | true             | (                     | (                         | {                  |         |
|           |                  |   `credit_limit` INT, |   `predicted_default` INT |   AZUREML.API_K... |         |
|           |                  |   `age` INT           | )                         |                    |         |
|           |                  | )                     |                           |                    |         |
+-----------+------------------+-----------------------+---------------------------+--------------------+---------+
```

```sql
DESCRIBE MODEL `my-model$all`;
```

```sql
+-----------+------------------+-----------------------+---------------------------+--------------------+---------+
| VersionId | IsDefaultVersion |        Inputs         |          Outputs          |      Options       | Comment |
+-----------+------------------+-----------------------+---------------------------+--------------------+---------+
| 1         | true             | (                     | (                         | {                  |         |
|           |                  |   `credit_limit` INT, |   `predicted_default` INT |   AZUREML.API_K... |         |
|           |                  |   `age` INT           | )                         |                    |         |
|           |                  | )                     |                           |                    |         |
| 2         | false            | (                     | (                         | {                  |         |
|           |                  |   `credit_limit` INT, |   `predicted_default` INT |   AZUREML.API_K... |         |
|           |                  |   `age` INT           | )                         |                    |         |
|           |                  | )                     |                           |                    |         |
+-----------+------------------+-----------------------+---------------------------+--------------------+---------+
```

```sql
DESCRIBE FUNCTION `SUM`;
```

```sql
+-----------------+------------+
|       info name | info value |
+-----------------+------------+
| system function |       true |
|       temporary |      false |
+-----------------+------------+
```

```sql
DESCRIBE FUNCTION EXTENDED `SUM`;
```

```sql
+------------------+----------------+
|        info name |     info value |
+------------------+----------------+
|  system function |           true |
|        temporary |          false |
|             kind |      AGGREGATE |
|     requirements |             [] |
|    deterministic |           true |
| constant folding |           true |
|        signature | SUM(<NUMERIC>) |
+------------------+----------------+
```

```sql
DESCRIBE FUNCTION `MyUpperCaseUdf`;
```

```sql
+-------------------+----------------------+
|         info name |           info value |
+-------------------+----------------------+
|   system function |                false |
|         temporary |                 true |
|        class name | org.example.UpperUDF |
| function language |                 JAVA |
|         plugin id |              ccp-xyz |
|        version id |              ver-123 |
|    argument types |                [str] |
|       return type |                  str |
+-------------------+----------------------+
```

```sql
DESCRIBE FUNCTION EXTENDED `MyUpperCaseUdf`;
```

```sql
+-------------------+-------------------------------+
|         info name |                    info value |
+-------------------+-------------------------------+
|   system function |                         false |
|         temporary |                          true |
|        class name |          org.example.UpperUDF |
| function language |                          JAVA |
|              kind |                        SCALAR |
|      requirements |                            [] |
|     deterministic |                          true |
|  constant folding |                          true |
|         signature | cat.db.MyUpperCaseUdf(STRING) |
|         plugin id |                       ccp-xyz |
|        version id |                       ver-123 |
|    argument types |                         [str] |
|       return type |                           str |
+-------------------+-------------------------------+
```

```sql
azure-openai-connection
```

```sql
DESCRIBE CONNECTION `azure-openai-connection`;
```

```sql
+-------------------------+-------------+-----------------------------------------------------------------------+---------+
|          Name           |    Type     |                              Endpoint                                 | Comment |
+-------------------------+-------------+-----------------------------------------------------------------------+---------+
| azure-openai-connection | AZUREOPENAI | https://<your-project>.openai.azure.com/openai/deployments/matrix-... |         |
+-------------------------+-------------+-----------------------------------------------------------------------+---------+
```

---

### SQL DROP CONNECTION Statement in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/statements/drop-connection.html

DROP CONNECTION Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® supports creating secure connections to external services and data sources. You can use these connections in your Flink statements. You remove these connections by using the DROP CONNECTION statement. Syntax¶ DROP CONNECTION [IF EXISTS] [catalog_name.][db_name.]connection_name Description¶ Delete a connection from the Flink environment. Dropping a connection deletes the corresponding credentials stored in the SecretStore. Example¶ DROP CONNECTION `azure-openai-connection`; Related content¶ ALTER CONNECTION CREATE CONNECTION DESCRIBE CONNECTION SHOW CONNECTIONS Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
DROP CONNECTION [IF EXISTS] [catalog_name.][db_name.]connection_name
```

```sql
SecretStore
```

```sql
DROP CONNECTION `azure-openai-connection`;
```

---

### SQL DROP MODEL Statement in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/statements/drop-model.html

DROP MODEL Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables real-time inference and prediction with AI models. Use the CREATE MODEL statement to register an AI model. Syntax¶ -- Delete the default version. The min version becomes the new default. DROP MODEL [IF EXISTS] [[catalog_name].[database_name]].model_name -- Delete the specified version. DROP MODEL [IF EXISTS] [[catalog_name].[database_name]].model_name[$version-id] -- Delete all versions and the model. DROP MODEL [IF EXISTS] [[catalog_name].[database_name]].model_name[$all] Description¶ Delete an AI model in Confluent Cloud for Apache Flink. Use the <model-name>$<model-version> syntax to delete a specific version of a model. For more information, see Model versioning. If version_id is not specified, DROP deletes the default version, and the min version becomes the default version. DROP MODEL <model-name>$all deletes all versions. When the IF EXISTS clause is provided and the model or version doesn’t exist, no action is taken. Examples¶ -- Delete the default version. The min version becomes the new default. DROP MODEL `<model-name>`; -- Delete a specific version of the model. DROP MODEL `<model-name>$<version>`; -- Delete all versions and the model. DROP MODEL `<model-name>$all`; Related content¶ CREATE MODEL ALTER MODEL Run an AI Model Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
-- Delete the default version. The min version becomes the new default.
DROP MODEL [IF EXISTS] [[catalog_name].[database_name]].model_name

-- Delete the specified version.
DROP MODEL [IF EXISTS] [[catalog_name].[database_name]].model_name[$version-id]

-- Delete all versions and the model.
DROP MODEL [IF EXISTS] [[catalog_name].[database_name]].model_name[$all]
```

```sql
<model-name>$<model-version>
```

```sql
DROP MODEL <model-name>$all
```

```sql
-- Delete the default version. The min version becomes the new default.
DROP MODEL `<model-name>`;

-- Delete a specific version of the model.
DROP MODEL `<model-name>$<version>`;

-- Delete all versions and the model.
DROP MODEL `<model-name>$all`;
```

---

### SQL DROP TABLE Statement in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/statements/drop-table.html

DROP TABLE Statement in Confluent Cloud for Apache Flink¶ The DROP TABLE statement removes a table definition from Confluent Cloud for Apache Flink® and, depending on the table type, will also delete associated resources like the Kafka topic and schemas in Schema Registry. Syntax¶ DROP TABLE [IF EXISTS] table_name Parameters¶ IF EXISTSOptional clause that prevents an error if the table does not exist. table_nameThe name of the table to drop. Description¶ The DROP TABLE statement behavior varies depending on the table type. Regular Tables¶ For tables backed by Kafka topics, which are created by using CREATE TABLE or inferred from existing topics: Deletes the underlying Kafka topic permanently When using TopicNameStrategy (default): - Deletes all versions of the associated schemas from Schema Registry When using RecordNameStrategy or TopicRecordNameStrategy: - Deletes the Kafka topic but preserves schemas in Schema Registry External Tables¶ Note External tables are an Open Preview feature in Confluent Cloud. A Preview feature is a Confluent Cloud component that is being introduced to gain early feedback from developers. Preview features can be used for evaluation and non-production testing purposes or to provide feedback to Confluent. The warranty, SLA, and Support Services provisions of your agreement with Confluent do not apply to Preview features. Confluent may discontinue providing preview releases of the Preview features at any time in Confluent’s’ sole discretion. Confluent Cloud for Apache Flink enables vector search with external tables. Use the CREATE TABLE statement to register an external table. For external tables, like vector databases and lookup tables: Removes the table definition from Flink metadata Does not delete data from the external system Examples include vector search tables and federated search tables Permissions¶ To execute DROP TABLE, you need an RBAC role that enables you to delete the Kafka topics and Schema Registry schema subjects. Important considerations¶ The DROP TABLE operation is not atomic. If either the Kafka topic deletion or schema deletion fails, the operation may partially complete. Dropping a table permanently deletes the Kafka topic data. Running statements that depend on a dropped table will transition to DEGRADED status. You should stop dependent statements before dropping a table. When using TopicNameStrategy, dropping a table deletes schemas, even if they are used by other topics. Examples¶ -- Drop a Kafka-backed table. DROP TABLE my_table; -- Drop a table if it exists. DROP TABLE IF EXISTS my_table; -- Drop an external table. DROP TABLE `<external-table-name>`; Related Content¶ ALTER TABLE CREATE TABLE Search External Tables Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
DROP TABLE [IF EXISTS] table_name
```

```sql
-- Drop a Kafka-backed table.
DROP TABLE my_table;

-- Drop a table if it exists.
DROP TABLE IF EXISTS my_table;

-- Drop an external table.
DROP TABLE `<external-table-name>`;
```

---

### SQL DROP VIEW Statement in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/statements/drop-view.html

DROP VIEW Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables dropping views using the DROP VIEW statement. When a view is dropped, its definition is removed from the catalog. The corresponding Kafka topic Flink resource reservation is removed. Any new statement referencing the dropped view will fail. Syntax¶ DROP VIEW [IF EXISTS] [catalog_name.][db_name.]view_name Description¶ DROP VIEW removes a view from the catalog. If the view does not exist, an exception is thrown unless IF EXISTS is specified. The view name can be in these formats: catalog_name.db_name.view_name: The view with the given name is dropped from the catalog named “catalog_name” and the database named “db_name”. db_name.view_name: The view with the given name is dropped from the current catalog of the execution table environment and the database named “db_name”. view_name: The view with the given name is dropped from the current catalog and the current database of the execution table environment. Examples¶ The following example drops the vip_customers view. In the Confluent CLI or in a Cloud Console workspace, run the following command: DROP VIEW vip_customers; Your output should resemble: Statement phase is COMPLETED. If you try to query the dropped view: SELECT * FROM vip_customers; You will get an error message indicating that the view does not exist: [Code: 1, SQL State: 42000]: Object 'default_catalog.default_database.vip_customers' does not exist. To avoid the error when dropping a view that may not exist, use the IF EXISTS clause: DROP VIEW IF EXISTS vip_customers; This statement will not throw an error if the vip_customers view does not exist. Related content¶ CREATE VIEW ALTER VIEW Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
DROP VIEW [IF EXISTS] [catalog_name.][db_name.]view_name
```

```sql
catalog_name.db_name.view_name
```

```sql
db_name.view_name
```

```sql
vip_customers
```

```sql
DROP VIEW vip_customers;
```

```sql
Statement phase is COMPLETED.
```

```sql
SELECT * FROM vip_customers;
```

```sql
[Code: 1, SQL State: 42000]: Object 'default_catalog.default_database.vip_customers' does not exist.
```

```sql
DROP VIEW IF EXISTS vip_customers;
```

```sql
vip_customers
```

---

### SQL EXPLAIN Statement in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/statements/explain.html

EXPLAIN Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables viewing and analyzing the query plans of Flink SQL statements. Syntax¶ EXPLAIN { <query_statement> | <insert_statement> | <statement_set> | CREATE TABLE ... AS SELECT ... } <statement_set>: STATEMENT SET BEGIN -- one or more INSERT INTO statements { INSERT INTO <select_statement>; }+ END; Description¶ The EXPLAIN statement provides detailed information about how Flink executes a specified query or INSERT statement. EXPLAIN shows: The optimized physical execution plan If the changelog mode is not append-only, details about the changelog mode per operator Upsert keys and primary keys where applicable Table source and sink details This information is valuable for understanding query performance, optimizing complex queries, and debugging unexpected results. Use the EXPLAIN statement in conjunction with the Flink SQL Query Profiler to understand the physical plan of your query. Example queries¶ Basic query analysis¶ This example analyzes a query finding users who clicked but never placed an order: EXPLAIN SELECT c.* FROM `examples`.`marketplace`.`clicks` c LEFT JOIN ( SELECT DISTINCT customer_id FROM `examples`.`marketplace`.`orders` ) o ON c.user_id = o.customer_id WHERE o.customer_id IS NULL; The output shows the physical plan and operator details: == Physical Plan == StreamSink [11] +- StreamCalc [10] +- StreamJoin [9] +- StreamExchange [3] : +- StreamCalc [2] : +- StreamTableSourceScan [1] +- StreamExchange [8] +- StreamGroupAggregate [7] +- StreamExchange [6] +- StreamCalc [5] +- StreamTableSourceScan [4] == Physical Details == [1] StreamTableSourceScan Table: `examples`.`marketplace`.`clicks` Changelog mode: append State size: low [4] StreamTableSourceScan Table: `examples`.`marketplace`.`orders` Changelog mode: append State size: low [7] StreamGroupAggregate Changelog mode: retract Upsert key: (customer_id) State size: medium [8] StreamExchange Changelog mode: retract Upsert key: (customer_id) [9] StreamJoin Changelog mode: retract State size: medium [10] StreamCalc Changelog mode: retract [11] StreamSink Table: Foreground Changelog mode: retract State size: low Note that the [11] StreamSink Table: Foreground in the output indicates this is a preview execution plan. For more accurate optimization analysis, it’s recommended to test queries using either the final target table or CREATE TABLE AS statements, which will determine the optimal primary key and changelog mode for your specific use case. Creating tables¶ This example shows creating a new table from a query: EXPLAIN CREATE TABLE clicks_without_orders AS SELECT c.* FROM `examples`.`marketplace`.`clicks` c LEFT JOIN ( SELECT DISTINCT customer_id FROM `examples`.`marketplace`.`orders` ) o ON c.user_id = o.customer_id WHERE o.customer_id IS NULL; The output includes sink information for the new table: == Physical Plan == StreamSink [11] +- StreamCalc [10] +- StreamJoin [9] +- StreamExchange [3] : +- StreamCalc [2] : +- StreamTableSourceScan [1] +- StreamExchange [8] +- StreamGroupAggregate [7] +- StreamExchange [6] +- StreamCalc [5] +- StreamTableSourceScan [4] == Physical Details == [1] StreamTableSourceScan Table: `examples`.`marketplace`.`clicks` Changelog mode: append State size: low [4] StreamTableSourceScan Table: `examples`.`marketplace`.`orders` Changelog mode: append State size: low [7] StreamGroupAggregate Changelog mode: retract Upsert key: (customer_id) State size: medium [8] StreamExchange Changelog mode: retract Upsert key: (customer_id) [9] StreamJoin Changelog mode: retract State size: medium [10] StreamCalc Changelog mode: retract [11] StreamSink Table: `catalog`.`database`.`clicks_without_orders` Changelog mode: retract State size: low Inserting values¶ This example shows inserting static values: EXPLAIN INSERT INTO orders VALUES (1, 1001, '2023-02-24', 50.0), (2, 1002, '2023-02-25', 60.0), (3, 1003, '2023-02-26', 70.0); The output shows a simple insertion plan: == Physical Plan == StreamSink [3] +- StreamCalc [2] +- StreamValues [1] == Physical Details == [1] StreamValues Changelog mode: append State size: low [3] StreamSink Table: `catalogs`.`database`.`orders` Changelog mode: append State size: low Multiple operations¶ This example demonstrates operation reuse across multiple inserts: EXPLAIN STATEMENT SET BEGIN INSERT INTO low_orders SELECT * from `orders` where price < 100; INSERT INTO high_orders SELECT * from `orders` where price > 100; END; The output shows table scan reuse: == Physical Plan == StreamSink [3] +- StreamCalc [2] +- StreamTableSourceScan [1] StreamSink [5] +- StreamCalc [4] +- (reused) [1] == Physical Details == [1] StreamTableSourceScan Table: `examples`.`marketplace`.`orders` Changelog mode: append State size: low [3] StreamSink Table: `catalog`.`database`.`low_orders` Changelog mode: append State size: low [5] StreamSink Table: `catalog`.`database`.`high_orders` Changelog mode: append State size: low Window functions¶ This example shows window functions and self-joins: EXPLAIN WITH windowed_customers AS ( SELECT * FROM TABLE( TUMBLE(TABLE `examples`.`marketplace`.`customers`, DESCRIPTOR($rowtime), INTERVAL '1' MINUTE) ) ) SELECT c1.window_start, c1.city, COUNT(DISTINCT c1.customer_id) as unique_customers, COUNT(c2.customer_id) as total_connections FROM windowed_customers c1 JOIN windowed_customers c2 ON c1.city = c2.city AND c1.customer_id < c2.customer_id AND c1.window_start = c2.window_start GROUP BY c1.window_start, c1.city HAVING COUNT(DISTINCT c1.customer_id) > 5; The output shows the complex processing required for windowed aggregations: == Physical Plan == StreamSink [14] +- StreamCalc [13] +- StreamGroupAggregate [12] +- StreamExchange [11] +- StreamCalc [10] +- StreamJoin [9] +- StreamExchange [8] : +- StreamCalc [7] : +- StreamWindowTableFunction [6] : +- StreamCalc [5] : +- StreamChangelogNormalize [4] : +- StreamExchange [3] : +- StreamCalc [2] : +- StreamTableSourceScan [1] +- (reused) [8] == Physical Details == [1] StreamTableSourceScan Table: `examples`.`marketplace`.`customers` Primary key: (customer_id) Changelog mode: upsert Upsert key: (customer_id) State size: low [2] StreamCalc Changelog mode: upsert Upsert key: (customer_id) [3] StreamExchange Changelog mode: upsert Upsert key: (customer_id) [4] StreamChangelogNormalize Changelog mode: retract Upsert key: (customer_id) State size: medium [5] StreamCalc Changelog mode: retract Upsert key: (customer_id) [6] StreamWindowTableFunction Changelog mode: retract State size: low [7] StreamCalc Changelog mode: retract [8] StreamExchange Changelog mode: retract [9] StreamJoin Changelog mode: retract State size: medium [10] StreamCalc Changelog mode: retract [11] StreamExchange Changelog mode: retract [12] StreamGroupAggregate Changelog mode: retract Upsert key: (window_start,city) State size: medium [13] StreamCalc Changelog mode: retract Upsert key: (window_start,city) [14] StreamSink Table: Foreground Changelog mode: retract Upsert key: (window_start,city) State size: low Understanding the output¶ Reading physical plans¶ The physical plan shows how Flink executes your query. Each operation is numbered and indented to show its position in the execution flow. Indentation indicates data flow, with each operator passing results to its parent. Changelog modes¶ Changelog modes describe how operators handle data modifications: Append: The operator processes only insert operations. New rows are simply added. Upsert: The operator handles both inserts and updates. It uses an “upsert key” to identify rows. If a row with a given key exists already, the operator updates it; otherwise, it inserts a new row. Retract: The operator handles inserts, updates, and deletes. Updates are typically represented as a retraction (deletion) of the old row followed by an insertion of the new row. Deletes are represented as retractions. Operators change changelog modes when different update patterns are needed, such as when moving from streaming reads to aggregations. Data movement¶ The physical details section shows how data moves between operators. Watch for: Exchange operators indicating data redistribution Changes in upsert keys showing where data must be reshuffled Operator reuse marked by “(reused)” references State size¶ Each operator in the physical plan includes a “State Size” property indicating its memory requirements during execution: LOW: Minimal state maintenance, typically efficient memory usage MEDIUM: Moderate state requirements, may need attention with high cardinality HIGH: Significant state maintenance that requires careful management When operators show HIGH state size, you should configure a state TTL to prevent unbounded state growth. Without TTL configuration, these operators can accumulate unlimited state over time, potentially leading to resource exhaustion and the statement ending up in a DEGRADED state. SET 'sql.state-ttl' = '12 hours'; For MEDIUM state size, consider TTL settings if your data has high cardinality or frequent updates per key. Physical operators¶ Below is a reference of common operators you may see in EXPLAIN output, along with examples of SQL that typically produces them. Basic operations¶ StreamTableSourceScanReads data from a source table. The foundation of any query reading from a table. SELECT * FROM orders; StreamCalcPerforms row-level computations and filtering. Appears when using WHERE clauses or expressions in SELECT. SELECT amount * 1.1 as amount_with_tax FROM orders WHERE status = 'completed'; StreamValuesGenerates literal row values. Commonly seen with INSERT statements. INSERT INTO orders VALUES (1, 'pending', 100); StreamSinkWrites results to a destination. Present in any INSERT or when displaying query results. Supports two modes of operation: Append-only: Each record is treated as a new event, which displays as State size: Low. Upsert-materialize: Maintains state to handle updates/deletes based on key fields. which displays as State size: High. INSERT INTO order_summaries SELECT status, COUNT(*) FROM orders GROUP BY status; Aggregation operations¶ StreamGroupAggregatePerforms grouping and aggregation. Created by GROUP BY clauses. SELECT customer_id, SUM(price) FROM orders GROUP BY customer_id; StreamLocalWindowAggregate and StreamGlobalWindowAggregateThese operators implement Flink two-phase aggregation strategy for distributed stream processing. They work together to compute aggregations efficiently across multiple parallel instances while maintaining exactly-once processing semantics. The LocalGroupAggregate performs initial aggregation within each parallel task, maintaining partial results in its state. The GlobalGroupAggregate then combines these partial results to produce final aggregations. This two-phase approach appears in both regular GROUP BY operations and windowed aggregations. For window operations, these operators appear as StreamLocalWindowAggregate and StreamGlobalWindowAggregate. Here’s an example that triggers their use: SELECT window_start, window_end, SUM(price) as total_price FROM TABLE( TUMBLE(TABLE orders, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES)) GROUP BY window_start, window_end; Join operations¶ StreamJoinPerforms standard stream-to-stream joins. SELECT o.*, c.name FROM orders o JOIN customers c ON o.customer_id = c.id; StreamTemporalJoinJoins streams using temporal (time-versioned) semantics. SELECT orders.*, customers.* FROM orders LEFT JOIN customers FOR SYSTEM_TIME AS OF orders.`$rowtime` ON orders.customer_id = customers.customer_id; StreamIntervalJoinJoins streams within a time interval. SELECT * FROM orders o, clicks c WHERE o.customer_id = c.user_id AND o.`$rowtime` BETWEEN c.`$rowtime` - INTERVAL '1' MINUTE AND c.`$rowtime`; StreamWindowJoinJoins streams within defined windows. SELECT * FROM ( SELECT * FROM TABLE(TUMBLE(TABLE clicks, DESCRIPTOR($rowtime), INTERVAL '5' MINUTES)) ) c JOIN ( SELECT * FROM TABLE(TUMBLE(TABLE orders, DESCRIPTOR($rowtime), INTERVAL '5' MINUTES)) ) o ON c.user_id = o.customer_id AND c.window_start = o.window_start AND c.window_end = o.window_end; Ordering and ranking¶ StreamRankComputes the smallest or largest values (Top-N queries). SELECT product_id, price FROM ( SELECT *, ROW_NUMBER() OVER (PARTITION BY product_id ORDER BY price DESC) AS row_num FROM orders) WHERE row_num <= 5; StreamLimitLimits the number of returned rows. SELECT * FROM orders LIMIT 10; StreamSortLimitCombines sorting with row limiting. SELECT * FROM orders ORDER BY $rowtime LIMIT 10; StreamWindowRankComputes the smallest or largest values within window boundaries (Window Top-N queries). SELECT * FROM ( SELECT *, ROW_NUMBER() OVER (PARTITION BY window_start, window_end ORDER BY price DESC) as rownum FROM ( SELECT window_start, window_end, customer_id, SUM(price) as price, COUNT(*) as cnt FROM TABLE( TUMBLE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES)) GROUP BY window_start, window_end, customer_id ) ) WHERE rownum <= 3; Data movement and distribution¶ StreamExchangeRedistributes/exchanges data between parallel instances. For example, when you write a query with a GROUP BY clause, Flink might use a HASH exchange to ensure all records with the same key are processed by the same task: -- Appears in plans with GROUP BY on a different key than the source distribution SELECT customer_id, COUNT(*) FROM orders GROUP BY customer_id; StreamUnionCombines results from multiple queries. SELECT * FROM european_orders UNION ALL SELECT * FROM american_orders; StreamExpandGenerates multiple rows from a single row for CUBE, ROLLUP, and GROUPING SETS. SELECT department, brand, COUNT(*) as product_count, COUNT(DISTINCT vendor) as vendor_count FROM products GROUP BY CUBE(department, brand) HAVING COUNT(*) > 1; Specialized operations¶ StreamChangelogNormalizeConverts upsert-based changelog streams (based on primary key) into retract-based streams (with explicit +/- records) to support correct aggregation results in streaming queries. -- Appears when processing versioned data, like a table that uses upsert semantics SELECT COUNT(*) as cnt FROM products; StreamAsyncCalcExecutes user-defined functions. This operator allows for non-blocking execution of user-defined functions (UDFs). SELECT my_udf(name) FROM customers; StreamWindowTableFunctionApplies windowing operations as table functions. SELECT * FROM TABLE( TUMBLE(TABLE orders, DESCRIPTOR($rowtime), INTERVAL '1' HOUR) ); StreamCorrelateHandles correlated subqueries (UNNEST) and table function calls. EXPLAIN SELECT product_id, product_name, tag FROM ( VALUES (1, 'Laptop', ARRAY['electronics', 'computers']), (2, 'Phone', ARRAY['electronics', 'mobile']) ) AS products (product_id, product_name, tags) CROSS JOIN UNNEST(tags) AS t (tag); StreamMatchExecutes pattern-matching operations using MATCH_RECOGNIZE. SELECT * FROM orders MATCH_RECOGNIZE ( PARTITION BY customer_id ORDER BY $rowtime MEASURES COUNT(*) as order_count PATTERN (A B+) DEFINE A as price > 100, B as price <= 100 ); Optimizing query performance¶ Minimizing data movement¶ Data shuffling impacts performance. When examining EXPLAIN output: Look for exchange operators and upsert key changes. Consider keeping compatible partitioning keys through your query. Watch for opportunities to reduce data redistribution. Pay special attention to data skew when designing your queries. If a particular key value appears much more frequently than others, it can lead to uneven processing where a single parallel instance becomes overwhelmed handling that key’s data. Consider strategies like adding additional dimensions to your keys or pre-aggregating hot keys to distribute the workload more evenly. Using operator reuse¶ Flink automatically reuses operators when possible. In EXPLAIN output: Look for “(reused)” references showing optimization. Consider restructuring queries to enable more reuse. Verify that similar operations share scan results. Optimizing sink configuration¶ When working with sinks in upsert mode, it’s crucial to align your primary and upsert keys for optimal performance: Whenever possible, configure the primary key to be identical to the upsert key. Having different primary and upsert keys in upsert mode can lead to significant performance degradation. If you must use different keys, carefully evaluate the performance impact and consider restructuring your query to align these keys. Related content¶ Flink SQL Query Profiler Profile a Query SELECT INSERT VALUES INSERT INTO FROM SELECT CREATE TABLE AS Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
EXPLAIN { <query_statement> | <insert_statement> | <statement_set> | CREATE TABLE ... AS SELECT ... }

<statement_set>:
STATEMENT SET
BEGIN
  -- one or more INSERT INTO statements
  { INSERT INTO <select_statement>; }+
END;
```

```sql
EXPLAIN
SELECT c.*
FROM `examples`.`marketplace`.`clicks` c
LEFT JOIN (
  SELECT DISTINCT customer_id
  FROM `examples`.`marketplace`.`orders`
) o ON c.user_id = o.customer_id
WHERE o.customer_id IS NULL;
```

```sql
== Physical Plan ==

StreamSink [11]
  +- StreamCalc [10]
    +- StreamJoin [9]
      +- StreamExchange [3]
      :  +- StreamCalc [2]
      :    +- StreamTableSourceScan [1]
      +- StreamExchange [8]
        +- StreamGroupAggregate [7]
          +- StreamExchange [6]
            +- StreamCalc [5]
              +- StreamTableSourceScan [4]

== Physical Details ==

[1] StreamTableSourceScan
Table: `examples`.`marketplace`.`clicks`
Changelog mode: append
State size: low

[4] StreamTableSourceScan
Table: `examples`.`marketplace`.`orders`
Changelog mode: append
State size: low

[7] StreamGroupAggregate
Changelog mode: retract
Upsert key: (customer_id)
State size: medium

[8] StreamExchange
Changelog mode: retract
Upsert key: (customer_id)

[9] StreamJoin
Changelog mode: retract
State size: medium

[10] StreamCalc
Changelog mode: retract

[11] StreamSink
Table: Foreground
Changelog mode: retract
State size: low
```

```sql
[11] StreamSink Table: Foreground
```

```sql
EXPLAIN
CREATE TABLE clicks_without_orders AS
SELECT c.*
FROM `examples`.`marketplace`.`clicks` c
LEFT JOIN (
  SELECT DISTINCT customer_id
  FROM `examples`.`marketplace`.`orders`
) o ON c.user_id = o.customer_id
WHERE o.customer_id IS NULL;
```

```sql
== Physical Plan ==

StreamSink [11]
  +- StreamCalc [10]
    +- StreamJoin [9]
      +- StreamExchange [3]
      :  +- StreamCalc [2]
      :    +- StreamTableSourceScan [1]
      +- StreamExchange [8]
        +- StreamGroupAggregate [7]
          +- StreamExchange [6]
            +- StreamCalc [5]
              +- StreamTableSourceScan [4]

== Physical Details ==

[1] StreamTableSourceScan
Table: `examples`.`marketplace`.`clicks`
Changelog mode: append
State size: low

[4] StreamTableSourceScan
Table: `examples`.`marketplace`.`orders`
Changelog mode: append
State size: low

[7] StreamGroupAggregate
Changelog mode: retract
Upsert key: (customer_id)
State size: medium

[8] StreamExchange
Changelog mode: retract
Upsert key: (customer_id)

[9] StreamJoin
Changelog mode: retract
State size: medium

[10] StreamCalc
Changelog mode: retract

[11] StreamSink
Table: `catalog`.`database`.`clicks_without_orders`
Changelog mode: retract
State size: low
```

```sql
EXPLAIN
INSERT INTO orders VALUES
  (1, 1001, '2023-02-24', 50.0),
  (2, 1002, '2023-02-25', 60.0),
  (3, 1003, '2023-02-26', 70.0);
```

```sql
== Physical Plan ==

StreamSink [3]
  +- StreamCalc [2]
    +- StreamValues [1]

== Physical Details ==

[1] StreamValues
Changelog mode: append
State size: low

[3] StreamSink
Table: `catalogs`.`database`.`orders`
Changelog mode: append
State size: low
```

```sql
EXPLAIN STATEMENT SET
BEGIN
  INSERT INTO low_orders SELECT * from `orders` where price < 100;
  INSERT INTO high_orders SELECT * from `orders` where price > 100;
END;
```

```sql
== Physical Plan ==

StreamSink [3]
  +- StreamCalc [2]
    +- StreamTableSourceScan [1]

StreamSink [5]
  +- StreamCalc [4]
    +- (reused) [1]

== Physical Details ==

[1] StreamTableSourceScan
Table: `examples`.`marketplace`.`orders`
Changelog mode: append
State size: low

[3] StreamSink
Table: `catalog`.`database`.`low_orders`
Changelog mode: append
State size: low

[5] StreamSink
Table: `catalog`.`database`.`high_orders`
Changelog mode: append
State size: low
```

```sql
EXPLAIN
WITH windowed_customers AS (
  SELECT * FROM TABLE(
    TUMBLE(TABLE `examples`.`marketplace`.`customers`, DESCRIPTOR($rowtime), INTERVAL '1' MINUTE)
  )
)
SELECT
    c1.window_start,
    c1.city,
    COUNT(DISTINCT c1.customer_id) as unique_customers,
    COUNT(c2.customer_id) as total_connections
FROM
    windowed_customers c1
    JOIN windowed_customers c2
    ON c1.city = c2.city
    AND c1.customer_id < c2.customer_id
    AND c1.window_start = c2.window_start
GROUP BY
    c1.window_start,
    c1.city
HAVING
    COUNT(DISTINCT c1.customer_id) > 5;
```

```sql
== Physical Plan ==

StreamSink [14]
  +- StreamCalc [13]
    +- StreamGroupAggregate [12]
      +- StreamExchange [11]
        +- StreamCalc [10]
          +- StreamJoin [9]
            +- StreamExchange [8]
            :  +- StreamCalc [7]
            :    +- StreamWindowTableFunction [6]
            :      +- StreamCalc [5]
            :        +- StreamChangelogNormalize [4]
            :          +- StreamExchange [3]
            :            +- StreamCalc [2]
            :              +- StreamTableSourceScan [1]
            +- (reused) [8]

== Physical Details ==

[1] StreamTableSourceScan
Table: `examples`.`marketplace`.`customers`
Primary key: (customer_id)
Changelog mode: upsert
Upsert key: (customer_id)
State size: low

[2] StreamCalc
Changelog mode: upsert
Upsert key: (customer_id)

[3] StreamExchange
Changelog mode: upsert
Upsert key: (customer_id)

[4] StreamChangelogNormalize
Changelog mode: retract
Upsert key: (customer_id)
State size: medium

[5] StreamCalc
Changelog mode: retract
Upsert key: (customer_id)

[6] StreamWindowTableFunction
Changelog mode: retract
State size: low

[7] StreamCalc
Changelog mode: retract

[8] StreamExchange
Changelog mode: retract

[9] StreamJoin
Changelog mode: retract
State size: medium

[10] StreamCalc
Changelog mode: retract

[11] StreamExchange
Changelog mode: retract

[12] StreamGroupAggregate
Changelog mode: retract
Upsert key: (window_start,city)
State size: medium

[13] StreamCalc
Changelog mode: retract
Upsert key: (window_start,city)

[14] StreamSink
Table: Foreground
Changelog mode: retract
Upsert key: (window_start,city)
State size: low
```

```sql
SET 'sql.state-ttl' = '12 hours';
```

```sql
SELECT * FROM orders;
```

```sql
SELECT amount * 1.1 as amount_with_tax
FROM orders
WHERE status = 'completed';
```

```sql
INSERT INTO orders VALUES (1, 'pending', 100);
```

```sql
INSERT INTO order_summaries
SELECT status, COUNT(*)
FROM orders
GROUP BY status;
```

```sql
SELECT customer_id, SUM(price)
FROM orders
GROUP BY customer_id;
```

```sql
SELECT window_start, window_end, SUM(price) as total_price
   FROM TABLE(
       TUMBLE(TABLE orders, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES))
   GROUP BY window_start, window_end;
```

```sql
SELECT o.*, c.name
FROM orders o
JOIN customers c ON o.customer_id = c.id;
```

```sql
SELECT
     orders.*,
     customers.*
FROM orders
LEFT JOIN customers FOR SYSTEM_TIME AS OF orders.`$rowtime`
ON orders.customer_id = customers.customer_id;
```

```sql
SELECT *
FROM orders o, clicks c
WHERE o.customer_id = c.user_id
AND o.`$rowtime` BETWEEN c.`$rowtime` - INTERVAL '1' MINUTE AND c.`$rowtime`;
```

```sql
SELECT *
FROM (
    SELECT * FROM TABLE(TUMBLE(TABLE clicks, DESCRIPTOR($rowtime), INTERVAL '5' MINUTES))
) c
JOIN (
    SELECT * FROM TABLE(TUMBLE(TABLE orders, DESCRIPTOR($rowtime), INTERVAL '5' MINUTES))
) o
ON c.user_id = o.customer_id
    AND c.window_start = o.window_start
    AND c.window_end = o.window_end;
```

```sql
SELECT product_id, price
FROM (
  SELECT *,
    ROW_NUMBER() OVER (PARTITION BY product_id ORDER BY price DESC) AS row_num
  FROM orders)
WHERE row_num <= 5;
```

```sql
SELECT * FROM orders LIMIT 10;
```

```sql
SELECT * FROM orders ORDER BY $rowtime LIMIT 10;
```

```sql
SELECT *
  FROM (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY window_start, window_end ORDER BY price DESC) as rownum
    FROM (
      SELECT window_start, window_end, customer_id, SUM(price) as price, COUNT(*) as cnt
      FROM TABLE(
        TUMBLE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES))
      GROUP BY window_start, window_end, customer_id
    )
  ) WHERE rownum <= 3;
```

```sql
-- Appears in plans with GROUP BY on a different key than the source distribution
SELECT customer_id, COUNT(*)
   FROM orders
   GROUP BY customer_id;
```

```sql
SELECT * FROM european_orders
UNION ALL
SELECT * FROM american_orders;
```

```sql
SELECT
    department,
    brand,
    COUNT(*) as product_count,
    COUNT(DISTINCT vendor) as vendor_count
FROM products
GROUP BY CUBE(department, brand)
HAVING COUNT(*) > 1;
```

```sql
-- Appears when processing versioned data, like a table that uses upsert semantics
SELECT COUNT(*) as cnt
FROM products;
```

```sql
SELECT
    my_udf(name)
FROM customers;
```

```sql
SELECT * FROM TABLE(
     TUMBLE(TABLE orders, DESCRIPTOR($rowtime), INTERVAL '1' HOUR)
   );
```

```sql
EXPLAIN
SELECT
    product_id,
    product_name,
    tag
FROM (
    VALUES
        (1, 'Laptop', ARRAY['electronics', 'computers']),
        (2, 'Phone', ARRAY['electronics', 'mobile'])
) AS products (product_id, product_name, tags)
CROSS JOIN UNNEST(tags) AS t (tag);
```

```sql
SELECT *
   FROM orders
   MATCH_RECOGNIZE (
     PARTITION BY customer_id
     ORDER BY $rowtime
     MEASURES
       COUNT(*) as order_count
     PATTERN (A B+)
     DEFINE
       A as price > 100,
       B as price <= 100
   );
```

---

### SQL HINTS in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/statements/hints.html

Dynamic Table Options in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® supports dynamic table options, or SQL hints, which enable you to specify or override table options dynamically. Syntax¶ To use dynamic table options, employ the following Oracle-style SQL hint syntax: table_path /*+ OPTIONS(key=val [, key=val]*) */ key: stringLiteral val: stringLiteral The dynamic options must be placed next to the table and not by any aliases, for example: SELECT * FROM t /*+ OPTIONS(...) */ AS alias; Description¶ Dynamic Table Options in Confluent Cloud for Apache Flink offer the following benefits: Flexible configuration: Specify table options on a per-statement basis, providing more flexibility than static options as stored in the table definition. Query-specific adjustments: Customize table behavior for individual queries without altering the permanent table definition. Examples¶ Here are some examples of using dynamic table options in Confluent Cloud for Apache Flink: Override scan startup mode for a table: SELECT id, name FROM table /*+ OPTIONS('scan.startup.mode'='earliest-offset') */; Set options for multiple tables in a join: SELECT * FROM table1 /*+ OPTIONS('scan.startup.mode'='earliest-offset') */ t1 JOIN table2 /*+ OPTIONS('scan.startup.mode'='earliest-offset') */ t2 ON t1.id = t2.id; Set the scan startup mode to use the latest offset: SELECT * FROM orders /*+ OPTIONS('scan.startup.mode'='latest-offset') */; Set the scan startup mode to use the specific offsets, for example, using the latest_offsets attribute from a previous statement: INSERT INTO customers_sink (customer_id, name, address, postcode, city, email) SELECT customer_id, name, address, postcode, city, email FROM customers_source /*+ OPTIONS( 'scan.startup.mode' = 'specific-offsets', 'scan.startup.specific-offsets' = 'partition:0,offset:10;partition:1,offset:123' ) */; // Note: for a statement with multiple topics, use OPTIONS for each table SELECT * FROM table1 /*+ OPTIONS('scan.startup.mode'='specific-offsets', 'scan.startup.specific-offsets' = '...') */ t1 JOIN table2 /*+ OPTIONS('scan.startup.mode'='specific-offsets', 'scan.startup.specific-offsets' = '...') */ t2 ON t1.id = t2.id; State TTL Hints¶ For stateful computations such as Regular Joins and Group Aggregations, Confluent Cloud for Apache Flink supports the STATE_TTL hint. This hint allows you to specify operator-level Idle State Retention Time, enabling these operators to have a different TTL from the pipeline-level configuration set by sql.state-ttl. Syntax¶ The syntax for using State TTL hints is as follows: table_path /*+ STATE_TTL('table_name_or_alias'='ttl_value') */ ttl_value: stringLiteral (e.g., '6h', '2d', '10800s') Examples¶ Here are some examples of using State TTL hints in Confluent Cloud for Apache Flink for social media analytics: Set State TTL for a Regular Join of posts and users: SELECT /*+ STATE_TTL('posts'='6h', 'users'='2d') */ * FROM posts JOIN users ON posts.user_id = users.id; Use table aliases with State TTL hints for analyzing engagement: SELECT /*+ STATE_TTL('p'='4h', 'e'='12h') */ * FROM posts p JOIN engagement e ON p.post_id = e.post_id; Apply State TTL hints in a Group Aggregation for trending hashtags: SELECT /*+ STATE_TTL('hashtags' = '1h') */ hashtag, COUNT(*) AS usage_count FROM hashtags GROUP BY hashtag; Important Considerations¶ When using State TTL hints, keep the following in mind: You can use either the table name or table alias as the hint key. If you specify an alias for a table, you must use that alias in the STATE_TTL hint. For queries with multiple joins, the specified TTLs are applied in a bottom-up order. The STATE_TTL hint only affects the query block where it’s applied. If a hint key is duplicated, the last occurrence takes precedence. When multiple STATE_TTL hints are used with the same hint key, the first occurrence is applied. Related content¶ CREATE TABLE ALTER TABLE Table Options Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
table_path /*+ OPTIONS(key=val [, key=val]*) */

key:
    stringLiteral
val:
    stringLiteral
```

```sql
SELECT * FROM t /*+ OPTIONS(...) */ AS alias;
```

```sql
SELECT id, name
FROM table /*+ OPTIONS('scan.startup.mode'='earliest-offset') */;
```

```sql
SELECT *
FROM table1 /*+ OPTIONS('scan.startup.mode'='earliest-offset') */ t1
JOIN table2 /*+ OPTIONS('scan.startup.mode'='earliest-offset') */ t2
ON t1.id = t2.id;
```

```sql
SELECT *
FROM orders /*+ OPTIONS('scan.startup.mode'='latest-offset') */;
```

```sql
INSERT INTO customers_sink (customer_id, name, address, postcode, city, email)
    SELECT customer_id, name, address, postcode, city, email
    FROM customers_source
    /*+ OPTIONS(
        'scan.startup.mode' = 'specific-offsets',
        'scan.startup.specific-offsets'  = 'partition:0,offset:10;partition:1,offset:123'
    ) */;

// Note: for a statement with multiple topics, use OPTIONS for each table
SELECT *
FROM table1 /*+ OPTIONS('scan.startup.mode'='specific-offsets', 'scan.startup.specific-offsets' = '...') */ t1
JOIN table2 /*+ OPTIONS('scan.startup.mode'='specific-offsets', 'scan.startup.specific-offsets' = '...') */ t2
ON t1.id = t2.id;
```

```sql
table_path /*+ STATE_TTL('table_name_or_alias'='ttl_value') */

ttl_value:
    stringLiteral (e.g., '6h', '2d', '10800s')
```

```sql
SELECT /*+ STATE_TTL('posts'='6h', 'users'='2d') */ *
FROM posts
JOIN users ON posts.user_id = users.id;
```

```sql
SELECT /*+ STATE_TTL('p'='4h', 'e'='12h') */ *
FROM posts p
JOIN engagement e ON p.post_id = e.post_id;
```

```sql
SELECT /*+ STATE_TTL('hashtags' = '1h') */
       hashtag, COUNT(*) AS usage_count
FROM hashtags
GROUP BY hashtag;
```

---

### SQL Statements in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/statements/overview.html

DDL Statements in Confluent Cloud for Apache Flink¶ In Confluent Cloud for Apache Flink®, a statement is a high-level resource that’s created when you enter a SQL query. Data Definition Language (DDL) statements are imperative verbs that define metadata in Flink SQL by adding, changing, or deleting tables. Unlike Data Manipulation Language (DML) statements, DDL statements modify only metadata and don’t change data. When you want to change data, use DML statements. For valid lexical structure of statements, see Flink SQL Syntax in Confluent Cloud for Apache Flink. Available DDL statements¶ These are the available DDL statements in Confluent Cloud for Flink SQL. ALTER ALTER TABLE: Change properties of an existing table. ALTER MODEL: Rename an AI model or change model options. CREATE CREATE TABLE: Register a table into the current or specified catalog (Confluent Cloud environment). CREATE FUNCTION: Register a user-defined function (UDF) in the current database (Apache Kafka® cluster). CREATE MODEL: Create a new AI model. DESCRIBE DESCRIBE: Show properties of a table, AI model, or UDF. DROP DROP MODEL: Remove an AI model. DROP TABLE: Remove a table. DROP VIEW: Remove a view from a catalog. EXPLAIN EXPLAIN: View the query plan of a Flink SQL statement. RESET RESET: Reset the Flink SQL shell configuration to default settings. SET SET: Modify or list the Flink SQL shell configuration. SHOW SHOW CATALOGS: List all catalogs. SHOW CREATE MODEL: Show details about an AI inference model. SHOW CREATE TABLE: Show details about a table. SHOW CURRENT CATALOG: Show the current catalog. SHOW CURRENT DATABASE: Show the current database. SHOW DATABASES: List all databases in the current catalog. SHOW FUNCTIONS: List all functions in the current catalog and database. SHOW JOBS: List the status of all statements in the current catalog. SHOW MODELS: List all AI models that are registered in the current catalog. SHOW TABLES: List all tables for the current database. USE USE CATALOG: Set the current catalog. USE [database_name]: Set the current database. Related content¶ Flink SQL Syntax Flink SQL Queries Stream Processing Concepts Built-in Functions Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

---

### SQL RESET Statement in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/statements/reset.html

RESET Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables resetting Flink SQL shell properties to default values. Syntax¶ RESET 'key'; Description¶ Reset the Flink SQL shell configuration to the default settings. If no key is specified, all properties are set to their default values. To assign a session property, use the SET Statement in Confluent Cloud for Apache Flink. Example¶ The following examples show how to run a RESET statement in the Flink SQL shell. RESET 'table.local-time-zone'; Your output should resemble: configuration key "table.local-time-zone" has been reset successfully. +------------------------+---------------------+ | Key | Value | +------------------------+---------------------+ | client.service-account | <unset> (default) | | sql.local-time-zone | GMT+02:00 (default) | +------------------------+---------------------+ RESET; configuration has been reset successfully. +------------------------+---------------------+ | Key | Value | +------------------------+---------------------+ | client.service-account | <unset> (default) | | sql.local-time-zone | GMT+02:00 (default) | +------------------------+---------------------+ Related content¶ SET Statement in Confluent Cloud for Apache Flink Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
RESET 'key';
```

```sql
RESET 'table.local-time-zone';
```

```sql
configuration key "table.local-time-zone" has been reset successfully.
+------------------------+---------------------+
|          Key           |        Value        |
+------------------------+---------------------+
| client.service-account | <unset> (default)   |
| sql.local-time-zone    | GMT+02:00 (default) |
+------------------------+---------------------+
```

```sql
configuration has been reset successfully.
+------------------------+---------------------+
|          Key           |        Value        |
+------------------------+---------------------+
| client.service-account | <unset> (default)   |
| sql.local-time-zone    | GMT+02:00 (default) |
+------------------------+---------------------+
```

---

### SQL SET Statement in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/statements/set.html

SET Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables setting Flink SQL shell properties to different values. Syntax¶ SET 'key' = 'value'; Description¶ Modify or list the Flink SQL shell configuration. If no key and value are specified, SET prints all of the properties that you have assigned for the session. To reset a session property to its default value, use the RESET Statement in Confluent Cloud for Apache Flink. Note In a Cloud Console workspace, the SET statement can’t be run separately and must be submitted along with another Flink SQL statement, like SELECT, CREATE, or INSERT, for example: SET 'sql.current-catalog' = 'default'; SET 'sql.current-database' = 'cluster_0'; SELECT * FROM pageviews; Example¶ The following examples show how to run a SET statement in the Flink SQL shell. SET 'table.local-time-zone' = 'America/Los_Angeles'; Your output should resemble: Statement successfully submitted. Statement phase is COMPLETED. configuration updated successfully. To list the current session settings, run the SET command with no parameters. SET; Your output should resemble: Statement successfully submitted. Statement phase is COMPLETED. +-----------------------+--------------------------+ | Key | Value | +-----------------------+--------------------------+ | catalog | default (default) | | default_database | <your_cluster> (default) | | table.local-time-zone | America/Los_Angeles | +-----------------------+--------------------------+ The SET; operation is not supported in Cloud Console workspaces. Available SET Options¶ These are the available configuration options available by using the SET statement in Confluent Cloud for Apache Flink. For a comparison of option names with corresponding options in Apache Flink, see Configuration options. Table options¶ Key Default Type Description sql.current-catalog (None) String Defines the current catalog. Semantically equivalent with USE CATALOG [catalog_name]. Required if object identifiers are not fully qualified. sql.current-database (None) String Defines the current database. Semantically equivalent with USE [database_id]. Required if object identifiers are not fully qualified. sql.dry-run false Boolean If true, the statement is parsed and validated but not executed. sql.inline-result false Boolean If true, query results are returned inline. sql.local-time-zone “UTC” String Specifies the local time zone offset for TIMESTAMP_LTZ conversions. When converting to data types that don’t include a time zone (for example, TIMESTAMP, TIME, or simply STRING), this time zone is used. The input for this option is either a Time Zone Database (TZDB) ID, like “America/Los_Angeles”, or fixed offset, like “GMT+03:00”. sql.snapshot.mode “off” String Specifies the mode for snapshot queries. Valid values are “now” and “off”. If not specified, the default value is “now”. For more information, see Snapshot Queries in Confluent Cloud for Apache Flink. sql.state-ttl 0 ms Duration Specifies a minimum time interval for how long idle state, which is state that hasn’t been updated, is retained. The system decides on actual clearance after this interval. If set to the default value of 0, no clearance is performed. sql.tables.initial-offset-from (None) String Specifies the name of a reference statement from which to carry over topic offsets when creating a new statement. Applies only when replacing an existing statement in the same organization, environment, and region. For details, see Carry Over Offsets. sql.tables.scan.bounded.timestamp-millis (None) Long Overwrites scan.bounded.timestamp-millis for Confluent-native tables used in newly created queries. This option is not applied if the table uses a value that differs from the default value. sql.tables.scan.bounded.mode (None) GlobalScanBoundedMode Overwrites scan.bounded.mode for Confluent-native tables used in newly created queries. This option is not applied if the table uses a value that differs from the default value. sql.tables.scan.idle-timeout (None) Duration Specifies the timeout interval for progressive idleness detection. Setting this value to 0 disables idleness detection. For more information, see Progressive idleness detection. sql.tables.scan.watermark-alignment.max-allowed-drift 5 min Duration Specifies the maximum allowed drift for watermark alignment across different splits or partitions to ensure even processing. Setting to 0 disables watermark alignment, which can prevent performance bottlenecks and latency for queries that don’t require event-time semantics, like regular joins, non-windowed aggregations, and ETL. Intended for advanced use-cases, because incorrect use can cause issues, for example, state growth, in queries that depend on event-time. For more information, see Watermark alignment. sql.tables.scan.startup.timestamp-millis (None) Long Overwrites scan.startup.timestamp-millis for Confluent-native tables used in newly created queries. This option is not applied if the table uses a value that differs from the default value. sql.tables.scan.startup.mode (None) GlobalScanStartupMode Overwrites scan.startup.mode for Confluent-native tables used in newly created queries. This option is not applied if the table uses a value that differs from the default value. Flink SQL shell options¶ The following SET options are available only in the Flink SQL shell. In a Cloud Console workspace, the only client option you can set is client.statement-name. Key Default Type Description client.output-format standard String Output format. Valid values are “standard” or “plain-text”. client.results-timeout 600000 Long Total amount of time, in milliseconds, to wait before timing out the request waiting for results to be ready. client.service-account (None) String Service account to use instead of running statements attached to your user account. For more information, see Production workloads (service accounts). client.statement-name (None) String Give your Flink statement a meaningful name that can help you identify it more easily. Instead of an autogenerated name, like 123e4567-e89b-12d3, this sets the statement name to the given value. To avoid naming conflicts, the name resets itself after successful submission. The underscore character (_) and period character (.) are not supported. Related content¶ RESET Statement in Confluent Cloud for Apache Flink Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
SET 'key' = 'value';
```

```sql
SET 'sql.current-catalog' = 'default';
SET 'sql.current-database' = 'cluster_0';
SELECT * FROM pageviews;
```

```sql
SET 'table.local-time-zone' = 'America/Los_Angeles';
```

```sql
Statement successfully submitted.
Statement phase is COMPLETED.
configuration updated successfully.
```

```sql
Statement successfully submitted.
 Statement phase is COMPLETED.
+-----------------------+--------------------------+
|          Key          |          Value           |
+-----------------------+--------------------------+
| catalog               | default (default)        |
| default_database      | <your_cluster> (default) |
| table.local-time-zone | America/Los_Angeles      |
+-----------------------+--------------------------+
```

```sql
GlobalScanBoundedMode
```

```sql
GlobalScanStartupMode
```

```sql
client.statement-name
```

```sql
123e4567-e89b-12d3
```

---

### SQL SHOW Statements in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/statements/show.html

SHOW Statements in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables listing catalogs, which map to Confluent Cloud environments, databases, which map to Apache Kafka® clusters, and other available Flink resources, like AI models, UDFs, connections, and tables. Confluent Cloud for Apache Flink supports these SHOW statements. SHOW CATALOGS SHOW CONNECTIONS SHOW CREATE MODEL SHOW CURRENT CATALOG SHOW CREATE TABLE SHOW CURRENT DATABASE SHOW DATABASES SHOW JOBS SHOW FUNCTIONS SHOW MODELS SHOW TABLES SHOW CATALOGS¶ SyntaxSHOW CATALOGS; DescriptionShow all catalogs. Confluent Cloud for Apache Flink maps Flink catalogs to environments. ExampleSHOW CATALOGS; Your output should resemble: +-------------------------+------------+ | catalog name | catalog id | +-------------------------+------------+ | my_environment | env-12abcz | | example-streams-env | env-23xjoo | | quickstart-env | env-9wg8ny | | default | env-t12345 | +-------------------------+------------+ Run the USE CATALOG statement to set the current Flink catalog (Confluent Cloud environment). USE CATALOG my_environment; Your output should resemble: +---------------------+----------------+ | Key | Value | +---------------------+----------------+ | sql.current-catalog | my_environment | +---------------------+----------------+ SHOW CONNECTIONS¶ SyntaxSHOW CONNECTIONS [LIKE <sql-like-pattern>]; DescriptionShow all connections. ExampleSHOW CONNECTIONS; -- with name filter SHOW CONNECTIONS LIKE 'sql%'; Your output should resemble: +-------------------------+ | Name | +-------------------------+ | azure-openai-connection | | deepwiki-mcp-connection | | demo-day-mcp-connection | | mcp-connection | +-------------------------+ SHOW CURRENT CATALOG¶ SyntaxSHOW CURRENT CATALOG; DescriptionShow the current catalog. ExampleSHOW CURRENT CATALOG; Your output should resemble: +----------------------+ | current catalog name | +----------------------+ | my_environment | +----------------------+ SHOW DATABASES¶ SyntaxSHOW DATABASES; DescriptionShow all databases in the current catalog. Confluent Cloud for Apache Flink maps Flink databases to Kafka clusters. ExampleSHOW DATABASES; Your output should resemble: +---------------+-------------+ | database name | database id | +---------------+-------------+ | cluster_0 | lkc-r289m7 | +---------------+-------------+ Run the USE statement to set the current database (Kafka cluster). USE cluster_0; Your output should resemble: +----------------------+-----------+ | Key | Value | +----------------------+-----------+ | sql.current-database | cluster_0 | +----------------------+-----------+ SHOW CURRENT DATABASE¶ SyntaxSHOW CURRENT DATABASE; DescriptionShow the current database. Confluent Cloud for Apache Flink maps Flink databases to Kafka clusters. ExampleSHOW CURRENT DATABASE; Your output should resemble: +-----------------------+ | current database name | +-----------------------+ | cluster_0 | +-----------------------+ SHOW TABLES¶ SyntaxSHOW TABLES [ [catalog_name.]database_name ] [ [NOT] LIKE <sql_like_pattern> ] DescriptionShow all tables for the current database. You can filter the output of SHOW TABLES by using the LIKE clause with an optional matching pattern. The optional LIKE clause shows all tables with names that match <sql_like_pattern>. The syntax of the SQL pattern in a LIKE clause is the same as in the MySQL dialect. % matches any number of characters, including zero characters. Use the backslash character to escape the % character: \% matches one % character. _ matches exactly one character. Use the backslash character to escape the _ character: \_ matches one _ character. ExampleCreate two tables in the current catalog: flights and orders. -- Create a flights table. CREATE TABLE flights ( flight_id STRING, origin STRING, destination STRING ); -- Create an orders table. CREATE TABLE orders ( user_id BIGINT NOT NULL, product_id STRING, amount INT ); Show all tables in the current database that are similar to the specified SQL pattern. SHOW TABLES LIKE 'f%'; Your output should resemble: +------------+ | table name | +------------+ | flights | +------------+ Show all tables in the given database that are not similar to the specified SQL pattern. SHOW TABLES NOT LIKE 'f%'; Your output should resemble: +------------+ | table name | +------------+ | orders | +------------+ Show all tables in the current database. SHOW TABLES; +------------+ | table name | +------------+ | flights | | orders | +------------+ SHOW CREATE TABLE¶ SyntaxSHOW CREATE TABLE [catalog_name.][db_name.]table_name; DescriptionShow details about the specified table. ExampleSHOW CREATE TABLE flights; Your output should resemble: +-----------------------------------------------------------+ | SHOW CREATE TABLE | +-----------------------------------------------------------+ | CREATE TABLE `my_environment`.`cluster_0`.`flights` ( | | `flight_id` VARCHAR(2147483647), | | `origin` VARCHAR(2147483647), | | `destination` VARCHAR(2147483647) | | ) WITH ( | | 'changelog.mode' = 'append', | | 'connector' = 'confluent', | | 'kafka.cleanup-policy' = 'delete', | | 'kafka.max-message-size' = '2097164 bytes', | | 'kafka.partitions' = '6', | | 'kafka.retention.size' = '0 bytes', | | 'kafka.retention.time' = '604800000 ms', | | 'scan.bounded.mode' = 'unbounded', | | 'scan.startup.mode' = 'earliest-offset', | | 'value.format' = 'avro-registry' | | ) | | | +-----------------------------------------------------------+ Inferred Tables¶ Inferred tables are tables that have not been created with CREATE TABLE but are detected automatically by using information about existing topics and Schema Registry entries. The following examples show SHOW CREATE TABLE called on the resulting table. No key and no value in Schema Registry¶ SHOW CREATE TABLE returns: CREATE TABLE `t_raw` ( `key` VARBINARY(2147483647), `val` VARBINARY(2147483647) ) DISTRIBUTED BY HASH(`key`) INTO 2 BUCKETS WITH ( 'changelog.mode' = 'append', 'connector' = 'confluent', 'key.format' = 'raw', 'value.format' = 'raw' ... ); Properties Key and value formats are raw (binary format) with BYTES Following Kafka message semantics, both key and value support NULL as well, so the following statement is supported: INSERT INTO t_raw (key, val) SELECT CAST(NULL AS BYTES), CAST(NULL AS BYTES); No key and but record value in Schema Registry¶ Given the following value in Schema Registry: { "type": "record", "name": "TestRecord", "fields": [ { "name": "i", "type": "int" }, { "name": "s", "type": "string" } ] } SHOW CREATE TABLE returns: CREATE TABLE `t_raw_key` ( `key` VARBINARY(2147483647), `i` INT NOT NULL, `s` VARCHAR(2147483647) NOT NULL ) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS WITH ( 'changelog.mode' = 'append', 'connector' = 'confluent', 'key.format' = 'raw', 'value.format' = 'avro-registry' ... ) Properties Key format is raw (binary format) with BYTES Following Kafka message semantics, key supports NULL as well. So this is possible: so the following statement is supported: INSERT INTO t_raw_key SELECT CAST(NULL AS BYTES), 12, 'Bob'; Atomic key and record value in Schema Registry¶ Given the following key and value in Schema Registry: "int" { "type": "record", "name": "TestRecord", "fields": [ { "name": "i", "type": "int" }, { "name": "s", "type": "string" } ] } SHOW CREATE TABLE returns: CREATE TABLE `t_atomic_key` ( `key` INT NOT NULL, `i` INT NOT NULL, `s` VARCHAR(2147483647) NOT NULL ) DISTRIBUTED BY HASH(`key`) INTO 2 BUCKETS WITH ( 'changelog.mode' = 'append', 'connector' = 'confluent', 'key.format' = 'avro-registry', 'value.format' = 'avro-registry' ... ) Properties Schema Registry defines column data type INT NOT NULL. The column name key is used as a default, because Schema Registry doesn’t provide a column name. Overlapping names in key/value, no key in Schema Registry¶ Given the following value in Schema Registry: { "type": "record", "name": "TestRecord", "fields": [ { "name": "i", "type": "int" }, { "name": "s", "type": "string" } ] } SHOW CREATE TABLE returns: CREATE TABLE `t_raw_disjoint` ( `key_key` VARBINARY(2147483647), `i` INT NOT NULL, `key` VARCHAR(2147483647) NOT NULL ) DISTRIBUTED BY HASH(`key_key`) INTO 1 BUCKETS WITH ( 'changelog.mode' = 'append', 'connector' = 'confluent', 'key.fields-prefix' = 'key_', 'key.format' = 'raw', 'value.format' = 'avro-registry' ... ) Properties Schema Registry value defines columns INT NOT NULL and key STRING The column name key BYTES is used as a default if no key is in Schema Registry Because key would collide with value column, key_ prefix is added Record key and record value in Schema Registry¶ Given the following key and value in Schema Registry: { "type": "record", "name": "TestRecord", "fields": [ { "name": "uid", "type": "int" } ] } { "type": "record", "name": "TestRecord", "fields": [ { "name": "name", "type": "string" }, { "name": "zip_code", "type": "string" } ] } SHOW CREATE TABLE returns: CREATE TABLE `t_sr_disjoint` ( `uid` INT NOT NULL, `name` VARCHAR(2147483647) NOT NULL, `zip_code` VARCHAR(2147483647) NOT NULL ) DISTRIBUTED BY HASH(`uid`) INTO 1 BUCKETS WITH ( 'changelog.mode' = 'append', 'connector' = 'confluent', 'value.format' = 'avro-registry' ... ) Properties Schema Registry defines columns for both key and value. The column names of key and value are disjoint sets and don’t overlap. Record key and record value with overlap in Schema Registry¶ Given the following key and value in Schema Registry: { "type": "record", "name": "TestRecord", "fields": [ { "name": "uid", "type": "int" } ] } { "type": "record", "name": "TestRecord", "fields": [ { "name": "uid", "type": "int" },{ "name": "name", "type": "string" }, { "name": "zip_code", "type": "string" } ] } SHOW CREATE TABLE returns: CREATE TABLE `t_sr_joint` ( `uid` INT NOT NULL, `name` VARCHAR(2147483647) NOT NULL, `zip_code` VARCHAR(2147483647) NOT NULL ) DISTRIBUTED BY HASH(`uid`) INTO 1 BUCKETS WITH ( 'changelog.mode' = 'append', 'connector' = 'confluent', 'value.fields-include' = 'all', 'value.format' = 'avro-registry' ... ) Properties Schema Registry defines columns for both key and value. The column names of key and value overlap on uid. 'value.fields-include' = 'all' is set to exclude the key because it is fully contained in the value. Inferred tables schema evolution¶ Schema Registry columns overlap with computed/metadata columns¶ Given the following value in Schema Registry: { "type": "record", "name": "TestRecord", "fields": [ { "name": "uid", "type": "int" } ] } Evolve the table by adding metadata: ALTER TABLE t_metadata_overlap ADD `timestamp` TIMESTAMP_LTZ(3) NOT NULL METADATA; Evolve the table by adding an optional schema column: { "type": "record", "name": "TestRecord", "fields": [ { "name": "uid", "type": "int" }, { "name": "timestamp", "type": ["null", "string"], "default": null } ] } SHOW CREATE TABLE shows: CREATE TABLE t_metadata_overlap` ( `key` VARBINARY(2147483647), `uid` INT NOT NULL, `timestamp` TIMESTAMP(3) WITH LOCAL TIME ZONE NOT NULL METADATA ) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS WITH ( ... ) Properties Schema Registry says there is a timestamp physical column, but Flink says there is timestamp metadata column. In this case, metadata columns and computed columns have precedence, so Flink removes the physical column from the schema. Given that Flink advertises FULL_TRANSITIVE mode, queries still work, and the physical column is set to NULL in the payload: INSERT INTO t_metadata_overlap SELECT CAST(NULL AS BYTES), 42, TO_TIMESTAMP_LTZ(0, 3); SELECT * FROM t_metadata_overlap; Evolve the table by renaming metadata: ALTER TABLE t_metadata_overlap DROP `timestamp`; ALTER TABLE t_metadata_overlap ADD message_timestamp TIMESTAMP_LTZ(3) METADATA FROM 'timestamp'; SHOW CREATE TABLE shows: CREATE TABLE `t_metadata_overlap` ( `key` VARBINARY(2147483647), `uid` INT NOT NULL, `timestamp` VARCHAR(2147483647), `message_timestamp` TIMESTAMP(3) WITH LOCAL TIME ZONE METADATA FROM 'timestamp' ) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS WITH ( ... ) Properties Now, both physical and metadata column show up and can be accessed both for reading and writing. SHOW JOBS¶ SyntaxSHOW JOBS; DescriptionShow the status of all statements in the current catalog/environment. ExampleSHOW JOBS; Your output should resemble: +----------------------------------+-----------+------------------+--------------+------------------+------------------+ | Name | Phase | Statement | Compute Pool | Creation Time | Detail | +----------------------------------+-----------+------------------+--------------+------------------+------------------+ | 0fb72c57-8e3d-4614 | COMPLETED | CREATE TABLE ... | lfcp-8m03rm | 2024-01-23 13... | Table 'flight... | | 8567b0eb-fabd-4cb8 | COMPLETED | CREATE TABLE ... | lfcp-8m03rm | 2024-01-23 13... | Table 'orders... | | 4cd171ca-77db-48ce | COMPLETED | SHOW TABLES L... | lfcp-8m03rm | 2024-01-23 13... | | | 291eb50b-965c-4a53 | COMPLETED | SHOW TABLES N... | lfcp-8m03rm | 2024-01-23 13... | | | 7a30e70a-36af-41f4 | COMPLETED | SHOW TABLES; | lfcp-8m03rm | 2024-01-23 13... | | +----------------------------------+-----------+------------------+--------------+------------------+------------------+ SHOW FUNCTIONS¶ SyntaxSHOW [USER] FUNCTIONS; DescriptionShow all functions including system functions and user-defined functions in the current catalog and current database. Both system and catalog functions are returned. The USER option shows only user-defined functions in the current catalog and current database. Functions of internal modules are shown if your Organization is in the allow-list, for example, OLTP functions. For convenience, SHOW FUNCITONS also shows functions with special syntax or keywords that don’t follow a traditional functional-style syntax, like FUNC(arg0). For example, || (string concatenation) or IS BETWEEN. ExampleSHOW FUNCTIONS; Your output should resemble: +------------------------+ | function name | +------------------------+ | % | | * | | + | | - | | / | | < | | <= | | <> | | = | | > | | >= | | ABS | | ACOS | | AND | | ARRAY | | ARRAY_CONTAINS | | ASCII | | ASIN | | ATAN | | ATAN2 | | AVG | ... SHOW MODELS¶ SyntaxSHOW MODELS [ ( FROM | IN ) [catalog_name.]database_name ] [ [NOT] LIKE <sql_like_pattern> ]; DescriptionShow all AI models that are registered in the current Flink environment. To register an AI model, run the CREATE MODEL statement. ExampleSHOW MODELS; Your output should resemble: +----------------+ | Model Name | +----------------+ | demo_model | +----------------+ SHOW CREATE MODEL¶ SyntaxSHOW CREATE MODEL <model-name>; DescriptionShow details about the specified AI inference model. This command is useful for understanding the configuration and options that were set when the model was created with the CREATE MODEL statement. ExampleFor an example AWS Bedrock model named “bedrock_embed”, the following statement might display the shown output. SHOW CREATE MODEL bedrock_embed; -- Example SHOW CREATE MODEL output: CREATE MODEL `model-testing`.`virtual_topic_GCP`.`bedrock_embed` INPUT (`text` VARCHAR(2147483647)) OUTPUT (`response` ARRAY<FLOAT>) WITH ( 'BEDROCK.CONNECTION' = 'bedrock-connection-hao', 'BEDROCK.INPUT_FORMAT' = 'AMAZON-TITAN-EMBED', 'PROVIDER' = 'bedrock', 'TASK' = 'text_generation' ); Related content¶ DESCRIBE USE CATALOG Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
SHOW CATALOGS;
```

```sql
SHOW CATALOGS;
```

```sql
+-------------------------+------------+
|      catalog name       | catalog id |
+-------------------------+------------+
| my_environment          | env-12abcz |
| example-streams-env     | env-23xjoo |
| quickstart-env          | env-9wg8ny |
| default                 | env-t12345 |
+-------------------------+------------+
```

```sql
USE CATALOG my_environment;
```

```sql
+---------------------+----------------+
|         Key         |      Value     |
+---------------------+----------------+
| sql.current-catalog | my_environment |
+---------------------+----------------+
```

```sql
SHOW CONNECTIONS [LIKE <sql-like-pattern>];
```

```sql
SHOW CONNECTIONS;

-- with name filter
SHOW CONNECTIONS LIKE 'sql%';
```

```sql
+-------------------------+
|          Name           |
+-------------------------+
| azure-openai-connection |
| deepwiki-mcp-connection |
| demo-day-mcp-connection |
| mcp-connection          |
+-------------------------+
```

```sql
SHOW CURRENT CATALOG;
```

```sql
SHOW CURRENT CATALOG;
```

```sql
+----------------------+
| current catalog name |
+----------------------+
| my_environment       |
+----------------------+
```

```sql
SHOW DATABASES;
```

```sql
SHOW DATABASES;
```

```sql
+---------------+-------------+
| database name | database id |
+---------------+-------------+
| cluster_0     | lkc-r289m7  |
+---------------+-------------+
```

```sql
USE cluster_0;
```

```sql
+----------------------+-----------+
|         Key          |   Value   |
+----------------------+-----------+
| sql.current-database | cluster_0 |
+----------------------+-----------+
```

```sql
SHOW CURRENT DATABASE;
```

```sql
SHOW CURRENT DATABASE;
```

```sql
+-----------------------+
| current database name |
+-----------------------+
| cluster_0             |
+-----------------------+
```

```sql
SHOW TABLES [ [catalog_name.]database_name ] [ [NOT] LIKE <sql_like_pattern> ]
```

```sql
<sql_like_pattern>
```

```sql
-- Create a flights table.
CREATE TABLE flights (
  flight_id STRING,
  origin STRING,
  destination STRING
);
```

```sql
-- Create an orders table.
CREATE TABLE orders (
  user_id BIGINT NOT NULL,
  product_id STRING,
  amount INT
);
```

```sql
SHOW TABLES LIKE 'f%';
```

```sql
+------------+
| table name |
+------------+
| flights    |
+------------+
```

```sql
SHOW TABLES NOT LIKE 'f%';
```

```sql
+------------+
| table name |
+------------+
| orders     |
+------------+
```

```sql
SHOW TABLES;
```

```sql
+------------+
| table name |
+------------+
| flights    |
| orders     |
+------------+
```

```sql
SHOW CREATE TABLE [catalog_name.][db_name.]table_name;
```

```sql
SHOW CREATE TABLE flights;
```

```sql
+-----------------------------------------------------------+
|                     SHOW CREATE TABLE                     |
+-----------------------------------------------------------+
| CREATE TABLE `my_environment`.`cluster_0`.`flights` (     |
|   `flight_id` VARCHAR(2147483647),                        |
|   `origin` VARCHAR(2147483647),                           |
|   `destination` VARCHAR(2147483647)                       |
| ) WITH (                                                  |
|   'changelog.mode' = 'append',                            |
|   'connector' = 'confluent',                              |
|   'kafka.cleanup-policy' = 'delete',                      |
|   'kafka.max-message-size' = '2097164 bytes',             |
|   'kafka.partitions' = '6',                               |
|   'kafka.retention.size' = '0 bytes',                     |
|   'kafka.retention.time' = '604800000 ms',                |
|   'scan.bounded.mode' = 'unbounded',                      |
|   'scan.startup.mode' = 'earliest-offset',                |
|   'value.format' = 'avro-registry'                        |
| )                                                         |
|                                                           |
+-----------------------------------------------------------+
```

```sql
CREATE TABLE `t_raw` (
  `key` VARBINARY(2147483647),
  `val` VARBINARY(2147483647)
) DISTRIBUTED BY HASH(`key`) INTO 2 BUCKETS
WITH (
  'changelog.mode' = 'append',
  'connector' = 'confluent',
  'key.format' = 'raw',
  'value.format' = 'raw'
  ...
);
```

```sql
INSERT INTO t_raw (key, val) SELECT CAST(NULL AS BYTES), CAST(NULL AS BYTES);
```

```sql
{
  "type": "record",
  "name": "TestRecord",
  "fields": [
    {
      "name": "i",
      "type": "int"
    },
    {
      "name": "s",
      "type": "string"
    }
  ]
}
```

```sql
CREATE TABLE `t_raw_key` (
  `key` VARBINARY(2147483647),
  `i` INT NOT NULL,
  `s` VARCHAR(2147483647) NOT NULL
) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS
WITH (
  'changelog.mode' = 'append',
  'connector' = 'confluent',
  'key.format' = 'raw',
  'value.format' = 'avro-registry'
  ...
)
```

```sql
INSERT INTO t_raw_key SELECT CAST(NULL AS BYTES), 12, 'Bob';
```

```sql
{
  "type": "record",
  "name": "TestRecord",
  "fields": [
     {
        "name": "i",
        "type": "int"
     },
     {
        "name": "s",
        "type": "string"
     }
  ]
}
```

```sql
CREATE TABLE `t_atomic_key` (
  `key` INT NOT NULL,
  `i` INT NOT NULL,
  `s` VARCHAR(2147483647) NOT NULL
) DISTRIBUTED BY HASH(`key`) INTO 2 BUCKETS
WITH (
  'changelog.mode' = 'append',
  'connector' = 'confluent',
  'key.format' = 'avro-registry',
  'value.format' = 'avro-registry'
  ...
)
```

```sql
{
  "type": "record",
  "name": "TestRecord",
  "fields": [
     {
        "name": "i",
        "type": "int"
     },
     {
        "name": "s",
        "type": "string"
     }
  ]
}
```

```sql
CREATE TABLE `t_raw_disjoint` (
  `key_key` VARBINARY(2147483647),
  `i` INT NOT NULL,
  `key` VARCHAR(2147483647) NOT NULL
) DISTRIBUTED BY HASH(`key_key`) INTO 1 BUCKETS
WITH (
  'changelog.mode' = 'append',
  'connector' = 'confluent',
  'key.fields-prefix' = 'key_',
  'key.format' = 'raw',
  'value.format' = 'avro-registry'
  ...
)
```

```sql
{
  "type": "record",
  "name": "TestRecord",
  "fields": [
     {
        "name": "uid",
        "type": "int"
     }
  ]
}
```

```sql
{
  "type": "record",
  "name": "TestRecord",
  "fields": [
     {
        "name": "name",
        "type": "string"
     },
     {
        "name": "zip_code",
        "type": "string"
     }
  ]
}
```

```sql
CREATE TABLE `t_sr_disjoint` (
  `uid` INT NOT NULL,
  `name` VARCHAR(2147483647) NOT NULL,
  `zip_code` VARCHAR(2147483647) NOT NULL
) DISTRIBUTED BY HASH(`uid`) INTO 1 BUCKETS
WITH (
  'changelog.mode' = 'append',
  'connector' = 'confluent',
  'value.format' = 'avro-registry'
  ...
)
```

```sql
{
  "type": "record",
  "name": "TestRecord",
  "fields": [
     {
        "name": "uid",
        "type": "int"
     }
  ]
}
```

```sql
{
  "type": "record",
  "name": "TestRecord",
  "fields": [
     {
        "name": "uid",
        "type": "int"
     },{
        "name": "name",
        "type": "string"
     },
     {
        "name": "zip_code",
        "type": "string"
     }
  ]
}
```

```sql
CREATE TABLE `t_sr_joint` (
  `uid` INT NOT NULL,
  `name` VARCHAR(2147483647) NOT NULL,
  `zip_code` VARCHAR(2147483647) NOT NULL
) DISTRIBUTED BY HASH(`uid`) INTO 1 BUCKETS
WITH (
  'changelog.mode' = 'append',
  'connector' = 'confluent',
  'value.fields-include' = 'all',
  'value.format' = 'avro-registry'
  ...
)
```

```sql
'value.fields-include' = 'all'
```

```sql
{
  "type": "record",
  "name": "TestRecord",
  "fields": [
     {
        "name": "uid",
        "type": "int"
     }
  ]
}
```

```sql
ALTER TABLE t_metadata_overlap ADD `timestamp` TIMESTAMP_LTZ(3) NOT NULL METADATA;
```

```sql
{
  "type": "record",
  "name": "TestRecord",
  "fields": [
     {
        "name": "uid",
        "type": "int"
     },
     {
        "name": "timestamp",
        "type": ["null", "string"],
        "default": null
     }
  ]
}
```

```sql
CREATE TABLE t_metadata_overlap` (
  `key` VARBINARY(2147483647),
  `uid` INT NOT NULL,
  `timestamp` TIMESTAMP(3) WITH LOCAL TIME ZONE NOT NULL METADATA
) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS
WITH (
  ...
)
```

```sql
INSERT INTO t_metadata_overlap
  SELECT CAST(NULL AS BYTES), 42, TO_TIMESTAMP_LTZ(0, 3);

SELECT * FROM t_metadata_overlap;
```

```sql
ALTER TABLE t_metadata_overlap DROP `timestamp`;

ALTER TABLE t_metadata_overlap
  ADD message_timestamp TIMESTAMP_LTZ(3) METADATA FROM 'timestamp';
```

```sql
CREATE TABLE `t_metadata_overlap` (
  `key` VARBINARY(2147483647),
  `uid` INT NOT NULL,
  `timestamp` VARCHAR(2147483647),
  `message_timestamp` TIMESTAMP(3) WITH LOCAL TIME ZONE METADATA FROM 'timestamp'
) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS
WITH (
  ...
)
```

```sql
+----------------------------------+-----------+------------------+--------------+------------------+------------------+
|               Name               |   Phase   |    Statement     | Compute Pool |  Creation Time   |      Detail      |
+----------------------------------+-----------+------------------+--------------+------------------+------------------+
| 0fb72c57-8e3d-4614               | COMPLETED | CREATE TABLE ... | lfcp-8m03rm  | 2024-01-23 13... | Table 'flight... |
| 8567b0eb-fabd-4cb8               | COMPLETED | CREATE TABLE ... | lfcp-8m03rm  | 2024-01-23 13... | Table 'orders... |
| 4cd171ca-77db-48ce               | COMPLETED | SHOW TABLES L... | lfcp-8m03rm  | 2024-01-23 13... |                  |
| 291eb50b-965c-4a53               | COMPLETED | SHOW TABLES N... | lfcp-8m03rm  | 2024-01-23 13... |                  |
| 7a30e70a-36af-41f4               | COMPLETED | SHOW TABLES;     | lfcp-8m03rm  | 2024-01-23 13... |                  |
+----------------------------------+-----------+------------------+--------------+------------------+------------------+
```

```sql
SHOW [USER] FUNCTIONS;
```

```sql
SHOW FUNCTIONS;
```

```sql
+------------------------+
|     function name      |
+------------------------+
| %                      |
| *                      |
| +                      |
| -                      |
| /                      |
| <                      |
| <=                     |
| <>                     |
| =                      |
| >                      |
| >=                     |
| ABS                    |
| ACOS                   |
| AND                    |
| ARRAY                  |
| ARRAY_CONTAINS         |
| ASCII                  |
| ASIN                   |
| ATAN                   |
| ATAN2                  |
| AVG                    |
...
```

```sql
SHOW MODELS [ ( FROM | IN ) [catalog_name.]database_name ]
[ [NOT] LIKE <sql_like_pattern> ];
```

```sql
SHOW MODELS;
```

```sql
+----------------+
|   Model Name   |
+----------------+
|   demo_model   |
+----------------+
```

```sql
SHOW CREATE MODEL <model-name>;
```

```sql
SHOW CREATE MODEL bedrock_embed;

-- Example SHOW CREATE MODEL output:
CREATE MODEL `model-testing`.`virtual_topic_GCP`.`bedrock_embed`
INPUT (`text` VARCHAR(2147483647))
OUTPUT (`response` ARRAY<FLOAT>)
WITH (
  'BEDROCK.CONNECTION' = 'bedrock-connection-hao',
  'BEDROCK.INPUT_FORMAT' = 'AMAZON-TITAN-EMBED',
  'PROVIDER' = 'bedrock',
  'TASK' = 'text_generation'
);
```

---

### SQL USE CATALOG Statement in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/statements/use-catalog.html

USE CATALOG Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables setting the active environment with the SQL USE statement. Syntax¶ USE CATALOG catalog_name; Description¶ Set the current catalog (Confluent Cloud environment). All subsequent commands that don’t specify a catalog use catalog_name. Confluent Cloud for Apache Flink interprets your Confluent Cloud environments as catalogs. Flink can access various databases (Apache Kafka® clusters) in a catalog. The catalog_name parameter is case-sensitive. The default current catalog is named default. If catalog_name doesn’t exist, Flink throws an exception on the next DML or DDL statement. Important USE CATALOG is a client-side setting statement and sets corresponding properties that are attached to future requests. By itself, a USE CATALOG statement is a no-op. To see its effect, you must follow it with one or more DML or DDL statements, for example: -- Set the current catalog (environment). USE CATALOG my_env; -- Set the current database (Kafka cluster). USE cluster_0; -- Submit a DDL statement. SELECT * FROM my_table; Use the USE DATABASE statement to set the current Flink database (Kafka cluster). USE CATALOG in Cloud Console workspaces¶ When you run the USE CATALOG statement in a Cloud Console workspace, it sets the catalog that will be used in any subsequent CREATE statement requests for the specific editor cell. Different cells can use different catalogs within the same workspace. The catalog parameter is unquoted, for example, USE CATALOG catalog1;. Any USE statements within an editor cell take precedence over the settings in the workspace’s global catalog and database dropdown controls. Related content¶ USE database Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
USE CATALOG catalog_name;
```

```sql
catalog_name
```

```sql
catalog_name
```

```sql
catalog_name
```

```sql
-- Set the current catalog (environment).
USE CATALOG my_env;

-- Set the current database (Kafka cluster).
USE cluster_0;

-- Submit a DDL statement.
SELECT * FROM my_table;
```

```sql
USE CATALOG catalog1;
```

---

### SQL USE database_name Statement in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/statements/use-database.html

USE <database_name> Statement in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® enables setting the current Apache Kafka® cluster with the USE <database_name> statement. Syntax¶ USE database_name; Description¶ Set the current database (Kafka cluster). The USE <database_name> statement enables you to access tables in various databases without specifying the full paths. In Confluent Cloud, Apache Flink® databases are equivalent to Kafka clusters. All Kafka clusters in the region where Flink is running are registered automatically as databases and can be accessed by Flink, when using the correct catalog/environment. All subsequent commands that don’t specify a database use <database_name>. If <database_name> doesn’t exist, Flink throws an exception on the next DML or DDL statement. Important USE <database_name> is a client-side setting statement and sets corresponding properties that are attached to future requests. By itself, a USE <database_name> statement is a no-op. To see its effect, you must follow it with one or more DML or DDL statements, for example: -- Set the current catalog (environment). USE CATALOG my_env; -- Set the current database (Kafka cluster). USE cluster_0; -- Submit a DDL statement. SELECT * FROM my_table; Run the USE CATALOG statement to set the current Flink catalog (Confluent Cloud environment). USE database_name in Cloud Console workspaces¶ When you run the USE <database_name> statement in a Cloud Console workspace, it sets the database that will be used in any subsequent CREATE statement requests for the specific editor cell. Different cells can use different databases within the same workspace. The <database_name> parameter is unquoted, for example, USE database1;. Any USE statements within an editor cell take precedence over the settings in the workspace’s global catalog and database dropdown controls. Example¶ In the Flink SQL shell, run the following commands to see an example of the USE <database_name> statement. View the existing databases. SHOW DATABASES; Your output should resemble: +---------------+-------------+ | database name | database id | +---------------+-------------+ | cluster_0 | lkc-a123c4 | +---------------+-------------+ Set the current database to cluster_0. USE cluster_0; Your output should resemble: +----------------------+-----------+ | Key | Value | +----------------------+-----------+ | sql.current-database | cluster_0 | +----------------------+-----------+ Run the SHOW CURRENT DATABASE to check the database change. SHOW CURRENT DATABASE; Your output should resemble: +-----------------------+ | current database name | +-----------------------+ | cluster_0 | +-----------------------+ Related content¶ USE CATALOG Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
USE <database_name>
```

```sql
USE database_name;
```

```sql
USE <database_name>
```

```sql
<database_name>
```

```sql
<database_name>
```

```sql
USE <database_name>
```

```sql
USE <database_name>
```

```sql
-- Set the current catalog (environment).
USE CATALOG my_env;

-- Set the current database (Kafka cluster).
USE cluster_0;

-- Submit a DDL statement.
SELECT * FROM my_table;
```

```sql
USE <database_name>
```

```sql
<database_name>
```

```sql
USE database1;
```

```sql
USE <database_name>
```

```sql
SHOW DATABASES;
```

```sql
+---------------+-------------+
| database name | database id |
+---------------+-------------+
| cluster_0     | lkc-a123c4  |
+---------------+-------------+
```

```sql
USE cluster_0;
```

```sql
+----------------------+-----------+
|         Key          |   Value   |
+----------------------+-----------+
| sql.current-database | cluster_0 |
+----------------------+-----------+
```

```sql
SHOW CURRENT DATABASE;
```

```sql
+-----------------------+
| current database name |
+-----------------------+
| cluster_0             |
+-----------------------+
```

---

### Table API on Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/table-api.html

Table API on Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® supports programming applications with the Table API in Java and Python. Confluent provides a plugin for running applications that use the Table API on Confluent Cloud. The Table API enables a programmatic way of developing, testing, and submitting Flink pipelines for processing data streams. Streams can be finite or infinite, with insert-only or changelog data. Changelog data enables handling Change Data Capture (CDC) events. To use the Table API, you work with tables that change over time, a concept inspired by relational databases. A Table program is a declarative and structured graph of transformations. The Table API is inspired by SQL and complements it with additional tools for manipulating real-time data. You can use both Flink SQL and the Table API in your applications. A table program has these characteristics: Runs in a regular main() method (Java) Uses Flink APIs Communicates with Confluent Cloud by using REST requests, for example, Statements endpoint. For a list of Table API functions supported by Confluent Cloud for Apache Flink, see Table API functions. For a list of Table API limitations in Confluent Cloud for Apache Flink, see Known limitations. Use the Confluent for VS Code extension to generate a new Flink Table API project that interacts with your Confluent Cloud resources. This option is ideal if you’re learning about the Table API. For more information see Confluent for VS Code for Confluent Cloud. Note The Flink Table API is available for preview. A Preview feature is a Confluent Cloud component that is being introduced to gain early feedback from developers. Preview features can be used for evaluation and non-production testing purposes or to provide feedback to Confluent. The warranty, SLA, and Support Services provisions of your agreement with Confluent do not apply to Preview features. Confluent may discontinue providing preview releases of the Preview features at any time in Confluent’s’ sole discretion. Comments, questions, and suggestions related to the Table API are encouraged and can be submitted through the established channels. Add the Table API to an existing Java project¶ To add the Table API to an existing project, include the following dependencies in the <dependencies> section of your pom.xml file. <!-- Apache Flink dependencies --> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-table-api-java</artifactId> <version>${flink.version}</version> </dependency> <!-- Confluent Flink Table API Java plugin --> <dependency> <groupId>io.confluent.flink</groupId> <artifactId>confluent-flink-table-api-java-plugin</artifactId> <version>${confluent-plugin.version}</version> </dependency> Configure the plugin¶ The plugin requires a set of configuration options for establishing a connection to Confluent Cloud. The following configuration options are required. Property key Command-line argument Environment variable Notes client.cloud –cloud CLOUD_PROVIDER Confluent identifier for a cloud provider. Valid values are aws, azure, and gcp. client.compute-pool-id –compute-pool-id COMPUTE_POOL_ID ID of the compute pool, for example, lfcp-8m03rm client.environment-id –environment-id ENV_ID ID of the environment, for example, env-z3y2x1. client.flink-api-key –flink-api-key FLINK_API_KEY API key for Flink access. For more information, see Generate an API Key. client.flink-api-secret –flink-api-secret FLINK_API_SECRET API secret for Flink access. For more information, see Generate an API Key. client.organization-id –organization-id ORG_ID ID of the organization, for example, b0b21724-4586-4a07-b787-d0bb5aacbf87. client.region –region CLOUD_REGION Confluent identifier for a cloud provider’s region, for example, us-east-1. For available regions, see Supported Regions or run confluent flink region list. The following configuration options are required for supporting UDF uploads. For more information, see Upload the jar as a Flink artifact. Note Create a Confluent Cloud API key artifact key and secret in Confluent Cloud Console. For more information, see Manage API Keys in |ccloud|. Property key Command-line argument Environment variable Notes client.artifact-api-key –artifact-api-key ARTIFACT_API_KEY API key for artifact creation client.artifact-api-secret –artifact-api-secret ARTIFACT_API_SECRET API secret for artifact creation The following configuration options are optional. Property key Command-line argument Environment variable Notes client.artifact-endpoint-template –artifact-endpoint-template ARTIFACT_ENDPOINT_TEMPLATE A template for the artifact endpoint URL, for example, https://api.{region}.{cloud}.confluent.cloud. client.catalog-cache Expiration time for catalog objects, for example, '5 min'. The default is '1 min'. '0' disables caching. client.context –context A name for the current Table API session, for example, my_table_program. client.endpoint-template –endpoint-template ENDPOINT_TEMPLATE A template for the endpoint URL, for example, https://flinkpls-dom123.{region}.{cloud}.confluent.cloud. client.principal-id –principal-id PRINCIPAL_ID Principal that runs submitted statements, for example, sa-23kgz4 for a service account. client.rest-endpoint –rest-endpoint REST_ENDPOINT URL to the REST endpoint, for example, proxyto.confluent.cloud. client.statement-name –statement-name Unique name for statement submission. By default, generated using a UUID. client.tmp-dir –tmp-dir Directory for temporary files created by the plugin, like UDF jars, for example, /tmp. The default is java.io.tmpdir. Endpoint configuration¶ The Confluent Flink plugin provides options to configure endpoints for connecting to Confluent Cloud services. The template-based approach is the recommended method. client.endpoint-template¶ This option provides a template for constructing the Flink statement API endpoint URL. Default value: https://flink.{region}.{cloud}.confluent.cloud Example: https://flinkpls-dom123.{region}.{cloud}.confluent.cloud Usage: The template supports placeholders {region} and {cloud} that are replaced with the configured region and cloud provider values. Environment Variable: ENDPOINT_TEMPLATE client.artifact-endpoint-template¶ This option provides a template for constructing the URL used for uploading artifacts, like UDF JARs. Default value: https://api.confluent.cloud Example: https://api.{region}.{cloud}.confluent.cloud Usage: Similar to the endpoint template, this supports placeholders {region} and {cloud}. Environment Variable: ARTIFACT_ENDPOINT_TEMPLATE client.rest-endpoint (Deprecated)¶ This option specifies the base domain for REST API calls to Confluent Cloud. While still supported, using the template-based configuration is preferred. Default value: No default value Example: proxy.confluent.cloud Usage: When specified, the plugin constructs the full Flink statement API endpoint URL as https://flink.{region}.{cloud}.{rest-endpoint} where {region} and {cloud} are replaced with the configured region and cloud provider values. Environment Variable: REST_ENDPOINT Important client.endpoint-template and client.rest-endpoint are mutually exclusive. If both are set, an exception is thrown. Relationship and default behavior¶ The following rules control the relationship between the configuration options. The client.endpoint-template and client.rest-endpoint configuration options can’t be set simultaneously The client.artifact-endpoint-template and client.rest-endpoint configuration options can’t be set simultaneously. The following rules control the default behavior. If neither client.rest-endpoint nor client.endpoint-template is configured, the default template, https://flink.{region}.{cloud}.confluent.cloud is used for statement API If neither client.rest-endpoint nor client.artifact-endpoint-template is specified, the default artifact endpoint, https://api.confluent.cloud is used If endpoint templates are used, each endpoint is constructed independently with the provided templates. The following simple example shows different ways to configure endpoints. // Option 1 (RECOMMENDED): Using endpoint templates // Resolved endpoints: // - Statement API: https://flinkpls-dom123.us-east-1.aws.confluent.cloud ConfluentSettings settings1 = ConfluentSettings.newBuilder() .setRegion("us-east-1") .setCloud("aws") .setEndpointTemplate("https://flinkpls-dom123.{region}.{cloud}.confluent.cloud") .setArtifactEndpointTemplate("https://artifacts.{region}.{cloud}.custom-domain.com") // Other required settings... .build(); // Option 2: Using properties file with endpoint templates // cloud.properties: // client.region=us-east-1 // client.cloud=aws // client.endpoint-template=https://flinkpls-dom123.{region}.{cloud}.confluent.cloud // Resolved endpoints: // - Statement API: https://flinkpls-dom123.us-east-1.aws.confluent.cloud // - Artifact API: https://api.confluent.cloud (default) ConfluentSettings settings2 = ConfluentSettings.fromResource("/cloud.properties"); // Option 3 (DISCOURAGED): Using rest-endpoint (both statement endpoint will be derived from this) // Resolved endpoints: // - Statement API: https://flink.us-east-1.aws.proxy.confluent.cloud // - Artifact API: https://api.proxy.confluent.cloud ConfluentSettings settings3 = ConfluentSettings.newBuilder() .setRegion("us-east-1") .setCloud("aws") .setRestEndpoint("proxy.confluent.cloud") // Other required settings... .build(); ConfluentSettings class¶ The ConfluentSettings class provides configuration options from various sources, so you can combine external input, code, and environment variables to set up your applications. The following precedence order applies to configuration sources, from highest to lowest: CLI arguments or properties file Code Environment variables The following code example shows a TableEnvironment that’s configured by a combination of command-line arguments and code. JavaPythonpublic static void main(String[] args) { // Args might set cloud, region, org, env, and compute pool. // Environment variables might pass key and secret. // Code sets the session name and SQL-specific options. ConfluentSettings settings = ConfluentSettings.newBuilder(args) .setContextName("MyTableProgram") .setOption("sql.local-time-zone", "UTC") .build(); TableEnvironment env = TableEnvironment.create(settings); } from pyflink.table.confluent import ConfluentSettings from pyflink.table import TableEnvironment def run(): # Properties file might set cloud, region, org, env, and compute pool. # Environment variables might pass key and secret. # Code sets the session name and SQL-specific options. settings = ConfluentSettings.new_builder_from_file(...) \ .set_context_name("MyTableProgram") \ .set_option("sql.local-time-zone", "UTC") \ .build() env = TableEnvironment.create(settings) Properties file¶ You can store options in a cloud.properties file and reference the file in code. # Cloud region client.cloud=aws client.region=eu-west-1 # Access & compute resources client.flink-api-key=XXXXXXXXXXXXXXXX client.flink-api-secret=XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXx client.organization-id=00000000-0000-0000-0000-000000000000 client.environment-id=env-xxxxx client.compute-pool-id=lfcp-xxxxxxxxxx Reference the cloud.properties file in code: JavaPython// Arbitrary file location in file system ConfluentSettings settings = ConfluentSettings.fromPropertiesFile("/path/to/cloud.properties"); // Part of the JAR package (in src/main/resources) ConfluentSettings settings = ConfluentSettings.fromPropertiesResource("/cloud.properties"); from pyflink.table.confluent import ConfluentSettings # Arbitrary file location in file system settings = ConfluentSettings.from_file("/path/to/cloud.properties") Command-line arguments¶ You can pass the configuration settings as command-line options when you run your application’s jar: java -jar my-table-program.jar \ --cloud aws \ --region us-east-1 \ --flink-api-key key \ --flink-api-secret secret \ --organization-id b0b21724-4586-4a07-b787-d0bb5aacbf87 \ --environment-id env-z3y2x1 \ --compute-pool-id lfcp-8m03rm Access the configuration settings from the command-line arguments by using the ConfluentSettings.fromArgs method: JavaPythonpublic static void main(String[] args) { ConfluentSettings settings = ConfluentSettings.fromArgs(args); } from pyflink.table.confluent import ConfluentSettings settings = ConfluentSettings.from_global_variables() Code¶ You can assign the configuration settings in code by using the builder provided with the ConfluentSettings class: JavaPythonConfluentSettings settings = ConfluentSettings.newBuilder() .setCloud("aws") .setRegion("us-east-1") .setFlinkApiKey("key") .setFlinkApiSecret("secret") .setOrganizationId("b0b21724-4586-4a07-b787-d0bb5aacbf87") .setEnvironmentId("env-z3y2x1") .setComputePoolId("lfcp-8m03rm") .build(); from pyflink.table.confluent import ConfluentSettings settings = ConfluentSettings.new_builder() \ .set_cloud("aws") \ .set_region("us-east-1") \ .set_flink_api_key("key") \ .set_flink_api_secret("secret") \ .set_organization_id("b0b21724-4586-4a07-b787-d0bb5aacbf87") \ .set_environment_id("env-z3y2x1") \ .set_compute_pool_id("lfcp-8m03rm") \ .build() Environment variables¶ Set the following environment variables to provide configuration settings. export CLOUD_PROVIDER="aws" export CLOUD_REGION="us-east-1" export FLINK_API_KEY="key" export FLINK_API_SECRET="secret" export ORG_ID="b0b21724-4586-4a07-b787-d0bb5aacbf87" export ENV_ID="env-z3y2x1" export COMPUTE_POOL_ID="lfcp-8m03rm" java -jar my-table-program.jar In code, call: JavaPythonConfluentSettings settings = ConfluentSettings.fromGlobalVariables(); from pyflink.table.confluent import ConfluentSettings settings = ConfluentSettings.from_global_variables() Confluent utilities¶ The ConfluentTools class provides more methods that you can use for developing and testing Table API programs. ConfluentTools.collectChangelog and ConfluentTools.printChangelog¶ Runs the specified table transformations on Confluent Cloud and returns the results locally as a list of changelog rows or prints to the console in a table style. These methods run table.execute().collect() and consume a fixed number of rows from the returned iterator. These methods can work on both finite and infinite input tables. If the pipeline is potentially unbounded, they stop fetching after the desired number of rows has been reached. JavaPython// On a Table object Table table = env.from("examples.marketplace.customers"); List<Row> rows = ConfluentTools.collectMaterialized(table, 100); ConfluentTools.printMaterialized(table, 100); // On a TableResult object TableResult tableResult = env.executeSql("SELECT * FROM examples.marketplace.customers"); List<Row> rows = ConfluentTools.collectMaterialized(tableResult, 100); ConfluentTools.printMaterialized(tableResult, 100); // For finite (i.e. bounded) tables ConfluentTools.collectMaterialized(table); ConfluentTools.printMaterialized(table); from pyflink.table.confluent import ConfluentSettings, ConfluentTools from pyflink.table import TableEnvironment settings = ConfluentSettings.from_global_variables() env = TableEnvironment.create(settings) # On a Table object table = env.from_path("examples.marketplace.customers") rows = ConfluentTools.collect_changelog_limit(table, 100) ConfluentTools.print_changelog_limit(table, 100) # On a TableResult object tableResult = env.execute_sql("SELECT * FROM examples.marketplace.customers") rows = ConfluentTools.collect_changelog_limit(tableResult, 100) ConfluentTools.print_changelog_limit(tableResult, 100) # For finite (i.e. bounded) tables ConfluentTools.collect_changelog(table) ConfluentTools.print_changelog(table) ConfluentTools.collect_materialized and ConfluentTools.print_materialized¶ Runs the specified table transformations on Confluent Cloud and returns the results locally as a materialized changelog. Changes are applied to an in-memory table and returned as a list of insert-only rows or printed to the console in a table style. These methods run table.execute().collect() and consume a fixed number of rows from the returned iterator. These methods can work on both finite and infinite input tables. If the pipeline is potentially unbounded, they stop fetching after the desired number of rows have been reached. JavaPython// On a Table object Table table = env.from("examples.marketplace.customers"); List<Row> rows = ConfluentTools.collectMaterialized(table, 100); ConfluentTools.printMaterialized(table, 100); // On a TableResult object TableResult tableResult = env.executeSql("SELECT * FROM examples.marketplace.customers"); List<Row> rows = ConfluentTools.collectMaterialized(tableResult, 100); ConfluentTools.printMaterialized(tableResult, 100); // For finite (i.e. bounded) tables ConfluentTools.collectMaterialized(table); ConfluentTools.printMaterialized(table); from pyflink.table.confluent import ConfluentSettings, ConfluentTools from pyflink.table import TableEnvironment settings = ConfluentSettings.from_global_variables() env = TableEnvironment.create(settings) # On Table object table = env.from_path("examples.marketplace.customers") rows = ConfluentTools.collect_materialized_limit(table, 100) ConfluentTools.print_materialized_limit(table, 100) # On TableResult object tableResult = env.execute_sql("SELECT * FROM examples.marketplace.customers") rows = ConfluentTools.collect_materialized_limit(tableResult, 100) ConfluentTools.print_materialized_limit(tableResult, 100) # For finite (i.e. bounded) tables ConfluentTools.collect_materialized(table) ConfluentTools.print_materialized(table) ConfluentTools.getStatementName and ConfluentTools.stopStatement¶ Additional lifecycle methods for controlling statements on Confluent Cloud after they have been submitted. JavaPython// On TableResult object TableResult tableResult = env.executeSql("SELECT * FROM examples.marketplace.customers"); String statementName = ConfluentTools.getStatementName(tableResult); ConfluentTools.stopStatement(tableResult); // Based on statement name ConfluentTools.stopStatement(env, "table-api-2024-03-21-150457-36e0dbb2e366-sql"); # On TableResult object table_result = env.execute_sql("SELECT * FROM examples.marketplace.customers") statement_name = ConfluentTools.get_statement_name(table_result) ConfluentTools.stop_statement(table_result) # Based on statement name ConfluentTools.stop_statement_by_name(env, "table-api-2024-03-21-150457-36e0dbb2e366-sql") Confluent table descriptor¶ A table descriptor for creating tables located in Confluent Cloud programmatically. Compared to the regular Flink class, the ConfluentTableDescriptor class adds support for Confluent’s system columns and convenience methods for working with Confluent tables. The for_managed() method corresponds to TableDescriptor.for_connector("confluent"). JavaPythonTableDescriptor descriptor = ConfluentTableDescriptor.forManaged() .schema( Schema.newBuilder() .column("i", DataTypes.INT()) .column("s", DataTypes.INT()) .watermark("$rowtime", $("$rowtime").minus(lit(5).seconds())) // Access $rowtime system column .build()) .build(); env.createTable("t1", descriptor); from pyflink.table.confluent import ConfluentTableDescriptor from pyflink.table import Schema, DataTypes from pyflink.table.expressions import col, lit descriptor = ConfluentTableDescriptor.for_managed() \ .schema( Schema.new_builder() .column("i", DataTypes.INT()) .column("s", DataTypes.INT()) .watermark("$rowtime", col("$rowtime").minus(lit(5).seconds)) # Access $rowtime system column .build()) \ .build() env.createTable("t1", descriptor) Known limitations¶ The Table API plugin is in Open Preview stage. Unsupported by Table API Plugin¶ The following features are not supported. Temporary catalog objects (including tables, views, functions) Custom modules Custom catalogs User-defined functions (including system functions) Anonymous, inline objects (including functions, data types) CompiledPlan features are not supported Batch mode Restrictions from Confluent Cloud custom connectors/formats processing time operations structured data types many configuration options limited SQL syntax batch execution mode Issues in Apache Flink¶ Both catalog and database must be set, or identifiers must be fully qualified. A mixture of setting a current catalog and using two-part identifiers can cause errors. String concatenation with .plus causes errors. Instead, use Expressions.concat. Selecting .rowtime in windows causes errors. Using .limit() can cause errors. Next steps¶ Java Table API Quick Start on Confluent Cloud for Apache Flink Python Table API Quick Start on Confluent Cloud for Apache Flink Related content¶ Course: Apache Flink® Table API: Processing Data Streams in Java Table API functions Built-in Functions Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
<dependencies>
```

```sql
<!-- Apache Flink dependencies -->
<dependency>
   <groupId>org.apache.flink</groupId>
   <artifactId>flink-table-api-java</artifactId>
   <version>${flink.version}</version>
</dependency>

<!-- Confluent Flink Table API Java plugin -->
<dependency>
   <groupId>io.confluent.flink</groupId>
   <artifactId>confluent-flink-table-api-java-plugin</artifactId>
   <version>${confluent-plugin.version}</version>
</dependency>
```

```sql
lfcp-8m03rm
```

```sql
b0b21724-4586-4a07-b787-d0bb5aacbf87
```

```sql
confluent flink region list
```

```sql
https://api.{region}.{cloud}.confluent.cloud
```

```sql
https://flinkpls-dom123.{region}.{cloud}.confluent.cloud
```

```sql
proxyto.confluent.cloud
```

```sql
java.io.tmpdir
```

```sql
https://flink.{region}.{cloud}.confluent.cloud
```

```sql
https://flinkpls-dom123.{region}.{cloud}.confluent.cloud
```

```sql
ENDPOINT_TEMPLATE
```

```sql
https://api.confluent.cloud
```

```sql
https://api.{region}.{cloud}.confluent.cloud
```

```sql
ARTIFACT_ENDPOINT_TEMPLATE
```

```sql
proxy.confluent.cloud
```

```sql
https://flink.{region}.{cloud}.{rest-endpoint}
```

```sql
REST_ENDPOINT
```

```sql
client.endpoint-template
```

```sql
client.rest-endpoint
```

```sql
client.endpoint-template
```

```sql
client.rest-endpoint
```

```sql
client.artifact-endpoint-template
```

```sql
client.rest-endpoint
```

```sql
client.rest-endpoint
```

```sql
client.endpoint-template
```

```sql
https://flink.{region}.{cloud}.confluent.cloud
```

```sql
client.rest-endpoint
```

```sql
client.artifact-endpoint-template
```

```sql
https://api.confluent.cloud
```

```sql
// Option 1 (RECOMMENDED): Using endpoint templates
// Resolved endpoints:
// - Statement API: https://flinkpls-dom123.us-east-1.aws.confluent.cloud
ConfluentSettings settings1 = ConfluentSettings.newBuilder()
      .setRegion("us-east-1")
      .setCloud("aws")
      .setEndpointTemplate("https://flinkpls-dom123.{region}.{cloud}.confluent.cloud")
      .setArtifactEndpointTemplate("https://artifacts.{region}.{cloud}.custom-domain.com")
      // Other required settings...
      .build();

// Option 2: Using properties file with endpoint templates
// cloud.properties:
// client.region=us-east-1
// client.cloud=aws
// client.endpoint-template=https://flinkpls-dom123.{region}.{cloud}.confluent.cloud
// Resolved endpoints:
// - Statement API: https://flinkpls-dom123.us-east-1.aws.confluent.cloud
// - Artifact API: https://api.confluent.cloud (default)
ConfluentSettings settings2 = ConfluentSettings.fromResource("/cloud.properties");

// Option 3 (DISCOURAGED): Using rest-endpoint (both statement endpoint will be derived from this)
// Resolved endpoints:
// - Statement API: https://flink.us-east-1.aws.proxy.confluent.cloud
// - Artifact API: https://api.proxy.confluent.cloud
ConfluentSettings settings3 = ConfluentSettings.newBuilder()
   .setRegion("us-east-1")
   .setCloud("aws")
   .setRestEndpoint("proxy.confluent.cloud")
   // Other required settings...
   .build();
```

```sql
ConfluentSettings
```

```sql
ConfluentSettings
```

```sql
TableEnvironment
```

```sql
public static void main(String[] args) {
  // Args might set cloud, region, org, env, and compute pool.
  // Environment variables might pass key and secret.

  // Code sets the session name and SQL-specific options.
  ConfluentSettings settings = ConfluentSettings.newBuilder(args)
   .setContextName("MyTableProgram")
   .setOption("sql.local-time-zone", "UTC")
   .build();

  TableEnvironment env = TableEnvironment.create(settings);
}
```

```sql
from pyflink.table.confluent import ConfluentSettings
from pyflink.table import TableEnvironment

def run():
  # Properties file might set cloud, region, org, env, and compute pool.
  # Environment variables might pass key and secret.

  # Code sets the session name and SQL-specific options.
  settings = ConfluentSettings.new_builder_from_file(...) \
   .set_context_name("MyTableProgram") \
   .set_option("sql.local-time-zone", "UTC") \
   .build()

  env = TableEnvironment.create(settings)
```

```sql
cloud.properties
```

```sql
# Cloud region
client.cloud=aws
client.region=eu-west-1

# Access & compute resources
client.flink-api-key=XXXXXXXXXXXXXXXX
client.flink-api-secret=XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXx
client.organization-id=00000000-0000-0000-0000-000000000000
client.environment-id=env-xxxxx
client.compute-pool-id=lfcp-xxxxxxxxxx
```

```sql
cloud.properties
```

```sql
// Arbitrary file location in file system
ConfluentSettings settings = ConfluentSettings.fromPropertiesFile("/path/to/cloud.properties");

// Part of the JAR package (in src/main/resources)
ConfluentSettings settings = ConfluentSettings.fromPropertiesResource("/cloud.properties");
```

```sql
from pyflink.table.confluent import ConfluentSettings

# Arbitrary file location in file system
settings = ConfluentSettings.from_file("/path/to/cloud.properties")
```

```sql
java -jar my-table-program.jar \
  --cloud aws \
  --region us-east-1 \
  --flink-api-key key \
  --flink-api-secret secret \
  --organization-id b0b21724-4586-4a07-b787-d0bb5aacbf87 \
  --environment-id env-z3y2x1 \
  --compute-pool-id lfcp-8m03rm
```

```sql
ConfluentSettings.fromArgs
```

```sql
public static void main(String[] args) {
  ConfluentSettings settings = ConfluentSettings.fromArgs(args);
}
```

```sql
from pyflink.table.confluent import ConfluentSettings

settings = ConfluentSettings.from_global_variables()
```

```sql
ConfluentSettings
```

```sql
ConfluentSettings settings = ConfluentSettings.newBuilder()
  .setCloud("aws")
  .setRegion("us-east-1")
  .setFlinkApiKey("key")
  .setFlinkApiSecret("secret")
  .setOrganizationId("b0b21724-4586-4a07-b787-d0bb5aacbf87")
  .setEnvironmentId("env-z3y2x1")
  .setComputePoolId("lfcp-8m03rm")
  .build();
```

```sql
from pyflink.table.confluent import ConfluentSettings

settings = ConfluentSettings.new_builder() \
  .set_cloud("aws") \
  .set_region("us-east-1") \
  .set_flink_api_key("key") \
  .set_flink_api_secret("secret") \
  .set_organization_id("b0b21724-4586-4a07-b787-d0bb5aacbf87") \
  .set_environment_id("env-z3y2x1") \
  .set_compute_pool_id("lfcp-8m03rm") \
  .build()
```

```sql
export CLOUD_PROVIDER="aws"
export CLOUD_REGION="us-east-1"
export FLINK_API_KEY="key"
export FLINK_API_SECRET="secret"
export ORG_ID="b0b21724-4586-4a07-b787-d0bb5aacbf87"
export ENV_ID="env-z3y2x1"
export COMPUTE_POOL_ID="lfcp-8m03rm"

java -jar my-table-program.jar
```

```sql
ConfluentSettings settings = ConfluentSettings.fromGlobalVariables();
```

```sql
from pyflink.table.confluent import ConfluentSettings

settings = ConfluentSettings.from_global_variables()
```

```sql
ConfluentTools
```

```sql
ConfluentTools.collectChangelog
```

```sql
ConfluentTools.printChangelog
```

```sql
table.execute().collect()
```

```sql
// On a Table object
Table table = env.from("examples.marketplace.customers");
List<Row> rows = ConfluentTools.collectMaterialized(table, 100);
ConfluentTools.printMaterialized(table, 100);

// On a TableResult object
TableResult tableResult = env.executeSql("SELECT * FROM examples.marketplace.customers");
List<Row> rows = ConfluentTools.collectMaterialized(tableResult, 100);
ConfluentTools.printMaterialized(tableResult, 100);

// For finite (i.e. bounded) tables
ConfluentTools.collectMaterialized(table);
ConfluentTools.printMaterialized(table);
```

```sql
from pyflink.table.confluent import ConfluentSettings, ConfluentTools
from pyflink.table import TableEnvironment

settings = ConfluentSettings.from_global_variables()
env = TableEnvironment.create(settings)
# On a Table object
table = env.from_path("examples.marketplace.customers")
rows = ConfluentTools.collect_changelog_limit(table, 100)
ConfluentTools.print_changelog_limit(table, 100)

# On a TableResult object
tableResult = env.execute_sql("SELECT * FROM examples.marketplace.customers")
rows = ConfluentTools.collect_changelog_limit(tableResult, 100)
ConfluentTools.print_changelog_limit(tableResult, 100)

# For finite (i.e. bounded) tables
ConfluentTools.collect_changelog(table)
ConfluentTools.print_changelog(table)
```

```sql
ConfluentTools.collect_materialized
```

```sql
ConfluentTools.print_materialized
```

```sql
table.execute().collect()
```

```sql
// On a Table object
Table table = env.from("examples.marketplace.customers");
List<Row> rows = ConfluentTools.collectMaterialized(table, 100);
ConfluentTools.printMaterialized(table, 100);

// On a TableResult object
TableResult tableResult = env.executeSql("SELECT * FROM examples.marketplace.customers");
List<Row> rows = ConfluentTools.collectMaterialized(tableResult, 100);
ConfluentTools.printMaterialized(tableResult, 100);

// For finite (i.e. bounded) tables
ConfluentTools.collectMaterialized(table);
ConfluentTools.printMaterialized(table);
```

```sql
from pyflink.table.confluent import ConfluentSettings, ConfluentTools
from pyflink.table import TableEnvironment

settings = ConfluentSettings.from_global_variables()
env = TableEnvironment.create(settings)
# On Table object
table = env.from_path("examples.marketplace.customers")
rows = ConfluentTools.collect_materialized_limit(table, 100)
ConfluentTools.print_materialized_limit(table, 100)

# On TableResult object
tableResult = env.execute_sql("SELECT * FROM examples.marketplace.customers")
rows = ConfluentTools.collect_materialized_limit(tableResult, 100)
ConfluentTools.print_materialized_limit(tableResult, 100)

# For finite (i.e. bounded) tables
ConfluentTools.collect_materialized(table)
ConfluentTools.print_materialized(table)
```

```sql
ConfluentTools.getStatementName
```

```sql
ConfluentTools.stopStatement
```

```sql
// On TableResult object
TableResult tableResult = env.executeSql("SELECT * FROM examples.marketplace.customers");
String statementName = ConfluentTools.getStatementName(tableResult);
ConfluentTools.stopStatement(tableResult);

// Based on statement name
ConfluentTools.stopStatement(env, "table-api-2024-03-21-150457-36e0dbb2e366-sql");
```

```sql
# On TableResult object
table_result = env.execute_sql("SELECT * FROM examples.marketplace.customers")
statement_name = ConfluentTools.get_statement_name(table_result)
ConfluentTools.stop_statement(table_result)

# Based on statement name
ConfluentTools.stop_statement_by_name(env, "table-api-2024-03-21-150457-36e0dbb2e366-sql")
```

```sql
ConfluentTableDescriptor
```

```sql
for_managed()
```

```sql
TableDescriptor.for_connector("confluent")
```

```sql
TableDescriptor descriptor = ConfluentTableDescriptor.forManaged()
  .schema(
    Schema.newBuilder()
      .column("i", DataTypes.INT())
      .column("s", DataTypes.INT())
      .watermark("$rowtime", $("$rowtime").minus(lit(5).seconds())) // Access $rowtime system column
      .build())
  .build();

env.createTable("t1", descriptor);
```

```sql
from pyflink.table.confluent import ConfluentTableDescriptor
from pyflink.table import Schema, DataTypes
from pyflink.table.expressions import col, lit

descriptor = ConfluentTableDescriptor.for_managed() \
  .schema(
     Schema.new_builder()
       .column("i", DataTypes.INT())
       .column("s", DataTypes.INT())
       .watermark("$rowtime", col("$rowtime").minus(lit(5).seconds)) # Access $rowtime system column
       .build()) \
  .build()

env.createTable("t1", descriptor)
```

```sql
Expressions.concat
```

---

### SQL Timezone Types in Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/flink/reference/timezone.html

Timezone Types in Confluent Cloud for Apache Flink¶ Confluent Cloud for Apache Flink® provides rich data types for date and time, including these: DATE TIME TIMESTAMP TIMESTAMP_LTZ INTERVAL YEAR TO MONTH INTERVAL DAY TO SECOND These datetime types and the related datetime functions enable processing business data across timezones. TIMESTAMP vs TIMESTAMP_LTZ¶ TIMESTAMP type¶ TIMESTAMP(p) is an abbreviation for TIMESTAMP(p) WITHOUT TIME ZONE. The precision p supports a range from 0 to 9. The default is 6. TIMESTAMP describes a timestamp that represents year, month, day, hour, minute, second, and fractional seconds. TIMESTAMP can be specified from a string literal. The following code example shows a SELECT statement that creates a timestamp from a string. SELECT TIMESTAMP '1970-01-01 00:00:04.001'; Your output should resemble: EXPR$0 1970-01-01 00:00:04.001 TIMESTAMP_LTZ type¶ TIMESTAMP_LTZ(p) is an abbreviation for TIMESTAMP(p) WITH LOCAL TIME ZONE. The precision p supports a range from 0* to 9. The default is 6. TIMESTAMP_LTZ describes an absolute time point on the time-line. It stores a LONG value representing epoch-milliseconds and an INT representing nanosecond-of-millisecond. The epoch time is measured from the standard Java epoch of 1970-01-01T00:00:00Z. Every datum of TIMESTAMP_LTZ type is interpreted in the local timezone configured in the current session. Typically, the local timezone is used for computation and visualization. TIMESTAMP_LTZ can be used in cross timezones business because the absolute time point. for example, 4001 milliseconds describes a same instantaneous point in different timezones. If the local system time of all machines in the world returns same value, for example, 4001 milliseconds, this is the meaning of “absolute time point”. TIMESTAMP_LTZ has no literal representation, so you can’t create it from a literal. It can be derived from a LONG epoch time, as shown in the following code example. SET 'sql.local-time-zone' = 'UTC'; Your output should resemble: +---------------------+-------+ | Key | Value | +---------------------+-------+ | sql.local-time-zone | UTC | +---------------------+-------+ Query the TO_TIMESTAMP_LTZ function to convert a Unix time to a TIMESTAMP_LTZ. SELECT TO_TIMESTAMP_LTZ(4001, 3); Your output should resemble: EXPR$0 1970-01-01 00:00:04.001 Change the timezone: SET 'sql.local-time-zone' = 'Asia/Shanghai'; Your output should resemble: +---------------------+---------------+ | Key | Value | +---------------------+---------------+ | sql.local-time-zone | Asia/Shanghai | +---------------------+---------------+ Query the time again: SELECT TO_TIMESTAMP_LTZ(4001, 3); Your output should resemble: EXPR$0 1970-01-01 08:00:04.001 Set the timezone¶ The local timezone defines the current session timezone id. You can configure the timezone in the Flink SQL shell or in your applications. -- set to UTC timezone SET 'sql.local-time-zone' = 'UTC'; -- set to Shanghai timezone SET 'sql.local-time-zone' = 'Asia/Shanghai'; -- set to Los_Angeles timezone SET 'sql.local-time-zone' = 'America/Los_Angeles'; Datetime functions and timezones¶ The return values of the following datetime functions depend on the configured timezone. LOCALTIME LOCALTIMESTAMP CURRENT_DATE CURRENT_TIME CURRENT_TIMESTAMP CURRENT_ROW_TIMESTAMP NOW The following example code shows the return types of these datetime functions. CREATE TABLE timeview AS SELECT LOCALTIME, LOCALTIMESTAMP, CURRENT_DATE, CURRENT_TIME, CURRENT_TIMESTAMP, CURRENT_ROW_TIMESTAMP() as current_row_ts, NOW() as now; DESC timeview; Your output should resemble: +-------------------+------------------+----------+--------+ | Column Name | Data Type | Nullable | Extras | +-------------------+------------------+----------+--------+ | LOCALTIME | TIME(0) | NOT NULL | | | LOCALTIMESTAMP | TIMESTAMP(3) | NOT NULL | | | CURRENT_DATE | DATE | NOT NULL | | | CURRENT_TIME | TIME(0) | NOT NULL | | | CURRENT_TIMESTAMP | TIMESTAMP_LTZ(3) | NOT NULL | | | current_row_ts | TIMESTAMP_LTZ(3) | NOT NULL | | | now | TIMESTAMP_LTZ(3) | NOT NULL | | +-------------------+------------------+----------+--------+ Set the timezone to UTC and and query the table. SET 'sql.local-time-zone' = 'UTC'; SELECT * FROM timeview; Your output should resemble: LOCALTIME LOCALTIMESTAMP CURRENT_DATE CURRENT_TIME CURRENT_TIMESTAMP current_row_ts now 04:33:01 2024-09-26 04:33:01.822 2024-09-26 04:33:01 2024-09-25 20:33:01.822 2024-09-25 20:33:01.822 2024-09-25 20:33:01.822 Change the timezone and query the table again. SET 'sql.local-time-zone' = 'Asia/Shanghai'; SELECT * FROM timeview; Your output should resemble: LOCALTIME LOCALTIMESTAMP CURRENT_DATE CURRENT_TIME CURRENT_TIMESTAMP current_row_ts now 04:33:01 2024-09-26 04:33:01.822 2024-09-26 04:33:01 2024-09-26 04:33:01.822 2024-09-26 04:33:01.822 2024-09-26 04:33:01.822 TIMESTAMP_LTZ string representation¶ The session timezone is used when represents a TIMESTAMP_LTZ value to string format, i.e print the value, cast the value to STRING type, cast the value to TIMESTAMP, cast a TIMESTAMP value to TIMESTAMP_LTZ: CREATE TABLE timeview2 AS SELECT TO_TIMESTAMP_LTZ(4001, 3) AS ltz, TIMESTAMP '1970-01-01 00:00:01.001' AS ntz; DESC timeview2; Your output should resemble: +-------------+------------------+----------+--------+ | Column Name | Data Type | Nullable | Extras | +-------------+------------------+----------+--------+ | ltz | TIMESTAMP_LTZ(3) | NULL | | | ntz | TIMESTAMP(3) | NOT NULL | | +-------------+------------------+----------+--------+ Set the timezone to UTC and and query the table. SET 'sql.local-time-zone' = 'UTC'; SELECT * FROM timeview2; Your output should resemble: ltz ntz 1970-01-01 00:00:04.001 1970-01-01 00:00:01.001 Change the timezone and query the table again. SET 'sql.local-time-zone' = 'Asia/Shanghai'; SELECT * FROM timeview2; Your output should resemble: ltz ntz 1970-01-01 08:00:04.001 1970-01-01 00:00:01.001 The following table shows that columns with data types that result from casting. CREATE TABLE timeview3 AS SELECT ltz, CAST(ltz AS TIMESTAMP(3)), CAST(ltz AS STRING), ntz, CAST(ntz AS TIMESTAMP_LTZ(3)) FROM timeview2; DESC timeview3; Your output should resemble: +-------------+------------------+----------+--------+ | Column Name | Data Type | Nullable | Extras | +-------------+------------------+----------+--------+ | ltz | TIMESTAMP_LTZ(3) | NULL | | | ts3 | TIMESTAMP(3) | NULL | | | string_rep | STRING | NULL | | | ntz | TIMESTAMP(3) | NOT NULL | | | ts_ltz3 | TIMESTAMP_LTZ(3) | NOT NULL | | +-------------+------------------+----------+--------+ Query the table. SELECT * FROM timeview3; Your output should resemble: ltz ts3 string_rep ntz ts_ltz3 1970-01-01 08:00:04.001 1970-01-01 08:00:04.001 1970-01-01 08:00:04.001 1970-01-01 00:00:01.001 1970-01-01 00:00:01.001 Time attribute and timezone¶ For more information about time attributes, see Time attributes. Event time and timezone¶ Flink SQL supports defining an event-time attribute on TIMESTAMP and TIMESTAMP_LTZ columns. Event-time attribute on TIMESTAMP¶ If the timestamp data in the source is represented as year-month-day-hour-minute-second, usually a string value without timezone information, for example, 2020-04-15 20:13:40.564, you can define the event-time attribute as a TIMESTAMP column. Event-time attribute on TIMESTAMP_LTZ¶ If the timestamp data in the source is represented as a epoch time, usually as a LONG value, for example, 1618989564564, you can define an event-time attribute as a TIMESTAMP_LTZ column. Daylight Saving Time support¶ Flink SQL supports defining time attributes on a TIMESTAMP_LTZ column, and Flink SQL uses the TIMESTAMP and TIMESTAMP_LTZ types in window processing to support the Daylight Saving Time. Flink SQL uses a timestamp literal to split the window and assigns window to data according to the epoch time of the each row. This means that Flink SQL uses the TIMESTAMP type for window start and window end, like TUMBLE_START and TUMBLE_END, and it uses TIMESTAMP_LTZ for window-time attributes, like TUMBLE_ROWTIME. Given an example tumble window, the Daylight Saving Time in the America/Los_Angeles timezone starts at time 2021-03-14 02:00:00: long epoch1 = 1615708800000L; // 2021-03-14 00:00:00 long epoch2 = 1615712400000L; // 2021-03-14 01:00:00 long epoch3 = 1615716000000L; // 2021-03-14 03:00:00, skip one hour (2021-03-14 02:00:00) long epoch4 = 1615719600000L; // 2021-03-14 04:00:00 The tumble window [2021-03-14 00:00:00, 2021-03-14 00:04:00] collects 3 hours’ worth of data in the America/Los_Angeles timezone, but it collect 4 hours’ worth of data in other non-DST timezones. You only need to define time the attribute on a TIMESTAMP_LTZ column. All windows in Flink SQL, like Hop window, Session window, Cumulative window follow this pattern, and all operations in Flink SQL support TIMESTAMP_LTZ, so Flink SQL provides complete support for Daylight Saving Time. Related content¶ Datetime Functions Time attributes Flink SQL Queries DDL Statements Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
TIMESTAMP(p)
```

```sql
TIMESTAMP(p) WITHOUT TIME ZONE
```

```sql
SELECT TIMESTAMP '1970-01-01 00:00:04.001';
```

```sql
EXPR$0
1970-01-01 00:00:04.001
```

```sql
TIMESTAMP_LTZ(p)
```

```sql
TIMESTAMP(p) WITH LOCAL TIME ZONE
```

```sql
TIMESTAMP_LTZ
```

```sql
1970-01-01T00:00:00Z
```

```sql
TIMESTAMP_LTZ
```

```sql
TIMESTAMP_LTZ
```

```sql
TIMESTAMP_LTZ
```

```sql
SET 'sql.local-time-zone' = 'UTC';
```

```sql
+---------------------+-------+
|         Key         | Value |
+---------------------+-------+
| sql.local-time-zone | UTC   |
+---------------------+-------+
```

```sql
TIMESTAMP_LTZ
```

```sql
SELECT TO_TIMESTAMP_LTZ(4001, 3);
```

```sql
EXPR$0
1970-01-01 00:00:04.001
```

```sql
SET 'sql.local-time-zone' = 'Asia/Shanghai';
```

```sql
+---------------------+---------------+
|         Key         |     Value     |
+---------------------+---------------+
| sql.local-time-zone | Asia/Shanghai |
+---------------------+---------------+
```

```sql
SELECT TO_TIMESTAMP_LTZ(4001, 3);
```

```sql
EXPR$0
1970-01-01 08:00:04.001
```

```sql
-- set to UTC timezone
SET 'sql.local-time-zone' = 'UTC';

-- set to Shanghai timezone
SET 'sql.local-time-zone' = 'Asia/Shanghai';

-- set to Los_Angeles timezone
SET 'sql.local-time-zone' = 'America/Los_Angeles';
```

```sql
CREATE TABLE timeview AS SELECT
  LOCALTIME,
  LOCALTIMESTAMP,
  CURRENT_DATE,
  CURRENT_TIME,
  CURRENT_TIMESTAMP,
  CURRENT_ROW_TIMESTAMP() as current_row_ts,
  NOW() as now;

DESC timeview;
```

```sql
+-------------------+------------------+----------+--------+
|    Column Name    |    Data Type     | Nullable | Extras |
+-------------------+------------------+----------+--------+
| LOCALTIME         | TIME(0)          | NOT NULL |        |
| LOCALTIMESTAMP    | TIMESTAMP(3)     | NOT NULL |        |
| CURRENT_DATE      | DATE             | NOT NULL |        |
| CURRENT_TIME      | TIME(0)          | NOT NULL |        |
| CURRENT_TIMESTAMP | TIMESTAMP_LTZ(3) | NOT NULL |        |
| current_row_ts    | TIMESTAMP_LTZ(3) | NOT NULL |        |
| now               | TIMESTAMP_LTZ(3) | NOT NULL |        |
+-------------------+------------------+----------+--------+
```

```sql
SET 'sql.local-time-zone' = 'UTC';
SELECT * FROM timeview;
```

```sql
LOCALTIME LOCALTIMESTAMP          CURRENT_DATE CURRENT_TIME CURRENT_TIMESTAMP       current_row_ts          now
04:33:01  2024-09-26 04:33:01.822 2024-09-26   04:33:01     2024-09-25 20:33:01.822 2024-09-25 20:33:01.822 2024-09-25 20:33:01.822
```

```sql
SET 'sql.local-time-zone' = 'Asia/Shanghai';
SELECT * FROM timeview;
```

```sql
LOCALTIME LOCALTIMESTAMP          CURRENT_DATE CURRENT_TIME CURRENT_TIMESTAMP       current_row_ts          now
04:33:01  2024-09-26 04:33:01.822 2024-09-26   04:33:01     2024-09-26 04:33:01.822 2024-09-26 04:33:01.822 2024-09-26 04:33:01.822
```

```sql
TIMESTAMP_LTZ
```

```sql
TIMESTAMP_LTZ
```

```sql
CREATE TABLE timeview2 AS SELECT
  TO_TIMESTAMP_LTZ(4001, 3) AS ltz,
  TIMESTAMP '1970-01-01 00:00:01.001' AS ntz;

DESC timeview2;
```

```sql
+-------------+------------------+----------+--------+
| Column Name |    Data Type     | Nullable | Extras |
+-------------+------------------+----------+--------+
| ltz         | TIMESTAMP_LTZ(3) | NULL     |        |
| ntz         | TIMESTAMP(3)     | NOT NULL |        |
+-------------+------------------+----------+--------+
```

```sql
SET 'sql.local-time-zone' = 'UTC';
SELECT * FROM timeview2;
```

```sql
ltz                     ntz
1970-01-01 00:00:04.001 1970-01-01 00:00:01.001
```

```sql
SET 'sql.local-time-zone' = 'Asia/Shanghai';
SELECT * FROM timeview2;
```

```sql
ltz                     ntz
1970-01-01 08:00:04.001 1970-01-01 00:00:01.001
```

```sql
CREATE TABLE timeview3 AS SELECT ltz,
  CAST(ltz AS TIMESTAMP(3)),
  CAST(ltz AS STRING),
  ntz,
  CAST(ntz AS TIMESTAMP_LTZ(3)) FROM timeview2;

DESC timeview3;
```

```sql
+-------------+------------------+----------+--------+
| Column Name |    Data Type     | Nullable | Extras |
+-------------+------------------+----------+--------+
| ltz         | TIMESTAMP_LTZ(3) | NULL     |        |
| ts3         | TIMESTAMP(3)     | NULL     |        |
| string_rep  | STRING           | NULL     |        |
| ntz         | TIMESTAMP(3)     | NOT NULL |        |
| ts_ltz3     | TIMESTAMP_LTZ(3) | NOT NULL |        |
+-------------+------------------+----------+--------+
```

```sql
SELECT * FROM timeview3;
```

```sql
ltz                     ts3                     string_rep              ntz                     ts_ltz3
1970-01-01 08:00:04.001 1970-01-01 08:00:04.001 1970-01-01 08:00:04.001 1970-01-01 00:00:01.001 1970-01-01 00:00:01.001
```

```sql
2020-04-15 20:13:40.564
```

```sql
1618989564564
```

```sql
TIMESTAMP_LTZ
```

```sql
TUMBLE_START
```

```sql
TIMESTAMP_LTZ
```

```sql
TUMBLE_ROWTIME
```

```sql
America/Los_Angeles
```

```sql
2021-03-14 02:00:00
```

```sql
long epoch1 = 1615708800000L; // 2021-03-14 00:00:00
long epoch2 = 1615712400000L; // 2021-03-14 01:00:00
long epoch3 = 1615716000000L; // 2021-03-14 03:00:00, skip one hour (2021-03-14 02:00:00)
long epoch4 = 1615719600000L; // 2021-03-14 04:00:00
```

```sql
America/Los_Angeles
```

---

### Flink authentication and authorization event methods (Confluent Cloud audit logs) | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/monitoring/audit-logging/event-methods/flink-authn-authz.html

Flink Authentication and Authorization Auditable Event Methods on Confluent Cloud¶ Expand all examples | Collapse all examples Confluent Cloud audit logs contain records of auditable events for authentication and authorization operations. When an auditable event occurs, a message is sent to the audit log and is stored as an audit log record. Flink region authentication auditable event methods¶ Included here are operations authenticating to a Flink region that generate auditable event messages for the io.confluent.flink.server/authentication event type. Method name Action triggering an auditable event message flink.Authenticate A request for authentication to a Flink region. Examples¶ flink.Authenticate¶ The flink.Authenticate event method is triggered by a request to authenticate to a Flink region. SUCCESS { "type": "io.confluent.flink.server/authentication", "id": "f388a04b-0bbe-4e10-9b97-b2f565274196", "subject": "crn://confluent.cloud/organization=7c210ed4-6e1e-4355-abf9-b25e25a8b25a/environment=env-xmzdkk/flink-region=AWS.eu-central-1", "@timestamp": "2024-01-12T13:33:46.296Z", "datacontenttype": "application/json", "@version": "1", "kafka.partition": "106", "dataschema": "https://confluent.io/internal/events/AuditLog.v2", "specversion": "1.0", "source": "crn://confluent.cloud/", "kafka.offset": "2495047099", "time": "2024-01-12T13:33:46.296209728Z", "data": { "requestMetadata": { "clientAddress": [ { "ip": "134.238.54.136" } ], "requestId": [ "d31875a39d6e5eae08e0419176808af3" ] }, "internalServiceName": "crn://confluent.cloud/organization=7c210ed4-6e1e-4355-abf9-b25e25a8b25a/environment=env-xmzdkk/flink-region=AWS.eu-central-1", "cloudResources": [ { "scope": { "resources": [ { "type": "ORGANIZATION", "resourceId": "7c210ed4-6e1e-4355-abf9-b25e25a8b25a" }, { "type": "ENVIRONMENT", "resourceId": "env-xmzdkk" } ] }, "resource": { "type": "FLINK_REGION", "resourceId": "AWS.eu-central-1" } } ], "result": { "status": "SUCCESS" }, "request": { "accessType": "READ_ONLY", "data": "{\"intendedLogicalClusterCrn\":\"crn://confluent.cloud/organization=7c210ed4-6e1e-4355-abf9-b25e25a8b25a/environment=env-xmzdkk/flink-region=AWS.eu-central-1\"}" }, "serviceName": "crn://confluent.cloud/organization=7c210ed4-6e1e-4355-abf9-b25e25a8b25a/environment=env-xmzdkk/flink-region=AWS.eu-central-1", "methodName": "flink.Authenticate", "authenticationInfo": { "result": "SUCCESS", "exposure": "CUSTOMER", "credentials": { "mechanism": "HTTP_BEARER", "idTokenCredentials": { "type": "JWT", "issuer": "Confluent", "subject": "1281943" } } } } } Flink Authorization auditable event methods¶ Included here are operations authorizing principals to access, modify, delete, or create a Flink resource that generate auditable event messages for the io.confluent.flink.server/authorization event type. Method name Action triggering an auditable event message flink.Authorize A request to authorize a principal to access, modify, delete, or create a Flink resource. Examples¶ flink.Authorize¶ The flink.Authorize event method is triggered by a request to authorize a principal to access, modify, delete, or create a Flink resource (STATEMENT OR WORKSPACE). SUCCESS { "cloudResources": [ { "scope": { "resources": [ { "resourceId": "49aea135-19f4-4e75-adb3-8ca5dd04e292", "type": "ORGANIZATION" }, { "resourceId": "env-3ny01o", "type": "ENVIRONMENT" }, { "resourceId": "azure.eastus2", "type": "FLINK_REGION" } ] }, "resource": { "resourceId": "workspace-2024-03-07-030236-92003e1d-1abf-4401-bbfb-57b6b9ead5de", "type": "STATEMENT" } } ], "authorizationInfo": { "resourceName": "workspace-2024-03-07-030236-92003e1d-1abf-4401-bbfb-57b6b9ead5de", "operation": "Describe", "resourceType": "STATEMENT", "rbacAuthorization": { "patternType": "LITERAL", "resourceType": "Statement", "actingPrincipal": { "group": { "resourceId": "group-Xmgn" } }, "role": "FlinkAdmin", "patternName": "*", "operation": "Describe", "cloudScope": { "resources": [ { "resourceId": "49aea135-19f4-4e75-adb3-8ca5dd04e292", "type": "ORGANIZATION" }, { "resourceId": "env-3px32m", "type": "ENVIRONMENT" } ] } }, "result": "ALLOW" }, "request": { "accessType": "READ_ONLY" }, "internalServiceName": "crn://confluent.cloud/organization=49afb126-18f4-4e76-adb3-8ca5dd04e393/environment=env-3px32m/flink-region=azure.eastus2", "authenticationInfo": { "exposure": "CUSTOMER", "identity": "crn://confluent.cloud/organization=49afb126-18f4-4e76-adb3-8ca5dd04e393/identity-provider=Confluent/identity=u-nqxk78", "principal": { "confluentUser": { "resourceId": "u-nqxk78" } }, "result": "SUCCESS" }, "serviceName": "crn://confluent.cloud/organization=49afb126-18f4-4e76-adb3-8ca5dd04e393/environment=env-3px32m/flink-region=azure.eastus2", "methodName": "flink.Authorize", "requestMetadata": { "requestId": [ "52107f4df7fce0356e278c20ce143418" ], "clientAddress": [ { "ip": "1.2.3.4.5" } ] }, "result": { "status": "SUCCESS" } }

#### Code Examples

```sql
io.confluent.flink.server/authentication
```

```sql
flink.Authenticate
```

```sql
{
  "type": "io.confluent.flink.server/authentication",
  "id": "f388a04b-0bbe-4e10-9b97-b2f565274196",
  "subject": "crn://confluent.cloud/organization=7c210ed4-6e1e-4355-abf9-b25e25a8b25a/environment=env-xmzdkk/flink-region=AWS.eu-central-1",
  "@timestamp": "2024-01-12T13:33:46.296Z",
  "datacontenttype": "application/json",
  "@version": "1",
  "kafka.partition": "106",
  "dataschema": "https://confluent.io/internal/events/AuditLog.v2",
  "specversion": "1.0",
  "source": "crn://confluent.cloud/",
  "kafka.offset": "2495047099",
  "time": "2024-01-12T13:33:46.296209728Z",
  "data": {
    "requestMetadata": {
      "clientAddress": [
        {
          "ip": "134.238.54.136"
        }
      ],
      "requestId": [
        "d31875a39d6e5eae08e0419176808af3"
      ]
    },
    "internalServiceName": "crn://confluent.cloud/organization=7c210ed4-6e1e-4355-abf9-b25e25a8b25a/environment=env-xmzdkk/flink-region=AWS.eu-central-1",
    "cloudResources": [
      {
        "scope": {
          "resources": [
            {
              "type": "ORGANIZATION",
              "resourceId": "7c210ed4-6e1e-4355-abf9-b25e25a8b25a"
            },
            {
              "type": "ENVIRONMENT",
              "resourceId": "env-xmzdkk"
            }
          ]
        },
        "resource": {
          "type": "FLINK_REGION",
          "resourceId": "AWS.eu-central-1"
        }
      }
    ],
    "result": {
      "status": "SUCCESS"
    },
    "request": {
      "accessType": "READ_ONLY",
      "data": "{\"intendedLogicalClusterCrn\":\"crn://confluent.cloud/organization=7c210ed4-6e1e-4355-abf9-b25e25a8b25a/environment=env-xmzdkk/flink-region=AWS.eu-central-1\"}"
    },
    "serviceName": "crn://confluent.cloud/organization=7c210ed4-6e1e-4355-abf9-b25e25a8b25a/environment=env-xmzdkk/flink-region=AWS.eu-central-1",
    "methodName": "flink.Authenticate",
    "authenticationInfo": {
      "result": "SUCCESS",
      "exposure": "CUSTOMER",
      "credentials": {
        "mechanism": "HTTP_BEARER",
        "idTokenCredentials": {
          "type": "JWT",
          "issuer": "Confluent",
          "subject": "1281943"
        }
      }
    }
  }
}
```

```sql
io.confluent.flink.server/authorization
```

```sql
flink.Authorize
```

```sql
{
    "cloudResources": [
        {
            "scope": {
                "resources": [
                    {
                        "resourceId": "49aea135-19f4-4e75-adb3-8ca5dd04e292",
                        "type": "ORGANIZATION"
                    },
                    {
                        "resourceId": "env-3ny01o",
                        "type": "ENVIRONMENT"
                    },
                    {
                        "resourceId": "azure.eastus2",
                        "type": "FLINK_REGION"
                    }
                ]
            },
            "resource": {
                "resourceId": "workspace-2024-03-07-030236-92003e1d-1abf-4401-bbfb-57b6b9ead5de",
                "type": "STATEMENT"
            }
        }
    ],
    "authorizationInfo": {
        "resourceName": "workspace-2024-03-07-030236-92003e1d-1abf-4401-bbfb-57b6b9ead5de",
        "operation": "Describe",
        "resourceType": "STATEMENT",
        "rbacAuthorization": {
            "patternType": "LITERAL",
            "resourceType": "Statement",
            "actingPrincipal": {
                "group": {
                    "resourceId": "group-Xmgn"
                }
            },
            "role": "FlinkAdmin",
            "patternName": "*",
            "operation": "Describe",
            "cloudScope": {
                "resources": [
                    {
                        "resourceId": "49aea135-19f4-4e75-adb3-8ca5dd04e292",
                        "type": "ORGANIZATION"
                    },
                    {
                        "resourceId": "env-3px32m",
                        "type": "ENVIRONMENT"
                    }
                ]
            }
        },
        "result": "ALLOW"
    },
    "request": {
        "accessType": "READ_ONLY"
    },
    "internalServiceName": "crn://confluent.cloud/organization=49afb126-18f4-4e76-adb3-8ca5dd04e393/environment=env-3px32m/flink-region=azure.eastus2",
    "authenticationInfo": {
        "exposure": "CUSTOMER",
        "identity": "crn://confluent.cloud/organization=49afb126-18f4-4e76-adb3-8ca5dd04e393/identity-provider=Confluent/identity=u-nqxk78",
        "principal": {
            "confluentUser": {
                "resourceId": "u-nqxk78"
            }
        },
        "result": "SUCCESS"
    },
    "serviceName": "crn://confluent.cloud/organization=49afb126-18f4-4e76-adb3-8ca5dd04e393/environment=env-3px32m/flink-region=azure.eastus2",
    "methodName": "flink.Authorize",
    "requestMetadata": {
        "requestId": [
            "52107f4df7fce0356e278c20ce143418"
        ],
        "clientAddress": [
            {
                "ip": "1.2.3.4.5"
            }
        ]
    },
    "result": {
        "status": "SUCCESS"
    }
}
```

---

### Auditable event methods for Apache Flink (Confluent Cloud) | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/monitoring/audit-logging/event-methods/flink.html

Auditable Event Methods for Apache Flink on Confluent Cloud¶ Auditable event methods for Confluent Cloud for Apache Flink are triggered by operations on Apache Flink® in Confluent Cloud and send event messages about the operations to the audit log cluster, where they are stored as event records in a Kafka topic. The resource types for which auditable event methods are triggered include: Flink region (FLINK_REGION) Flink compute pool (COMPUTE_POOL) Flink workspace (FLINK_WORKSPACE) Flink statement (STATEMENT) The following sections provide details about the auditable event methods for each of these resource types. Flink region¶ Auditable event methods for the resource type FLINK_REGION are triggered by operations on Flink compute pool and generate event messages that are sent to the audit log cluster, where they are stored as event records in a Kafka topic. Method name Action triggering an auditable event message ListFlinkRegions A request to list the Flink regions in the organization. ListFlinkRegions¶ The ListFlinkRegions event method is triggered by a request to get a a list of the Flink regions in the organization and sends an event message that is saved in the audit log as an event record. Examples¶ SUCCESS { "specversion": "1.0", "id": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee", "source": "crn://confluent.cloud/", "type": "io.confluent.cloud/request", "subject": "crn://confluent.cloud/organization=6c2e1a25-2292-483b-9c76-79982e3dc005", "datacontenttype": "application/json", "dataschema": "https://confluent.io/internal/events/AuditLog.v2", "data": { "service_name": "crn://confluent.cloud/service=cc-ksql-api-service", "internal_service_name": "crn://confluent.cloud/service=cc-ksql-api-service", "method_name": "ListFlinkRegions", "cloud_resources": [ { "resource": { "type": "ORGANIZATION", "resource_id": "6c2e1a25-2292-483b-9c76-79982e3dc005" } } ], "authentication_info": { "exposure": "CUSTOMER", "principal": { "confluent_user": { "resource_id": "user-1", "internal_id": "99" } }, "result": "SUCCESS", "identity": "crn://confluent.cloud/organization=6c2e1a25-2292-483b-9c76-79982e3dc005/identity-provider=Confluent/identity=user-1" }, "request_metadata": { "request_id": [ "74726163656964303132333435363738" ], "client_address": [ { "ip": "1.2.3.4" } ] }, "request": { "access_type": "READ_ONLY", "data": { "BypassCache": false, "Cloud": 0, "PageSize": 10, "PageToken": "", "RegionName": "" } }, "result": { "status": "SUCCESS", "data": { "elements": [ { "fcpm_v_2_region": { "id": "aws.af-south-1", "metadata": null } }, { "fcpm_v_2_region": { "id": "aws.ap-east-1", "metadata": null } }, { "fcpm_v_2_region": { "id": "aws.ap-northeast-1", "metadata": null } }, { "fcpm_v_2_region": { "id": "aws.ap-northeast-2", "metadata": null } }, { "fcpm_v_2_region": { "id": "aws.ap-northeast-3", "metadata": null } }, { "fcpm_v_2_region": { "id": "aws.ap-south-1", "metadata": null } }, { "fcpm_v_2_region": { "id": "aws.ap-south-2", "metadata": null } }, { "fcpm_v_2_region": { "id": "aws.ap-southeast-1", "metadata": null } }, { "fcpm_v_2_region": { "id": "aws.ap-southeast-2", "metadata": null } }, { "fcpm_v_2_region": { "id": "aws.ap-southeast-3", "metadata": null } } ] } } } } Flink compute pool¶ Auditable event methods for the resource type COMPUTE_POOL are triggered by operations on a Flink compute pool and generate event messages that are sent to the audit log cluster, where they are stored as event records in a Kafka topic. Method name Action triggering an auditable event message CreateComputePool A request to create a Flink compute pool. DeleteComputePool A request to delete a Flink compute pool. GetComputePool A request for a query of a Flink compute pool details. ListComputePools A request for a list of Flink compute pools. UpdateComputePool A request to update a Flink compute pool. CreateComputePool¶ The CreateComputePool event method is triggered by a request to create a Flink compute pool and sends an event message that is saved in the audit log as an event record. Examples¶ SUCCESS { "specversion": "1.0", "id": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee", "source": "crn://confluent.cloud/", "type": "io.confluent.cloud/request", "subject": "crn://confluent.cloud/organization=6c2e1a25-2292-483b-9c76-79982e3dc005", "datacontenttype": "application/json", "dataschema": "https://confluent.io/internal/events/AuditLog.v2", "data": { "service_name": "crn://confluent.cloud/service=cc-ksql-api-service", "internal_service_name": "crn://confluent.cloud/service=cc-ksql-api-service", "method_name": "ListRegions", "cloud_resources": [ { "resource": { "type": "ORGANIZATION", "resource_id": "6c2e1a25-2292-483b-9c76-79982e3dc005" } } ], "authentication_info": { "exposure": "CUSTOMER", "principal": { "confluent_user": { "resource_id": "user-1", "internal_id": "99" } }, "result": "SUCCESS", "identity": "crn://confluent.cloud/organization=6c2e1a25-2292-483b-9c76-79982e3dc005/identity-provider=Confluent/identity=user-1" }, "request_metadata": { "request_id": [ "74726163656964303132333435363738" ], "client_address": [ { "ip": "1.2.3.4" } ] }, "request": { "access_type": "READ_ONLY", "data": { "BypassCache": false, "Cloud": 0, "PageSize": 10, "PageToken": "", "RegionName": "" } }, "result": { "status": "SUCCESS", "data": { "elements": [ { "fcpm_v_2_region": { "id": "aws.af-south-1", "metadata": null } }, { "fcpm_v_2_region": { "id": "aws.ap-east-1", "metadata": null } }, { "fcpm_v_2_region": { "id": "aws.ap-northeast-1", "metadata": null } }, { "fcpm_v_2_region": { "id": "aws.ap-northeast-2", "metadata": null } }, { "fcpm_v_2_region": { "id": "aws.ap-northeast-3", "metadata": null } }, { "fcpm_v_2_region": { "id": "aws.ap-south-1", "metadata": null } }, { "fcpm_v_2_region": { "id": "aws.ap-south-2", "metadata": null } }, { "fcpm_v_2_region": { "id": "aws.ap-southeast-1", "metadata": null } }, { "fcpm_v_2_region": { "id": "aws.ap-southeast-2", "metadata": null } }, { "fcpm_v_2_region": { "id": "aws.ap-southeast-3", "metadata": null } } ] } } } } DeleteComputePool¶ The DeleteComputePool event method is triggered by a request to delete a Flink compute pool and sends an event message that is saved in the audit log as an event record. Examples¶ SUCCESS { "specversion": "1.0", "id": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee", "source": "crn://confluent.cloud/", "type": "io.confluent.cloud/request", "subject": "crn://confluent.cloud/organization=6c2e1a25-2292-483b-9c76-79982e3dc005/environment=env-j30y0iqp/flink-region=azure.uksouth/compute-pool=lfcp-1", "datacontenttype": "application/json", "dataschema": "https://confluent.io/internal/events/AuditLog.v2", "data": { "service_name": "crn://confluent.cloud/service=cc-ksql-api-service", "internal_service_name": "crn://confluent.cloud/service=cc-ksql-api-service", "method_name": "DeleteComputePool", "cloud_resources": [ { "scope": { "resources": [ { "type": "ORGANIZATION", "resource_id": "6c2e1a25-2292-483b-9c76-79982e3dc005" }, { "type": "ENVIRONMENT", "resource_id": "env-j30y0iqp" }, { "type": "FLINK_REGION", "resource_id": "azure.uksouth" } ] }, "resource": { "type": "COMPUTE_POOL", "resource_id": "lfcp-1" } } ], "authentication_info": { "exposure": "CUSTOMER", "principal": { "confluent_user": { "resource_id": "user-1", "internal_id": "99" } }, "result": "SUCCESS", "identity": "crn://confluent.cloud/organization=6c2e1a25-2292-483b-9c76-79982e3dc005/identity-provider=Confluent/identity=user-1" }, "request_metadata": { "request_id": [ "74726163656964303132333435363738" ], "client_address": [ { "ip": "1.2.3.4" } ] }, "request": { "access_type": "MODIFICATION", "data": { "environment_id": "env-j30y0iqp", "id": "lfcp-1" } }, "result": { "status": "SUCCESS" } } } GetComputePool¶ The GetComputePool event method is triggered by a request to get the details for a Flink compute pool and sends an event message that is saved in the audit log as an event record. Examples¶ SUCCESS { "specversion": "1.0", "id": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee", "source": "crn://confluent.cloud/", "type": "io.confluent.cloud/request", "subject": "crn://confluent.cloud/organization=6c2e1a25-2292-483b-9c76-79982e3dc005/environment=env-j30y0iqp/flink-region=azure.uksouth/compute-pool=lfcp-1", "datacontenttype": "application/json", "dataschema": "https://confluent.io/internal/events/AuditLog.v2", "data": { "service_name": "crn://confluent.cloud/service=cc-ksql-api-service", "internal_service_name": "crn://confluent.cloud/service=cc-ksql-api-service", "method_name": "DeleteComputePool", "cloud_resources": [ { "scope": { "resources": [ { "type": "ORGANIZATION", "resource_id": "6c2e1a25-2292-483b-9c76-79982e3dc005" }, { "type": "ENVIRONMENT", "resource_id": "env-j30y0iqp" }, { "type": "FLINK_REGION", "resource_id": "azure.uksouth" } ] }, "resource": { "type": "COMPUTE_POOL", "resource_id": "lfcp-1" } } ], "authentication_info": { "exposure": "CUSTOMER", "principal": { "confluent_user": { "resource_id": "user-1", "internal_id": "99" } }, "result": "SUCCESS", "identity": "crn://confluent.cloud/organization=6c2e1a25-2292-483b-9c76-79982e3dc005/identity-provider=Confluent/identity=user-1" }, "request_metadata": { "request_id": [ "74726163656964303132333435363738" ], "client_address": [ { "ip": "1.2.3.4" } ] }, "request": { "access_type": "MODIFICATION", "data": { "environment_id": "env-j30y0iqp", "id": "lfcp-1" } }, "result": { "status": "SUCCESS" } } } ListComputePools¶ The ListComputePools event method is triggered by a request for a list of Flink compute pools and sends an event message that is saved in the audit log as an event record. Examples¶ SUCCESS { "specversion": "1.0", "id": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee", "source": "crn://confluent.cloud/", "type": "io.confluent.cloud/request", "subject": "crn://confluent.cloud/organization=6c2e1a25-2292-483b-9c76-79982e3dc005/environment=env-j30y0iqp/flink-region=azure.uksouth/compute-pool=lfcp-1", "datacontenttype": "application/json", "dataschema": "https://confluent.io/internal/events/AuditLog.v2", "data": { "service_name": "crn://confluent.cloud/service=cc-ksql-api-service", "internal_service_name": "crn://confluent.cloud/service=cc-ksql-api-service", "method_name": "DeleteComputePool", "cloud_resources": [ { "scope": { "resources": [ { "type": "ORGANIZATION", "resource_id": "6c2e1a25-2292-483b-9c76-79982e3dc005" }, { "type": "ENVIRONMENT", "resource_id": "env-j30y0iqp" }, { "type": "FLINK_REGION", "resource_id": "azure.uksouth" } ] }, "resource": { "type": "COMPUTE_POOL", "resource_id": "lfcp-1" } } ], "authentication_info": { "exposure": "CUSTOMER", "principal": { "confluent_user": { "resource_id": "user-1", "internal_id": "99" } }, "result": "SUCCESS", "identity": "crn://confluent.cloud/organization=6c2e1a25-2292-483b-9c76-79982e3dc005/identity-provider=Confluent/identity=user-1" }, "request_metadata": { "request_id": [ "74726163656964303132333435363738" ], "client_address": [ { "ip": "1.2.3.4" } ] }, "request": { "access_type": "MODIFICATION", "data": { "environment_id": "env-j30y0iqp", "id": "lfcp-1" } }, "result": { "status": "SUCCESS" } } } UpdateComputePool¶ The UpdateComputePool event method is triggered by a request to update a Flink compute pool and sends an event message that is saved in the audit log as an event record. Examples¶ SUCCESS { "specversion": "1.0", "id": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee", "source": "crn://confluent.cloud/", "type": "io.confluent.cloud/request", "subject": "crn://confluent.cloud/organization=6c2e1a25-2292-483b-9c76-79982e3dc005/environment=env-j30y0iqp/flink-region=azure.uksouth/compute-pool=lfcp-1", "datacontenttype": "application/json", "dataschema": "https://confluent.io/internal/events/AuditLog.v2", "data": { "service_name": "crn://confluent.cloud/service=cc-ksql-api-service", "internal_service_name": "crn://confluent.cloud/service=cc-ksql-api-service", "method_name": "DeleteComputePool", "cloud_resources": [ { "scope": { "resources": [ { "type": "ORGANIZATION", "resource_id": "6c2e1a25-2292-483b-9c76-79982e3dc005" }, { "type": "ENVIRONMENT", "resource_id": "env-j30y0iqp" }, { "type": "FLINK_REGION", "resource_id": "azure.uksouth" } ] }, "resource": { "type": "COMPUTE_POOL", "resource_id": "lfcp-1" } } ], "authentication_info": { "exposure": "CUSTOMER", "principal": { "confluent_user": { "resource_id": "user-1", "internal_id": "99" } }, "result": "SUCCESS", "identity": "crn://confluent.cloud/organization=6c2e1a25-2292-483b-9c76-79982e3dc005/identity-provider=Confluent/identity=user-1" }, "request_metadata": { "request_id": [ "74726163656964303132333435363738" ], "client_address": [ { "ip": "1.2.3.4" } ] }, "request": { "access_type": "MODIFICATION", "data": { "environment_id": "env-j30y0iqp", "id": "lfcp-1" } }, "result": { "status": "SUCCESS" } } } Flink workspace¶ Auditable event methods for the resource type FLINK_WORKSPACE are triggered by operations on a Flink workspace and generate event messages that are sent to the audit log cluster, where they are stored as event records in a Kafka topic. Method name Action triggering an auditable event message CreateWorkspace A request to create a Flink workspace. DeleteWorkspace A request to delete a Flink workspace. GetWorkspace A request for a query of a Flink workspace details. ListWorkspaces A request for a list of Flink workspaces. UpdateWorkspace A request to update a Flink workspace. CreateWorkspace¶ The CreateWorkspace event method is triggered by a request to create a Flink workspace and sends an event message that is saved in the audit log as an event record. Examples¶ SUCCESS { "datacontenttype": "application/json", "data": { "serviceName": "crn://confluent.cloud/", "methodName": "CreateWorkspace", "cloudResources": [ { "scope": { "resources": [ { "type": "ORGANIZATION", "resourceId": "a56cf537-ab71-480e-b272-43e71531798b" }, { "type": "ENVIRONMENT", "resourceId": "env-rzhxp2" }, { "type": "FLINK_REGION", "resourceId": "aws.us-east-1" } ] }, "resource": { "type": "FLINK_WORKSPACE", "resourceId": "workspace-2023-09-22-162414" } } ], "authenticationInfo": { "principal": { "confluentUser": { "resourceId": "u-123456" } }, "result": "SUCCESS", "identity": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/identity-provider=Confluent/identity=u-123456" }, "requestMetadata": { "requestId": [ "8b4f7ec5693a01fb4a1ae0a24240f944" ], "clientAddress": [ { "ip": "1.2.3.4" } ] }, "request": { "accessType": "MODIFICATION", "data": { "workspace_name": "workspace-2023-09-22-162414", "environment_id": "env-rzhxp2", "org_resource_id": "a56cf537-ab71-480e-b272-43e71531798b", "spec": { "compute_pool": { "id": "lfcp-stgcc30xr80" }, "service_account": null } } }, "result": { "status": "SUCCESS", "data": { "environment_id": "env-rzhxp2", "name": "workspace-2023-09-22-162414", "org_id": "a56cf537-ab71-480e-b272-43e71531798b", "spec": { "compute_pool": { "id": "lfcp-stgcc30xr80" }, "service_account": null } } }, "resourceName": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/environment=env-rzhxp2/flink-region=aws.us-east-1/flink-workspace=workspace-2023-09-22-162414" }, "subject": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/environment=env-rzhxp2/flink-region=aws.us-east-1/flink-workspace=workspace-2023-09-22-162414", "specversion": "1.0", "id": "b76bee22-7678-49ea-8902-67519b0d4133", "source": "crn://confluent.cloud/", "time": "2023-09-22T16:24:15.007233032Z", "type": "io.confluent.cloud/request" } DeleteWorkspace¶ The DeleteWorkspace event method is triggered by a request to delete a Flink workspace and sends an event message that is saved in the audit log as an event record. Examples¶ SUCCESS { "datacontenttype": "application/json", "data": { "serviceName": "crn://confluent.cloud/", "methodName": "DeleteWorkspace", "cloudResources": [ { "scope": { "resources": [ { "type": "ORGANIZATION", "resourceId": "a56cf537-ab71-480e-b272-43e71531798b" }, { "type": "ENVIRONMENT", "resourceId": "env-rzhxp2" }, { "type": "FLINK_REGION", "resourceId": "aws.us-east-1" } ] }, "resource": { "type": "FLINK_WORKSPACE", "resourceId": "workspace-2023-09-22-162414" } } ], "authenticationInfo": { "principal": { "confluentUser": { "resourceId": "u-123456" } }, "result": "SUCCESS", "identity": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/identity-provider=Confluent/identity=u-123456" }, "requestMetadata": { "requestId": [ "6a4dd657fe6fc5241360983cbf8dc8ce" ], "clientAddress": [ { "ip": "1.2.3.4" } ] }, "request": { "accessType": "MODIFICATION", "data": { "workspace_name": "workspace-2023-09-22-162414", "environment_id": "env-rzhxp2", "org_resource_id": "a56cf537-ab71-480e-b272-43e71531798b" } }, "result": { "status": "SUCCESS" }, "resourceName": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/environment=env-rzhxp2/flink-region=aws.us-east-1/flink-workspace=workspace-2023-09-22-162414" }, "subject": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/environment=env-rzhxp2/flink-region=aws.us-east-1/flink-workspace=workspace-2023-09-22-162414", "specversion": "1.0", "id": "36791901-6bd6-4057-8820-9d6860d56d0c", "source": "crn://confluent.cloud/", "time": "2023-09-22T16:24:41.773914645Z", "type": "io.confluent.cloud/request" } GetWorkspace¶ The GetWorkspace event method is triggered by a request to get the details for a Flink workspace and sends an event message that is saved in the audit log as an event record. Examples¶ SUCCESS { "datacontenttype": "application/json", "data": { "serviceName": "crn://confluent.cloud/", "methodName": "GetWorkspace", "cloudResources": [ { "scope": { "resources": [ { "type": "ORGANIZATION", "resourceId": "a56cf537-ab71-480e-b272-43e71531798b" }, { "type": "ENVIRONMENT", "resourceId": "env-rzhxp2" }, { "type": "FLINK_REGION", "resourceId": "aws.us-east-1" } ] }, "resource": { "type": "FLINK_WORKSPACE", "resourceId": "workspace-2023-09-22-162414" } } ], "authenticationInfo": { "principal": { "confluentUser": { "resourceId": "u-123456" } }, "result": "SUCCESS", "identity": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/identity-provider=Confluent/identity=u-123456" }, "requestMetadata": { "requestId": [ "ae0fe8164a496916ba2494a4f5cef447" ], "clientAddress": [ { "ip": "1.2.3.4" } ] }, "request": { "accessType": "READ_ONLY", "data": { "environment_id": "env-rzhxp2", "org_resource_id": "a56cf537-ab71-480e-b272-43e71531798b", "workspace_name": "workspace-2023-09-22-162414" } }, "result": { "status": "SUCCESS", "data": { "environment_id": "env-rzhxp2", "name": "workspace-2023-09-22-162414", "org_id": "a56cf537-ab71-480e-b272-43e71531798b", "spec": { "service_account": null, "compute_pool": { "id": "lfcp-stgcc30xr80" } } } }, "resourceName": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/environment=env-rzhxp2/flink-region=aws.us-east-1/flink-workspace=workspace-2023-09-22-162414" }, "subject": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/environment=env-rzhxp2/flink-region=aws.us-east-1/flink-workspace=workspace-2023-09-22-162414", "specversion": "1.0", "id": "ae935a4b-bcc6-4359-9149-3c31e728877a", "source": "crn://confluent.cloud/", "time": "2023-09-22T16:24:15.666686762Z", "type": "io.confluent.cloud/request" } ListWorkspaces¶ The ListWorkspaces event method is triggered by a request for a list of Flink workspaces and sends an event message that is saved in the audit log as an event record. Examples¶ SUCCESS { "datacontenttype": "application/json", "data": { "serviceName": "crn://confluent.cloud/", "methodName": "ListWorkspace", "cloudResources": [ { "scope": { "resources": [ { "type": "ORGANIZATION", "resourceId": "a56cf537-ab71-480e-b272-43e71531798b" }, { "type": "ENVIRONMENT", "resourceId": "env-rzhxp2" }, { "type": "FLINK_REGION", "resourceId": "aws.us-east-1" } ] }, "resource": { "type": "FLINK_WORKSPACE", "resourceId": "workspace-2023-09-22-162414" } } ], "authenticationInfo": { "principal": { "confluentUser": { "resourceId": "u-123456" } }, "result": "SUCCESS", "identity": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/identity-provider=Confluent/identity=u-123456" }, "requestMetadata": { "requestId": [ "5e926b0c56f3131f8fb350f228ad9b11" ], "clientAddress": [ { "ip": "1.2.3.4" } ] }, "request": { "accessType": "READ_ONLY", "data": { "environment_id": "env-rzhxp2", "org_resource_id": "a56cf537-ab71-480e-b272-43e71531798b", "page_size": 100 } }, "result": { "status": "SUCCESS", "data": { "data": [ { "name": "workspace-2023-09-22-162414", "org_id": "a56cf537-ab71-480e-b272-43e71531798b", "spec": { "compute_pool": { "id": "lfcp-stgcc30xr80" }, "service_account": null }, "environment_id": "env-rzhxp2" } ] } }, "resourceName": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/environment=env-rzhxp2/flink-region=aws.us-east-1/flink-workspace=workspace-2023-09-22-162414" }, "subject": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/environment=env-rzhxp2/flink-region=aws.us-east-1/flink-workspace=workspace-2023-09-22-162414", "specversion": "1.0", "id": "f1f9c92e-f3b8-425e-971f-c0206b0eadc0", "source": "crn://confluent.cloud/", "time": "2023-09-22T16:24:29.707277883Z", "type": "io.confluent.cloud/request" } UpdateWorkspace¶ The UpdateWorkspace event method is triggered by a request to update a Flink workspace and sends an event message that is saved in the audit log as an event record. Examples¶ SUCCESS { "datacontenttype": "application/json", "data": { "serviceName": "crn://confluent.cloud/", "methodName": "UpdateWorkspace", "cloudResources": [ { "scope": { "resources": [ { "type": "ORGANIZATION", "resourceId": "a56cf537-ab71-480e-b272-43e71531798b" }, { "type": "ENVIRONMENT", "resourceId": "env-rzhxp2" }, { "type": "FLINK_REGION", "resourceId": "aws.us-east-1" } ] }, "resource": { "type": "FLINK_WORKSPACE", "resourceId": "workspace-2023-09-22-162803" } } ], "authenticationInfo": { "principal": { "confluentUser": { "resourceId": "u-123456" } }, "result": "SUCCESS", "identity": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/identity-provider=Confluent/identity=u-123456" }, "requestMetadata": { "requestId": [ "8dd4507a31c9fa9f7ca08fdad18020c5" ], "clientAddress": [ { "ip": "1.2.3.4" } ] }, "request": { "accessType": "MODIFICATION", "data": { "spec": { "compute_pool": null, "service_account": null }, "workspace_name": "workspace-2023-09-22-162803", "environment_id": "env-rzhxp2", "org_resource_id": "a56cf537-ab71-480e-b272-43e71531798b" } }, "result": { "status": "SUCCESS", "data": { "environment_id": "env-rzhxp2", "name": "workspace-2023-09-22-162803", "org_id": "a56cf537-ab71-480e-b272-43e71531798b", "spec": { "compute_pool": null, "service_account": null } } }, "resourceName": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/environment=env-rzhxp2/flink-region=aws.us-east-1/flink-workspace=workspace-2023-09-22-162803" }, "subject": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/environment=env-rzhxp2/flink-region=aws.us-east-1/flink-workspace=workspace-2023-09-22-162803", "specversion": "1.0", "id": "b59d471f-3da3-41e2-847a-8363ab4f9077", "source": "crn://confluent.cloud/", "time": "2023-09-22T16:29:09.323947120Z", "type": "io.confluent.cloud/request" } Flink statement¶ Auditable event methods for the resource type STATEMENT are triggered by operations on a Flink statement and generate event messages that are sent to the audit log cluster, where they are stored as event records in a Kafka topic. Method name Action triggering an auditable event message CreateStatement A request to create a Flink statement. DeleteStatement A request to delete a Flink statement. GetStatement A request for a query of a Flink statement details. ListStatements A request for a list of Flink statements. UpdateStatement A request to update a Flink statement. PatchStatement A request to patch a Flink statement. CreateStatement¶ The CreateStatement event method is triggered by a request to create a Flink statement and sends an event message that is saved in the audit log as an event record. Examples¶ SUCCESS { "datacontenttype": "application/json", "data": { "serviceName": "crn://confluent.cloud/", "methodName": "CreateStatement", "cloudResources": [ { "scope": { "resources": [ { "type": "ORGANIZATION", "resourceId": "e9eb4f2c-ef73-475c-ba7f-6b37a4ff00e5" }, { "type": "ENVIRONMENT", "resourceId": "env-xx5q1x" }, { "type": "FLINK_REGION", "resourceId": "aws.us-west-2" } ] }, "resource": { "type": "STATEMENT", "resourceId": "d730eb03-d3b5-412d" } } ], "authenticationInfo": { "principal": { "confluentUser": { "resourceId": "u-5q0mkq" } }, "result": "SUCCESS", "identity": "crn://confluent.cloud/organization=e9eb4f2c-ef73-475c-ba7f-6b37a4ff00e5/identity-provider=Confluent/identity=u-5q0mkq" }, "requestMetadata": { "requestId": [ "38cf3bb10d833c36d7b022c633522153" ], "clientAddress": [ { "ip": "1.2.3.4" } ] }, "request": { "accessType": "MODIFICATION", "data": { "environment_id": "env-xx5q1x", "org_resource_id": "e9eb4f2c-ef73-475c-ba7f-6b37a4ff00e5", "spec": { "compute_pool_id": "lfcp-devccxwdpvk", "name": "d730eb03-d3b5-412d", "principal": "u-5q0mkq" } } }, "result": { "status": "SUCCESS", "data": { "metadata": { "environment_id": "env-xx5q1x" }, "spec": { "compute_pool_id": "lfcp-devccxwdpvk", "name": "d730eb03-d3b5-412d", "principal": "u-5q0mkq" } } }, "resourceName": "crn://confluent.cloud/organization=e9eb4f2c-ef73-475c-ba7f-6b37a4ff00e5/environment=env-xx5q1x/flink-region=aws.us-west-2/statement=d730eb03-d3b5-412d" }, "subject": "crn://confluent.cloud/organization=e9eb4f2c-ef73-475c-ba7f-6b37a4ff00e5/environment=env-xx5q1x/flink-region=aws.us-west-2/statement=d730eb03-d3b5-412d", "specversion": "1.0", "id": "d1fbc567-e5bb-4728-bf54-de88a1aba84e", "source": "crn://confluent.cloud/", "time": "2023-09-22T16:45:13.689395512Z", "type": "io.confluent.cloud/request" } DeleteStatement¶ The DeleteStatement event method is triggered by a request to delete a Flink statement and sends an event message that is saved in the audit log as an event record. Examples¶ SUCCESS { "datacontenttype": "application/json", "data": { "serviceName": "crn://confluent.cloud/", "methodName": "DeleteStatement", "cloudResources": [ { "scope": { "resources": [ { "type": "ORGANIZATION", "resourceId": "e9eb4f2c-ef73-475c-ba7f-6b37a4ff00e5" }, { "type": "ENVIRONMENT", "resourceId": "env-v6x7j0" }, { "type": "FLINK_REGION", "resourceId": "aws.us-west-2" } ] }, "resource": { "type": "STATEMENT", "resourceId": "workspace-2023-09-19-024944-b9c724de-c284-486e-a45f-e7dc1100e181" } } ], "authenticationInfo": { "principal": { "confluentUser": { "resourceId": "u-devccq71mwp" } }, "result": "SUCCESS", "identity": "crn://confluent.cloud/organization=e9eb4f2c-ef73-475c-ba7f-6b37a4ff00e5/identity-provider=Confluent/identity=u-devccq71mwp" }, "requestMetadata": { "requestId": [ "7e9362e01607ffacb08fa80dd2241db2" ], "clientAddress": [ { "ip": "1.2.3.4" } ] }, "request": { "accessType": "MODIFICATION", "data": { "StatementName": "workspace-2023-09-19-024944-b9c724de-c284-486e-a45f-e7dc1100e181", "OrgResourceId": "e9eb4f2c-ef73-475c-ba7f-6b37a4ff00e5", "EnvironmentId": "env-v6x7j0" } }, "result": { "status": "SUCCESS" }, "resourceName": "crn://confluent.cloud/organization=e9eb4f2c-ef73-475c-ba7f-6b37a4ff00e5/environment=env-v6x7j0/flink-region=aws.us-west-2/statement=workspace-2023-09-19-024944-b9c724de-c284-486e-a45f-e7dc1100e181" }, "subject": "crn://confluent.cloud/organization=e9eb4f2c-ef73-475c-ba7f-6b37a4ff00e5/environment=env-v6x7j0/flink-region=aws.us-west-2/statement=workspace-2023-09-19-024944-b9c724de-c284-486e-a45f-e7dc1100e181", "specversion": "1.0", "id": "de07cd1b-ec0f-4d0e-abce-050a993e7532", "source": "crn://confluent.cloud/", "time": "2023-09-22T16:48:05.106656163Z", "type": "io.confluent.cloud/request" } GetStatement¶ The GetStatement event method is triggered by a request to get the details for a Flink statement and sends an event message that is saved in the audit log as an event record. Examples¶ SUCCESS { "datacontenttype": "application/json", "data": { "serviceName": "crn://confluent.cloud/", "methodName": "GetStatement", "cloudResources": [ { "scope": { "resources": [ { "type": "ORGANIZATION", "resourceId": "a56cf537-ab71-480e-b272-43e71531798b" }, { "type": "ENVIRONMENT", "resourceId": "env-9pjxk0" }, { "type": "FLINK_REGION", "resourceId": "aws.us-west-2" } ] }, "resource": { "type": "STATEMENT", "resourceId": "928c8647-582b-4d3b" } } ], "authenticationInfo": { "principal": { "confluentUser": { "resourceId": "u-21r8oo" } }, "result": "SUCCESS", "identity": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/identity-provider=Confluent/identity=u-21r8oo" }, "requestMetadata": { "requestId": [ "a688f8810ba426f39511c04f7b511a0a" ], "clientAddress": [ { "ip": "1.2.3.4" } ] }, "request": { "accessType": "READ_ONLY", "data": { "statement_name": "928c8647-582b-4d3b", "environment_id": "env-9pjxk0" } }, "result": { "status": "SUCCESS", "data": { "metadata": { "environment_id": "env-9pjxk0" }, "spec": { "compute_pool_id": "lfcp-stgccgjvgr1", "name": "928c8647-582b-4d3b", "principal": "u-21r8oo" } } }, "resourceName": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/environment=env-9pjxk0/flink-region=aws.us-west-2/statement=928c8647-582b-4d3b" }, "subject": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/environment=env-9pjxk0/flink-region=aws.us-west-2/statement=928c8647-582b-4d3b", "specversion": "1.0", "id": "f6f45075-3d85-4e41-8677-c06a80ef903e", "source": "crn://confluent.cloud/", "time": "2023-09-22T16:35:20.968310060Z", "type": "io.confluent.cloud/request" } ListStatements¶ The ListStatements event method is triggered by a request for a list of Flink statements and sends an event message that is saved in the audit log as an event record. Examples¶ SUCCESS { "datacontenttype": "application/json", "data": { "serviceName": "crn://confluent.cloud/", "methodName": "ListStatements", "cloudResources": [ { "scope": { "resources": [ { "type": "ORGANIZATION", "resourceId": "e9eb4f2c-ef73-475c-ba7f-6b37a4ff00e5" }, { "type": "ENVIRONMENT", "resourceId": "env-xx3gwz" }, { "type": "FLINK_REGION", "resourceId": "aws.eu-west-1" } ] }, "resource": { "type": "STATEMENT", "resourceId": "3ab9a756-4bcf-475b" } }, { "scope": { "resources": [ { "type": "ORGANIZATION", "resourceId": "e9eb4f2c-ef73-475c-ba7f-6b37a4ff00e5" }, { "type": "ENVIRONMENT", "resourceId": "env-xx3gwz" }, { "type": "FLINK_REGION", "resourceId": "aws.eu-west-1" } ] }, "resource": { "type": "STATEMENT", "resourceId": "e264b999-269c-46d6" } } ], "authenticationInfo": { "principal": { "confluentUser": { "resourceId": "u-devccq71mwp" } }, "result": "SUCCESS", "identity": "crn://confluent.cloud/organization=e9eb4f2c-ef73-475c-ba7f-6b37a4ff00e5/identity-provider=Confluent/identity=u-devccq71mwp" }, "requestMetadata": { "requestId": [ "7cec6a84ab0b05ccb38ecf14981da31b" ], "clientAddress": [ { "ip": "1.2.3.4" } ] }, "request": { "accessType": "READ_ONLY", "data": { "compute_pool_id": "", "environment_id": "env-xx3gwz", "page_size": 100 } }, "result": { "status": "SUCCESS", "data": { "data": [ { "metadata": { "environment_id": "env-xx3gwz" }, "spec": { "compute_pool_id": "lfcp-devcc36z5jj", "name": "3ab9a756-4bcf-475b", "principal": "u-rk1gy7" } }, { "metadata": { "environment_id": "env-xx3gwz" }, "spec": { "principal": "u-rk1gy7", "compute_pool_id": "lfcp-devcc36z5jj", "name": "e264b999-269c-46d6" } } ] } }, "resourceName": "crn://confluent.cloud/organization=e9eb4f2c-ef73-475c-ba7f-6b37a4ff00e5/environment=env-xx3gwz/flink-region=aws.eu-west-1" }, "subject": "crn://confluent.cloud/organization=e9eb4f2c-ef73-475c-ba7f-6b37a4ff00e5/environment=env-xx3gwz/flink-region=aws.eu-west-1", "specversion": "1.0", "id": "5e6bc2d3-9881-442b-af0c-a0a6aa127867", "source": "crn://confluent.cloud/", "time": "2023-09-22T16:47:00.894461118Z", "type": "io.confluent.cloud/request" } UpdateStatement¶ The UpdateStatement event method is triggered by a request to update a Flink statement and sends an event message that is saved in the audit log as an event record. Examples¶ SUCCESS { "datacontenttype": "application/json", "data": { "service_name": "crn://confluent.cloud/", "method_name": "UpdateStatement", "cloud_resources": [ { "scope": { "resources": [ { "type": "ORGANIZATION", "resource_id": "org-123" }, { "type": "ENVIRONMENT", "resource_id": "env-123" }, { "type": "FLINK_REGION", "resource_id": "aws.us-east-2" } ] }, "resource": { "type": "STATEMENT", "resource_id": "statement-123" } }, { "scope": { "resources": [ { "type": "ORGANIZATION", "resource_id": "org-123" }, { "type": "ENVIRONMENT", "resource_id": "env-123" }, { "type": "FLINK_REGION", "resource_id": "aws.us-east-2" } ] }, "resource": { "type": "STATEMENT", "resource_id": "statement-123" } } ], "authentication_info": { "exposure": "CUSTOMER", "principal": { "confluent_user": { "resource_id": "u-123" } }, "result": "SUCCESS", "identity": "crn://confluent.cloud/organization=org-123/identity-provider=Confluent/identity=u-123" }, "request_metadata": { "request_id": [ "74726163656964303132333435363738" ], "client_address": [ { "ip": "127.0.0.1" } ] }, "request": { "access_type": "MODIFICATION", "data": { "environment_id": "env-123", "org_resource_id": "org-123", "spec": { "compute_pool_id": "lfcp-123", "name": "statement-123", "principal": "sa-123" } } }, "result": { "status": "SUCCESS", "data": { "metadata": { "environment_id": "env-123" }, "spec": { "compute_pool_id": "lfcp-123", "name": "statement-123", "principal": "sa-123" } } } } } PatchStatement¶ The PatchStatement event method is triggered by a request to patch a Flink statement and sends an event message that is saved in the audit log as an event record. Examples¶ SUCCESS { "datacontenttype": "application/json", "data": { "service_name": "crn://confluent.cloud/", "method_name": "PatchStatement", "cloud_resources": [ { "scope": { "resources": [ { "type": "ORGANIZATION", "resource_id": "org-123" }, { "type": "ENVIRONMENT", "resource_id": "env-123" }, { "type": "FLINK_REGION", "resource_id": "aws.us-east-2" } ] }, "resource": { "type": "STATEMENT", "resource_id": "statement-123" } }, { "scope": { "resources": [ { "type": "ORGANIZATION", "resource_id": "org-123" }, { "type": "ENVIRONMENT", "resource_id": "env-123" }, { "type": "FLINK_REGION", "resource_id": "aws.us-east-2" } ] }, "resource": { "type": "STATEMENT", "resource_id": "statement-123" } } ], "authentication_info": { "exposure": "CUSTOMER", "principal": { "confluent_user": { "resource_id": "u-123" } }, "result": "SUCCESS", "identity": "crn://confluent.cloud/organization=org-123/identity-provider=Confluent/identity=u-123" }, "request_metadata": { "request_id": [ "74726163656964303132333435363738" ], "client_address": [ { "ip": "127.0.0.1" } ] }, "request": { "access_type": "MODIFICATION", "data": { "environment_id": "env-123", "org_resource_id": "org-123", "statement_name": "statement-123" } }, "result": { "status": "SUCCESS" } } }

#### Code Examples

```sql
FLINK_REGION
```

```sql
ListFlinkRegions
```

```sql
{
     "specversion": "1.0",
     "id": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
     "source": "crn://confluent.cloud/",
     "type": "io.confluent.cloud/request",
     "subject": "crn://confluent.cloud/organization=6c2e1a25-2292-483b-9c76-79982e3dc005",
     "datacontenttype": "application/json",
     "dataschema": "https://confluent.io/internal/events/AuditLog.v2",
     "data": {
             "service_name": "crn://confluent.cloud/service=cc-ksql-api-service",
             "internal_service_name": "crn://confluent.cloud/service=cc-ksql-api-service",
             "method_name": "ListFlinkRegions",
             "cloud_resources": [
                     {
                             "resource": {
                                     "type": "ORGANIZATION",
                                     "resource_id": "6c2e1a25-2292-483b-9c76-79982e3dc005"
                             }
                     }
             ],
             "authentication_info": {
                     "exposure": "CUSTOMER",
                     "principal": {
                             "confluent_user": {
                                     "resource_id": "user-1",
                                     "internal_id": "99"
                             }
                     },
                     "result": "SUCCESS",
                     "identity": "crn://confluent.cloud/organization=6c2e1a25-2292-483b-9c76-79982e3dc005/identity-provider=Confluent/identity=user-1"
             },
             "request_metadata": {
                     "request_id": [
                             "74726163656964303132333435363738"
                     ],
                     "client_address": [
                             {
                                     "ip": "1.2.3.4"
                             }
                     ]
             },
             "request": {
                     "access_type": "READ_ONLY",
                     "data": {
                             "BypassCache": false,
                             "Cloud": 0,
                             "PageSize": 10,
                             "PageToken": "",
                             "RegionName": ""
                     }
             },
             "result": {
                     "status": "SUCCESS",
                     "data": {
                             "elements": [
                                     {
                                             "fcpm_v_2_region": {
                                                     "id": "aws.af-south-1",
                                                     "metadata": null
                                             }
                                     },
                                     {
                                             "fcpm_v_2_region": {
                                                     "id": "aws.ap-east-1",
                                                     "metadata": null
                                             }
                                     },
                                     {
                                             "fcpm_v_2_region": {
                                                     "id": "aws.ap-northeast-1",
                                                     "metadata": null
                                             }
                                     },
                                     {
                                             "fcpm_v_2_region": {
                                                     "id": "aws.ap-northeast-2",
                                                     "metadata": null
                                             }
                                     },
                                     {
                                             "fcpm_v_2_region": {
                                                     "id": "aws.ap-northeast-3",
                                                     "metadata": null
                                             }
                                     },
                                     {
                                             "fcpm_v_2_region": {
                                                     "id": "aws.ap-south-1",
                                                     "metadata": null
                                             }
                                     },
                                     {
                                             "fcpm_v_2_region": {
                                                     "id": "aws.ap-south-2",
                                                     "metadata": null
                                             }
                                     },
                                     {
                                             "fcpm_v_2_region": {
                                                     "id": "aws.ap-southeast-1",
                                                     "metadata": null
                                             }
                                     },
                                     {
                                             "fcpm_v_2_region": {
                                                     "id": "aws.ap-southeast-2",
                                                     "metadata": null
                                             }
                                     },
                                     {
                                             "fcpm_v_2_region": {
                                                     "id": "aws.ap-southeast-3",
                                                     "metadata": null
                                             }
                                     }
                             ]
                     }
             }
     }
}
```

```sql
COMPUTE_POOL
```

```sql
CreateComputePool
```

```sql
{
     "specversion": "1.0",
     "id": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
     "source": "crn://confluent.cloud/",
     "type": "io.confluent.cloud/request",
     "subject": "crn://confluent.cloud/organization=6c2e1a25-2292-483b-9c76-79982e3dc005",
     "datacontenttype": "application/json",
     "dataschema": "https://confluent.io/internal/events/AuditLog.v2",
     "data": {
             "service_name": "crn://confluent.cloud/service=cc-ksql-api-service",
             "internal_service_name": "crn://confluent.cloud/service=cc-ksql-api-service",
             "method_name": "ListRegions",
             "cloud_resources": [
                     {
                             "resource": {
                                     "type": "ORGANIZATION",
                                     "resource_id": "6c2e1a25-2292-483b-9c76-79982e3dc005"
                             }
                     }
             ],
             "authentication_info": {
                     "exposure": "CUSTOMER",
                     "principal": {
                             "confluent_user": {
                                     "resource_id": "user-1",
                                     "internal_id": "99"
                             }
                     },
                     "result": "SUCCESS",
                     "identity": "crn://confluent.cloud/organization=6c2e1a25-2292-483b-9c76-79982e3dc005/identity-provider=Confluent/identity=user-1"
             },
             "request_metadata": {
                     "request_id": [
                             "74726163656964303132333435363738"
                     ],
                     "client_address": [
                             {
                                     "ip": "1.2.3.4"
                             }
                     ]
             },
             "request": {
                     "access_type": "READ_ONLY",
                     "data": {
                             "BypassCache": false,
                             "Cloud": 0,
                             "PageSize": 10,
                             "PageToken": "",
                             "RegionName": ""
                     }
             },
             "result": {
                     "status": "SUCCESS",
                     "data": {
                             "elements": [
                                     {
                                             "fcpm_v_2_region": {
                                                     "id": "aws.af-south-1",
                                                     "metadata": null
                                             }
                                     },
                                     {
                                             "fcpm_v_2_region": {
                                                     "id": "aws.ap-east-1",
                                                     "metadata": null
                                             }
                                     },
                                     {
                                             "fcpm_v_2_region": {
                                                     "id": "aws.ap-northeast-1",
                                                     "metadata": null
                                             }
                                     },
                                     {
                                             "fcpm_v_2_region": {
                                                     "id": "aws.ap-northeast-2",
                                                     "metadata": null
                                             }
                                     },
                                     {
                                             "fcpm_v_2_region": {
                                                     "id": "aws.ap-northeast-3",
                                                     "metadata": null
                                             }
                                     },
                                     {
                                             "fcpm_v_2_region": {
                                                     "id": "aws.ap-south-1",
                                                     "metadata": null
                                             }
                                     },
                                     {
                                             "fcpm_v_2_region": {
                                                     "id": "aws.ap-south-2",
                                                     "metadata": null
                                             }
                                     },
                                     {
                                             "fcpm_v_2_region": {
                                                     "id": "aws.ap-southeast-1",
                                                     "metadata": null
                                             }
                                     },
                                     {
                                             "fcpm_v_2_region": {
                                                     "id": "aws.ap-southeast-2",
                                                     "metadata": null
                                             }
                                     },
                                     {
                                             "fcpm_v_2_region": {
                                                     "id": "aws.ap-southeast-3",
                                                     "metadata": null
                                             }
                                     }
                             ]
                     }
             }
     }
}
```

```sql
DeleteComputePool
```

```sql
{
     "specversion": "1.0",
     "id": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
     "source": "crn://confluent.cloud/",
     "type": "io.confluent.cloud/request",
     "subject": "crn://confluent.cloud/organization=6c2e1a25-2292-483b-9c76-79982e3dc005/environment=env-j30y0iqp/flink-region=azure.uksouth/compute-pool=lfcp-1",
     "datacontenttype": "application/json",
     "dataschema": "https://confluent.io/internal/events/AuditLog.v2",
     "data": {
             "service_name": "crn://confluent.cloud/service=cc-ksql-api-service",
             "internal_service_name": "crn://confluent.cloud/service=cc-ksql-api-service",
             "method_name": "DeleteComputePool",
             "cloud_resources": [
                     {
                             "scope": {
                                     "resources": [
                                             {
                                                     "type": "ORGANIZATION",
                                                     "resource_id": "6c2e1a25-2292-483b-9c76-79982e3dc005"
                                             },
                                             {
                                                     "type": "ENVIRONMENT",
                                                     "resource_id": "env-j30y0iqp"
                                             },
                                             {
                                                     "type": "FLINK_REGION",
                                                     "resource_id": "azure.uksouth"
                                             }
                                     ]
                             },
                             "resource": {
                                     "type": "COMPUTE_POOL",
                                     "resource_id": "lfcp-1"
                             }
                     }
             ],
             "authentication_info": {
                     "exposure": "CUSTOMER",
                     "principal": {
                             "confluent_user": {
                                     "resource_id": "user-1",
                                     "internal_id": "99"
                             }
                     },
                     "result": "SUCCESS",
                     "identity": "crn://confluent.cloud/organization=6c2e1a25-2292-483b-9c76-79982e3dc005/identity-provider=Confluent/identity=user-1"
             },
             "request_metadata": {
                     "request_id": [
                             "74726163656964303132333435363738"
                     ],
                     "client_address": [
                             {
                                     "ip": "1.2.3.4"
                             }
                     ]
             },
             "request": {
                     "access_type": "MODIFICATION",
                     "data": {
                             "environment_id": "env-j30y0iqp",
                             "id": "lfcp-1"
                     }
             },
             "result": {
                     "status": "SUCCESS"
             }
     }
}
```

```sql
GetComputePool
```

```sql
{
     "specversion": "1.0",
     "id": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
     "source": "crn://confluent.cloud/",
     "type": "io.confluent.cloud/request",
     "subject": "crn://confluent.cloud/organization=6c2e1a25-2292-483b-9c76-79982e3dc005/environment=env-j30y0iqp/flink-region=azure.uksouth/compute-pool=lfcp-1",
     "datacontenttype": "application/json",
     "dataschema": "https://confluent.io/internal/events/AuditLog.v2",
     "data": {
             "service_name": "crn://confluent.cloud/service=cc-ksql-api-service",
             "internal_service_name": "crn://confluent.cloud/service=cc-ksql-api-service",
             "method_name": "DeleteComputePool",
             "cloud_resources": [
                     {
                             "scope": {
                                     "resources": [
                                             {
                                                     "type": "ORGANIZATION",
                                                     "resource_id": "6c2e1a25-2292-483b-9c76-79982e3dc005"
                                             },
                                             {
                                                     "type": "ENVIRONMENT",
                                                     "resource_id": "env-j30y0iqp"
                                             },
                                             {
                                                     "type": "FLINK_REGION",
                                                     "resource_id": "azure.uksouth"
                                             }
                                     ]
                             },
                             "resource": {
                                     "type": "COMPUTE_POOL",
                                     "resource_id": "lfcp-1"
                             }
                     }
             ],
             "authentication_info": {
                     "exposure": "CUSTOMER",
                     "principal": {
                             "confluent_user": {
                                     "resource_id": "user-1",
                                     "internal_id": "99"
                             }
                     },
                     "result": "SUCCESS",
                     "identity": "crn://confluent.cloud/organization=6c2e1a25-2292-483b-9c76-79982e3dc005/identity-provider=Confluent/identity=user-1"
             },
             "request_metadata": {
                     "request_id": [
                             "74726163656964303132333435363738"
                     ],
                     "client_address": [
                             {
                                     "ip": "1.2.3.4"
                             }
                     ]
             },
             "request": {
                     "access_type": "MODIFICATION",
                     "data": {
                             "environment_id": "env-j30y0iqp",
                             "id": "lfcp-1"
                     }
             },
             "result": {
                     "status": "SUCCESS"
             }
     }
}
```

```sql
ListComputePools
```

```sql
{
     "specversion": "1.0",
     "id": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
     "source": "crn://confluent.cloud/",
     "type": "io.confluent.cloud/request",
     "subject": "crn://confluent.cloud/organization=6c2e1a25-2292-483b-9c76-79982e3dc005/environment=env-j30y0iqp/flink-region=azure.uksouth/compute-pool=lfcp-1",
     "datacontenttype": "application/json",
     "dataschema": "https://confluent.io/internal/events/AuditLog.v2",
     "data": {
             "service_name": "crn://confluent.cloud/service=cc-ksql-api-service",
             "internal_service_name": "crn://confluent.cloud/service=cc-ksql-api-service",
             "method_name": "DeleteComputePool",
             "cloud_resources": [
                     {
                             "scope": {
                                     "resources": [
                                             {
                                                     "type": "ORGANIZATION",
                                                     "resource_id": "6c2e1a25-2292-483b-9c76-79982e3dc005"
                                             },
                                             {
                                                     "type": "ENVIRONMENT",
                                                     "resource_id": "env-j30y0iqp"
                                             },
                                             {
                                                     "type": "FLINK_REGION",
                                                     "resource_id": "azure.uksouth"
                                             }
                                     ]
                             },
                             "resource": {
                                     "type": "COMPUTE_POOL",
                                     "resource_id": "lfcp-1"
                             }
                     }
             ],
             "authentication_info": {
                     "exposure": "CUSTOMER",
                     "principal": {
                             "confluent_user": {
                                     "resource_id": "user-1",
                                     "internal_id": "99"
                             }
                     },
                     "result": "SUCCESS",
                     "identity": "crn://confluent.cloud/organization=6c2e1a25-2292-483b-9c76-79982e3dc005/identity-provider=Confluent/identity=user-1"
             },
             "request_metadata": {
                     "request_id": [
                             "74726163656964303132333435363738"
                     ],
                     "client_address": [
                             {
                                     "ip": "1.2.3.4"
                             }
                     ]
             },
             "request": {
                     "access_type": "MODIFICATION",
                     "data": {
                             "environment_id": "env-j30y0iqp",
                             "id": "lfcp-1"
                     }
             },
             "result": {
                     "status": "SUCCESS"
             }
     }
}
```

```sql
UpdateComputePool
```

```sql
{
     "specversion": "1.0",
     "id": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
     "source": "crn://confluent.cloud/",
     "type": "io.confluent.cloud/request",
     "subject": "crn://confluent.cloud/organization=6c2e1a25-2292-483b-9c76-79982e3dc005/environment=env-j30y0iqp/flink-region=azure.uksouth/compute-pool=lfcp-1",
     "datacontenttype": "application/json",
     "dataschema": "https://confluent.io/internal/events/AuditLog.v2",
     "data": {
             "service_name": "crn://confluent.cloud/service=cc-ksql-api-service",
             "internal_service_name": "crn://confluent.cloud/service=cc-ksql-api-service",
             "method_name": "DeleteComputePool",
             "cloud_resources": [
                     {
                             "scope": {
                                     "resources": [
                                             {
                                                     "type": "ORGANIZATION",
                                                     "resource_id": "6c2e1a25-2292-483b-9c76-79982e3dc005"
                                             },
                                             {
                                                     "type": "ENVIRONMENT",
                                                     "resource_id": "env-j30y0iqp"
                                             },
                                             {
                                                     "type": "FLINK_REGION",
                                                     "resource_id": "azure.uksouth"
                                             }
                                     ]
                             },
                             "resource": {
                                     "type": "COMPUTE_POOL",
                                     "resource_id": "lfcp-1"
                             }
                     }
             ],
             "authentication_info": {
                     "exposure": "CUSTOMER",
                     "principal": {
                             "confluent_user": {
                                     "resource_id": "user-1",
                                     "internal_id": "99"
                             }
                     },
                     "result": "SUCCESS",
                     "identity": "crn://confluent.cloud/organization=6c2e1a25-2292-483b-9c76-79982e3dc005/identity-provider=Confluent/identity=user-1"
             },
             "request_metadata": {
                     "request_id": [
                             "74726163656964303132333435363738"
                     ],
                     "client_address": [
                             {
                                     "ip": "1.2.3.4"
                             }
                     ]
             },
             "request": {
                     "access_type": "MODIFICATION",
                     "data": {
                             "environment_id": "env-j30y0iqp",
                             "id": "lfcp-1"
                     }
             },
             "result": {
                     "status": "SUCCESS"
             }
     }
}
```

```sql
FLINK_WORKSPACE
```

```sql
CreateWorkspace
```

```sql
{
  "datacontenttype": "application/json",
  "data": {
    "serviceName": "crn://confluent.cloud/",
    "methodName": "CreateWorkspace",
    "cloudResources": [
      {
        "scope": {
          "resources": [
            {
              "type": "ORGANIZATION",
              "resourceId": "a56cf537-ab71-480e-b272-43e71531798b"
            },
            {
              "type": "ENVIRONMENT",
              "resourceId": "env-rzhxp2"
            },
            {
              "type": "FLINK_REGION",
              "resourceId": "aws.us-east-1"
            }
          ]
        },
        "resource": {
          "type": "FLINK_WORKSPACE",
          "resourceId": "workspace-2023-09-22-162414"
        }
      }
    ],
    "authenticationInfo": {
      "principal": {
        "confluentUser": {
          "resourceId": "u-123456"
        }
      },
      "result": "SUCCESS",
      "identity": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/identity-provider=Confluent/identity=u-123456"
    },
    "requestMetadata": {
      "requestId": [
        "8b4f7ec5693a01fb4a1ae0a24240f944"
      ],
      "clientAddress": [
        {
          "ip": "1.2.3.4"
        }
      ]
    },
    "request": {
      "accessType": "MODIFICATION",
      "data": {
        "workspace_name": "workspace-2023-09-22-162414",
        "environment_id": "env-rzhxp2",
        "org_resource_id": "a56cf537-ab71-480e-b272-43e71531798b",
        "spec": {
          "compute_pool": {
            "id": "lfcp-stgcc30xr80"
          },
          "service_account": null
        }
      }
    },
    "result": {
      "status": "SUCCESS",
      "data": {
        "environment_id": "env-rzhxp2",
        "name": "workspace-2023-09-22-162414",
        "org_id": "a56cf537-ab71-480e-b272-43e71531798b",
        "spec": {
          "compute_pool": {
            "id": "lfcp-stgcc30xr80"
          },
          "service_account": null
        }
      }
    },
    "resourceName": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/environment=env-rzhxp2/flink-region=aws.us-east-1/flink-workspace=workspace-2023-09-22-162414"
  },
  "subject": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/environment=env-rzhxp2/flink-region=aws.us-east-1/flink-workspace=workspace-2023-09-22-162414",
  "specversion": "1.0",
  "id": "b76bee22-7678-49ea-8902-67519b0d4133",
  "source": "crn://confluent.cloud/",
  "time": "2023-09-22T16:24:15.007233032Z",
  "type": "io.confluent.cloud/request"
}
```

```sql
DeleteWorkspace
```

```sql
{
  "datacontenttype": "application/json",
  "data": {
    "serviceName": "crn://confluent.cloud/",
    "methodName": "DeleteWorkspace",
    "cloudResources": [
      {
        "scope": {
          "resources": [
            {
              "type": "ORGANIZATION",
              "resourceId": "a56cf537-ab71-480e-b272-43e71531798b"
            },
            {
              "type": "ENVIRONMENT",
              "resourceId": "env-rzhxp2"
            },
            {
              "type": "FLINK_REGION",
              "resourceId": "aws.us-east-1"
            }
          ]
        },
        "resource": {
          "type": "FLINK_WORKSPACE",
          "resourceId": "workspace-2023-09-22-162414"
        }
      }
    ],
    "authenticationInfo": {
      "principal": {
        "confluentUser": {
          "resourceId": "u-123456"
        }
      },
      "result": "SUCCESS",
      "identity": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/identity-provider=Confluent/identity=u-123456"
    },
    "requestMetadata": {
      "requestId": [
        "6a4dd657fe6fc5241360983cbf8dc8ce"
      ],
      "clientAddress": [
        {
          "ip": "1.2.3.4"
        }
      ]
    },
    "request": {
      "accessType": "MODIFICATION",
      "data": {
        "workspace_name": "workspace-2023-09-22-162414",
        "environment_id": "env-rzhxp2",
        "org_resource_id": "a56cf537-ab71-480e-b272-43e71531798b"
      }
    },
    "result": {
      "status": "SUCCESS"
    },
    "resourceName": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/environment=env-rzhxp2/flink-region=aws.us-east-1/flink-workspace=workspace-2023-09-22-162414"
  },
  "subject": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/environment=env-rzhxp2/flink-region=aws.us-east-1/flink-workspace=workspace-2023-09-22-162414",
  "specversion": "1.0",
  "id": "36791901-6bd6-4057-8820-9d6860d56d0c",
  "source": "crn://confluent.cloud/",
  "time": "2023-09-22T16:24:41.773914645Z",
  "type": "io.confluent.cloud/request"
}
```

```sql
GetWorkspace
```

```sql
{
  "datacontenttype": "application/json",
  "data": {
    "serviceName": "crn://confluent.cloud/",
    "methodName": "GetWorkspace",
    "cloudResources": [
      {
        "scope": {
          "resources": [
            {
              "type": "ORGANIZATION",
              "resourceId": "a56cf537-ab71-480e-b272-43e71531798b"
            },
            {
              "type": "ENVIRONMENT",
              "resourceId": "env-rzhxp2"
            },
            {
              "type": "FLINK_REGION",
              "resourceId": "aws.us-east-1"
            }
          ]
        },
        "resource": {
          "type": "FLINK_WORKSPACE",
          "resourceId": "workspace-2023-09-22-162414"
        }
      }
    ],
    "authenticationInfo": {
      "principal": {
        "confluentUser": {
          "resourceId": "u-123456"
        }
      },
      "result": "SUCCESS",
      "identity": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/identity-provider=Confluent/identity=u-123456"
    },
    "requestMetadata": {
      "requestId": [
        "ae0fe8164a496916ba2494a4f5cef447"
      ],
      "clientAddress": [
        {
          "ip": "1.2.3.4"
        }
      ]
    },
    "request": {
      "accessType": "READ_ONLY",
      "data": {
        "environment_id": "env-rzhxp2",
        "org_resource_id": "a56cf537-ab71-480e-b272-43e71531798b",
        "workspace_name": "workspace-2023-09-22-162414"
      }
    },
    "result": {
      "status": "SUCCESS",
      "data": {
        "environment_id": "env-rzhxp2",
        "name": "workspace-2023-09-22-162414",
        "org_id": "a56cf537-ab71-480e-b272-43e71531798b",
        "spec": {
          "service_account": null,
          "compute_pool": {
            "id": "lfcp-stgcc30xr80"
          }
        }
      }
    },
    "resourceName": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/environment=env-rzhxp2/flink-region=aws.us-east-1/flink-workspace=workspace-2023-09-22-162414"
  },
  "subject": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/environment=env-rzhxp2/flink-region=aws.us-east-1/flink-workspace=workspace-2023-09-22-162414",
  "specversion": "1.0",
  "id": "ae935a4b-bcc6-4359-9149-3c31e728877a",
  "source": "crn://confluent.cloud/",
  "time": "2023-09-22T16:24:15.666686762Z",
  "type": "io.confluent.cloud/request"
}
```

```sql
ListWorkspaces
```

```sql
{
  "datacontenttype": "application/json",
  "data": {
    "serviceName": "crn://confluent.cloud/",
    "methodName": "ListWorkspace",
    "cloudResources": [
      {
        "scope": {
          "resources": [
            {
              "type": "ORGANIZATION",
              "resourceId": "a56cf537-ab71-480e-b272-43e71531798b"
            },
            {
              "type": "ENVIRONMENT",
              "resourceId": "env-rzhxp2"
            },
            {
              "type": "FLINK_REGION",
              "resourceId": "aws.us-east-1"
            }
          ]
        },
        "resource": {
          "type": "FLINK_WORKSPACE",
          "resourceId": "workspace-2023-09-22-162414"
        }
      }
    ],
    "authenticationInfo": {
      "principal": {
        "confluentUser": {
          "resourceId": "u-123456"
        }
      },
      "result": "SUCCESS",
      "identity": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/identity-provider=Confluent/identity=u-123456"
    },
    "requestMetadata": {
      "requestId": [
        "5e926b0c56f3131f8fb350f228ad9b11"
      ],
      "clientAddress": [
        {
          "ip": "1.2.3.4"
        }
      ]
    },
    "request": {
      "accessType": "READ_ONLY",
      "data": {
        "environment_id": "env-rzhxp2",
        "org_resource_id": "a56cf537-ab71-480e-b272-43e71531798b",
        "page_size": 100
      }
    },
    "result": {
      "status": "SUCCESS",
      "data": {
        "data": [
          {
            "name": "workspace-2023-09-22-162414",
            "org_id": "a56cf537-ab71-480e-b272-43e71531798b",
            "spec": {
              "compute_pool": {
                "id": "lfcp-stgcc30xr80"
              },
              "service_account": null
            },
            "environment_id": "env-rzhxp2"
          }
        ]
      }
    },
    "resourceName": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/environment=env-rzhxp2/flink-region=aws.us-east-1/flink-workspace=workspace-2023-09-22-162414"
  },
  "subject": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/environment=env-rzhxp2/flink-region=aws.us-east-1/flink-workspace=workspace-2023-09-22-162414",
  "specversion": "1.0",
  "id": "f1f9c92e-f3b8-425e-971f-c0206b0eadc0",
  "source": "crn://confluent.cloud/",
  "time": "2023-09-22T16:24:29.707277883Z",
  "type": "io.confluent.cloud/request"
}
```

```sql
UpdateWorkspace
```

```sql
{
  "datacontenttype": "application/json",
  "data": {
    "serviceName": "crn://confluent.cloud/",
    "methodName": "UpdateWorkspace",
    "cloudResources": [
      {
        "scope": {
          "resources": [
            {
              "type": "ORGANIZATION",
              "resourceId": "a56cf537-ab71-480e-b272-43e71531798b"
            },
            {
              "type": "ENVIRONMENT",
              "resourceId": "env-rzhxp2"
            },
            {
              "type": "FLINK_REGION",
              "resourceId": "aws.us-east-1"
            }
          ]
        },
        "resource": {
          "type": "FLINK_WORKSPACE",
          "resourceId": "workspace-2023-09-22-162803"
        }
      }
    ],
    "authenticationInfo": {
      "principal": {
        "confluentUser": {
          "resourceId": "u-123456"
        }
      },
      "result": "SUCCESS",
      "identity": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/identity-provider=Confluent/identity=u-123456"
    },
    "requestMetadata": {
      "requestId": [
        "8dd4507a31c9fa9f7ca08fdad18020c5"
      ],
      "clientAddress": [
        {
          "ip": "1.2.3.4"
        }
      ]
    },
    "request": {
      "accessType": "MODIFICATION",
      "data": {
        "spec": {
          "compute_pool": null,
          "service_account": null
        },
        "workspace_name": "workspace-2023-09-22-162803",
        "environment_id": "env-rzhxp2",
        "org_resource_id": "a56cf537-ab71-480e-b272-43e71531798b"
      }
    },
    "result": {
      "status": "SUCCESS",
      "data": {
        "environment_id": "env-rzhxp2",
        "name": "workspace-2023-09-22-162803",
        "org_id": "a56cf537-ab71-480e-b272-43e71531798b",
        "spec": {
          "compute_pool": null,
          "service_account": null
        }
      }
    },
    "resourceName": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/environment=env-rzhxp2/flink-region=aws.us-east-1/flink-workspace=workspace-2023-09-22-162803"
  },
  "subject": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/environment=env-rzhxp2/flink-region=aws.us-east-1/flink-workspace=workspace-2023-09-22-162803",
  "specversion": "1.0",
  "id": "b59d471f-3da3-41e2-847a-8363ab4f9077",
  "source": "crn://confluent.cloud/",
  "time": "2023-09-22T16:29:09.323947120Z",
  "type": "io.confluent.cloud/request"
}
```

```sql
CreateStatement
```

```sql
{
  "datacontenttype": "application/json",
  "data": {
    "serviceName": "crn://confluent.cloud/",
    "methodName": "CreateStatement",
    "cloudResources": [
      {
        "scope": {
          "resources": [
            {
              "type": "ORGANIZATION",
              "resourceId": "e9eb4f2c-ef73-475c-ba7f-6b37a4ff00e5"
            },
            {
              "type": "ENVIRONMENT",
              "resourceId": "env-xx5q1x"
            },
            {
              "type": "FLINK_REGION",
              "resourceId": "aws.us-west-2"
            }
          ]
        },
        "resource": {
          "type": "STATEMENT",
          "resourceId": "d730eb03-d3b5-412d"
        }
      }
    ],
    "authenticationInfo": {
      "principal": {
        "confluentUser": {
          "resourceId": "u-5q0mkq"
        }
      },
      "result": "SUCCESS",
      "identity": "crn://confluent.cloud/organization=e9eb4f2c-ef73-475c-ba7f-6b37a4ff00e5/identity-provider=Confluent/identity=u-5q0mkq"
    },
    "requestMetadata": {
      "requestId": [
        "38cf3bb10d833c36d7b022c633522153"
      ],
      "clientAddress": [
        {
          "ip": "1.2.3.4"
        }
      ]
    },
    "request": {
      "accessType": "MODIFICATION",
      "data": {
        "environment_id": "env-xx5q1x",
        "org_resource_id": "e9eb4f2c-ef73-475c-ba7f-6b37a4ff00e5",
        "spec": {
          "compute_pool_id": "lfcp-devccxwdpvk",
          "name": "d730eb03-d3b5-412d",
          "principal": "u-5q0mkq"
        }
      }
    },
    "result": {
      "status": "SUCCESS",
      "data": {
        "metadata": {
          "environment_id": "env-xx5q1x"
        },
        "spec": {
          "compute_pool_id": "lfcp-devccxwdpvk",
          "name": "d730eb03-d3b5-412d",
          "principal": "u-5q0mkq"
        }
      }
    },
    "resourceName": "crn://confluent.cloud/organization=e9eb4f2c-ef73-475c-ba7f-6b37a4ff00e5/environment=env-xx5q1x/flink-region=aws.us-west-2/statement=d730eb03-d3b5-412d"
  },
  "subject": "crn://confluent.cloud/organization=e9eb4f2c-ef73-475c-ba7f-6b37a4ff00e5/environment=env-xx5q1x/flink-region=aws.us-west-2/statement=d730eb03-d3b5-412d",
  "specversion": "1.0",
  "id": "d1fbc567-e5bb-4728-bf54-de88a1aba84e",
  "source": "crn://confluent.cloud/",
  "time": "2023-09-22T16:45:13.689395512Z",
  "type": "io.confluent.cloud/request"
}
```

```sql
DeleteStatement
```

```sql
{
  "datacontenttype": "application/json",
  "data": {
    "serviceName": "crn://confluent.cloud/",
    "methodName": "DeleteStatement",
    "cloudResources": [
      {
        "scope": {
          "resources": [
            {
              "type": "ORGANIZATION",
              "resourceId": "e9eb4f2c-ef73-475c-ba7f-6b37a4ff00e5"
            },
            {
              "type": "ENVIRONMENT",
              "resourceId": "env-v6x7j0"
            },
            {
              "type": "FLINK_REGION",
              "resourceId": "aws.us-west-2"
            }
          ]
        },
        "resource": {
          "type": "STATEMENT",
          "resourceId": "workspace-2023-09-19-024944-b9c724de-c284-486e-a45f-e7dc1100e181"
        }
      }
    ],
    "authenticationInfo": {
      "principal": {
        "confluentUser": {
          "resourceId": "u-devccq71mwp"
        }
      },
      "result": "SUCCESS",
      "identity": "crn://confluent.cloud/organization=e9eb4f2c-ef73-475c-ba7f-6b37a4ff00e5/identity-provider=Confluent/identity=u-devccq71mwp"
    },
    "requestMetadata": {
      "requestId": [
        "7e9362e01607ffacb08fa80dd2241db2"
      ],
      "clientAddress": [
        {
          "ip": "1.2.3.4"
        }
      ]
    },
    "request": {
      "accessType": "MODIFICATION",
      "data": {
        "StatementName": "workspace-2023-09-19-024944-b9c724de-c284-486e-a45f-e7dc1100e181",
        "OrgResourceId": "e9eb4f2c-ef73-475c-ba7f-6b37a4ff00e5",
        "EnvironmentId": "env-v6x7j0"
      }
    },
    "result": {
      "status": "SUCCESS"
    },
    "resourceName": "crn://confluent.cloud/organization=e9eb4f2c-ef73-475c-ba7f-6b37a4ff00e5/environment=env-v6x7j0/flink-region=aws.us-west-2/statement=workspace-2023-09-19-024944-b9c724de-c284-486e-a45f-e7dc1100e181"
  },
  "subject": "crn://confluent.cloud/organization=e9eb4f2c-ef73-475c-ba7f-6b37a4ff00e5/environment=env-v6x7j0/flink-region=aws.us-west-2/statement=workspace-2023-09-19-024944-b9c724de-c284-486e-a45f-e7dc1100e181",
  "specversion": "1.0",
  "id": "de07cd1b-ec0f-4d0e-abce-050a993e7532",
  "source": "crn://confluent.cloud/",
  "time": "2023-09-22T16:48:05.106656163Z",
  "type": "io.confluent.cloud/request"
}
```

```sql
GetStatement
```

```sql
{
  "datacontenttype": "application/json",
  "data": {
    "serviceName": "crn://confluent.cloud/",
    "methodName": "GetStatement",
    "cloudResources": [
      {
        "scope": {
          "resources": [
            {
              "type": "ORGANIZATION",
              "resourceId": "a56cf537-ab71-480e-b272-43e71531798b"
            },
            {
              "type": "ENVIRONMENT",
              "resourceId": "env-9pjxk0"
            },
            {
              "type": "FLINK_REGION",
              "resourceId": "aws.us-west-2"
            }
          ]
        },
        "resource": {
          "type": "STATEMENT",
          "resourceId": "928c8647-582b-4d3b"
        }
      }
    ],
    "authenticationInfo": {
      "principal": {
        "confluentUser": {
          "resourceId": "u-21r8oo"
        }
      },
      "result": "SUCCESS",
      "identity": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/identity-provider=Confluent/identity=u-21r8oo"
    },
    "requestMetadata": {
      "requestId": [
        "a688f8810ba426f39511c04f7b511a0a"
      ],
      "clientAddress": [
        {
          "ip": "1.2.3.4"
        }
      ]
    },
    "request": {
      "accessType": "READ_ONLY",
      "data": {
        "statement_name": "928c8647-582b-4d3b",
        "environment_id": "env-9pjxk0"
      }
    },
    "result": {
      "status": "SUCCESS",
      "data": {
        "metadata": {
          "environment_id": "env-9pjxk0"
        },
        "spec": {
          "compute_pool_id": "lfcp-stgccgjvgr1",
          "name": "928c8647-582b-4d3b",
          "principal": "u-21r8oo"
        }
      }
    },
    "resourceName": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/environment=env-9pjxk0/flink-region=aws.us-west-2/statement=928c8647-582b-4d3b"
  },
  "subject": "crn://confluent.cloud/organization=a56cf537-ab71-480e-b272-43e71531798b/environment=env-9pjxk0/flink-region=aws.us-west-2/statement=928c8647-582b-4d3b",
  "specversion": "1.0",
  "id": "f6f45075-3d85-4e41-8677-c06a80ef903e",
  "source": "crn://confluent.cloud/",
  "time": "2023-09-22T16:35:20.968310060Z",
  "type": "io.confluent.cloud/request"
}
```

```sql
ListStatements
```

```sql
{
  "datacontenttype": "application/json",
  "data": {
    "serviceName": "crn://confluent.cloud/",
    "methodName": "ListStatements",
    "cloudResources": [
      {
        "scope": {
          "resources": [
            {
              "type": "ORGANIZATION",
              "resourceId": "e9eb4f2c-ef73-475c-ba7f-6b37a4ff00e5"
            },
            {
              "type": "ENVIRONMENT",
              "resourceId": "env-xx3gwz"
            },
            {
              "type": "FLINK_REGION",
              "resourceId": "aws.eu-west-1"
            }
          ]
        },
        "resource": {
          "type": "STATEMENT",
          "resourceId": "3ab9a756-4bcf-475b"
        }
      },
      {
        "scope": {
          "resources": [
            {
              "type": "ORGANIZATION",
              "resourceId": "e9eb4f2c-ef73-475c-ba7f-6b37a4ff00e5"
            },
            {
              "type": "ENVIRONMENT",
              "resourceId": "env-xx3gwz"
            },
            {
              "type": "FLINK_REGION",
              "resourceId": "aws.eu-west-1"
            }
          ]
        },
        "resource": {
          "type": "STATEMENT",
          "resourceId": "e264b999-269c-46d6"
        }
      }
    ],
    "authenticationInfo": {
      "principal": {
        "confluentUser": {
          "resourceId": "u-devccq71mwp"
        }
      },
      "result": "SUCCESS",
      "identity": "crn://confluent.cloud/organization=e9eb4f2c-ef73-475c-ba7f-6b37a4ff00e5/identity-provider=Confluent/identity=u-devccq71mwp"
    },
    "requestMetadata": {
      "requestId": [
        "7cec6a84ab0b05ccb38ecf14981da31b"
      ],
      "clientAddress": [
        {
          "ip": "1.2.3.4"
        }
      ]
    },
    "request": {
      "accessType": "READ_ONLY",
      "data": {
        "compute_pool_id": "",
        "environment_id": "env-xx3gwz",
        "page_size": 100
      }
    },
    "result": {
      "status": "SUCCESS",
      "data": {
        "data": [
          {
            "metadata": {
              "environment_id": "env-xx3gwz"
            },
            "spec": {
              "compute_pool_id": "lfcp-devcc36z5jj",
              "name": "3ab9a756-4bcf-475b",
              "principal": "u-rk1gy7"
            }
          },
          {
            "metadata": {
              "environment_id": "env-xx3gwz"
            },
            "spec": {
              "principal": "u-rk1gy7",
              "compute_pool_id": "lfcp-devcc36z5jj",
              "name": "e264b999-269c-46d6"
            }
          }
        ]
      }
    },
    "resourceName": "crn://confluent.cloud/organization=e9eb4f2c-ef73-475c-ba7f-6b37a4ff00e5/environment=env-xx3gwz/flink-region=aws.eu-west-1"
  },
  "subject": "crn://confluent.cloud/organization=e9eb4f2c-ef73-475c-ba7f-6b37a4ff00e5/environment=env-xx3gwz/flink-region=aws.eu-west-1",
  "specversion": "1.0",
  "id": "5e6bc2d3-9881-442b-af0c-a0a6aa127867",
  "source": "crn://confluent.cloud/",
  "time": "2023-09-22T16:47:00.894461118Z",
  "type": "io.confluent.cloud/request"
}
```

```sql
UpdateStatement
```

```sql
{
  "datacontenttype": "application/json",
  "data": {
    "service_name": "crn://confluent.cloud/",
    "method_name": "UpdateStatement",
    "cloud_resources": [
      {
        "scope": {
          "resources": [
            {
              "type": "ORGANIZATION",
              "resource_id": "org-123"
            },
            {
              "type": "ENVIRONMENT",
              "resource_id": "env-123"
            },
            {
              "type": "FLINK_REGION",
              "resource_id": "aws.us-east-2"
            }
          ]
        },
        "resource": {
          "type": "STATEMENT",
          "resource_id": "statement-123"
        }
      },
      {
        "scope": {
          "resources": [
            {
              "type": "ORGANIZATION",
              "resource_id": "org-123"
            },
            {
              "type": "ENVIRONMENT",
              "resource_id": "env-123"
            },
            {
              "type": "FLINK_REGION",
              "resource_id": "aws.us-east-2"
            }
          ]
        },
        "resource": {
          "type": "STATEMENT",
          "resource_id": "statement-123"
        }
      }
    ],
    "authentication_info": {
      "exposure": "CUSTOMER",
      "principal": {
        "confluent_user": {
          "resource_id": "u-123"
        }
      },
      "result": "SUCCESS",
      "identity": "crn://confluent.cloud/organization=org-123/identity-provider=Confluent/identity=u-123"
    },
    "request_metadata": {
      "request_id": [
        "74726163656964303132333435363738"
      ],
      "client_address": [
        {
          "ip": "127.0.0.1"
        }
      ]
    },
    "request": {
      "access_type": "MODIFICATION",
      "data": {
        "environment_id": "env-123",
        "org_resource_id": "org-123",
        "spec": {
          "compute_pool_id": "lfcp-123",
          "name": "statement-123",
          "principal": "sa-123"
        }
      }
    },
    "result": {
      "status": "SUCCESS",
      "data": {
        "metadata": {
          "environment_id": "env-123"
        },
        "spec": {
          "compute_pool_id": "lfcp-123",
          "name": "statement-123",
          "principal": "sa-123"
        }
      }
    }
  }
}
```

```sql
PatchStatement
```

```sql
{
  "datacontenttype": "application/json",
  "data": {
    "service_name": "crn://confluent.cloud/",
    "method_name": "PatchStatement",
    "cloud_resources": [
      {
        "scope": {
          "resources": [
            {
              "type": "ORGANIZATION",
              "resource_id": "org-123"
            },
            {
              "type": "ENVIRONMENT",
              "resource_id": "env-123"
            },
            {
              "type": "FLINK_REGION",
              "resource_id": "aws.us-east-2"
            }
          ]
        },
        "resource": {
          "type": "STATEMENT",
          "resource_id": "statement-123"
        }
      },
      {
        "scope": {
          "resources": [
            {
              "type": "ORGANIZATION",
              "resource_id": "org-123"
            },
            {
              "type": "ENVIRONMENT",
              "resource_id": "env-123"
            },
            {
              "type": "FLINK_REGION",
              "resource_id": "aws.us-east-2"
            }
          ]
        },
        "resource": {
          "type": "STATEMENT",
          "resource_id": "statement-123"
        }
      }
    ],
    "authentication_info": {
      "exposure": "CUSTOMER",
      "principal": {
        "confluent_user": {
          "resource_id": "u-123"
        }
      },
      "result": "SUCCESS",
      "identity": "crn://confluent.cloud/organization=org-123/identity-provider=Confluent/identity=u-123"
    },
    "request_metadata": {
      "request_id": [
        "74726163656964303132333435363738"
      ],
      "client_address": [
        {
          "ip": "127.0.0.1"
        }
      ]
    },
    "request": {
      "access_type": "MODIFICATION",
      "data": {
        "environment_id": "env-123",
        "org_resource_id": "org-123",
        "statement_name": "statement-123"
      }
    },
    "result": {
      "status": "SUCCESS"
    }
  }
}
```

---

### Query Encrypted Data with Flink & Confluent Cloud | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/security/encrypt/csfle/flink-integration.html

Secure Stream Processing: Query Encrypted Data with Flink on Confluent Cloud¶ Processing sensitive data like personally identifiable information (PII) or financial records in real-time data streams presents a significant challenge. How do you perform meaningful operations like filtering, joining, or aggregating data while it remains fully encrypted and secure? Traditionally, you couldn’t. But with Client-Side Field Level Encryption (CSFLE) and deterministic encryption, Confluent Cloud for Flink gives you the power to query and process encrypted data streams directly, unlocking critical use cases while ensuring your data’s privacy and compliance. This powerful combination allows you to leverage the full capabilities of stream processing while your sensitive data remains protected from start to finish. What is deterministic encryption?¶ At the core of this capability is deterministic encryption, a method where encrypting the same plaintext value with the same key always produces the exact same ciphertext. This property is what allows Flink to process the encrypted ciphertext directly, effectively performing equality comparisons, joins, and groupings on the original data without ever needing to decrypt it. Supported operations on encrypted data¶ While Flink itself does not perform decryption, it operates on raw bytes. This allows it to process the ciphertext produced by CSFLE, and because the encryption is deterministic, you can: Process non-encrypted fields in your data stream without any limitations. Run powerful SQL queries that operate directly on your encrypted fields. Here are some of the key operations possible on deterministically encrypted columns: Filtering and equality: Use encrypted fields in WHERE clauses for exact matches. Grouping and aggregation: Perform GROUP BY operations on encrypted fields. The only aggregation functions that work correctly are those based on uniqueness comparison, such as COUNT and COUNT(DISTINCT). Joins: Join multiple streams together using an encrypted column (for example, joining a stream of user activity to a stream of user profiles on an encrypted user ID). Window functions: Use comparison-based window functions like LEAD and LAG. Example: Query on an encrypted column¶ Suppose you want to count the number of active users, grouping by the deterministically encrypted email field: SELECT COUNT(DISTINCT email_encrypted) FROM users_stream WHERE status = 'ACTIVE'; This example shows a common use case — counting unique, active users — where the encrypted email_encrypted field can be grouped or filtered without being decrypted, leveraging deterministic encryption. Important limitations and trade-offs¶ This capability comes with two important considerations you must understand: Limited aggregation functions: Because the data’s actual value is never revealed to Flink, mathematical operations do not produce correct results. Aggregation functions like SUM, AVG, MIN, and MAX execute but yield erroneous values. The deterministic trade-off: Deterministic encryption inherently reveals when two encrypted values are identical. This is a necessary trade-off that enables querying, but it’s a piece of information that can be analyzed. You should carefully consider this when deciding which fields to encrypt deterministically. How it works¶ When you use CSFLE with Flink on Confluent Cloud, the security of your data is maintained because the actual decryption only happens when the data is read from a sink (like a database or materialized view) by an authorized client application that holds the decryption keys. Flink processes the data without ever having access to the plaintext. This ensures that sensitive data cannot be exposed even in the event of a compromise within the processing environment. Powered by Google Tink¶ The CSFLE implementation uses the open-source Google Tink Cryptographic library to perform deterministic encryption using the AES256_SIV algorithm. For more information on Google Tink, see the following: I want to encrypt data deterministically Deterministic Authenticated Encryption with Associated Data Use Tink to meet FIPS 140-2 requirements Related content¶ Protect Sensitive Data Using Client-Side Field Level Encryption on Confluent Cloud Secure Stream Processing: Query Encrypted Data with Flink on Confluent Cloud

#### Code Examples

```sql
COUNT(DISTINCT)
```

```sql
SELECT COUNT(DISTINCT email_encrypted)
FROM users_stream
WHERE status = 'ACTIVE';
```

```sql
email_encrypted
```

---

### Query Tableflow Tables with Confluent Cloud for Apache Flink | Confluent Documentation
Source: https://docs.confluent.io/cloud/current/topics/tableflow/how-to-guides/query-engines/query-with-flink.html

Query Tableflow Tables with Flink in Confluent Cloud for Apache Flink®¶ Confluent Cloud for Apache Flink® supports snapshot queries that read data from a Tableflow-enabled topic at a specific point in time. Querying a Tableflow-enabled topic is similar to querying a Flink topic. If Tableflow is enabled on a topic with Confluent Managed Storage, the query reads from both Kafka and Parquet. If Tableflow is enabled on a topic with custom storage, the query reads from your S3 bucket. This guide shows how to run a snapshot query on a Tableflow-enabled topic. Note Snapshot query is an Early Access Program feature in Confluent Cloud for Apache Flink. An Early Access feature is a component of Confluent Cloud introduced to gain feedback. This feature should be used only for evaluation and non-production testing purposes or to provide feedback to Confluent, particularly as it becomes more widely available in follow-on preview editions. Early Access Program features are intended for evaluation use in development and testing environments only, and not for production use. Early Access Program features are provided: (a) without support; (b) “AS IS”; and (c) without indemnification, warranty, or condition of any kind. No service level commitment will apply to Early Access Program features. Early Access Program features are considered to be a Proof of Concept as defined in the Confluent Cloud Terms of Service. Confluent may discontinue providing preview releases of the Early Access Program features at any time in Confluent’s sole discretion. Prerequisites¶ Access to Confluent Cloud. The OrganizationAdmin, EnvironmentAdmin, or FlinkAdmin role for creating compute pools, or the FlinkDeveloper role if you already have a compute pool. If you don’t have the appropriate role, contact your OrganizationAdmin or EnvironmentAdmin. For more information, see Grant Role-Based Access in Confluent Cloud for Apache Flink. A provisioned Flink compute pool. Step 1: Enable Tableflow on your topic¶ If you want to try querying a table with mock data, complete the steps in Run a Snapshot Query, then proceed to the next step. If you want to query a table with mock data or data from your Kafka topic, and you want to use Confluent Managed Storage, complete the following steps, then proceed to Step 2: Run a snapshot query with Tableflow. In Confluent Cloud Console, navigate to your cluster. In the navigation menu, click Topics. In the topics list, find your topic and click it to open the details page. Click Enable Tableflow. In the Enable Tableflow dialog, select Iceberg and click Use Confluent storage. The topic status updates to Tableflow Syncing. If you want to query a table with data from your Kafka topic, and you want to use custom storage, complete steps 1-4 in Tableflow Quick Start Using Your Storage and AWS Glue and proceed to Step 2: Run a snapshot query with Tableflow. Step 2: Run a snapshot query with Tableflow¶ Once Tableflow is enabled on your topic, you can run a snapshot query on the table by using the same statements that you use for Flink tables. In a Flink workspace or the Flink SQL shell, prepend your query with the following SET statement: SET 'sql.snapshot.mode' = 'now'; Also, in the Flink workspace, you can change the Mode dropdown setting to Snapshot before running your query. For more information, see Run a Snapshot Query. Related content¶ Query with AWS Query with Snowflake Query with Trino Stream Processing with Confluent Cloud for Apache Flink Note This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

#### Code Examples

```sql
SET 'sql.snapshot.mode' = 'now';
```

---

### confluent flink application create | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/application/confluent_flink_application_create.html

confluent flink application create Description Create a Flink application. confluent flink application create <resourceFilePath> [flags] Flags --environment string REQUIRED: Name of the Flink environment. --url string Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag. --client-key-path string Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag. --client-cert-path string Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag. --certificate-authority-path string Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag. -o, --output string Specify the output format as "json" or "yaml". (default "json") Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). See Also confluent flink application - Manage Flink applications.

#### Code Examples

```sql
confluent flink application create <resourceFilePath> [flags]
```

```sql
--environment string                  REQUIRED: Name of the Flink environment.
    --url string                          Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag.
    --client-key-path string              Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag.
    --client-cert-path string             Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag.
    --certificate-authority-path string   Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag.
-o, --output string                       Specify the output format as "json" or "yaml". (default "json")
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

---

### confluent flink application delete | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/application/confluent_flink_application_delete.html

confluent flink application delete Description Delete one or more Flink applications. confluent flink application delete <name-1> [name-2] ... [name-n] [flags] Flags --environment string REQUIRED: Name of the Flink environment. --url string Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag. --client-key-path string Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag. --client-cert-path string Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag. --certificate-authority-path string Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag. --force Skip the deletion confirmation prompt. Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). See Also confluent flink application - Manage Flink applications.

#### Code Examples

```sql
confluent flink application delete <name-1> [name-2] ... [name-n] [flags]
```

```sql
--environment string                  REQUIRED: Name of the Flink environment.
--url string                          Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag.
--client-key-path string              Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag.
--client-cert-path string             Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag.
--certificate-authority-path string   Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag.
--force                               Skip the deletion confirmation prompt.
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

---

### confluent flink application describe | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/application/confluent_flink_application_describe.html

confluent flink application describe Description Describe a Flink application. confluent flink application describe <name> [flags] Flags --environment string REQUIRED: Name of the Flink environment. --url string Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag. --client-key-path string Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag. --client-cert-path string Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag. --certificate-authority-path string Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag. -o, --output string Specify the output format as "json" or "yaml". (default "json") Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). See Also confluent flink application - Manage Flink applications.

#### Code Examples

```sql
confluent flink application describe <name> [flags]
```

```sql
--environment string                  REQUIRED: Name of the Flink environment.
    --url string                          Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag.
    --client-key-path string              Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag.
    --client-cert-path string             Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag.
    --certificate-authority-path string   Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag.
-o, --output string                       Specify the output format as "json" or "yaml". (default "json")
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

---

### confluent flink application list | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/application/confluent_flink_application_list.html

confluent flink application list Description List Flink applications. confluent flink application list [flags] Flags --environment string REQUIRED: Name of the Flink environment. --url string Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag. --client-key-path string Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag. --client-cert-path string Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag. --certificate-authority-path string Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag. -o, --output string Specify the output format as "human", "json", or "yaml". (default "human") Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). See Also confluent flink application - Manage Flink applications.

#### Code Examples

```sql
confluent flink application list [flags]
```

```sql
--environment string                  REQUIRED: Name of the Flink environment.
    --url string                          Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag.
    --client-key-path string              Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag.
    --client-cert-path string             Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag.
    --certificate-authority-path string   Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag.
-o, --output string                       Specify the output format as "human", "json", or "yaml". (default "human")
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

---

### confluent flink application update | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/application/confluent_flink_application_update.html

confluent flink application update Description Update a Flink application. confluent flink application update <resourceFilePath> [flags] Flags --environment string REQUIRED: Name of the Flink environment. --url string Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag. --client-key-path string Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag. --client-cert-path string Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag. --certificate-authority-path string Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag. -o, --output string Specify the output format as "json" or "yaml". (default "json") Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). See Also confluent flink application - Manage Flink applications.

#### Code Examples

```sql
confluent flink application update <resourceFilePath> [flags]
```

```sql
--environment string                  REQUIRED: Name of the Flink environment.
    --url string                          Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag.
    --client-key-path string              Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag.
    --client-cert-path string             Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag.
    --certificate-authority-path string   Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag.
-o, --output string                       Specify the output format as "json" or "yaml". (default "json")
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

---

### confluent flink application web-ui-forward | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/application/confluent_flink_application_web-ui-forward.html

confluent flink application web-ui-forward Description Forward the web UI of a Flink application. confluent flink application web-ui-forward <name> [flags] Flags --environment string REQUIRED: Name of the Flink environment. --url string Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag. --client-key-path string Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag. --client-cert-path string Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag. --certificate-authority-path string Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag. --port uint16 Port to forward the web UI to. If not provided, a random, OS-assigned port will be used. Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). See Also confluent flink application - Manage Flink applications.

#### Code Examples

```sql
confluent flink application web-ui-forward <name> [flags]
```

```sql
--environment string                  REQUIRED: Name of the Flink environment.
--url string                          Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag.
--client-key-path string              Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag.
--client-cert-path string             Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag.
--certificate-authority-path string   Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag.
--port uint16                         Port to forward the web UI to. If not provided, a random, OS-assigned port will be used.
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

---

### confluent flink application | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/application/index.html

confluent flink application Aliases application, app Description Manage Flink applications. Subcommands Command Description confluent flink application create Create a Flink application. confluent flink application delete Delete one or more Flink applications. confluent flink application describe Describe a Flink application. confluent flink application list List Flink applications. confluent flink application update Update a Flink application. confluent flink application web-ui-forward Forward the web UI of a Flink application.

#### Code Examples

```sql
application, app
```

---

### confluent flink artifact create | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/artifact/confluent_flink_artifact_create.html

confluent flink artifact create Description Create a Flink UDF artifact. confluent flink artifact create <unique-name> [flags] Flags --artifact-file string REQUIRED: Flink artifact JAR file or ZIP file. --cloud string REQUIRED: Specify the cloud provider as "aws", "azure", or "gcp". --region string REQUIRED: Cloud region for Flink (use "confluent flink region list" to see all). --environment string Environment ID. --runtime-language string Specify the Flink artifact runtime language as "python" or "java". (default "java") --description string Specify the Flink artifact description. --documentation-link string Specify the Flink artifact documentation link. --context string CLI context name. -o, --output string Specify the output format as "human", "json", or "yaml". (default "human") Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). Examples Create Flink artifact “my-flink-artifact”. confluent flink artifact create my-flink-artifact --artifact-file artifact.jar --cloud aws --region us-west-2 --environment env-123456 Create Flink artifact “flink-java-artifact”. confluent flink artifact create my-flink-artifact --artifact-file artifact.jar --cloud aws --region us-west-2 --environment env-123456 --description flinkJavaScalar See Also confluent flink artifact - Manage Flink UDF artifacts.

#### Code Examples

```sql
confluent flink artifact create <unique-name> [flags]
```

```sql
--artifact-file string        REQUIRED: Flink artifact JAR file or ZIP file.
    --cloud string                REQUIRED: Specify the cloud provider as "aws", "azure", or "gcp".
    --region string               REQUIRED: Cloud region for Flink (use "confluent flink region list" to see all).
    --environment string          Environment ID.
    --runtime-language string     Specify the Flink artifact runtime language as "python" or "java". (default "java")
    --description string          Specify the Flink artifact description.
    --documentation-link string   Specify the Flink artifact documentation link.
    --context string              CLI context name.
-o, --output string               Specify the output format as "human", "json", or "yaml". (default "human")
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

```sql
confluent flink artifact create my-flink-artifact --artifact-file artifact.jar --cloud aws --region us-west-2 --environment env-123456
```

```sql
confluent flink artifact create my-flink-artifact --artifact-file artifact.jar --cloud aws --region us-west-2 --environment env-123456 --description flinkJavaScalar
```

---

### confluent flink artifact delete | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/artifact/confluent_flink_artifact_delete.html

confluent flink artifact delete Description Delete one or more Flink UDF artifacts. confluent flink artifact delete <id-1> [id-2] ... [id-n] [flags] Flags --cloud string REQUIRED: Specify the cloud provider as "aws", "azure", or "gcp". --region string REQUIRED: Cloud region for Flink (use "confluent flink region list" to see all). --environment string Environment ID. --force Skip the deletion confirmation prompt. --context string CLI context name. Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). Examples Delete Flink UDF artifact. confluent flink artifact delete --cloud aws --region us-west-2 --environment env-123456 cfa-123456 See Also confluent flink artifact - Manage Flink UDF artifacts.

#### Code Examples

```sql
confluent flink artifact delete <id-1> [id-2] ... [id-n] [flags]
```

```sql
--cloud string         REQUIRED: Specify the cloud provider as "aws", "azure", or "gcp".
--region string        REQUIRED: Cloud region for Flink (use "confluent flink region list" to see all).
--environment string   Environment ID.
--force                Skip the deletion confirmation prompt.
--context string       CLI context name.
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

```sql
confluent flink artifact delete --cloud aws --region us-west-2 --environment env-123456 cfa-123456
```

---

### confluent flink artifact describe | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/artifact/confluent_flink_artifact_describe.html

confluent flink artifact describe Description Describe a Flink UDF artifact. confluent flink artifact describe <id> [flags] Flags --cloud string REQUIRED: Specify the cloud provider as "aws", "azure", or "gcp". --region string REQUIRED: Cloud region for Flink (use "confluent flink region list" to see all). --environment string Environment ID. --context string CLI context name. -o, --output string Specify the output format as "human", "json", or "yaml". (default "human") Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). Examples Describe Flink UDF artifact. confluent flink artifact describe --cloud aws --region us-west-2 --environment env-123456 cfa-123456 See Also confluent flink artifact - Manage Flink UDF artifacts.

#### Code Examples

```sql
confluent flink artifact describe <id> [flags]
```

```sql
--cloud string         REQUIRED: Specify the cloud provider as "aws", "azure", or "gcp".
    --region string        REQUIRED: Cloud region for Flink (use "confluent flink region list" to see all).
    --environment string   Environment ID.
    --context string       CLI context name.
-o, --output string        Specify the output format as "human", "json", or "yaml". (default "human")
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

```sql
confluent flink artifact describe --cloud aws --region us-west-2 --environment env-123456 cfa-123456
```

---

### confluent flink artifact list | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/artifact/confluent_flink_artifact_list.html

confluent flink artifact list Description List Flink UDF artifacts. confluent flink artifact list [flags] Flags --cloud string REQUIRED: Specify the cloud provider as "aws", "azure", or "gcp". --region string REQUIRED: Cloud region for Flink (use "confluent flink region list" to see all). --environment string Environment ID. --context string CLI context name. -o, --output string Specify the output format as "human", "json", or "yaml". (default "human") Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). Examples List Flink UDF artifacts. confluent flink artifact list --cloud aws --region us-west-2 --environment env-123456 See Also confluent flink artifact - Manage Flink UDF artifacts.

#### Code Examples

```sql
confluent flink artifact list [flags]
```

```sql
--cloud string         REQUIRED: Specify the cloud provider as "aws", "azure", or "gcp".
    --region string        REQUIRED: Cloud region for Flink (use "confluent flink region list" to see all).
    --environment string   Environment ID.
    --context string       CLI context name.
-o, --output string        Specify the output format as "human", "json", or "yaml". (default "human")
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

```sql
confluent flink artifact list --cloud aws --region us-west-2 --environment env-123456
```

---

### confluent flink artifact | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/artifact/index.html

confluent flink artifact Description Manage Flink UDF artifacts. Subcommands Command Description confluent flink artifact create Create a Flink UDF artifact. confluent flink artifact delete Delete one or more Flink UDF artifacts. confluent flink artifact describe Describe a Flink UDF artifact. confluent flink artifact list List Flink UDF artifacts.

---

### confluent flink catalog create | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/catalog/confluent_flink_catalog_create.html

confluent flink catalog create Description Create a Flink catalog in Confluent Platform that provides metadata about tables and other database objects such as views and functions. confluent flink catalog create <resourceFilePath> [flags] Flags --url string Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag. --client-key-path string Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag. --client-cert-path string Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag. --certificate-authority-path string Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag. -o, --output string Specify the output format as "human", "json", or "yaml". (default "human") Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). See Also confluent flink catalog - Manage Flink catalogs in Confluent Platform.

#### Code Examples

```sql
confluent flink catalog create <resourceFilePath> [flags]
```

```sql
--url string                          Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag.
    --client-key-path string              Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag.
    --client-cert-path string             Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag.
    --certificate-authority-path string   Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag.
-o, --output string                       Specify the output format as "human", "json", or "yaml". (default "human")
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

---

### confluent flink catalog delete | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/catalog/confluent_flink_catalog_delete.html

confluent flink catalog delete Description Delete one or more Flink catalogs in Confluent Platform. confluent flink catalog delete <name-1> [name-2] ... [name-n] [flags] Flags --url string Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag. --client-key-path string Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag. --client-cert-path string Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag. --certificate-authority-path string Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag. --force Skip the deletion confirmation prompt. Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). See Also confluent flink catalog - Manage Flink catalogs in Confluent Platform.

#### Code Examples

```sql
confluent flink catalog delete <name-1> [name-2] ... [name-n] [flags]
```

```sql
--url string                          Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag.
--client-key-path string              Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag.
--client-cert-path string             Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag.
--certificate-authority-path string   Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag.
--force                               Skip the deletion confirmation prompt.
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

---

### confluent flink catalog describe | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/catalog/confluent_flink_catalog_describe.html

confluent flink catalog describe Description Describe a Flink catalog in Confluent Platform. confluent flink catalog describe <name> [flags] Flags --url string Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag. --client-key-path string Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag. --client-cert-path string Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag. --certificate-authority-path string Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag. -o, --output string Specify the output format as "human", "json", or "yaml". (default "human") Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). See Also confluent flink catalog - Manage Flink catalogs in Confluent Platform.

#### Code Examples

```sql
confluent flink catalog describe <name> [flags]
```

```sql
--url string                          Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag.
    --client-key-path string              Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag.
    --client-cert-path string             Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag.
    --certificate-authority-path string   Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag.
-o, --output string                       Specify the output format as "human", "json", or "yaml". (default "human")
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

---

### confluent flink catalog list | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/catalog/confluent_flink_catalog_list.html

confluent flink catalog list Description List Flink catalogs in Confluent Platform. confluent flink catalog list [flags] Flags --url string Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag. --client-key-path string Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag. --client-cert-path string Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag. --certificate-authority-path string Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag. -o, --output string Specify the output format as "human", "json", or "yaml". (default "human") Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). See Also confluent flink catalog - Manage Flink catalogs in Confluent Platform.

#### Code Examples

```sql
confluent flink catalog list [flags]
```

```sql
--url string                          Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag.
    --client-key-path string              Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag.
    --client-cert-path string             Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag.
    --certificate-authority-path string   Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag.
-o, --output string                       Specify the output format as "human", "json", or "yaml". (default "human")
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

---

### confluent flink catalog | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/catalog/index.html

confluent flink catalog Description Manage Flink catalogs in Confluent Platform. Subcommands Command Description confluent flink catalog create Create a Flink catalog. confluent flink catalog delete Delete one or more Flink catalogs in Confluent Platform. confluent flink catalog describe Describe a Flink catalog in Confluent Platform. confluent flink catalog list List Flink catalogs in Confluent Platform.

---

### confluent flink compute-pool create | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/compute-pool/confluent_flink_compute-pool_create.html

confluent flink compute-pool create Description Cloud Create a Flink compute pool. confluent flink compute-pool create <name> [flags] On-Premises Create a Flink compute pool in Confluent Platform. confluent flink compute-pool create <resourceFilePath> [flags] Flags Cloud --cloud string REQUIRED: Specify the cloud provider as "aws", "azure", or "gcp". --region string REQUIRED: Cloud region for Flink (use "confluent flink region list" to see all). --max-cfu int32 Maximum number of Confluent Flink Units (CFU). (default 5) --environment string Environment ID. -o, --output string Specify the output format as "human", "json", or "yaml". (default "human") On-Premises --environment string REQUIRED: Name of the Flink environment. --url string Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag. --client-key-path string Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag. --client-cert-path string Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag. --certificate-authority-path string Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag. -o, --output string Specify the output format as "human", "json", or "yaml". (default "human") Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). Examples Cloud Create Flink compute pool “my-compute-pool” in AWS with 5 CFUs. confluent flink compute-pool create my-compute-pool --cloud aws --region us-west-2 --max-cfu 5 On-Premises No examples. See Also confluent flink compute-pool - Manage Flink compute pools.

#### Code Examples

```sql
confluent flink compute-pool create <name> [flags]
```

```sql
confluent flink compute-pool create <resourceFilePath> [flags]
```

```sql
--cloud string         REQUIRED: Specify the cloud provider as "aws", "azure", or "gcp".
    --region string        REQUIRED: Cloud region for Flink (use "confluent flink region list" to see all).
    --max-cfu int32        Maximum number of Confluent Flink Units (CFU). (default 5)
    --environment string   Environment ID.
-o, --output string        Specify the output format as "human", "json", or "yaml". (default "human")
```

```sql
--environment string                  REQUIRED: Name of the Flink environment.
    --url string                          Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag.
    --client-key-path string              Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag.
    --client-cert-path string             Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag.
    --certificate-authority-path string   Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag.
-o, --output string                       Specify the output format as "human", "json", or "yaml". (default "human")
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

```sql
confluent flink compute-pool create my-compute-pool --cloud aws --region us-west-2 --max-cfu 5
```

---

### confluent flink compute-pool delete | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/compute-pool/confluent_flink_compute-pool_delete.html

confluent flink compute-pool delete Description Cloud Delete one or more Flink compute pools. confluent flink compute-pool delete <id-1> [id-2] ... [id-n] [flags] On-Premises Delete one or more Flink compute pools in Confluent Platform, a compute pool can only be deleted if there are no statements associated with it. confluent flink compute-pool delete <name-1> [name-2] ... [name-n] [flags] Flags Cloud --force Skip the deletion confirmation prompt. --environment string Environment ID. --context string CLI context name. On-Premises --environment string REQUIRED: Name of the Flink environment. --url string Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag. --client-key-path string Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag. --client-cert-path string Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag. --certificate-authority-path string Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag. --force Skip the deletion confirmation prompt. Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). See Also confluent flink compute-pool - Manage Flink compute pools.

#### Code Examples

```sql
confluent flink compute-pool delete <id-1> [id-2] ... [id-n] [flags]
```

```sql
confluent flink compute-pool delete <name-1> [name-2] ... [name-n] [flags]
```

```sql
--force                Skip the deletion confirmation prompt.
--environment string   Environment ID.
--context string       CLI context name.
```

```sql
--environment string                  REQUIRED: Name of the Flink environment.
--url string                          Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag.
--client-key-path string              Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag.
--client-cert-path string             Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag.
--certificate-authority-path string   Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag.
--force                               Skip the deletion confirmation prompt.
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

---

### confluent flink compute-pool describe | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/compute-pool/confluent_flink_compute-pool_describe.html

confluent flink compute-pool describe Description Cloud Describe a Flink compute pool. confluent flink compute-pool describe [id] [flags] On-Premises Describe a Flink compute pool in Confluent Platform. confluent flink compute-pool describe <name> [flags] Flags Cloud --environment string Environment ID. -o, --output string Specify the output format as "human", "json", or "yaml". (default "human") On-Premises --environment string REQUIRED: Name of the Flink environment. --url string Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag. --client-key-path string Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag. --client-cert-path string Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag. --certificate-authority-path string Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag. -o, --output string Specify the output format as "human", "json", or "yaml". (default "human") Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). See Also confluent flink compute-pool - Manage Flink compute pools.

#### Code Examples

```sql
confluent flink compute-pool describe [id] [flags]
```

```sql
confluent flink compute-pool describe <name> [flags]
```

```sql
--environment string   Environment ID.
-o, --output string        Specify the output format as "human", "json", or "yaml". (default "human")
```

```sql
--environment string                  REQUIRED: Name of the Flink environment.
    --url string                          Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag.
    --client-key-path string              Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag.
    --client-cert-path string             Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag.
    --certificate-authority-path string   Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag.
-o, --output string                       Specify the output format as "human", "json", or "yaml". (default "human")
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

---

### confluent flink compute-pool list | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/compute-pool/confluent_flink_compute-pool_list.html

confluent flink compute-pool list Description Cloud List Flink compute pools. confluent flink compute-pool list [flags] On-Premises List Flink compute pools in Confluent Platform. confluent flink compute-pool list [flags] Flags Cloud --region string Cloud region for Flink (use "confluent flink region list" to see all). --environment string Environment ID. -o, --output string Specify the output format as "human", "json", or "yaml". (default "human") On-Premises --environment string REQUIRED: Name of the Flink environment. --url string Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag. --client-key-path string Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag. --client-cert-path string Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag. --certificate-authority-path string Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag. -o, --output string Specify the output format as "human", "json", or "yaml". (default "human") Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). See Also confluent flink compute-pool - Manage Flink compute pools.

#### Code Examples

```sql
confluent flink compute-pool list [flags]
```

```sql
confluent flink compute-pool list [flags]
```

```sql
--region string        Cloud region for Flink (use "confluent flink region list" to see all).
    --environment string   Environment ID.
-o, --output string        Specify the output format as "human", "json", or "yaml". (default "human")
```

```sql
--environment string                  REQUIRED: Name of the Flink environment.
    --url string                          Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag.
    --client-key-path string              Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag.
    --client-cert-path string             Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag.
    --certificate-authority-path string   Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag.
-o, --output string                       Specify the output format as "human", "json", or "yaml". (default "human")
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

---

### confluent flink compute-pool unset | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/compute-pool/confluent_flink_compute-pool_unset.html

confluent flink compute-pool unset Description Unset the current Flink compute pool that was set with the use command. confluent flink compute-pool unset [flags] Flags -o, --output string Specify the output format as "human", "json", or "yaml". (default "human") Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). Examples Unset default compute pool: confluent flink compute-pool unset See Also confluent flink compute-pool - Manage Flink compute pools.

#### Code Examples

```sql
confluent flink compute-pool unset [flags]
```

```sql
-o, --output string   Specify the output format as "human", "json", or "yaml". (default "human")
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

```sql
confluent flink compute-pool unset
```

---

### confluent flink compute-pool update | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/compute-pool/confluent_flink_compute-pool_update.html

confluent flink compute-pool update Description Update a Flink compute pool. confluent flink compute-pool update [id] [flags] Flags --name string Name of the compute pool. --max-cfu int32 Maximum number of Confluent Flink Units (CFU). --environment string Environment ID. -o, --output string Specify the output format as "human", "json", or "yaml". (default "human") Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). Examples Update name and CFU count of a Flink compute pool. confluent flink compute-pool update lfcp-123456 --name "new name" --max-cfu 5 See Also confluent flink compute-pool - Manage Flink compute pools.

#### Code Examples

```sql
confluent flink compute-pool update [id] [flags]
```

```sql
--name string          Name of the compute pool.
    --max-cfu int32        Maximum number of Confluent Flink Units (CFU).
    --environment string   Environment ID.
-o, --output string        Specify the output format as "human", "json", or "yaml". (default "human")
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

```sql
confluent flink compute-pool update lfcp-123456 --name "new name" --max-cfu 5
```

---

### confluent flink compute-pool use | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/compute-pool/confluent_flink_compute-pool_use.html

confluent flink compute-pool use Description Choose a Flink compute pool to be used in subsequent commands which support passing a compute pool with the --compute-pool flag. confluent flink compute-pool use <id> [flags] Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). See Also confluent flink compute-pool - Manage Flink compute pools.

#### Code Examples

```sql
--compute-pool
```

```sql
confluent flink compute-pool use <id> [flags]
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

---

### confluent flink compute-pool | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/compute-pool/index.html

confluent flink compute-pool Description Manage Flink compute pools. Subcommands Cloud Command Description confluent flink compute-pool create Create a Flink compute pool. confluent flink compute-pool delete Delete one or more Flink compute pools. confluent flink compute-pool describe Describe a Flink compute pool. confluent flink compute-pool list List Flink compute pools. confluent flink compute-pool unset Unset the current Flink compute pool. confluent flink compute-pool update Update a Flink compute pool. confluent flink compute-pool use Use a Flink compute pool in subsequent commands. On-Premises Command Description confluent flink compute-pool create Create a Flink compute pool in Confluent Platform. confluent flink compute-pool delete Delete one or more Flink compute pools. confluent flink compute-pool describe Describe a Flink compute pool in Confluent Platform. confluent flink compute-pool list List Flink compute pools in Confluent Platform.

---

### confluent flink shell | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/confluent_flink_shell.html

confluent flink shell Description Start Flink interactive SQL client. confluent flink shell [flags] Flags --compute-pool string Flink compute pool ID. --service-account string Service account ID. --database string The database which will be used as the default database. When using Kafka, this is the cluster ID. --environment string Environment ID. --context string CLI context name. Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). Examples For a Quick Start with examples in context, see https://docs.confluent.io/cloud/current/flink/get-started/quick-start-shell.html. See Also confluent flink - Manage Apache Flink.

#### Code Examples

```sql
confluent flink shell [flags]
```

```sql
--compute-pool string      Flink compute pool ID.
--service-account string   Service account ID.
--database string          The database which will be used as the default database. When using Kafka, this is the cluster ID.
--environment string       Environment ID.
--context string           CLI context name.
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

---

### confluent flink connection create | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/connection/confluent_flink_connection_create.html

confluent flink connection create Description Create a Flink connection. confluent flink connection create <name> [flags] Flags --cloud string REQUIRED: Specify the cloud provider as "aws", "azure", or "gcp". --region string REQUIRED: Cloud region for Flink (use "confluent flink region list" to see all). --type string REQUIRED: Specify the connection type as "openai", "azureml", "azureopenai", "bedrock", "sagemaker", "googleai", "vertexai", "mongodb", "elastic", "pinecone", "couchbase", "confluent_jdbc", "rest", or "mcp_server". --endpoint string REQUIRED: Specify endpoint for the connection. --api-key string Specify API key for the type: "openai", "azureml", "azureopenai", "googleai", "elastic", "pinecone", or "mcp_server". --aws-access-key string Specify access key for the type: "bedrock" or "sagemaker". --aws-secret-key string Specify secret key for the type: "bedrock" or "sagemaker". --aws-session-token string Specify session token for the type: "bedrock" or "sagemaker". --service-key string Specify service key for the type: "vertexai". --username string Specify username for the type: "mongodb", "couchbase", "confluent_jdbc", or "rest". --password string Specify password for the type: "mongodb", "couchbase", "confluent_jdbc", or "rest". --token string Specify bearer token for the type: "rest" or "mcp_server". --token-endpoint string Specify OAuth2 token endpoint for the type: "rest" or "mcp_server". --client-id string Specify OAuth2 client ID for the type: "rest" or "mcp_server". --client-secret string Specify OAuth2 client secret for the type: "rest" or "mcp_server". --scope string Specify OAuth2 scope for the type: "rest" or "mcp_server". --sse-endpoint string Specify SSE endpoint for the type: "mcp_server". --transport-type string Specify transport type for the type: "mcp_server". Default: SSE. --environment string Environment ID. -o, --output string Specify the output format as "human", "json", or "yaml". (default "human") Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). Examples Create Flink connection “my-connection” in AWS us-west-2 for OpenAPI with endpoint and API key. confluent flink connection create my-connection --cloud aws --region us-west-2 --type openai --endpoint https://api.openai.com/v1/chat/completions --api-key 0000000000000000 See Also confluent flink connection - Manage Flink connections.

#### Code Examples

```sql
confluent flink connection create <name> [flags]
```

```sql
--cloud string               REQUIRED: Specify the cloud provider as "aws", "azure", or "gcp".
    --region string              REQUIRED: Cloud region for Flink (use "confluent flink region list" to see all).
    --type string                REQUIRED: Specify the connection type as "openai", "azureml", "azureopenai", "bedrock", "sagemaker", "googleai", "vertexai", "mongodb", "elastic", "pinecone", "couchbase", "confluent_jdbc", "rest", or "mcp_server".
    --endpoint string            REQUIRED: Specify endpoint for the connection.
    --api-key string             Specify API key for the type: "openai", "azureml", "azureopenai", "googleai", "elastic", "pinecone", or "mcp_server".
    --aws-access-key string      Specify access key for the type: "bedrock" or "sagemaker".
    --aws-secret-key string      Specify secret key for the type: "bedrock" or "sagemaker".
    --aws-session-token string   Specify session token for the type: "bedrock" or "sagemaker".
    --service-key string         Specify service key for the type: "vertexai".
    --username string            Specify username for the type: "mongodb", "couchbase", "confluent_jdbc", or "rest".
    --password string            Specify password for the type: "mongodb", "couchbase", "confluent_jdbc", or "rest".
    --token string               Specify bearer token for the type: "rest" or "mcp_server".
    --token-endpoint string      Specify OAuth2 token endpoint for the type: "rest" or "mcp_server".
    --client-id string           Specify OAuth2 client ID for the type: "rest" or "mcp_server".
    --client-secret string       Specify OAuth2 client secret for the type: "rest" or "mcp_server".
    --scope string               Specify OAuth2 scope for the type: "rest" or "mcp_server".
    --sse-endpoint string        Specify SSE endpoint for the type: "mcp_server".
    --transport-type string      Specify transport type for the type: "mcp_server". Default: SSE.
    --environment string         Environment ID.
-o, --output string              Specify the output format as "human", "json", or "yaml". (default "human")
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

```sql
confluent flink connection create my-connection --cloud aws --region us-west-2 --type openai --endpoint https://api.openai.com/v1/chat/completions --api-key 0000000000000000
```

---

### confluent flink connection delete | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/connection/confluent_flink_connection_delete.html

confluent flink connection delete Description Delete one or more Flink connections. confluent flink connection delete <name-1> [name-2] ... [name-n] [flags] Flags --cloud string REQUIRED: Specify the cloud provider as "aws", "azure", or "gcp". --region string REQUIRED: Cloud region for Flink (use "confluent flink region list" to see all). --force Skip the deletion confirmation prompt. --environment string Environment ID. --context string CLI context name. Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). See Also confluent flink connection - Manage Flink connections.

#### Code Examples

```sql
confluent flink connection delete <name-1> [name-2] ... [name-n] [flags]
```

```sql
--cloud string         REQUIRED: Specify the cloud provider as "aws", "azure", or "gcp".
--region string        REQUIRED: Cloud region for Flink (use "confluent flink region list" to see all).
--force                Skip the deletion confirmation prompt.
--environment string   Environment ID.
--context string       CLI context name.
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

---

### confluent flink connection describe | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/connection/confluent_flink_connection_describe.html

confluent flink connection describe Description Describe a Flink connection. confluent flink connection describe <name> [flags] Flags --cloud string REQUIRED: Specify the cloud provider as "aws", "azure", or "gcp". --region string REQUIRED: Cloud region for Flink (use "confluent flink region list" to see all). --environment string Environment ID. -o, --output string Specify the output format as "human", "json", or "yaml". (default "human") Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). See Also confluent flink connection - Manage Flink connections.

#### Code Examples

```sql
confluent flink connection describe <name> [flags]
```

```sql
--cloud string         REQUIRED: Specify the cloud provider as "aws", "azure", or "gcp".
    --region string        REQUIRED: Cloud region for Flink (use "confluent flink region list" to see all).
    --environment string   Environment ID.
-o, --output string        Specify the output format as "human", "json", or "yaml". (default "human")
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

---

### confluent flink connection list | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/connection/confluent_flink_connection_list.html

confluent flink connection list Description List Flink connections. confluent flink connection list [flags] Flags --cloud string REQUIRED: Specify the cloud provider as "aws", "azure", or "gcp". --region string REQUIRED: Cloud region for Flink (use "confluent flink region list" to see all). --environment string Environment ID. --type string Specify the connection type as "openai", "azureml", "azureopenai", "bedrock", "sagemaker", "googleai", "vertexai", "mongodb", "elastic", "pinecone", "couchbase", "confluent_jdbc", "rest", or "mcp_server". -o, --output string Specify the output format as "human", "json", or "yaml". (default "human") Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). See Also confluent flink connection - Manage Flink connections.

#### Code Examples

```sql
confluent flink connection list [flags]
```

```sql
--cloud string         REQUIRED: Specify the cloud provider as "aws", "azure", or "gcp".
    --region string        REQUIRED: Cloud region for Flink (use "confluent flink region list" to see all).
    --environment string   Environment ID.
    --type string          Specify the connection type as "openai", "azureml", "azureopenai", "bedrock", "sagemaker", "googleai", "vertexai", "mongodb", "elastic", "pinecone", "couchbase", "confluent_jdbc", "rest", or "mcp_server".
-o, --output string        Specify the output format as "human", "json", or "yaml". (default "human")
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

---

### confluent flink connection update | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/connection/confluent_flink_connection_update.html

confluent flink connection update Description Update a Flink connection. Only secret can be updated. confluent flink connection update <name> [flags] Flags --cloud string REQUIRED: Specify the cloud provider as "aws", "azure", or "gcp". --region string REQUIRED: Cloud region for Flink (use "confluent flink region list" to see all). --api-key string Specify API key for the type: "openai", "azureml", "azureopenai", "googleai", "elastic", "pinecone", or "mcp_server". --aws-access-key string Specify access key for the type: "bedrock" or "sagemaker". --aws-secret-key string Specify secret key for the type: "bedrock" or "sagemaker". --aws-session-token string Specify session token for the type: "bedrock" or "sagemaker". --service-key string Specify service key for the type: "vertexai". --username string Specify username for the type: "mongodb", "couchbase", "confluent_jdbc", or "rest". --password string Specify password for the type: "mongodb", "couchbase", "confluent_jdbc", or "rest". --token string Specify bearer token for the type: "rest" or "mcp_server". --token-endpoint string Specify OAuth2 token endpoint for the type: "rest" or "mcp_server". --client-id string Specify OAuth2 client ID for the type: "rest" or "mcp_server". --client-secret string Specify OAuth2 client secret for the type: "rest" or "mcp_server". --scope string Specify OAuth2 scope for the type: "rest" or "mcp_server". --sse-endpoint string Specify SSE endpoint for the type: "mcp_server". --transport-type string Specify transport type for the type: "mcp_server". Default: SSE. --environment string Environment ID. -o, --output string Specify the output format as "human", "json", or "yaml". (default "human") Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). Examples Update API key of Flink connection “my-connection”. confluent flink connection update my-connection --cloud aws --region us-west-2 --api-key new-key See Also confluent flink connection - Manage Flink connections.

#### Code Examples

```sql
confluent flink connection update <name> [flags]
```

```sql
--cloud string               REQUIRED: Specify the cloud provider as "aws", "azure", or "gcp".
    --region string              REQUIRED: Cloud region for Flink (use "confluent flink region list" to see all).
    --api-key string             Specify API key for the type: "openai", "azureml", "azureopenai", "googleai", "elastic", "pinecone", or "mcp_server".
    --aws-access-key string      Specify access key for the type: "bedrock" or "sagemaker".
    --aws-secret-key string      Specify secret key for the type: "bedrock" or "sagemaker".
    --aws-session-token string   Specify session token for the type: "bedrock" or "sagemaker".
    --service-key string         Specify service key for the type: "vertexai".
    --username string            Specify username for the type: "mongodb", "couchbase", "confluent_jdbc", or "rest".
    --password string            Specify password for the type: "mongodb", "couchbase", "confluent_jdbc", or "rest".
    --token string               Specify bearer token for the type: "rest" or "mcp_server".
    --token-endpoint string      Specify OAuth2 token endpoint for the type: "rest" or "mcp_server".
    --client-id string           Specify OAuth2 client ID for the type: "rest" or "mcp_server".
    --client-secret string       Specify OAuth2 client secret for the type: "rest" or "mcp_server".
    --scope string               Specify OAuth2 scope for the type: "rest" or "mcp_server".
    --sse-endpoint string        Specify SSE endpoint for the type: "mcp_server".
    --transport-type string      Specify transport type for the type: "mcp_server". Default: SSE.
    --environment string         Environment ID.
-o, --output string              Specify the output format as "human", "json", or "yaml". (default "human")
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

```sql
confluent flink connection update my-connection --cloud aws --region us-west-2 --api-key new-key
```

---

### confluent flink connection | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/connection/index.html

confluent flink connection Description Manage Flink connections. Subcommands Command Description confluent flink connection create Create a Flink connection. confluent flink connection delete Delete one or more Flink connections. confluent flink connection describe Describe a Flink connection. confluent flink connection list List Flink connections. confluent flink connection update Update a Flink connection. Only secret can be updated.

---

### confluent flink connectivity-type use | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/connectivity-type/confluent_flink_connectivity-type_use.html

confluent flink connectivity-type use Description Select a Flink connectivity type for the current environment as “public” or “private”. If unspecified, the CLI will default to public connectivity type. confluent flink connectivity-type use <region-access> [flags] Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). See Also confluent flink connectivity-type - Manage Flink connectivity type.

#### Code Examples

```sql
confluent flink connectivity-type use <region-access> [flags]
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

---

### confluent flink connectivity-type | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/connectivity-type/index.html

confluent flink connectivity-type Description Manage Flink connectivity type. Subcommands Command Description confluent flink connectivity-type use Select a Flink connectivity type.

---

### confluent flink endpoint list | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/endpoint/confluent_flink_endpoint_list.html

confluent flink endpoint list Description List Flink endpoint. confluent flink endpoint list [flags] Flags --context string CLI context name. -o, --output string Specify the output format as "human", "json", or "yaml". (default "human") Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). Examples List the available Flink endpoints with current cloud provider and region. confluent flink endpoint list See Also confluent flink endpoint - Manage Flink endpoint.

#### Code Examples

```sql
confluent flink endpoint list [flags]
```

```sql
--context string   CLI context name.
-o, --output string    Specify the output format as "human", "json", or "yaml". (default "human")
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

```sql
confluent flink endpoint list
```

---

### confluent flink endpoint unset | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/endpoint/confluent_flink_endpoint_unset.html

confluent flink endpoint unset Description Unset the current Flink endpoint that was previously set with the use command. confluent flink endpoint unset [flags] Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). Examples Unset the current Flink endpoint “https://flink.us-east-1.aws.confluent.cloud”. confluent flink endpoint unset See Also confluent flink endpoint - Manage Flink endpoint.

#### Code Examples

```sql
confluent flink endpoint unset [flags]
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

```sql
confluent flink endpoint unset
```

---

### confluent flink endpoint use | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/endpoint/confluent_flink_endpoint_use.html

confluent flink endpoint use Description Use a Flink endpoint as active endpoint for all subsequent Flink dataplane commands in current environment, such as flink connection, flink statement and flink shell. confluent flink endpoint use [flags] Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). Examples Use “https://flink.us-east-1.aws.confluent.cloud” for subsequent Flink dataplane commands. confluent flink endpoint use "https://flink.us-east-1.aws.confluent.cloud" See Also confluent flink endpoint - Manage Flink endpoint.

#### Code Examples

```sql
flink connection
```

```sql
flink statement
```

```sql
flink shell
```

```sql
confluent flink endpoint use [flags]
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

```sql
confluent flink endpoint use "https://flink.us-east-1.aws.confluent.cloud"
```

---

### confluent flink endpoint | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/endpoint/index.html

confluent flink endpoint Description Manage Flink endpoint. Subcommands Command Description confluent flink endpoint list List Flink endpoint. confluent flink endpoint unset Unset the current Flink endpoint. confluent flink endpoint use Use a Flink endpoint.

---

### confluent flink environment create | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/environment/confluent_flink_environment_create.html

confluent flink environment create Description Create a Flink environment. confluent flink environment create <name> [flags] Flags --kubernetes-namespace string REQUIRED: Kubernetes namespace to deploy Flink applications to. --defaults string JSON string defining the environment's Flink application defaults, or path to a file to read defaults from (with .yml, .yaml or .json extension). --statement-defaults string JSON string defining the environment's Flink statement defaults, or path to a file to read defaults from (with .yml, .yaml or .json extension). --compute-pool-defaults string JSON string defining the environment's Flink compute pool defaults, or path to a file to read defaults from (with .yml, .yaml or .json extension). --url string Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag. --client-key-path string Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag. --client-cert-path string Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag. --certificate-authority-path string Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag. -o, --output string Specify the output format as "human", "json", or "yaml". (default "human") Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). See Also confluent flink environment - Manage Flink environments.

#### Code Examples

```sql
confluent flink environment create <name> [flags]
```

```sql
--kubernetes-namespace string         REQUIRED: Kubernetes namespace to deploy Flink applications to.
    --defaults string                     JSON string defining the environment's Flink application defaults, or path to a file to read defaults from (with .yml, .yaml or .json extension).
    --statement-defaults string           JSON string defining the environment's Flink statement defaults, or path to a file to read defaults from (with .yml, .yaml or .json extension).
    --compute-pool-defaults string        JSON string defining the environment's Flink compute pool defaults, or path to a file to read defaults from (with .yml, .yaml or .json extension).
    --url string                          Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag.
    --client-key-path string              Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag.
    --client-cert-path string             Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag.
    --certificate-authority-path string   Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag.
-o, --output string                       Specify the output format as "human", "json", or "yaml". (default "human")
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

---

### confluent flink environment delete | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/environment/confluent_flink_environment_delete.html

confluent flink environment delete Description Delete one or more Flink environments. confluent flink environment delete <name-1> [name-2] ... [name-n] [flags] Flags --url string Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag. --client-key-path string Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag. --client-cert-path string Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag. --certificate-authority-path string Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag. --force Skip the deletion confirmation prompt. Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). See Also confluent flink environment - Manage Flink environments.

#### Code Examples

```sql
confluent flink environment delete <name-1> [name-2] ... [name-n] [flags]
```

```sql
--url string                          Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag.
--client-key-path string              Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag.
--client-cert-path string             Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag.
--certificate-authority-path string   Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag.
--force                               Skip the deletion confirmation prompt.
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

---

### confluent flink environment describe | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/environment/confluent_flink_environment_describe.html

confluent flink environment describe Description Describe a Flink environment. confluent flink environment describe <name> [flags] Flags --url string Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag. --client-key-path string Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag. --client-cert-path string Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag. --certificate-authority-path string Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag. -o, --output string Specify the output format as "human", "json", or "yaml". (default "human") Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). See Also confluent flink environment - Manage Flink environments.

#### Code Examples

```sql
confluent flink environment describe <name> [flags]
```

```sql
--url string                          Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag.
    --client-key-path string              Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag.
    --client-cert-path string             Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag.
    --certificate-authority-path string   Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag.
-o, --output string                       Specify the output format as "human", "json", or "yaml". (default "human")
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

---

### confluent flink environment list | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/environment/confluent_flink_environment_list.html

confluent flink environment list Description List Flink environments. confluent flink environment list [flags] Flags --url string Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag. --client-key-path string Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag. --client-cert-path string Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag. --certificate-authority-path string Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag. -o, --output string Specify the output format as "human", "json", or "yaml". (default "human") Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). See Also confluent flink environment - Manage Flink environments.

#### Code Examples

```sql
confluent flink environment list [flags]
```

```sql
--url string                          Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag.
    --client-key-path string              Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag.
    --client-cert-path string             Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag.
    --certificate-authority-path string   Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag.
-o, --output string                       Specify the output format as "human", "json", or "yaml". (default "human")
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

---

### confluent flink environment update | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/environment/confluent_flink_environment_update.html

confluent flink environment update Description Update a Flink environment. confluent flink environment update <name> [flags] Flags --url string Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag. --client-key-path string Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag. --client-cert-path string Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag. --certificate-authority-path string Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag. --defaults string JSON string defining the environment's Flink application defaults, or path to a file to read defaults from (with .yml, .yaml or .json extension). --statement-defaults string JSON string defining the environment's Flink statement defaults, or path to a file to read defaults from (with .yml, .yaml or .json extension). --compute-pool-defaults string JSON string defining the environment's Flink compute pool defaults, or path to a file to read defaults from (with .yml, .yaml or .json extension). -o, --output string Specify the output format as "human", "json", or "yaml". (default "human") Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). See Also confluent flink environment - Manage Flink environments.

#### Code Examples

```sql
confluent flink environment update <name> [flags]
```

```sql
--url string                          Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag.
    --client-key-path string              Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag.
    --client-cert-path string             Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag.
    --certificate-authority-path string   Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag.
    --defaults string                     JSON string defining the environment's Flink application defaults, or path to a file to read defaults from (with .yml, .yaml or .json extension).
    --statement-defaults string           JSON string defining the environment's Flink statement defaults, or path to a file to read defaults from (with .yml, .yaml or .json extension).
    --compute-pool-defaults string        JSON string defining the environment's Flink compute pool defaults, or path to a file to read defaults from (with .yml, .yaml or .json extension).
-o, --output string                       Specify the output format as "human", "json", or "yaml". (default "human")
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

---

### confluent flink environment | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/environment/index.html

confluent flink environment Aliases environment, env Description Manage Flink environments. Subcommands Command Description confluent flink environment create Create a Flink environment. confluent flink environment delete Delete one or more Flink environments. confluent flink environment describe Describe a Flink environment. confluent flink environment list List Flink environments. confluent flink environment update Update a Flink environment.

#### Code Examples

```sql
environment, env
```

---

### confluent flink | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/index.html

confluent flink Description Manage Apache Flink. Subcommands Cloud Command Description confluent flink artifact Manage Flink UDF artifacts. confluent flink compute-pool Manage Flink compute pools. confluent flink connection Manage Flink connections. confluent flink connectivity-type Manage Flink connectivity type. confluent flink endpoint Manage Flink endpoint. confluent flink region Manage Flink regions. confluent flink shell Start Flink interactive SQL client. confluent flink statement Manage Flink SQL statements. On-Premises Command Description confluent flink application Manage Flink applications. confluent flink catalog Manage Flink catalogs in Confluent Platform. confluent flink compute-pool Manage Flink compute pools. confluent flink environment Manage Flink environments. confluent flink statement Manage Flink SQL statements.

---

### confluent flink region list | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/region/confluent_flink_region_list.html

confluent flink region list Description List Flink regions. confluent flink region list [flags] Flags --cloud string Specify the cloud provider as "aws", "azure", or "gcp". --context string CLI context name. -o, --output string Specify the output format as "human", "json", or "yaml". (default "human") Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). Examples List the available Flink AWS regions. confluent flink region list --cloud aws See Also confluent flink region - Manage Flink regions.

#### Code Examples

```sql
confluent flink region list [flags]
```

```sql
--cloud string     Specify the cloud provider as "aws", "azure", or "gcp".
    --context string   CLI context name.
-o, --output string    Specify the output format as "human", "json", or "yaml". (default "human")
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

```sql
confluent flink region list --cloud aws
```

---

### confluent flink region unset | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/region/confluent_flink_region_unset.html

confluent flink region unset Description Unset the current Flink cloud and region that was set with the use command. confluent flink region unset [flags] Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). Examples Unset the current Flink region us-west-1 with cloud provider = AWS. confluent flink region unset See Also confluent flink region - Manage Flink regions.

#### Code Examples

```sql
confluent flink region unset [flags]
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

```sql
confluent flink region unset
```

---

### confluent flink region use | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/region/confluent_flink_region_use.html

confluent flink region use Description Choose a Flink region to be used in subsequent commands which support passing a region with the --region flag. confluent flink region use [flags] Flags --cloud string REQUIRED: Specify the cloud provider as "aws", "azure", or "gcp". --region string REQUIRED: Cloud region for Flink (use "confluent flink region list" to see all). Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). Examples Select region “N. Virginia (us-east-1)” for use in subsequent Flink commands. confluent flink region use --cloud aws --region us-east-1 See Also confluent flink region - Manage Flink regions.

#### Code Examples

```sql
confluent flink region use [flags]
```

```sql
--cloud string    REQUIRED: Specify the cloud provider as "aws", "azure", or "gcp".
--region string   REQUIRED: Cloud region for Flink (use "confluent flink region list" to see all).
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

```sql
confluent flink region use --cloud aws --region us-east-1
```

---

### confluent flink region | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/region/index.html

confluent flink region Description Manage Flink regions. Subcommands Command Description confluent flink region list List Flink regions. confluent flink region unset Unset the current Flink cloud and region. confluent flink region use Use a Flink region in subsequent commands.

---

### confluent flink statement create | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/statement/confluent_flink_statement_create.html

confluent flink statement create Description Cloud Create a Flink SQL statement. confluent flink statement create [name] [flags] On-Premises Create a Flink SQL statement in Confluent Platform. confluent flink statement create [name] [flags] Flags Cloud --sql string REQUIRED: The Flink SQL statement. --compute-pool string Flink compute pool ID. --service-account string Service account ID. --database string The database which will be used as the default database. When using Kafka, this is the cluster ID. --wait Block until the statement is running or has failed. --property strings A mechanism to pass properties in the form key=value when creating a Flink statement. --environment string Environment ID. --context string CLI context name. -o, --output string Specify the output format as "human", "json", or "yaml". (default "human") On-Premises --sql string REQUIRED: The Flink SQL statement. --environment string REQUIRED: Name of the Flink environment. --compute-pool string REQUIRED: The compute pool name to execute the Flink SQL statement. --parallelism uint16 The parallelism the statement, default value is 1. (default 1) --catalog string The name of the default catalog. --database string The name of the default database. --flink-configuration string The file path to hold the Flink configuration for the statement. --wait Boolean flag to block until the statement is running or has failed. --url string Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag. --client-key-path string Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag. --client-cert-path string Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag. --certificate-authority-path string Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag. -o, --output string Specify the output format as "human", "json", or "yaml". (default "human") Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). Examples Cloud Create a Flink SQL statement in the current compute pool. confluent flink statement create --sql "SELECT * FROM table;" Create a Flink SQL statement named “my-statement” in compute pool “lfcp-123456” with service account “sa-123456”, using Kafka cluster “my-cluster” as the default database, and with additional properties. confluent flink statement create my-statement --sql "SELECT * FROM my-topic;" --compute-pool lfcp-123456 --service-account sa-123456 --database my-cluster --property property1=value1,property2=value2 On-Premises No examples. See Also confluent flink statement - Manage Flink SQL statements.

#### Code Examples

```sql
confluent flink statement create [name] [flags]
```

```sql
confluent flink statement create [name] [flags]
```

```sql
--sql string               REQUIRED: The Flink SQL statement.
    --compute-pool string      Flink compute pool ID.
    --service-account string   Service account ID.
    --database string          The database which will be used as the default database. When using Kafka, this is the cluster ID.
    --wait                     Block until the statement is running or has failed.
    --property strings         A mechanism to pass properties in the form key=value when creating a Flink statement.
    --environment string       Environment ID.
    --context string           CLI context name.
-o, --output string            Specify the output format as "human", "json", or "yaml". (default "human")
```

```sql
--sql string                          REQUIRED: The Flink SQL statement.
    --environment string                  REQUIRED: Name of the Flink environment.
    --compute-pool string                 REQUIRED: The compute pool name to execute the Flink SQL statement.
    --parallelism uint16                  The parallelism the statement, default value is 1. (default 1)
    --catalog string                      The name of the default catalog.
    --database string                     The name of the default database.
    --flink-configuration string          The file path to hold the Flink configuration for the statement.
    --wait                                Boolean flag to block until the statement is running or has failed.
    --url string                          Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag.
    --client-key-path string              Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag.
    --client-cert-path string             Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag.
    --certificate-authority-path string   Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag.
-o, --output string                       Specify the output format as "human", "json", or "yaml". (default "human")
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

```sql
confluent flink statement create --sql "SELECT * FROM table;"
```

```sql
confluent flink statement create my-statement --sql "SELECT * FROM my-topic;" --compute-pool lfcp-123456 --service-account sa-123456 --database my-cluster --property property1=value1,property2=value2
```

---

### confluent flink statement delete | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/statement/confluent_flink_statement_delete.html

confluent flink statement delete Description Cloud Delete one or more Flink SQL statements. confluent flink statement delete <name-1> [name-2] ... [name-n] [flags] On-Premises Delete one or more Flink SQL statements in Confluent Platform. confluent flink statement delete <name-1> [name-2] ... [name-n] [flags] Flags Cloud --cloud string Specify the cloud provider as "aws", "azure", or "gcp". --region string Cloud region for Flink (use "confluent flink region list" to see all). --force Skip the deletion confirmation prompt. --environment string Environment ID. --context string CLI context name. On-Premises --environment string REQUIRED: Name of the Flink environment. --url string Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag. --client-key-path string Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag. --client-cert-path string Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag. --certificate-authority-path string Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag. --force Skip the deletion confirmation prompt. Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). See Also confluent flink statement - Manage Flink SQL statements.

#### Code Examples

```sql
confluent flink statement delete <name-1> [name-2] ... [name-n] [flags]
```

```sql
confluent flink statement delete <name-1> [name-2] ... [name-n] [flags]
```

```sql
--cloud string         Specify the cloud provider as "aws", "azure", or "gcp".
--region string        Cloud region for Flink (use "confluent flink region list" to see all).
--force                Skip the deletion confirmation prompt.
--environment string   Environment ID.
--context string       CLI context name.
```

```sql
--environment string                  REQUIRED: Name of the Flink environment.
--url string                          Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag.
--client-key-path string              Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag.
--client-cert-path string             Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag.
--certificate-authority-path string   Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag.
--force                               Skip the deletion confirmation prompt.
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

---

### confluent flink statement describe | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/statement/confluent_flink_statement_describe.html

confluent flink statement describe Description Cloud Describe a Flink SQL statement. confluent flink statement describe <name> [flags] On-Premises Describe a Flink SQL statement in Confluent Platform. confluent flink statement describe [name] [flags] Flags Cloud --cloud string Specify the cloud provider as "aws", "azure", or "gcp". --region string Cloud region for Flink (use "confluent flink region list" to see all). --environment string Environment ID. --context string CLI context name. -o, --output string Specify the output format as "human", "json", or "yaml". (default "human") On-Premises --environment string REQUIRED: Name of the Flink environment. --url string Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag. --client-key-path string Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag. --client-cert-path string Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag. --certificate-authority-path string Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag. -o, --output string Specify the output format as "human", "json", or "yaml". (default "human") Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). See Also confluent flink statement - Manage Flink SQL statements.

#### Code Examples

```sql
confluent flink statement describe <name> [flags]
```

```sql
confluent flink statement describe [name] [flags]
```

```sql
--cloud string         Specify the cloud provider as "aws", "azure", or "gcp".
    --region string        Cloud region for Flink (use "confluent flink region list" to see all).
    --environment string   Environment ID.
    --context string       CLI context name.
-o, --output string        Specify the output format as "human", "json", or "yaml". (default "human")
```

```sql
--environment string                  REQUIRED: Name of the Flink environment.
    --url string                          Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag.
    --client-key-path string              Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag.
    --client-cert-path string             Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag.
    --certificate-authority-path string   Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag.
-o, --output string                       Specify the output format as "human", "json", or "yaml". (default "human")
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

---

### confluent flink statement list | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/statement/confluent_flink_statement_list.html

confluent flink statement list Description Cloud List Flink SQL statements. confluent flink statement list [flags] On-Premises List Flink SQL statements in Confluent Platform. confluent flink statement list [flags] Flags Cloud --cloud string Specify the cloud provider as "aws", "azure", or "gcp". --region string Cloud region for Flink (use "confluent flink region list" to see all). --compute-pool string Flink compute pool ID. --environment string Environment ID. --context string CLI context name. -o, --output string Specify the output format as "human", "json", or "yaml". (default "human") --status string Filter the results by statement status. On-Premises --environment string REQUIRED: Name of the Flink environment. --compute-pool string Optional flag to filter the Flink statements by compute pool ID. --status string Optional flag to filter the Flink statements by statement status. --url string Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag. --client-key-path string Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag. --client-cert-path string Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag. --certificate-authority-path string Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag. -o, --output string Specify the output format as "human", "json", or "yaml". (default "human") Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). Examples Cloud List running statements. confluent flink statement list --status running On-Premises No examples. See Also confluent flink statement - Manage Flink SQL statements.

#### Code Examples

```sql
confluent flink statement list [flags]
```

```sql
confluent flink statement list [flags]
```

```sql
--cloud string          Specify the cloud provider as "aws", "azure", or "gcp".
    --region string         Cloud region for Flink (use "confluent flink region list" to see all).
    --compute-pool string   Flink compute pool ID.
    --environment string    Environment ID.
    --context string        CLI context name.
-o, --output string         Specify the output format as "human", "json", or "yaml". (default "human")
    --status string         Filter the results by statement status.
```

```sql
--environment string                  REQUIRED: Name of the Flink environment.
    --compute-pool string                 Optional flag to filter the Flink statements by compute pool ID.
    --status string                       Optional flag to filter the Flink statements by statement status.
    --url string                          Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag.
    --client-key-path string              Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag.
    --client-cert-path string             Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag.
    --certificate-authority-path string   Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag.
-o, --output string                       Specify the output format as "human", "json", or "yaml". (default "human")
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

```sql
confluent flink statement list --status running
```

---

### confluent flink statement rescale | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/statement/confluent_flink_statement_rescale.html

confluent flink statement rescale Description Rescale a Flink SQL statement in Confluent Platform. confluent flink statement rescale <statement-name> [flags] Flags --environment string REQUIRED: Name of the Flink environment. --parallelism int32 REQUIRED: New parallelism of the Flink SQL statement. (default 1) --url string Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag. --client-key-path string Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag. --client-cert-path string Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag. --certificate-authority-path string Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag. Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). See Also confluent flink statement - Manage Flink SQL statements.

#### Code Examples

```sql
confluent flink statement rescale <statement-name> [flags]
```

```sql
--environment string                  REQUIRED: Name of the Flink environment.
--parallelism int32                   REQUIRED: New parallelism of the Flink SQL statement. (default 1)
--url string                          Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag.
--client-key-path string              Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag.
--client-cert-path string             Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag.
--certificate-authority-path string   Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag.
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

---

### confluent flink statement resume | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/statement/confluent_flink_statement_resume.html

confluent flink statement resume Description Cloud Resume a Flink SQL statement. confluent flink statement resume <name> [flags] On-Premises Resume a Flink SQL statement in Confluent Platform. confluent flink statement resume <statement-name> [flags] Flags Cloud --principal string A user or service account the statement runs as. --compute-pool string Flink compute pool ID. --cloud string Specify the cloud provider as "aws", "azure", or "gcp". --region string Cloud region for Flink (use "confluent flink region list" to see all). --environment string Environment ID. --context string CLI context name. On-Premises --environment string REQUIRED: Name of the Flink environment. --url string Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag. --client-key-path string Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag. --client-cert-path string Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag. --certificate-authority-path string Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag. Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). Examples Cloud Request to resume the currently stopped statement “my-statement” using original principal id and under the original compute pool. confluent flink statement resume my-statement Request to resume the currently stopped statement “my-statement” using service account “sa-123456”. confluent flink statement resume my-statement --principal sa-123456 Request to resume the currently stopped statement “my-statement” using user account “u-987654”. confluent flink statement resume my-statement --principal u-987654 Request to resume the currently stopped statement “my-statement” and under a different compute pool “lfcp-123456”. confluent flink statement resume my-statement --compute-pool lfcp-123456 Request to resume the currently stopped statement “my-statement” using service account “sa-123456” and under a different compute pool “lfcp-123456”. confluent flink statement resume my-statement --principal sa-123456 --compute-pool lfcp-123456 On-Premises No examples. See Also confluent flink statement - Manage Flink SQL statements.

#### Code Examples

```sql
confluent flink statement resume <name> [flags]
```

```sql
confluent flink statement resume <statement-name> [flags]
```

```sql
--principal string      A user or service account the statement runs as.
--compute-pool string   Flink compute pool ID.
--cloud string          Specify the cloud provider as "aws", "azure", or "gcp".
--region string         Cloud region for Flink (use "confluent flink region list" to see all).
--environment string    Environment ID.
--context string        CLI context name.
```

```sql
--environment string                  REQUIRED: Name of the Flink environment.
--url string                          Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag.
--client-key-path string              Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag.
--client-cert-path string             Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag.
--certificate-authority-path string   Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag.
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

```sql
confluent flink statement resume my-statement
```

```sql
confluent flink statement resume my-statement --principal sa-123456
```

```sql
confluent flink statement resume my-statement --principal u-987654
```

```sql
confluent flink statement resume my-statement --compute-pool lfcp-123456
```

```sql
confluent flink statement resume my-statement --principal sa-123456 --compute-pool lfcp-123456
```

---

### confluent flink statement stop | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/statement/confluent_flink_statement_stop.html

confluent flink statement stop Description Cloud Stop a Flink SQL statement. confluent flink statement stop <name> [flags] On-Premises Stop a Flink SQL statement in Confluent Platform. confluent flink statement stop <statement-name> [flags] Flags Cloud --cloud string Specify the cloud provider as "aws", "azure", or "gcp". --region string Cloud region for Flink (use "confluent flink region list" to see all). --environment string Environment ID. --context string CLI context name. On-Premises --environment string REQUIRED: Name of the Flink environment. --url string Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag. --client-key-path string Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag. --client-cert-path string Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag. --certificate-authority-path string Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag. Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). Examples Cloud Request to stop the currently running statement “my-statement”. confluent flink statement stop my-statement On-Premises No examples. See Also confluent flink statement - Manage Flink SQL statements.

#### Code Examples

```sql
confluent flink statement stop <name> [flags]
```

```sql
confluent flink statement stop <statement-name> [flags]
```

```sql
--cloud string         Specify the cloud provider as "aws", "azure", or "gcp".
--region string        Cloud region for Flink (use "confluent flink region list" to see all).
--environment string   Environment ID.
--context string       CLI context name.
```

```sql
--environment string                  REQUIRED: Name of the Flink environment.
--url string                          Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag.
--client-key-path string              Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag.
--client-cert-path string             Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag.
--certificate-authority-path string   Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag.
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

```sql
confluent flink statement stop my-statement
```

---

### confluent flink statement update | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/statement/confluent_flink_statement_update.html

confluent flink statement update Description Update a Flink SQL statement. confluent flink statement update <name> [flags] Flags --principal string A user or service account the statement runs as. --compute-pool string Flink compute pool ID. --stopped Request to stop or resume the statement. --cloud string Specify the cloud provider as "aws", "azure", or "gcp". --region string Cloud region for Flink (use "confluent flink region list" to see all). --environment string Environment ID. --context string CLI context name. Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). Examples Request to resume the currently stopped statement “my-statement” using original principal id and under the original compute pool. confluent flink statement update my-statement --stopped=false Request to resume the currently stopped statement “my-statement” using service account “sa-123456”. confluent flink statement update my-statement --stopped=false --principal sa-123456 Request to resume the currently stopped statement “my-statement” using user account “u-987654”. confluent flink statement update my-statement --stopped=false --principal u-987654 Request to resume the currently stopped statement “my-statement” and under a different compute pool “lfcp-123456”. confluent flink statement update my-statement --stopped=false --compute-pool lfcp-123456 Request to resume the currently stopped statement “my-statement” using service account “sa-123456” and under a different compute pool “lfcp-123456”. confluent flink statement update my-statement --stopped=false --principal sa-123456 --compute-pool lfcp-123456 Request to stop the currently running statement “my-statement”. confluent flink statement update my-statement --stopped=true See Also confluent flink statement - Manage Flink SQL statements.

#### Code Examples

```sql
confluent flink statement update <name> [flags]
```

```sql
--principal string      A user or service account the statement runs as.
--compute-pool string   Flink compute pool ID.
--stopped               Request to stop or resume the statement.
--cloud string          Specify the cloud provider as "aws", "azure", or "gcp".
--region string         Cloud region for Flink (use "confluent flink region list" to see all).
--environment string    Environment ID.
--context string        CLI context name.
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

```sql
confluent flink statement update my-statement --stopped=false
```

```sql
confluent flink statement update my-statement --stopped=false --principal sa-123456
```

```sql
confluent flink statement update my-statement --stopped=false --principal u-987654
```

```sql
confluent flink statement update my-statement --stopped=false --compute-pool lfcp-123456
```

```sql
confluent flink statement update my-statement --stopped=false --principal sa-123456 --compute-pool lfcp-123456
```

```sql
confluent flink statement update my-statement --stopped=true
```

---

### confluent flink statement web-ui-forward | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/statement/confluent_flink_statement_web-ui-forward.html

confluent flink statement web-ui-forward Description Forward the web UI of a Flink statement in Confluent Platform. confluent flink statement web-ui-forward <name> [flags] Flags --environment string REQUIRED: Name of the Flink environment. --url string Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag. --client-key-path string Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag. --client-cert-path string Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag. --certificate-authority-path string Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag. --port uint16 Port to forward the web UI to. If not provided, a random, OS-assigned port will be used. Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). See Also confluent flink statement - Manage Flink SQL statements.

#### Code Examples

```sql
confluent flink statement web-ui-forward <name> [flags]
```

```sql
--environment string                  REQUIRED: Name of the Flink environment.
--url string                          Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag.
--client-key-path string              Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag.
--client-cert-path string             Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag.
--certificate-authority-path string   Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag.
--port uint16                         Port to forward the web UI to. If not provided, a random, OS-assigned port will be used.
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

---

### confluent flink statement exception list | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/statement/exception/confluent_flink_statement_exception_list.html

confluent flink statement exception list Description Cloud List exceptions for a Flink SQL statement. confluent flink statement exception list <statement-name> [flags] On-Premises List exceptions for a Flink SQL statement in Confluent Platform. confluent flink statement exception list <statement-name> [flags] Flags Cloud --cloud string Specify the cloud provider as "aws", "azure", or "gcp". --region string Cloud region for Flink (use "confluent flink region list" to see all). --environment string Environment ID. --context string CLI context name. -o, --output string Specify the output format as "human", "json", or "yaml". (default "human") On-Premises --environment string REQUIRED: Name of the Flink environment. --url string Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag. --client-key-path string Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag. --client-cert-path string Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag. --certificate-authority-path string Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag. -o, --output string Specify the output format as "human", "json", or "yaml". (default "human") Global Flags -h, --help Show help for this command. --unsafe-trace Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets. -v, --verbose count Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace). See Also confluent flink statement exception - Manage Flink SQL statement exceptions.

#### Code Examples

```sql
confluent flink statement exception list <statement-name> [flags]
```

```sql
confluent flink statement exception list <statement-name> [flags]
```

```sql
--cloud string         Specify the cloud provider as "aws", "azure", or "gcp".
    --region string        Cloud region for Flink (use "confluent flink region list" to see all).
    --environment string   Environment ID.
    --context string       CLI context name.
-o, --output string        Specify the output format as "human", "json", or "yaml". (default "human")
```

```sql
--environment string                  REQUIRED: Name of the Flink environment.
    --url string                          Base URL of the Confluent Manager for Apache Flink (CMF). Environment variable "CONFLUENT_CMF_URL" may be set in place of this flag.
    --client-key-path string              Path to client private key for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_KEY_PATH" may be set in place of this flag.
    --client-cert-path string             Path to client cert to be verified by Confluent Manager for Apache Flink. Include for mTLS authentication. Environment variable "CONFLUENT_CMF_CLIENT_CERT_PATH" may be set in place of this flag.
    --certificate-authority-path string   Path to a PEM-encoded Certificate Authority to verify the Confluent Manager for Apache Flink connection. Environment variable "CONFLUENT_CMF_CERTIFICATE_AUTHORITY_PATH" may be set in place of this flag.
-o, --output string                       Specify the output format as "human", "json", or "yaml". (default "human")
```

```sql
-h, --help            Show help for this command.
    --unsafe-trace    Equivalent to -vvvv, but also log HTTP requests and responses which might contain plaintext secrets.
-v, --verbose count   Increase verbosity (-v for warn, -vv for info, -vvv for debug, -vvvv for trace).
```

---

### confluent flink statement exception | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/statement/exception/index.html

confluent flink statement exception Description Manage Flink SQL statement exceptions. Subcommands Command Description confluent flink statement exception list List exceptions for a Flink SQL statement.

---

### confluent flink statement | Confluent Documentation
Source: https://docs.confluent.io/confluent-cli/current/command-reference/flink/statement/index.html

confluent flink statement Description Manage Flink SQL statements. Subcommands Cloud Command Description confluent flink statement create Create a Flink SQL statement. confluent flink statement delete Delete one or more Flink SQL statements. confluent flink statement describe Describe a Flink SQL statement. confluent flink statement exception Manage Flink SQL statement exceptions. confluent flink statement list List Flink SQL statements. confluent flink statement resume Resume a Flink SQL statement. confluent flink statement stop Stop a Flink SQL statement. confluent flink statement update Update a Flink SQL statement. On-Premises Command Description confluent flink statement create Create a Flink SQL statement. confluent flink statement delete Delete one or more Flink SQL statements. confluent flink statement describe Describe a Flink SQL statement. confluent flink statement exception Manage Flink SQL statement exceptions. confluent flink statement list List Flink SQL statements in Confluent Platform. confluent flink statement rescale Rescale a Flink SQL statement. confluent flink statement resume Resume a Flink SQL statement. confluent flink statement stop Stop a Flink SQL statement. confluent flink statement web-ui-forward Forward the web UI of a Flink statement.

---

### Manage Confluent Platform for Apache Flink Applications Using Confluent for Kubernetes | Confluent Documentation
Source: https://docs.confluent.io/operator/current/co-manage-flink.html

Manage Flink Applications Using Confluent for Kubernetes Apache Flink® is a powerful, scalable, and secure stream processing framework for running complex, stateful, low-latency streaming applications on large volumes of data. Offered with Confluent Platform, Confluent Manager for Apache Flink® (CMF) is a self-managed service for Flink that integrates seamlessly with Apache Kafka®. To learn more about CMF, see Overview of Confluent Platform for Apache Flink. You can use Confluent for Kubernetes (CFK) to manage CMF and Flink applications within the familiar Kubernetes environment and custom resources. The high-level workflow to manage Flink applications with CFK is: Install the Confluent Platform for Apache Flink Kubernetes operator. Install Confluent Manager for Apache Flink. Install or upgrade CFK with Flink integration enabled: helm upgrade --install confluent-operator \ confluentinc/confluent-for-kubernetes To configure CFK to listen to multiple namespaces or single name space, see Configure CFK to manage Confluent Platform components in different namespaces. For example, to configure CFK to manage Flink in two namespaces, confluent and default and only in those namespaces, add the --set namespaceList and --set namespaced=true flags to the helm upgrade command as shown below: helm upgrade --install confluent-operator \ confluentinc/confluent-for-kubernetes \ --set namespaceList="{confluent,default}" \ --set namespaced=true For more information about CFK installation, see Deploy Confluent for Kubernetes. Create a CMF REST class. Create a Flink environment. Create a Flink application. In the Flink Web UI, verify that the application job you created is running. An example scenario of using CMF with CFK is available in the CFK Example Repository. Requirements and considerations To manage Flink in CFK, you need the following versions: CMF version V1 CFK 2.10.0 and higher Confluent Platform 7.8.0 and higher Currently, CFK can authenticate to CMF without authentication or using mTLS. Create a CMF REST Class When managing CMF in CFK, the CMF custom resources, namely, FlinkEnvironment and FlinkApplication, communicate with CMF through the CMF REST Class (CMFRestClass). You need to first set up a CMF REST Class custom resource (CR). CMF REST Class is only used by CFK and is not part of CMF. If using mTLS or TLS to connect to the Flink host, create a secret. Certificates with appropriate Subject Alternate Names (SANs) are required for the mTLS setup. mTLS: You need to create a secret with certs and reference it in the CMFRestClass CR in the next step. TLS: The secret is only required if using a self-signed certificate. See Provide TLS keys and certificates in PEM format and Provide TLS keys and certificates in Java KeyStore format for the expected keys in the TLS secret. Create a a CMF REST Class (CMFRestClass CR) with the following spec and deploy the resource using the kubectl apply -f command. apiVersion: platform.confluent.io/v1beta1 kind: CMFRestClass metadata: name: --- [1] namespace: --- [2] spec: cmfRest: --- [3] authentication: type: --- [4] endpoint: --- [5] tls: --- [6] secretRef: --- [7] [1] The name of the REST Class. [2] The namespace of the CMF REST Class. [3] The CMF cluster. [4] To use mTLS authentication, set to mtls and specify the certificates in [7]. [5] The endpoint of the CMF host. [6] Required when you set the authentication type ([4]) is set to mtls. [7] The name of the secret that contains the TLS certificates. An example CMFRestClass CR: apiVersion: platform.confluent.io/v1beta1 kind: CMFRestClass metadata: name: default namespace: operator spec: cmfRest: endpoint: https://cmf-service:80 authentication: type: mtls sslClientAuthentication: true tls: secretRef: cmf-day2-tls Check the status: kubectl get CMFRestClass default -n <namespace> -oyaml Create a Flink environment A Flink environment is a set of configurations that Flink applications use. Create a FlinkEnvironment CR using the following spec, and deploy it with the kubectl apply -f command. apiVersion: platform.confluent.io/v1beta1 kind: FlinkEnvironment metadata: name: namespace: spec: kubernetesNamespace: --- [1] flinkApplicationDefaults: --- [2] metadata: --- [3] spec: --- [4] flinkConfiguration: cmfRestClassRef: --- [5] name: namespace: [1] The namespace of the Flink cluster. Typically, you would install the FlinkEnvironment CR in the CFK namespace (metadata.namespace), but the Flink would be in another namespace (spec.kubernetesNamespace), for example, default. [2] Configurations for the Flink cluster to specify the deployment-wide default application settings. [3] Kubernetes API metadata. [4] Spec of the FlinkApplicationSpec type. [5] The reference to the REST Class you created in Create a CMF REST Class. You can install FlinkEnvironment CR and the CMF REST class in different namespaces. If omitted, the CMFRestClass of the name default in the same namespace is used. An example FlinkEnvironment CR: apiVersion: platform.confluent.io/v1beta1 kind: FlinkEnvironment metadata: name: my-env1 namespace: operator spec: kubernetesNamespace: default flinkApplicationDefaults: metadata: labels: "acmecorp.com/owned-by": "analytics-team" spec: flinkConfiguration: taskmanager.numberOfTaskSlots: "2" rest.profiling.enabled": "true" cmfRestClassRef: name: default namespace: operator Check the status. kubectl get flinkEnvironment -n <namespace> -oyaml Create a Flink application A Flink application is a user program that creates one or more Flink jobs to process data. To create a Flink application resource in CFK: Create a FlinkApplication CR using the following spec and deploy the resource using the kubectl apply -f command. apiVersion: platform.confluent.io/v1beta1 kind: FlinkApplication metadata: spec: cmfRestClassRef: name: --- [1] namespace: --- [2] image: --- [3] flinkEnvironment: --- [4] image: flinkVersion: flinkConfiguration: --- [5] serviceAccount: --- [6] jobManager: --- [7] taskManager: --- [8] job: --- [9] [1] The reference to the REST Class you created in Create a CMF REST Class. If omitted, the CMFRestClass of the name default in the same namespace is used. [2] The namespace of this FlinkApplication CR. The namespace of the Flink cluster is determined by FlinkEnvironment.spec.kubernetesNamespace. [3] The CMF image. [4] The reference to the FlinkEnvironment CR you created in Create a Flink environment. [5] Flink configurations. [6] The service account that runs Flink. [7] FlinkJobManager [8] FlinkTaskManager [9] FlinkJob An example FlinkApplication CR: apiVersion: platform.confluent.io/v1beta1 kind: FlinkApplication metadata: name: my-app1 namespace: default spec: flinkEnvironment: my-env1 image: confluentinc/cp-flink:1.19.1-cp1 flinkVersion: v1_19 flinkConfiguration: "taskmanager.numberOfTaskSlots": "2" "metrics.reporter.prom.factory.class": "org.apache.flink.metrics.prometheus.PrometheusReporterFactory" "metrics.reporter.prom.port": "9249-9250" "rest.profiling.enabled": "true" serviceAccount: flink jobManager: resource: memory: 1048m cpu: 1 taskManager: resource: memory: 1048m cpu: 1 job: jarURI: local:///opt/flink/examples/streaming/StateMachineExample.jar state: running parallelism: 3 upgradeMode: stateless cmfRestClassRef: name: default namespace: operator Check the status. kubectl get flinkApplication -n <namespace> -oyaml For status details, see Check the Flink application status. Check the Flink application status The following are the notable status fields in the CFK-manged FlinkApplication CR: status: cmfSync: --- [1] errorMessage: --- [2] lastSyncTime: --- [3] status: --- [4] error: --- [5] clusterInfo: --- [6] jobManagerDeploymentStatus: --- [7] jobStatus: state: --- [8] [1] The status of the sync between CFK and CMF through the CMFRestClass CR. [2] Any error message related to the sync between CFK and CMF ([1]), for example, a connection, authentication, or validation error. [3] The time when the latest sync between CFK and CMF happened. [4] The sync status. The possible values: CREATED, DELETED, UNKNOWN, FAILED. [5] Indicates async errors during from the Flink deployment. This is only set if status.cmfSync.errorMessage ([2]) is empty and status.cmfSync.status: CREATED. For details about the below status fields, refer to the CMF documentation. [6] Information about the Flink cluster when deployed. This section is only set if the status.error ([5]) is not set. [7] Status of the JobManager deployment in Kubernetes. [8] Status of the Flink job inside the FlinkApplication’s Flink cluster. It is important to note, that there is a hierarchy of status/error fields in the FlinkApplication.status: Level 1. The status.cmfSync field needs to be error-free, as this indicates that CFK was able to submit the FlinkApplication to the CMF backend. Level 2, The CMF backend or the internal Kubernetes Operator might report an error in the status.error field. Level 3. Once the errors with the above field are resolved, the rest of the status fields, get populated. The following is an example of error status: status: cfkInternalState: CREATED clusterInfo: {} cmfSync: errorMessage: "" lastSyncTime: "2024-11-05T19:15:09Z" status: Created error: '{"type":"org.apache.flink.kubernetes.operator.exception.ReconciliationException","message":"org.apache.flink.configuration.IllegalConfigurationException: JobManager memory configuration failed: Sum of configured JVM Metaspace (256.000mb (268435456 bytes)) and JVM Overhead (192.000mb (201326592 bytes)) exceed configured Total Process Memory (1 bytes).","additionalMetadata":{},"throwableList":[{"type":"org.apache.flink.configuration.IllegalConfigurationException","message":"JobManager memory configuration failed: Sum of configured JVM Metaspace (256.000mb (268435456 bytes)) and JVM Overhead (192.000mb (201326592 bytes)) exceed configured Total Process Memory (1 bytes).","additionalMetadata":{}},{"type":"org.apache.flink.configuration.IllegalConfigurationException","message":"Sum of configured JVM Metaspace (256.000mb (268435456 bytes)) and JVM Overhead (192.000mb (201326592 bytes)) exceed configured Total Process Memory (1 bytes).","additionalMetadata":{}}]}' jobManagerDeploymentStatus: MISSING jobStatus: checkpointInfo: lastPeriodicCheckpointTimestamp: 0 jobId: a6251e5a0f3f2e00f56874b56bc0780c jobName: "" savepointInfo: lastPeriodicSavepointTimestamp: 0 savepointHistory: [] state: "" lifecycleState: UPGRADING observedGeneration: 5 reconciliationStatus: lastReconciledSpec: '{"spec":{"job":{"jarURI":"local:///opt/flink/examples/streaming/StateMachineExample.jar","parallelism":1,"entryClass":null,"args":[],"state":"suspended","savepointTriggerNonce":null,"initialSavepointPath":null,"checkpointTriggerNonce":null,"upgradeMode":"stateless","allowNonRestoredState":null,"savepointRedeployNonce":null},"restartNonce":null,"flinkConfiguration":{"rest.profiling.enabled":"true","taskmanager.numberOfTaskSlots":"2"},"image":"confluentinc/cp-flink:1.19.1- cp1","imagePullPolicy":null,"serviceAccount":"flink","flinkVersion":"v1_19","ingress":null,"podTemplate":null,"jobManager":{"resource":{"cpu":1.0,"memory":"1","ephemeralStorage":null},"replicas":1,"podTemplate":{"metadata":{"labels":{"platform.confluent.io/origin":"flink"}}}},"taskManager":{"resource":{"cpu":1.0,"memory":"1","ephemeralStorage":null},"replicas":null,"podTemplate":{"metadata":{"labels":{"platform.confluent.io/ origin":"flink"}}}},"logConfiguration":null,"mode":null},"resource_metadata":{"apiVersion":"flink.apache.org/v1beta1","metadata":{"generation":6},"firstDeployment":true}}' reconciliationTimestamp: 1730834099726 state: UPGRADING taskManager: labelSelector: "" replicas: 0 The following is an example status of a successful FlinkApplication creation: status: cfkInternalState: CREATED clusterInfo: flink-revision: 89d0b8f @ 2024-06-22T13:19:31+02:00 flink-version: 1.19.1-cp1 total-cpu: "2.0" total-memory: "2516582400" cmfSync: errorMessage: "" lastSyncTime: "2024-11-05T19:19:10Z" status: Created jobManagerDeploymentStatus: READY jobStatus: checkpointInfo: lastPeriodicCheckpointTimestamp: 0 jobId: 522d7ff7f15b4e138ffb9ea4053abbd3 jobName: State machine job savepointInfo: lastPeriodicSavepointTimestamp: 0 savepointHistory: [] startTime: "1730834237948" state: RUNNING updateTime: "1730834248753" lifecycleState: STABLE observedGeneration: 6 reconciliationStatus: lastReconciledSpec: '{"spec":{"job":{"jarURI":"local:///opt/flink/examples/streaming/StateMachineExample.jar","parallelism":1,"entryClass":null,"args":[],"state":"running","savepointTriggerNonce":null,"initialSavepointPath":null,"checkpointTriggerNonce":null,"upgradeMode":"stateless","allowNonRestoredState":null,"savepointRedeployNonce":null},"restartNonce":null,"flinkConfiguration":{"rest.profiling.enabled":"true","taskmanager.numberOfTaskSlots":"2"},"image":"confluentinc/cp-flink:1.19.1- cp1","imagePullPolicy":null,"serviceAccount":"flink","flinkVersion":"v1_19","ingress":null,"podTemplate":null,"jobManager":{"resource":{"cpu":1.0,"memory":"1200m","ephemeralStorage":null},"replicas":1,"podTemplate":{"metadata":{"labels":{"platform.confluent.io/origin":"flink"}}}},"taskManager":{"resource":{"cpu":1.0,"memory":"1200m","ephemeralStorage":null},"replicas":null,"podTemplate":{"metadata":{"labels":{"platform.confluent.io/ origin":"flink"}}}},"logConfiguration":null,"mode":null},"resource_metadata":{"apiVersion":"flink.apache.org/v1beta1","metadata":{"generation":12},"firstDeployment":true}}' lastStableSpec: '{"spec":{"job":{"jarURI":"local:///opt/flink/examples/streaming/StateMachineExample.jar","parallelism":1,"entryClass":null,"args":[],"state":"running","savepointTriggerNonce":null,"initialSavepointPath":null,"checkpointTriggerNonce":null,"upgradeMode":"stateless","allowNonRestoredState":null,"savepointRedeployNonce":null},"restartNonce":null,"flinkConfiguration":{"rest.profiling.enabled":"true","taskmanager.numberOfTaskSlots":"2"},"image":"confluentinc/cp-flink:1.19.1- cp1","imagePullPolicy":null,"serviceAccount":"flink","flinkVersion":"v1_19","ingress":null,"podTemplate":null,"jobManager":{"resource":{"cpu":1.0,"memory":"1200m","ephemeralStorage":null},"replicas":1,"podTemplate":{"metadata":{"labels":{"platform.confluent.io/origin":"flink"}}}},"taskManager":{"resource":{"cpu":1.0,"memory":"1200m","ephemeralStorage":null},"replicas":null,"podTemplate":{"metadata":{"labels":{"platform.confluent.io/ origin":"flink"}}}},"logConfiguration":null,"mode":null},"resource_metadata":{"apiVersion":"flink.apache.org/v1beta1","metadata":{"generation":12},"firstDeployment":true}}' reconciliationTimestamp: 1730834229475 state: DEPLOYED taskManager: labelSelector: component=taskmanager,app=app111 replicas: 1

#### Code Examples

```sql
helm upgrade --install confluent-operator \
  confluentinc/confluent-for-kubernetes
```

```sql
--set namespaceList
```

```sql
--set namespaced=true
```

```sql
helm upgrade
```

```sql
helm upgrade --install confluent-operator \
  confluentinc/confluent-for-kubernetes \
  --set namespaceList="{confluent,default}" \
  --set namespaced=true
```

```sql
kubectl apply -f
```

```sql
apiVersion: platform.confluent.io/v1beta1
kind: CMFRestClass
metadata:
  name:                     --- [1]
  namespace:                --- [2]
spec:
  cmfRest:                  --- [3]
    authentication:
      type:                 --- [4]
    endpoint:               --- [5]
    tls:                    --- [6]
      secretRef:            --- [7]
```

```sql
apiVersion: platform.confluent.io/v1beta1
kind: CMFRestClass
metadata:
  name: default
  namespace: operator
spec:
  cmfRest:
    endpoint: https://cmf-service:80
    authentication:
      type: mtls
      sslClientAuthentication: true
    tls:
      secretRef: cmf-day2-tls
```

```sql
kubectl get CMFRestClass default -n <namespace> -oyaml
```

```sql
kubectl apply -f
```

```sql
apiVersion: platform.confluent.io/v1beta1
kind: FlinkEnvironment
metadata:
  name:
  namespace:
spec:
  kubernetesNamespace:      --- [1]
  flinkApplicationDefaults: --- [2]
    metadata:               --- [3]
    spec:                   --- [4]
      flinkConfiguration:
  cmfRestClassRef:          --- [5]
    name:
    namespace:
```

```sql
metadata.namespace
```

```sql
spec.kubernetesNamespace
```

```sql
apiVersion: platform.confluent.io/v1beta1
kind: FlinkEnvironment
metadata:
  name: my-env1
  namespace: operator
spec:
  kubernetesNamespace: default
  flinkApplicationDefaults:
    metadata:
      labels:
        "acmecorp.com/owned-by": "analytics-team"
    spec:
      flinkConfiguration:
        taskmanager.numberOfTaskSlots: "2"
        rest.profiling.enabled": "true"
  cmfRestClassRef:
    name: default
    namespace: operator
```

```sql
kubectl get flinkEnvironment -n <namespace> -oyaml
```

```sql
kubectl apply -f
```

```sql
apiVersion: platform.confluent.io/v1beta1
kind: FlinkApplication
metadata:
spec:
  cmfRestClassRef:
     name:                  --- [1]
     namespace:             --- [2]
  image:                    --- [3]
  flinkEnvironment:         --- [4]
  image:
  flinkVersion:
  flinkConfiguration:       --- [5]
  serviceAccount:           --- [6]
  jobManager:               --- [7]
  taskManager:              --- [8]
  job:                      --- [9]
```

```sql
FlinkEnvironment.spec.kubernetesNamespace
```

```sql
apiVersion: platform.confluent.io/v1beta1
kind: FlinkApplication
metadata:
  name: my-app1
  namespace: default
spec:
  flinkEnvironment: my-env1
  image: confluentinc/cp-flink:1.19.1-cp1
  flinkVersion: v1_19
  flinkConfiguration:
    "taskmanager.numberOfTaskSlots": "2"
    "metrics.reporter.prom.factory.class": "org.apache.flink.metrics.prometheus.PrometheusReporterFactory"
    "metrics.reporter.prom.port": "9249-9250"
    "rest.profiling.enabled": "true"
  serviceAccount: flink
  jobManager:
    resource:
      memory: 1048m
      cpu: 1
  taskManager:
    resource:
      memory: 1048m
      cpu: 1
  job:
    jarURI: local:///opt/flink/examples/streaming/StateMachineExample.jar
    state: running
    parallelism: 3
    upgradeMode: stateless
  cmfRestClassRef:
    name: default
    namespace: operator
```

```sql
kubectl get flinkApplication -n <namespace> -oyaml
```

```sql
status:
  cmfSync:                     --- [1]
    errorMessage:              --- [2]
    lastSyncTime:              --- [3]
    status:                    --- [4]
  error:                       --- [5]
  clusterInfo:                 --- [6]
  jobManagerDeploymentStatus:  --- [7]
  jobStatus:
    state:                     --- [8]
```

```sql
status.cmfSync.errorMessage
```

```sql
status.cmfSync.status: CREATED
```

```sql
status.error
```

```sql
FlinkApplication.status
```

```sql
status.cmfSync
```

```sql
status.error
```

```sql
status:
  cfkInternalState: CREATED
  clusterInfo: {}
  cmfSync:
    errorMessage: ""
    lastSyncTime: "2024-11-05T19:15:09Z"
    status: Created
  error: '{"type":"org.apache.flink.kubernetes.operator.exception.ReconciliationException","message":"org.apache.flink.configuration.IllegalConfigurationException:
    JobManager memory configuration failed: Sum of configured JVM Metaspace (256.000mb
    (268435456 bytes)) and JVM Overhead (192.000mb (201326592 bytes)) exceed configured
    Total Process Memory (1 bytes).","additionalMetadata":{},"throwableList":[{"type":"org.apache.flink.configuration.IllegalConfigurationException","message":"JobManager
    memory configuration failed: Sum of configured JVM Metaspace (256.000mb (268435456
    bytes)) and JVM Overhead (192.000mb (201326592 bytes)) exceed configured Total
    Process Memory (1 bytes).","additionalMetadata":{}},{"type":"org.apache.flink.configuration.IllegalConfigurationException","message":"Sum
    of configured JVM Metaspace (256.000mb (268435456 bytes)) and JVM Overhead (192.000mb
    (201326592 bytes)) exceed configured Total Process Memory (1 bytes).","additionalMetadata":{}}]}'
  jobManagerDeploymentStatus: MISSING
  jobStatus:
    checkpointInfo:
      lastPeriodicCheckpointTimestamp: 0
    jobId: a6251e5a0f3f2e00f56874b56bc0780c
    jobName: ""
    savepointInfo:
      lastPeriodicSavepointTimestamp: 0
      savepointHistory: []
    state: ""
  lifecycleState: UPGRADING
  observedGeneration: 5
  reconciliationStatus:
    lastReconciledSpec: '{"spec":{"job":{"jarURI":"local:///opt/flink/examples/streaming/StateMachineExample.jar","parallelism":1,"entryClass":null,"args":[],"state":"suspended","savepointTriggerNonce":null,"initialSavepointPath":null,"checkpointTriggerNonce":null,"upgradeMode":"stateless","allowNonRestoredState":null,"savepointRedeployNonce":null},"restartNonce":null,"flinkConfiguration":{"rest.profiling.enabled":"true","taskmanager.numberOfTaskSlots":"2"},"image":"confluentinc/cp-flink:1.19.1-   cp1","imagePullPolicy":null,"serviceAccount":"flink","flinkVersion":"v1_19","ingress":null,"podTemplate":null,"jobManager":{"resource":{"cpu":1.0,"memory":"1","ephemeralStorage":null},"replicas":1,"podTemplate":{"metadata":{"labels":{"platform.confluent.io/origin":"flink"}}}},"taskManager":{"resource":{"cpu":1.0,"memory":"1","ephemeralStorage":null},"replicas":null,"podTemplate":{"metadata":{"labels":{"platform.confluent.io/   origin":"flink"}}}},"logConfiguration":null,"mode":null},"resource_metadata":{"apiVersion":"flink.apache.org/v1beta1","metadata":{"generation":6},"firstDeployment":true}}'
    reconciliationTimestamp: 1730834099726
    state: UPGRADING
  taskManager:
    labelSelector: ""
    replicas: 0
```

```sql
status:
  cfkInternalState: CREATED
  clusterInfo:
    flink-revision: 89d0b8f @ 2024-06-22T13:19:31+02:00
    flink-version: 1.19.1-cp1
    total-cpu: "2.0"
    total-memory: "2516582400"
  cmfSync:
    errorMessage: ""
    lastSyncTime: "2024-11-05T19:19:10Z"
    status: Created
  jobManagerDeploymentStatus: READY
  jobStatus:
    checkpointInfo:
      lastPeriodicCheckpointTimestamp: 0
    jobId: 522d7ff7f15b4e138ffb9ea4053abbd3
    jobName: State machine job
    savepointInfo:
      lastPeriodicSavepointTimestamp: 0
      savepointHistory: []
    startTime: "1730834237948"
    state: RUNNING
    updateTime: "1730834248753"
  lifecycleState: STABLE
  observedGeneration: 6
  reconciliationStatus:
    lastReconciledSpec: '{"spec":{"job":{"jarURI":"local:///opt/flink/examples/streaming/StateMachineExample.jar","parallelism":1,"entryClass":null,"args":[],"state":"running","savepointTriggerNonce":null,"initialSavepointPath":null,"checkpointTriggerNonce":null,"upgradeMode":"stateless","allowNonRestoredState":null,"savepointRedeployNonce":null},"restartNonce":null,"flinkConfiguration":{"rest.profiling.enabled":"true","taskmanager.numberOfTaskSlots":"2"},"image":"confluentinc/cp-flink:1.19.1-   cp1","imagePullPolicy":null,"serviceAccount":"flink","flinkVersion":"v1_19","ingress":null,"podTemplate":null,"jobManager":{"resource":{"cpu":1.0,"memory":"1200m","ephemeralStorage":null},"replicas":1,"podTemplate":{"metadata":{"labels":{"platform.confluent.io/origin":"flink"}}}},"taskManager":{"resource":{"cpu":1.0,"memory":"1200m","ephemeralStorage":null},"replicas":null,"podTemplate":{"metadata":{"labels":{"platform.confluent.io/   origin":"flink"}}}},"logConfiguration":null,"mode":null},"resource_metadata":{"apiVersion":"flink.apache.org/v1beta1","metadata":{"generation":12},"firstDeployment":true}}'
    lastStableSpec: '{"spec":{"job":{"jarURI":"local:///opt/flink/examples/streaming/StateMachineExample.jar","parallelism":1,"entryClass":null,"args":[],"state":"running","savepointTriggerNonce":null,"initialSavepointPath":null,"checkpointTriggerNonce":null,"upgradeMode":"stateless","allowNonRestoredState":null,"savepointRedeployNonce":null},"restartNonce":null,"flinkConfiguration":{"rest.profiling.enabled":"true","taskmanager.numberOfTaskSlots":"2"},"image":"confluentinc/cp-flink:1.19.1-   cp1","imagePullPolicy":null,"serviceAccount":"flink","flinkVersion":"v1_19","ingress":null,"podTemplate":null,"jobManager":{"resource":{"cpu":1.0,"memory":"1200m","ephemeralStorage":null},"replicas":1,"podTemplate":{"metadata":{"labels":{"platform.confluent.io/origin":"flink"}}}},"taskManager":{"resource":{"cpu":1.0,"memory":"1200m","ephemeralStorage":null},"replicas":null,"podTemplate":{"metadata":{"labels":{"platform.confluent.io/   origin":"flink"}}}},"logConfiguration":null,"mode":null},"resource_metadata":{"apiVersion":"flink.apache.org/v1beta1","metadata":{"generation":12},"firstDeployment":true}}'
    reconciliationTimestamp: 1730834229475
    state: DEPLOYED
  taskManager:
    labelSelector: component=taskmanager,app=app111
    replicas: 1
```

---