**Native PostgreSQL, psql**
```
postgresql://crate@localhost:5432/doc
```
**JDBC: PostgreSQL pgJDBC**
```
jdbc:crate://crate@localhost:5432/doc
```
**JDBC: CrateDB JDBC, e.g. Apache Flink**
```
jdbc:crate://:@localhost:5432/doc
```
**HTTP: Admin UI, CLI, CrateDB drivers**
```
http://crate@localhost:4200/
```
**SQLAlchemy**
```
crate://crate@localhost:4200/?schema=doc
```
:::
::::
:::::
::::::
```{tip}
- CrateDB's fixed catalog name is `crate`, the default schema name is `doc`.
- CrateDB does not implement the notion of a database,
however tables can be created in different [schemas].
- When asked for a *database name*, specifying a schema name (any),
or the fixed catalog name `crate` may be applicable.
- If a database-/schema-name is omitted while connecting,
the PostgreSQL drivers may default to the "username".
- The predefined [superuser] on an unconfigured CrateDB cluster is
called `crate`, defined without a password.
- For authenticating properly, please learn about the available
[authentication] options.
```
## Client Libraries
This section lists drivers and adapters for relevant programming languages,
frameworks, and environments.
### PostgreSQL
The drivers listed in this section all use the [CrateDB PostgreSQL interface].
::::{sd-table}
:widths: 2 3 5 2
:row-class: top-border
:::{sd-row}
```{sd-item}
```
```{sd-item} **Driver/Adapter**
```
```{sd-item} **Description**
```
```{sd-item} **Info**
```
:::
:::{sd-row}
```{sd-item} \-
```
```{sd-item}
[PostgreSQL ODBC](https://odbc.postgresql.org/)
```
```{sd-item}
The official PostgreSQL ODBC Driver.
For connecting to CrateDB from any environment that supports it.
```
```{sd-item}
```
:::
:::{sd-row}
```{sd-item} .NET
```
```{sd-item}
[Npgsql](https://www.npgsql.org/)
```
```{sd-item}
An open source ADO.NET Data Provider for PostgreSQL, for program written in C#,
Visual Basic, and F#.
```
```{sd-item}
[](https://github.com/npgsql/npgsql)
[](https://github.com/crate/cratedb-examples/tree/main/by-language/csharp-npgsql)
```
:::
:::{sd-row}
```{sd-item} .NET
```
```{sd-item}
[CrateDB Npgsql fork](https://cratedb.com/docs/npgsql/)
```
```{sd-item}
This fork of the official driver was needed prior to CrateDB 4.2.
```
```{sd-item}
[](https://github.com/crate/npgsql)
```
:::
:::{sd-row}
```{sd-item} Golang
```
```{sd-item}
[pgx](https://github.com/jackc/pgx)
```
```{sd-item}
A pure Go driver and toolkit for PostgreSQL.
```
```{sd-item}
[](https://github.com/jackc/pgx)
```
:::
:::{sd-row}
```{sd-item} Java
```
```{sd-item}
[PostgreSQL JDBC](https://jdbc.postgresql.org/)
```
```{sd-item}
The official PostgreSQL JDBC Driver.
For connecting to CrateDB from any environment that supports it.
```
```{sd-item}
[](https://github.com/pgjdbc/pgjdbc)
[](#java)
[](https://github.com/crate/cratedb-examples/tree/main/by-language/java-jdbc)
```
:::
:::{sd-row}
```{sd-item} Java
```
```{sd-item}
[CrateDB PgJDBC fork](https://cratedb.com/docs/jdbc/)
```
```{sd-item}
For connecting to CrateDB with specialized type system support and
other tweaks. Ignores the `ROLLBACK` statement and the `hstore` and
`jsonb` extensions.
```
```{sd-item}
[](https://github.com/crate/pgjdbc)
```
:::
:::{sd-row}
```{sd-item} Node.js
```
```{sd-item}
[node-postgres](https://node-postgres.com/)
```
```{sd-item}
A collection of Node.js modules for interfacing with a PostgreSQL database using
JavaScript or TypeScript.
Has support for callbacks, promises, async/await, connection pooling, prepared
statements, cursors, streaming results, C/C++ bindings, rich type parsing, and more.
```
```{sd-item}
[](https://github.com/brianc/node-postgres)
[](#javascript)
```
:::
:::{sd-row}
```{sd-item} PHP
```
```{sd-item}
[PDO_PGSQL](https://www.php.net/manual/en/ref.pdo-pgsql.php)
```
```{sd-item}
For connecting to CrateDB from PHP, supporting its PDO interface.
```
```{sd-item}
[](https://github.com/php/php-src/tree/master/ext/pdo_pgsql)
[](https://github.com/crate/cratedb-examples/tree/main/by-language/php-pdo)
```
:::
:::{sd-row}
```{sd-item} PHP
```
```{sd-item}
[AMPHP](https://amphp.org/)
```
```{sd-item}
For connecting to CrateDB using AMPHP, an Async PostgreSQL client for PHP.
AMPHP is a collection of high-quality, event-driven libraries for PHP
designed with fibers and concurrency in mind.
```
```{sd-item}
[](https://github.com/amphp/postgres)
[](https://github.com/crate/cratedb-examples/tree/main/by-language/php-amphp)
```
:::
:::{sd-row}
```{sd-item} Python
```
```{sd-item}
[aoipg](https://github.com/aio-libs/aiopg)
```
```{sd-item}
For connecting to CrateDB from Python, supporting Python's `asyncio` (PEP-3156/tulip) framework.
```
```{sd-item}
[](https://github.com/aio-libs/aiopg)
[](#aiopg)
```
:::
:::{sd-row}
```{sd-item} Python
```
```{sd-item}
[asyncpg](https://github.com/MagicStack/asyncpg)
```
```{sd-item}
For connecting to CrateDB from Python, supporting Python's `asyncio`.
```
```{sd-item}
[](https://github.com/MagicStack/asyncpg)
[](#psycopg2)
```
:::
:::{sd-row}
```{sd-item} Python
```
```{sd-item}
[psycopg3](https://www.psycopg.org/psycopg3/docs/)
```
```{sd-item}
For connecting to CrateDB from Python, supporting Python's `asyncio`.
```
```{sd-item}
[](https://github.com/psycopg/psycopg)
[](#psycopg3)
```
:::
::::
### HTTP
The drivers listed in this section all use the [CrateDB HTTP interface].
::::{sd-table}
:widths: 2 3 5 2
:row-class: top-border
:::{sd-row}
```{sd-item}
```
```{sd-item} **Driver/Adapter**
```
```{sd-item} **Description**
```
```{sd-item} **Info**
```
:::
:::{sd-row}
```{sd-item} MicroPython
```
```{sd-item}
[micropython-cratedb](https://github.com/crate/micropython-cratedb)
```
```{sd-item}
A MicroPython library connecting to the CrateDB HTTP API.
```
```{sd-item}
[](https://github.com/crate/micropython-cratedb)
```
:::
:::{sd-row}
```{sd-item} Node.js
```
```{sd-item}
[node-crate](https://www.npmjs.com/package/node-crate)
```
```{sd-item}
A JavaScript library connecting to the CrateDB HTTP API.
```
```{sd-item}
[](https://github.com/megastef/node-crate)
[](https://github.com/crate/devrel-shipping-forecast-geo-demo)
```
:::
:::{sd-row}
```{sd-item} PHP
```
```{sd-item}
[CrateDB PDO driver](https://cratedb.com/docs/pdo/)
```
```{sd-item}
For connecting to CrateDB from PHP.
```
```{sd-item}
[](https://github.com/crate/crate-pdo)
[](#php)
```
:::
:::{sd-row}
```{sd-item} PHP
```
```{sd-item}
[CrateDB DBAL adapter](https://cratedb.com/docs/dbal/)
```
```{sd-item}
For connecting to CrateDB from PHP, using DBAL and Doctrine.
```
```{sd-item}
[](https://github.com/crate/crate-dbal)
[](#php)
```
:::
:::{sd-row}
```{sd-item} Python
```
```{sd-item}
[CrateDB Python driver](https://cratedb.com/docs/python/)
```
```{sd-item}
For connecting to CrateDB from Python. Has support for [CrateDB BLOBs].
```
```{sd-item}
[](https://github.com/crate/crate-python)
[][python-dbapi-by-example]
[](#crate-python)
```
:::
:::{sd-row}
```{sd-item} Python
```
```{sd-item}
[SQLAlchemy dialect](https://cratedb.com/docs/sqlalchemy-cratedb/)
```
```{sd-item}
For connecting to CrateDB from Python, using SQLAlchemy.
```
```{sd-item}
[](https://github.com/crate/sqlalchemy-cratedb)
[][python-sqlalchemy-by-example]
[](#sqlalchemy-cratedb)
```
:::
:::{sd-row}
```{sd-item} Ruby
```
```{sd-item}
[CrateDB Ruby driver](https://github.com/crate/crate_ruby)
```
```{sd-item}
A Ruby client library for CrateDB.
```
```{sd-item}
[](https://github.com/crate/crate_ruby)
[](#ruby)
[](https://github.com/crate/cratedb-examples/tree/main/by-language/ruby)
```
:::
:::{sd-row}
```{sd-item} Ruby
```
```{sd-item}
[CrateDB ActiveRecord adapter](https://github.com/crate/activerecord-crate-adapter)
```
```{sd-item}
Ruby on Rails ActiveRecord adapter for CrateDB.
```
```{sd-item}
[](https://github.com/crate/activerecord-crate-adapter)
```
:::
::::
```{tip}
Please visit the [](#build-status) page for an overview about the integration
status of the client drivers listed above, and more.
```
```{toctree}
:maxdepth: 1
:hidden:
java
javascript
php
python
ruby
```
[ADBC]: https://arrow.apache.org/docs/format/ADBC.html
[Authentication]: inv:crate-reference:*:label#admin_auth
[CrateDB BLOBs]: inv:crate-reference:*:label#blob_support
[CrateDB bulk operations]: inv:crate-reference:*:label#http-bulk-ops
[CrateDB HTTP interface]: inv:crate-reference:*:label#interface-http
[CrateDB PostgreSQL interface]: inv:crate-reference:*:label#interface-postgresql
[HTTP protocol]: https://en.wikipedia.org/wiki/HTTP
[JDBC]: https://en.wikipedia.org/wiki/Java_Database_Connectivity
[ODBC]: https://en.wikipedia.org/wiki/Open_Database_Connectivity
[PostgreSQL wire protocol]: https://www.postgresql.org/docs/current/protocol.html
[python-dbapi-by-example]: inv:crate-python:*:label#by-example
[python-sqlalchemy-by-example]: inv:sqlalchemy-cratedb:*:label#by-example
[schema]: inv:crate-reference:*:label#ddl-create-table-schemas
[schemas]: inv:crate-reference:*:label#ddl-create-table-schemas
[superuser]: inv:crate-reference:*:label#administration_user_management.. highlight:: sh
.. _conf-cluster-settings:
=====================
Cluster-wide settings
=====================
All current applied cluster settings can be read by querying the
:ref:`sys.cluster.settings ` column. Most
cluster settings can be :ref:`changed at runtime
`. This is documented at each setting.
.. rubric:: Table of contents
.. contents::
:local:
.. _applying-cluster-settings:
Non-runtime cluster-wide settings
---------------------------------
Cluster wide settings which cannot be changed at runtime need to be specified
in the configuration of each node in the cluster.
.. CAUTION::
Cluster settings specified via node configurations are required to be
exactly the same on every node in the cluster for proper operation of the
cluster.
.. _conf_collecting_stats:
Collecting stats
----------------
.. _stats.enabled:
**stats.enabled**
| *Default:* ``true``
| *Runtime:* ``yes``
A boolean indicating whether or not to collect statistical information about
the cluster.
.. CAUTION::
The collection of statistical information incurs a slight performance
penalty, as details about every job and operation across the cluster will
cause data to be inserted into the corresponding system tables.
.. _stats.jobs_log_size:
**stats.jobs_log_size**
| *Default:* ``10000``
| *Runtime:* ``yes``
The maximum number of job records kept to be kept in the :ref:`sys.jobs_log
` table on each node.
A job record corresponds to a single SQL statement to be executed on the
cluster. These records are used for performance analytics. A larger job log
produces more comprehensive stats, but uses more RAM.
Older job records are deleted as newer records are added, once the limit is
reached.
Setting this value to ``0`` disables collecting job information.
.. _stats.jobs_log_expiration:
**stats.jobs_log_expiration**
| *Default:* ``0s`` (disabled)
| *Runtime:* ``yes``
The job record expiry time in seconds.
Job records in the :ref:`sys.jobs_log ` table are periodically
cleared if they are older than the expiry time. This setting overrides
:ref:`stats.jobs_log_size `.
If the value is set to ``0``, time based log entry eviction is disabled.
.. NOTE::
If both the :ref:`stats.operations_log_size `
and
:ref:`stats.operations_log_expiration `
settings are disabled, jobs will not be recorded.
.. _stats.jobs_log_filter:
**stats.jobs_log_filter**
| *Default:* ``true`` (Include everything)
| *Runtime:* ``yes``
An :ref:`expression ` to determine if a job should be
recorded into ``sys.jobs_log``. The expression must :ref:`evaluate
` to a boolean. If it evaluates to ``true`` the statement
will show up in ``sys.jobs_log`` until it's evicted due to one of the other
rules. (expiration or size limit reached).
The expression may reference all columns contained in ``sys.jobs_log``. A
common use case is to include only jobs that took a certain amount of time to
execute::
cr> SET GLOBAL "stats.jobs_log_filter" = $$ended - started > '5 minutes'::interval$$;
SET OK, 1 row affected (... sec)
.. _stats.jobs_log_persistent_filter:
**stats.jobs_log_persistent_filter**
| *Default:* ``false`` (Include nothing)
| *Runtime:* ``yes``
An expression to determine if a job should also be recorded to the regular
``CrateDB`` log. Entries that match this filter will be logged under the
``StatementLog`` logger with the ``INFO`` level.
This is similar to ``stats.jobs_log_filter`` except that these entries are
persisted to the log file. This should be used with caution and shouldn't be
set to an expression that matches many queries as the logging operation will
block on IO and can therefore affect performance.
A common use case is to use this for slow query logging.
.. _stats.operations_log_size:
**stats.operations_log_size**
| *Default:* ``10000``
| *Runtime:* ``yes``
The maximum number of operations records to be kept in the
:ref:`sys.operations_log ` table on each node.
A job consists of one or more individual operations. Operations records are
used for performance analytics. A larger operations log produces more
comprehensive stats, but uses more RAM.
Older operations records are deleted as newer records are added, once the
limit is reached.
Setting this value to ``0`` disables collecting operations information.
.. _stats.operations_log_expiration:
**stats.operations_log_expiration**
| *Default:* ``0s`` (disabled)
| *Runtime:* ``yes``
Entries of :ref:`sys.operations_log ` are cleared by a periodically
job when they are older than the specified expire time. This setting
overrides :ref:`stats.operations_log_size `. If
the value is set to ``0`` the time based log entry eviction is disabled.
.. NOTE::
If both settings :ref:`stats.operations_log_size
` and :ref:`stats.operations_log_expiration
` are disabled, no job information will be
collected.
.. _stats.service.interval:
**stats.service.interval**
| *Default:* ``24h``
| *Runtime:* ``yes``
Defines the refresh interval to refresh tables statistics used to produce
optimal query execution plans.
This field expects a time value either as a ``bigint`` or
``double precision`` or alternatively as a string literal with a time suffix
(``ms``, ``s``, ``m``, ``h``, ``d``, ``w``).
If the value provided is ``0`` then the refresh is disabled.
.. CAUTION::
Using a very small value can cause a high load on the cluster.
.. _stats.service.max_bytes_per_sec:
**stats.service.max_bytes_per_sec**
| *Default:* ``40mb``
| *Runtime:* ``yes``
Specifies the maximum number of bytes per second that can be read on data
nodes to collect statistics. If this is set to a positive number, the
underlying I/O operations of the :ref:`ANALYZE ` statement are
throttled.
If the value provided is ``0`` then the throttling is disabled.
Shard limits
------------
.. _cluster.max_shards_per_node:
**cluster.max_shards_per_node**
| *Default:* 1000
| *Runtime:* ``yes``
The maximum number of open primary and replica shards per node. This setting
is checked on a shard creation and doesn't limit shards for individual nodes.
To limit the number of shards for each node, use
:ref:`cluster.routing.allocation.total_shards_per_node
` setting.
The actual limit being checked is ``max_shards_per_node * number of data nodes``.
Any operations that would result in the creation of additional shard copies
that would exceed this limit are rejected.
For example. If you have 999 shards in the current cluster and you try to
create a new table, the create table operation will fail.
Similarly, if a write operation would lead to the creation of a new
partition, the statement will fail.
Each shard on a node requires some memory and increases the size of the
cluster state. Having too many shards per node will impact the clusters
stability and it is therefore discouraged to raise the limit above 1000.
.. NOTE::
The maximum number of shards per node setting is also used for the
:ref:`sys-node_checks_max_shards_per_node` check.
.. NOTE::
If a table is created with :ref:`sql-create-table-number-of-replicas`
provided as a range or default ``0-1`` value, the limit check accounts only
for primary shards and not for possible expanded replicas and thus actual
number of all shards can exceed the limit.
.. _conf_usage_data_collector:
Usage data collector
--------------------
The settings of the Usage Data Collector are read-only and cannot be set during
runtime. Please refer to :ref:`usage_data_collector` to get further information
about its usage.
.. _udc.enabled:
**udc.enabled**
| *Default:* ``true``
| *Runtime:* ``no``
``true``: Enables the Usage Data Collector.
``false``: Disables the Usage Data Collector.
.. _udc.initial_delay:
**udc.initial_delay**
| *Default:* ``10m``
| *Runtime:* ``no``
The delay for first ping after start-up.
This field expects a time value either as a ``bigint`` or
``double precision`` or alternatively as a string literal with a time suffix
(``ms``, ``s``, ``m``, ``h``, ``d``, ``w``).
.. _udc.interval:
**udc.interval**
| *Default:* ``24h``
| *Runtime:* ``no``
The interval a UDC ping is sent.
This field expects a time value either as a ``bigint`` or
``double precision`` or alternatively as a string literal with a time suffix
(``ms``, ``s``, ``m``, ``h``, ``d``, ``w``).
.. _udc.url:
**udc.url**
| *Default:* ``https://udc.crate.io``
| *Runtime:* ``no``
The URL the ping is sent to.
.. _conf_graceful_stop:
Graceful stop
-------------
By default, when the CrateDB process stops it simply shuts down, possibly
making some shards unavailable which leads to a *red* cluster state and lets
some queries fail that required the now unavailable shards. In order to
*safely* shutdown a CrateDB node, the graceful stop procedure can be used.
The following cluster settings can be used to change the shutdown behaviour of
nodes of the cluster:
.. _cluster.graceful_stop.min_availability:
**cluster.graceful_stop.min_availability**
| *Default:* ``primaries``
| *Runtime:* ``yes``
| *Allowed values:* ``none | primaries | full``
``none``: No minimum data availability is required. The node may shut down
even if records are missing after shutdown.
``primaries``: At least all primary shards need to be available after the node
has shut down. Replicas may be missing.
``full``: All records and all replicas need to be available after the node
has shut down. Data availability is full.
.. NOTE::
This option is ignored if there is only 1 node in a cluster!
.. _cluster.graceful_stop.timeout:
**cluster.graceful_stop.timeout**
| *Default:* ``2h``
| *Runtime:* ``yes``
Defines the maximum waiting time in milliseconds for the :ref:`reallocation
` process to finish. The ``force`` setting will
define the behaviour when the shutdown process runs into this timeout.
The timeout expects a time value either as a ``bigint`` or
``double precision`` or alternatively as a string literal with a time suffix
(``ms``, ``s``, ``m``, ``h``, ``d``, ``w``).
.. _cluster.graceful_stop.force:
**cluster.graceful_stop.force**
| *Default:* ``false``
| *Runtime:* ``yes``
Defines whether ``graceful stop`` should force stopping of the node if it
runs into the timeout which is specified with the
`cluster.graceful_stop.timeout`_ setting.
.. _conf_bulk_operations:
Bulk operations
---------------
SQL DML Statements involving a huge amount of rows like :ref:`sql-copy-from`,
:ref:`sql-insert` or :ref:`ref-update` can take an enormous amount of time and
resources. The following settings change the behaviour of those queries.
.. _bulk.request_timeout:
**bulk.request_timeout**
| *Default:* ``1m``
| *Runtime:* ``yes``
Defines the timeout of internal shard-based requests involved in the
execution of SQL DML Statements over a huge amount of rows.
.. _conf_discovery:
Discovery
---------
Data sharding and work splitting are at the core of CrateDB. This is how we
manage to execute very fast queries over incredibly large datasets. In order
for multiple CrateDB nodes to work together a cluster needs to be formed. The
process of finding other nodes with which to form a cluster is called
discovery. Discovery runs when a CrateDB node starts and when a node is not
able to reach the master node and continues until a master node is found or a
new master node is elected.
.. _discovery.seed_hosts:
**discovery.seed_hosts**
| *Default:* ``127.0.0.1``
| *Runtime:* ``no``
In order to form a cluster with CrateDB instances running on other nodes a
list of seed master-eligible nodes needs to be provided. This setting should
normally contain the addresses of all the master-eligible nodes in the
cluster. In order to seed the discovery process the nodes listed here must
be live and contactable. This setting contains either an array of hosts or a
comma-delimited string.
By default a node will bind to the available loopback and scan for local
ports between ``4300`` and ``4400`` to try to connect to other nodes running
on the same server. This default behaviour provides local auto clustering
without any configuration.
Each value should be in the form of host:port or host (where port defaults
to the setting ``transport.tcp.port``).
.. NOTE::
IPv6 hosts must be bracketed.
.. _cluster.initial_master_nodes:
**cluster.initial_master_nodes**
| *Default:* ``not set``
| *Runtime:* ``no``
Contains a list of node names, full-qualified hostnames or IP addresses of
the master-eligible nodes which will vote in the very first election of a
cluster that's bootstrapping for the first time. By default this is not set,
meaning it expects this node to join an already formed cluster.
In development mode, with no discovery settings configured, this step is
performed by the nodes themselves, but this auto-bootstrapping is designed
to aim development and is not safe for production. In production you must
explicitly list the names or IP addresses of the master-eligible nodes whose
votes should be counted in the very first election.
.. _discovery.type:
**discovery.type**
| *Default:* ``zen``
| *Runtime:* ``no``
| *Allowed values:* ``zen | single-node``
Specifies whether CrateDB should form a multiple-node cluster. By default,
CrateDB discovers other nodes when forming a cluster and allows other nodes to
join the cluster later. If ``discovery.type`` is set to ``single-node``,
CrateDB forms a single-node cluster and the node won't join any other
clusters. This can be useful for testing. It is not recommend to use this for
production setups. The ``single-node`` mode also skips `bootstrap checks`_.
.. CAUTION::
If a node is started without any :ref:`initial_master_nodes
` or a :ref:`discovery_type `
set to ``single-node`` (e.g., the default configuration), it will never join
a cluster even if the configuration is subsequently changed.
It is possible to force the node to forget its current cluster state by
using the :ref:`cli-crate-node` CLI tool. However, be aware that this may
result in data loss.
.. _conf_host_discovery:
Unicast host discovery
......................
As described above, CrateDB has built-in support for statically specifying a
list of addresses that will act as the seed nodes in the discovery process
using the `discovery.seed_hosts`_ setting.
CrateDB also has support for several different mechanisms of seed nodes
discovery. Currently there are two other discovery types: via DNS and via EC2
API.
When a node starts up with one of these discovery types enabled, it performs a
lookup using the settings for the specified mechanism listed below. The hosts
and ports retrieved from the mechanism will be used to generate a list of
unicast hosts for node discovery.
The same lookup is also performed by all nodes in a cluster whenever the master
is re-elected (see `Cluster Meta Data`).
.. _discovery.seed_providers:
**discovery.seed_providers**
| *Default:* ``not set``
| *Runtime:* ``no``
| *Allowed values:* ``srv``, ``ec2``
See also: `Discovery`_.
.. _conf_dns_discovery:
Discovery via DNS
`````````````````
Crate has built-in support for discovery via DNS. To enable DNS discovery the
``discovery.seed_providers`` setting needs to be set to ``srv``.
The order of the unicast hosts is defined by the priority, weight and name of
each host defined in the SRV record. For example::
_crate._srv.example.com. 3600 IN SRV 2 20 4300 crate1.example.com.
_crate._srv.example.com. 3600 IN SRV 1 10 4300 crate2.example.com.
_crate._srv.example.com. 3600 IN SRV 2 10 4300 crate3.example.com.
would result in a list of discovery nodes ordered like::
crate2.example.com:4300, crate3.example.com:4300, crate1.example.com:4300
.. _discovery.srv.query:
**discovery.srv.query**
| *Runtime:* ``no``
The DNS query that is used to look up SRV records, usually in the format
``_service._protocol.fqdn`` If not set, the service discovery will not be
able to look up any SRV records.
.. _discovery.srv.resolver:
**discovery.srv.resolver**
| *Runtime:* ``no``
The hostname or IP of the DNS server used to resolve DNS records. If this is
not set, or the specified hostname/IP is not resolvable, the default (system)
resolver is used.
Optionally a custom port can be specified using the format ``hostname:port``.
.. _conf_ec2_discovery:
Discovery on Amazon EC2
```````````````````````
CrateDB has built-in support for discovery via the EC2 API. To enable EC2
discovery the ``discovery.seed_providers`` settings needs to be set to
``ec2``.
.. _discovery.ec2.access_key:
**discovery.ec2.access_key**
| *Runtime:* ``no``
The access key ID to identify the API calls.
.. _discovery.ec2.secret_key:
**discovery.ec2.secret_key**
| *Runtime:* ``no``
The secret key to identify the API calls.
Following settings control the discovery:
.. _discovery.ec2.groups:
**discovery.ec2.groups**
| *Runtime:* ``no``
A list of security groups; either by ID or name. Only instances with the
given group will be used for unicast host discovery.
.. _discovery.ec2.any_group:
**discovery.ec2.any_group**
| *Default:* ``true``
| *Runtime:* ``no``
Defines whether all (``false``) or just any (``true``) security group must
be present for the instance to be used for discovery.
.. _discovery.ec2.host_type:
**discovery.ec2.host_type**
| *Default:* ``private_ip``
| *Runtime:* ``no``
| *Allowed values:* ``private_ip``, ``public_ip``, ``private_dns``, ``public_dns``
Defines via which host type to communicate with other instances.
.. _discovery.ec2.availability_zones:
**discovery.ec2.availability_zones**
| *Runtime:* ``no``
A list of availability zones. Only instances within the given availability
zone will be used for unicast host discovery.
.. _discovery.ec2.tag.name:
**discovery.ec2.tag.**
| *Runtime:* ``no``
EC2 instances for discovery can also be filtered by tags using the
``discovery.ec2.tag.`` prefix plus the tag name.
E.g. to filter instances that have the ``environment`` tags with the value
``dev`` your setting will look like: ``discovery.ec2.tag.environment: dev``.
.. _discovery.ec2.endpoint:
**discovery.ec2.endpoint**
| *Runtime:* ``no``
If you have your own compatible implementation of the EC2 API service you can
set the endpoint that should be used.
.. _conf_routing:
Routing allocation
------------------
.. _cluster.routing.allocation.enable:
**cluster.routing.allocation.enable**
| *Default:* ``all``
| *Runtime:* ``yes``
| *Allowed values:* ``all | none | primaries | new_primaries``
``all`` allows all :ref:`shard allocations `, the
cluster can allocate all kinds of shards.
``none`` allows no shard allocations at all. No shard will be moved or
created.
``primaries`` only primaries can be moved or created. This includes existing
primary shards.
``new_primaries`` allows allocations for new primary shards only. This means
that for example a newly added node will not allocate any replicas. However
it is still possible to allocate new primary shards for new indices. Whenever
you want to perform a zero downtime upgrade of your cluster you need to set
this value before gracefully stopping the first node and reset it to ``all``
after starting the last updated node.
.. NOTE::
This allocation setting has no effect on the :ref:`recovery
` of primary shards! Even when
``cluster.routing.allocation.enable`` is set to ``none``, nodes will recover
their unassigned local primary shards immediately after restart.
.. _cluster.routing.rebalance.enable:
**cluster.routing.rebalance.enable**
| *Default:* ``all``
| *Runtime:* ``yes``
| *Allowed values:* ``all | none | primaries | replicas``
Enables or disables rebalancing for different types of shards:
- ``all`` allows shard rebalancing for all types of shards.
- ``none`` disables shard rebalancing for any types.
- ``primaries`` allows shard rebalancing only for primary shards.
- ``replicas`` allows shard rebalancing only for replica shards.
.. _cluster.routing.allocation.allow_rebalance:
**cluster.routing.allocation.allow_rebalance**
| *Default:* ``indices_all_active``
| *Runtime:* ``yes``
| *Allowed values:* ``always | indices_primary_active | indices_all_active``
Defines when rebalancing will happen based on the total state of all
the indices shards in the cluster.
Defaults to ``indices_all_active`` to reduce chatter during initial
:ref:`recovery `.
.. _cluster.routing.allocation.cluster_concurrent_rebalance:
**cluster.routing.allocation.cluster_concurrent_rebalance**
| *Default:* ``2``
| *Runtime:* ``yes``
Defines how many concurrent rebalancing tasks are allowed across all nodes.
.. _cluster.routing.allocation.node_initial_primaries_recoveries:
**cluster.routing.allocation.node_initial_primaries_recoveries**
| *Default:* ``4``
| *Runtime:* ``yes``
Defines how many concurrent primary shard recoveries are allowed on a node.
Since primary recoveries use data that is already on disk (as opposed to
inter-node recoveries), recovery should be fast and so this
setting can be higher than :ref:`node_concurrent_recoveries
`.
.. _cluster.routing.allocation.node_concurrent_recoveries:
**cluster.routing.allocation.node_concurrent_recoveries**
| *Default:* ``2``
| *Runtime:* ``yes``
Defines how many concurrent recoveries are allowed on a node.
.. _conf-routing-allocation-balance:
Shard balancing
...............
You can configure how CrateDB attempts to balance shards across a cluster by
specifying one or more property *weights*. CrateDB will consider a cluster to
be balanced when no further allowed action can bring the weighted properties of
each node closer together.
.. NOTE::
Balancing may be restricted by other settings (e.g., :ref:`attribute-based
` and :ref:`disk-based
` shard allocation).
.. _cluster.routing.allocation.balance.shard:
**cluster.routing.allocation.balance.shard**
| *Default:* ``0.45f``
| *Runtime:* ``yes``
Defines the weight factor for shards :ref:`allocated
` on a node (float). Raising this raises the tendency
to equalize the number of shards across all nodes in the cluster.
.. NOTE::
:ref:`cluster.routing.allocation.balance.shard` and
:ref:`cluster.routing.allocation.balance.index` cannot be both set to
``0.0f``.
.. _cluster.routing.allocation.balance.index:
**cluster.routing.allocation.balance.index**
| *Default:* ``0.55f``
| *Runtime:* ``yes``
Defines a factor to the number of shards per index :ref:`allocated
` on a specific node (float). Increasing this value
raises the tendency to equalize the number of shards per index across all
nodes in the cluster.
.. NOTE::
:ref:`cluster.routing.allocation.balance.shard` and
:ref:`cluster.routing.allocation.balance.index` cannot be both set to
``0.0f``.
.. _cluster.routing.allocation.balance.threshold:
**cluster.routing.allocation.balance.threshold**
| *Default:* ``1.0f``
| *Runtime:* ``yes``
Minimal optimization value of operations that should be performed (non
negative float). Increasing this value will cause the cluster to be less
aggressive about optimising the shard balance.
.. _conf-routing-allocation-attributes:
Attribute-based shard allocation
................................
You can control how shards are allocated to specific nodes by setting
:ref:`custom attributes ` on each node (e.g., server rack
ID or node availability zone). After doing this, you can define
:ref:`cluster-wide attribute awareness ` and
then configure :ref:`cluster-wide attribute filtering
`.
.. SEEALSO::
For an in-depth example of using custom node attributes, check out the
`multi-zone setup how-to guide`_.
.. _conf-routing-allocation-awareness:
Cluster-wide attribute awareness
`````````````````````````````````
To make use of :ref:`custom attributes ` for
:ref:`attribute-based ` :ref:`shard
allocation `, you must configure *cluster-wide
attribute awareness*.
.. _cluster.routing.allocation.awareness.attributes:
**cluster.routing.allocation.awareness.attributes**
| *Runtime:* ``no``
You may define :ref:`custom node attributes ` which can
then be used to do awareness based on the :ref:`allocation
` of a shard and its replicas.
For example, let's say we want to use an attribute named ``rack_id``. We
start two nodes with ``node.attr.rack_id`` set to ``rack_one``. Then we
create a single table with five shards and one replica. The table will be
fully deployed on the current nodes (five shards and one replica each, making
a total of 10 shards).
Now, if we start two more nodes with ``node.attr.rack_id`` set to
``rack_two``, CrateDB will relocate shards to even out the number of shards
across the nodes. However, a shard and its replica will not be allocated to
nodes sharing the same ``rack_id`` value.
The ``awareness.attributes`` setting supports using several values.
.. _cluster.routing.allocation.awareness.force.\*.values:
**cluster.routing.allocation.awareness.force.\*.values**
| *Runtime:* ``no``
Attributes on which :ref:`shard allocation ` will be
forced. Here, ``*`` is a placeholder for the awareness attribute, which can
be configured using the :ref:`cluster.routing.allocation.awareness.attributes
` setting.
For example, let's say we configured forced shard allocation for an awareness
attribute named ``zone`` with ``values`` set to ``zone1, zone2``. Start two
nodes with ``node.attr.zone`` set to ``zone1``. Then, create a table with
five shards and one replica. The table will be created, but only five shards
will be allocated (with no replicas). The replicas will only be allocated
when we start one or more nodes with ``node.attr.zone`` set to
``zone2``.
.. _conf-routing-allocation-filtering:
Cluster-wide attribute filtering
````````````````````````````````
To control how CrateDB uses :ref:`custom attributes ` for
:ref:`attribute-based ` :ref:`shard
allocation `, you must configure *cluster-wide
attribute filtering*.
.. NOTE::
CrateDB will retroactively enforce filter definitions. If a new filter
would prevent newly created matching shards from being allocated to a node,
CrateDB would also move any *existing* matching shards away from that node.
.. _cluster.routing.allocation.include.*:
**cluster.routing.allocation.include.***
| *Runtime:* ``yes``
Only :ref:`allocate shards ` on nodes where at least
**one** of the specified values matches the attribute.
For example::
cluster.routing.allocation.include.zone: "zone1,zone2"`
This setting can be overridden for individual tables by the related
:ref:`table setting `.
.. _cluster.routing.allocation.exclude.*:
**cluster.routing.allocation.exclude.***
| *Runtime:* ``yes``
Only :ref:`allocate shards ` on nodes where **none**
of the specified values matches the attribute.
For example::
cluster.routing.allocation.exclude.zone: "zone1"
This setting can be overridden for individual tables by the related
:ref:`table setting `.
Therefore, if a node is excluded from shard allocation by this cluster level
setting, the node can still allocate shards if the table setting allows it.
.. _cluster.routing.allocation.require.*:
**cluster.routing.allocation.require.***
| *Runtime:* ``yes``
Used to specify a number of rules, which **all** of them must match for a node
in order to :ref:`allocate a shard ` on it.
This setting can be overridden for individual tables by the related
:ref:`table setting `.
.. _conf-routing-allocation-disk:
Disk-based shard allocation
...........................
.. _cluster.routing.allocation.disk.threshold_enabled:
**cluster.routing.allocation.disk.threshold_enabled**
| *Default:* ``true``
| *Runtime:* ``yes``
Prevent :ref:`shard allocation ` on nodes depending
of the disk usage.
.. _cluster.routing.allocation.disk.watermark.low:
**cluster.routing.allocation.disk.watermark.low**
| *Default:* ``85%``
| *Runtime:* ``yes``
Defines the lower disk threshold limit for :ref:`shard allocations
`. New shards will not be allocated on nodes with
disk usage greater than this value. It can also be set to an absolute bytes
value (like e.g. ``500mb``) to prevent the cluster from allocating new shards
on node with less free disk space than this value.
.. _cluster.routing.allocation.disk.watermark.high:
**cluster.routing.allocation.disk.watermark.high**
| *Default:* ``90%``
| *Runtime:* ``yes``
Defines the higher disk threshold limit for :ref:`shard allocations
`. The cluster will attempt to relocate existing
shards to another node if the disk usage on a node rises above this value. It
can also be set to an absolute bytes value (like e.g. ``500mb``) to relocate
shards from nodes with less free disk space than this value.
.. _cluster.routing.allocation.disk.watermark.flood_stage:
**cluster.routing.allocation.disk.watermark.flood_stage**
| *Default:* ``95%``
| *Runtime:* ``yes``
Defines the threshold on which CrateDB enforces a read-only block on every
index that has at least one :ref:`shard allocated `
on a node with at least one disk exceeding the flood stage.
.. NOTE::
:ref:`sql-create-table-blocks-read-only-allow-delete` setting is
automatically reset to ``FALSE`` for the tables if the disk space is
freed and the threshold is undershot.
``cluster.routing.allocation.disk.watermark`` settings may be defined as
percentages or bytes values. However, it is not possible to mix the value
types.
By default, the cluster will retrieve information about the disk usage of the
nodes every 30 seconds. This can also be changed by setting the
`cluster.info.update.interval`_ setting.
.. NOTE::
The watermark settings are also used for the
:ref:`sys-node_checks_watermark_low` and :ref:`sys-node_checks_watermark_high` node
check.
Setting ``cluster.routing.allocation.disk.threshold_enabled`` to false will
disable the allocation decider, but the node checks will still be active and
warn users about running low on disk space.
.. _cluster.routing.allocation.total_shards_per_node:
**cluster.routing.allocation.total_shards_per_node**
| *Default*: ``-1``
| *Runtime*: ``yes``
Limits the number of primary and replica shards that can be :ref:`allocated
` per node. A value of ``-1`` means unlimited.
Setting this to ``1000``, for example, will prevent CrateDB from assigning
more than 1000 shards per node. A node with 1000 shards would be excluded
from allocation decisions and CrateDB would attempt to allocate shards to
other nodes, or leave shards unassigned if no suitable node can be found.
.. NOTE::
If a table is created with :ref:`sql-create-table-number-of-replicas`
provided as a range or default ``0-1`` value, the limit check accounts only
for primary shards and not for possible expanded replicas and thus actual
number of all shards can exceed the limit.
.. _indices.recovery:
Recovery
--------
.. _indices.recovery.max_bytes_per_sec:
**indices.recovery.max_bytes_per_sec**
| *Default:* ``40mb``
| *Runtime:* ``yes``
Specifies the maximum number of bytes that can be transferred during
:ref:`shard recovery ` per seconds. Limiting can be
disabled by setting it to ``0``. This setting allows to control the network
usage of the recovery process. Higher values may result in higher network
utilization, but also faster recovery process.
.. _indices.recovery.retry_delay_state_sync:
**indices.recovery.retry_delay_state_sync**
| *Default:* ``500ms``
| *Runtime:* ``yes``
Defines the time to wait after an issue caused by cluster state syncing
before retrying to :ref:`recover `.
.. _indices.recovery.retry_delay_network:
**indices.recovery.retry_delay_network**
| *Default:* ``5s``
| *Runtime:* ``yes``
Defines the time to wait after an issue caused by the network before retrying
to :ref:`recover `.
.. _indices.recovery.internal_action_timeout:
**indices.recovery.internal_action_timeout**
| *Default:* ``15m``
| *Runtime:* ``yes``
Defines the timeout for internal requests made as part of the :ref:`recovery
`.
.. _indices.recovery.internal_action_long_timeout:
**indices.recovery.internal_action_long_timeout**
| *Default:* ``30m``
| *Runtime:* ``yes``
Defines the timeout for internal requests made as part of the :ref:`recovery
` that are expected to take a long time. Defaults to
twice :ref:`internal_action_timeout
`.
.. _indices.recovery.recovery_activity_timeout:
**indices.recovery.recovery_activity_timeout**
| *Default:* ``30m``
| *Runtime:* ``yes``
:ref:`Recoveries ` that don't show any activity for
more then this interval will fail. Defaults to
:ref:`internal_action_long_timeout
`.
.. _indices.recovery.max_concurrent_file_chunks:
**indices.recovery.max_concurrent_file_chunks**
| *Default:* ``2``
| *Runtime:* ``yes``
Controls the number of file chunk requests that can be sent in parallel per
:ref:`recovery `. As multiple recoveries are already
running in parallel, controlled by
:ref:`cluster.routing.allocation.node_concurrent_recoveries
`, increasing this
expert-level setting might only help in situations where peer recovery of a
single shard is not reaching the total inbound and outbound peer recovery
traffic as configured by :ref:`indices.recovery.max_bytes_per_sec
`, but is CPU-bound instead, typically
when using transport-level security or compression.
Memory management
-----------------
.. _memory.allocation.type:
**memory.allocation.type**
| *Default:* ``on-heap``
| *Runtime:* ``yes``
Supported values are ``on-heap`` and ``off-heap``. This influences if memory is
preferably allocated in the heap space or in the off-heap/direct memory region.
Setting this to ``off-heap`` doesn't imply that the heap won't be used anymore.
Most allocations will still happen in the heap space but some operations will
be allowed to utilize off heap buffers.
.. warning::
Using ``off-heap`` is considered **experimental**.
.. _memory.operation_limit:
**memory.operation_limit**
| *Default:* ``0``
| *Runtime:* ``yes``
Default value for the :ref:`memory.operation_limit
session setting `. Changing the cluster
setting will only affect new sessions, not existing sessions.
Example statement to update the default value to 1 GB, i.e. 1073741824 bytes::
cr> SET GLOBAL "memory.operation_limit" = 1073741824;
SET OK, 1 row affected (... sec)
Operations that hit this memory limit will trigger a ``CircuitBreakingException``
that can be handled in the application to inform the user about too much memory
consumption for the particular query.
Query circuit breaker
---------------------
The Query circuit breaker will keep track of the used memory during the
execution of a query. If a query consumes too much memory or if the cluster is
already near its memory limit it will terminate the query to ensure the cluster
keeps working.
.. _indices.breaker.query.limit:
**indices.breaker.query.limit**
| *Default:* ``60%``
| *Runtime:* ``yes``
Specifies the limit for the query breaker. Provided values can either be
absolute values (interpreted as a number of bytes), byte sizes (like ``1mb``)
or percentage of the heap size (like ``12%``). A value of ``-1`` disables
breaking the circuit while still accounting memory usage.
Request circuit breaker
-----------------------
The request circuit breaker allows an estimation of required heap memory per
request. If a single request exceeds the specified amount of memory, an
exception is raised.
.. _indices.breaker.request.limit:
**indices.breaker.request.limit**
| *Default:* ``60%``
| *Runtime:* ``yes``
Specifies the JVM heap limit for the request circuit breaker.
Accounting circuit breaker
--------------------------
Tracks things that are held in memory independent of queries. For example the
memory used by Lucene for segments.
.. _indices.breaker.accounting.limit:
**indices.breaker.accounting.limit**
| *Default:* ``100%``
| *Runtime:* ``yes``
Specifies the JVM heap limit for the accounting circuit breaker
.. CAUTION::
This setting is deprecated and will be removed in a future release.
.. _stats.breaker.log:
Stats circuit breakers
----------------------
Settings that control the behaviour of the stats circuit breaker. There are two
breakers in place, one for the jobs log and one for the operations log. For
each of them, the breaker limit can be set.
.. _stats.breaker.log.jobs.limit:
**stats.breaker.log.jobs.limit**
| *Default:* ``5%``
| *Runtime:* ``yes``
The maximum memory that can be used from :ref:`CRATE_HEAP_SIZE
` for the :ref:`sys.jobs_log ` table on each
node.
When this memory limit is reached, the job log circuit breaker logs an error
message and clears the :ref:`sys.jobs_log ` table completely.
.. _stats.breaker.log.operations.limit:
**stats.breaker.log.operations.limit**
| *Default:* ``5%``
| *Runtime:* ``yes``
The maximum memory that can be used from :ref:`CRATE_HEAP_SIZE
` for the :ref:`sys.operations_log ` table on
each node.
When this memory limit is reached, the operations log circuit breaker logs an
error message and clears the :ref:`sys.operations_log ` table
completely.
Total circuit breaker
---------------------
.. _indices.breaker.total.limit:
**indices.breaker.total.limit**
| *Default:* ``95%``
| *Runtime:* ``yes``
The maximum memory that can be used by all aforementioned circuit breakers
together.
Even if an individual circuit breaker doesn't hit its individual limit,
queries might still get aborted if several circuit breakers together would
hit the memory limit configured in ``indices.breaker.total.limit``.
Thread pools
------------
Every node uses a number of thread pools to schedule operations, each pool is
dedicated to specific operations. The most important pools are:
* ``write``: Used for write operations like index, update or delete. The ``type``
defaults to ``fixed``.
* ``search``: Used for read operations like ``SELECT`` statements. The ``type``
defaults to ``fixed``.
* ``get``: Used for some specific read operations. For example on tables like
``sys.shards`` or ``sys.nodes``. The ``type`` defaults to ``fixed``.
* ``refresh``: Used for :ref:`refresh operations `. The ``type``
defaults to ``scaling``.
* ``generic``: For internal tasks like cluster state management. The ``type``
defaults to ``scaling``.
* ``logical_replication``: For logical replication operations. The ``type``
defaults to fixed.
In addition to those pools, there are also ``netty`` worker threads which are
used to process network requests and many CPU bound actions like query analysis
and optimization.
The thread pool settings are expert settings which you generally shouldn't need
to touch. They are dynamically sized depending on the number of available CPU
cores. If you're running multiple services on the same machine you instead
should change the :ref:`processors` setting.
Increasing the number of threads for a pool can result in degraded performance
due to increased context switching and higher memory footprint.
If you observe idle CPU cores increasing the thread pool size is rarely the
right course of action, instead it can be a sign that:
- Operations are blocked on disk IO. Increasing the thread pool size could
result in more operations getting queued and blocked on disk IO without
increasing throughput but decreasing it due to more memory pressure and
additional garbage collection activity.
- Individual operations running single threaded. Not all tasks required to
process a SQL statement can be further subdivided and processed in parallel,
but many operations default to use one thread per shard. Because of this, you
can consider increasing the number of shards of a table to increase the
parallelism of a single individual statement and increase CPU core
utilization. As an alternative you can try increasing the concurrency on the
client side, to have CrateDB process more SQL statements in parallel.
.. _thread_pool..type:
**thread_pool..type**
| *Runtime:* ``no``
| *Allowed values:* ``fixed | scaling``
``fixed`` holds a fixed size of threads to handle the requests. It also has a
queue for pending requests if no threads are available.
``scaling`` ensures that a thread pool holds a dynamic number of threads that
are proportional to the workload.
Settings for fixed thread pools
...............................
If the type of a thread pool is set to ``fixed`` there are a few optional
settings.
.. _thread_pool..size:
**thread_pool..size**
| *Runtime:* ``no``
Number of threads. The default size of the different thread pools depend on
the number of available CPU cores.
.. _thread_pool..queue_size:
**thread_pool..queue_size**
| *Default write:* ``200``
| *Default search:* ``1000``
| *Default get:* ``100``
| *Runtime:* ``no``
Size of the queue for pending requests. A value of ``-1`` sets it to
unbounded.
If you have burst workloads followed by periods of inactivity it can make
sense to increase the ``queue_size`` to allow a node to buffer more queries
before rejecting new operations. But be aware, increasing the queue size if
you have sustained workloads will only increase the system's memory
consumption and likely degrade performance.
.. _overload_protection:
Overload Protection
-------------------
Overload protection settings control how many resources operations like
``INSERT INTO FROM QUERY`` or ``COPY`` can use.
The values here serve as a starting point for an algorithm that dynamically
adapts the effective concurrency limit based on the round-trip time of
requests. Whenever one of these settings is updated, the previously calculated
effective concurrency is reset.
Changing settings will only effect new operations, already running operations
will continue with the previous settings.
.. _overload_protection.dml.initial_concurrency:
**overload_protection.dml.initial_concurrency**
| *Default:* ``5``
| *Runtime:* ``yes``
The initial number of concurrent operations allowed per target node.
.. _overload_protection.dml.min_concurrency:
**overload_protection.dml.min_concurrency**
| *Default:* ``1``
| *Runtime:* ``yes``
The minimum number of concurrent operations allowed per target node.
.. _overload_protection.dml.max_concurrency:
**overload_protection.dml.max_concurrency**
| *Default:* ``100``
| *Runtime:* ``yes``
The maximum number of concurrent operations allowed per target node.
.. _overload_protection.dml.queue_size:
**overload_protection.dml.queue_size**
| *Default:* ``25``
| *Runtime:* ``yes``
How many operations are allowed to queue up.
Metadata
--------
.. _cluster.info.update.interval:
**cluster.info.update.interval**
| *Default:* ``30s``
| *Runtime:* ``yes``
Defines how often the cluster collect metadata information (e.g. disk usages
etc.) if no concrete event is triggered.
.. _metadata_gateway:
Metadata gateway
................
The following settings can be used to configure the behavior of the
:ref:`metadata gateway `.
.. _gateway.expected_nodes:
**gateway.expected_nodes**
| *Default:* ``-1``
| *Runtime:* ``no``
The setting ``gateway.expected_nodes`` defines the total number of nodes
expected in the cluster. It is evaluated together with
``gateway.recover_after_nodes``
to decide if the cluster can start with recovery.
.. CAUTION::
This setting is deprecated and will be removed in a future version.
Use `gateway.expected_data_nodes`_ instead.
.. _gateway.expected_data_nodes:
**gateway.expected_data_nodes**
| *Default:* ``-1``
| *Runtime:* ``no``
The setting ``gateway.expected_data_nodes`` defines the total number of
data nodes expected in the cluster. It is evaluated together with
``gateway.recover_after_data_nodes``
to decide if the cluster can start with recovery.
.. _gateway.recover_after_time:
**gateway.recover_after_time**
| *Default:* ``5m``
| *Runtime:* ``no``
The ``gateway.recover_after_time`` setting defines the time to wait for
the number of nodes set in ``gateway.expected_data_nodes`` (or
``gateway.expected_nodes``) to become available, before starting the
recovery, once the number of nodes defined in
``gateway.recover_after_data_nodes`` (or ``gateway.recover_after_nodes``)
has already been reached.
This setting is ignored if ``gateway.expected_data_nodes`` or
``gateway.expected_nodes`` are set to 0 or 1.
It also has no effect if ``gateway.recover_after_data_nodes`` is set equal
to ``gateway.expected_data_nodes`` (or ``gateway.recover_after_nodes`` is
set equal to ``gateway.expected_nodes``).
The cluster also proceeds to immediate recovery, and the default 5 minutes
waiting time does not apply, if neither this setting nor ``expected_nodes`` and
``expected_data_nodes`` are explicitly set.
.. _gateway.recover_after_nodes:
**gateway.recover_after_nodes**
| *Default:* ``-1``
| *Runtime:* ``no``
The ``gateway.recover_after_nodes`` setting defines the number of nodes that
need to join the cluster before the cluster state recovery can start.
If this setting is ``-1`` and ``gateway.expected_nodes`` is set, all nodes
will need to be started before the cluster state recovery can start.
Please note that proceeding with recovery when not all nodes are available
could trigger the promotion of shards and the creation of new replicas,
generating disk and network load, which may be unnecessary. You can use a
combination of this setting with ``gateway.recovery_after_time`` to
mitigate this risk.
.. CAUTION::
This setting is deprecated and will be removed in CrateDB 5.0.
Use `gateway.recover_after_data_nodes`_ instead.
.. _gateway.recover_after_data_nodes:
**gateway.recover_after_data_nodes**
| *Default:* ``-1``
| *Runtime:* ``no``
The ``gateway.recover_after_data_nodes`` setting defines the number of data
nodes that need to be started before the cluster state recovery can start.
If this setting is ``-1`` and ``gateway.expected_data_nodes`` is set, all
data nodes will need to be started before the cluster state recovery can
start.
Please note that proceeding with recovery when not all data nodes are
available could trigger the promotion of shards and the creation of new
replicas, generating disk and network load, which may be unnecessary. You
can use a combination of this setting with ``gateway.recovery_after_time``
to mitigate this risk.
Logical Replication
-------------------
Replication process can be configured by the following settings. Settings
are dynamic and can be changed in runtime.
.. _replication.logical.ops_batch_size:
**replication.logical.ops_batch_size**
| *Default:* ``50000``
| *Min value:* ``16``
| *Runtime:* ``yes``
Maximum number of operations to replicate from the publisher cluster per poll.
Represents a number to advance a sequence.
.. _replication.logical.reads_poll_duration:
**replication.logical.reads_poll_duration**
| *Default:* ``50``
| *Runtime:* ``yes``
The maximum time (in milliseconds) to wait for changes per poll operation. When
a subscriber makes another one request to a publisher, it has
``reads_poll_duration`` milliseconds to harvest changes from the publisher.
.. _replication.logical.recovery.chunk_size:
**replication.logical.recovery.chunk_size**
| *Default:* ``1MB``
| *Min value:* ``1KB``
| *Max value:* ``1GB``
| *Runtime:* ``yes``
Chunk size to transfer files during the initial recovery of a replicating table.
.. _replication.logical.recovery.max_concurrent_file_chunks:
**replication.logical.recovery.max_concurrent_file_chunks**
| *Default:* ``2``
| *Min value:* ``1``
| *Max value:* ``5``
| *Runtime:* ``yes``
Controls the number of file chunk requests that can be sent in parallel between
clusters during the recovery.
.. hide:
cr> RESET GLOBAL "stats.jobs_log_filter"
RESET OK, 1 row affected (... sec)
cr> RESET GLOBAL "memory.operation_limit"
RESET OK, 1 row affected (... sec)
.. _bootstrap checks: https://crate.io/docs/crate/howtos/en/latest/admin/bootstrap-checks.html
.. _multi-zone setup how-to guide: https://crate.io/docs/crate/howtos/en/latest/clustering/multi-zone-setup.html.. highlight:: sh
.. vale off
.. _conf-node-settings:
======================
Node-specific settings
======================
.. rubric:: Table of contents
.. contents::
:local:
Basics
======
.. _cluster.name:
**cluster.name**
| *Default:* ``crate``
| *Runtime:* ``no``
The name of the CrateDB cluster the node should join to.
.. _node.name:
**node.name**
| *Runtime:* ``no``
The name of the node. If no name is configured a random one will be
generated.
.. NOTE::
Node names must be unique in a CrateDB cluster.
.. _node.store_allow_mmap:
**node.store.allow_mmap**
| *Default:* ``true``
| *Runtime:* ``no``
The setting indicates whether or not memory-mapping is allowed.
Node types
==========
CrateDB supports different types of nodes.
The following settings can be used to differentiate nodes upon startup:
.. _node.master:
**node.master**
| *Default:* ``true``
| *Runtime:* ``no``
Whether or not this node is able to get elected as *master* node in the
cluster.
.. _node.data:
**node.data**
| *Default:* ``true``
| *Runtime:* ``no``
Whether or not this node will store data.
Using different combinations of these two settings, you can create four
different types of node. Each type of node is differentiated by what types of
load it will handle.
Tabulating the truth values for ``node.master`` and ``node.data`` produces a
truth table outlining the four different types of node:
+---------------+-----------------------------+------------------------------+
| | **Master** | **No master** |
+---------------+-----------------------------+------------------------------+
| **Data** | Handle all loads. | Handles client requests and |
| | | query execution. |
+---------------+-----------------------------+------------------------------+
| **No data** | Handles cluster management. | Handles client requests. |
+---------------+-----------------------------+------------------------------+
Nodes marked as ``node.master`` will only handle cluster management if they are
elected as the cluster master. All other loads are shared equally.
General
=======
.. _node.sql.read_only:
**node.sql.read_only**
| *Default:* ``false``
| *Runtime:* ``no``
If set to ``true``, the node will only allow SQL statements which are
resulting in read operations.
.. _statement_timeout:
**statement_timeout**
| *Default:* ``0``
| *Runtime:* ``yes``
The maximum duration of any statement before it gets cancelled.
This value is used as default value for the :ref:`statement_timeout session
setting `
If ``0`` queries are allowed to run infinitely and don't get cancelled
automatically.
.. NOTE::
Updating this setting won't affect existing sessions, it will only take
effect for new sessions.
Networking
==========
.. _conf_hosts:
Hosts
-----
.. _network.host:
**network.host**
| *Default:* ``_local_``
| *Runtime:* ``no``
The IP address CrateDB will bind itself to. This setting sets both the
`network.bind_host`_ and `network.publish_host`_ values.
.. _network.bind_host:
**network.bind_host**
| *Default:* ``_local_``
| *Runtime:* ``no``
This setting determines to which address CrateDB should bind itself to.
.. _network.publish_host:
**network.publish_host**
| *Default:* ``_local_``
| *Runtime:* ``no``
This setting is used by a CrateDB node to publish its own address to the rest
of the cluster.
.. TIP::
Apart from IPv4 and IPv6 addresses there are some special values that can
be used for all above settings:
========================= =================================================
``_local_`` Any loopback addresses on the system, for example
``127.0.0.1``.
``_site_`` Any site-local addresses on the system, for
example ``192.168.0.1``.
``_global_`` Any globally-scoped addresses on the system, for
example ``8.8.8.8``.
``_[INTERFACE]_`` Addresses of a network interface, for example
``_en0_``.
========================= =================================================
.. _conf_ports:
Ports
-----
.. _http.port:
**http.port**
| *Runtime:* ``no``
This defines the TCP port range to which the CrateDB HTTP service will be
bound to. It defaults to ``4200-4300``. Always the first free port in this
range is used. If this is set to an integer value it is considered as an
explicit single port.
The HTTP protocol is used for the REST endpoint which is used by all clients
except the Java client.
.. _http.publish_port:
**http.publish_port**
| *Runtime:* ``no``
The port HTTP clients should use to communicate with the node. It is
necessary to define this setting if the bound HTTP port (``http.port``) of
the node is not directly reachable from outside, e.g. running it behind a
firewall or inside a Docker container.
.. _transport.tcp.port:
**transport.tcp.port**
| *Runtime:* ``no``
This defines the TCP port range to which the CrateDB transport service will
be bound to. It defaults to ``4300-4400``. Always the first free port in this
range is used. If this is set to an integer value it is considered as an
explicit single port.
The transport protocol is used for internal node-to-node communication.
.. _transport.publish_port:
**transport.publish_port**
| *Runtime:* ``no``
The port that the node publishes to the cluster for its own discovery. It is
necessary to define this setting when the bound tranport port
(``transport.tcp.port``) of the node is not directly reachable from outside,
e.g. running it behind a firewall or inside a Docker container.
.. _psql.port:
**psql.port**
| *Runtime:* ``no``
This defines the TCP port range to which the CrateDB Postgres service will be
bound to. It defaults to ``5432-5532``. Always the first free port in this
range is used. If this is set to an integer value it is considered as an
explicit single port.
Advanced TCP settings
---------------------
Any interface that uses TCP (Postgres wire, HTTP & Transport protocols) shares
the following settings:
.. _network.tcp.no_delay:
**network.tcp.no_delay**
| *Default:* ``true``
| *Runtime:* ``no``
Enable or disable the `Nagle's algorithm`_ for buffering TCP packets.
Buffering is disabled by default.
.. _network.tcp.keep_alive:
**network.tcp.keep_alive**
| *Default:* ``true``
| *Runtime:* ``no``
Configures the ``SO_KEEPALIVE`` option for sockets, which determines
whether they send TCP keepalive probes.
.. _network.tcp.reuse_address:
**network.tcp.reuse_address**
| *Default:* ``true`` on non-windows machines and ``false`` otherwise
| *Runtime:* ``no``
Configures the ``SO_REUSEADDRS`` option for sockets, which determines
whether they should reuse the address.
.. _network.tcp.send_buffer_size:
**network.tcp.send_buffer_size**
| *Default:* ``-1``
| *Runtime:* ``no``
The size of the TCP send buffer (`SO_SNDBUF`_ socket option).
By default not explicitly set.
.. _network.tcp.receive_buffer_size:
**network.tcp.receive_buffer_size**
| *Default:* ``-1``
| *Runtime:* ``no``
The size of the TCP receive buffer (`SO_RCVBUF`_ socket option).
By default not explicitly set.
.. NOTE::
Each setting in this section has its counterpart for HTTP and transport.
To provide a protocol specific setting, remove ``network`` prefix and use
either ``http`` or ``transport`` instead. For example, no_delay can be
configured as ``http.tcp.no_delay`` and ``transport.tcp.no_delay``. Please
note, that PG interface takes its settings from transport.
Transport settings
------------------
.. _transport.connect_timeout:
**transport.connect_timeout**
| *Default:* ``30s``
| *Runtime:* ``no``
The connect timeout for initiating a new connection.
.. _transport.compress:
**transport.compress**
| *Default:* ``false``
| *Runtime:* ``no``
Set to `true` to enable compression (DEFLATE) between all nodes.
.. _transport.ping_schedule:
**transport.ping_schedule**
| *Default:* ``-1``
| *Runtime:* ``no``
Schedule a regular application-level ping message to ensure that transport
connections between nodes are kept alive. Defaults to `-1` (disabled). It is
preferable to correctly configure TCP keep-alives instead of using this
feature, because TCP keep-alives apply to all kinds of long-lived connections
and not just to transport connections.
Paths
=====
.. NOTE::
Relative paths are relative to :ref:`CRATE_HOME `.
Absolute paths override this behavior.
.. _path.conf:
**path.conf**
| *Default:* ``config``
| *Runtime:* ``no``
Filesystem path to the directory containing the configuration files
``crate.yml`` and ``log4j2.properties``.
.. _path.data:
**path.data**
| *Default:* ``data``
| *Runtime:* ``no``
Filesystem path to the directory where this CrateDB node stores its data
(table data and cluster metadata).
Multiple paths can be set by using a comma separated list and each of these
paths will hold full shards (instead of striping data across them). For
example:
.. code-block:: yaml
path.data: /path/to/data1,/path/to/data2
When CrateDB finds striped shards at the provided locations (from CrateDB
<0.55.0), these shards will be migrated automatically on startup.
.. _path.logs:
**path.logs**
| *Default:* ``logs``
| *Runtime:* ``no``
Filesystem path to a directory where log files should be stored.
Can be used as a variable inside ``log4j2.properties``.
For example:
.. code-block::
yaml
appender:
file:
file: ${path.logs}/${cluster.name}.log
.. _path.repo:
**path.repo**
| *Runtime:* ``no``
A list of filesystem or UNC paths where repositories of type
:ref:`sql-create-repo-fs` may be stored.
Without this setting a CrateDB user could write snapshot files to any
directory that is writable by the CrateDB process. To safeguard against this
security issue, the possible paths have to be whitelisted here.
See also :ref:`location ` setting of repository
type ``fs``.
.. SEEALSO::
:ref:`blobs.path `
Plug-ins
========
.. _plugin.mandatory:
**plugin.mandatory**
| *Runtime:* ``no``
A list of plug-ins that are required for a node to startup.
If any plug-in listed here is missing, the CrateDB node will fail to start.
CPU
===
.. _processors:
**processors**
| *Runtime:* ``no``
The number of processors is used to set the size of the thread pools CrateDB
is using appropriately. If not set explicitly, CrateDB will infer the number
from the available processors on the system.
In environments where the CPU amount can be restricted (like Docker) or when
multiple CrateDB instances are running on the same hardware, the inferred
number might be too high. In such a case, it is recommended to set the value
explicitly.
Memory
======
.. _bootstrap.memory_lock:
**bootstrap.memory_lock**
| *Default:* ``false``
| *Runtime:* ``no``
CrateDB performs poorly when the JVM starts swapping: you should ensure that
it *never* swaps. If set to ``true``, CrateDB will use the ``mlockall``
system call on startup to ensure that the memory pages of the CrateDB process
are locked into RAM.
Garbage collection
==================
CrateDB logs if JVM garbage collection on different memory pools takes too
long. The following settings can be used to adjust these timeouts:
.. _monitor.jvm.gc.collector.young.warn:
**monitor.jvm.gc.collector.young.warn**
| *Default:* ``1000ms``
| *Runtime:* ``no``
CrateDB will log a warning message if it takes more than the configured
timespan to collect the *Eden Space* (heap).
.. _monitor.jvm.gc.collector.young.info:
**monitor.jvm.gc.collector.young.info**
| *Default:* ``700ms``
| *Runtime:* ``no``
CrateDB will log an info message if it takes more than the configured
timespan to collect the *Eden Space* (heap).
.. _monitor.jvm.gc.collector.young.debug:
**monitor.jvm.gc.collector.young.debug**
| *Default:* ``400ms``
| *Runtime:* ``no``
CrateDB will log a debug message if it takes more than the configured
timespan to collect the *Eden Space* (heap).
.. _monitor.jvm.gc.collector.old.warn:
**monitor.jvm.gc.collector.old.warn**
| *Default:* ``10000ms``
| *Runtime:* ``no``
CrateDB will log a warning message if it takes more than the configured
timespan to collect the *Old Gen* / *Tenured Gen* (heap).
.. _monitor.jvm.gc.collector.old.info:
**monitor.jvm.gc.collector.old.info**
| *Default:* ``5000ms``
| *Runtime:* ``no``
CrateDB will log an info message if it takes more than the configured
timespan to collect the *Old Gen* / *Tenured Gen* (heap).
.. _monitor.jvm.gc.collector.old.debug:
**monitor.jvm.gc.collector.old.debug**
| *Default:* ``2000ms``
| *Runtime:* ``no``
CrateDB will log a debug message if it takes more than the configured
timespan to collect the *Old Gen* / *Tenured Gen* (heap).
Authentication
==============
.. _host_based_auth:
Trust authentication
--------------------
.. _auth.trust.http_default_user:
**auth.trust.http_default_user**
| *Default:* ``crate``
| *Runtime:* ``no``
The default user that should be used for authentication when clients connect
to CrateDB via HTTP protocol and they do not specify a user via the
``Authorization`` request header.
.. _auth.trust.http_support_x_real_ip:
**auth.trust.http_support_x_real_ip**
| *Default:* ``false``
| *Runtime:* ``no``
If enabled, the HTTP transport will trust the ``X-Real-IP`` header sent by
the client to determine the client's IP address. This is useful when CrateDB
is running behind a reverse proxy or load-balancer. For improved security,
any ``_local_`` IP address (``127.0.0.1`` and ``::1``) defined in this header
will be ignored.
.. warning::
Enabling this setting can be a security risk, as it allows clients to
impersonate other clients by sending a fake ``X-Real-IP`` header.
Host-based authentication
-------------------------
Authentication settings (``auth.host_based.*``) are node settings, which means
that their values apply only to the node where they are applied and different
nodes may have different authentication settings.
.. _auth.host_based.enabled:
**auth.host_based.enabled**
| *Default:* ``false``
| *Runtime:* ``no``
Setting to enable or disable Host Based Authentication (HBA). It is disabled
by default.
.. _jwt_defaults:
JWT Based Authentication
........................
Default global settings for the :ref:`JWT authentication `.
.. _auth.host_based.jwt.iss:
**auth.host_based.jwt.iss**
| *Runtime:* ``no``
Default value for the ``iss`` :ref:`JWT property `.
If ``iss`` is set, user specific JWT properties are ignored.
.. _auth.host_based.jwt.aud:
**auth.host_based.jwt.aud**
| *Runtime:* ``no``
Default value for the ``aud`` :ref:`JWT property `.
If ``aud`` is set but ``iss`` is not, then global config is not complete and
user specific JWT properties are used.
HBA entries
...........
The ``auth.host_based.config.`` setting is a group setting that can have zero,
one or multiple groups that are defined by their group key (``${order}``) and
their fields (``user``, ``address``, ``method``, ``protocol``, ``ssl``).
.. _$(order):
**${order}:**
| An identifier that is used as a natural order key when looking up the host
| based configuration entries. For example, an order key of ``a`` will be
| looked up before an order key of ``b``. This key guarantees that the entry
| lookup order will remain independent from the insertion order of the
| entries.
The :ref:`admin_hba` setting is a list of predicates that users can specify to
restrict or allow access to CrateDB.
The meaning of the fields of the are as follows:
.. _auth.host_based.config.${order}.user:
**auth.host_based.config.${order}.user**
| *Runtime:* ``no``
| Specifies an existing CrateDB username, only ``crate`` user (superuser) is
| available. If no user is specified in the entry, then all existing users
| can have access.
.. _auth.host_based.config.${order}.address:
**auth.host_based.config.${order}.address**
| *Runtime:* ``no``
| The client machine addresses that the client matches, and which are allowed
| to authenticate. This field may contain an IPv4 address, an IPv6 address or
| an IPv4 CIDR mask. For example: ``127.0.0.1`` or ``127.0.0.1/32``. It also
| may contain a hostname or the special ``_local_`` notation which will match
| both IPv4 and IPv6 connections from localhost. A hostname specification
| that starts with a dot (.) matches a suffix of the actual hostname.
| So .crate.io would match foo.crate.io but not just crate.io. If no address
| is specified in the entry, then access to CrateDB is open for all hosts.
.. _auth.host_based.config.${order}.method:
**auth.host_based.config.${order}.method**
| *Runtime:* ``no``
| The authentication method to use when a connection matches this entry.
| Valid values are ``trust``, ``cert``, ``password`` and ``jwt``. If no
| method is specified, the ``trust`` method is used by default.
| See :ref:`auth_trust`, :ref:`auth_cert`, :ref:`auth_password` and
| :ref:`auth_jwt` for more information about these methods.
.. _auth.host_based.config.${order}.protocol:
**auth.host_based.config.${order}.protocol**
| *Runtime:* ``no``
| Specifies the protocol for which the authentication entry should be used.
| If no protocol is specified, then this entry will be valid for all
| protocols that rely on host based authentication see :ref:`auth_trust`).
.. _auth.host_based.config.${order}.ssl:
**auth.host_based.config.${order}.ssl**
| *Default:* ``optional``
| *Runtime:* ``no``
| Specifies whether the client must use SSL/TLS to connect to the cluster.
| If set to ``on`` then the client must be connected through SSL/TLS
| otherwise is not authenticated. If set to ``off`` then the client must
| *not* be connected via SSL/TLS otherwise is not authenticated. Finally
| ``optional``, which is the value when the option is completely skipped,
| means that the client can be authenticated regardless of SSL/TLS is used
| or not.
Example of config groups:
.. code-block:: yaml
auth.host_based.config:
entry_a:
user: crate
address: 127.16.0.0/16
entry_b:
method: trust
entry_3:
user: crate
address: 172.16.0.0/16
method: trust
protocol: pg
ssl: on
.. _ssl_config:
Secured communications (SSL/TLS)
================================
Secured communications via SSL allows you to encrypt traffic between CrateDB
nodes and clients connecting to them. Connections are secured using Transport
Layer Security (TLS).
.. _ssl.http.enabled:
**ssl.http.enabled**
| *Default:* ``false``
| *Runtime:* ``no``
Set this to true to enable secure communication between the CrateDB node
and the client through SSL via the HTTPS protocol.
.. _ssl.psql.enabled:
**ssl.psql.enabled**
| *Default:* ``false``
| *Runtime:* ``no``
Set this to true to enable secure communication between the CrateDB node
and the client through SSL via the PostgreSQL wire protocol.
.. _ssl.transport.mode:
**ssl.transport.mode**
| *Default:* ``legacy``
| *Runtime:* ``no``
For communication between nodes, choose:
``off``
SSL cannot be used
``legacy``
SSL is not used. If HBA is enabled, transport connections won't be verified
Any reachable host can establish a connection.
``on``
SSL must be used
.. _ssl.keystore_filepath:
**ssl.keystore_filepath**
| *Runtime:* ``no``
The full path to the node keystore file.
.. _ssl.keystore_password:
**ssl.keystore_password**
| *Runtime:* ``no``
The password used to decrypt the keystore file defined with
``ssl.keystore_filepath``.
.. _ssl.keystore_key_password:
**ssl.keystore_key_password**
| *Runtime:* ``no``
The password entered at the end of the ``keytool -genkey command``.
.. NOTE::
Optionally trusted CA certificates can be stored separately from the
node's keystore into a truststore for CA certificates.
.. _ssl.truststore_filepath:
**ssl.truststore_filepath**
| *Runtime:* ``no``
The full path to the node truststore file. If not defined, then only a
keystore will be used.
.. _ssl.truststore_password:
**ssl.truststore_password**
| *Runtime:* ``no``
The password used to decrypt the truststore file defined with
``ssl.truststore_filepath``.
.. _ssl.resource_poll_interval:
**ssl.resource_poll_interval**
| *Default:* ``5m``
| *Runtime:* ``no``
The frequency at which SSL files such as keystore and truststore are polled
for changes.
Cross-origin resource sharing (CORS)
====================================
Many browsers support the `same-origin policy`_ which requires web applications
to explicitly allow requests across origins. The `cross-origin resource
sharing`_ settings in CrateDB allow for configuring these.
.. _http.cors.enabled:
**http.cors.enabled**
| *Default:* ``false``
| *Runtime:* ``no``
Enable or disable `cross-origin resource sharing`_.
.. _http.cors.allow-origin:
**http.cors.allow-origin**
| *Default:* ````
| *Runtime:* ``no``
Define allowed origins of a request. ``*`` allows *any* origin (which can be
a substantial security risk) and by prepending a ``/`` the string will be
treated as a :ref:`regular expression `. For
example ``/https?:\/\/crate.io/`` will allow requests from
``https://crate.io`` and ``https://crate.io``. This setting disallows any
origin by default.
.. _http.cors.max-age:
**http.cors.max-age**
| *Default:* ``1728000`` (20 days)
| *Runtime:* ``no``
Max cache age of a preflight request in seconds.
.. _http.cors.allow-methods:
**http.cors.allow-methods**
| *Default:* ``OPTIONS, HEAD, GET, POST, PUT, DELETE``
| *Runtime:* ``no``
Allowed HTTP methods.
.. _http.cors.allow-headers:
**http.cors.allow-headers**
| *Default:* ``X-Requested-With, Content-Type, Content-Length``
| *Runtime:* ``no``
Allowed HTTP headers.
.. _http.cors.allow-credentials:
**http.cors.allow-credentials**
| *Default:* ``false``
| *Runtime:* ``no``
Add the ``Access-Control-Allow-Credentials`` header to responses.
.. _`same-origin policy`: https://developer.mozilla.org/en-US/docs/Web/Security/Same-origin_policy
.. _`cross-origin resource sharing`: https://developer.mozilla.org/en-US/docs/Web/HTTP/Access_control_CORS
Blobs
=====
.. _blobs.path:
**blobs.path**
| *Runtime:* ``no``
Path to a filesystem directory where to store blob data allocated for this
node.
By default blobs will be stored under the same path as normal data. A
relative path value is interpreted as relative to ``CRATE_HOME``.
.. _ref-configuration-repositories:
Repositories
============
Repositories are used to :ref:`backup ` a CrateDB cluster.
.. _repositories.url.allowed_urls:
**repositories.url.allowed_urls**
| *Runtime:* ``no``
This setting only applies to repositories of type :ref:`sql-create-repo-url`.
With this setting a list of urls can be specified which are allowed to be
used if a repository of type ``url`` is created.
Wildcards are supported in the host, path, query and fragment parts.
This setting is a security measure to prevent access to arbitrary resources.
In addition, the supported protocols can be restricted using the
:ref:`repositories.url.supported_protocols
` setting.
.. _repositories.url.supported_protocols:
**repositories.url.supported_protocols**
| *Default:* ``http``, ``https``, ``ftp``, ``file`` and ``jar``
| *Runtime:* ``no``
A list of protocols that are supported by repositories of type
:ref:`sql-create-repo-url`.
The ``jar`` protocol is used to access the contents of jar files. For more
info, see the java `JarURLConnection documentation`_.
See also the :ref:`path.repo ` Setting.
.. _`JarURLConnection documentation`: https://docs.oracle.com/javase/8/docs/api/java/net/JarURLConnection.html
Queries
=======
.. _indices.query.bool.max_clause_count:
**indices.query.bool.max_clause_count**
| *Default:* ``8192``
| *Runtime:* ``no``
This setting limits the number of boolean clauses that can be generated by
``!= ANY()``, ``LIKE ANY()``, ``ILIKE ANY()``, ``NOT LIKE ANY()`` and
``NOT ILIKE ANY()`` :ref:`operators ` on arrays in order to
prevent users from executing queries that may result in heavy memory
consumption causing nodes to crash with ``OutOfMemory`` exceptions. Throws
``TooManyClauses`` errors when the limit is exceeded.
.. NOTE::
You can avoid ``TooManyClauses`` errors by increasing this setting. The
number of boolean clauses used can be larger than the elements of the array
.
Legacy
=======
.. _legacy.table_function_column_naming:
**legacy.table_function_column_naming**
| *Default:* ``false``
| *Runtime:* ``no``
Since CrateDB 5.0.0, if the table function is not aliased and is returning a
single base data typed column, the table function name is used as the column
name. This setting can be set in order to use the naming convention prior to
5.0.0.
The following table functions are affected by this setting:
- :ref:`unnest `
- :ref:`regexp_matches `
- :ref:`generate_series `
When the setting is set and a single column is expected to be returned,
the returned column will be named ``col1``, ``groups``, or ``col1``
respectively.
.. NOTE::
Beware that if not all nodes in the cluster are consistently set or unset,
the behaviour will depend on the node handling the query.
.. _conf-node-lang-js:
JavaScript language
===================
.. _lang.js.enabled:
**lang.js.enabled**
| *Default:* ``true``
| *Runtime:* ``no``
Setting to enable or disable :ref:`JavaScript UDF ` support.
.. _conf-fdw:
Foreign Data Wrappers
=====================
.. _fdw.allow_local:
**fdw.allow_local**
| *Default:* ``false``
| *Runtime:* ``no``
Allow access to local addresses via :ref:`Foreign data wrappers
` for all users.
By default, only the ``crate`` superuser is allowed to access foreign servers
that point to ``localhost``.
.. warning::
Changing this to ``true`` can pose a security risk if you do not trust the
users with ``AL`` permissions on the system. They can create foreign servers,
foreign tables and user mappings that allow them to access services running on
the same machine as CrateDB as if connected locally - effectively bypassing
any restrictions set up via :ref:`admin_hba`.
Do **not** change this if you don't understand the implications.
.. _conf-node-attributes:
Custom attributes
=================
The ``node.attr`` namespace is a bag of custom attributes. Custom attributes
can be :ref:`used to control shard allocation
`.
You can create any attribute you want under this namespace, like
``node.attr.key: value``. These attributes use the ``node.attr`` namespace to
distinguish them from core node attribute like ``node.name``.
Custom attributes are not validated by CrateDB, unlike core node attributes.
.. vale on
.. _plugins: https://github.com/crate/crate/blob/master/devs/docs/plugins.rst
.. _Nagle's algorithm: https://en.wikipedia.org/wiki/Nagle%27s_algorithm
.. _SO_RCVBUF: https://docs.oracle.com/javase/7/docs/api/java/net/StandardSocketOptions.html#SO_RCVBUF
.. _SO_SNDBUF: https://docs.oracle.com/javase/7/docs/api/java/net/StandardSocketOptions.html#SO_SNDBUF.. _concept-clustering:
==========
Clustering
==========
The aim of this document is to describe, on a high level, how the distributed
SQL database CrateDB uses a shared nothing architecture to form high-
availability, resilient database clusters with minimal effort of configuration.
It will lay out the core concepts of the shared nothing architecture at the
heart of CrateDB. The main difference to a `primary-secondary architecture`_ is
that every node in the CrateDB cluster can perform every operation - hence all
nodes are equal in terms of functionality (see
:ref:`concept-node-components`) and are configured the same.
.. rubric:: Table of contents
.. contents::
:local:
.. _concept-node-components:
Components of a CrateDB Node
============================
To understand how a CrateDB cluster works it makes sense to first take a look
at the components of an individual node of the cluster.
.. _figure_1:
.. figure:: interconnected-crate-nodes.png
:align: center
Figure 1
Multiple interconnected instances of CrateDB form a single database cluster.
The components of each node are equal.
:ref:`figure_1` shows that in CrateDB each node of a cluster contains the same
components that (a) interface with each other, (b) with the same component from
a different node and/or (c) with the outside world. These four major components
are: SQL Handler, Job Execution Service, Cluster State Service, and Data
Storage.
SQL Handler
-----------
The SQL Handler part of a node is responsible for three aspects:
(a) handling incoming client requests,
(b) parsing and analyzing the SQL statement from the request and
(c) creating an execution plan based on the analyzed statement
(`abstract syntax tree`_)
The SQL Handler is the only of the four components that interfaces with the
"outside world". CrateDB supports three protocols to handle client requests:
(a) HTTP
(b) a Binary Transport Protocol
(c) the PostgreSQL Wire Protocol
A typical request contains a SQL statement and its corresponding arguments.
Job Execution Service
---------------------
The Job Execution Service is responsible for the execution of a plan ("job").
The phases of the job and the resulting operations are already defined in the
execution plan. A job usually consists of multiple operations that are
distributed via the Transport Protocol to the involved nodes, be it the local
node and/or one or multiple remote nodes. Jobs maintain IDs of their individual
operations. This allows CrateDB to "track" (or for example "kill") distributed
queries.
Cluster State Service
---------------------
The three main functions of the Cluster State Service are:
(a) cluster state management,
(b) election of the master node and
(c) node discovery, thus being the main component for cluster building (as
described in section :ref:`concept-clusters`).
It communicates using the Binary Transport Protocol.
Data storage
------------
The data storage component handles operations to store and retrieve data from
disk based on the execution plan.
In CrateDB, the data stored in the tables is sharded, meaning that tables are
divided and (usually) stored across multiple nodes. Each shard is a separate
Lucene index that is stored physically on the filesystem. Reads and writes are
operating on a shard level.
.. _concept-clusters:
Multi-node setup: Clusters
==========================
A CrateDB cluster is a set of two or more CrateDB instances (referred to as
*nodes*) running on different hosts which form a single, distributed database.
For inter-node communication, CrateDB uses a software specific transport
protocol that utilizes byte-serialized Plain Old Java Objects (`POJOs`_) and
operates on a separate port. That so-called "transport port" must be open and
reachable from all nodes in the cluster.
Cluster state management
------------------------
The cluster state is versioned and all nodes in a cluster keep a copy of the
latest cluster state. However, only a single node in the cluster -- the
*master node* -- is allowed to change the state at runtime.
Settings, metadata, and routing
................................
The cluster state contains all necessary meta information to maintain the
cluster and coordinate operations:
* Global cluster settings
* Discovered nodes and their status
* Schemas of tables
* The status and location of primary and replica shards
When the master node updates the cluster state it will publish the new state to all
nodes in the cluster and wait for all nodes to respond before processing
the next update.
.. _concept-master-election:
Master Node Election
--------------------
In a CrateDB cluster there can only be one master node at any single time.
The cluster only becomes available to serve requests once a master has been
elected, and a new election takes place if the current master node becomes
unavailable.
By default, all nodes are master-eligible, but
:ref:`a node setting `
is available to indicate, if desired, that a node must not take on the role
of master.
To elect a master among the eligible nodes, a majority
(``floor(half)+1``), also known as *quorum*, is required among a subset of
all master-eligible nodes, this subset of nodes is known as the
*voting configuration*.
The *voting configuration* is a list which is persisted as part of the cluster
state. It is maintained automatically in a way that makes so that split-brain
scenarios are never possible.
Every time a node joins the cluster, or leaves the cluster, even if it is
for a few seconds, CrateDB re-evaluates the voting configuration.
If the new number of master-eligible nodes in the cluster is odd, CrateDB will
put them all in the voting configuration.
If the number is even, CrateDB will exclude one of the master-eligible nodes
from the voting configuration.
The voting configuration is not shrunk below 3 nodes, meaning that if there
were 3 nodes in the voting configuration and one of them becomes unavailable,
they all stay in the voting configuration and a quorum of 2 nodes is still
required.
A master node rescinds its role if it cannot contact a quorum of nodes from
the latest voting configuration.
.. WARNING::
If you do infrastructure maintenance, please note that as nodes are shutdown
or rebooted, they will temporarily leave the voting configuration, and for
the cluster to elect a master a quorum is required among the
nodes that were last in the voting configuration.
For instance, if you
have a 5-nodes cluster, with all nodes master-eligible, and node 1 is
currently the master, and you shutdown node 5, then node 4, then node 3,
the cluster will stay available as the voting configuration will have
adapted to only have nodes 1, 2, and 3 on it.
If you then shutdown one more node the cluster will become unavailable as
a quorum of 2 nodes is now required and not available.
To bring the cluster back online at this point you will require two nodes
among 1, 2, and 3. Bringing back nodes 3, 4, and 5, will not be sufficient.
.. NOTE::
Special `settings and considerations
`_
applied prior to CrateDB version 4.0.0.
.. _concept-discovery:
Discovery
---------
The process of finding, adding and removing nodes is done in the discovery
module.
.. _figure_2:
.. figure:: discovery-process.png
:align: center
Figure 2
Phases of the node discovery process. n1 and n2 already form a cluster where
n1 is the elected master node, n3 joins the cluster. The cluster state
update happens in parallel!
Node discovery happens in multiple steps:
* CrateDB requires a list of potential host addresses for other CrateDB nodes
when it is starting up. That list can either be provided by a static
configuration or can be dynamically generated, for example by fetching DNS
SRV records, querying the Amazon EC2 API, and so on.
* All potential host addresses are pinged. Nodes which receive the request
respond to it with information about the cluster it belongs to, the current
master node, and its own node name.
* Now that the node knows the master node, it sends a join request. The
Primary verifies the incoming request and adds the new node to the cluster
state that now contains the complete list of all nodes in the cluster.
* The cluster state is then published across the cluster. This guarantees the
common knowledge of the node addition.
.. CAUTION::
If a node is started without any :ref:`initial_master_nodes
` or a :ref:`discovery_type `
set to ``single-node`` (e.g., the default configuration), it will never join
a cluster even if the configuration is subsequently changed.
It is possible to force the node to forget its current cluster state by
using the :ref:`cli-crate-node` CLI tool. However, be aware that this may
result in data loss.
Networking
----------
In a CrateDB cluster all nodes have a direct link to all other nodes; this is
known as `full mesh`_ topology. Due to simplicity reasons every node maintains
a one-way connections to every other node in the network. The network topology
of a 5 node cluster looks like this:
.. _figure_3:
.. figure:: mesh-network-topology.png
:align: center
:width: 50%
Figure 3
Network topology of a 5 node CrateDB cluster. Each line represents a one-way
connection.
The advantages of a fully connected network are that it provides a high degree
of reliability and the paths between nodes are the shortest possible. However,
there are limitations in the size of such networked applications because the
number of connections (c) grows quadratically with the number of nodes (n):
.. code-block:: mathematica
c = n * (n - 1)
Cluster behavior
================
The fact that each CrateDB node in a cluster is equal allows applications and
users to connect to any node and get the same response for the same operations.
As already described in section :ref:`concept-node-components`, the SQL
handler is responsible for handling incoming client SQL requests, either using
the HTTP transport protocol, or the PostgreSQL wire protocol.
The "handler node" that
accepts the client request also returns the response to the client. It does
neither redirect nor delegate the request to a different nodes. The handler
node parses the incoming request into a syntax tree, analyzes it and creates
an execution plan locally. Then the operations of the plan are executed in a
distributed manner. The upstream of the final phase of the execution is always
the handler which then returns the response to the client.
Application use case
====================
In a conventional setup of an application using a primary-secondary database the
deployed stack looks similar to this:
.. _figure_4:
.. figure:: conventional-deployment.png
:align: center
Figure 4
Conventional deployment of an application-database stack.
However, this given setup does not scale because all application servers use
the same, single entry point to the database for writes (the application can
still read from secondaries) and if that entry point is unavailable the complete
stack is broken.
Choosing a shared nothing architecture allows DevOps to deploy their
applications in an "elastic" manner without SPoF. The idea is to extend the
shared nothing architecture from the database to the application which in most
cases is stateless already.
.. _figure_5:
.. figure:: shared-nothing-deployment.png
:align: center
Figure 5
Elastic deployment making use of the shared nothing architecture.
If you deploy an instance of CrateDB together with every application server you
will be able to dynamically scale up and down your database backend depending
on your needs. The application only needs to communicate to its "bound" CrateDB
instance on localhost. The load balancer tracks the health of the hosts and if
either the application or the database on a single host fails the complete host
will taken out of the load balancing.
.. _primary-secondary architecture: https://en.wikipedia.org/wiki/Master/slave_(technology)
.. _abstract syntax tree: https://en.wikipedia.org/wiki/Abstract_syntax_tree
.. _POJOs: https://en.wikipedia.org/wiki/Plain_Old_Java_Object
.. _full mesh: https://en.wikipedia.org/wiki/Network_topology#Mesh
.. _split-brain: https://en.wikipedia.org/wiki/Split-brain_(computing).. _concept-joins:
=====
Joins
=====
:ref:`Joins ` are essential operations in relational databases. They
create a link between rows based on common values and allow the meaningful
combination of these rows. CrateDB supports joins and due to its distributed
nature allows you to work with large amounts of data.
In this document we will present the following topics. First, an overview of
the existing types of joins and algorithms provided. Then a description of how
CrateDB implements them along with the necessary optimizations, which allows us
to work with huge datasets.
.. rubric:: Table of contents
.. contents::
:local:
.. _join-types:
Join types
==========
A join is a relational operation that merges two data sets based on certain
properties. :ref:`joins_figure_1` shows which elements appear in which join.
.. _joins_figure_1:
.. figure:: joins.png
:align: center
Join Types
From left to right, top to bottom: left join, right join, inner join, outer
join, and cross join of a set L and R.
.. _join-types-cross:
Cross join
----------
A :ref:`cross join ` returns the Cartesian product of two or more
relations. The result of the Cartesian product on the relation *L* and *R*
consists of all possible permutations of each tuple of the relation *L* with
every tuple of the relation *R*.
.. _join-types-inner:
Inner join
----------
An :ref:`inner join ` is a join of two or more relations that
returns only tuples that satisfy the join condition.
.. _join-types-equi:
Equi Join
.........
An *equi join* is a subset of an inner join and a comparison-based join, that
uses equality comparisons in the join condition. The equi join of the relation
*L* and *R* combines tuple *l* of relation *L* with a tuple *r* of the relation
*R* if the join attributes of both tuples are identical.
.. _join-types-outer:
Outer join
----------
An :ref:`outer join ` returns a relation consisting of tuples that
satisfy the join condition and dangling tuples from both or one of the
relations, respectively to the outer join type.
An outer join can be one of the following types:
- **Left** outer join returns tuples of the relation *L* matching tuples of
the relation *R* and dangling tuples of the relation *R* padded with null
values.
- **Right** outer join returns tuples of the relation *R* matching tuples of
the relation *L* and dangling tuples from the relation *L* padded with null
values.
- **Full** outer join returns matching tuples of both relations and dangling
tuples produced by left and right outer joins.
.. _join-algos:
Join algorithms
===============
CrateDB supports (a) CROSS JOIN, (b) INNER JOIN, (c) EQUI JOIN, (d) LEFT JOIN,
(e) RIGHT JOIN and (f) FULL JOIN. All of these join types are executed using
the :ref:`nested loop join algorithm ` except for the
:ref:`Equi Joins ` which are executed using the :ref:`hash
join algorithm `. Special optimizations, according to the
specific use cases, are applied to improve execution performance.
.. _join-algos-nested-loop:
Nested loop join
----------------
The **nested loop** join is the simplest join algorithm. One of the relations
is nominated as the inner relation and the other as the outer relation. Each
tuple of the outer relation is compared with each tuple of the inner relation
and if the join condition is satisfied, the tuples of the relation *L* and *R*
are concatenated and added into the returned virtual relation::
for each tuple l ∈ L do
for each tuple r ∈ R do
if l.a Θ r.b
put tuple(l, r) in Q
*Listing 1. Nested loop join algorithm.*
.. _join-algos-nested-loop-prim:
Primitive nested loop
.....................
For joins on some relations, the nested loop operation can be executed directly
on the handler node. Specifically for queries involving a CROSS JOIN or joins
on `system tables`_ /`information_schema`_ each shard sends the data to the
handler node. Afterwards, this node runs the nested loop, applies limits, etc.
and ultimately returns the results. Similarly, joins can be nested, so instead
of collecting data from shards the rows can be the result of a previous join or
:ref:`table function `.
.. _join-algos-nested-loop-dist:
Distributed nested loop
.......................
Relations are usually distributed to different nodes which require the nested
loop to acquire the data before being able to join. After finding the locations
of the required shards (which is done in the planning stage), the smaller data
set (based on the row count) is broadcast amongst all the nodes holding the
shards they are joined with.
After that, each of the receiving nodes can start
running a nested loop on the subset it has just received. Finally, these
intermediate results are pushed to the original (handler) node to merge and
return the results to the requesting client (see :ref:`joins_figure_2`).
.. _joins_figure_2:
.. figure:: nested-loop.png
:align: center
Nodes that are holding the smaller shards broadcast the data to the
processing nodes which then return the results to the requesting node.
Queries can be optimized if they contain (a) ORDER BY, (b) LIMIT, or (c) if
INNER/EQUI JOIN. In any of these cases, the nested loop can be terminated
earlier:
- Ordering allows determining whether there are records left
- Limit states the maximum number of rows that are returned
Consequently, the number of rows is significantly reduced allowing the
operation to complete much faster.
.. _join-algos-hash:
Hash join
---------
The Hash Join algorithm is used to execute certain types of joins in a more
efficient way than :ref:`Nested Loop `.
.. _join-algos-hash-basic:
Basic algorithm
...............
The operation takes place in one node (the handler node to which the client is
connected). The rows of the left relation of the join are read and a hashing
algorithm is applied on the fields of the relation which participate in the
join condition. The hashing algorithm generates a hash value which is used to
store every row of the left relation in the proper position in a `hash table`_.
Then the rows of the right relation are read one-by-one and the same hashing
algorithm is applied on the fields that participate in the join condition. The
generated hash value is used to make a lookup in the `hash table`_. If no entry
is found, the row is skipped and the processing continues with the next row
from the right relation. If an entry is found, the join condition is validated
(handling hash collisions) and on successful validation the combined tuple of
left and right relation is returned.
.. _joins_figure_3:
.. figure:: hash-join.png
:align: center
Basic hash join algorithm
.. _join-algos-hash-block:
Block hash join
...............
The Hash Join algorithm requires a `hash table`_ containing all the rows of the
left relation to be stored in memory. Therefore, depending on the size of the
relation (number of rows) and the size of each row, the size of this hash table
might exceed the available memory of the node executing the hash join. To
resolve this limitation the rows of the left relation are loaded into the hash
table in blocks.
On every iteration the maximum available size of the `hash table`_ is
calculated, based on the number of rows and size of each row of the table but
also taking into account the available memory for query execution on the node.
Once this block-size is calculated the rows of the left relation are processed
and inserted into the `hash table`_ until the block-size is reached.
The operation then starts reading the rows of the right relation, process them
one-by-one and performs the lookup and the join condition validation. Once all
rows from the right relation are processed the `hash table`_ is re-initialized
based on a new calculation of the block size and a new iteration starts until
all rows of the left relation are processed.
With this algorithm the memory limitation is handled in expense of having to
iterate over the rows of the right table multiple times, and it is the default
algorithm used for Hash Join execution by CrateDB.
.. _join-algos-hash-block-switch:
Switch tables optimization
''''''''''''''''''''''''''
Since the right table can be processed multiple times (number of rows from left
/ block-size) the right table should be the smaller (in number of rows) of the
two relations participating in the join. Therefore, if originally the right
relation is larger than the left the query planner performs a switch to take
advantage of this detail and execute the hash join with better performance.
.. _join-algos-hash-dist:
Distributed block hash join
...........................
Since CrateDB is a distributed database and a standard deployment consists of
at least three nodes and in most case of much more, the Hash Join algorithm
execution can be further optimized (performance-wise) by executing it in a
distributed manner across the CrateDB cluster.
The idea is to have the hash join operation executing in multiple nodes of the
cluster in parallel and then merge the intermediate results before returning
them to the client.
A hashing algorithm is applied on every row of both the left and right
relations. On the integer value generated by this hash, a modulo, by the number
of nodes in the cluster, is applied and the resulting number defines the node
to which this row should be sent. As a result each node of the cluster receives
a subset of the whole data set which is ensured (by the hashing and modulo) to
contain all candidate matching rows.
Each node in turn performs a :ref:`block hash join ` on
this subset and sends its result tuples
to the handler node (where the client issued the query). Finally, the handler
node receives those intermediate results, merges them and applies any pending
``ORDER BY``, ``LIMIT`` and ``OFFSET`` and sends the final result to the
client.
This algorithm is used by CrateDB for most cases of hash join execution except
for joins on complex subqueries that contain ``LIMIT`` and/or ``OFFSET``.
.. _joins_figure_4:
.. figure:: distributed-hash-join.png
:align: center
Distributed hash join algorithm
.. _join-optim:
Join optimizations
==================
.. _join-optim-optim-query-fetch:
Query then fetch
----------------
Join operations on large relation can be extremely slow especially if the join
is executed with a :ref:`Nested Loop `. - which means that
the runtime complexity grows quadratically (O(n*m)). Specifically for
:ref:`cross joins ` this results in large amounts of data sent
over the network and loaded into memory at the handler node. CrateDB reduces
the volume of data transferred by employing "Query Then Fetch": First, filtering
and ordering are applied (if possible where the data is located) to obtain the
required document IDs. Next, as soon as the final data set is ready, CrateDB
fetches the selected fields and returns the data to the client.
.. _join-optim-optim-push-down:
Push-down query optimization
----------------------------
Complex queries such as Listing 2 require the planner to decide when to filter,
sort, and merge in order to efficiently execute the plan. In this case, the
query would be split internally into subqueries before running the join. As
shown in :ref:`joins_figure_5`, first filtering (and ordering) is applied to
relations *L* and *R* on their shards, then the result is directly broadcast to
the nodes running the join. Not only will this behavior reduce the number of
rows to work with, it also distributes the workload among the nodes so that the
(expensive) join operation can run faster.
.. code-block:: SQL
SELECT L.a, R.x
FROM L, R
WHERE L.id = R.id
AND L.b > 100
AND R.y < 10
ORDER BY L.a
*Listing 2. An INNER JOIN on ids (effectively an EQUI JOIN) which can be
optimized.*
.. _joins_figure_5:
.. figure:: push-down.png
:align: center
Figure 5
Complex queries are broken down into subqueries that are run on their shards
before joining.
.. _join-optim-cross-join-elimination:
Cross join elimination
----------------------
The optimizer will try to eliminate cross joins in the query plan by changing
the join-order. Cross join elimination replaces a CROSS JOIN with an INNER JOIN
if query conditions used in the WHERE clause or other join conditions allow
for it. An example:
.. code-block:: SQL
SELECT *
FROM t1 CROSS JOIN t2
INNER JOIN t3
ON t3.z = t1.x AND t3.z = t2.y
The cross join elimination will change the order of the query from t1, t2, t3
to t2, t1, t3 so that each join has a join condition and the CROSS JOIN can be
replaced by an INNER JOIN. When reordering, it will try to preserve the
original join order as much as possible. If a CROSS JOIN cannot be eliminated,
the original join order will be maintained. This optimizer rule can be disabled
with the :ref:`optimizer eliminate cross join session setting
`::
SET optimizer_eliminate_cross_join = false
Note that this setting is experimental, and may change in the future.
.. _hash table: https://en.wikipedia.org/wiki/Hash_table
.. _here: http://www.dcs.ed.ac.uk/home/tz/phd/thesis.pdf
.. _information_schema: https://crate.io/docs/reference/sql/information_schema.html
.. _system tables: https://crate.io/docs/reference/sql/system.html.. _concept-storage-consistency:
=======================
Storage and consistency
=======================
This document provides an overview on how CrateDB stores and distributes state
across the cluster and what consistency and durability guarantees are provided.
.. NOTE::
Since CrateDB heavily relies on Elasticsearch_ and Lucene_ for storage and
cluster consensus, concepts shown here might look familiar to Elasticsearch_
users, since the implementation is actually reused from the Elasticsearch_
code.
.. rubric:: Table of contents
.. contents::
:local:
.. _concept-data-storage:
Data storage
============
Every table in CrateDB is sharded, which means that tables are divided and
distributed across the nodes of a cluster. Each shard in CrateDB is a Lucene_
index broken down into segments getting stored on the filesystem. Physically
the files reside under one of the configured data directories of a node.
Lucene only appends data to segment files, which means that data written to the
disc will never be mutated. This makes it easy for replication and
:ref:`recovery `, since syncing a shard is simply a
matter of fetching data from a specific marker.
An arbitrary number of replica shards can be configured per table. Every
operational replica holds a full synchronized copy of the primary shard.
With read operations, there is no difference between executing the
operation on the primary shard or on any of the replicas. CrateDB
randomly assigns a shard when routing an operation. It is possible to
configure this behavior if required, see our best practice guide on
`multi zone setups `_
for more details.
Write operations are handled differently than reads. Such operations are
synchronous over all active replicas with the following flow:
1. The primary shard and the active replicas are looked up in the cluster state
for the given operation. The primary shard and a quorum of the configured
replicas need to be available for this step to succeed.
2. The operation is routed to the according primary shard for execution.
3. The operation gets executed on the primary shard
4. If the operation succeeds on the primary, the operation gets executed on all
replicas in parallel.
5. After all replica operations finish the operation result gets returned to
the caller.
Should any replica shard fail to write the data or times out in step 5, it's
immediately considered as unavailable.
.. _concept-atomicity:
Atomicity at document level
===========================
Each row of a table in CrateDB is a semi structured document which can be
nested arbitrarily deep through the use of object and array types.
Operations on documents are atomic. Meaning that a write operation on a
document either succeeds as a whole or has no effect at all. This is always the
case, regardless of the nesting depth or size of the document.
CrateDB does not provide transactions. Since every document in CrateDB has a
version number assigned, which gets increased every time a change occurs,
patterns like `Optimistic Concurrency Control`_ can help to work around that
limitation.
.. _concept-durability:
Durability
==========
Each shard has a WAL_ also known as translog. It guarantees that operations on
documents are persisted to disk without having to issue a Lucene-Commit for
every write operation. When the translog gets flushed all data is written to
the persistent index storage of Lucene and the translog gets cleared.
In case of an unclean shutdown of a shard, the transactions in the translog are
getting replayed upon startup to ensure that all executed operations are
permanent.
The translog is also directly transferred when a newly allocated replica
initializes itself from the primary shard. There is no need to flush segments
to disc just for replica :ref:`recovery ` purposes.
.. _concept-addressing-documents:
Addressing documents
====================
Every document has an :ref:`internal identifier
`. By default this identifier is derived
from the primary key. Documents living in tables without a primary key are
assigned a unique auto-generated ID automatically when created.
Each document is :ref:`routed ` to one specific shard
according to the :ref:`routing column `. All rows that
have the same routing column row value are stored in the same shard. The
routing column can be specified with the :ref:`CLUSTERED
` clause when creating the table. If a
:ref:`primary key ` has been defined, it will be used
as the default routing column, otherwise the :ref:`internal document ID
` is used.
While transparent to the user, internally there are two ways how CrateDB
accesses documents:
:get:
Direct access by identifier. Only applicable if the routing key and the
identifier can be computed from the given query specification. (e.g: the full
primary key is defined in the where clause).
This is the most efficient way to access a document, since only a single shard
gets accessed and only a simple index lookup on the ``_id`` field has to be
done.
:search:
Query by matching against fields of documents across all candidate shards of
the table.
.. _concept-consistency:
Consistency
===========
CrateDB is eventual consistent for search operations. Search operations are
performed on shared ``IndexReaders`` which besides other functionality, provide
caching and reverse lookup capabilities for shards. An ``IndexReader`` is
always bound to the Lucene_ segment it was started from, which means it has to
be refreshed in order to see new changes, this is done on a time based manner,
but can also be done manually (see `refresh`_). Therefore a search only sees a
change if the according ``IndexReader`` was refreshed after that change
occurred.
If a query specification results in a ``get`` operation, changes are visible
immediately. This is achieved by looking up the document in the translog first,
which will always have the most recent version of the document. The common
update and fetch use-case is therefore possible. If a client updates a row and
that row is looked up by its primary key after that update the changes will
always be visible, since the information will be retrieved directly from the
translog. There is an exception to that, when the ``WHERE`` clause contains
complex filtering and/or lots of Primary Key values. You can find more details
:ref:`here `.
.. NOTE::
``Dirty reads`` can occur if the primary shard becomes isolated. The primary
will only realize it is isolated once it tries to communicate with its
replicas or the master. At that point, a write operation is already committed
into the primary and can be read by a concurrent read operation. In order to
minimise the window of opportunity for this phenomena, the CrateDB nodes
communicate with the master every second (by default) and once they realise
no master is known, they will start rejecting write operations.
Every replica shard is updated synchronously with its primary and always
carries the same information. Therefore it does not matter if the primary or
a replica shard is accessed in terms of consistency. Only the refresh of the
``IndexReader`` affects consistency.
.. NOTE::
Due to internal constraints, when the ``WHERE`` clause filters on multiple
columns of a ``PRIMARY KEY``, but one or more of those columns is tested
against lots of values, the query might be executed using a ``Collect``
operator instead of a ``Get``, thus records might be unavailable until a
``REFRESH`` is run. The same situation could occur when the ``WHERE`` clause
contains long complex expressions, e.g.::
SELECT * FROM t
WHERE pk1 IN () AND pk2 = 3 AND pk3 = 'foo'
SELECT * FROM t
WHERE pk1 = ?
AND pk2 = ?
AND pk3 = ?
OR pk1 = ?
AND pk2 = ?
AND pk3 = ?
OR pk1 = ?
...
.. CAUTION::
Some outage conditions can affect these consistency claims. See the
:ref:`resiliency documentation ` for details.
.. _concept-cluster-metadata:
Cluster meta data
=================
Cluster meta data is held in the so called "Cluster State", which contains the
following information:
- Tables schemas.
- Primary and replica shard locations. Basically just a mapping from shard
number to the storage node.
- Status of each shard, which tells if a shard is currently ready for use or
has any other state like "initializing", "recovering" or cannot be assigned
at all.
- Information about discovered nodes and their status.
- Configuration information.
Every node has its own copy of the cluster state. However there is only one
node allowed to change the cluster state at runtime. This node is called the
"master" node and gets auto-elected. The "master" node has no special
configuration at all, all nodes are master-eligible by default, and any
master-eligible node can be elected as the master. There
is also an automatic re-election if the current master node goes down for some
reason.
.. NOTE::
To avoid a scenario where two masters could be elected due to network
partitioning, CrateDB automatically defines a quorum of nodes with
which it is possible to elect a master. For details on how this works
and further information see :ref:`concept-master-election`.
To explain the flow of events for any cluster state change, here is an example
flow for an ``ALTER TABLE`` statement which changes the schema of a table:
#. A node in the cluster receives the ``ALTER TABLE`` request.
#. The node sends out a request to the current master node to change the table
definition.
#. The master node applies the changes locally to the cluster state and sends
out a notification to all affected nodes about the change.
#. The nodes apply the change, so that they are now in sync with the master.
#. Every node might take some local action depending on the type of cluster
state change.
.. _Elasticsearch: https://www.elasticsearch.org/
.. _Lucene: https://lucene.apache.org/core/
.. _WAL: https://en.wikipedia.org/wiki/Write-ahead_logging
.. _Optimistic Concurrency Control: https://crate.io/docs/crate/reference/sql/occ.html
.. _refresh: https://crate.io/docs/crate/reference/sql/refresh.html.. _concept-resiliency:
==========
Resiliency
==========
Distributed systems are tricky. All sorts of things can go wrong that are
beyond your control. The network can go away, disks can fail, hosts can be
terminated unexpectedly. CrateDB tries very hard to cope with these sorts of
issues while maintaining :ref:`availability `,
:ref:`consistency `, and :ref:`durability
`.
However, as with any distributed system, sometimes, *rarely*, things can go
wrong.
Thankfully, for most use-cases, if you follow best practices, you are extremely
unlikely to experience resiliency issues with CrateDB.
.. SEEALSO::
:ref:`Appendix: Resiliency Issues `
.. rubric:: Table of contents
.. contents::
:local:
.. _concept-resiliency-monitoring:
Monitoring cluster status
=========================
.. figure:: resilience-status.png
:alt:
The Admin UI in CrateDB has a status indicator which can be used to determine
the stability and health of a cluster.
A green status indicates that all shards have been replicated, are available,
and are not being relocated. This is the lowest risk status for a cluster. The
status will turn yellow when there is an elevated risk of encountering issues,
due to a network failure or the failure of a node in the cluster.
The status is updated every few seconds (variable on your cluster `ping
configuration
`_).
.. _concept-resiliency-consistency:
Storage and consistency
=======================
Code that expects the behavior of an `ACID
`_ compliant database like MySQL may not
always work as expected with CrateDB.
CrateDB does not support ACID transactions, but instead has :ref:`atomic
operations ` and :ref:`eventual consistency
` at the row level. See also :ref:`concept-clustering`.
Eventual consistency is the trade-off that CrateDB makes in exchange for
high-availability that can tolerate most hardware and network failures. So you
may observe data from different cluster nodes temporarily falling very briefly
out-of-sync with each other, although over time they will become consistent.
For example, you know a row has been written as soon as you get the ``INSERT
OK`` message. But that row might not be read back by a subsequent ``SELECT`` on
a different node until after a :ref:`table refresh ` (which
typically occurs within one second).
Your applications should be designed to work this storage and consistency model.
.. _concept-resiliency-deployment:
Deployment strategies
=====================
When deploying CrateDB you should carefully weigh your need for
high-availability and disaster recovery against operational complexity and
expense.
Which strategy you pick is going to depend on the specifics of your situation.
Here are some considerations:
- CrateDB is designed to scale horizontally. Make sure that your machines are
fit for purpose, i.e. use SSDs, increase RAM up to 64 GB, and use multiple
CPU cores when you can. But if you want to dynamically increase (or
decrease) the capacity of your cluster, `add (or remove) nodes
`_.
- If availability is a concern, you can add `nodes across multiple zones
`_
(e.g. different data centers or geographical regions). The more available
your CrateDB cluster is, the more likely it is to withstand external
failures like a zone going down.
- If data durability or read performance is a concern, you can increase the
number of :ref:`table replicas `.
More table replicas means a smaller chance of permanent data loss due to
hardware failures, in exchange for the use of more disk space and more
intra-cluster network traffic.
- If disaster recovery is important, you can :ref:`take regular snapshots
` and store those snapshots in cold storage. This
safeguards data that has already been successfully written and replicated
across the cluster.
- CrateDB works well as part of a `data pipeline
`_, especially if you’re working
with high-volume data. If you have a message queue in front of CrateDB, you
can configure it with backups and replay the data flow for a specific
timeframe. This can be used to recover from issues that affect your data
before it has been successfully written and replicated across the cluster.
Indeed, this is the generally recommended way to recover from any of the
rare consistency or data-loss issues you might encounter when CrateDB
experiences network or hardware failures (see next section).(services)=
# Services
In the realm of CrateDB Cloud services, understanding your options is crucial
for optimizing both performance and costs. This section of the documentation
provides an in-depth look at the various service plans we offer, catering to a
wide range of use-cases from small-scale applications to enterprise-level
deployments. Our service plans are engineered for scalability, reliability, and
performance.
::::{grid} 1 2 2 3
:margin: 0 0 0 0
:gutter: 01
:::{grid-item-card} Shared
_non-critical workloads_
- Single Node
- Up to 8 shared vCPUs
- Up to 12 GiB RAM
- Up to 1 TiB storage
- Backups (once per day)
- Single-AZ
- Development support
:::
:::{grid-item-card} Dedicated
_production workloads_
- Up to 9 Nodes
- Up to 144 vCPUs
- Up to 495 GiB RAM
- Up to 72 TiB storage
- Backups (once per hour)
- Multi-AZ
- Basic Support
---
- AWS / Azure Private Link
- Uptime SLAs
- Premium support available
:::
:::{grid-item-card} Custom
_large production workloads_
- Any cluster size
- Custom compute options
- Dedicated master nodes
- Unlimited Storage
- Custom Backups
- Premium Support
- AWS / Azure Private Link
- Uptime SLAs
[Contact us for more information](https://cratedb.com/contact)
:::
::::
## Shared
CrateDB Cloud's Shared Plan provides an affordable and easy-to-setup option for
users who require basic database functionalities. The plan is built around the
principle of cost-effectiveness and is particularly well-suited for development,
testing, or non-critical production environments.
**Node sizes**
:::{table}
:width: 700px
:widths: 100, 100, 150, 150, 200
:align: left
| Plan | Size | vCPUs | RAM | Storage |
|----|--------|-----------|----------| ---- |
| Shared | CRFREE* | up to 2 | 2 GiB | 8 GiB |
| Shared | S2 | up to 2 | 2 GiB | 8 GiB to 1TiB |
| Shared | S4 | up to 3 | 4 GiB | 8 GiB to 1TiB |
| Shared | S6 | up to 4 | 6 GiB | 8 GiB to 1TiB |
| Shared | S12 | up to 8 | 12 GiB | 8 GiB to 1TiB |
:::
**Variable Performance**
Since your cluster will be sharing vCPUs with other clusters, the performance
may vary depending on the overall load on the underlying machine. This
variability makes it less predictable compared to Dedicated Plans, where your
database is running on dedicated resources.
**Fair-Use Principle**
The Shared Plan operates on a fair-use principle. All users are expected to
utilize the shared resources responsibly so that the system remains equitable
and functional for everyone.
:::{note}
__*CRFREE__: This plan is aimed at new users who want to test and evaluate
CrateDB Cloud and is perpetually free to use. Every user can deploy one free
tier cluster in their organization without adding a payment method. Free tier
clusters will be suspended if they are not used for 4 days, and deleted after
10 more days of inactivity. They cannot be scaled or changed.
:::
## Dedicated
CrateDB Cloud's Dedicated Plan is designed to provide robust, scalable, and
high-performance database solutions. Unlike the Shared Plan, the Dedicated Plan
offers dedicated resources, including dedicated vCPUs, to meet the demands of
high-availability and high-throughput environments.
**Node sizes**
:::{table}
:width: 700px
:widths: 200, 100, 100, 100, 200
:align: left
| Plan | Size | vCPUs | RAM | Storage |
|----|--------|-----------|----------| ---- |
| Dedicated | CR1 | 2 | 7 GiB | 32 GiB to 8 TiB |
| Dedicated | CR2 | 4 | 14 GiB | 32 GiB to 8 TiB |
| Dedicated | CR3 | 8 | 28 GiB | 32 GiB to 8 TiB |
| Dedicated | CR4 | 16 | 55 GiB | 32 GiB to 8 TiB |
:::
All Dedicated Plans can be scaled from 1 to 9 nodes. Depeding on the number of
nodes, the overall cluster size can be scaled up to the following limits:
**Cluster sizes**
:::{table}
:width: 700px
:widths: 200, 100, 100, 150, 150
:align: left
| Plan | Size | vCPUs | RAM | Storage |
|----|--------|-----------|----------| ---- |
| Dedicated | CR1 | up to 18 | up to 63 GiB | up to 72 TiB |
| Dedicated | CR2 | up to 36 | up to 126 GiB | up to 72 TiB |
| Dedicated | CR3 | up to 72 | up to 252 GiB | up to 72 TiB |
| Dedicated | CR4 | up to 144 | up to 495 GiB | up to 72 TiB |
:::
**High Availability**
While it’s possible to start with just one node, for applications requiring high
availability and fault tolerance, we recommend using at least three nodes. This
ensures data replication and allows the cluster to handle node failures gracefully.
Dedicated nodes are automatically deployed across three availability zones,
and the specific zone for each node cannot be manually configured. A single
dedicated node is placed in one zone, two nodes are distributed across two
zones, and three or more nodes utilize all three availability zones, with nodes
distributed as uniformly as possible. While a node count that is a multiple of
three (e.g., 3, 6, 9, 12, etc.) provides optimal distribution across zones, it
is not strictly required for high availability.
## Custom
For organizations with specialized requirements that go beyond the Shared and
Dedicated Plans, CrateDB Cloud offers custom solutions tailored to your
specific needs. Our sales team and solutions engineers work closely with you to
architect and deploy a custom cluster configuration, ensuring optimal
performance, scalability, and security for your mission-critical applications.
Whether you have stringent compliance mandates, complex integrations, or unique
scalability challenges, our custom solutions provide the flexibility and
expertise to meet your business objectives and technical requirements.(organization-billing)=
# Billing
CrateDB Cloud offers flexible billing options to accommodate various needs and
preferences. We only bill for actual usage of services, meaning there are no
flat fees or minimum payments.
## Billing Information
In the Billing tab under the Organization overview, you can add and edit your
billing information, including your company address, country of residence, VAT
info, invoice email, phone contacts, and more. You need to fill out this
information whenever you use a paid offer on CrateDB Cloud, regardless of the
payment method.
## Subscriptions
After adding your billing information, you can add a subscription (payment
method). We currently support the following payment methods:
- **Cloud Marketplaces**: Available on [AWS](https://aws.amazon.com/marketplace/pp/prodview-l7rqf2xpeaubk),
[Azure](https://azuremarketplace.microsoft.com/en-us/marketplace/apps/crate.cratedbcloud),
and [GCP](https://console.cloud.google.com/marketplace/product/cratedb-public/cratedb-gcp) Marketplaces.
- **Credit Card**: Available worldwide. We use Stripe as our payment provider.
- **Bank Transfer**: Available in the EU. We use Stripe as our payment provider.
- **Custom Contract**: For large individual deployments, [contact](https://cratedb.com/contact) our sales team.
:::{tip}
**Marketplace Committed Spend:** All three marketplace offerings
(AWS, Azure, GCP) can be applied towards any committed spend agreement (e.g., MACC)
you have with the cloud provider. This effectively reduces your committed spend
balance and allows you to use CrateDB as if it were a native service provided by
the cloud provider.
:::
### Setup New Subscription
::::{tab} Cloud Marketplaces
1. Register for an account on the [CrateDB Cloud sign-in page](https://console.cratedb.cloud/).
2. Navigate to the **"Billing"** tab on the right side.
3. Add your billing information.
4. Click on **"Add Payment Method"**.
5. Select your preferred cloud marketplace.
6. Follow the instructions to sign up for **CrateDB Cloud** through the selected marketplace.
7. After completing the subscription in the marketplace, you will be redirected to CrateDB Cloud.
8. Connect one of your organizations to the created marketplace SaaS subscription.
**You are now ready to deploy a cluster.**
When you deploy a cluster, the usage will be reported regularly to the
marketplace as usage amount in USD, and it will appear on your regular cloud
provider's invoice. Depending on your settlement currency, a conversion may be
applied.
::::
::::{tab} Credit Card
1. Register for an account on the [CrateDB Cloud sign-in page](https://console.cratedb.cloud/).
2. Navigate to the **"Billing"** tab on the right side.
3. Add your billing information.
4. Click on **"Add susbscription"**.
5. Click on **"Pay with credit or debit card"**.
6. Fill out the required information and click **"Save"**.
**You are now ready to deploy a cluster.**
The payment cycle is monthly and aligns with the calendar month. You will be charged
for the previous period's usage and will receive an invoice at the email address
you provided. If needed, you can add a new credit card to replace the current one.
::::
::::{tab} Bank Transfer
1. Register for an account on the [CrateDB Cloud sign-in page](https://console.cratedb.cloud/).
2. Navigate to the **"Billing"** tab on the right side.
3. Add your billing information.
4. Click on **"Add Payment Method"**.
5. Click on **"Ask to enable"** next to **"Pay via Bank Transfer"**. Complete
the form, and our team will contact you to process your request.
**After** your request has been approved:
1. Click on **"Add Payment Method"**.
2. Click on **"Pay via Bank Transfer"**.
**You are now ready to deploy a cluster.**
The payment cycle is monthly and aligns with the calendar month. You will be charged
for the previous period's usage, and an invoice will be sent to the email address
you provided. Payment is due within the specified terms.
:::{caution}
**Bank transfer payment is currently available only within the EU.**
These payments are processed by Stripe and invoiced in Euros at a fixed USD/EUR
exchange rate. You can find the current exchange rate within the CrateDB Cloud
Console after setting up the bank transfer payment method.
:::
::::
::::{tab} Custom Contract
Custom contracts are individually tailored to your needs. Please
[contact](https://cratedb.com/contact) our sales team to set up a custom contract
that fits your requirements.
::::
:::{note}
To remove a subscription, please contact support.
:::
## Usage Reporting
Whenever you use a paid offer in CrateDB Cloud, we collect your usage
information, including the cluster compute and storage size and the number of
nodes deployed. You can view this usage in the CrateDB Cloud Console, where
you'll find a usage snapshot for the current calendar month, the current cost
for the deployed service, and any available credits that might be applied to
the current usage period. Be aware that the billing period might deviate from
the shown calendar month usage.
## Credits
Credits are another way to pay for CrateDB Cloud services and can be used together
with other payment methods. Credits are applied to your account and are used up
first before any other payment method is charged. This means that if credits are
available, you will not be charged, nor will any usage be reported to your
payment provider. You can see the remaining credits and their validity date on
the Billing and Usage page.
There is also the option to purchase more credits at a discount by
[contacting](https://cratedb.com/contact) our sales team.
:::{tip}
**Free Trial Credits**: If you just signed up, you have the option to enable
$200 of credits that can be used for any paid cluster. To enable these credits,
you need to provide a valid payment method, which will only be charged if you
have used up your credits.
:::(organization-api)=
# API
We offer an API to allow programmatic access to CrateDB Cloud. The API can
be accessed by generating a key and secret in your
[account page](https://console.cratedb.cloud/account/settings) (login required).
The API keys are bound to the CrateDB Cloud user account that generates them.
This means that any actions performed using the API keys will be executed as
that user. Consequently, the API keys inherit the same permissions as the user,
allowing the same level of access and control over the organization and its
resources.

Click the *Generate new key* button to create your key. A popup with
your key and secret will appear. Make sure to store your secret safely,
as you cannot access it again.
(api-access)=
## Access
The key and secret can be used as HTTP Basic Auth credentials when
calling the API, e.g.
:::{code} console
sh$ $ curl -s -u $your_key:$your_secret https://console.cratedb.cloud/api/v2/users/me
:::
This example will return details of the current user:
:::{code} console
{"email":"some@example.com","hmac":"...","is_superuser":false,"name":"Some User","organization_id":"123","status":"active","uid":"uid","username":"some@example.com"}
:::
(api-examples)=
## Examples
The API is documented with
[Swagger](https://console.cratedb.cloud/api/docs) (login required). It
contains endpoints for:
- Organizations
- Regions
- Projects
- Clusters
- Products
- Users
- Roles
- Subscriptions
- Audit logs
It provides example requests with all the required parameters, expected
responses, and all response codes. Access the API documentation
[here](https://console.cratedb.cloud/api/docs) (login required).(cluster-import)=
# Import
The first thing you see in the "Import" tab is the history of your
imports. You can see whether you imported from a URL or from a file,
file name, table into which you imported, date, and status. By clicking
"Show details" you can display details of a particular import.
Clicking the "Import new data" button will bring up the Import page,
where you can select the source of your data.
If you don't have a dataset prepared, we also provide an example in the
URL import section. It's the New York City taxi trip dataset for July
of 2019 (about 6.3M records).
(cluster-import-url)=
## Import from URL
To import data, fill out the URL, name of the table which will be
created and populated with your data, data format, and whether it is
compressed.
If a table with the chosen name doesn't exist, it will be automatically
created.
The following data formats are supported:
- CSV (all variants)
- JSON (JSON-Lines, JSON Arrays and JSON Documents)
- Parquet
Gzip compressed files are also supported.

(cluster-import-s3)=
## Import from private S3 bucket
CrateDB Cloud allows convenient imports directly from S3-compatible
storage. To import a file form bucket, provide the name of your bucket,
and path to the file. The S3 Access Key ID, and S3 Secret Access Key are
also needed. You can also specify the endpoint for non-AWS S3 buckets.
Keep in mind that you may be charged for egress traffic, depending on
your provider. There is also a volume limit of 10 GiB per file for S3
imports. The usual file formats are supported - CSV (all variants), JSON
(JSON-Lines, JSON Arrays and JSON Documents), and Parquet.

:::{note}
It's important to make sure that you have the right permissions to
access objects in the specified bucket. For AWS S3, your user should
have a policy that allows GetObject access, for example:
:::{code}
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowGetObject",
"Effect": "Allow",
"Principal": {
"AWS": "*"
},
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::EXAMPLE-BUCKET-NAME/*"
}]
}
:::
:::
(cluster-import-azure)=
## Import from Azure Blob Storage Container
Importing data from private Azure Blob Storage containers is possible
using a stored secret, which includes a secret name and either an Azure
Storage Connection string or an Azure SAS Token URL. An admin user at
the organization level can add this secret.
You can specify a secret, a container, a table and a path in the form
[/folder/my_file.parquet]
As with other imports Parquet, CSV, and JSON files are supported. File
size limitation for imports is 10 GiB per file.

(cluster-import-globbing)=
## Importing multiple files
Importing multiple files, also known as import globbing is supported in
any s3-compatible blob storage. The steps are the same as if importing
from S3, i.e. bucket name, path to the file and S3 ID/Secret.
Importing multiple files from Azure Container/Blob Storage is also
supported: `/folder/*.parquet`
Files to be imported are specified by using the well-known
[wildcard](https://en.wikipedia.org/wiki/Wildcard_character) notation,
also known as "globbing". In computer programming,
[glob](https://en.wikipedia.org/wiki/Glob_(programming)) patterns
specify sets of filenames with wildcard characters. The following
example would import all the files from the single specified day.
:::{code} console
/somepath/AWSLogs/123456678899/CloudTrail/us-east-1/2023/11/12/*.json.gz
:::

As with other imports, the supported file types are CSV, JSON, and
Parquet.
(cluster-import-file)=
## Import from file
Uploading directly from your computer offers more control over your
data. From the security point of view, you don't have to share the data
on the internet just to be able to import it to your cluster. You also
have more control over who has access to your data. Your files are
temporarily uploaded to a secure location managed by Crate (an S3 bucket
in AWS) which is not publicly accessible. The files are automatically
deleted after 3 days. You may re-import the same file into multiple
tables without having to re-upload it within those 3 days. Up to 5 files
may be uploaded at the same time, with the oldest ones being
automatically deleted if you upload more.

As with other import, the supported file formats are:
- CSV (all variants)
- JSON (JSON-Lines, JSON Arrays and JSON Documents)
- Parquet
There is also a limit to file size, currently 1GB.
(overview-cluster-import-schema-evolution)=
## Schema evolution
Schema Evolution, available for all import types, enables automatic
addition of new columns to existing tables during data import,
eliminating the need to pre-define table schemas. This feature is
applicable to both pre-existing tables and those created during the
import process. It can be toggled via the 'Schema Evolution' checkbox
on the import page.
Note that Schema Evolution is limited to adding new columns; it does not
modify existing ones. For instance, if an existing table has an
'OrderID' column of type **INTEGER**, and an import is
attempted with Schema Evolution enabled for data where 'OrderID'
column is of type **STRING**, the import job will fail due to
type mismatch.
## File Format Limitations
**CSV** files:
1. Comma, tab and pipe delimiters are supported.
**JSON** files:
The following formats are supported for JSON:
1. JSON Documents. Will insert as a single row in the table.
:::{code} console
{
"id":1,
"text": "example"
}
:::
2. JSON Arrays. Will insert as a row per array item.
:::{code} console
[
{
"id":1,
"text": "example"
},
{
"id":2,
"text": "example2"
}
]
:::
3. JSON-Lines. Each line will insert as a row.
:::{code} console
{"id":1, "text": "example"}
{"id":2, "text": "example2"}
:::(cluster-export)=
# Export
The "Export" section allows users to download specific tables/views. When you
first visit the Export tab, you can specify the name of a table/view,
format (CSV, JSON, or Parquet) and whether you'd like your data to be
gzip compressed (recommended for CSV and JSON files).
:::{important}
- Size limit for exporting is 1 GiB
- Exports are held for 3 days, then automatically deleted
:::
:::{note}
**Limitations with Parquet**:
Parquet is a highly compressed data format for very efficient storage of
tabular data. Please note that for OBJECT and ARRAY columns in CrateDB,
the exported data will be JSON encoded when saving to Parquet
(effectively saving them as strings). This is due to the complexity of
encoding structs and lists in the Parquet format, where determining the
exact schema might not be possible. When re-importing such a Parquet
file, make sure you pre-create the table with the correct schema.
:::
(cluster-backups)=
# Backups
You can find the Backups page in the detailed view of your cluster and
you can see and restore all existing backups here.
By default, a backup is made every hour. The backups are kept for 14
days. We also keep the last 14 backups indefinitely, no matter the state
of your cluster.
The Backups tab provides a list of all your backups. By default, a
backup is made every hour.

You can also control the schedule of your backups by clicking the *Edit
backup schedule* button.

Here you can create a custom schedule by selecting any number of hour
slots. Backups will be created at selected times. At least one backup a
day is mandatory.
To restore a particular backup, click the *Restore* button. A popup
window with a SQL statement will appear. Input this statement to your
Admin UI console either by copy-pasting it, or clicking the *Run query
in Admin UI*. The latter will bring you directly to the Admin UI console
with the statement automatically pre-filled.

You have a choice between restoring the cluster fully, or only specific
tables.
(cluster-cloning)=
## Cluster Cloning
Cluster cloning is a process of duplicating all the data from a specific
snapshot into a different cluster. Creating the new cluster isn't part
of the cloning process, you need to create the target cluster yourself.
You can clone a cluster from the Backups page.

Choose a snapshot and click the *Clone* button. As with restoring a
backup, you can choose between cloning the whole cluster, or only
specific tables.

:::{note}
Keep in mind that the full cluster clone will include users, views,
privileges and everything else. Cloning also doesn't distinguish
between cluster plans, meaning you can clone from CR2 to CR1 or any
other variation.
:::
(cluster-cloning-fail)=
## Failed cloning
There are circumstances under which cloning can fail or behave
unexpectedly. These are:
- If you already have tables with the same names in the target cluster
as in the source snapshot, the entire clone operation will fail.
- There isn't enough storage left on the target cluster to
accommodate the tables you're trying to clone. In this case, you
might get an incomplete cloning as the cluster will run out of
storage.
- You're trying to clone an invalid or no longer existing snapshot.
This can happen if you're cloning through
[Croud](https://cratedb.com/docs/cloud/cli/en/latest/). In this case,
the cloning will fail.
- You're trying to restore a table that is not included in the
snapshot. This can happen if you're restoring snapshots through
[Croud](https://cratedb.com/docs/cloud/cli/en/latest/). In this case,
the cloning will fail.
When cloning fails, it is indicated by a banner in the cluster overview
screen.
(integrations-mongo-cdc)=
# MongoDB CDC (Preview)
CrateDB Cloud enables continuous data ingestion from MongoDB using Change Data
Capture (CDC), providing seamless, real-time synchronization of your data.
:::{caution}
This integration is currently in preview and may have restricted availability.
For more information, please [contact us](https://cratedb.com/contact).
:::
## Key Concepts
The MongoDB CDC integration in CrateDB Cloud allows you to keep your data
synchronized between your MongoDB Atlas cluster and your CrateDB Cloud cluster
in real-time.
### How It Works
The integration functions in two main stages:
1. **Initial Sync:**
The integration performs a complete scan of your MongoDB collections,
importing all existing data into your CrateDB Cloud cluster.
2. **Continuous Sync:**
The integration uses MongoDB Change Streams to monitor changes in your
MongoDB collections and syncs these updates to your CrateDB Cloud cluster
in real-time, ensuring that your data remains current.
### Data Consistency and Mode
For continuous sync, CrateDB Cloud uses MongoDB's **full document mode** to
ensure data consistency. This mode guarantees that MongoDB returns the latest
majority-committed version of the updated document.
While receiving partial deltas is more efficient, full document mode provides
robust functionality. Support for partial deltas may be added in the future to
enhance performance and flexibility.
---
## Create a new Integration
A MongoDB integration allows you to sync a single collection from a MongoDB
Atlas cluster. You can reuse an existing connection across multiple integrations
to continuously sync data from multiple MongoDB Atlas collections.
Supported authentication methods:
- MongoDB SCRAM Authentication
- MongoDB X.509 Authentication
### Set Up MongoDB Atlas Authentication
The following steps should be performed in the MongoDB Atlas UI.
#### Step 1: Create a Custom Role
1. **Navigate to Database Access**
Go to **Database Access** in the MongoDB Atlas UI for the cluster you want to
connect to CrateDB Cloud.
2. **Add a Custom Role**
Under **Custom Roles**, click **Add New Custom Role**.
3. **Set Up Read-Only Access**
Assign the following actions or roles to the custom role:
- `find`
- `changeStream`
- `collStats`
Specify the databases and collections for these actions. You can update
access permissions in the MongoDB Atlas UI later if needed.
#### Step 2: Create a User
Depending on whether you plan to use SCRAM or X.509 authentication, create a
database user with one of the following methods:
:::{tab} SCRAM Auhentication
1. **Navigate to Database Access**
In the MongoDB Atlas UI, go to **Database Access** and click **Add New
Database User**.
2. **Set Authentication Method**
Choose **Password** as the authentication method and enter a username and
password for the database user.
3. **Assign the Role**
Under **Database User Privileges**, select the custom role created in Step 1.
4. **Copy User Credentials**
Click **Add User**, and make sure to record the username and password. These
credentials will be used later in the CrateDB Cloud Console.
:::
:::{tab} x.509 Authentication
1. **Navigate to Database Access**
In the MongoDB Atlas UI, go to **Database Access** and click **Add New
Database User**.
2. **Set Authentication Method**
Choose **Certificate** as the authentication method.
3. **Assign the Role**
Under **Database User Privileges**, select the custom role created in Step 1.
4. **Save the Certificate**
Click **Add User**, and store the certificate securely. This will be required
later in the CrateDB Cloud Console.
:::
#### Step 3: Configure IP Access
To allow CrateDB Cloud to access your MongoDB Atlas cluster, you must add the
CrateDB Cloud IP addresses to the IP Access List in MongoDB Atlas.
1. **Navigate to Network Access**
In the MongoDB Atlas UI, go to **Network Access** from the left navigation.
2. **Add IP Address**
Click **Add IP Address** and choose an IP address or range to allow access.
For testing purposes, you can select **Allow Access from Anywhere**, but for
production, it is recommended to specify only the required IPs.
:::{note}
The specific IP addresses depend on the region of your CrateDB Cloud cluster.
These IP addresses can also be found in the **Connection Details** section of the
CrateDB Cloud Console, just before you click **Test Connection** during the
setup process.
**Outbound IP Addresses**:
| Cloud Provider | Region | IP Addresses |
|----------------|---------------|---------------------------------|
| Azure | East US 2 | `52.184.241.228/32`, `52.254.31.90/32` |
| Azure | West Europe | `51.105.153.175/32`, `108.142.34.5/32` |
| AWS | EU West 1 | `34.255.75.224` |
| AWS | US East 1 | `54.197.229.58` |
| AWS | US West 2 | `54.189.16.20` |
| GCP | US Central 1 | `34.69.134.49` |
:::
:::{note}
To set up a PrivateLink connection for the Mongo CDC integration, please reach
out to our support team.
:::
#### Step 4: Access Connection String
You'll need to provide the connection string for your MongoDB Atlas cluster so
that CrateDB Cloud can connect to it.
1. **Navigate to Your Cluster**
In the MongoDB Atlas UI, navigate to the cluster you want to connect to CrateDB Cloud.
2. **Click "Connect"**
From the cluster view, click on **Connect**.
3. **Select "Connect Your Application"**
Choose **Connect your application** as the connection method.
4. **Copy the Connection String**
Copy the connection string provided in the MongoDB Atlas UI. It will look like this:
```
mongodb+srv://:@/?retryWrites=true&w=majority
```
---
:::{note}
If you are using X.509 authentication, the connection string will look slightly
different and will not include a username and password. Instead, it will
reference the certificate file:
```
mongodb+srv:///?authMechanism=MONGODB-X509&retryWrites=true&w=majority
```
Make sure to upload the X.509 certificate file when configuring the connection
in CrateDB Cloud.
:::
### Set Up Integration in CrateDB Cloud
Follow these steps in the CrateDB Cloud Console to set up the MongoDB CDC integration:
#### Step 1: Create an Integration
1. Navigate to the **Import** section in the CrateDB Cloud Console.
2. Click **Create Integration** and select **MongoDB** as the source type.
#### Step 2: Configure Connection
1. Choose **Create New Connection** or select an existing one.
2. Fill in the following details:
:::{tab} SCRAM Auhentication
- **Connection Name**: Provide a unique name for the connection.
- **Connection String**: Paste the connection string from MongoDB Atlas.
- **Username**: Enter the database username (required for SCRAM).
- **Password**: Enter the database password (required for SCRAM).
- **Default Database**: Specify the default database to use for this connection.
:::
:::{tab} X.509 Auhentication
- **Connection Name**: Provide a unique name for the connection.
- **Connection String**: Paste the connection string from MongoDB Atlas.
- **Certificate**: Upload the X.509 certificate file.
- **Default Database**: Specify the default database to use for this connection.
:::
#### Step 3: Test the Connection
Click **Test Connection** to verify CrateDB Cloud can connect to your MongoDB
Atlas cluster. Resolve any issues if the test fails.
#### Step 4: Select Collection
Enter the database and collection name from your MongoDB Atlas cluster, that you
want to sync with CrateDB Cloud.
#### Step 5: Select Target Table
1. Specify the target table in your CrateDB Cloud cluster where the data will be synced.
2. MongoDB records will be inserted into an object column called `document`.
3. Select the object type for the column:
- **`dynamic`**: Allows indexing and columnar storage for faster querying.
- **`ignored`**: Prevents type conflicts in CrateDB if your source data lacks a strict schema.
:::{note}
If your source data doesn't follow a strict schema, select `ignored` to avoid type conflicts.
However, selecting `dynamic` provides faster query performance by utilizing indexes and columnar storage.
:::
#### Step 6: Configure Integration Settings
1. Enter a name for the integration.
2. Select the integration mode:
- **Full Load Only**: Imports the data once but doesn’t sync changes.
- **Full Load and CDC**: Imports the data and syncs changes in real-time.
- **CDC Only**: Syncs only new changes in real-time without importing existing data.
#### Step 7: Create the Integration
Click **Create Integration** to finalize the setup. CrateDB Cloud will now sync
your MongoDB data based on the selected settings.
---
## Limitations
The MongoDB CDC integration is available as a preview. The feature is stable
enough for broader use but may still have limitations, known issues, or
incomplete features. While suitable for many use cases, it is not yet
recommended for mission-critical workloads.
### Column Name Restrictions
Column or property names containing square brackets (`[]`) are not supported and
are replaced with `__openbrk__` and `__closebrk__` respectively. Likewise, column
names containing dots (`.`) are not supported and are replaced with (`__dot__`).
:::{warning}
This behavior may change in future releases.
:::
### Unsupported Data Types
The following MongoDB data types are not supported in the CrateDB Cloud MongoDB
CDC integration:
- **Long Strings** exceeding 32,766 characters are replaced with a placeholder
value.
- **Binary data types** other than UUIDs, which are converted to `TEXT` and
**vectors**, which are converted to `ARRAY`s of numbers.
- The `Decimal128` data type is not supported and is converted to a string, as
CrateDB does not support a decimal data type... _user-defined-functions:
======================
User-defined functions
======================
.. rubric:: Table of contents
.. contents::
:local:
.. _udf-create-replace:
``CREATE OR REPLACE``
=====================
CrateDB supports user-defined :ref:`functions `. See
:ref:`ref-create-function` for a full syntax description.
``CREATE FUNCTION`` defines a new function::
cr> CREATE FUNCTION my_subtract_function(integer, integer)
... RETURNS integer
... LANGUAGE JAVASCRIPT
... AS 'function my_subtract_function(a, b) { return a - b; }';
CREATE OK, 1 row affected (... sec)
.. hide:
cr> _wait_for_function('my_subtract_function(1::integer, 1::integer)')
::
cr> SELECT doc.my_subtract_function(3, 1) AS col;
+-----+
| col |
+-----+
| 2 |
+-----+
SELECT 1 row in set (... sec)
``CREATE OR REPLACE FUNCTION`` will either create a new function or replace
an existing function definition::
cr> CREATE OR REPLACE FUNCTION log10(bigint)
... RETURNS double precision
... LANGUAGE JAVASCRIPT
... AS 'function log10(a) {return Math.log(a)/Math.log(10); }';
CREATE OK, 1 row affected (... sec)
.. hide:
cr> _wait_for_function('log10(1::bigint)')
::
cr> SELECT doc.log10(10) AS col;
+-----+
| col |
+-----+
| 1.0 |
+-----+
SELECT 1 row in set (... sec)
It is possible to use named function arguments in the function signature. For
example, the ``calculate_distance`` function signature has two ``geo_point``
arguments named ``start`` and ``end``::
cr> CREATE OR REPLACE FUNCTION calculate_distance("start" geo_point, "end" geo_point)
... RETURNS real
... LANGUAGE JAVASCRIPT
... AS 'function calculate_distance(start, end) {
... return Math.sqrt(
... Math.pow(end[0] - start[0], 2),
... Math.pow(end[1] - start[1], 2));
... }';
CREATE OK, 1 row affected (... sec)
.. NOTE::
Argument names are used for query documentation purposes only. You cannot
reference arguments by name in the function body.
Optionally, a schema-qualified function name can be defined. If you omit the
schema, the current session schema is used::
cr> CREATE OR REPLACE FUNCTION my_schema.log10(bigint)
... RETURNS double precision
... LANGUAGE JAVASCRIPT
... AS 'function log10(a) { return Math.log(a)/Math.log(10); }';
CREATE OK, 1 row affected (... sec)
.. NOTE::
In order to improve the PostgreSQL server compatibility CrateDB allows the
creation of user defined functions against the :ref:`postgres-pg_catalog`
schema. However, the creation of user defined functions against the
read-only :ref:`system-information` and :ref:`information_schema` schemas is
prohibited.
.. _udf-supported-types:
Supported types
===============
Function arguments and return values can be any of the supported :ref:`data
types `. The values passed into a function must strictly
correspond to the specified argument data types.
.. NOTE::
The value returned by the function will be casted to the return type
provided in the definition if required. An exception will be thrown if the
cast is not successful.
.. _udf-overloading:
Overloading
===========
Within a specific schema, you can overload functions by defining functions
with the same name but a different set of arguments::
cr> CREATE FUNCTION my_schema.my_multiply(integer, integer)
... RETURNS integer
... LANGUAGE JAVASCRIPT
... AS 'function my_multiply(a, b) { return a * b; }';
CREATE OK, 1 row affected (... sec)
This would overload the ``my_multiply`` function with different argument
types::
cr> CREATE FUNCTION my_schema.my_multiply(bigint, bigint)
... RETURNS bigint
... LANGUAGE JAVASCRIPT
... AS 'function my_multiply(a, b) { return a * b; }';
CREATE OK, 1 row affected (... sec)
This would overload the ``my_multiply`` function with more arguments::
cr> CREATE FUNCTION my_schema.my_multiply(bigint, bigint, bigint)
... RETURNS bigint
... LANGUAGE JAVASCRIPT
... AS 'function my_multiply(a, b, c) { return a * b * c; }';
CREATE OK, 1 row affected (... sec)
.. CAUTION::
It is considered bad practice to create functions that have the same name
as the CrateDB built-in functions.
.. NOTE::
If you call a function without a schema name, CrateDB will look it up in
the built-in functions first and only then in the user-defined functions
available in the :ref:`search_path `.
**Therefore a built-in function with the same name as a user-defined
function will hide the latter, even if it contains a different set of
arguments.** However, such functions can still be called if the schema name
is explicitly provided.
.. _udf-determinism:
Determinism
===========
.. CAUTION::
User-defined functions need to be deterministic, meaning that they must
always return the same result value when called with the same argument
values, because CrateDB might cache the returned values and reuse the value
if the function is called multiple times with the same arguments.
.. _udf-drop-function:
``DROP FUNCTION``
=================
Functions can be dropped like this::
cr> DROP FUNCTION doc.log10(bigint);
DROP OK, 1 row affected (... sec)
Adding ``IF EXISTS`` prevents from raising an error if the function doesn't
exist::
cr> DROP FUNCTION IF EXISTS doc.log10(integer);
DROP OK, 1 row affected (... sec)
Optionally, argument names can be specified within the drop statement::
cr> DROP FUNCTION IF EXISTS doc.calculate_distance(start_point geo_point, end_point geo_point);
DROP OK, 1 row affected (... sec)
Optionally, you can provide a schema::
cr> DROP FUNCTION my_schema.log10(bigint);
DROP OK, 1 row affected (... sec)
.. _udf-supported-languages:
Supported languages
===================
Currently, CrateDB only supports JavaScript for user-defined functions.
.. _udf-js:
JavaScript
----------
The user defined function JavaScript is compatible with the `ECMAScript 2019`_
specification.
CrateDB uses the `GraalVM JavaScript`_ engine as a JavaScript (ECMAScript)
language execution runtime. The `GraalVM JavaScript`_ engine is a Java
application that works on the stock Java Virtual Machines (VMs). The
interoperability between Java code (host language) and JavaScript user-defined
functions (guest language) is guaranteed by the `GraalVM Polyglot API`_.
Please note: CrateDB does not use the GraalVM JIT compiler as optimizing
compiler. However, the `stock host Java VM JIT compilers`_ can JIT-compile,
optimize, and execute the GraalVM JavaScript codebase to a certain extent.
The execution context for guest JavaScript is created with restricted
privileges to allow for the safe execution of less trusted guest language
code. The guest language application context for each user-defined function
is created with default access modifiers, so any access to managed resources
is denied. The only exception is the host language interoperability
configuration which explicitly allows access to Java lists and arrays. Please
refer to `GraalVM Security Guide`_ for more detailed information.
Also, even though user-defined functions implemented with ECMA-compliant
JavaScript, objects that are normally accessible with a web browser
(e.g. ``window``, ``console``, and so on) are not available.
.. NOTE::
GraalVM treats objects provided to JavaScript user-defined functions as
close as possible to their respective counterparts and therefore by default
only a subset of prototype functions are available in user-defined
functions. For CrateDB 4.6 and earlier the object prototype was disabled.
Please refer to the `GraalVM JavaScript Compatibility FAQ`_ to learn more
about the compatibility.
.. _udf-js-supported-types:
JavaScript supported types
..........................
JavaScript functions can handle all CrateDB data types. However, for some
return types the function output must correspond to the certain format.
If a function requires ``geo_point`` as a return type, then the JavaScript
function must return a ``double precision`` array of size 2, ``WKT`` string or
``GeoJson`` object.
Here is an example of a JavaScript function returning a ``double array``::
cr> CREATE FUNCTION rotate_point(point geo_point, angle real)
... RETURNS geo_point
... LANGUAGE JAVASCRIPT
... AS 'function rotate_point(point, angle) {
... var cos = Math.cos(angle);
... var sin = Math.sin(angle);
... var x = cos * point[0] - sin * point[1];
... var y = sin * point[0] + cos * point[1];
... return [x, y];
... }';
CREATE OK, 1 row affected (... sec)
Below is an example of a JavaScript function returning a ``WKT`` string, which
will be cast to ``geo_point``::
cr> CREATE FUNCTION symmetric_point(point geo_point)
... RETURNS geo_point
... LANGUAGE JAVASCRIPT
... AS 'function symmetric_point (point, angle) {
... var x = - point[0],
... y = - point[1];
... return "POINT (\" + x + \", \" + y +\")";
... }';
CREATE OK, 1 row affected (... sec)
Similarly, if the function specifies the ``geo_shape`` return data type, then
the JavaScript function should return a ``GeoJson`` object or ``WKT`` string::
cr> CREATE FUNCTION line("start" array(double precision), "end" array(double precision))
... RETURNS object
... LANGUAGE JAVASCRIPT
... AS 'function line(start, end) {
... return { "type": "LineString", "coordinates" : [start_point, end_point] };
... }';
CREATE OK, 1 row affected (... sec)
.. NOTE::
If the return value of the JavaScript function is ``undefined``, it is
converted to ``NULL``.
.. _udf-js-numbers:
Working with ``NUMBERS``
........................
The JavaScript engine interprets numbers as ``java.lang.Double``,
``java.lang.Long``, or ``java.lang.Integer``, depending on the computation
performed. In most cases, this is not an issue, since the return type of the
JavaScript function will be cast to the return type specified in the ``CREATE
FUNCTION`` statement, although cast might result in a loss of precision.
However, when you try to cast ``DOUBLE PRECISION`` to
``TIMESTAMP WITH TIME ZONE``, it will be interpreted as UTC seconds and will
result in a wrong value::
cr> CREATE FUNCTION utc(bigint, bigint, bigint)
... RETURNS TIMESTAMP WITH TIME ZONE
... LANGUAGE JAVASCRIPT
... AS 'function utc(year, month, day) {
... return Date.UTC(year, month, day, 0, 0, 0);
... }';
CREATE OK, 1 row affected (... sec)
.. hide:
cr> _wait_for_function('utc(1::bigint, 1::bigint, 1::bigint)')
::
cr> SELECT date_format(utc(2016,04,6)) as epoque;
+------------------------------+
| epoque |
+------------------------------+
| 48314-07-22T00:00:00.000000Z |
+------------------------------+
SELECT 1 row in set (... sec)
.. hide:
cr> DROP FUNCTION utc(bigint, bigint, bigint);
DROP OK, 1 row affected (... sec)
To avoid this behavior, the numeric value should be divided by 1000 before it
is returned::
cr> CREATE FUNCTION utc(bigint, bigint, bigint)
... RETURNS TIMESTAMP WITH TIME ZONE
... LANGUAGE JAVASCRIPT
... AS 'function utc(year, month, day) {
... return Date.UTC(year, month, day, 0, 0, 0)/1000;
... }';
CREATE OK, 1 row affected (... sec)
.. hide:
cr> _wait_for_function('utc(1::bigint, 1::bigint, 1::bigint)')
::
cr> SELECT date_format(utc(2016,04,6)) as epoque;
+-----------------------------+
| epoque |
+-----------------------------+
| 2016-05-06T00:00:00.000000Z |
+-----------------------------+
SELECT 1 row in set (... sec)
.. hide:
cr> DROP FUNCTION my_subtract_function(integer, integer);
DROP OK, 1 row affected (... sec)
cr> DROP FUNCTION my_schema.my_multiply(integer, integer);
DROP OK, 1 row affected (... sec)
cr> DROP FUNCTION my_schema.my_multiply(bigint, bigint, bigint);
DROP OK, 1 row affected (... sec)
cr> DROP FUNCTION my_schema.my_multiply(bigint, bigint);
DROP OK, 1 row affected (... sec)
cr> DROP FUNCTION rotate_point(point geo_point, angle real);
DROP OK, 1 row affected (... sec)
cr> DROP FUNCTION symmetric_point(point geo_point);
DROP OK, 1 row affected (... sec)
cr> DROP FUNCTION line(start_point array(double precision), end_point array(double precision));
DROP OK, 1 row affected (... sec)
cr> DROP FUNCTION utc(bigint, bigint, bigint);
DROP OK, 1 row affected (... sec)
.. _ECMAScript 2019: https://262.ecma-international.org/10.0/index.html
.. _GraalVM JavaScript: https://www.graalvm.org/reference-manual/js/
.. _GraalVM JavaScript Compatibility FAQ: https://www.graalvm.org/latest/reference-manual/js/JavaScriptCompatibility/
.. _GraalVM Polyglot API: https://www.graalvm.org/reference-manual/embed-languages/
.. _GraalVM Security Guide: https://www.graalvm.org/security-guide/
.. _stock host Java VM JIT compilers: https://www.graalvm.org/reference-manual/js/RunOnJDK/(integrate)=
# Integrations
You have a variety of options to connect and integrate 3rd-party
applications, mostly using [CrateDB's PostgreSQL interface].
This documentation section lists applications, frameworks, and libraries,
which can be used together with CrateDB, and outlines how to use them
optimally.
```{toctree}
:maxdepth: 2
ide
orm
df
etl/index
cdc/index
metrics/index
visualize/index
bi/index
lineage/index
testing
```
:::{tip}
Please also visit the [Overview of CrateDB integration tutorials].
:::
[CrateDB's PostgreSQL interface]: inv:crate-reference#interface-postgresql
[Overview of CrateDB integration tutorials]: https://community.cratedb.com/t/overview-of-cratedb-integration-tutorials/1015We've created many integration-focused tutorials to help you use CrateDB with other awesome tools and libraries.👇
All tutorials require the working installation of CrateDB.
|**Tool** | **Articles/Tutorials** | C|
|--- | --- | ---|
|[Apache Airflow](https://airflow.apache.org/) / [Astronomer](https://www.astronomer.io/) | - https://community.cratedb.com/t/cratedb-and-apache-airflow-automating-data-export-to-s3/901 - https://community.cratedb.com/t/cratedb-and-apache-airflow-implementation-of-data-retention-policy/913 - https://community.cratedb.com/t/cratedb-and-apache-airflow-building-a-data-ingestion-pipeline/926 - https://community.cratedb.com/t/cratedb-and-apache-airflow-building-a-hot-cold-storage-data-retention-policy/934 | |
|[Apache Arrow](https://arrow.apache.org) | https://community.cratedb.com/t/import-parquet-files-into-cratedb-using-apache-arrow-and-sqlalchemy/1161 | |
|[Apache Kafka](https://kafka.apache.org/) | https://crate.io/docs/crate/howtos/en/latest/integrations/kafka-connect.html | |
|[Apache NiFi](https://nifi.apache.org/) | https://community.cratedb.com/t/connecting-to-cratedb-from-apache-nifi/647 | |
|[Apache Spark](https://spark.apache.org/) | - https://community.cratedb.com/t/getting-started-with-apache-spark-and-cratedb-a-step-by-step-tutorial/1595 - https://community.cratedb.com/t/introduction-to-azure-databricks-with-cratedb/764 - https://github.com/crate/cratedb-examples/tree/main/by-dataframe/spark/scala-http | |
|[Apache Superset](https://github.com/apache/superset) / [Preset](https://preset.io/) | - https://community.cratedb.com/t/set-up-apache-superset-with-cratedb/1716 - https://crate.io/blog/use-cratedb-and-apache-superset-for-open-source-data-warehousing-and-visualization - [Introduction to Time-Series Visualization in CrateDB and Superset](https://crate.io/blog/introduction-to-time-series-visualization-in-cratedb-and-superset) | |
|[Balena](https://www.balena.io/) | https://community.cratedb.com/t/deploying-cratedb-on-balena-io/1067 | |
|[Cluvio](https://www.cluvio.com/) | https://community.cratedb.com/t/data-analysis-with-cluvio-and-cratedb/1571 | |
|[Dapr](https://dapr.io/) | https://community.cratedb.com/t/connecting-to-cratedb-from-dapr/660 | |
|[DataGrip](https://www.jetbrains.com/datagrip/) | https://cratedb.com/docs/guide/integrate/datagrip/ | |
|[Datashader](https://datashader.org/) | [CrateDB Time Series Exploration and Visualization](https://github.com/crate/cratedb-examples/tree/amo/cloud-datashader/topic/timeseries/explore) | |
|[Dask](https://www.dask.org/) | https://community.cratedb.com/t/guide-to-efficient-data-ingestion-to-cratedb-with-pandas-and-dask/1482 | |
|[DBeaver](https://dbeaver.io/about/) | https://crate.io/blog/cratedb-dbeaver | |
|[dbt](https://github.com/dbt-labs/dbt-core) | https://community.cratedb.com/t/using-dbt-with-cratedb/1566 | |
|[Debezium](https://debezium.io/) | https://community.cratedb.com/t/replicating-data-from-other-databases-to-cratedb-with-debezium-and-kafka/1388 | |
|[Explo](https://www.explo.co/) | https://crate.io/blog/introduction-to-time-series-visualization-in-cratedb-and-explo | |
|[JMeter](https://jmeter.apache.org) | https://community.cratedb.com/t/jmeter-jdbc-connection-to-cratedb/1051/2?u=jayeff | |
|[Grafana](https://grafana.com/) | - https://crate.io/blog/visualizing-time-series-data-with-grafana-and-cratedb - https://community.cratedb.com/t/monitoring-an-on-premises-cratedb-cluster-with-prometheus-and-grafana/1236 | |
|[Kestra.io](https://kestra.io/) | https://community.cratedb.com/t/guide-to-cratedb-data-pipelines-with-kestra-io/1400 | |
|[LangChain](https://www.langchain.com/) | https://community.cratedb.com/t/how-to-set-up-langchain-with-cratedb/1576 | |
|[Locust](https://locust.io) | https://community.cratedb.com/t/loadtesting-cratedb-using-locust/1686 | |
|[Meltano](https://meltano.com/) | [Meltano Examples](https://github.com/crate/cratedb-examples/tree/amo/meltano/framework/singer-meltano) | |
|[Metabase](https://www.metabase.com/) | - https://community.cratedb.com/t/visualizing-data-with-metabase/1401 - https://community.cratedb.com/t/demo-of-metabase-and-cratedb-getting-started/1436 | |
|[Node-RED](https://nodered.org/) | https://community.cratedb.com/t/ingesting-mqtt-messages-into-cratedb-using-node-red/803 | |
|[pandas](https://pandas.pydata.org/) | - https://community.cratedb.com/t/from-data-storage-to-data-analysis-tutorial-on-cratedb-and-pandas-2/1440 - https://community.cratedb.com/t/automating-financial-data-collection-and-storage-in-cratedb-with-python-and-pandas/916 - https://community.cratedb.com/t/importing-parquet-files-into-cratedb-using-apache-arrow-and-sqlalchemy/1161 - https://community.cratedb.com/t/guide-to-efficient-data-ingestion-from-pandas-to-cratedb/1541 | |
|[PowerBI](https://powerbi.microsoft.com/en-us/) | https://crate.io/docs/crate/howtos/en/latest/integrations/powerbi-desktop.html https://crate.io/docs/crate/howtos/en/latest/integrations/powerbi-gateway.html | |
|[Prefect](https://www.prefect.io/) | https://community.cratedb.com/t/building-seamless-data-pipelines-made-easy-combining-prefect-and-cratedb/1555 | |
|[Prometheus](https://prometheus.io/) | - https://community.cratedb.com/t/cratedb-and-prometheus-for-long-term-metrics-storage/1012 - https://community.cratedb.com/t/monitoring-an-on-premises-cratedb-cluster-with-prometheus-and-grafana/1236 | |
|[PyCaret](https://pycaret.org/) | [AutoML with PyCaret and CrateDB](https://github.com/crate/cratedb-examples/tree/main/topic/machine-learning/automl) | |
|[R](https://www.r-project.org/) | https://crate.io/docs/crate/howtos/en/latest/integrations/r.html | |
|[Rill](https://www.rilldata.com/) | https://community.cratedb.com/t/introducing-rill-and-bi-as-code-with-cratedb-cloud/1718 | |
|[Rsyslog](https://www.rsyslog.com/) | https://community.cratedb.com/t/storing-server-logs-on-cratedb-for-fast-search-and-aggregations/1562 | |
|[SQLPad](https://crate.io/blog/use-cratedb-with-sqlpad-as-a-self-hosted-query-tool-and-visualizer) | https://crate.io/blog/use-cratedb-with-sqlpad-as-a-self-hosted-query-tool-and-visualizer | |
|[StreamSets](https://crate.io/docs/crate/howtos/en/latest/integrations/streamsets.html) | https://crate.io/docs/crate/howtos/en/latest/integrations/streamsets.html | |
|[Tableau](https://www.tableau.com/) | https://community.cratedb.com/t/using-cratedb-with-tableau/1192 | |
|[Telegraf](https://www.influxdata.com/time-series-platform/telegraf/) | https://crate.io/blog/use-cratedb-with-telegraf-an-agent-for-collecting-reporting-metrics | |
|[TensorFlow](https://www.tensorflow.org/) | https://crate.io/docs/crate/howtos/en/latest/integrations/ml-dist.html | |
|[Terraform](https://www.terraform.io/) | https://community.cratedb.com/t/deploying-cratedb-to-the-cloud-via-terraform/849 | |
|[Trino](https://trino.io/) | https://community.cratedb.com/t/connecting-to-cratedb-using-trino/993 | |Page not found · GitHub · GitHub
You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert