
==================
Information Schema
==================

The Information Schema is a special schema that contains virtual tables which
are read-only and can be queried to get information about the state of the
cluster.

.. note:: currently only the tables ``tables`` and ``columns`` are implemented.

Tables
======

The information schema contains a table called `tables`.

This table can be queried to get a list of all available tables and their
settings like the number of shards or number of replicas::

    cr> select * from information_schema.tables
    ... where table_name not like 'my_table%' order by schema_name asc, table_name asc;
    +--------------------+-------------------+------------------+--------------------+--------------+
    | schema_name        | table_name        | number_of_shards | number_of_replicas | clustered_by |
    +--------------------+-------------------+------------------+--------------------+--------------+
    | blob               | myblobs           | 3                | 1                  | digest       |
    | doc                | documents         | 5                | 1                  | _id          |
    | doc                | locations         | 2                | 0                  | id           |
    | doc                | quotes            | 2                | 0                  | id           |
    | information_schema | columns           | 1                | 0                  | NULL         |
    | information_schema | routines          | 1                | 0                  | NULL         |
    | information_schema | table_constraints | 1                | 0                  | NULL         |
    | information_schema | tables            | 1                | 0                  | NULL         |
    | sys                | cluster           | 1                | 0                  | NULL         |
    | sys                | nodes             | 1                | 0                  | NULL         |
    | sys                | shards            | 1                | 0                  | NULL         |
    +--------------------+-------------------+------------------+--------------------+--------------+
    SELECT 11 rows in set (... sec)


Columns
=======

This table can be queried to get a list of all available columns of all tables and their
definition like data type and ordinal position inside the table::

    cr> select * from information_schema.columns
    ... where schema_name='doc' and table_name not like 'my_table%'
    ... order by table_name asc, column_name asc;
    +-------------+------------+------------------+------------------+--------------+
    | schema_name | table_name | column_name      | ordinal_position | data_type    |
    +-------------+------------+------------------+------------------+--------------+
    | doc         | documents  | body             | 1                | string       |
    | doc         | documents  | title            | 2                | string       |
    | doc         | locations  | date             | 1                | timestamp    |
    | doc         | locations  | description      | 2                | string       |
    | doc         | locations  | id               | 3                | string       |
    | doc         | locations  | kind             | 4                | string       |
    | doc         | locations  | name             | 5                | string       |
    | doc         | locations  | position         | 6                | integer      |
    | doc         | locations  | race             | 7                | object       |
    | doc         | locations  | race.description | 8                | string       |
    | doc         | locations  | race.interests   | 9                | string_array |
    | doc         | locations  | race.name        | 10               | string       |
    | doc         | quotes     | id               | 1                | integer      |
    | doc         | quotes     | quote            | 2                | string       |
    +-------------+------------+------------------+------------------+--------------+
    SELECT 14 rows in set (... sec)


You can even query this tables' own columns (attention: this might lead to infinite recursion of your mind, beware!)::

    cr> select column_name, data_type, ordinal_position from information_schema.columns
    ... where schema_name = 'information_schema' and table_name = 'columns' order by ordinal_position asc;
    +------------------+-----------+------------------+
    | column_name      | data_type | ordinal_position |
    +------------------+-----------+------------------+
    | schema_name      | string    | 1                |
    | table_name       | string    | 2                |
    | column_name      | string    | 3                |
    | ordinal_position | short     | 4                |
    | data_type        | string    | 5                |
    +------------------+-----------+------------------+
    SELECT 5 rows in set (... sec)

.. note::

  Columns at Crate are always sorted alphabetically in ascending order despite in which order
  they were defined on table creation. Thus the ``ordinal_position`` reflects the alphabetical
  position.


Table Constraints
=================

This table can be queries to get a list of all defined table constraints,
their type, name and which table they are defined in.

.. note::

    Currently only ``PRIMARY_KEY`` constraints are supported.

::

    cr> select * from information_schema.table_constraints
    ... where table_name not like 'my_table%'
    ... order by schema_name desc, table_name desc limit 10;
    +--------------------+------------+-------------------------------------------------+-----------------+
    | schema_name        | table_name | constraint_name                                 | constraint_type |
    +--------------------+------------+-------------------------------------------------+-----------------+
    | sys                | shards     | [u'schema_name', u'table_name', u'id']          | PRIMARY_KEY     |
    | sys                | nodes      | [u'id']                                         | PRIMARY_KEY     |
    | information_schema | tables     | [u'schema_name', u'table_name']                 | PRIMARY_KEY     |
    | information_schema | columns    | [u'schema_name', u'table_name', u'column_name'] | PRIMARY_KEY     |
    | doc                | quotes     | [u'id']                                         | PRIMARY_KEY     |
    | doc                | locations  | [u'id']                                         | PRIMARY_KEY     |
    | doc                | documents  | [u'_id']                                        | PRIMARY_KEY     |
    | blob               | myblobs    | [u'digest']                                     | PRIMARY_KEY     |
    +--------------------+------------+-------------------------------------------------+-----------------+
    SELECT 8 rows in set (... sec)


Indices
=======

.. note:: currently not implemented

This table can be queried to get a list of all defined indices of all columns and their
definition like index method, expression list and property list.
Using a :ref:`plain index <sql_ddl_index_plain>` for every column is the default
behaviour at Crate, so almost all columns are listed as an index as well::

    cr> select * from information_schema.indices
    ... where table_name not like 'my_table%' order by table_name asc, index_name asc; #doctest: +SKIP
    +------------+---------------------+----------+---------------------------+------------------+
    | table_name | index_name          | method   | columns                   | properties       |
    +------------+---------------------+----------+---------------------------+------------------+
    | documents  | body                | plain    | [u'body']                 |                  |
    | documents  | title               | plain    | [u'title']                |                  |
    | documents  | title_body_ft       | fulltext | [u'body', u'title']       | analyzer=english |
    | locations  | date                | plain    | [u'date']                 |                  |
    | locations  | description         | plain    | [u'description']          |                  |
    | locations  | id                  | plain    | [u'id']                   |                  |
    | locations  | kind                | plain    | [u'kind']                 |                  |
    | locations  | name                | plain    | [u'name']                 |                  |
    | locations  | name_description_ft | fulltext | [u'description', u'name'] | analyzer=english |
    | locations  | position            | plain    | [u'position']             |                  |
    | locations  | race                | plain    | [u'race']                 |                  |
    | quotes     | id                  | plain    | [u'id']                   |                  |
    | quotes     | quote               | plain    | [u'quote']                |                  |
    +------------+---------------------+----------+---------------------------+------------------+
    SELECT 13 rows in set (... sec)


Routines
========

The routines table contains all custom analyzers, tokenizers, token-filters
and char-filters and all custom analyzers created by ``CREATE ANALYZER``
statements (see :ref:`sql-ddl-custom-analyzer`)::

    cr> select routine_name, routine_type from information_schema.routines
    ... group by routine_name, routine_type order by routine_name asc limit 5;
    +----------------------+--------------+
    | routine_name         | routine_type |
    +----------------------+--------------+
    | arabic               | ANALYZER     |
    | arabic_normalization | TOKEN_FILTER |
    | arabic_stem          | TOKEN_FILTER |
    | armenian             | ANALYZER     |
    | asciifolding         | TOKEN_FILTER |
    +----------------------+--------------+
    SELECT 5 rows in set (... sec)

For example you can use this table to list existing tokenizers like this::

    cr> select routine_name from information_schema.routines
    ... where routine_type='TOKENIZER'
    ... order by routine_name asc limit 10;
    +----------------+
    | routine_name   |
    +----------------+
    | classic        |
    | e2_mypattern   |
    | edgeNGram      |
    | edge_ngram     |
    | keyword        |
    | letter         |
    | lowercase      |
    | nGram          |
    | ngram          |
    | path_hierarchy |
    +----------------+
    SELECT 10 rows in set (... sec)

Or get an overview of how many routines and routine types are available::

    cr> select count(*), routine_type from information_schema.routines
    ... group by routine_type order by routine_type;
    +----------+--------------+
    | count(*) | routine_type |
    +----------+--------------+
    | 45       | ANALYZER     |
    | 5        | CHAR_FILTER  |
    | 14       | TOKENIZER    |
    | 46       | TOKEN_FILTER |
    +----------+--------------+
    SELECT 4 rows in set (... sec)
