Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 56 additions & 24 deletions doc/book/admin/troubleshoot.rst
Original file line number Diff line number Diff line change
@@ -1,15 +1,14 @@
.. _admin-troubleshoot:
.. _admin-troubleshooting-guide:

================================================================================
Troubleshooting guide
================================================================================
=====================

.. _admin-troubleshoot-memory-issues:

--------------------------------------------------------------------------------

Problem: INSERT/UPDATE-requests result in ER_MEMORY_ISSUE error
--------------------------------------------------------------------------------
---------------------------------------------------------------

**Possible reasons**

Expand Down Expand Up @@ -63,9 +62,8 @@ Try either of the following measures:

.. _admin-troubleshoot-cpu-load:

--------------------------------------------------------------------------------
Problem: Tarantool generates too heavy CPU load
--------------------------------------------------------------------------------
-----------------------------------------------

**Possible reasons**

Expand Down Expand Up @@ -108,9 +106,8 @@ If the load is mostly generated by INSERT/UPDATE/DELETE requests, we recommend

.. _admin-troubleshoot-query-timeout:

--------------------------------------------------------------------------------
Problem: Query processing times out
--------------------------------------------------------------------------------
-----------------------------------

**Possible reasons**

Expand Down Expand Up @@ -165,9 +162,8 @@ Problem: Query processing times out

.. _admin-troubleshoot-negative-lag-idle:

--------------------------------------------------------------------------------
Problem: Replication "lag" and "idle" contain negative values
--------------------------------------------------------------------------------
-------------------------------------------------------------

This is about ``box.info.replication.(upstream.)lag`` and
``box.info.replication.(upstream.)idle`` values in
Expand All @@ -189,9 +185,8 @@ the local instance’s clock.

.. _admin-troubleshoot-idle-grows-no-logs:

--------------------------------------------------------------------------------
Problem: Replication "idle" keeps growing, but no related log messages appear
--------------------------------------------------------------------------------
-----------------------------------------------------------------------------

This is about ``box.info.replication.(upstream.)idle`` value in
:doc:`/reference/reference_lua/box_info/replication` section.
Expand All @@ -211,9 +206,8 @@ the same replica UUID'``.

.. _admin-troubleshoot-mr-odd-replication-stats:

--------------------------------------------------------------------------------
Problem: Replication statistics differ on replicas within a replica set
--------------------------------------------------------------------------------
-----------------------------------------------------------------------

This is about a replica set that consists of one master and several replicas.
In a replica set of this type, values in
Expand All @@ -231,9 +225,8 @@ Replication is broken.

.. _admin-troubleshoot-mm-replication-stopped:

--------------------------------------------------------------------------------
Problem: Master-master replication is stopped
--------------------------------------------------------------------------------
---------------------------------------------

This is about
:doc:`box.info.replication(.upstream).status </reference/reference_lua/box_info/replication>`
Expand Down Expand Up @@ -268,9 +261,8 @@ We also recommend using text primary keys or setting up

.. _admin-troubleshoot-slow-tarantool:

--------------------------------------------------------------------------------
Problem: Tarantool works much slower than before
--------------------------------------------------------------------------------
------------------------------------------------

**Possible reasons**

Expand Down Expand Up @@ -308,15 +300,56 @@ recommend to optimize your Tarantool application code).
If the value is greater than 0.01, your application definitely needs thorough
code analysis aimed at optimizing memory usage.

.. _admin-troubleshoot-auth-delay:

Problem: Adding a new replica set to a cluster results in ER_AUTH_DELAY error
-----------------------------------------------------------------------------

There are instances in the cluster that are unable to connect to another node in the replica set due to exceeding
the number of authorization attempts.
On these instances, the ``Too many authentication attempts`` error is raised.

**Possible reasons**

1. Incorrect authentication credentials

**Solution**

In the cluster configuration, verify that the credentials the node is attempting to connect with are correct.
To do this, check the :ref:`replication <cfg_replication-replication>` parameter.

2. Network issues

**Solution**

If you encounter network issues, restart the instance or re-add the replica set to the cluster.

3. Tarantool instances are running on matching addresses

**Solution**

Identify the instance that other nodes in the replica set are unable to connect to.
Check the number of failed authorization attempts on the instance that was unable to connect to.

.. code-block:: lua

box.stat().AUTH

If the number of failed attempts is increasing every second, check the list of nodes that are trying to authorize on this replica.
An increasing number of attempts may indicate there are some other Tarantool instances on the machine that have been
previously started on the same addresses.
In this case, the instance with the ``ER_AUTH_DELAY`` error and some old Tarantool nodes are both trying to
authorize on the same replica, and the first instance exceeds the authorization time limit on the replica.

To resolve the problem, stop the old Tarantool instances and restart the replication.

.. _admin-troubleshoot-finalizer_yielding:

--------------------------------------------------------------------------------
Problem: Fiber switch is forbidden in '__gc' metamethod
--------------------------------------------------------------------------------
-------------------------------------------------------

~~~~~~~~~~~~~~~~~~~~~~~~
Problem description
~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~

Fiber switch is forbidden in ``__gc`` metamethod since `this change <https://github.com/tarantool/tarantool/issues/4518#issuecomment-704259323>`_
to avoid unexpected Lua OOM.
Expand All @@ -325,9 +358,8 @@ for example, to close a socket.

Below are examples of proper implementing such a procedure.

~~~~~~~~~~~~~~~~
Solution
~~~~~~~~~~~~~~~~
~~~~~~~~

First, there come two simple examples illustrating the logic of the
solution:
Expand Down