Clever Cloud Status

Incident History

Full history of incidents.

Newest first

January 2025

Fixed · Global

A network maintenance on the Paris region is planned on January 23th, 2025 between 21:30 UTC and 23:30 UTC. No service interruption or degradation is expected during this maintenance. We will update this incident throughout the operations. This maintenance is the second out of three maintenances to improve our transit peering.

EDIT 21:37 UTC: The maintenance will start shortly.

EDIT 00:47 UTC: The maintenance is now over. A small connection cut was seen between 00:17:17 and 00:17:43 for trafic incoming on one of our routers.

Fixed · Infrastructure · Global

An hypervisor rebooted on the MTL region. This hypervisor only hosts some of the FSBucket add-ons as well as some DEV MySQL add-ons. We are investigating the root cause and start restoring the service.

EDIT 16:45 UTC: The services are available again. The issue most probably comes from an ongoing electrical maintenance. This server was located in room B710: https://network.status-ovhcloud.com/incidents/m9fzvt0nd8jm. We will follow up with OVH to get more information. In the meantime, we'll keep an eye on the maintenance status.

Fixed · Global

A network maintenance on the Paris region is planned on January 16th, 2025 between 21:30 UTC and 23:30 UTC. No service interruption or degradation is expected during this maintenance. We will update this incident throughout the operations.

EDIT 2025-01-16 21:27 UTC: The maintenance is about to start.

EDIT 2025-01-16 23:30 UTC: The maintenance is now over, no impact observed.

Fixed · Global

A maintenance is planned on Grafana starting at 9:00 Paris CET and expected to last for several minutes. We plan to upgrade the version of Grafana to v9.5.5.

EDIT 10:10 CET: The update completed successfully

Fixed · Console · Global

16:50 UTC: We identified an issue where the user can be disconnected if the user try to see add-on informations of a MateriaKV. We are working on a fix.

EDIT 17:05 UTC: A fixed is deployed, if any disconnection happens, please inform the support.

Fixed · Reverse Proxies · Global

Load balancers are experiencing connectivity issues with the event bus when retrieving configuration; we are investigating.

EDIT 10:00 UTC : it seems that the issue is due to packets losse between SGP region and PAR region, we are investigating.

EDIT 11:00 UTC : the incident is related to an AS which is not in our network, we have contacted our network providers to mitigate the issue.

Fixed · RabbitMQ shared cluster · Global

The shared RabbitMQ cluster on the Paris region is experiencing intermittent degraded performances. We are investigating.

EDIT 10:27 UTC: The underlying issue has been found, a RabbitMQ client was reconnecting too fast / too often leading to a global increase of load on the cluster. The situation has been addressed. We continue to monitor the situation.

EDIT 2025-01-13 09:45 UTC: The situation is back to normal since 10/01/2025 10:30 UTC. The incident is over.

Fixed · Global

We are currently in the process of upgrading our Cellar Paris service. During this upgrade, customers might experience increased requests latencies when we restart certain components. No service interruption is to be expected. The upgrade should take place throughout the week starting on Monday 13/01/2025 around 09:00 UTC . We will update this maintenance once it is over or if anything comes up.

EDIT 2025-01-15 17:05 UTC: The maintenance is now over. No service interruption occurred.

Fixed · Global

We are currently in the process of upgrading our Cellar North service. During this upgrade, customers might experience increased requests latencies when we restart certain components. No service interruption is to be expected. The upgrade should take place between Wednesday 2025-01-08 15:00:00 UTC and Friday 2025-01-10. We will update this maintenance once it is over or if anything comes up.

EDIT 2025-01-09 16:40 UTC: The upgrade is now over. Some queries were slower than usual today around 12:00 UTC but no requests were lost.

Fixed · cleverapps.io domains · Global

The monitoring detect a high number of simultaneous connections which result on connection refused from the load balancer, we are investigating the issue and a way to solve it.

EDIT 15:20 UTC : The number of simultaneous connections goes back to normal, we are still investigating the reason behind this raise.

Fixed · Infrastructure · Global

The monitoring has detected that one of our hypervisor is unreachable, we are investigating the issue.

EDIT 18:00 UTC : The hypervisor is back up and running, we are proceeding to the recovery process.

EDIT 18:15 UTC : Services are back up and running, we are investigating the reason of the issue.

December 2024

Fixed · Infrastructure · Global

An hypervisor is unreachable on the MEA region, we are investigating.

EDIT 07:41 UTC: The hypervisor is back online since 07:27 as well as all of its services.

Fixed · Access Logs · Global

The processing components of the access logs pipeline have a bug that prevent us to process the access log properly. We are currently investigating the issue.

EDIT 14:50 UTC : We found the bug and a patch that comes with. We have deployed the path in production and we are processing access logs queues.

EDIT 10:15 UTC : The access logs have been completely consume, we are processing on the fly since yesterday 18: 45 UTC.

Fixed · Global

(Times in UTC) An hypervisor has crashed on the RBX region. Applications that were running on this hypervisor are currently redeploying. We are investigating the reason of the crash.

  • 00:35: it looks like the kernel went rogue. We are rebooting the server
  • 00:45: The server was successfully rebooted, we are starting to check that all the services are restarting correctly.
  • 00:55: Everything is back to normal. All databases are up and running.
Fixed · Global

A maintenance is planned on Grafana starting at 21:00 Paris CET and expected to last for 1 hour

[ 01:00 CET] Maintenance is now completed

Fixed · Pulsar · Global

A few nodes of the pulsar storage layer known as bookkeeper crashed and propagate the pulsar cluster with them. We are restoring the bookkeeper cluster and then we will help the cluster pulsar to recover.

EDIT 19:15 UTC : We have deployed a patch to fix the bookkeeper cluster, we have deployed the new configuration and we are rolled out the cluster. The pulsar cluster should be available.

EDIT 20:20 UTC : Some nodes of the bookkeeper cluster are under memory pressure, we are investigating the issue.

EDIT 21:20 UTC: We found the issue and are deploying the patch.

EDIT 21:50 UTC: Situation is back to normal.

Fixed · Metrics · Global

We observed an issue while accessing Grafana Metrics dashboards with the message Access denied to this dashboard

A patch is currently beiing deployed

[ 12:30 CET]: All organisations have been patched

Fixed · Infrastructure · Global

An hypervisor on the Paris region is experiencing degraded I/O operations. We are looking into it.

EDIT 20:25 UTC+1: The hypervisor is back to normal levels since 20:08 UTC+1. We keep investigating the reason of the slow I/O. Applications on this hypervisor were redeployed elsewhere to avoid any issues.

Fixed · Infrastructure · Global

An hypervisor has crashed on the PAR region. Applications are currently redeploying. We are investigating the reason of the crash, probably an issue with the RAID array.

EDIT 11:50 CET : hypervisor is up and running, we are still investigating the root cause.

Fixed · Infrastructure · Global

An hypervisor has crashed on the PAR region. Applications are currently redeploying. We are investigating the reason of the crashed.

EDIT 16:40 CET - HV has been restarted and is now running