Clever Cloud Status

Incidents

Full history of incidents.

Oldest first

January 2024

Fixed · Reverse Proxies · Global

We have removed the ip address 46.252.181.103 from the domain name domain.par.clever-cloud.com. One of our network partner has detected an abnormal amount of traffic coming to this ip address and begin to mitigate it. We are investigating the issue

EDIT 15:15 UTC: we are still digging the issue, the abnormal traffic is over and everything seems going back to normal

EDIT 16:30 UTC : we have put back the ip address in the load balancer pool 46.252.181.103

December 2023

Fixed · Cellar · Global

Between 16:58 UTC and 17:03 UTC, the Cellar service on the North region timed out on some requests. The faulty component has been decommissioned and further investigations will be done to understand the source of the timeouts. The service is currently up and running.

EDIT 2023-12-30 00:51 UTC: The problem has been identified and resolved. The component is back in the pool and is working as expected. This incident is now over.

Fixed · Access Logs · Global

We are seeing elevated error rate for metrics queries due to the underlying storage system. The problem has been identified and we are working toward its resolution. This can impact some of the grafana dashboards or API queries.

EDIT 09:44 UTC: The issue is not fully resolved yet but we are seeing improvements. We continue working on the issue.

EDIT 11:04 UTC: Queries are now working since 10:15 UTC, we continue monitoring to ensure everything is working as intended.

EDIT 15:43 UTC: Everything is back to normal, this incident is now over.

Fixed · Infrastructure · Global

An hypervisor is unreachable, we are investigating.

EDIT 03:17 UTC : There is no database affected on this hypervisor and applications has been redeployed.

EDIT 03:30 UTC : The hypervisor has been reboot and everything comes back to normal

Fixed · Infrastructure · Global

An hypervisor is unreachable, we are investigating.

EDIT 3:37 UTC : The issue seems to be related with the following OVH incident : https://bare-metal-servers.status-ovhcloud.com/incidents/x135vv46x85l

EDIT 3:45 UTC : Applications on this hypervisor are currently redeploying and there is no such addons on it, we also have remove temporarely the A record from domain.rbx.clever-cloud.com to solve connection issues

EDIT 4:00 UTC : Applications have been redeployed, we are waiting after ovh folk to go further

EDIT 05:30 UTC : The hypervisor is reachable again, we are starting the recovery process

EDIT 05:45 UTC : The recovery process is over, everything works normally, the load balancer ip affected by the incident will be put later in the pool. for the record, the ip is 87.98.177.176 for domain.rbx.clever-cloud.com.

Fixed · Reverse Proxies · Global

We are seeing the number of connexions on load balancers rising, we are investigating

EDIT 10:20 UTC : the investigation is still in progress and we are mitigating the issue with a rise a maximum connexions

EDIT 11:00 UTC : We are now on the nominal values, we are still watching

Fixed · Global

We are planning to do various updates on one of our datacenter in the Paris region starting at 14:15 UTC. It will last for a few hours. No issue is to be expected during this maintenance.

We will update this status accordingly.

EDIT 15:10 UTC: Maintenance is over, no impact during the operations.

Fixed · Infrastructure · Global

Our monitoring detected that an hypervisor located in RBX-1 is unreachable. We are investigating.

EDIT 06:07 AM UTC: the hypervisor has become unresponsive due to a really high cpu load average. It has been rebooted. Almost all databases are reachable, we are fixing the last ones.

EDIT 06:45 AM UTC: all databses are now up

API instability
Fixed · API · Global

We have detected a high number of errors towards certain APIs. One of the core database have been restarted to restore the service.

Fixed · Reverse Proxies · Global

We have detected a configuration issue on our internal loadbalancer. It has been fixed. You may have experienced issues connecting to api.clever-cloud.com and the console for a few minutes.

Fixed · Matomo add-on · Global

We are investigating an issue with Matomo add-ons failing to create since a few days.

EDIT 2023-12-21 16:00 UTC+1: We found and fixed the rood cause. Matomo add-ons can now be ordered again.

Fixed · Global

We are planning to do various updates on one of our datacenter in the Paris region starting at 15:40 UTC. It will last for a few hours. No issue is to be expected during this maintenance.

We will update this status accordingly.

EDIT 17:30 UTC: Maintenance is over, no impact during the operations.

Fixed · Reverse Proxies · Global

We are observing connections issues on load balancers. We are investigating.

EDIT 16:00 UTC : We have found that one of our customers is under ddos, we are mitigating the issue.

EDIT 16:30 UTC : The ddos seems to be mitigated, we are watching.

Fixed · Infrastructure · Global

An hypervisor is unreachable on the Jeddah region since 10:25 UTC. We are investigating.

EDIT 10:55 UTC: The hypervisor went back online at 10:33 UTC. All applications were redeployed to another hypervisor. The incident is now over.

Fixed · Infrastructure · Global

An hypervisor in the SCW region crashed. We restarted it.

Some databases went unavailable, We are checking that they all rebooted correctly.

EDIT 15:51 UTC: all checks have completed. All the services are operational.

EDIT 04/12/2023 11:00 UTC : It seems that the load balancer behind the ip 212.129.27.183 was impacted by the incident. The issue is solved.

Fixed · Reverse Proxies · Global

16:44 UTC: one of the reverse proxy for databases became unresponsive on SCW. We restarted it. 16:47 UTC: the reverse proxy has restarted and is working again.

Consequences: some applications on SCW may have lost connection to their database for a few minutes. They may have crashed and been redeployed by our monitoring.

November 2023

Fixed · API · Global

Our main API responds slowly. We are investigating to find out why.

EDIT 19h UTC : The issue has been solved

Fixed · Global

We are planning to do various updates on one of our datacenter in the Paris region starting at 13:30 UTC. It will last for a few hours. No issue is to be expected during this window.

We will update this status accordingly.

EDIT 17:30 UTC: All updates are now over. Operations went smoothly and no impact was detected.

Fixed · Global

We are planning to do various updates on one of our datacenter in the Paris region starting at 14:00 UTC. It will last for a few hours. No issue is to be expected during this window.

We will update this status accordingly.

EDIT 23:15 UTC: All updates are now over. Operations went smoothly and no impact was detected.

Fixed · Reverse Proxies · Global

In our efforts to fix the issues listed in this status, we fully moved our trafic from the old LB running sōzu 0.13 to new LBs running sōzu 0.15 at 13:30 UTC.

While performing the move, a network configuration issue arose, impacting only customers using TCP redirections on the PAR region.

As the team was focused on monitoring and fine-tuning the configuration of the new LB, it failed to see the error reports until 14:30 UTC. To prevent such an incident in the future, we have since improved our monitoring and alert tools for TCP redirects.

The issue was fixed by 14:55 UTC.