Incidents
Full history of incidents.
November 2017
Due to a software update, deployements will be disabled for up to 30 minutes starting at 12:30 UTC+1
EDIT 13:00 UTC+1: The maintenance is over, deployments are back since 15 minutes
A network issue affecting a front load balancer on the PAR zone has been identified and fixed
October 2017
Due to a network issue, some FS buckets have been unavailable for a short period of time. All FS buckets are now available.
One hypervisor has experienced a hardware issue and is rebooting. Affected apps are being redeployed, affected addons will be available shortly.
Due to a phishing application deployed on a cleverapps.io domain, the whole domain name has been marked as malicious.
We are working on clearing the alert. In the meantime, we'd like to warn you that cleverapps.io domain names are provided only for test purposes and that they should not be used in production.
September 2017
A node hosting shared redis databases has been restarted after having connectivity issues. Impacted applications will automatically be redeployed when connectivity is restored.
Our main API is currently slower than usual. We are looking into it
EDIT 12:09 UTC: The API is now performing smoothly. We will keep looking why it went into such state
A shared redis cluster went down. It's being restarted
EDIT 17:30 UTC: all shared redis are now available again
The .cleverapps.io domains have issues resolving through multiple DNS servers. It seems like the top .io TLD DNS servers are the root cause of the problem. Users using this domain may have error messages like "Server not found".
If you need it, here is the IP of the domain: 217.70.184.38
EDIT 19:43 UTC: The incident seems to be resolved, .cleverapps.io domains now resolve correctly
One of our hypervisors is having a huge load, making it unresponsive.
Impacted applications are being redeployed
EDIT 10:15 UTC: The server is still under huge load. Services on it continue to answer correctly in most cases. Applications are still redeploying
EDIT 10:30 UTC: The server is now reachable and responsive, we are looking into why it went under such a heavy load
One of our physical server has gone down. Impacted applications are being redeployed and we are investigating the incident
EDIT 14:24 UTC: The server is still down, we are waiting for more informations from our prodiver
EDIT 14:37 UTC: One of the server's fan has died and the server won't start.
EDIT 14:43 UTC: Impacted databases will be migrated on another server on request to the support. We will also contact impacted users. Let us know if you want to start a new database using tonight's backup
EDIT 15:23 UTC: Our provider is replacing the fans, no ETA for now
EDIT 16:55 UTC: Our provider replaced the fans and the server is now back up. Non migrated databases have been started again and linked applications are being redeployed. We will continue to monitor the situation
A reverse proxy serving addons traffic has started refusing connections. It has been restarted and is now serving traffic correctly. The affected applications have been automatically restarted.
August 2017
The API and the Console are currently unavailable. We are working on bringing them back. Applications are not impacted
EDIT: 17:40: Everything is back, sorry for the interruption.
Deployments are currently delayed, we are working on getting it back
Update 13:57 UTC: deployments are now back up, we continue to monitor the situation
Update 14:15 UTC: it's all good now
The MySQL shared cluster is subjected to increased load. Dedicated DBs are not affected.
Update 09:29 UTC: the master node has been restarted. We're watching it closely
Update 15:45 UTC: the master has been alright since then
The montreal zone has some network issues, we are currently reaching to hour hosting provider
Update 20:13 UTC: Network seems more stable now. We are still waiting for more information from our provider Update 21:17:UTC: Our provider has confirmed the issue is fixed.
A network outage happened on our add-on reverse proxies. We are currently monitoring the situation to avoid further downtime. Impacted applications will be automatically redeployed
EDIT 18:00 UTC: all good now
July 2017
We are migrating all credit card information from a payment gateway to another. During this migration, you will not be able to manage your credit cards. You will still be able to perform payments, though.
EDIT 29/07/17 11:35 UTC: the migration will begin at 12h15pm UTC. During the migration and for a few hours after, credit cards management might not work
EDIT 29/07/17 13:10 UTC: the migration is over, we will continue to monitor payments for a few hours
We will be doing a maintenance on the Cellar cluster starting on 2017-07-26 at 08:00 UTC.
This is the 2nd step of the maintenance started on the 20th (https://status.clever-cloud.com/incident/31).
This should not have an impact on availability but may have a slightly bigger impact on performance than the first step (which did not have any noticeable impact).
It should take around 10 hours. This is a very rough estimate though, we will be posting updates along the way.
EDIT 08:01 UTC: Maintenance is starting now.
EDIT 11:55 UTC: Everything is going smoothly. Performance impact is very low.
EDIT 19:35 UTC: Maintenance is still in progress. No significant impact; so as for the 1st step, consider this event over.
One hypervisor is unreachable. Affected applications are being redeployed automatically. Affected addons are unreachable.
EDIT 12:05 UTC: All affected applications have finished redeploying ; we are awaiting an answer from our provider
EDIT 12:47 UTC: Our provider is "running tests" on the affected server and has not given any ETA as of now.
EDIT 13:00 UTC: The server is reporting an hardware error, not disk-related. Our provider is working on fixing the issue.
EDIT 13:31 UTC: The server fails to start. Our provider is giving us another server and will put the disks of the old server into the new one.
EDIT 14:30 UTC: The server is ready, the disks are up and running. We are now rebooting the server in operational mode and will make sure everything starts up fine and will then update the network configuration.
EDIT 15:11 UTC: All databases are available again.