Clever Cloud Status

Incidents

Full history of incidents.

Oldest first

July 2017

Fixed · Global

We will be doing a maintenance on the Cellar cluster starting on 2017-07-20 at 08:00 UTC.

This is a 2-steps maintenance, the second one will be scheduled at a later stage.

This should not have an impact on availability but may have a light to moderate impact on upload / download speeds.

No ETA as of now, we will be posting updates along the way.

EDIT 2017-07-20 08:00 UTC: Maintenance is starting now

EDIT 10:00 UTC: We are expecting the maintenance to end between 21:00 UTC and 2017-07-21 01:00 UTC ; we are seeing no significant impact on upload / download speeds as of now

EDIT 14:45 UTC: The maintenance is running fine and still has no significant impact on performance, we are keeping it as-is. Consider this event over; If something goes wrong, we will create a new event.

Fixed · Global

A maintenance of the logs system will happen at 10am UTC. Applications logs will be unavailable during this maintenance.

The maintenance should not last more than 1 hour.

EDIT 10:18 UTC: Maintenance started a few minutes ago, logs collection will be disabled in a few seconds

EDIT 10:44 UTC: Maintenance is over since a few minutes, logs are now available

Fixed · API · Global

An issue occurred on the main API. It was mostly unavailable, only answering to ~30% of requests at best for close to 10 minutes, until we switched to a backup system.

At this point, most services were available except for logs, events and notifications.

30 minutes after the beginning of this issue, it's now fully available.

Fixed · Infrastructure · Global

Network is flaky in the Europe zone, we are seeing intermittent unreachability issues on multiple elements of our infrastructure. We are investigating.

EDIT 06:48 UTC: The network seems to work fine now. Deployments are unavailable, we are working on bringing them back up.

EDIT 07:35 UTC: Deployments have been back up since 07:15, we are still cleaning up the remaining items.

EDIT 07:40 UTC: Everything is cleaned up and functional now. If you have an issue, come ping us.

June 2017

Fixed · Deployments · Global

Deployments are disabled for a short maintenance operation.

EDIT 16:12 UTC: Deployments are back

Fixed · Deployments · Global

We are currently experiencing performance issues on a component of our deployment system. Deployments are delayed by a few minutes.

Fixed · Deployments · Global

We are doing a maintenance operation on a component of our monitoring system. Deployments may be delayed until the end of the operation.

This should last no more than 10 minutes. Deployments should not be delayed by more than a couple minutes.

Maintenance operation will start at 09:10 UTC.

EDIT 09:19 UTC: Deployments should go back to normal in the next few minutes. Maintenance is over, we are now checking that everything is working fine.

EDIT 09:24 UTC: Deployments delay back to normal; end of incident

Fixed · Infrastructure · Global

One hypervisor went down, affected applications are being automatically redeployed. Addons on this hypervisor are unreachable (~2% of dedicated addons in the Europe zone).

We are awaiting news from our provider.

EDIT 15:30 UTC: We are still awaiting a manual operation from our provider

EDIT 15:37 UTC: They have rebooted the server manually but "observed an error" and are "analyzing" the issue

EDIT 16:04 UTC: The power supply is out of order and is being replaced

EDIT 16:55 UTC: The operation is over, the server just rebooted and will now start recovering / cleaning up after the forced reboot. Databases will be coming back online automatically.

EDIT 17:50 UTC: Most databases are available since 17:15 UTC. The remaining databases are now available

Monitoring issue
Fixed · Deployments · Global

An incident occurred in our monitoring tools. Old instances are unable to stop, thus causing instability in applications.

Deployments are stopped until the monitoring is back up and running.

Fixed · Deployments · Global

We are working on fixing an issue with our applications and addons monitoring system of the Europe zone. Deployments have been disabled to allow the monitoring to catch up faster.

Fixed · Infrastructure · Global

The addon gateway has been restarted, some connections have been forcibly closed.

Fixed · Infrastructure · Global

The addon gateway has been restarted, some connections have been forcibly closed.

Fixed · Global

A core component of the deployment infrastructure will be upgraded to improve stability and performance. As a result, deployments will be stopped for up to 60 minutes (hopefully less)

EDIT 11:05 UTC: Maintenance is fully over now, deployments have been available since 10:50 UTC.

Fixed · Deployments · Global

Deployments take more time to start due to higher than usual activity. We are working on fixing the problem.

EDIT 16:00 UTC: The deployment starting time is back to normal

May 2017

Fixed · Deployments · Global

Deployments take more time to start due to higher than usual activity. We are working on fixing the problem.

Fixed · Deployments · Global

Deployments are disabled following an incident on a component of our deployment system. We are working on bringing it back up.

ETA is about an hour.

Fixed · Deployments · Global

Our monitoring system had a small network split making it think applications were unreachable. This triggered a lot of redeployments. This does not make applications unreachable. You might receive some emails with a "Monitoring/Unreachable" deployment reason.

Also, deployments are delayed until we clean the non-important redeployments

UPDATE 5:07PM UTC: Incident has been resolved, sorry for those redeployments

April 2017

Fixed · Infrastructure · Global

We are investigating the problem.

UPDATE 12:43PM UTC: The problem has been resolved, we will investigate about why it happened and how to prevent this from happening again.

Fixed · Infrastructure · Global

We are investigating a network issue affecting a reverse proxy for the addons.

EDIT: The issue is gone. It looks like it was a temporary network issue of our provider.

March 2017

Fixed · Infrastructure · Global

Impacted applications are redeploying. Edit: resolved at 14:10 UTC