Clever Cloud Status

Incident History

Full history of incidents.

Newest first

March 2018

Fixed · Global

A dedicated addons reverse proxy is refusing new connections. It is being restarted.

EDIT 11:49 UTC: Incident over since 11:45 UTC

Fixed · Services Logs · Global

Real-time log delivery is affected by an outage on our message broker. Log drains are affected as well. Logs are still archived.

EDIT 17:03 UTC: Real-time delivery is back since 16:50 UTC

February 2018

Fixed · Global

Our monitoring system has detected network connectivity issues. Issues were caused by a network configuration inconsistency, they are solved.

Fixed · Deployments · Global

NodeJS applications are failing to deploy because of the missing nomnom module. We are investigating the issue.

EDIT 10:53 UTC: You can create the following environment variable for a temporary workaround: CC_PRE_RUN_HOOK=npm install nomnom@1.8.1 -g

EDIT 11:33 UTC: A fix has been made and the new image version is now deploying on our servers.

EDIT 12:33 UTC: The new image is now live. All NodeJS applications will be redeployed to avoid using a now broken image.

Metrics
Fixed · Access Logs · Global

The metrics data cluster is under unusual load. Metrics display is currently unavailable, but metrics are still collected.

EDIT 17:35 UTC: Service is back to normal and collected metrics have all been correctly persisted.

Fixed · Global

The proxy is being restarted. Some add-ons may be unreachable until it's done.

EDIT 15:42 UTC: Incident over since 15:40 UTC.

Fixed · Services Logs · Global

The log storage cluster is experiencing network issues. We are working on it. In the meantime, only realtime logs are available.

January 2018

Fixed · Infrastructure · Global

The proxy is being restarted. Some add-ons may be unreachable until it's done

EDIT 16:41 UTC: the proxy has been successfully restarted. Add-ons should be reachable again. Applications not supporting the loss of an established connection will be redeployed. We continue to monitor the proxy.

EDIT 17:30 UTC: the incident is now over

Fixed · Infrastructure · Global

A redis cluster was down and is restarting

EDIT 20:17:00 UTC: The cluster has been restarted, impacted applications have been redeployed. The incident is over

Fixed · Global

PostgreSQL addon dashboards will be unavailable for about 15 minutes starting on 2018-01-25 at 12:30 UTC

EDIT: Delayed to 12:50 UTC

EDIT 12:50 UTC: Will start in a few seconds

EDIT 13:07 UTC: Maintenance over. If you encounter an issue, please tell us.

Logs unavailable
Fixed · Services Logs · Global

Logs are currently unavailable. We are working on restoring them. All logs sent in the last 30 minutes won't be stored.

EDIT 03:15 UTC: Logs are back again

Fixed · Global

The MongoDB shared cluster needs to be upgraded to have more resources.

Performance issues and or partial outage are to be expected. We will try to keep them as low as possible.

The maintenance starts at 22:00 UTC

EDIT 02:00 UTC: the maintenance is now over

Fixed · Infrastructure · Global

An addon reverse proxy is restarting, connections are dropped and impacted applications will be redeployed

EDIT 20:45:00 UTC: The reverse proxy took ~1 minute to restart. It is now restarted

EDIT 20:48:00 UTC: Impacted applications were redeployed as expected. The incident is now over and all add-ons are now reachable again

Fixed · Deployments · Global

All deployments from around 15:40 UTC might be shown in a FAILED state, even though they were successful. It's just a matter of display and the instances, if correctly deployed, are put into production.

The Activity pane (Console), clever status (cli) and the API endpoint /applications/<app>/deployments incorrectly report the deployment status.

Notifications (slack webhooks, mails) correctly report the deployment status (failed or successful) and can be trusted.

EDIT 21:48 UTC: It should now be fixed. Deployments with the "FAILED" state will keep their broken state.

Fixed · MySQL shared cluster · Global

Network instability on Online DC2 makes some products unreachable:

  • Mysql shared cluster
  • Postgresql shared cluster
  • Mongodb shared cluster
  • One of the cleverapps front proxies
Fixed · MongoDB shared cluster · Global

The shared mongodb cluster is experiencing issues, we're working on bringing it back up.

Fixed · Services Logs · Global

Due to disk space, we need to lower the number of logs we store, for now. Only the last 4 days are kept, instead of the ideal number of last 7 days.

EDIT 2018-06-15 UTC: All 7 days are now available again.

December 2017

Fixed · Global

A core component will be upgraded. Deployments will be disabled for an hour starting at 11:30 UTC. This upgrade should fix some deployments delay among other things.

EDIT 11:31 UTC: Maintenance is starting

EDIT 12:06 UTC: Deployments are back, we are now cleaning some old artefacts

EDIT 13:00 UTC: The maintenance is over

Fixed · Deployments · Global

Our deployment system encounter some slow down. Some application may take longer than usual to deploy. We are working on it

EDIT 19:25 UTC: Those slow downs might require an infrastructure change that will be done next week. Until then, slow downs should be less frequent and less important

EDIT 2017-12-08: 12:00 UTC: Deployments take less time after some fixes on our end. The migration will still happen to entirely fix it. Incident is considered as closed because we don't see any more extra times.

Fixed · Infrastructure · Global

We've observed an elevated error rate on two front load balancers newly added to the pool. We're pulling traffic back from these load balancers.