Sites down, no comms

Seems that something was changed/updated and a bit was rolled back (at least I’m kinda remembering that something is missing from last week, maybe). Any idea from how long?

The reasons are a whole different can of worms but the biggest thumb down has to go to the lack of any communications in any channel, medium or site.

2 Likes

Well, I contacted support and they explained what was going on. :+1:

2 Likes

Our Forums have been down for the last few days and we apologize for the interruption. We did communicate on this in the Matrix room but that quickly fades away in the backlog.

Here is the full story:

On Saturday the 29th of July, at 05:08 am GMT, we have noticed that our main puri.sm website was not responding anymore. After investigating, we realized that a huge amount of requests where being sent to the server. Our Systems team reported that 10000 requests from 3500 different IP address have hit the server in 10 minutes. That is 1000 requests per minute or around 16 requests per second. Requests where hitting random pages where most of them would not even exist and return a 404 error. We figured that we could possibly face a Distributed Denial-of-Service (DDoS) attack.

After a few minutes the shop was down with the same symptoms. Followed by the Gitlab repos, the Forums and the entire PureOS infrastructure, including the pureos.net website, the repos front-end and the repos themselves. At that point, PureOS users where not able to update their systems nor to install new software from the PureOS repositories.

After a few hours, we realized that there were hundreds of thousands of different IP addresses involved in the attack and it was pretty difficult to filter out the requests related to the attack from the requests from real visitors.

We are obviously not a big multinational company and our servers resources are not unlimited. We decided we would deal with each website one by one. Therefore, we have setup some logic on our servers to analyze the requests in order to guess as good as possible what would differentiate an attack request from a normal visitor’s request. After almost two days of analyzing and filtering, improving the servers configurations, we managed to block most requests from the attack and were able to have the main website, along with the shop back online.

What was a first victory, then became a non stopping effort as requests kept coming from always different IP addresses.

By now, the attack has lowered its intensity and our forums, that was the last piece to restore is back online.

We had never experienced DDoS attacks before and those numbers we got hit by were quite impressive for a first try. A huge thanks to our team who managed to brilliantly handle this situation and bring back our online services to the people!

25 Likes

I got the following answer by email from Purism’s support when I asked a few days ago while things were down:

Thank you for your report. Over the previous weekend we had a strong DDoS attack targeting our main website, forums, and PureOS repos. We mostly managed to mitigate and reduce consequences to a minimum, but some of our websites are still very hard to reach. We expect them to be up and running in the following days, so please check back regularly and follow our new blog to stay informed.

5 Likes

We were communicating in real-time on our Matrix channels, as those weren’t affected.

8 Likes

Matrix is like the smallest post-it note possible at a location general public doesn’t know to look. For next time: linux and hacker news sites, the main puri.sm webpage and news section, Mastodon, Reddit…

3 Likes

Was Purism (@purism@librem.one) - Librem Social down, too? I personally would have expected a small post there (just a short note) or on Twitter. As these are to my mind the primary systems to look for corporate communications.

However, good that it’s back again!

2 Likes

Unfortunately, we had to do some upgrades to the servers and the latest backups of the forums would not run correctly so we had to roll back to a backup from the 28th. Again, we apologize for the inconvenience.

2 Likes

https://social.librem.one/ has been working, at least most of the time, I think.

3 Likes

And seeing what else has been filling the communications void, I’d say - from a risk management perspective - it would be a good idea to get a crisis plan together, which includes comms plan as well.

3 Likes

I agree we could have done better in term of communication. We have tried to setup a temporary page on the website to explain the situation but due to the nature of the attack even that page wouldn’t load. Then, our team is not that big, especially during the week end and we indeed have been putting more focus and energy into solving the problem itself than in communicating about it.

It is what we do now.

13 Likes

In any case, good job and welcome back! :slightly_smiling_face:

7 Likes

Quote this on mastodon or librem.one could have bad pr or other would try to reach it. The ddos was obvious, i think. Thanks for being back. Please, at puri.sm do not add cloudflare. Just have two cheap different ipv6 Addresses too for that service (just in case) and two more to update your dns v6, or a onion domain to access or read the forums too.

In that case it should not harm, to have some ipv4 Forums down.

I just hope it was not nudged by that right to repair alphabet influencer. Gosh… its hard to make this world to a better place in case if the mainstream is against you. In reality everyone see that’s important if you share some crowd. But in the digital world you are offline then. Like in the 2000er Years the Mobile Phone Network broke down by to many humans per square meter.

4 Likes

Yes, Please don’t look into cheap DDoS protections services setup to deal with attacks many times the size of this, instead why not add more IP addresses to the same server that will struggle with the attack…

I expected that something happens to servers and whatever it is, there is no chance to reconnect until its fixed. So I was waiting and looking if something has changed. After I realized repo is back online I also visited main page and was expecting an answer there when forums come back.

It would be nice to see an information on main page or blog page as soon as they’re stable while other services are still offline.

Everything else: good job for the whole team.

This was not something on the table. Cloudflare among many other things is a man in the middle.

11 Likes

One thing to note: the post above was being prepared already before the forums went online, you folks were just faster to ask than François was to post :wink:

[edit] and I was too fast to post after misreading, sorry for being offtopic :laughing:

2 Likes

I was mocking the post I replied to. (because they said don’t do the cheapest easiest thing that you could do, and suggested something that would be of little to no benefit.)

Cloudflare / Akami etc are middle actors, but are not a MITM attackers.
-at least they are no more or a threat actor than Purism is a threat actor for providing design/assembly services between the pile of components and the assembled phone I’m still waiting for :wink:

You say you don’t have the skills to deal with the attack, you say you’re only a small team, you’ve proven that you don’t have the skills to deal with the attack that came, the sites has been offline for days, and you’ve lost data. You have struggled to deal with the kinds of attack those guys are shrugging off every minute or so.

Go find a MITM to protect you and your business.

This is easily the worst suggestion I have ever seen on the Purism forums after 4+ years of lurking here.

4 Likes

In my eyes they dealed pretty well for the first time of such a problem. Don’t be rude just because it was not as perfect as a fully experienced anti-DDoS-team would have handled this.

2 Likes