Skip to main content


We apologize for a period of extreme slowness today. The army of AI crawlers just leveled up and hit us very badly.

The good news: We're keeping up with the additional load of new users moving to Codeberg. Welcome aboard, we're happy to have you here. After adjusting the AI crawler protections, performance significantly improved again.

in reply to Codeberg

It seems like the AI crawlers learned how to solve the Anubis challenges. Anubis is a tool hosted on our infrastructure that requires browsers to do some heavy computation before accessing Codeberg again. It really saved us tons of nerves over the past months, because it saved us from manually maintaining blocklists to having a working detection for "real browsers" and "AI crawlers".
in reply to Codeberg

However, we can confirm that at least Huawei networks now send the challenge responses and they actually do seem to take a few seconds to actually compute the answers. It looks plausible, so we assume that AI crawlers leveled up their computing power to emulate more of real browser behaviour to bypass the diversity of challenges that platform enabled to avoid the bot army.

reshared this

in reply to Codeberg

If some of the attack is coming from Huawei's cloud hosting, it might be worth sending a complaint to their abuse department. IME Chinese companies tend to be scared of breaking rules in international dealings like this.
in reply to Codeberg

We have a list of explicitly blocked IP ranges. However, a configuration oversight on our part only blocked these ranges on the "normal" routes. The "anubis-protected" routes didn't consider the challenge. It was not a problem while Anubis also protected from the crawlers on the other routes.

However, now that they managed to break through Anubis, there was nothing stopping these armies.

It took us a while to identify and fix the config issue, but we're safe again (for now).

in reply to Codeberg

so, to clarify, do you have evidence that the bots were solving Anubis challenges or not, i.e., it was due to the configuration issue? (I think it's inevitably going to happen if Anubis gets traction. I'm just curious if we're already there or not.) Thanks for your work and transparency on all this.
in reply to Stefano Zacchiroli

@zacchiro Yes, the crawlers completed the challenges. We tried to verify if they are sharing the same cookie value across machines, but that doesn't seem to be the case.
in reply to Codeberg

I have a follow up question, though, @Codeberg, re: @zacchiro's question. Is it *possible* that giant human farms of Anubis challenge-solvers actually did it? Or did it all happen so fast that there is no way it could be that?

#Huawei surely could fund such a farm and the routing software needed to get the challenge to the human and back to the bot quickly enough that it might *seem* the bot did it.

in reply to Bradley Kuhn

@bkuhn
Anubis challenges are not solved by humans. It's not like a captcha. It's a challenge that the browser computes, based on the assumption that crawlers don't run real browsers for performance reasons and only implement simpler crawlers.

So at least one crawler now seems to emulate enough browser behaviour to make it pass the anubis challenge. ~f
@zacchiro

in reply to Codeberg

For the load average auction, we offer these numbers from one of our physical servers. Who can offer more?

(It was not the "wildest" moment, but the only for which we have a screenshot)

in reply to Codeberg

"AI crawlers learned how to solve the Anubis challenges"

Why does EU discuss chat control and not AI crawlers control again?

in reply to Codeberg

😲🤬 re: what's happened to @Codeberg today.
The AI ballyhoo *is* a real DDoS against one of the few code hosting sites that takes a stand against slurping #FOSS code into LLM training sets — in violation of #copyleft.

Deregulation/lack-of-regulation will bring more of this. ∃ plenty of blame to go around, but #Microsoft & #GitHub deserve the bulk of it; they trailblazed the idea that FOSS code-hosting sites are lucrative targets.

giveupgithub.org

#GiveUpGitHub #FreeSoftware #OpenSource

This entry was edited (5 days ago)
in reply to Bradley Kuhn

@bkuhn if anyone need it, there is this gist showing how to pseudo-automate repository bulk deletion.
gist.github.com/mrkpatchaa/637…

and this tool
reporemover.xyz very handy

This entry was edited (5 days ago)
in reply to serk

IMO, @serk, the better move is not to delete the repository, but to do something like I've done here with my personal “small hacks” repository:

github.com/bkuhn/small-hacks

I'm going to try to make a short video of how to do this, step by step. The main thing is that rather than 404'ing, the repository now spreads the message that we should #GiveUpGitHub!

in reply to Bradley Kuhn

@bkuhn @serk When @librecast moved our repos I wrote a script to wipe the GitHub repo and replace it with the #GiveUpGitHub README:

codeberg.org/librecast/giveupg…

in reply to Codeberg

Are you going to publish your work anywhere? I guess it could cause the bot spike again, but I guess more forgejo instances will be hit with this soon so would be good to establish some way to communicate this among other forgejo instances to prevent abuse.