We’ll give the upgrade new try tomorrow. I’ve had some good input from admins of other instances, which are also gonna help troubleshoot during/after the upgrade.
Also there are newer RC versions with fixed issues.
Be aware that might we need to rollback again, posts posted between the upgrade and the rollback will be lost.
We see a huge rise in new user signups (duh… it’s July 1st) which also stresses the server. Let’s hope the improvements in 0.18.1 will also help with that.
PSA from Admin Team: The update completed roughly two hours ago. Since that time, the Admin team (and other site admins) have been working on the noted performance issues. We believe we have found a solution, but we still need time to test this out. You may still see brief outages and differences in performance as we are testing different configurations. We are trying to prevent rolling back.
While I know this can be frustrating - especially today - please keep in mind we have a team of volunteer techies (from around the globe!) collaborating on this issue. It is an inspiring situation. Also keep in mind that lemmy.world is quite a bit larger (and more active than any other instance). As such, we are a bit of a ‘test instance’ in regards to high volume requests. This is just part of the growing pain. We appreciate your understanding.
@ruud@lemmy.world will provide a debrief once we have completed testing.
We are happy to have all of you! Do what is needed to make this place the best. Even reddit sucked in the early days.
502 it went through, 504 try once more
Thanks for the kind words! Yeah, there are definite growing pains, and likely will be for some time (just do to the codebase we are working with, understandably). We have a really solid group though heading up lemmy.world though so we will be just fine ;)
Not sure if this is a good place to ask but it was on my mind these days, with the big user boom and probably most of it being directed at .world - should some of us consider moving to other instance to make room? Would it make sense? Would it help? Or are you ok having so many users under the .world umbrella (possibly causing even greater flood with people seeing it is the “main” (with biig quotes there) instance?
Technically speaking, yes, a portion of our issues are due to the highest user base of an Lemmy instance. So in theory, if half of our users dispersed to other instances, we would likely see some performance improvement here. However, lemmy.world is intended to be an accessible instance for the general population. The server itself that is running lemmy.world is beyond spec’d to handle much more than this user load. We are running up against code-level issues that we may or may not be able to get around with our internal configurations. This is just part of developing software in an environment were you go from a few thousand users total to hundreds of thousands in the space of a few weeks. There is no directive to have users create accounts on new instances, though if you are looking for an immediate performance improvement, that may be your best option currently. That is up to you to decide :)
Gotcha. So pretty much if I want to personally avoid the growing pains (or possibly ease them for others), I should consider it, but doesn’t particularly help you directly. Thanks for clarifying and thanks for all the work you all put into it!
Yup you nailed it. For additional context, Ruud is running an almost identical server for his Mastadon.world server which has 160k users. Relatively speaking, these are large, performant, and expensive servers. They can absolutely handle the current user influx we are getting from the Reddit exodus. Are hands are tied by software limitations unfortunately. I can confidently tell you were are constantly in communication about ways we can amplify user experience with the tools that we do have access to. For instance, this status page was recently spun up which you can access anytime you think there might be server issues to help confirm that what you are seeing is recognized at a server level. Things like that.
All that being said, for users who are looking for a smoother experience right now, I can recommend lemm.ee as a solid home as well. Their admin Sunaurus has been very active and helpful throughout this process and handles his instance very professionally. He is essentially another Ruud (though Ruud is the best! ;)). Just something to keep in mind going forward as I can’t make any promises about the time frames for these issues being resolved. Hopefully once we get contact back from the Lemmy devs we can start expediting a resolution. They have a lot on their plates right now though, haha, so we will see. Cheers!
Thanks for the additional context, you guys rock!
It’s not just performance issues (which seem to be solved right now?), the login form doesn’t work at all.
Please try to clear the browser change
The new version is doing some of the API calls differently, which may cause issues with the old/cached version in your browserThe login form should be related to the overall spiking we were seeing, though I can’t say we had conclusive evidence of that. I have been able to get it to load properly with a few refreshes. Maybe try clearing your cache? I had to do that several times during testing.
deleted by creator
This is being reviewed by the Admin team. Are you logging in via browser or app?
deleted by creator
That is good to hear :) we continue to analyze the config to see where this failure is potentially happening.
deleted by creator
Server feels a lot better now than it did an hour ago. Comments are going through quickly and upvotes are working for me.
Upvote are still slow. I’m testing the comment right now.
Ok, commented are fine ( edit too) 👍
Upvotes are still slow indeed but they are at least appearing now after 15 seconds. 👍
Working well for me too! Hope we can keep this even if there are still a few bugs.
I’ve noticed that voting is slow, joining/leaving communities is slow (or doesnt happen at all in some cases), but commenting is very fast now.
Performance is extremely uneven. Sometimes loads instantly, sometimes I get a timeout. Upvotes don’t show up until reload. Still a lot better than the last attempt.
The server hamsters seem to be on fire, but I won’t hold it against the service/community/individuals (except spez; fuck you spez). Lemmy is in the unfortunate position of being forced into a development model called “fuck it, we’ll do it live!”
edit: commenting and editing seems to be fine.
Seems to be evening out
Plane lands and everyone is disembarking
Rest in peace Leslie Nielson you glorious comedic bastard.
Heyo! Small update from someone who is watching the upgrade live; it’s ongoing still. Seems like they are still facing some performance issues. So grab your popcorn and wait! 🍿 (Btw, i am not a sysadmin, just a moderator)
Where do you look at it?
It’s on a private discord server for instance admins and mods, so not for public. Sorry :)
Alright, thank you for the update, hope everything goes well!
I’m a moderator, who should I reach out to to receive an invite to the discord? I want to post a new forum game to one of the communities I mod for, but I’m holding off until I can be sure this update is stable.
It should be good now. The discord server is only for people who help moderate the instance itself, and not community mods. I help with the moderation part. If you have questions then you can ask in the !support@lemmy.world or !moderators@lemmy.world communities. Sorry :)
Understood. Thanks!
Sigh, just like reddit. Private spaces that Plebs aren’t invited too / can’t access.
You are free to make your own lemmy instance and make everything public. That’s the beauty of lemmy over reddit.
So, you think a public stream showing all backend infos including credentials to databases and stuff like that, is a good idea? 🤔
TFW you run a small server with an open source social platform for shits any giggles and one month it explodes a thousandfold because of one greedy pig boy.
I’ve spent almost an hour trying to just to sign up. Poor servers. This is my first comment here and I sincerely hope this takes off in the best way.
I’ve been on here almost a month and there are usually very few issues and the admin team are quick to troubleshoot when an issue does arise. It should calm down soon!
Servers are being hammered hard right now lol, but it should calm down soon, you’ll see it’s pretty stable outside of reddit hugs of death :D
It’ll calm down as in more resources are provisioned or redditors looking to ditch reddit give up and go back to reddit because of stability issues, thus lowering use count closer to normal?
Because the later isn’t ideal 😬
I have business fiber and my own compute, can nodes be ran distributed? I’d be happy to spin up VMs to run a node service. Or does this rely on a single central host?
You can host your own instance, which will be federated with this one by default. Every little bit helps.
deleted by creator
Can you tell me more?
Can we have distributed compute? Is there some official process to be federated with lemmy.world to provide resources? How do upgrades work? I assume some sort of consensus algorithm is in use (ie. Raft)?
It’s not distributed computing, they are different servers independent from each other but that replicate data among them because they use the same communication protocol (ActivityPub), that’s what federation does.
looking good so far, the instance is back, testing to post!
Good luck.
We’re all counting on you.
Posting and seems fine from Jerboa, can’t log in using wefwef.
Can’t log back in with mobile browser either (browser is cleared properly). I have not logged out from liftoff, that still works fine.
I can’t log into Jerboa with my .world account at all right now. Just doesn’t work.
Edit: works now. Huh, damn, it’s going pretty fast, too.
I am also unable to login to Jerboa. I’m posting this from the desktop. I am logged into lem.ee on Jerboa.
Experiencing the same (logged in via Jerboa just fine, unable to log in via desktop web browser).
Just a heads-up. I ended up having to create a new login at lemm.ee because even after the improvements in speed and so on with the upgrade, for all intents and purposes it looks like my account here may have been wiped. Can’t login from anywhere, getting “Record not found” whenever I try to sign in on Connect, “incorrect login” from Jerboa, and a spinning button and then nothing on a PC browser, basically treating my account as if it doesn’t exist. Some folks appear to be able to login though. Is this an ongoing server issue, or did my account actually get wiped? Same username (Grangle1) as on lemm.ee.
Logins are returning non found errors. It’s a bug. If you had a previous session that’s as already logged in, you should be able to access it.
Look like Lemmy.world got slightly better.
This is a test comment. Commenting took about 5 seconds
EDIT: Test edit took about 3 to 5 seconds
Can I try replying to my comment?
Commenting took about 3 seconds
Editing took about 3 seconds.
Posting/commenting takes slightly longer compared to 0.17.4, which was 1-3 seconds
Finally able to register. Glad to be here.
Thank you for the heads-up. 🤙
This is like that scene in a sci-fi movie where they’re trying to repair the shop mid-battle. Good luck!
Watching lemmy react realtime is hella interesting, good luck y’all!
I used to work with a guy who was a sysadmin for a moderately sized webhost in the mid-2000s when things like containers and cluster orchestrators didn’t exist and high availability/multi-master database systems were only really accessible to banks.
He once described patching the servers “like trying to change all the tires on a car without being allowed to pull over”.
Can’t you disable the post button within the upgrade progress? It’s a shame to see posts to disappear due to it.
The best would be to redirect to a 502 for the duration of the migration. Better safe than sorry
I think a static page page explaining to users that the server is being upgraded would be best.
I think a static page page explaining to users that the server is being upgraded would be best.
Beehaw.org did this, their nginx seems configured to return a nice error message.