KoihimeNakamura
Creativity Murderer
Hm. Well, I suppose I can't offer much, as the only other software I know of is PhpBB which is... not something I'd eagerly recommend.
I think we've just had a volunteer to provide technical assistance. Someone give this guy admin powers, quick!Invision Power Board is the other "big" commercial forum software out there, and I don't think it's much better or worse than vB with respect to efficiency and scalability.
I assume the low hanging fruit has been handled already, stuff like making sure XCache is installed and configured to run properly, analyzing the mysql instance and tuning the cache values so that most of the common queries are cached, making sure there's no memory bottleneck (RAM is comparatively cheap). You can use Yslow for Firebug to check the pages and make sure that the site is using the browser cache effectively, we made some Apache changes from that to improve the caching performance, reduced the raw # of hits to the server.
You could also look at using lighttpd (or there's another newer one that's supposed to be even better but the name eludes me) instead of Apache, much lighter and more efficient. We haven't done this one yet.
There also seems to be quite a few add-ons to vBulletin here, some research into what each one does and how much load it might put on the server might be worthwhile, in case there's one that's causing an inordinate amount of load.
One other thing we've done to reduce the load on our server is use a content delivery network for all the static resources. All the forum images, avatars, CSS and JS files can be fairly easily relocated to a content delivery network and the settings changed in the control panel. This reduced the # of hits to our server by 50% or more, it was a pretty huge difference. Won't help directly if the problem is MySQL, but will help indirectly by reducing the amount of resources the http server is using. SimpleCDN is the one we used, very easy to use if you use their mirroring functionality.
I don't understand why the post count of an individual poster being high would affect anything, but if dropping mine to zero and keeping it there would help, go for it.
If that actually means deleting every post I ever made, let me save the valuable ones........ok, done.. Pull the control rods.
I don't understand why the post count of an individual poster being high would affect anything,
but if dropping mine to zero and keeping it there would help, go for it.
If that actually means deleting every post I ever made, let me save the valuable ones........ok, done.. Pull the control rods.
One other thing I forgot to mention is gzip compression; make sure it's enabled somewhere (either in the forum itself, or through apache with mod_deflate), but make sure it isn't enabled in BOTH places! I had that so every page the forum generated was being compressed twice, which of course uses up resources needlessly.
I suspect that for now it should be turned off completely,unless bandwifth is really restricted.
We could just get rid of all the crazy people...that would free up a lot of bandwidth. Although I think there would only be like..5 people left.
Would it help to simply disable that feature?It doesn't affect anything until someone hits "See more posts by Soapy_Sam", at which point a very long-running query is generated which examines every post you've ever made, sorts them by date, and shows the most recent 400.
I find this feature useful for researching particular posts that I remember seeing, but not which thread they are in. I'm in two minds - in one sense I would not want to see it disabled, rather we should all self-regulate and not use it to find posts by people with a large number of posts. But I'm not naive enough to beleive that this could possibly work.Would it help to simply disable that feature?
Yeah, it's what I call the LAMP death spiral.vBuilletin makes some very inefficient queries, for one thing. Particularly it doesn't deal well with huge threads or members with huge numbers of posts. The other thing is the server needs more RAM (and ideally to be a 64-bit kernel so that a single process can grab a larger amount of RAM too). Almost invariably when the "no DB connections left" messages start, it's because we're thrashing swap. I've tried various things to tune the db to use VM better, but once one of those giant "runs for several hundred seconds" queries kicks in, we usually end up dropping people.
ETA: if we didn't drop people, it probably wouldn't recover on its own, and it would be hung until rebooted (or until someone restarted Apache and/or the database, at least).
Crash a lot?Why can't we copy what the biggest forums do ?
There's one other silver bullet you can use, and that's a Fusion-io ioDrive. They're not cheap (start at around $3000) but they're about 1000 times faster than a regular disk drive. I'm working on a system that handles about 1.5 million posts a day, and we do all sorts of crazy indexing (full text indexes, geospatial indexing) and one ioDrive handles everything we throw at it without blinking.
Yeah, it's what I call the LAMP death spiral.
Your server is trundling along more or less happily, then you get a query that takes longer than usual for whatever reason. Other queries back up behind it, which means that the Apache/PHP processes involved are sitting around taking up memory while they wait. As more requests come in, Apache starts up more processes to handle them, which takes up memory that would otherwise be used to cache the data from the database.
True, but I like the analogy of trying to drive fast with a flat tyre. You can certainly keep speed up by shoe-horning a turbo-charged dragster engine under the hood. But a far better solution would be to fix the flat first and use your existing engine as specified.Not sure I agree that throwing more hardware at the situation is unwarranted.
Ah, I see. Not a vBulletin guru myself, but I still take your point.Of course, that's optimization 101. But this is effectively shrinkwrap software. We are in a world of hurt if we start changing vBulletin, and I for one am not up for supporting a fork of a licensed commercial product just because we happen to be able to read the PHP.
Shouldn't have to do that if the thread is indexed wisely. Even just having post-numbers as an index of some sort for a thread allows selecting specific posts only to be returned for display. You don't need the whole thread returned in one query each time. Added advantage is that most activity will be at the end of most threads, so only the last few posts will need to be cached the most. As seen above, the SQL queries can stand to be reviewed![Re: big tables]They aren't, but you have to look at all the posts in a thread to generate the thread display. Or at least look at an index containing all the posts.
It seems like this is a five minute fix. All you have to do is disable that feature. Can anyone honestly say that they care what Soapy_Sam's first post was? Also, is the problem really that big of a deal as it is right now? So what if I can't make that post at that exact instant because someone is trying to figure out Soapy_Sam's first post.It doesn't affect anything until someone hits "See more posts by Soapy_Sam", at which point a very long-running query is generated which examines every post you've ever made, sorts them by date, and shows the most recent 400.
A big array.Nice! You'd need a big array of SSD drives to even get close to what that thing can do.
We're using MySQL. We do that sort of filtering on some stuff, but 99% of the time the bounding box is good enough for what we do.Do you use Postgres? MySQL is nice but its geospatial isn't fully implemented which has driven me crazy from time to time (having to load a bunch of records inside a bounding box, then go through them again to find the ones that are actually inside the shape I really want kind of thing).
There's one other silver bullet you can use, and that's a Fusion-io ioDrive. They're not cheap (start at around $3000) but they're about 1000 times faster than a regular disk drive. I'm working on a system that handles about 1.5 million posts a day, and we do all sorts of crazy indexing (full text indexes, geospatial indexing) and one ioDrive handles everything we throw at it without blinking.
I suggest we clear the slates on all posters with over 17,000 posts. They can start fresh, no one would care.I would start by deleting anyone with more than 40,000 posts.
I suggest we clear the slates on all posters with over 17,000 posts. They can start fresh, no one would care.
Holy Haleakala! That's amazing. $3k is a fair bit of cash, but wow. Thing is, we have a web host, so they'd have to support it. Not sure they do, but I'll ask Jeff to look into this.
Bigger and better hardware is the route we'll probably take. I have no desire to prune out posts, and there's no need. Computers tend to get faster with time (give me Moore!), so I'm sure we'll find what we need.
Technically, the server versions of the Core i7 aren't out yet, though there's nothing to stop you building a server out of a desktop CPU. But those configs don't have nearly enough memory.