• Due to ongoing issues caused by Search, it has been temporarily disabled
  • Please excuse the mess, we're moving the furniture and restructuring the forum categories

Recent server issues

Invision Power Board is the other "big" commercial forum software out there, and I don't think it's much better or worse than vB with respect to efficiency and scalability.

I assume the low hanging fruit has been handled already, stuff like making sure XCache is installed and configured to run properly, analyzing the mysql instance and tuning the cache values so that most of the common queries are cached, making sure there's no memory bottleneck (RAM is comparatively cheap). You can use Yslow for Firebug to check the pages and make sure that the site is using the browser cache effectively, we made some Apache changes from that to improve the caching performance, reduced the raw # of hits to the server.

You could also look at using lighttpd (or there's another newer one that's supposed to be even better but the name eludes me) instead of Apache, much lighter and more efficient. We haven't done this one yet.

There also seems to be quite a few add-ons to vBulletin here, some research into what each one does and how much load it might put on the server might be worthwhile, in case there's one that's causing an inordinate amount of load.

One other thing we've done to reduce the load on our server is use a content delivery network for all the static resources. All the forum images, avatars, CSS and JS files can be fairly easily relocated to a content delivery network and the settings changed in the control panel. This reduced the # of hits to our server by 50% or more, it was a pretty huge difference. Won't help directly if the problem is MySQL, but will help indirectly by reducing the amount of resources the http server is using. SimpleCDN is the one we used, very easy to use if you use their mirroring functionality.
 
Hey!! Speak for yourself. We the inmates have taken over the asylum. :D
 
Invision Power Board is the other "big" commercial forum software out there, and I don't think it's much better or worse than vB with respect to efficiency and scalability.

I assume the low hanging fruit has been handled already, stuff like making sure XCache is installed and configured to run properly, analyzing the mysql instance and tuning the cache values so that most of the common queries are cached, making sure there's no memory bottleneck (RAM is comparatively cheap). You can use Yslow for Firebug to check the pages and make sure that the site is using the browser cache effectively, we made some Apache changes from that to improve the caching performance, reduced the raw # of hits to the server.

You could also look at using lighttpd (or there's another newer one that's supposed to be even better but the name eludes me) instead of Apache, much lighter and more efficient. We haven't done this one yet.

There also seems to be quite a few add-ons to vBulletin here, some research into what each one does and how much load it might put on the server might be worthwhile, in case there's one that's causing an inordinate amount of load.

One other thing we've done to reduce the load on our server is use a content delivery network for all the static resources. All the forum images, avatars, CSS and JS files can be fairly easily relocated to a content delivery network and the settings changed in the control panel. This reduced the # of hits to our server by 50% or more, it was a pretty huge difference. Won't help directly if the problem is MySQL, but will help indirectly by reducing the amount of resources the http server is using. SimpleCDN is the one we used, very easy to use if you use their mirroring functionality.
I think we've just had a volunteer to provide technical assistance. :D Someone give this guy admin powers, quick!
 
I don't understand why the post count of an individual poster being high would affect anything, but if dropping mine to zero and keeping it there would help, go for it.
If that actually means deleting every post I ever made, let me save the valuable ones........ok, done.. Pull the control rods.
 
Lol, I'm just regurgitating a lot of what our server admin guy and I researched out and decided was good, so I had help. I'm not a linux guru by any means (which I hate, but never seem to have time to remedy).

Though I did compile and install XCache from source, so I was impressed with myself.
 
I don't understand why the post count of an individual poster being high would affect anything, but if dropping mine to zero and keeping it there would help, go for it.
If that actually means deleting every post I ever made, let me save the valuable ones........ok, done.. Pull the control rods.

The actual post count # that's displayed is just a field in the database and won't affect anything.

The number of records can have an impact though, I haven't researched it but I assume there's ways to archive older posts out to improve things; my forum has 1/3 the # of posts here so I haven't had to worry about that yet.

One other thing I forgot to mention is gzip compression; make sure it's enabled somewhere (either in the forum itself, or through apache with mod_deflate), but make sure it isn't enabled in BOTH places! :D I had that so every page the forum generated was being compressed twice, which of course uses up resources needlessly. :)
 
The only thing worse than being a paid IT with a knowledgeable boss must be being an volunteer IT with about 60 knowledgeable members.

Why don't y'all quit jostling his elbows? If he had to read all your stuff he'd never be able to find the time to type his own SQL.

<duck and cover>
 
Lol very true, I was just offering up the things we've done that have worked for us, I'm sure Terry's looked at 90% of what we went through already, thought he content delivery network thing is not very common for vBulletin forums yet so it's not a well known option.
 
I don't understand why the post count of an individual poster being high would affect anything,

It doesn't affect anything until someone hits "See more posts by Soapy_Sam", at which point a very long-running query is generated which examines every post you've ever made, sorts them by date, and shows the most recent 400.

but if dropping mine to zero and keeping it there would help, go for it.
If that actually means deleting every post I ever made, let me save the valuable ones........ok, done.. Pull the control rods.

I'd rather not start throwing away data. If that does happen, it'll be because JREF has decided it is the appropriate course, and most likely that will be against my recommendation.
 
One other thing I forgot to mention is gzip compression; make sure it's enabled somewhere (either in the forum itself, or through apache with mod_deflate), but make sure it isn't enabled in BOTH places! :D I had that so every page the forum generated was being compressed twice, which of course uses up resources needlessly. :)

I suspect that for now it should be turned off completely,unless bandwifth is really restricted.

And additionaly ( ;) ) I suppose that sooner or latery it is going to be either hardware or rewrite bad SQL queries.(But with proper documentation of changes it should not be difficult to upgrade later)
 
I'm just shooting ideas, but if there's a problem with users having many posts, would it be effective to split up really old posts to an archive sock puppet? So if I have 5000 posts older than a year, what about some task makes a sock puppet as close as possible to me and cuts&pastes those posts to the sock puppet (which would be Mitchell314_ or something)? And if I get 5000 more old posts, push them on the old sock puppet? Probably not too practical but I'm just throwing stuff out there. Beats banning the old members. Then again...
 
I suspect that for now it should be turned off completely,unless bandwifth is really restricted.

I guess it depends on what is limited, it's not just bandwidth, it takes apache less time to transfer the gzipped resource, quite a bit less time for raw HTML pages. Typically it's worth the extra CPU to compress them.
 
It doesn't affect anything until someone hits "See more posts by Soapy_Sam", at which point a very long-running query is generated which examines every post you've ever made, sorts them by date, and shows the most recent 400.
Would it help to simply disable that feature?
 
Would it help to simply disable that feature?
I find this feature useful for researching particular posts that I remember seeing, but not which thread they are in. I'm in two minds - in one sense I would not want to see it disabled, rather we should all self-regulate and not use it to find posts by people with a large number of posts. But I'm not naive enough to beleive that this could possibly work.
 
vBuilletin makes some very inefficient queries, for one thing. Particularly it doesn't deal well with huge threads or members with huge numbers of posts. The other thing is the server needs more RAM (and ideally to be a 64-bit kernel so that a single process can grab a larger amount of RAM too). Almost invariably when the "no DB connections left" messages start, it's because we're thrashing swap. I've tried various things to tune the db to use VM better, but once one of those giant "runs for several hundred seconds" queries kicks in, we usually end up dropping people.

ETA: if we didn't drop people, it probably wouldn't recover on its own, and it would be hung until rebooted (or until someone restarted Apache and/or the database, at least).
Yeah, it's what I call the LAMP death spiral.

Your server is trundling along more or less happily, then you get a query that takes longer than usual for whatever reason. Other queries back up behind it, which means that the Apache/PHP processes involved are sitting around taking up memory while they wait. As more requests come in, Apache starts up more processes to handle them, which takes up memory that would otherwise be used to cache the data from the database.

That means that MySQL has to hit the disk to fetch its data instead of just grabbing it from memory, which means the queries run even slower, and in a matter of minutes you can go from everything running fine to the server being so overloaded that you can't even log in to fix things.

The solution is exactly what you are proposing to do: Throw hardware at it.

64-bit is nice but not critical; the Linux filesystem cache can grow to 64GB (I think) even on a 32-bit system, and while it's not quite as efficient as MySQL's own buffers (assuming you're using InnoDB), it gets the job done.

There's one other silver bullet you can use, and that's a Fusion-io ioDrive. They're not cheap (start at around $3000) but they're about 1000 times faster than a regular disk drive. I'm working on a system that handles about 1.5 million posts a day, and we do all sorts of crazy indexing (full text indexes, geospatial indexing) and one ioDrive handles everything we throw at it without blinking.
 
I googled these last time you mentioned them. Very nice bit of kit. Presumably the price of these will fall in the months to come?
 
Actually, the price has gone up. They're selling them as fast as they can churn them out, and the closest competition is a year away from shipping, so for the moment they have the market all to themselves.

Worth every penny, though.
 
There's one other silver bullet you can use, and that's a Fusion-io ioDrive. They're not cheap (start at around $3000) but they're about 1000 times faster than a regular disk drive. I'm working on a system that handles about 1.5 million posts a day, and we do all sorts of crazy indexing (full text indexes, geospatial indexing) and one ioDrive handles everything we throw at it without blinking.

Nice! You'd need a big array of SSD drives to even get close to what that thing can do.

Do you use Postgres? MySQL is nice but its geospatial isn't fully implemented which has driven me crazy from time to time (having to load a bunch of records inside a bounding box, then go through them again to find the ones that are actually inside the shape I really want kind of thing).
 
Yeah, it's what I call the LAMP death spiral.

Your server is trundling along more or less happily, then you get a query that takes longer than usual for whatever reason. Other queries back up behind it, which means that the Apache/PHP processes involved are sitting around taking up memory while they wait. As more requests come in, Apache starts up more processes to handle them, which takes up memory that would otherwise be used to cache the data from the database.

Apologies if the following is stupid. Feel free to educate me. I know 0 about databases.
Is there no way to recognise a request that will take too long and simply reject it with a polite apology? I'm not asking for a solution to Turing's halting problem, just a recognition that (eg) If that eedjit has 15000 posts and I'm busy --> then call Copout. Better the originator of the request is frustrated than a dozen innocents are dumped in the cybersoup.

As for the hardware upgrade debate- we've raised money before for servers and can doubtless do it again. Given some good folk here are in the business, I'd hope we could take advantage of any savings they might be able to offer. Makes sense if we can give Terry what he needs at minimal cost to everyone.
 
Remember that vBulletin is a commercial product and whilst tweaking around the edges so to speak is relatively straightforward major rewrites of the core product are anything but.
 
Ah. So it's not yours to mess with?
I'm wallowing in my own iggerance here, darat.
 
Sure, they can mess with it.

But then the next version comes out, and they have to throw away all their hard work, or try to figure out how to integrate it into the new version. It'd be a nightmare.
 
I wonder, would there be a way to acquire a second server solely for archived posts, say, those older than two months, or something like that? Can the software shunt search queries to another server, and let the other server handle the query load and feedback?

Or, more to the point, is there a way to do this without each member of the tech staff at JREF going to get Master's Degrees in database administration?
 
Not built-in. Archive management is probably one of vBulletin's weakest features, it really doesn't let you do anything like this.
 
I understand Terry's reluctance to ditch data- and I remember the sobs and gnashing of teeth last time we did so- but it does sound to me like a regular trim would would keep things more manageable.
 
Not sure I agree that throwing more hardware at the situation is unwarranted.
True, but I like the analogy of trying to drive fast with a flat tyre. You can certainly keep speed up by shoe-horning a turbo-charged dragster engine under the hood. But a far better solution would be to fix the flat first and use your existing engine as specified.

Of course, that's optimization 101. But this is effectively shrinkwrap software. We are in a world of hurt if we start changing vBulletin, and I for one am not up for supporting a fork of a licensed commercial product just because we happen to be able to read the PHP.
Ah, I see. Not a vBulletin guru myself, but I still take your point.

[Re: big tables]They aren't, but you have to look at all the posts in a thread to generate the thread display. Or at least look at an index containing all the posts.
Shouldn't have to do that if the thread is indexed wisely. Even just having post-numbers as an index of some sort for a thread allows selecting specific posts only to be returned for display. You don't need the whole thread returned in one query each time. Added advantage is that most activity will be at the end of most threads, so only the last few posts will need to be cached the most. As seen above, the SQL queries can stand to be reviewed! :)

But of course, there's always time and money issues to overcome too - appreciate that fully! Keep up the good work!
 
Last edited:
It doesn't affect anything until someone hits "See more posts by Soapy_Sam", at which point a very long-running query is generated which examines every post you've ever made, sorts them by date, and shows the most recent 400.
It seems like this is a five minute fix. All you have to do is disable that feature. Can anyone honestly say that they care what Soapy_Sam's first post was? Also, is the problem really that big of a deal as it is right now? So what if I can't make that post at that exact instant because someone is trying to figure out Soapy_Sam's first post.
 
Last edited:
Nice! You'd need a big array of SSD drives to even get close to what that thing can do.
A big array. :)

Do you use Postgres? MySQL is nice but its geospatial isn't fully implemented which has driven me crazy from time to time (having to load a bunch of records inside a bounding box, then go through them again to find the ones that are actually inside the shape I really want kind of thing).
We're using MySQL. We do that sort of filtering on some stuff, but 99% of the time the bounding box is good enough for what we do.
 
There's one other silver bullet you can use, and that's a Fusion-io ioDrive. They're not cheap (start at around $3000) but they're about 1000 times faster than a regular disk drive. I'm working on a system that handles about 1.5 million posts a day, and we do all sorts of crazy indexing (full text indexes, geospatial indexing) and one ioDrive handles everything we throw at it without blinking.

Holy Haleakala! That's amazing. $3k is a fair bit of cash, but wow. Thing is, we have a web host, so they'd have to support it. Not sure they do, but I'll ask Jeff to look into this.

Bigger and better hardware is the route we'll probably take. I have no desire to prune out posts, and there's no need. Computers tend to get faster with time (give me Moore!), so I'm sure we'll find what we need. :)
 
Holy Haleakala! That's amazing. $3k is a fair bit of cash, but wow. Thing is, we have a web host, so they'd have to support it. Not sure they do, but I'll ask Jeff to look into this.

Bigger and better hardware is the route we'll probably take. I have no desire to prune out posts, and there's no need. Computers tend to get faster with time (give me Moore!), so I'm sure we'll find what we need. :)

Hm.What specs of new server would be/are?

my guess/solution(hypotetical due to its price):
Core i7 965 EE
6GB RAM Corsair TR3X6G1600C8D
RAID 6 of 4 SCSI HDD (300GB each ,10k RPM)
(motherboard not specified)

or
Xeon Processor X7460
6GB RAM Corsair TR3X6G1600C8D
RAID 6 of 4 SCSI HDD (300GB each ,10k RPM)
(motherboard not specified)

Unfortunately I suspect that such perfect server would be away too expensive... :( (Don't know who would be able to afford such beast)
On the other hand,it would last for more then eight years.(three cores apache,three cores xSQL;same way memory)
 
Technically, the server versions of the Core i7 aren't out yet, though there's nothing to stop you building a server out of a desktop CPU. But those configs don't have nearly enough memory.

I'm not sure how big the database is. If it's in the 4GB range, then the solution is pretty simple: A pair of quad-core servers, each with 8GB, one to run the database, the other to run Apache and PHP. That's the most cost-effective way to run this sort of system.

If the database is approaching 8GB, though, you'll want more than 8GB on that server, and that probably means moving to a dual-processor box, which is obviously going to be more expensive.

I don't know where the forums are hosted, but... Okay, now I do. Reliable, but not cheap.
 
Technically, the server versions of the Core i7 aren't out yet, though there's nothing to stop you building a server out of a desktop CPU. But those configs don't have nearly enough memory.

Depends.I do not know how memory hungry apache with PHP is,but I would say that it would be so far enough.Since the only limitation for desktop PC regarding memory is lower number of modules and current memory density.And missing ECC and buffering for server role.

Thats why I included previous generation Xeon... ;)

BTW.Core i7 is already partially meant for database servers.(At least according to some reviews I read.)
 
Yep, Core i7 is designed more as a server chip than a desktop chip; I'm eagerly awaiting the server versions.

Apache/PHP can chew through memory like nobody's business. In the lead up to the election, my web server was running 200-250 Apache/PHP processes, each using 10-20MB of memory. And that's after I did everything possible to reduce memory usage, on a server that doesn't run a lot of complicated PHP anyway; before that, the processes were commonly using up to 50MB each, and sometimes more.

That's why it's so valuable to split off the database onto a separate box: Even if Apache gobbles up all the memory on its own box, the database itself is unaffected. Doesn't solve the problem, but it goes a long way to mitigate it.

On the MySQL side: If you want good performance from your database, you want the whole thing in RAM. Particularly if you're dealing with things like full-text indexes, which can arbitrarily haul tens of thousands of pages off disk in response to the simplest query.
 
Well, I'm certainly in the company of giants here. My company runs a small commercial web site (we sell the service and use the web to provide it) that gets an average of 60,000 page views per month. Contrast that with this forum, which gets 2.5 million.

About the only thing I can add is you really do not want to use commodity desktop hardware for even the small website we host, let alone this forum. I have no idea what the JREF is using, but we've been using HP ProLiant servers for our sites. We've been very happy with them. Linux runs really well on these guys (including the RAID array because HP open-sourced the driver code.)

I also don't know what the JREF's budget looks like, or how much priority they give to the forum. Good servers aren't cheap--you're looking at $2,000 to $3,000 for even a low end system. So buying two of them (one for the forum software, one for the database) is certainly an investment. And that's before they end up on Terry's desk with a note from Phil asking how long it will take to get them set up and installed at the data centre. :eek:

If our site ever gets up to 2 million page views a month, I'd sure be looking at splitting the database off to a separate server. We'll be a while getting there, though.
 
Back
Top Bottom