Future of the Forum

icerat · Aug 3, 2014

Christian Klippel said:
Are SSD's reliable enough nowdays, given the intrinsic limitation of the number of write-cycles for the used flash memory technology?

Yes, indeed with decent brands they may be more reliable than SATA

Or are you planning to use SSD for the OS and slow-changing stuff only, and have the fast-changing stuff on regular (RAID-)drives?

A lot depends on funding of course. Even just upgrading the server without SSD will make a noticeable difference. Anything more is icing.

icerat · Aug 3, 2014

jhunter1163 said:
I have little doubt that we could raise $10,000 in 24 hours if it'd preserve the Forum and solve the performance issues. My $100 is still on the table.

Any preference on a crowd-funding platform to support this?

Donn · Aug 3, 2014

icerat - without meaning to teach you how to suck eggs - please don't forget to cron some kind of regular backup of the db to an external location!

icerat · Aug 3, 2014

idoubtit said:
I suppose it's OK to release details. I'm trying to be conscientious here but I'm not that worried about hackers.

There has not been dedicated staff to deal with the forum or website. OBVIOUSLY.

There WILL be change. To hope that every nook, cranny and post count remains the same might be a pie in the sky. What do you want? Improvement. Sometimes it means resetting the odometer and starting from scratch.

Life goes on.

Thanks. I think we can maintain the site pretty much as including post counts etc. Look and feel will eventually change a bit, but we can do things incrementally.

Here's the current basic server configuration. It's a dedicated box installed 5.5 years ago -

Code:

Bare Metal Server Installed 2/19/2009 in Seattle @ Softlayer
CentOS 2.6.18-92.e15
SuperMicro X7QCE Intel Xeon HexCore QuadProc Sata [4Proc]
4x2GB Generic RAM
4x2.13GHz Intel Xeon-Tigerton (7320-Quadcore)
SuperMicro AOC-SIMSO-plus Remote Management Card
2xSuperMicro PWS-1K01-1R Power Supply
SuperMicro BPN-SAS-828TQ Backplane
Adaptec 3405 Drive Controller
Western Digital Raptor 10,000 RPM WD1500ADFD (sdb) for Database
Seagate Cheetah ST373455SS [73GB] (sda) for system

It is currently hosting both the forum and main website, there are already plans to migrate the website. The main website is filtered through a firewall, which is why it has a different IP.

Here's some network statistics

Code:

In: Average 201.11 Kbps, Max 781.84Kbps
Out: Average 2.6 Mbps, max 25.2 Mbps

I assume from that it's on a 100Mbps network.

Hard drive performance -

Code:

%iostat -d 3
Linux 2.6.18-92.el5 (jref.randi.org)    08/02/2014

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              15.23       226.15       455.27   78704786  158440916
sdb             231.65      6702.42      1794.54 2332545177  624528576

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda               1.67         5.33        24.00         16         72
sdb             271.00      3770.67      6261.33      11312      18784

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda               4.00         0.00       253.33          0        760
sdb             273.67      3789.33      6064.00      11368      18192

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              18.33         2.67       357.33          8       1072
sdb             265.00      3882.67      5906.67      11648      17720

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda               2.33         0.00        58.67          0        176
sdb             279.33      3888.00      7200.00      11664      21600

File system usage

Code:

Filesystem            Size  Used Avail Use% Mounted on
/dev/sda5             9.7G  586M  8.7G   7% /
/dev/sda8             996M   34M  911M   4% /tmp
/dev/sda7             104G   52G   46G  54% /home
/dev/sda3             9.7G  6.3G  3.0G  68% /usr
/dev/sda2             9.7G  2.2G  7.1G  24% /var
/dev/sda1              99M   12M   83M  13% /boot
tmpfs                 3.9G     0  3.9G   0% /dev/shm
/dev/sdb1              68G   33G   32G  51% /var/lib/mysql
/home/domlogs         104G   52G   46G  54% /usr/local/apache/domlogs

I thought the MySQL db seemed overly large however vbulletin 3.7 apparently stores images in the db by default. Assuming this is the current setup then moving them to the file system would probably help performance.

I have not yet got details on current backup strategy.

System Performance

%vmstat 3

Code:

%vmstat 3
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 7  0    252  86180  55660 7092544    0    0   219    71    7    7 26  3 62  9  0
 4  0    252  80972  55740 7094984    0    0   912  1376 1625 2253 20  1 76  3  0
 3  0    252  57136  55820 7100208    0    0  1621   692 1396 3459 21  1 74  4  0
 3  1    252  57728  55824 7094740    0    0  1211   616 1423 2835 19  1 76  3  0
 4  5    252  47628  55536 7052016    0    0  3113  1072 1399 11270 21  3 65 11  0
 2  5    252  47948  55572 7050052    0    0  3379   479 1529 2090 10  1 79 10  0

I'm not an expert on interpreting vmstat but this sample doesn't appear too bad (ie it's not a time when things are lagging badly) but a cpu upgrade would provide some immediate benefits, as I suspect would moving images out of the db, if they're not already.

icerat · Aug 3, 2014

Donn said:
icerat - without meaning to teach you how to suck eggs - please don't forget to cron some kind of regular backup of the db to an external location!

yeah, as I note above I don't yet know what the current backup strategy is. Hopefully there is one!

a batch backup of a 32gb database ain't quite so simple ....

Sherman Bay · Aug 3, 2014

This thread is moving too fast for me to keep up with it, so perhaps I'm repeating what has already been said. I've only read the first page, but (1) Abaddon is right, and (2) it seems like JREF is trying to host the hardware itself. Not a good idea. Let the cloud do it and you don't worry about CPUs, RAM, power supplies, or cooling towers -- all you need is to arrange for sufficient bandwidth and adequate storage.

There are many suppliers out there. Although I am no longer running a message board, my domain is hosted by GoDaddy, which has an entire staff dedicated to 24/7 support, and not in India, either. They have been very good to me and I've never paid a penny for additional support, even though I call them every 6 months or so with a concern.

I'm using a shared server, which is adequate for me, and GoDaddy gives me some outrageous amount of storage for video plus "unlimited" email space, and unmeasured data volume, all for about $100 per year for a business account. Moving to a dedicated server would be only a small upgrade.

Not long ago, I was hosting about 500GB of video for public access, and the shared server worked just fine. I daresay that the JREF message board's overall traffic for a few hundred users, being text-based, is not much different than a dozen users downloading video. JREF's needs are not that challenging in today's cloud world.

Last I checked, vBulletin's price was just a few hundred dollars for unlimited use, and your hosting service should assist in installation and migration for little or no cost.

Please, Sharon, look into this, and you'll be glad you did. We all will.

jhunter1163 · Aug 3, 2014

icerat said:
Any preference on a crowd-funding platform to support this?

I think Causes would be best; since JREF is a nonprofit the fee would only be 4.75% instead of the 7-9% on Kickstarter or Gofundme.

a_unique_person · Aug 3, 2014

icerat said:

Thanks. I think we can maintain the site pretty much as including post counts etc. Look and feel will eventually change a bit, but we can do things incrementally.

Here's the current basic server configuration. It's a dedicated box installed 5.5 years ago -

Code:

Bare Metal Server Installed 2/19/2009 in Seattle @ Softlayer
CentOS 2.6.18-92.e15
SuperMicro X7QCE Intel Xeon HexCore QuadProc Sata [4Proc]
4x2GB Generic RAM
4x2.13GHz Intel Xeon-Tigerton (7320-Quadcore)
SuperMicro AOC-SIMSO-plus Remote Management Card
2xSuperMicro PWS-1K01-1R Power Supply
SuperMicro BPN-SAS-828TQ Backplane
Adaptec 3405 Drive Controller
Western Digital Raptor 10,000 RPM WD1500ADFD (sdb) for Database
Seagate Cheetah ST373455SS [73GB] (sda) for system

It is currently hosting both the forum and main website, there are already plans to migrate the website. The main website is filtered through a firewall, which is why it has a different IP.

Here's some network statistics

Code:

In: Average 201.11 Kbps, Max 781.84Kbps
Out: Average 2.6 Mbps, max 25.2 Mbps

I assume from that it's on a 100Mbps network.

Hard drive performance -

Code:

%iostat -d 3
Linux 2.6.18-92.el5 (jref.randi.org)    08/02/2014

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              15.23       226.15       455.27   78704786  158440916
sdb             231.65      6702.42      1794.54 2332545177  624528576

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda               1.67         5.33        24.00         16         72
sdb             271.00      3770.67      6261.33      11312      18784

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda               4.00         0.00       253.33          0        760
sdb             273.67      3789.33      6064.00      11368      18192

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              18.33         2.67       357.33          8       1072
sdb             265.00      3882.67      5906.67      11648      17720

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda               2.33         0.00        58.67          0        176
sdb             279.33      3888.00      7200.00      11664      21600

File system usage

Code:

Filesystem            Size  Used Avail Use% Mounted on
/dev/sda5             9.7G  586M  8.7G   7% /
/dev/sda8             996M   34M  911M   4% /tmp
/dev/sda7             104G   52G   46G  54% /home
/dev/sda3             9.7G  6.3G  3.0G  68% /usr
/dev/sda2             9.7G  2.2G  7.1G  24% /var
/dev/sda1              99M   12M   83M  13% /boot
tmpfs                 3.9G     0  3.9G   0% /dev/shm
/dev/sdb1              68G   33G   32G  51% /var/lib/mysql
/home/domlogs         104G   52G   46G  54% /usr/local/apache/domlogs

I thought the MySQL db seemed overly large however vbulletin 3.7 apparently stores images in the db by default. Assuming this is the current setup then moving them to the file system would probably help performance.

I have not yet got details on current backup strategy.

System Performance

%vmstat 3

Code:

%vmstat 3
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 7  0    252  86180  55660 7092544    0    0   219    71    7    7 26  3 62  9  0
 4  0    252  80972  55740 7094984    0    0   912  1376 1625 2253 20  1 76  3  0
 3  0    252  57136  55820 7100208    0    0  1621   692 1396 3459 21  1 74  4  0
 3  1    252  57728  55824 7094740    0    0  1211   616 1423 2835 19  1 76  3  0
 4  5    252  47628  55536 7052016    0    0  3113  1072 1399 11270 21  3 65 11  0
 2  5    252  47948  55572 7050052    0    0  3379   479 1529 2090 10  1 79 10  0

I'm not an expert on interpreting vmstat but this sample doesn't appear too bad (ie it's not a time when things are lagging badly) but a cpu upgrade would provide some immediate benefits, as I suspect would moving images out of the db, if they're not already.

Really want some stats over time. The snapshot looks fine, but we don't know what it looks like when it is hanging. More RAM to buffer the DB wouldn't hurt.

Klimax · Aug 3, 2014

Interesting details on server. Seems not that many changes since Darat last time published them. (Last public upgrade was some years ago)

Klimax · Aug 3, 2014

Wait, is anybody else missing avatars of other posters?

Donn · Aug 3, 2014

icerat said:
yeah, as I note above I don't yet know what the current backup strategy is. Hopefully there is one!

a batch backup of a 32gb database ain't quite so simple ....

Ouch. That takes it out of my range of experience.

I googled a bit. This link:
http://dba.stackexchange.com/questions/20/how-can-i-optimize-a-mysqldump-of-a-large-database
.. mentions a 32gb database and shows how to use mysqldump and some bash to perform a backup.

I also saw mention of this free backup tool:
http://www.percona.com/software/percona-xtrabackup
.. which looks pretty good.

Another idea is dd. Since mysql has its own partition (a wise move) and this it's own mount, you could (after pausing writes somehow):
dd if=/dev/sdb1 of=/some/place/dbbackup_{some rolling date}.bin

http://linuxpoison.blogspot.com/2009/04/creating-backuprestore-images-using-dd.html

After all is done, maybe some kind of raid would be best. I always worry about making an exact copy of what may be a corrupt drive! Not sure how to avoid that.

Wowbagger · Aug 3, 2014

idoubtit said:
There WILL be change. To hope that every nook, cranny and post count remains the same might be a pie in the sky. What do you want? Improvement. Sometimes it means resetting the odometer and starting from scratch.

If you must "reset the odometer", I would like to see an archive of all the old posts kept somewhere. You can keep them in a static database, perhaps on a separate URL.

I agree that life moves on. But, we don't destroy the works of great artists while doing so. We shouldn't destroy the work of great posters, either. (Nor the bad ones, for context.)

icerat · Aug 3, 2014

Donn said:
After all is done, maybe some kind of raid would be best. I always worry about making an exact copy of what may be a corrupt drive! Not sure how to avoid that.

What you want is both -a live backup of the current database in order to have a rapid (or even immediate) return to service in case of hardware failure, plus a regular offsite backup in case of some more serious catastrophe.

There are a number of solutions, it comes down to $$$$.

blobru · Aug 3, 2014

Klimax said:
Wait, is anybody else missing avatars of other posters?

I'm seeing everybody's.

About the vmstats:

icerat said:

System Performance

%vmstat 3

Code:

%vmstat 3
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 [COLOR=Red]r  b[/COLOR]   swpd   free   buff  cache   si   so    [COLOR=DarkOrange]bi[/COLOR]    bo   [COLOR=Red]in   cs[/COLOR] us sy id wa st
 [COLOR=Blue]7  0    252  86180  55660 7092544    0    0   219    71    7    7 26  3 62  9  0[/COLOR]
 4  0    252  80972  55740 7094984    0    0   912  1376 1625 2253 20  1 76  3  0
 3  0    252  57136  55820 7100208    0    0  1621   692 1396 3459 21  1 74  4  0
 3  1    252  57728  55824 7094740    0    0  1211   616 1423 2835 19  1 76  3  0
 4  5    252  47628  55536 7052016    0    0  3113  1072 1399 11270 21  3 65 11  0
 2  5    252  47948  55572 7050052    0    0  3379   479 1529 2090 10  1 79 10  0

- if i'm reading them right, the top line is average values since the last upgrade: so on average over that period, each second there were 7 r (ready, requesting cpu time) processes and 0 b (blocked, waiting for disk input), with 7 in (interrupts: a higher priority process interrupting another) and 7 cs (context switches: recording the state of the interrupted process). In the rest of the lines (the recent values, measured each second[?]), the average for r ready processes is down to 3.2, with 5 b blocked processes in the bottom 2 lines; while interrupts/sec has risen from 7 way up to an average of roughly 1.5K, and cs context switches from 7 to over 4K (note the over 11K context switches once the 5 blocked processes appear, and the jump to over 3k-bytes/sec (bi) disk input into memory).

Which i guess just confirms that even when it's not hanging, the system's a heckuva lot busier than normal. I have no further insight; curious if the numbers suggest anything to anyone with expertise.

idoubtit · Aug 3, 2014

Mr. Adams, on the JREF board, is able to get in and do some technical tweaks to the hardware and software.

One insight he has given me is that right now 40% of the queries are from search engine bots, a good proportion of the traffic is from China. Legitimate traffic?

Sherman Bay · Aug 3, 2014

idoubtit said:
Mr. Adams, on the JREF board, is able to get in and do some technical tweaks to the hardware and software.

One insight he has given me is that right now 40% of the queries are from search engine bots, a good proportion of the traffic is from China. Legitimate traffic?

I think not. Your figs roughly match mine, back when I was worrying about such things.

Search engines are good to have, but you are probably lumping "good" engines like Yahoo and Google with "bad" ones who are merely trying to aggregate and repackage data and links for resale to anyone who will buy.

Some of that traffic is the bad guys probing for weakness, harvesters looking for emails, and scammers looking for victims. Unless you have an extremely good filter, it's a fact of life that host providers have to accommodate the bad traffic to some degree; to do otherwise risks filtering out the odd but good guys.

Wolfman · Aug 3, 2014

idoubtit said:
One insight he has given me is that right now 40% of the queries are from search engine bots, a good proportion of the traffic is from China. Legitimate traffic?

Well, I'm in China, as are a few other members...but even with my typically lengthy posts, I don't think we can account for "a good proportion of the traffic". I think we can safely assume that the rest is not the kinda' traffic we want.

Blue Mountain · Aug 3, 2014

icerat said:
yeah, as I note above I don't yet know what the current backup strategy is. Hopefully there is one!

a batch backup of a 32gb database ain't quite so simple ....

Using the Logical Volume Manager makes it a lot easier:

Do something to have HTTP stop accepting connections briefly (e.g. shut it down, or block packets on port 80)
Stop MySQL, or issue FLUSH TABLES WITH READ LOCK. This gives us a clean and up-to-date copy of the database.
Create a snapshot using LVM. This is actually very quick. The snapshot works by setting up a copy-on-write volume: as blocks on the source volume change, a "before" image is written to the CoW volume. Reading from the snapshot will read from either the original volume if the block is unchanged since the snapshot was taken, or from the before image on the CoW volume.
Restart MySQL (or issue UNLOCK TABLES) and httpd (unblock port 80). Total downtime: less than 30 seconds.
Back up from the snapshot volume at leisure (tar, cp, or even mysqldump if you're feeling masochistic)
Delete the snapshot volume when the backup is done, to prevent it from filling up

Blue Mountain · Aug 3, 2014

For hardware, there are nice systems like the following:

Hwelett-Packard ProLiant DL320e Gen8 v2 Server.

I'm biased toward HP because that's what we run at work. There are similar offerings from IBM and Dell. These computers are built very well and have excellent SAS RAID controllers with battery-backed write caching. Replacement discs can be bought at a premium from the vendor or at a discount from companies in eBay.

With, say, four discs set up in RAID 5 mode, a disc can fail and the system will just carry on. If there's a standby spare, the controller can immediately put the spare into service. Because the SAS discs are hot swappable, the replacement disc can be put into service without bringing down the system.

Setting up one of these systems properly also means setting up the monitoring tools so that the sysadmins receive alerts when something goes wrong. It's of no use having the RAID controller switch over to a spare disc if no-one knows it's happened, because the next disc to fail will destroy the array.

zooterkin · Aug 3, 2014

Or RAID 6, which needs another disk, but will allow 2 to fail without losing data (but you still have to monitor them, as you say).

Future of the Forum

Philosopher

Philosopher

Philosopher

Philosopher

Philosopher

Master Poster

beer-swilling semiliterate

Director of Hatcheries and Conditioning

NWO Cyborg 5960x (subversion VPUNPCKHQDQ)

NWO Cyborg 5960x (subversion VPUNPCKHQDQ)

Philosopher

The Infinitely Prolonged

Philosopher

Philosopher

Critical Thinker

Master Poster

Chief Solipsistic, Autosycophant

Resident Skeptical Hobbit

Resident Skeptical Hobbit

Nitpicking dilettante, Administrator