Crawl Rate Tracker plugin missing chart on CentOS 4.x

Well, as part of my WordPress install, I am using the Crawl Rate Tracker plugin. It shows the hits on your blog from various spiders and so on. In it’s dashboard (in your WordPress admin) there is a nifty chart that visualizes the info. Or, at least, there should be. In my case, I got a text link that just output the URL of the php script that should have generated the chart.

By looking into sbtracking-chart-data.php, I found that as it incremented a value to track the date, it hit the number 1225684800, which is apparently the max integer value in PHP 4.3.9, which is the version that was included in CentOS 4.x, with security patches backported. This value corresponds to some point during November 2, 2008. Well, as you well know, we are past that point in history, causing this PHP script to loop infinitely, since it never errored out upon incrementing that variable, it just stuck at the value 1225684800, and it was using the date as a means of breaking the loop.

To resolve this, I upgraded to PHP 5.1.6, which also required a MySQL upgrade. (The way I did it, anyway, which was the fast & easy, CentOS “semi-supported” way to do it.) I edited the file /etc/yum.repos.d/CentOS-Base.repo to make the centosplus repository available by changing enabled=0 to enabled=1 and then adding this line under the same repository:

includepkgs=php* mysql*

which restricts the upgrades and installs from this repository to the php and mysql packages.

I ran yum update and let it download and install all the necessary packages. This put me at PHP 5.1.6 with some extra security patches and MySQL 5.0.68 with the same.

Upon an Apache restart (apachectl graceful) I tested the Crawl Rate Tracker plugin, and AHA! The chart is there, and as nifty as ever.
Goodnight.

Poor WordPress admin performance

      No Comments on Poor WordPress admin performance

During the journey of installing WordPress, choosing a theme, setting up plugins, and so on, I ran into a problem that oh-so-many out there have hit: terrible performance in the WordPress admin (downloadable version of WordPress, not at wordpress.com)

Well, to make a long story short this time: Do as everyone recommends, despite believing that the problem occurred before a certain point, or isn’t related…just do it: Disable all plugins which should get your performance back to something more acceptable. Once you are that point, enable your plugins one at a time until you hit your performance snag. I won’t list the problematic one for me, as yours will probably be different.

Just do it. You’ll be glad you did, even if for the fact that then you can be certain that it isn’t your plugins.

Just do it.

WordPress woes

      No Comments on WordPress woes

Well, as my first post, let me touch on some wordpress performance problems, as I just ran into said problems while getting this blog running.

So, I setup this virtual server a couple weeks ago.  On it, I installed a variety of things, including some other PHP-heavy sites.  All of them work great; faster than I expected, in fact.

Today, I installed WordPress in a subfolder on an Apache VirtualHost (specifically, david.pryke.us, which, since you are here and reading this, you already know.)  I performed the basic setup of wp-config.php and then navigated to the site to complete the setup.  This took a while to run, but I figured that perhaps the MySQL instance on this server was taxed, and it just took a while to setup the basic database.

Once I clicked on the “Log In” button at the last screen, I realized that something was terribly wrong.  It took 40 seconds or so to load the basic login page for me to enter my username and password.  Once I entered them, it took another 40 seconds or so to get to the admin.  I tried just loading the basic blog, without loggin into the admin…40 seconds.  I started looking for WordPress performance problems on Google.  Many of the results pointed me to resort to the default “Kubrick” theme and start from there (no problem, it was a default install, I’m already using that theme.) After that, it indicated to disable all the plugins and start enabling them one at a time, taking note of when big slowdowns start to happen.

Well, since this was a fresh, default install, I figured that was’t the major problem.  While I am a SysAdmin, and have no problem going through things one at a time, I also hit on a post by Paul Spoerry about how to Diagnose slow WordPress performance using FireBug.  I found that to be a great idea, and one that I should have thought of by this point, as a lot of my coworkers use FireBug to look at problems during website development.

So, I installed FireBug and inspected my site with it.  Wow.  40.532 seconds for the basic index.php page, and then a few hundred milliseconds at worst for the rest of the jpg’s and such combined.  So, I start looking for WordPress performance diagnosis, and I come across a WordPress forum topic regarding a similar problem.  In there, I found to insert code like:

<!-- <?php echo get_num_queries(); ?> queries. <?php timer_stop(1); ?> seconds. -->

which, I found out later, was already in the Kubrick theme.  One other thing I found and enabled in my footer.php was this:

You can see how long each query is taking with a few modifications.

In your wp-config.php file, add this line of code to the top:
define('SAVEQUERIES', true);

Then, in the theme’s footer.php file, you can do this to dump all the queries and how long they took:

if (SAVEQUERIES) {
global $wpdb;
echo '<!--\n';
print_r($wpdb->queries);
echo '\n--!>';
}

Then looking at the source of the page will show all the queries and a number showing how long they took.

After you do this, turn the SAVEQUERIES back off by setting it to false in the wp-config.php file. You don’t want it to spit out the queries all the time.

The key there, which I knew, but some other readers on that forum topic didn’t, was to put “<?php” before the code block, and “?>” after the code block in footer.php.  I looked at the source of the page after I added those pieces of code, and it told me two things.  The first line output this:

<!-- 21 queries. 40.195 seconds. -->

Which told me that it believed the database queries were taking over 40 seconds to perform.  However, the second piece of output (from $wpdb->queries) told me something totally different.  This command lists the SQL for each query, as well as how long it took to run that query.  Each on was along the lines of 0.0001130104064941 or 0.00027084350585938, which, when added together, was still much less than one second.  Something isn’t “adding up” here…

After reading the rest of that forum topic, someone mentioned a problem which went away when he ran the internal “wp-cron.php” script by hand, but came back every time he created a new post.  Well, there are two important pieces of information here.  One is that this internal cron script is scheduled to run again when certain actions are taken, such as creating a new post.  The second thing, and important in my case, is that this is run from within the web server itself…specifically, from within the PHP parser.

Now, a key piece of info for my problem is that I am hosted on a virtual machine that lives on a non-routable IP, an RFC 1918 IP of 10.53.22.13; this is important in that the public IP of this site is 66.179.100.13 (at least, as of this writing, on November 3rd, 2008.)  The PHP parser tried to connect to david.pryke.us/blog/….. and could not get there because the firewall & NAT machine “in front” of the server would not redirect traffic back down this network link to the 10.53.22.13 address when one of the machines on the same link asked for the public, routable IP of 66.179.100.13;

To resolve this, and make the long story short, I had to add a line in /etc/hosts that read:

10.53.22.13     david.pryke.us

Which allowed the server to see the “correct” IP for that domain name (david.pryke.us) and vioala! the server loaded the first default post in less than a second!

Problem solved.  (This should never have been a problem, as I usually setup the hosts file on these servers right away…but I forgot in this instance. Oops!)

Virtualization: What is it? or “Virtualization vs. Emulation”

Today I was inspired by an article in the February 27th issue of Infoworld Magazine written by Tom Yager titled, “What virtualization is — and what it isn’t,” which was regarding misuse of the term “virtualization.” He goes on to briefly identify a few ways that the term is used correctly, as well as incorrectly, in some modern software products, namely VMWare software, Apple’s Rosetta binary translator, and Microsoft’s Virtual PC for Mac. I’m going to take a slightly different direction here, but I’ll touch on those products.

Definitions: Virtualization and Emulation

Virtualization is a broad term that refers to the abstraction of resources across many aspects of computing. The context I will be discussing today is with regards to local hardware virtualization. What this means is that multiple operating systems, or multiple copies of the same operating system (OS), can run simultaneously on a single set of hardware, and retain access to all aspects of the hardware. When one or more of the installed operating systems requests access to a piece of hardware, the layer that performs the virtualizing intercepts that call, and if the hardware is currently being used by another instance of an installed OS, it schedules the hardware call to happen as soon as possible. If the hardware is available, or once it becomes available, the call is passed on to the hardware and any responses from hardware are directed right back to the calling OS. This is a very fast process, as there is minimal interaction here, and the installed operating systems run at near full speed. (See my previous post on Xen Virtualization to get a brief look at one way that this can work.) Emulation is recreating an entire hardware architecture in software, then typically running an OS under that (though it could be used in a much “smaller” way, such as running a single program or even translating a single instruction.) As you can probably imagine, a program that acts like it is an entire piece of hardware is hardly simple and is typically much slower than the real hardware it is emulating.

Which to use when, and why?

Emulation is handy when you want to run an operating system or program from a completely different system on your machine. Think, for example, of playing Super Nintendo games on your computer, or running a Commodore 64 program under Windows or Mac OS. This can also be used for things like developing software that would be encoded on a chip to would be embedded in a consumer product, such as a calculator, a remote control, a television, or even a clothes washer! Emulation is likewise good when you are developing a new hardware product, such as a CPU, but want to get the design right before you manufacture ten, a thousand, or a million of them. You can create the entire hardware architecture in software and then work under that software as though you have the final device right in front of you. If you find bugs in the design, or something you could optimize to work better, you can change the emulation software; Once you have everything designed the way you want it, you can send the design out to be prototyped, tested, and manufactured. Virtualization, on the other hand, is used when you want to run multiple OS’s (or multiple copies of an operating system) on a single machine. One reason you might do this is if you are designing a distributed system (such as a cluster of machines) and you are trying to develop some software (such as communications protocols) that requires testing across many machines at once, but you only have one or two machines with which to test. (This example works best when you have multiprocessor/multicore machines to work with.) Another reason is if you want to run Windows, Linux, and FreeBSD simultaneously on one machine…without “dual-booting” or using emulation! (Examples chosen at random…many other combinations are possible, and I am not endorsing any particular product, nor trying to slight any product.) A third, and particularly useful, use of this kind of virtualization is for if you want to separate out individual parts of a complex system…such as a mail server solution. One way (and there are many ways) you could separate this example is to have the MTA (mail transfer agent, the actual program that receives the mail) running on one virtual machine, run an anti-virus/anti-malware scanner on a second virtual machine, and have a webmail interface running on a third virtual machine. (As you may guess, there are a number of other ways to set up this system, as I have glossed over a few parts of this complex system, such as the mail store, imap & pop servers, databases to store virtual addresses, and more.) This would allow you to use just one machine to accomplish this goal, while giving each conceptual part of the system it’s own dedicated resources…and no single part of the system could bring down another part. (Have any of you ever had an Apache server running a webmail client use all your available memory, causing failure or extreme delays in the MTA that is trying to receive e-mail? It can happen…)

Final thoughts for tonight…

Well, I hope to have brought a little insight to the question of what virtualization and emulation are in one context, and given enough examples to give you an idea of how each works as well as some potential uses for each. It turns out that I didn’t mention the products from the first paragraph again, but there is tons to talk about when it comes to virtualization, even when used exactly as it is defined here, and there a plenty of other meanings to the term. So, don’t be surprised if I end up talking about these concepts often…it is a field in which I am interested and use in my everyday life as a systems administrator.

Xen Virtualization Technology (Part 1)

      No Comments on Xen Virtualization Technology (Part 1)

Tonight I’m discussing Xen Virtualization Technology. This technology is the best I’ve ever worked with when it comes to virtualization of hardware on an x86-based machine.

What is Xen? From the Xen FAQ: “Xen is a virtual machine monitor (VMM) for x86-compatible computers. Xen can securely execute multiple virtual machines, each running its own OS, on a single physical system with close-to-native performance.” Xen can be used to run multiple instances of an operating system on separate “domains,” as they are called, each of which appears to be it’s own physical server to the operating system.

How does it work? Well, the basics are this: A Xen “hypervisor” runs on the native hardware (this is typically a Linux 2.6- or 2.4-based system, although Xen version 2.0 can be installed on NetBSD 3.0 as the host system, as well) which is considered “domain zero” (dom0.) Under dom0, you can create a virtually unlimited number of domains, each of which is a virtual machine. You specify how much RAM to dedicate to this virtual machine, optionally assign a specific processor on which it should run, and optionally assign a specific IP address. You can then install any of: Linux 2.4, Linux 2.6, NetBSD 2.0, NetBSD 3.0, FreeBSD 5.x, or Plan 9 as the guest operating system. Aside from a specific kernel with hooks into the Xen Hypervisor (dom0), this system runs in completely unmodified form, which means that you can run any software that runs on any of these systems.

How well does it work? Very well. There is practically zero overhead with the Xen system, meaning that nearly 100% of your processing power and RAM can be dedicated to the virtual machines that are hosted on the system. This virtualization technique works in precisely the way you would expect it to work….If you have two virtual machines (VMs) dedicated to one processor and both are heavily loaded, each utilizes half of the processor….neither will take over the machine and push other VMs out of the way. This software scales nearly endlessly, too – you can have ten VMs on one machine, and if set up properly, when all ten VMs are loaded to the max, each will behave as though it has a processor that is one tenth the power of the primary system. I haven’t gotten through extensive tests to indicate if each VM can use more than (CPU Power) / (# of VMs) horsepower if other VMs are idle, but so far the tests seem to indicate precisely that.

Here is my anecdotal evidence as of today…
Part 1: A co-worker of mine and I installed a basic CentOS 4.2 system on a simple dual-processor box (dual PIII 600 MHz, 1GB RAM) as domain zero, then created two sub-domains (dom1 and dom2). We assigned dom0 to utilize processor 0, then assigned dom1 to processor 0 and dom2 to processor 1. Each subdomain was given 384 MB of RAM. We ran a variety of benchmarks on dom0 without either subdomain running, and it behaved like a 600 MHz PIII, as expected. We then loaded up both subdomains with CentOS 4.2 and ran the same benchmarks on each, both separately and simultaneously. Each came up with results within a percentage point of dom0, indicating that they each behaved as though they were independent 600 MHz machines with 384MB of RAM.
Part 2: We then “shutdown” both dom1 and dom2, set dom 1 to also use processor 1, then proceeded to start them back up and ran the benchmarks again. Simultaneous tests showed that each was running on a 300 MHz machine with 384MB of RAM. Individual tests (where one domain was running fully loaded and the other was idle) indicated roughly 600 MHz machines…

What this seems to indicate to me is that the software works as expected…load up a machine with a few VMs and get the performance you expect out of them. You could give these subdomain VMs to anyone, trusted or not (supposing you took all security considerations into account), and you wouldn’t need to worry about whether any one person would be monopolizing the physical hardware in such a manner that it would prevent another VM from operating correctly.

My thoughts: Although these tests are not exhaustive, nor run more than a few times each, it gives me a good indication of the possibilities available if you need or want to run a few separate machines with modest processing requirements…you can buy one or two dual processor (or quad processor, or dual-core dual-processor (quad-core)) machines, load them up with RAM, and run a good number of virtual machines on them. This technology could be used for separating out web hosts, testing out configurations in a virtual sandbox, separating application hosting from database hosting, and more. This would be perfect for running a VPS (Virtual Private Server) hosting environment. In fact, there is at least one provider that is starting to use Xen technology exclusively for all new VPS hosts – Rimu Hosting.

I haven’t explored all the options nor found all the pitfalls to utilizing Xen Virtualization Technology, but from a first look, I would say that it is pretty hot, and the tech to use if you are thinking of virtualizing any of your Linux or BSD systems. (Not OpenBSD at this time.) And exciting things are on the horizon with Intel and AMD promising hardware virtualization in the near future…this kind of technology would allow Xen software to run unmodified versions of x86-based operating systems, including Windows, OpenBSD, and dare I hint at the possibility…MacOS X for Intel…pending support and authorization from Apple, of course.

Stay tuned for more on Xen in the near future!

Welcome!

      No Comments on Welcome!

Welcome, and hello! My name is Dave, and I am a Systems Administrator in Pennsylvania, USA. I am starting this blog to discuss various sysadmin related topics, including network and systems security, systems configurations, tips for optimizing systems and networks, using open source tools in a corporate environment, and more.

I will be writing about my experiences and various things I know to be true in the IT environment, from a Systems Administrator’s perspective. You may not agree with some things that I write here, especially if you are in IT but are not a sysadmin…this doesn’t mean that I think you are wrong…it means simply that I am looking at the situation from a different perspective. I encourage your comments and will try to respond to them. Feel free to advance this into a forum where we can all learn something that will help us in the IT world, and in our jobs.

As with anyone in this field, I am constantly learning and seeking out new ways to do things, mostly for three reasons: will it provide better security and/or performance?, will it save the company money?, and is it the “Right Way” to do it? Undoubtedly, the answers to each of these questions are often tenuous and hard to justify empirically, but when appropriate, I will make the effort to analyze the largest pros and cons under each category and try to give information that will help you to decide which solution will work best for you.

I may not get the chance to update this blog every day…as I’m sure many of you experience, some days my job will take many more than eight hours of my time, while others, it will barely take two. For the same reason, I may update it more than once on some days. Take it all with a grain of salt, as each Sys Admin needs to take his or her own environment, experience, and politics into account for each solution, but I hope I can shed some light on the various options for those “running the show” in IT environments worldwide.

Thanks for visiting this blog, and know that I encourage your comments!