Whether you are upgrading, decommissioning or putting your current master server in maintenance – here’s a quick howto on promoting your slave server as new master for your application writes. This instructions (for my notes and the benefit of the researching readers) covers especially if you have multiple slaves attached to the master. Let’s assume that your current active master is host A and the host you’d like to promote is host B.

  1. STOP SLAVE; on all slaves including B that are attached to A directly.
  2. Make sure that log_slave_updates and bin_log is ON on B.
  3. If you are not taking A out of rotation and planning to switch back to it later as your master, you can configure it as slave of B forming a master-master pair. Simply take the SHOW MASTER STATUS; coordinates from B and use it to CHANGE MASTER TO for A.
  4. Save SHOW MASTER STATUS; output from A.
  5. Start replication on all the slaves that are attach to A again, but only until the coordinates you get from #3 using the syntax START SLAVE UNTIL MASTER_LOG_FILE=<log_file>, MASTER_LOG_POS=<log_pos>. This ensures that all slaves are caught up at exactly the same position so you can safely point them to replicate later to B.
  6. Once all slaves have caught up on the same coordinates, now its time to reconfigure the rest of the slaves except B to replicate from B. Using the SHOW MASTER STATUS; coordinates from B, execute a CHANGE MASTER TO MASTER_HOST=B ...
  7. Now its time to fetch the remaining events from A to B, on B, STOP SLAVE; then START SLAVE; again, this time replication will retrieve all events since the coordinates we stopped at #4.
  8. Once B and its slaves has caught up on replication, you can now point your application to send writes to B.
  9. Lastly if you are taking A out of rotation but keeping it online, RESET SLAVE; on B after Exec_Master_Log_Pos and Read_Master_Log_Pos on B have stopped on the same position so that no further writes from A is replicated to B.

Did I miss anything? Of course – not all failovers are the same, this guide is meant as an overview and must be reviewed to match your situation always. Comments welcome! :)

In case you are upgrading to Fedora 16 and trying to use a software RAID for the /boot partition, at some point Anaconda is bound to complain that the RAID metadata version is invalid. Well this is a known bug and a workaround is available from this bug report.

In my case if I just specify the “updates” boot option, Anaconda will try to automatically configure networking – too bad if you don not have DHCP available. However, if you specify an alternative install method to Anaconda, it will ask you to manually configure your network, in this case you will be able ti use the updates image as well. In short, you just have to specify something like this on your boot options:

repo=http://mirror.steadfast.net/fedora/releases/16/Fedora/x86_64/os/ updates=http://dlehman.fedorapeople.org/updates/updates-750480.3.img

The only downside is that, obviously, your install media will be over the network. If you have a slow connection then that’s another story. Do you know any workaround?

January 1, 2012 | In: Linux, PHP

Where’s My PHP Debug Symbols?

While working on a PHP script recently when I stumble upon a nasty bug that leaves only a “Segmentation fault” while preparing statements with Zend_Db_Adapter_Mysqli. Attempting to gdb the same process only results in gibberish like below and nothing more. The problem is that, the PHP cli is a stripped binary and on CentOS (and Fedora 14 at least with PHP 5.3.8) there is no debuginfo package to provide these symbols, you’ll have to compile PHP yourself to debug further.


#19635 0x000000000046b13b in ?? ()
#19636 0x000000000046b231 in ?? ()
#19637 0x000000000046b13b in ?? ()
#19638 0x000000000046b231 in ?? ()
#19639 0x000000000046cb3f in ?? ()
#19640 0x000000000047c76f in php_pcre_exec ()
#19641 0x0000000000480d46 in php_pcre_replace_impl ()
---Type to continue, or q to quit---
#19642 0x0000000000481f0d in ?? ()
#19643 0x0000000000482504 in ?? ()
#19644 0x0000000000482a33 in ?? ()
#19645 0x00000000005e8f19 in ?? ()
#19646 0x00000000005e847b in execute ()
#19647 0x00000000005c1395 in zend_execute_scripts ()
#19648 0x0000000000571418 in php_execute_script ()
#19649 0x000000000064b5a0 in ?? ()
#19650 0x000000314821d994 in __libc_start_main () from /lib64/libc.so.6
#19651 0x00000000004222b9 in _start ()

My appeal to the PHP package builders (CentOS/Fedora mainstream, RPMForge, Atomic) – please please please – include a debuginfo package on your repo :)

When designing you backup strategy which involves using XtraBackup, it is often part of the job to be able to rotate backups not to fill the disks and not be able to take further backups. However, sometimes I’ve seen people how they can do this when XtraBackup creates its own timestamped directory (by default). Well, here’s two.

1. After the backup, find the latest backup set on the backup directory using bash or whichever scripting language you are using, by default the resulting directory for new backup sets takes this sample format ’2010-03-13_02-42-44′. Below is how you can achieve this with bash

CB=$(ls -1 | egrep '^[0-9]{4}-[0-9]{2}-[0-9]{2}_[0-9]{2}-[0-9]{2}-[0-9]{2}$' | sort -n | head -n 1)

2. Use the –no-timestamp option of innobackupex to control the backup directories.

--no-timestamp
This option prevents creation of a time-stamped subdirectory of the
BACKUP-ROOT-DIR given on the command line. When it is specified, the
backup is done in BACKUP-ROOT-DIR instead.

With this method, you will have to create the unique backup directories from your script, which in turn you would already know the resulting name you can use for prepare. You can easily generate one based on current date with the sample bash command below:

CURDATE=$(date +%Y-%m-%d)

This is only one essential part of a good backup procedure, I might blog on more. :)

As mentioned on my previous post, I am moving out of a Dell R210 server to a new hardware with 4x Western Digital 500G Caviar Blue and Intel 320 Series 120G for my own personal research and testing. The four WD disks are configured in RAID10 with mdadm (aka software RAID) and the SSD as standalone. Raw results can be found on my gists page here and here.

I am migrating to another server for my testing and this blog, but before I leave this server I decided to do a simple sysbench fileio test using 60G total file size. Raw information about the server and results can be found here. See the beautiful graph below.

When using Solr DataImportHandler with MySQL, the JDBC connection treats TINYINT(1) columns as BOOLEAN even if you have other values greater than. Most likely with a properly defined field of int type on your schema.xml you will get an error while indexing:

SEVERE: java.lang.NumberFormatException: For input string: “true”

This is a default behavior of Connector/J on the property tinyInt1isBit documented here. To fix this you should add tinyInt1isBit=false to your JDBC connection string.

err: Could not retrieve catalog from remote server: certificate verify failed

If you’re new to puppet, this error is a bit tricky to identify. Even after recreating the certificates on the client the error still persist. If this is the case, most likely is that time on your master and the client is different. Install and sync NTP – this should be a standard component for new servers anyway.

rdiff-backup is a great tool for mirroring or backing up files locally and between remote servers. It is based on rsync and works on top of SSH. To this note, I recently encountered one caveat when using the –user-mapping-file and –group-mapping-file options. Because these options intends to map a source user/group to a destination user/group it requires rdiff-backup to run on escalated privileges much as “chown” command would do.

I keep a list of groups, list and forums I regularly visit for information on the local open source scene and share them here:

Mailings lists:

Forums:

Groups:

Misc:

If you’re in the Philippines, what do you regularly check in to?