Acorns

Marcel's blog

WordPress Race Condition with MySQL Replication

Posted by Marcel M. Cary on August 03, 2010

My employer runs WordPress to power the Healthy and Green Living section of our web site. The blog serves spikes of dozens of pageviews per second. We use HyperDB to send read queries to a few slave databases.

One day, I found that replication was falling further and further behind, mostly because of updates to wp_options. These writes were contending with reads for the MyISAM table lock on the slave. Because MySQL prioritizes writes higher than reads, it reduced our read concurrency to roughly one thread:

8 comments
Tagged Programming

Captcha the Dog Exploit

Posted by Marcel M. Cary on June 09, 2010

Care2 has an interest in animal-themed captchas, so I evaluated Captcha the Dog. I think I have found a vulnerability, at least in the image recognition component, which I believe is the meat of the puzzle.

0 comments
Tagged Programming

Version Control Comparison for Large Repositories

Posted by Marcel M. Cary on December 18, 2009

At Care2, our main repository had 120,000 files and a 2.4 GB CVS checkout. CVS was mostly working with some hacks to run faster on our huge repository. But I wanted more out of version control.

The biggest issue was that merging didn't work well. Sometimes adding or removing files on a branch would have an unexpected result after merging to trunk. And it was difficult to merge to and from trunk multiple times. I know, I know, you can tag branches at just the right place to track what's been merged already... but I'd rather not.

We also relied on file lists to speed up CVS operations. File lists help by restricting CVS commands to a carefully maintained list of files that were actually touched on a feature branch. But we ran into intermittent problems when the file list database got moved or access was accidentally revoked. The file lists were good at speeding up CVS but greatly increased the complexity and fragility of our development process, so I was eager to leave them behind.

Our two main reasons for ditching CVS were, in a nut shell:

Cumbersome merging
Slow performance on our large repository

With those reasons in mind, and I set out to find a replacement.

0 comments
Tagged Programming

Faster Feature Branching in Large CVS Repositories

Posted by Marcel M. Cary on September 20, 2009

At work we have a large CVS repository. By large, I mean 120k files, 2.5GB checkout. Most things work fine, and we've evolved some techniques to deal with operations that would otherwise be slow.

Things that work well:

Committing a small list of files
Updating your whole working copy, since we only expect to do so once daily
Updating a small list of files to get someone's recent changes

Things that didn't work well:

Scanning your working copy for things you forgot to checkin
Branching, because if you do the naive thing, you have to wait for the CVS to branch the whole repository
When using tags the naive way, for example for marking a release and deploying it, again because CVS has to walk the whole tree

While we never quite addressed the first problem, we do pretty well at making sure CVS never has to walk the whole tree.

0 comments
Tagged Programming

Tracking CVS with Git using cvs2git

Posted by Marcel M. Cary on June 03, 2009

At work, we use CVS to manage code, but I want something better: Git. The git-cvsimport tool can do efficient incremental updates from CVS into Git — just what you need if you want to work in Git while your team's primary VCS is CVS. But git-cvsimport is based on cvsps, and cvsps is a dead project. And worse still, cvsps segfaults on my employer's repository. Enter cvs2git.

2 comments
Tagged Programming

Nested Imenu for PHP

Posted by Marcel M. Cary on July 18, 2008

I wanted an easy way to navigate a PHP file full of object-oriented class definitions in Emacs. My search for such a tool turned up php-mode integration with imenu. Imenu allows modes to generate menus of structural elements in a file, where selecting an element jumps to to its location in the file.

But php-mode separates the list of functions from the list of classes. The list of functions is often way too long, and it's not clearly organized by class.

3 comments
Tagged Programming

Plucene vs. Ferret

Posted by Marcel M. Cary on April 19, 2008

Switching from Plucene to Ferret for full-text search yielded huge performance improvements in both memory usage and execution time.

I setup search for an email list a year or two year ago. The original search used Plucene, a Perl port of the well-known Apache Lucene search library. Performance was never great at about 15 sec for first search results when I set it up. But over time, performance degraded to 60+ sec.

0 comments
Tagged Programming

Riding Rails with Typo

Posted by Marcel M. Cary on January 11, 2008

I upgraded from my home-brew XSLT-based static blog to Typo with the main goals of getting robust comment features and getting to tinker with a Rails app. Along the way, I also tried to support the original blog's URLs, article IDs, and look-and-feel. And I checked that it would deliver pages reasonably quickly.

0 comments
Tagged Programming

Acorns

Marcel's blog

WordPress Race Condition with MySQL Replication

Captcha the Dog Exploit

Version Control Comparison for Large Repositories

Faster Feature Branching in Large CVS Repositories

Tracking CVS with Git using cvs2git

Nested Imenu for PHP

Plucene vs. Ferret

Riding Rails with Typo

Pages

Categories

Search