Image 01 Image 02

0
Posted on 20th October 2008 by Sameer

I host all my servers with The Planet and a few days back, all at the same time, my MySQL databases started hangin up. The process list (”show processlist”) was showing many many unauthenticated user connections from 192.168.xxx.xxx. MySQL was trying to do a reverse dns lookup on the connecting IP address and was either stalling or failing on the request. I assume something went wrong with the dns server.

The work around is to insert “skip-name-resolve” into your my.cnf file and restart the server and MySQL will no longer run reverse dns on connecting IP addresses. To avoid your facing sudden downtime like mine, I would recommend inserting that line into your my.cnf immediately before you run into the same problem. Of course, if your mysql.user table authenticates any user based on a domain then you can’t skip resolution of IP addresses.

8
Posted on 20th October 2008 by Sameer

Elgg is the cream of the crop of open source social network software, a group which includes other products such as Dolphin, PHPizabiLovdbyLess. It’s also significantly superior to low cost white label social network software such as Handshakes, phpFoX, SocialEngine, and so on.

Elgg stands out because
a) It looks beautiful and has a good feature set out of the box
b) Encourages the community to contribute to the project with plugins and themes. It aims to be to social networking what Drupal/Joomla are to CMS systems.

However, if scalability is a top or immediate concern, be warned Elgg may not be suited for you. Read the rest of this entry…

0
Posted on 28th September 2008 by Sameer

I recently released an early beta of RateDesi Hungama which will be an entertainment/bollywood extension of RateDesi. For now the site lists all Bollywood soundtracks (in India the most popular albums, by far, are movie soundtracks) with song listings. Each song will pull the corresponding music video from YouTube for viewer enjoyment. Hopefully, the same South Asian users who like to spend free time at RateDesi, will also enjoy RateDesi Hungama.

As of now the two sites are totally independent of each other (besides marketing) but in time RateDesi credentials will be used on RateDesi Hungama and users will be able to add content from Hungama to their RateDesi profiles.

I also look forward to greatly expanding on the Hungama feature set.

4
Posted on 29th August 2008 by Sameer

The Zend Framework provides a Zend_Cache which can be plugged into with various backends such as SQLite, Memcached, APC, and so on. Separately, it also provides Zend_Registry which is a “a container for storing objects and values in the application space”. The Zend_Registry is not a cache as its contents are created and used only by the currently executing script.

So, why would you want to use the Registry as a cache when it does not cache anything between page loads? The answer is to provide a transition point for caching additional data in other Zend_Cache backends.

For example, every time a Zend_Db_Table is instatiated it runs a DESCRIBE TABLE query which is a surprisingly expensive query (or at least it was surprising to me). If you are using the MVC model, you can end up running this query dozens of times on one page. So to speed things up you should cache the results of the DESCRIBE TABLE query. You will end up improving performance whether you save the results in the Registry or (even better) in an appropriate Zend_Cache backend.

However, at the moment you have not configured your Memcached daemon so you instead decide to use the Zend_Registry. But Zend_Registry does not follow the same syntax as Zend_Cache. So, when you do finally set up Memcached you will have to go back, edit your code to follow the Zend_Cache sytanx, and then test your cache. It’s better to instead use Zend_Registry as a backend to Zend_Cache which will make it utterly simple to change the cache backend to Memcached at a later date.

Read the rest of this entry…

1
Posted on 21st August 2008 by Sameer

Now that I have covered how to load balance multiple web servers and how to keep their content synchronized there is one more major problem to solve: sessions. You need sessions to identify a particular user from request to request (remember HTTP is stateless). Usually session data is stored on the local filesystem. However with multiple load balanced web servers, a user can be thrown from one web server to another meaning that you can not count on saving session data in the local filesystem.

Most load balancers, including nginx (through the ip_hash command), do allow you to make your sessions “sticky” which means that a particular user will be sent to the same web server for the duration of his session. This allows for you to again rely on the local filesystem to save your sessions. However, sticky sessions have a greater likelihood for uneven load distribution. Plus when a particular web server goes down, all of its user’s sessions will be lost.

It would be better if sessions could be stored in a location that all the web servers could access. If you have a SAN, that would be one option. But, what most people already have is their database. So, let’s save our sessions in MySql. The obvious downside to using your database for sessions is that the database is slower than using a local filesystem. However, for most sites (even many large ones), the performance difference will be negligible.

Read the rest of this entry…

5
Posted on 21st August 2008 by Sameer

After learning how to load balance, you still need to keep your web files consistent between your web servers. My tool of choice for doing so is rsync which includes smart features such as delta uploads (if it notices a file has changed it will only upload the difference, not the whole file from scratch).

I am assuming that you will have a particular “main” web server which you always update with new content first. The new content can either be “pushed” by the main web server to dependent web servers running rsync daemons, or it can be “pulled” by dependent web servers from the main web server. I suggest running a “pull” environment because your main web server will not need any knowledge of the existence of the dependent web servers.

Read the rest of this entry…

18
Posted on 21st August 2008 by Sameer

In a previous post we saw how simple it is to set up nginx in front of apache, and in this post I’ll show you it’s just as easy to use nginx as a load balancer.

Load balancing can be left to either hardware or software. For most of us, the expensive hardware is out of the question, but cheap (free) software will solve our needs just fine. Here’s a look at how nginx does load balancing

upstream  mysite  {
   server   www1.mysite.com;
   server   www2.mysite.com;
}

server {
   server_name www.mysite.com;
   location / {
      proxy_pass  http://mysite;
   }
}

The above configuration will send 50% of requests for www.mysite.com to www1.mysite.com and the other 50% to www2.mysite.com. However, if you add a “weight” tag onto the end of the “server” definition you can modify the percentages. Other useful options include max_fails and fail_timeout. For sticky sessions use ip_hash. Refer to the full documentation for further details.

Now that you know how to load balance, you will need to learn how to sync your files between multiple web servers.

5
Posted on 21st August 2008 by Sameer

I’ve been evaluating nginx, a lightweight web server, for the last week and I am coming away impressed. Over the last year or so nginx seems to have overtaken lighttpd for the crown of lightweight web servers.

In our case nginx is used to serve static files while apache is used to serve dynamic content (we also use nginx for simple load balancing). A request for http://www.mysite.com/file.extension will first be sent to nginx which will determine whether to serve the file itself (if its static) or if not it will request the url http://localhost:8080/file.extension from apache and pass back the result seamlessly to the end user.

Read the rest of this entry…

0
Posted on 21st August 2008 by Sameer

Sites that accept user uploads (photos, documents, music etc) will need to need to determine an appropriate directory structure to house the large number of files they will collect. At first glance you may decide to just prefix all filenames with a userid and stick them all into one directory. Maybe even broken up into something like:

/uploads
   /photos
   /music
   /documents

If user 75474 uploads a photo, it will be named 75474_randomstring.jpg and put in directory “/uploads/photos”. However, over time the photos (and music and documents) directory will become huge. File systems of practically all kinds do poorly with large directories. Things run slower, become more error prone, and batch operations become difficult. You do not want huge directories

Read the rest of this entry…

1
Posted on 18th August 2008 by Sameer

On a recent multi person project, we’ve used a subversion client to directly pull the latest project files into the web directory. We do so because its a complicated environment that we have not yet created individual sandboxes for. To test a change, the code must be committed to subversion and then we execute “svn export” on the webserver to pull the latest files from our subversion repository directly to the web directory. The only downside seemed like we were going to have a crazy number of revisions. But we’ve also run into one other problem: APC.

Read the rest of this entry…