Image 01 Image 02

Elgg Scalability

Posted on 20th October 2008 by Sameer
8

Elgg is the cream of the crop of open source social network software, a group which includes other products such as Dolphin, PHPizabiLovdbyLess. It’s also significantly superior to low cost white label social network software such as Handshakes, phpFoX, SocialEngine, and so on.

Elgg stands out because
a) It looks beautiful and has a good feature set out of the box
b) Encourages the community to contribute to the project with plugins and themes. It aims to be to social networking what Drupal/Joomla are to CMS systems.

However, if scalability is a top or immediate concern, be warned Elgg may not be suited for you. Elgg attempts to make plugin development extremely easy by avoiding the need to directly work with or even think about the database. All blog posts, blog comments, wall posts, forum discussions, plus data your plugins may save are stored into a single metadata table. Over time that table will become huge and umanageable. 

Further, the database is laid out in a highly normalized form which is not ideal for performance, as it requires joins to be performed over multiple tables. A quick look at the db profiler shows many page loads that require over 100 db queries. 

Therefore I would recommend that only small scale web sites use Elgg for now unless they are confident in their abilities to optimize the architecture. 

Scaling the Elgg Database Hardware

If you plan to leave the core of the Elgg engine untouched you will likely have to rely on standard database replication to scale the database.

Replication is a process in which a “master” server receives all writes and pushes the queries to its “slave” servers. Reads can be split between the master and the slaves. Unlike with partioning, replication requires that every database server keep a copy of the full data set. Therefore, you can’t use commodity hardware to split the work up between many small servers. Instead you will have to rely on small number of superservers to be able to scale Elgg. Thats the only way that the metadata table will not kill your performance.

Introducing a Cache

Elgg does not use persistent caches. If you are willing to modify the engine of Elgg (which will make upgrading much more difficult), using a cache will likely improve performance greatly. I prefer to cache granular data; for example I would cache the user row from the database instead of the user profile page. However, with Elgg’s use of join queries, it may be difficult to properly invalidate and update the cache when data changes.

You will have to selectively search for the appropriate places to introduce a cache (I suggest memcached). Start with simple queries that are repeated on every page load but rarely change such as loading the site settings. Then, look for other queries that work on single tables (no joins). If your social network is not a walled garden, there are probably even many pages that you can cache on a full page basis using either memcached or even squid. For example, Wikipedia has customized user pages but for the most part they can cache full pages.

Conclusion

Elgg looks like it is shaping up the be the premier open source social network tool by far. It is in active development and has a growing community supporting it. Plugin development is simple and allows for rapid deployment. Use Elgg if you need to get your site live ASAP and are focusing on a niche area with under 100,000 users unless you are prepared to do some heavy optimization. But if you want to be the next Facebook or MySpace or even something like Hi5, there is no chance Elgg will be able to scale in its current form. On the bright side, the Elgg developers promise significant speed improvements in the 1.1 version which is scheduled to be released shortly.



8
Responses to.. Elgg Scalability

1
alaa halasa posted on November 5th 2008

hello man, can you please clarify what do you mean by 100000 users, do you mean 100000 registered users or 100000 users are working at the same time on the site?



2
alaa halasa posted on November 5th 2008

hello man, according to your experience , in its current release 1.1 which was released at 30/october, can elgg handle 5 million message per month?



3
Sameer posted on November 12th 2008

Alaa, I meant 100,000 registered users. As for 1.1, I haven’t taken a look yet. But, if you mean 5 million private messages per month that works out to roughly 7000 private messages per hour which with corresponding traffic on other pages probably would be too much for Elgg to handle. The database table, in its current structure, would have a tough time coping.

Like I said, I haven’t taken a look at 1.1. I know it has speed improvements over 1.0 but probably not enough to deal with what you are hoping for.



4
joycekim posted on November 13th 2008

Sameer - Thanks for the great port. I would love to hear your thoughts on 1.1 now that it is out. Can you you post your thoughts here once you have a chance to look at 1.1? I am looking at Elgg for a bigger site as well and am concerned about scalability…



5
asymptote posted on March 14th 2009

Year 2009. Elgg 1.5 got released, any thoughts on scalability?



6
Sameer posted on April 8th 2009

Sorry, I have not looked into the new release yet. Once I do, I will come up with a followup post



7

Hi Sameer:
There’s an eLgg Boot camp in Harvard next month (http://www.elggcampboston09.com/). One of the topics is scaling eLgg. I wonder if you’d be interested in sharing your experiences with the group at the event?



8

Many off the shelf social networks (and scripts in general) are built for convenience and ease of customization rather than optimization so it’s always going to be a compromise of speed over flexibility. One of the downsides of open source software like Elgg is that it’s maintained by a team of enthusiasts, not a business. This is evident in the long development cycles, no adherence to deadlines and lack of urgency in fixing bugs.



Leave a reply...