Written by niraj on June 4, 2009 – 2:42 pm
We know it makes more sense for most people having their social networks built to have a first version built very quickly and test it with the target user group or the market, but we have recently had experience with building social networks where the focus was on coming up with an architecture that will scale up to 10s of millions of users economically and effectively. We did an analysis of all the options available to us, and the following is a quick summary of our findings and recommendations.
Goal
To select a platform/framework which will allow scaling to 10s of million users for a social application.
The core issue here is Database Scalability and Flexibility. We’ll touch upon both one by one:
Database Scalability: As it turns out, perfectly normalized, flat databases with absolutely no data/information duplication are not essentially right for applications that need to scale to a huge user base. Such a database structure needs excessive computation and a lot of table joins, which end up getting computationally very expensive as user base grows.
Database Flexibility: Complete control over database structure is required if one is to design an application that is needed to scale to a huge user base, as with time, different clustering and partitioning schemes might be required to help the application scale horizontally.
Platform Assessment
The following is an assessment of Elgg, Drupal and Symfony on the above parameters:
ELGG: Due to a very flat and normalized database structure, the only way to scale horizontally with ELGG is to duplicate the database using MySQL replication on multiple servers. The negative with this approach is that it ends up duplicating the complete database, with the result that each machine running a copy of the replicated database will have to be very powerful server, rendering the solution expensive. (MySQL clusters is a standard way of doing it)
Drupal: Similar problems as ELGG with Drupal, but database is not as flat and normalized as ELGG. Scaling up in a similar manner as described for ELGG wity MySQL clusters would probably be cheaper with Drupal.
Symfony: Allows us to use a custom database structure. We can design the database and replicate it as we like. Symfony also uses its own query/object caching mechanism, which is efficient. As an example, the Yahoo bookmarks site supports 20 million users on Symfony.
Proposed Solutions
Solution 1:
Get deep into the ELGG code and customize the database interaction layer so as to support our own database, that can be designed and partitioned as we wish. The problem that this approach comes bundled with is that subsequent ELGG updates/releases will not be directly usable by us, and we will have to manually merge them into our ‘custom ELGG’. The positive of this approach is that ELGG plugins will work on our ‘custom ELGG’ with none or very little changes, if any.
Solution 2:
Build the solution ground up with Symfony, with our own custom database design that allows us the flexibility to partition it as we like.
Recommendation
Our recommendation in a case where you are looking to scale up to 10s of millions of users would be write everything from scratch (and not use a ‘platform’ like ELGG/Drupal), that allows you to customize and tweak anything and everything. Symfony seems to have a good reputation for enabling creation of very large websites, and so, we would recommend using Symfony as a framework.









