Category Archives: Uncategorized

Amazon DynamoDB Looks Like A Game Changer

Today Amazon announced the availability of DynamoDB which is a scaleable NoSQL Database service running on SSD disks in the cloud. It’s highly scalable, highly reliable, and meets the performance level that you request from it (via “provisioned throughput” where you use a dial in their console to select how many reads/writes per second you need).

NoSQL itself is not a game changer. MongoDB has been a pretty awesome NoSQL database and theres also Cassandra and many others. But the largest headache that comes with databases is managing reliability and performance as your database grows. Even when it comes to MongoDB, to get it right you need to first start by considering the right RAID scheme, then create a backup strategy, and eventually implement a replication setup. DynamoDB throws that all out the window. They will manage all the tedious background noise for you. Of course, as a new product we will see if they deliver what they have promised. But AWS generally has a very good reputation and their pre-launch clients have reported good success with DynamoDB.

One concern users may have is whether they will be “stuck” on DynamoDB if they choose to use it. It doesn’t seem like they have too many unique features so you shouldn’t have much trouble. It’s basically a key/value hash so you won’t have much trouble exporting it to the other NoSQL databases if you decide to move on. Of course importing your database from a RDBMS to DynamoDB (or to any NoSQL DB) will require many changes to your application.

Here’s a video explaining more:

5 Reasons Why There Is No E-commerce Bubble in India

For months we’ve been hearing whispers of an E-commerce bubble in India. But with the failure of VC-backed Taggle in December, those whispers exploded into a roar (see here and here). The failure of that single startup somehow provided evidence to critics that the whole segment was in serious danger. One year earlier, e-commerce was considered the most promising online category by the tech community, investors, and the media alike. Sentiment has flipped so dramatically that the new conventional wisdom is that 2012 will be a year of reckoning and major consolidation for e-commerce merchants.

But the race to jump off the e-commerce bandwagon is unnecessary. Here are 5 under-looked reasons for why we are not in an e-commerce bubble

REASON #1: A Few Failures Does Not Make a Bubble

It is absolutely normal for there to be a few failures in new markets. VCs don’t have or expect a 100% success rate. There will be at least a few spectacular e-commerce failures but as long as it is balanced out by an even greater number of successes, we’ll continue to see valuations near where they are now. Yes, each failure will naturally send valuations downward, but it’s going to take a fundamental weakness for one to come to the conclusion that valuations are at a bubble level.

A bubble implies that the underlying financial assumptions of the industry as a whole, both present and future, are unreasonable. Taggle raised money with the assumption that becoming a leader in the Daily Deal space would make for a highly valuable company. That assumption was correct but Taggle failed due to bad execution. So let’s not make Taggle out to be Bear Stearns or Lehman Brothers.

REASON #2: Daily Deals Is Not E-commerce

Many have rightly identified Daily Deals as a saturated market that is dominated by the existing 2-3 big players such as Snapdeal and Mydala with no room for new entrants. But, the Daily Deal segment is not real e-commerce. Let’s not extrapolate the saturation of a sub-segment of e-commerce to come to conclusions about e-commerce as a whole. Daily Deals has declined in popularity because of audience fatigue. Who can really put up with offers for facials and yoga day after day? It was a novel concept but one with a short shelf life.

But selling products to consumers is a time-tested business. Users have not tired of buying online. They have tired of buying useless coupons online. Taggle’s failure was indicative of the saturation of the daily deal market ONLY. Taggle was unable to create any type of differentiation or scale compared to their competitors. Yes, they pivoted to selling products but that was a case of too little, too late. Once again, don’t read too much into Taggle’s failure. They executed poorly within a declining sub-segment of e-commerce.

To that, some of you might say “but even the retail e-commerce merchants are not profitable.” However…

REASON #3: We’ve Learnt From the Dot Com Crash

The doubters compare the current e-commerce market in India to the dot com boom era in the USA in the late 1990s. During the boom, a whole new set of metrics was created to evaluate the worth of a businesses such as “eyeballs” or “mindshare.” Profitability in the near or even midterm was ignored because investors and executives believed the Internet would grow so substantially it would make up for astonishing levels of spending. This thought process created an environment of epic failures such as eToys.com which raised $220 million, went public, reached a market cap of $11 billion, and then went bankrupt two years later.

So what makes India different in 2012? Well for one we have the above as an example to learn from. Most investors and founders are weary of spending of their startups spending cash too quickly because of lessons learnt from the dot com era. Yes founders are looking for money to accelerate their growth and improve their tech, logistics, and supply chain. But the scale of these investments is nowhere like it was in the USA in the late 1990s.

The examples of Amazon.com, NewEgg.com, Drugstore.com, and BlueNile.com are proof of success from the same period. They go to show that if a merchant follows a reasonable financial model that it can survive for years without being profitable. As long as revenue is following a similar trajectory to spending and the company is making a healthy gross profit (revenue minus cost of goods), it will be in a position for a safe landing if the market goes south. Management can reduce its investments and overhead to turn itself profitable or at least conserve cash (albeit doing so will sacrifice future growth during the down period). The four mentioned merchants have vastly different stories but all show that putting growth before profitability in e-commerce is viable when done in an intelligent fashion.

REASON #4: Women Have Yet To Begin Shopping Online

The most significant aspect of why valuations of e-commerce startups are priced at high multiples is because of the huge growth that is expected. If that growth does not materialize, then it certainly would be a bubble. To estimate the future e-commerce market, most look at indicators such as the growth in online penetration, growth in income, improvements in reliability and coverage of couriers, easier payment methods, etc. These are all important metrics and will contribute greatly to the e-commerce story. But, there is one important angle that no one is talking about:

Women love to shop (sorry for stereotyping!). Are shopping malls full of men or are they full of women? It’s the ladies who drive the offline shopping market. So then why is it that the online shopping world is dominated by male shoppers to the point that our internal statistics indicate that online shoppers are at least 80% men? That discrepancy is because women tend to lag behind men when it comes to adopting new technology. Online shopping is still at a nascent stage in India, so it’s to be expected that not many women currently shop online.

But it’s only a matter of time until Indian women start to become comfortable with shopping online. As that process begins, a whole new wave of growth in the e-commerce market will begin. It will effectively double the reach of e-commerce merchants within a few years. In the USA, more women shop online then do men. India may not reach that point for years, but there’s absolutely no reason why women will not start to shop online in large numbers in the near future which will be a boon for the e-commerce market.

REASON #5: Organic Growth Makes Up For Any Bubble “Burst”

Let’s pretend that six months or a year from now it turns out the e-commerce bubble theorists were right. Valuations were too high and many more e-commerce merchants fail because they burnt through their money and couldn’t convince VC’s to invest in their subsequent down-round.

Okay now what? Well if it’s a bubble, you won’t see further investment in the space for years because of the scars of losing money. But what will happen, even in this worst case scenario, is that the failure of those startups will leave a void. Demand for online shopping from consumers will vastly outgrow the number of quality e-commerce startups in a very short time. Entrepreneurs will continue to jump into the e-commerce space. Investors will be unable to ignore the staggering amount of growth in e-commerce and will resume making investments

You just don’t have real bubbles in categories with extreme growth. You have a few failures and a correction of valuations. The bad startups and unwise investors are weeded out, but then investment activity will continue in that same space to service the amazing growth you’ll see. It’s just not a bubble if we can safely predict we’re going to see investments and growth in that sector for the next ten years. If some of the me-too investors in that category get burnt, then call it a slowdown or correction if you must, but it’s no bubble.

A real bubble is one in which there is no impetus to restart growth. For example, once the US housing market bubble burst, the only thing that would have saved the future of the market was sudden new demand for housing. And of course that did not materialize because there was no population growth or wealth growth in the US. But in this case, if e-commerce valuations do crash, its effects will be minimal. The increasing demand Indians have to shop online is in no way tied to the valuations of these companies and in no way would be affected by a drop in valuation. The continued growth in e-commerce would revitalize the market and spur new VC investments into e-commerce very shortly after any crash. Let’s not call it a bubble if the “bursting” of that bubble isn’t even going to cause lasting damage.

CONCLUSION

E-commerce in India is healthy. Taggle failed but it was not indicative of a systematic problem. A few other VC backed e-commerce startups may even fail, but we are seeing and will continue to see many success stories. As long as those startups continue to spend money within reason, the amazing growth in online shopping will make up for moderate short term losses. If anything, we may see a correction, but that will be short-lived as demand from consumers for quality online merchants will push money back into the e-commerce space.

That’s My Desi Life (TMDL)

I recently opened a new website ThatsMyDesiLife.com which parodies the “funny” or “different” (depending on your perspective) elements of desi culture via user submitted short stories. This is basically a spin off of FML, similar to what others have done by using the fmyscript software.

However, after purchasing that above script I added an API on top and have now (in conjuction with another developer) created an iPhone application for the website. I’m still in the process of learning the Cocoa framework and Objective C but its an interesting change from years of PHP. It’s nice to have a stateful process that can keep track of asynchronous requests and events unlike HTTP. And, the development tools (xcode and interface builder) are very nice to use. It was a bit of pain getting use to using a Mac though.

Anyway, take a look at screenshots from the app:

   

Tamper Data Extension for Firefox

Web Developers use Firefox as their browser of choice for many reasons but maybe most significant are the excellent extensions available to make development quicker, easier, and more effective. The two extensions that pretty much every developer is already aware of are Firebug and the aptly named Web Developer extension. However, an extension I use almost as often as those two is Tamper Data.

In its most basic form, TamperDdata allows you to view the headers for every request and response your browser handles. With that, you are able to examine the POST requests that your browser sends to a server.

But, the extension being called Tamper Data, it lets you do more than just examine the data being passed. It allows you to trap a request and alter the headers and POST data. Why might that be useful? Heres two of many possible use cases.

Use Cases

  • Form Tampering – Imagine you have a nifty registration form and you had all this fancy javascript that prevents and notifies a user if he enters invalid information. However, you still need to make sure your backend validates the data without relying on your javascript. One way to do so would be to use Tamper Data.

    In your browser, begin by completing the form correctly, but before you hit submit open Tamper Data and press “Start Tamper”. Then return to your browser and submit the form. Tamper Data will then popup asking you if you would like to tamper with the request that is being sent. Select tamper and then modify the post values to be invalid, and then hit okay. Tamper Data will submit the modified version of the form with the invalid data. You can then return to your browser window and verify your backend handled the submitted data as intended.

  • Investigating Session Problems – Sessions are identified via cookies. A server provides a cookie to a user upon its initial response. The user provides that cookie back to the site on each successive request allowing the site to identify future requests made by that same user. This concept allows a developer to keep a user “logged in” between requests.

    Several times I’ve had issues where sessions did not seem to persist. The best first step in identifying the issue is to determine if cookies are being handled properly. Is the server sending a cookie with the proper domain and settings to the user? Is the user sending that cookie in subsequent requests? That’s where Tamper Data comes in. Use it to verify the cookie data being sent in the headers.

Installing Tokyo Cabinet and Tokyo Tyrant

As mentioned in a recent post, Tokyo Cabinet is a highly performant key/value store. Its speed leaves MySQL and other RDBMS’s in the dust because it replaces their overhead with highly optimal data structures such as hash tables. Please check the link above for more detail. If you are looking to get more performance out of your system, Tokyo Cabinet is worth trying out.

The process for installing Tokyo Cabinet is very simple. Here is what I did on CentOS 5, starting with a few required libraries:

[code]
# Tokyo cabinet requires gzip and bzip
yum install gzip bzip2 bzip2-devel
[/code]

I then proceeded to download and install the underlying Tokyo Cabinet

[code]
wget http://voxel.dl.sourceforge.net/sourceforge/tokyocabinet/tokyocabinet-1.4.20.tar.gz
tar zxf tokyocabinet-1.4.20.tar.gz
cd tokyocabinet-1.4.20
./configure
make
make install
[/code]

On top of Tokyo Cabinet lies Tokyo Tyrant:

[code]
wget http://voxel.dl.sourceforge.net/sourceforge/tokyocabinet/tokyotyrant-1.1.26.tar.gz
tar zxf tokyotyrant-1.1.26.tar.gz
cd tokyotyrant-1.1.26
./configure
make
make install
[/code]

Tokyo Cabinet supports four types of databases: hash, B+ tree, fixed-length, and table. Each type uses different commands, for example the commands that allows you to create, update, and read from a database are tchmgr, tcbmgr, tcfmgr, and tctmgr respectively.

The following shows how to create a hash database and manipulate it from the command line:

[code]
[root]# tchmgr create db.tch
[root]# tchmgr put db.tch key1 value1
[root]# tchmgr put db.tch key2 value2
[root]# tchmgr list db.tch
key1
key2
[root]# tchmgr get db2.tch key1
value1
[/code]

Keep in mind Tokyo Cabinet’s database file extensions must match the type of the database.

  • .tch – Hash
  • .tcb – B+ tree
  • .tcf – Fixed-length
  • .tct – Table

In my next post, I will show how to connect to your Tokyo Cabinet database using PHP through Tokyo Tyrant.

Using Salts for Extra Security

Typically passwords are saved in databases using one way encryption such as md5. In other words if my password is “hello”, the database stores my password as “5d41402abc4b2a76b9719d911017c592″. Each time a user attempts to log in, the md5 algorithm is applied to the provided password and if the result matches the hash stored in the database then access is granted to the user such as in the following

[code lang="php"]
if (md5($_POST['pwd']) == $saved_hash)
// user is logged in
else
// user password was incorrect
[/code]

Saving this encrypted password is more secure than saving plain text passwords because if a database is temporarily compromised, at least the attacker will not have access to user’s passwords. However, despite not being able to unencrypt the password (remember this is one-way encryption), the intruder might still be able to crack many of your user’s passwords through precomputation.

An attacker could go through the dictionary (or any set of possible passwords) precomputing the md5 hashes. So, if the attacker were to see that my hash was “5d41402abc4b2a76b9719d911017c592″, he would just look it up in his reverse database and see that this hash maps to “hello”. There are many such reverse lookup databases on the web. This one successfully cracks the mentioned password.

Adding Salts

The use of salts greatly decreases the effectiveness of a precomputation attack. A salt is a random string appended to the password before encryption. Typically each user would receive a unique salt.

[code lang="php"]
$saved_hash = md5($pwd . $salt);
[/code]

Let’s examine the implications when the salt is public (stored in the compromised database) as opposed to when the salt is private:

  • Public Salt – The attackers reverse lookup table (commonly known as a rainbow table) will no longer be useful. He will need to generate a rainbow table for the specific user’s salt. While this is still very possible, the attacker will need to perform this operation for each user, which will make it a very challenging process to crack a large number of passwords
  • Private Salt – For the attacker to actually compromise a password, he would need to compute the md5 of each possible password appended to each possible salt. If your salts were 32 bits long the attacker would need to compute 800 trillion hashes or so for the English dictionary to be covered. This would be practically impossible.

Therefore, public salts are better than no salts, but private salts are much better than public salts. So, how does one keep their salt private? You can’t store it in your database because all this assumes your database was compromised. My suggestion is to create a salt based on the md5 of immutable data related to that user (and be very careful to not delete/modify that piece of data). For example, the user’s registration timestamp could be used. As long as your attacker was unable to also steal your application code the salt would be safe. This works out as the following:

[code lang="php"]
$salt = md5($registration_timestamp);
$saved_hash = md5($password . $salt);
[/code]

BarCamp Boston 4

This previous weekend I attended my first BarCamp Boston. I must say it was quite good. BarCamp is a series of “unConferences” which are organized on the fly by attendees, and without any formal registration fee. So, of course, the quality of the talks is not quite up to the standard of formal conferences, but you don’t have to fly around the country to attend (usually to Silicon Valley) and you don’t have to pay $1000+ while you still learn a lot.

Some of my favorite sessions from the weekend included:

  • iPhone – Development, Marketing, Best Practices, & App Store Ideas
  • Twitter for Business
  • Web App Design for Developers

BarCamp Boston is only once a year, but there are some other similar quality groups/events you can participate in throughout the year in Boston.

Secure Communication Over An API With Request Signatures

It’s a very common task for a web application to uniquely identify a visitor by a combination of username and password. However, not as trivial is identifying a third party attempting to use an API to access your web service on behalf of end users of their third party service. You often don’t want to force the end user to create a relationship with your service (such as would be required with OpenID) but instead allow the third party to use your API transparently (such as with Amazon). So, the task at hand is how to uniquely identify the third party making use of your API while preventing forgery and without requiring any sort of login system.

The solution starts with first providing each third party service with a unique public key. The public key is used to determine which third party service the request is claiming to be from. As expected, each public key has an associated private key. The private key is used to encrypt the message request into a signature. The API user will then send along that signature with the request. If the signature sent by the third party service matches the expected signature, then its safe to allow the request.

This method works because only you (the owner of the API) and the third party service have access to the private key. The third party encrypts its message using the private key and then sends along the encrypted version WITH the unencrypted version. The API owner then takes the unencrypted message and encrypts it with the private key (which it looked up based on the public key provided in the request). If the encrypted version generated by the API owner and the encrypted version sent in the request match, it can be trusted that the request came from the owner of the public key.

Here is some php code for the third party side of things. Basically the message is the url with an action of “friends.get”. The message is then encrypted and that encrypted signature is then appended to the url along with the public key. A request is then made to that url. The API owner will then process the request by verifying the identity of the requester (as mentioned above) and send back an appropriate response.

[php]
// your assigned public key which will be included in the api request
$public_key = “abcdefghijklmnopqrstuvxyz”;
// your assigned private key which will always be hidden
$private key = “zyxvutsrqponmlkjihgfedcba”;

// url of the api request which is essentially the message
$url = “http://www.apisite.com/api.php?action=friends.get”;

// create a signature based on the api request using the private key
$signature = hash_hmac(“sha512″, $url, $private_key);

// the final api url with the public key and signature appended
$api_url = $url . “&public_key=” . $public_key . “&signature=” . $signature;

// fetch the url
$api_request_data = file_get_contents($api_url);
[/php]