Update 2: Yurii responds with the Top 10 Reasons to Avoid Document Databases FUD. Update: Top 10 Reasons to Avoid the SimpleDB Hype by Ryan Park provides a well written counter take. Am I really that fawning? If so, doesn't that make me a dear? All your life you've used a relational database. At the tender age of five you banged out your first SQL query to track your allowance. Your RDBMS allegiance was just assumed, like your politics or religion would have been assumed 100 years ago. They now say--you know them--that relations won't scale and we have to do things differently. New databases like SimpleDB and BigTable are what's different. As a long time RDBMS user what can you expect of SimpleDB? That's what Alex Tolley of MyMeemz.com set out to discover. Like many brave explorers before him, Alex gave a report of his adventures to the Royal Society of the AWS Meetup. Alex told a wild almost unbelievable tale of cultures and practices so different from our own you almost could not believe him. But Alex brought back proof. Using a relational database is a no-brainer when you have a big organization behind you. Someone else worries about the scaling, the indexing, backups, and so on. When you are out on your own there's no one to hear you scream when your site goes down. In these circumstances you just want a database that works and that you never have to worry about again. That's what attracted Alex to SimpleDB. It's trivial to setup and use, no schema required, insert data on the fly with no upfront preparation, and it will scale with no work on your part. You become free from DIAS (Database Induced Anxiety Syndrome). You don't have to think about or babysit your database anymore. It will just work. And from a business perspective your database becomes a variable cost rather than a high fixed cost, which is excellent for the angel food funding. Those are very nice features in a database. But for those with a relational database background there are some major differences that take getting used to. No schema. You don't have to define a schema before you use the database. SimpleDB is an attribute-value store and you can use any you like any time you like. It doesn't care. Very different from Victorian world of the RDBMS. No joins. In relational theory the goal is to minimize update and deletion anomolies by normaling your data into seperate tables related by keys. You then join those tables together when you need the data back. In SimpleDB there are no joins. For many-to-1 relationships this works out great. In SimpleDB attribute values can have multiple values so there's no need to do a join to recover all the values. They are stored together. For many-to-many to relationships life is not so simple. You must code them by hand in your program. This is a common theme in SimpleDB. What the RDBMS does for you automatically must generally be coded by hand with SimpleDB. The wages of scale are more work for the programmer. What a surprise. Two step query process. In a RDBMS you can select which columns are returned in a query. Not so in SimpleDB. In a query SimpleDB just returns back a record ID, not the values of the record. You need to make another trip to the database to get the record contents. So to minimize your latency you would need to spawn off multiple threads. See, more work for the programmer. No sorting. Records are not returned in a sorted order. Values for multi-value attribute fields are not returned in sorted order. That means if you want sorted results you must do the sorting. And it also means you must get all the results back before you can do the sorting. More work for the programmer. Broken cursor. SimpleDB only returns back 250 results at a time. When there are more results you cursor through the result set using a token mechanism. The kicker is you must iterate through the result set sequentially. So iterating through a large result set will take a while. And you can't use your secret EC2 weapon of massive cheap CPU to parallelize the process. More work for the programmer because you have to move logic to the write part of the process instead of the read part because you'll never be able to read fast enough to perform your calculations in a low latency environment. The promise of scaling is fulfilled. Alex tested retrieving 10 record ids from 3 different database sizes. Using a 1K record database it took an average of 141 msecs to retrieve the 10 record ids. For a 100K record database it took 266 msecs on average. For a 1000K record database it took an average of 433 msecs to retrieve the 10 record ids. It's not fast, but it is relatively consistent. That seems to be a theme with these databases. BigTable isn't exactly a speed demon either. One could conclude that for certain needs at least, SimpleDB scales sufficiently well that you can feel comfortable that your database won't bottleneck your system or cause it to crash under load. If you have a complex OLAP style database SimpleDB is not for you. But, if you have a simple structure, you want ease of use, and you want it to scale without your ever lifting a finger ever again, then SimpleDB makes sense. The cost is everything you currently know about using databases is useless and all the cool things we take for granted that a database does, SimpleDB does not do. SimpleDB shifts work out of the database and onto programmers which is why the SimpleDB programming model sucks: it requires a lot more programming to do simple things. I'll argue however that this is the kind of suckiness programmers like. Programmers like problems they can solve with more programming. We don't even care how twisted and inelegant the code is because we can make it work. And as long as we can make it work we are happy. What programmers can't do is make the database scalable through more programming. Making a database scalable is not a solvable problem through more programming. So for programmers the right trade off was made. A scalable database you don't have to worry about for more programming work you already know how to do. How does that sound?
Recently, Google announced Google App Engine, another announcement in the rapidly growing world of cloud computing. This brings up some very serious questions: 1. If we want to take advantage of one of the clouds, are we doomed to be locked-in for life? 2. Must we re-write our existing applications to use the cloud? 3. Do we need to learn a brand new technology or language for the cloud? This post presents a pattern that will enable us to abstract our application code from the underlying cloud provider infrastructure. This will enable us to easily migrate our EXISTING applications to cloud based environment thus avoiding the need for a complete re-write.
Hello everybody! I am a developer of a website with a lot of traffic. Right now we are managing the whole website using perl + postgresql + fastcgi + memcached + mogileFS + lighttpd + roundrobin DNS distributed over 5 servers and I must say it works like a charm, load is stable and everything works very fast and we are recording about 8 million pageviews per day. The only problem is with postgres database since we have it installed only on one server and if this server goes down, the whole "cluster" goes down. That's why we have a master2slave replication so we still have a backup database except that when the master goes down, all inserts/updates are disabled so the whole website is just read only. But this is not a problem since this configuration is working for us and we don't have any problems with it. Right now we are planning to build our own analytics service that would be customized for our needs. We tried various different software packages but were not satisfied with any of them. We want to build something like Google Analytics so it would allow us to create reports in real-time with "drill-down" possibility to make interactive reports. We don't need real-time data to be included in report - we just need a possibility to make different reports very fast. Data can be pre-processed. For example right now we are logging requests into plain text log files in the following format: date | hour | user_id | site_id | action_id | some_other_attributes.. There are about 8 - 9 million requests per day and we want to make real-time reports for example: - number of hits per day (the simplest) - number of hits by unique users per day - number of hits by unique users on specific site per day - number of distinct actions by users on specific site during defined period (e.g. one month, period of X months...) etc. You can display any type of report by combining different columns as well as counting all or only distinct occurrences of certain attributes. I know how to parse these log files and calculate any type of report I want, but it takes time. There are about 9 million rows in each daily log file and if I want to calculate monthly reports I need to parse all daily log files for one month - meaning I have to parse almost 300 million of lines, count what I want and then display the summary. This can take for hours and sometimes it has to be done in more than one step (e.g. calculating a number of users that have been on site_id=1 but not on site_id=2 - in this case I have to export users on site 1, export users on site 2 and then compare results and count the differences). If you take a look at Google Analytics it calculates any type of similar report in real-time. How do they do it? How can someone form a database that could do something like that? If I put 300 million of rows (requests per month) into the Postgres/MySQL table, selects are even slower than parsing plain text log files using Perl... I am aware that they have a huge amount of servers but I am also aware that they have even bigger amount of hits per day. I have a possibility to store and process this kind of analytics on multiple servers at the same time but I don't have enough knowledge how to construct a software and database that would be able to do a job like this. Does somebody have any suggestion? A simple example would be great! We already managed to make some sort of a database for site_id+action_id drilldown but the problem is with "unique users" which is THE information that we need all the time. To calculate unique users during certain period you have to count all the distinct user_ids during that time period. E.g.: select count(distinct user_id) from ... where date>='2008-04-10' and date <='2008-04-18' - with a 9million rows per day this statement would take about two minutes to complete and we are not satisfied with it. Thank you for any hint!
The 2008 MySQL Conference & Expo has now closed, but what is still open for viewing is all the MySQL scaling knowledge that was shared. Planet MySQL is a great source of the goings on:
Hi, I am an owner of an large community website and currently we are having problems with our database architecture. We are using 2 database servers and spread tables across them to divide read/writes. We have about 90% reads and 10% writes. We use Memcached on all our webservers to cache as much as we can, traffic is load balanced between webservers. We have 2 extra servers ready to put to use! We have looked into a couple of solution so far: Continuent Uni/Cluster aka Sequoia -> Commercial version way too expensive and Java isn't as fast as it suppose to be. MySQL Proxy -> We couldn't find any good example on how to create a master - master with failover scenario. MySQL Clustering -> Seems to be not mature enough, had a lot of performance issues when we tried to go online with it. MySQL DRDB HA -> Only good for failover, cannot be scaled! MySQL Replication -> Well don't get me started ;) So now I turn to you guys to help me out, I am with my hands in my hair and see the site still growning and performance slowly getting to its limit. Really need your help!! HELP!
I haven't developed an AppEngine application yet, I'm just taking a look around their documentation and seeing what stands out for me. It's not the much speculated super cluster VM. AppEngine is solidly grounded in code and structure. It reminds me a little of the guy who ran a website out of S3 with a splash of Heroku thrown in as a chaser. The idea is clearly to take advantage of our massive multi-core future by creating a shared nothing infrastructure based firmly on a core set of infinitely scalable database, storage and CPU services. Don't forget Google also has a few other services to leverage: email, login, blogs, video, search, ads, metrics, and apps. A shared nothing request is a simple beast. By its very nature shared nothing architectures must be composed of services which are themselves already scalable and Google is signing up to supply that scalable infrastructure. Google has been busy creating a platform of out-of-the-box scalable services to build on. Now they have their scripting engine to bind it all together. Everything that could have tied you to a machine is tossed. No disk access, no threads, no sockets, no root, no system calls, no nothing but service based access. Services are king because they are easily made scalable by load balancing and other tricks of the trade that are easily turned behind the scenes, without any application awareness or involvement. Using the CGI interface was not a mistake. CGI is the perfect metaphor for our brave new app container world: get a request, process the request, die, repeat. Using AppEngine you have no choice but to write an app that can be splayed across a pointy well sharpened CPU grid. CGI was devalued because a new process had to be started for every request. It was too slow, too resource intensive. Ironic that in the cloud that's exactly what you want because that's exactly how you cause yourself fewer problems and buy yourself more flexibility. The model is pure abstraction. The implementation is pure pragmatism. Your application exists in the cloud and is in no way tied to any single machine or cluster of machines. CPUs run parallel through your application like a swarm of busy bees while wizards safely hidden in a pocket of space-time can bend reality as much as they desire without the muggles taking notice. Yet the abstraction is implemented in a very specific dynamic language that they already have experience with and have confidence they can make work. It's a pretty smart approach. No surprise I guess. One might ask: is LAMP dead? Certainly not in the way Microsoft was hoping. AppEngine is so much easier to use than the AWS environment of EC2, S3, SQS, and SDB. Creating an app in AWS takes real expertise. That's why I made the comparison of AppEngine to Heroku. Heroku is a load and go approach for RoR whereas AppEngine uses Python. You basically make a Python app using services and it scales. Simple. So simple you can't do much beyond making a web app. Nobody is going to make a super scalable transcoding service out of AppEngine. You simply can't load the needed software because you don't have your own servers. This is where Amazon wins big. But AppEngine does hit a sweet spot in the market: website builders who might have previously went with LAMP. What isn't scalable about AppEngine is the scalability of the complexity of the applications you can build. It's a simple request response system. I didn't notice a cron service, for example. Since you can't write your own services a cron service would give you an opportunity to get a little CPU time of your own to do work. To extend this notion a bit what I would like to see as an event driven state machine service that could drive web services. If email needs to be sent every hour, for example, who will invoke your service every hour so you can get the CPU to send the email? If you have a long running seven step asynchronous event driven algorithm to follow, how will you get the CPU to implement the steps? This may be Google's intent. Or somewhere in the development cycle we may get more features of this sort. But for now it's a serious weakness. Here's are a quick tour of a few interesting points. Please note I'm copying large chunks of their documentation in this post as that seems the quickest way to the finish line...
import wsgiref.handlers from google.appengine.ext import webapp class MainPage(webapp.RequestHandler): def get(self): self.response.headers['Content-Type'] = 'text/plain' self.response.out.write('Hello, webapp World!') def main(): application = webapp.WSGIApplication( [('/', MainPage)], debug=True) wsgiref.handlers.CGIHandler().run(application) if __name__ == "__main__": main()This code defines one request handler, MainPage, mapped to the root URL (/). When webapp receives an HTTP GET request to the URL /, it instantiates the MainPage class and calls the instance's get method. Inside the method, information about the request is available using self.request. Typically, the method sets properties on self.response to prepare the response, then exits. webapp sends a response based on the final state of the MainPage instance. The application itself is represented by a webapp.WSGIApplication instance. The parameter debug=true passed to its constructor tells webapp to print stack traces to the browser output if a handler encounters an error or raises an uncaught exception. You may wish to remove this option from the final version of your application.
Example of creation: from google.appengine.ext import db from google.appengine.api import users class Pet(db.Model): name = db.StringProperty(required=True) type = db.StringProperty(required=True, choices=set("cat", "dog", "bird")) birthdate = db.DateProperty() weight_in_pounds = db.IntegerProperty() spayed_or_neutered = db.BooleanProperty() owner = db.UserProperty() pet = Pet(name="Fluffy", type="cat", owner=users.get_current_user()) pet.weight_in_pounds = 24 pet.put() Example of get, modify, save: if users.get_current_user(): user_pets = db.GqlQuery("SELECT * FROM Pet WHERE pet.owner = :1", users.get_current_user()) for pet in user_pets: pet.spayed_or_neutered = True db.put(user_pets)Looks like your normal overly complex data access. Me, I appreciate the simplicity of a string based property interface.
Scalr is a fully redundant, self-curing and self-scaling hosting environment utilizing Amazon's EC2. It has been recently open sourced on Google Code. Scalr allows you to create server farms through a web-based interface using prebuilt AMI's for load balancers (pound or nginx), app servers (apache, others), databases (mysql master-slave, others), and a generic AMI to build on top of. Scalr promises automatic high-availability and scaling for developers by health and load monitoring. The health of the farm is continuously monitored and maintained. When the Load Average on a type of node goes above a configurable threshold a new node is inserted into the farm to spread the load and the cluster is reconfigured. When a node crashes a new machine of that type is inserted into the farm to replace it. 4 AMI's are provided for load balancers, mysql databases, application servers, and a generic base image to customize. Scalr allows you to further customize each image, bundle the image and use that for future nodes that are inserted into the farm. You can make changes to one machine and use that for a specific type of node. New machines of this type will be brought online to meet current levels and the old machines are terminated one by one. The open source scalr platform with the combination of the static EC2 IP addresses makes elastic computing easier to implement. Check out the blog announcement by Intridea for more info. As AWS conquers the scalable web application hosting space it is time to check out the new Programming Amazon Web Services: S3, EC2, SQS, FPS, and SimpleDB (Programming) book on amazon.com. What do you think of the opportunities of using scalr for automatic scalability?
It is fairly obvious that web site performance can be increased by making the code run faster and optimising the response time. But that only scales up to a point. To really take our web sites to the next level, we need to look at the performance problem from a different angle.
Skype uses PostgreSQL as their backend database. PostgreSQL doesn't get enough run in the database world so I was excited to see how PostgreSQL is used "as the main DB for most of [Skype's] business needs." Their approach is to use a traditional stored procedure interface for accessing data and on top of that layer proxy servers which hash SQL requests to a set of database servers that actually carry out queries. The result is a horizontally partitioned system that they think will scale to handle 1 billion users.
First, code to insert a user in a database: CREATE OR REPLACE FUNCTION insert_user(i_username text) RETURNS text AS $$ BEGIN PERFORM 1 FROM users WHERE username = i_username; IF NOT FOUND THEN INSERT INTO users (username) VALUES (i_username); RETURN 'user created'; ELSE RETURN 'user already exists'; END IF; END; $$ LANGUAGE plpgsql SECURITY DEFINER; Heres the proxy code to distribute the user insert to the correct partition: queries=# CREATE OR REPLACE FUNCTION insert_user(i_username text) RETURNS TEXT AS $$ CLUSTER 'queries'; RUN ON hashtext(i_username); $$ LANGUAGE plproxy; Your SQL query looks normal: SELECT insert_user("username");- The result of a query is exactly that same as if was executed on the remote database. - Currently they can route 1000-2000 requests/sec on Dual Opteron servers to a 16 parition cluster.