Watch_me

Data is getting bigger and more complex by the day, and so are your choices in handling it. From traditional RDBMS to newer NoSQL approaches, Seven Databases in Seven Weeks takes you on a tour of some of the hottest open source databases today. In the tradition of Bruce A. Tate’s Seven Languages in Seven Weeks, this book goes beyond your basic tutorial to explore the essential concepts at the core of each technology.

Buy Now

Select a DRM-free Format:

In Stock
In Stock
In Stock
Buy the eBook and get these DRM-free formats delivered immediately:
  • epub (for iPhone/iPad, Android, eReaders)
  • mobi (for Kindle)
  • PDF
We can automatically send them to your Kindle or Dropbox, and for a social reading experience we can link the book to your Readmill account. (You'll need to log in to enable these options.)
 

About this Book

  • 354 pages
  • Published:
  • Release: P2.0 (2013-01-28)
  • ISBN: 978-1-93435-692-0

Redis, Neo4J, CouchDB, MongoDB, HBase, Riak, and Postgres: with each database, you’ll tackle a real-world data problem that highlights the concepts and features that make it shine. You’ll explore the five data models employed by these databases: relational, key/value, columnar, document, and graph. See which kinds of problems are best suited to each, and when to use them.

You’ll learn how MongoDB and CouchDB are strikingly different, and discover the Dynamo heritage at the heart of Riak. Make your applications faster with Redis and more connected with Neo4J. Use MapReduce to solve Big Data problems. Build clusters of servers using scalable services like Amazon’s Elastic Compute Cloud (EC2).

Understand the tradeoffs between consistency and availability, and when you can use them to your advantage. Use multiple databases in concert to create a platform that’s more than the sum of its parts, or find one that meets all your needs at once.

Seven Databases in Seven Weeks will take you on a deep dive into each of the databases, their strengths and weaknesses, and how to choose the ones that fit your needs.

What You Need:

You’ll need a *nix shell (Mac OSX or Linux preferred, Windows users will need Cygwin), and Java 6 (or greater) and Ruby 1.8.7 (or greater). Each chapter will list the downloads required for that database.

Contents and Extracts

Full Table of Contents

Introduction

Q&A with authors Eric Redmond and Jim Wilson:

1. How did you pick the seven databases?

Eric:

We did have some criteria, if not explicit. The databases had to be open source—we didn’t want to cover any databases that would tie readers to a company. We wanted at least one implementation for each of the five database genres (Relational, Key-Value, Columnar, Document, Graph). Then we chose databases that exemplified some general concepts we wanted to cover, like the CAP theorem, or mapreduce. Finally, we chose databases that were good counterpoints to each other—so we chose MongoDB and CouchDB (different ways of implementing document stores). Or we chose Riak because it was a Dynamo (Amazon’s database) implementation to compare to HBase as a BigTable (Google’s database) implementation.

Jim:

Our goal with the book was principally to introduce readers to the field of choices they now have. Our selections were largely in service of that goal. Even so, it was a pretty long and iterative process. We knew that no matter which ones we picked there’d be people asking why we did or didn’t include their favorite. It came down to choosing the genres we wanted to discuss and then picking databases that had the right combination of (A) representing their genre and (B) relative popularity.

For example, we picked PostgreSQL since it sticks very closely to the SQL standard and is relatively less well known than other OSS competitors like MySQL. Similarly, even though both HBase and Cassandra are column-oriented databases, we went with HBase because Cassandra is more of a hybrid that incorporates elements from both the BigTable paper and the Dynamo paper.

2. Databases are rapidly changing. What do you wish you’d included now?

Eric:

There are hundreds of database options, but I’m glad to see that our choices are still going strong a year later. However, if I had to do it over again, I would like to have added a Triplestore (like Mulgara), since the semantic web is slowly popularizing this method of data storage. I also would have liked to spend more time on Neo4j’s Cypher language, or have covered Hadoop in a bit of detail, since analytics is a huge part of data storage.

Jim:

Yes, databases are rapidly changing, in two senses. First, the field of available data storage technology has been seeing an explosion in recent years. More and more different sorts of databases are cropping up to fill in various niche needs. In the other sense, the databases themselves are rapidly evolving. Even between minor version releases, modern NoSQL databases incorporate more and more features in order to claim more of the market and remain competitive. In that regard, there’s a bit of convergence happening and it makes choosing one even harder as there are more that can meet your needs all the time.

I still think the five genres and seven databases we chose satisfy the criteria that we set out to achieve. But there are others I’d like to write about as well. These include some old favorites like SQLite and some databases you might not think of as such, like OpenLDAP and SOLR (an inverted index/search engine).

3. Why did you decide to write this book?

Eric:

Jim and I discussed writing a book like this for quite some time. About a year and a half ago he sent me an email with no body—the subject was “Seven Databases in Seven Weeks?” The title sold me. We both loved Bruce’s “Seven Languages” book, and this seemed the perfect format to explore this emerging field.

Jim:

As early as March of 2010, Eric and I brainstormed about writing a NoSQL book of some kind. At the time there was a lot of buzz around the term, but also a lot of confusion. We thought we could bring some structure to the discussion and educate people who might not be up to speed yet on all the latest developments.

After reading Bruce A. Tate’s Seven Languages in Seven Weeks I thought, “What about Seven Databases?” Eric submitted a proposal and a few weeks later we were off to the races.

4. What do you see as up and coming databases?

Eric:

I’ve become a big fan of Neo4j. It’s one we covered in the book, but in all honesty we chose it because we wanted to explore an open source graph database. But over the past year it’s really come into its own. I really do believe this is the year we’ll see wider adoption of graph databases.

As for ones we did not cover, I think ElasticSearch is clearly gaining traction. OrientDB is also interesting, as it can act as a relational, key-value, document, or a graph database. I think you’ll see more of this multi-genre behavior in the future. And as I hinted at before, Triplestores are gaining a bit of traction, too, though their problem-set greatly overlaps with general graph databases.

Jim:

There are many, of course, but there are at least two that I personally look forward to exploring in more detail: ElasticSearch and doozer.

ElasticSearch is a distributed, peer-based, REST/JSON powered document search engine. Using a distributed Lucene index at its core, ElasticSearch allows REST clients to find documents based on fuzzy criteria. Everyone needs a search engine, and ElasticSearch makes it easy.

doozer is a fast, headless consensus engine. It’s written in the Go programming language by the smart folks at Heroku. doozer is great for storing small blobs of important information that absolutely must be consistent (like cluster configuration metadata), but without a single point of failure.

About the Author

Eric Redmond has been in the software industry for more than 15 years, working with Fortune 500 companies, governments, and many startups. He is a coder, illustrator, international speaker, and active organizer of several technology groups.

Jim R. Wilson started hacking at the age of 13 and never looked back. He began tinkering with NoSQL databases in 2007 and has contributed code to large-scale open source projects such as MediaWiki and HBase.

Upcoming Author Events

  • 2012-12-05: Bruce Tate
    Mary Poppins Meets the Matrix. A look at the languages in Seven Languages in Seven Weeks. What was going on when these languages were written, and what are the design principles that distinguish each one? Fans use discount code tate250. (London)
  • 2013-04-09: Jim R. Wilson
    A two-day conference for technical writers, documentarians, and all those who write the docs. (Write The Docs, Portland OR)

Comments and Reviews

  • Help Net Security said:

    This book gives great and structured overview of modern databases, and doesn’t delve too deep. Nor should it, as it currently gives all the knowledge you need to choose one database to suit your needs.

  • MatthewHelmke.net said:

    If you have any reason to use or consider using anything other than a more traditional relational database, and aren’t sure which one to try out of the exploding number of new options, this book will help you make sense of the field and better evaluate your options against your current needs. I recommend it.

  • Reading this book was like going on “Mr. Toad’s Wild Ride” at Disney Land. There are turns and twists, you never know what’s around the next corner, but it is a lot of fun.

    —Jeffrey Newman
  • The flow is perfect. On Friday, you’ll be up and running with a new database. On Saturday, you’ll see what it’s like under daily use. By Sunday, you’ll have learned a few tricks that might even surprise the experts! And next week, you’ll vault to another database and have fun all over again.

    —Ian Dees Coauthor, "Using JRuby"
  • Provides a great overview of several key databases that will multiply your data modeling options and skills. Read if you want database envy seven times in a row.

    —Sean Copenhaver Lead Code Commodore backgroundchecks.com
  • This is by far the best substantive overview of modern databases. Unlike the host of tutorials, blog posts, and documentation I have read, this book taught me why I would want to use each type of database and the ways in which I can use them in a way that made me easily understand and retain the information. It was a pleasure to read.

    —Loren Sands-Ramshaw Software Engineer U.S. Department of Defense
  • This is one of the best CouchDB introductions I have seen.

    —Jan Lehnardt Apache CouchDB Developer and Author
  • In an ideal world, the book cover would have been big enough to call this book “Everything you never thought you wanted to know about databases that you can’t possibly live without.” To be fair, “Seven Databases in Seven Weeks” will probably sell better.

    —Dr Nic Williams VP of Technology Engine Yard
  • ‘Seven Databases in Seven Weeks’ is an excellent introduction to all aspects of modern database design and implementation. Even spending a day in each chapter will broaden understanding at all skill levels, from novice to expert— there’s something there for everyone.

    —Jerry Sievert Director of Engineering Daily Insight Group