Published: April 2018
Data is getting bigger and more complex by the day, and so are your choices in handling it. Explore some of the most cutting-edge databases available—from a traditional relational database to newer NoSQL approaches—and make informed decisions about challenging data storage problems. This is the only comprehensive guide to the world of NoSQL databases, with in-depth practical and conceptual introductions to seven different technologies: Redis, Neo4J, CouchDB, MongoDB, HBase, Postgres, and DynamoDB. This second edition includes a new chapter on DynamoDB and updated content for each chapter.Join Luc Perkins in this episode of the Test & Code podcast at https://testandcode.com/53.
While relational databases such as MySQL remain as relevant as ever, the alternative, NoSQL paradigm has opened up new horizons in performance and scalability and changed the way we approach data-centric problems. This book presents the essential concepts behind each database alongside hands-on examples that make each technology come alive.
With each database, tackle a real-world problem that highlights the concepts and features that make it shine. Along the way, explore five database models—relational, key/value, columnar, document, and graph—from the perspective of challenges faced by real applications. Learn how MongoDB and CouchDB are strikingly different, make your applications faster with Redis and more connected with Neo4J, build a cluster of HBase servers using cloud services such as Amazon’s Elastic MapReduce, and more. This new edition brings a brand new chapter on DynamoDB, updated code samples and exercises, and a more up-to-date account of each database’s feature set.
Whether you’re a programmer building the next big thing, a data scientist seeking solutions to thorny problems, or a technology enthusiast venturing into new territory, you will find something to inspire you in this book.
Q&A with Author Luc Perkins
Q: Why did you choose to work on the second edition of Seven Databases in Seven Weeks?
A: Well, I began becoming intimately acquainted with the NoSQL space about five years ago, when I took on the role of technical writer at Basho Technologies, the company behind NoSQL database Riak. From the very beginning I found the space endlessly interesting, so full of promise and inspiring technology yet also very tricky to navigate.
Relational databases are quite interesting to me as well, but they tend to be very structurally similar. NoSQL databases on the other hand, tend to be much more individualistic, you could say. Each has its own special strengths and weakness and quirks and presents you with a set of trade-offs you’ve probably never encountered in another database. So the book was an opportunity to take my more localized knowledge of the space and really stretch my knowledge and my thinking outward.
Q: What was the hardest part about working on the book?
A: In general, I’d say making the book up to date. Unsurprisingly, a ton has changed since the original edition. The NoSQL space is notoriously fast moving and it’s hard enough to keep up with one database, let alone seven. That means that I had to check every single code snippet and CLI command and claim and diagram in the book to make sure that it still worked, presented accurate information, etc. Then I had to make sure that newer features are mentioned or showcased when necessary. For a book that’s really seven books in one, this was quite a task, though an extremely rewarding one.
Q: What are some of the main differences between the first and second edition?
A: First, and most importantly, everything in the book works now. We’re all used to bit rot in code but it happens in books, too. Database systems change a lot over time. Many of the CLI commands and code snippets from the first edition eventually started throwing cryptic errors or flat-out not working at all.
But there are some other, more specific changes. The chapter on Riak was removed and replaced with a chapter on Amazon’s DynamoDB. Riak is a fascinating database but its future is very uncertain. DynamoDB is also a fascinating database but it feels like a living, breathing project. Furthermore, the querying language for Neo4j was updated to Cypher (instead of the original and now largely defunct Gremlin).
Q: What’s your favorite database in the book?
A: Oh gosh, that’s very tricky, because I have a special fondness and a place in my heart reserved for each of them. But if I had to pick I’d say Redis. It has a pretty small surface area for such a widely used system and a very well-defined domain of problems that it seeks to address. If I had to build a new application that used all seven databases in the book, the Redis portion of the application would be the one I’d be most eager to work on.
Q: Do you have any general advice for readers? Databases are complex and it may not be readily apparent how even an extremely technically savvy reader should proceed.
A: I’d say take it nice and slow. The content is spread across “days” for a reason. You don’t have to follow the schema we present, of course, but this is not single-sitting material. Take a minute to really absorb the diagrams and technical definitions. Try to understand each database’s “worldview,” so to speak, and use that as a thinking cap for each chapter’s material. Try to imagine times when each database would be indispensable. And if you and a database just aren’t getting along, skip to the next one and come back later. You may come back with fresh insight and a new slate of questions.
Luc Perkins is a customer success engineer at Reflect Technologies, a data reporting and visualization startup in Portland, OR. In the past, he has worked as a technical writer for companies such as Twitter and Basho, and is actively involved in the Write the Docs community of technical writers.
Eric Redmond has been in the software industry for more than 20 years, working with Fortune 500 companies, governments, and many startups. He is a coder, illustrator, international speaker, and active organizer of several technology groups.
Jim R. Wilson is a software engineer at Google creating machine learning visualizations on the Big Picture team. He’s contributed to TensorFlow’s visualization suite, TensorBoard, and other open source projects.
Published: April 2018