Best Book on Mahout: Mahout in Action

Ranked #1,574 in Internet, #91,611 overall

Manning's Mahout in Action Covers Recommendation, Clustering and Classification

Mahout in Action, a new book published by Manning, is a how-to resource for computer programming using the data mining software Mahout for very large data projects.

DISCOUNT: Get 37% off any format of Mahout in Action at the Manning website. Use this discount code in the coupon box: mahout37

Manning website for Mahout in Action

Mahout in Action gives direct how-to advice to help developers put Mahout software to use on their own large data projects. provides resources to help readers actively participate in the Mahout community through online discussion group. Why do I think Mahout in Action is the best book on Mahout? Partly because I'm one of the authors and naturally biased. Three other reasons Mahout in Action is a good book on Mahout:

1. It's gotten excellent comments from outside reviewers and early users.

2. Mahout in Action combines fundamental concepts and specific how-to information with concrete examples using real world data.

3. It's essentially the only comprehensive, how-to book on Mahout currently available.

Mine is not an un-biased review, but hopefully it will help you decide if Mahout in Action is a good choice for you. In addition, I've included links to what others are saying about the book

Get Mahout in Action Now

Available onlineMahout in Action is an in depth book about how to use Mahout software, published by Manning Publications Co.

The book is now available as eBook and print version. You no longer have to rely on the MEAP or "Manning Early Access Program" - the finished book is ready.

Is there a free download of eBook for Mahout in Action? Not legally, without buying a copy of print or eBook. But you have a choice to get both or just the eBook alone, for a reduced price. The eBook is available from the publisher, Manning, with or without the print version.
WARNING: piracy sites often contain malware

Buy book from Manning official site and save 37% with this SPECIAL DISCOUNT CODE: mahout37
Manning's website for Mahout in Action

NEW WITH eBOOK Authors provide of audio and video segments eBook enhancements that expand key ideas. For example, Chapter 13 from the Classification section of Mahout in Action has a video segment on online learning algorithms. In addition, you get access to an online author forum with the authors.

If you later decide you want the print version, the full price of the eBook you've already bought is deducted from the print price. Or if you start with the print book, you get the eBook automatically at no additional cost.

Apache Mahout: Open Source Software

Apache Mahout logoApache Mahout is an open source software project that provides scalable approaches for machine learning. This data mining library is an open source software project that is an excellent resource for the advanced computer programmer. Mahout is particularly useful with very large data sets. Mahout excels in projects with a million training examples. At 10 million or more training examples, Mahout is pretty much the only game in town.

UPDATE: New slides and video presentations from the first Mahout Meet Up held in San Francisco in November 2011 are available online. Presenters were Mahout co-founder and committer Grant Ingersoll and Mahout committer and Mahout in Action author Ted Dunning. Grant talks about using clustering and classification for data mining with email. Ted talks about how Mahout uses random projections to improve performance in machine learning while maintaining quality.

CLICK LINK for free slides and video from the presentations:

Slides and Video Mahout Meet-up November 2011 San Francisco

Who Should Use This Book?

Mahout developer hard at work, © E. Friedman 2011, all rights reservedMahout in Action is not a textbook on machine learning. Rather, it is a practical approach to large data machine learning techniques with complete examples and guidance The goal is to enable the reader to apply Mahout to solve her or his own problems.

Wondering how to deal with huge requirements for speed and scale in your project? This book is for you.

Doing research in machine learning? This book is for you.

Leading a product team or start-up that needs cost-effective ways to scale solutions? This book is for you.

You're planning to become an active part of the Mahout community? This book is for you.

Is Mahout in Action a Reliable Source?

Mahout in Action, published by Manning 2011Three of the four authors of Mahout in Action are Apache Mahout committers who helped develop the Mahout project. Sean Owen wrote the first third of the book that focuses on recommendation. Robin Anil wrote the middle section that explains clustering applications using Apache Mahout. Ted Dunning, also a Mahout committer, co-wrote the final section of the book that explores the use of Mahout for classification projects with very large data sets. I am the other co-author of the classification section (Ellen Friedman). I'm a scientist and technical writer, but unlike the other three Mahout in Action authors, I am not a Mahout committer.

External Reviewers

One of the founders and core committers of Mahout, Grant Ingersoll, reviewed Mahout in Action at this link:

Mahout in Action Review

An in depth review of Mahout in Action in terms of how it covers data mining and how it explains Mahout in particular:

Tips and Tricks

For reviews of Mahout in Action by readers on Amazon, go to this link:

Amazon Reviews of Mahout in Action

Print Book Also Available From Amazon

Print copies are available from Amazon and are now in stock and ready to ship. Go to this link to read additional reviews of the book and to see how many users have given it a "thumbs up".
Loading

Mahout Logo

Apache Mahout logoSee the little person sitting atop the elephant in the Mahout logo? What does it mean?

Mahout is an open source software library that uses Apache Hadoop among other techniques for scalable data mining. Hadoop's logo is a little yellow elephant.

So who drives an elephant to go further? The Indian term for the elephant's driver is the Mahout.

To visit Apache Mahout home page, use this link Apache Mahout

More From the Authors: Sean Owen

Mahout in Action co-author Sean OwenSean was a Software Engineer with Google's Mobile Web search. He is a Mahout committer and an Apache Foundation V.P. Sean authored the recommendation section of Mahout in Action.

More From the Authors: Robin Anil

Mahout in Action co-author Robin AnilRobin works for Google, is a Mahout committer and authored the clustering section of the book Mahout in Action. You may enjoy following his blog:

Robin Anil's Blog

Update: Robin presented a workshop with Ted Dunning at OSCON meeting in Portland Oregon on July 25-29, 2011. See the slides at Hands On Mahout OSCON 2011Slides

More From the Authors: Ted Dunning

Ted DunningTed is Chief Application Architect at MapR Technologies. He is a committer and PMC member for the Apache Mahout project. Ted co-authored the classification section of Mahout In Action.

Ted's Blog is
Surprise and Coincidence: Musings from the Long Tail

Ted presented a workshop at OSCON in Portland Oregon on July 27-28, 2011 with Robin Anil. See slides at Hands On Mahout OSCON 2011Slides

UPDATE: New slides and video presentations from the first Mahout Meet Up held in San Francisco in November 2011 are available online. Presenters were Mahout co-founder and committer Grant Ingersoll and Mahout in Action author Ted Dunning. CLICK LINK to watch video or view slides:
Slides and Video Mahout Meet-up November 2011 San Francisco

Or meet Ted at one of the Bay Area Hadoop User Group meetings (HUG) or meeting of the ACM at which he is a frequent participant or speaker.

More from the Authors: Ellen Friedman

Ellen Friedman, co-author Mahout in Action, photo copyright E. FriedmanI wrote this not-un-biased review o f Mahout in Action, and as mentioned above, I am one of the four authors. I used to do laboratory research in molecular biology and biochemistry, and now I work on a variety of education, scientific and communications projects, including writing textbooks and evaluation of medical education.

Mahout Mailing List

Want to be active in the Mahout community? Do you have questions about Mahout? Why not join the discussion by jumping in?

Mahout Mailing List

Available from Amazon

Loading

Mahout uses Hadoop

There's a connection between the logos for Hadoop and Mahout because of the connection in the projects: Mahout uses Hadoop for many of its applications. For more about Hadoop (and it's yellow elephant) see this overview of Hadoop:
Loading

Mahout Workshop: Hands-on Mahout

JULY 2011: Two of the authors of Mahout in Action presented a hands-on workshop in Portland, Oregon on July 27, 2011 at the OSCON conference.

Ted Dunning and Robin Anil delivered a 90 minute workshop titled "Hands-on Mahout". For slides from the presentation,
CLICK THIS LINK: Hands On Mahout OSCON 2011Slides

Stay tuned for information about video from this session.

Update: Berlin BuzzWords conference June 2011

Top of Bundestag, Berlin © E. Friedman 2011One of the authors of Mahout in Action was a keynote speaker at open source software conference Berlin BuzzWords 6 - 7 June 2011. The interactive keynote address was presented by Ted Dunning to an audience of about 400 key software developers involved with open source software projects such as those from the Apache Foundation.

Another keynote talk was given by Hadoop, Lucene and Avro developer Doug Cutting. Both addresses were enthusiastically received. The conference was a well-organized success.

Check Ted's blog for more on Hadoop, other Apache Foundation Software projects and keynote address see
Ted's Blog Surprise and Coincidence: Musings from the Long Tail

More on Mahout

See more from Mahout committers such as Isabel Drost at this link.
Loading

If you are in the Bay Area...

Bay Area is a focus for computing, so if you are in the Bay Area, you may enjoy this restaurant recommendation. Remember: life is more than work! Happy eating.
Loading

Your comments or questions are welcome.

  • aesta1 Dec 6, 2011 @ 3:10 am | delete
    This isa bit foreign to me but I enjoyed it as this is my way of keeping up with what is there.
  • chiakisato Dec 5, 2011 @ 4:09 am | delete
    Very good descriptions!!!
  • rasisonia Dec 3, 2011 @ 9:39 am | delete
    very interesting..
  • gottaloveit Dec 2, 2011 @ 12:00 pm | delete
    I've always wanted to know more about computer programming. Sounds like a great resource book if I ever get around to it! P.S, I hope I'm not posting this comment a thousand times - the security word won't take!
  • Tipi Dec 1, 2011 @ 2:43 pm | delete
    Sopping back to love this once again.
  • Load More

Food for Thought

Drinking green or oolong tea can help you concentrate. Try these suggestions.
Loading

by

efriedman

My favorite time of the day: now.

My interests: the fine art of knowledge from sciences to painting.

My favorite place: outdoors, preferably mou...
more »

Feeling creative? Create a Lens!

Related Topic: Hadoop 

Hadoop: The Definitive Guide

Amazon Price: $30.10 (as of 06/02/2012)Buy Now

This book should provide you not only with a good guide to using Hadoop, it also gives you a good drawing of an elephant.

Mahout in Action from Amazon 

Mahout in Action

Amazon Price: $25.94 (as of 06/02/2012)Buy Now

This Mahout how-to book is getting enthusiastic response from Amazon customers. Check site to see reviews.

Amazon Spotlight Personal Review 

Lucene in Action, Second Edition: Covers Apache Lucene 3.0

Amazon Price: $23.99 (as of 06/02/2012)Buy Now

Lucene is another Apache Foundation open source project, and Lucene in Action as a how-to book has received excellent reviews.