Search My Blog

Tuesday, December 18, 2012

Apache Mahout - Scalable machine learning and data mining

This, looks like something worth learning more about...

Don

What is Apache Mahout?

The Apache Mahout™ machine learning library's goal is to build scalable machine learning libraries.

Mahout currently has

  • Collaborative Filtering
  • User and Item based recommenders
  • K-Means, Fuzzy K-Means clustering
  • Mean Shift clustering
  • Dirichlet process clustering
  • Latent Dirichlet Allocation
  • Singular value decomposition
  • Parallel Frequent Pattern mining
  • Complementary Naive Bayes classifier
  • Random forest decision tree based classifier
  • High performance java collections (previously colt collections)
  • A vibrant community
  • and many more cool stuff to come by this summer thanks to Google summer of code

With scalable we mean:

Scalable to reasonably large data sets. Our core algorithms for clustering, classfication and batch based collaborative filtering are implemented on top of Apache Hadoop using the map/reduce paradigm. However we do not restrict contributions to Hadoop based implementations: Contributions that run on a single node or on a non-Hadoop cluster are welcome as well. The core libraries are highly optimized to allow for good performance also for non-distributed algorithms

Scalable to support your business case. Mahout is distributed under a commercially friendly Apache Software license.

Scalable community. The goal of Mahout is to build a vibrant, responsive, diverse community to facilitate discussions not only on the project itself but also on potential use cases. Come to the mailing lists to find out more.

Currently Mahout supports mainly four use cases: Recommendation mining takes users' behavior and from that tries to find items users might like. Clustering takes e.g. text documents and groups them into groups of topically related documents. Classification learns from exisiting categorized documents what documents of a specific category look like and is able to assign unlabelled documents to the (hopefully) correct category. Frequent itemset mining takes a set of item groups (terms in a query session, shopping cart content) and identifies, which individual items usually appear together.

Interested in helping? See the wiki or join the mailing lists.

Mahout News

Read More...
http://mahout.apache.org/

Apache Mahout machine learning library's goal is to build scalable machine learning libraries
Linux Today - Mahout, There It Is! Open Source Algorithms Remake Overstock.com
Mahout, There It Is! Open Source Algorithms Remake Overstock.com | Wired Enterprise | Wired.com
MyCurrent
Apache Mahout: Scalable machine learning and data mining
Downloads - Apache Mahout - Apache Software Foundation
Mahout Wiki - Apache Mahout - Apache Software Foundation
Quickstart - Apache Mahout - Apache Software Foundation
Mahout - ASF JIRA
Mailing Lists, IRC and Archives - Apache Mahout - Apache Software Foundation

News 12-18-12
Linux Today - Get information on your hardware on Linux with hardinfo
» Linuxaria – Everything about GNU/Linux and Open source Get information on your hardware with hardinfo
Linux Today - The best of Linux - made on a Mac
The best of Linux - made on a Mac
Linux Today - Dell Cloud and HP Cloud: OpenStack and Open Source Twins?
Dell Cloud and HP Cloud: OpenStack Twins or Different DNA? | Public Cloud content from Talkin' Cloud
Watermelon air boat
More details about laser cut gingerbread houses
Silent HTPC build is an art piece for the livingroom
Researchers map 34 threats to the Great Lakes | Michigan Radio
From URL To Downloadable PDF In One Free Step
How Wal-Mart Used Payoffs to Get Its Way in Mexico - NYTimes.com
Ancient Bones That Tell a Story of Compassion - NYTimes.com
Instagram tests new limits in user privacy | Reuters
Using newlib with Stellaris Launchpad
Hacked together NAS in a box
Heavy metal computer case desk
HTML5 Still Not a Standard Until 2014 - InternetNews.
Linux Today - HTML5 Still Not a Standard Until 2014
Mahout, There It Is! Open Source Algorithms Remake Overstock.com | Wired Enterprise | Wired.com
MyCurrent
Apache Mahout: Scalable machine learning and data mining
Linux Today - Mahout, There It Is! Open Source Algorithms Remake Overstock.com
News 12-17-12
Linux Today - Dear Open Source Project Leader: Quit Being A Jerk
Dear Open Source Project Leader: Quit Being A Jerk | ThoughtStream.new :derick_bailey
Linux Today - Linux Google Drive Client Insync Gets Xfce And Mate Desktop Integration
Linux Google Drive Client Insync Gets Xfce And Mate Desktop Integration ~ Web Upd8: Ubuntu / Linux blog
Custom desk is a custom computer case
Using arcade monitors with the Raspberry Pi
Electronically controlled NFC tag
Raspberry Pi app store launches with games, tutorials, more | Crave - CNET
Get This Free Office Suite And Help A Charity Too
Holiday Wallpaper Collections (Wallpaper of the Week)
In Newtown, Conn., a Stiff Resistance to Gun Restrictions - NYTimes.com
In Spain, Having a Job No Longer Guarantees a Paycheck - NYTimes.com
Security Increased at Connecticut Schools as Investigation Into Shooting Continues - NYTimes.com


No comments: