Data Mining Resources
From DML
Below are lists of relevant data mining resources and tools.
[edit] Introductory Data Mining Resources
- From Data Mining to Knowledge Discover in Databases 1996
- Data Mining Methodology: The Virtuous Cycle Revisited, In Mastering Data Mining, Michael J.A. Berry and Gordon S. Linoff, Wiley, 2000, Chapter 3.
- CRISP-DM 1.0: Step-by-step Data Mining Guide, SPSS Inc. (Consortium Effort).
- Data Mining Algorithm Overview, Two Crows
- Challenging Problems in Data Mining (2006)
- ROC Graphs: Notes and Practical Considerations for Researchers
- Surveying the Field: Current Data Mining Applications, Analytic Tools, and Practical Challenges
- Success Stories in Data Mining
- Data Mining Presentation - brief overview that includes presenting the CRISP-DM Model and KDD Process.
- Introduction to Weka
- Connecting Rapid Miner to an external database
[edit] Tools
- Weka - open source machine learning software
- R - open source software environment for statistical computing and graphics. Here is a nice tutorial for programmers, a reference card, and another reference card.
- Social Network Graphing Tools - list of tools useful for displaying and analyzing social
- Record Linkage Tools - tools specific to entity resolution and record linkage
- LingPipe - LingPipe is a suite of Java libraries for the linguistic analysis of human language.
- MALLET - MALLET is a Java-based package for statistical natural language processing (NLP), document classification, clustering, topic modeling, information extraction, and other machine learning applications to text.
- KeplerWeka - KeplerWeka represents the integration of all the functionality of the WEKA Machine Learning Workbench into the open-source scientific workflow Kepler
- MLOSS - Machine Learning Open Source Software
