R-Rattle Training Video

Today, we are going to introduce a very powerful data mining tool called Rattle. Interesting feature of Rattle is that it is a GUI which sits on top of R. What it means is that it gives users a point and click interface to build data mining projects, predictive Models etc without writing a single line of R code.

In the featured video we have built various predictive models on a credit scoring dataset and compared their performances against each other using ROC curves. Models built are –>

  • Decision Trees
  • Random Forests
  • Adaptive Boosting
  • Support Vector Machines
  • Logistic Regression
  • Neural Networks

This was done without writing any R code (except to launch rattle). Total video lenght is about 17 minutes, which will take you through data import in rattle, variable exploration, model building and model evaluation using ROC’s.

This video is for people from an advanced analytics background as we have not explained much of the methodologies behind the techniques, merely how to do in Rattle. Those who can understand the methodology and are not working in the analytics industry, you should immediately jump ship, greener pastures are awaiting (Seriously, if you understand even 40% of this, you cannot be unemployed!)

For those, who want to understand and learn stuff shown on the video, check out our website www.learnanalytics.in, we specialize in Analytics Training for students worldwide. We provide SAS, R , Advanced Analytics trainings.

For doubts/queries, batch timings, drop in  a mail to info@learnanalytics.in

  1. Click here to download R
  2. Click here to download Rattle
  3. Click here to download the dataset discussed in the video

To install rattle, simply follow the instructions on the website linked above, if you have problems in installing,drop us a mail, we will be glad to help you out. We will be following up on a detailed post on R and rattle installation with troubleshooting.

Drop in comments to give us feedback!!

Learn Analytics Team

Comments


23 thoughts on “R-Rattle Training Video

  1. Karan and team ,

    Agyani from California, was learning R/Rattle to do stock analysis data, thank you for the details video, my questions are simple, need a little clarity in your workflow process of the following prerequisites for Predictive modelling

    1.Workflow for creating the different versions of excel CSV file for analysis
    i.e Raw Data —> Massaging and cleaning —>Normalization—>Creating Train CSV data file —> Running Model (glm,etc)—>Getting the ROC (for some reason stock data does not give ROC graph)–>Saving the rattle model file (You call it logistic file in your telecom data)—> Run this model on the “Validation CSVFile”

    2. Can you please explain why do you need two CSV files one “Train” and One “Validation”,? (Just want to understand this naming process and when it is created)

    Thanks in advance
    Agyani

  2. Hi ,

    Excellent vedio. Thanks for Sharing.

    However i have follwed all the steps, but my charts are not colorful as seen vedio. is there any package i missed to install. How can i get colorful charts like in the Vedio. Thanks

  3. Thank u very much, if we have training and testing datasets separately given as in kaggle, then how to validate the test dataset in rattle.

    • It will work on Character variables as well automatically, else you can use the transform tab to recode continuous variables as categoric as well.

  4. I absolutely loved this training video, many thanks. It was fast, but very informative and covered a lot of what I’m trying to do and understand.

  5. Hi Karan, greetings from Silicon Valley! I was in the process of installing and trying out ‘rattle’ on my Mac, ran into issues in installing ‘RGtk2’, and got linked to your site from TogaWare. Came in to find an answer for a mechanistic question, instead found a gem of fantastic content in your introductory video on ‘rattle’. You exhibit a great ability to ‘teach’, not talking down at students, pointing to exploratory topics beyond the scope of this talk, and looking to convey your overall enthusiasm for the subject. I am tempted to linger / come back for more 🙂

  6. Hi @ all,

    I can tell you why the NN wasn’t created.
    The values have to be scaled I[0,1]. (I guess that’s why this option exists.) If you then set the nods at 4 (rule of thumb from sqrt(20)) you get the best result at all.

    Very Nice tutorial – didn’t know rattle before,
    Claus

  7. I am facing one problem on your given data in csv file. I tried the same but it showed up only two box with no color if I am trying to do box plot distribution either for duration or amount of credit… but as per my understanding it should show 3 boxes one for all other twos for 0 and 1… Please, help on understanding why I am facing problem whether I am missing something…

    Note: It showed in data all respective min median/ mean / max are coming accurately as per the demonstration…

      • I am not getting any error msg. in the console. When I am starting R, in the console I am getting following msg. about version.

        Copyright (C) 2012 The R Foundation for Statistical Computing
        ISBN 3-900051-07-0
        Platform: x86_64-w64-mingw32/x64 (64-bit)

        R is free software and comes with ABSOLUTELY NO WARRANTY.
        You are welcome to redistribute it under certain conditions.
        Type ‘license()’ or ‘licence()’ for distribution details.

        Natural language support but running in an English locale

        R is a collaborative project with many contributors.
        Type ‘contributors()’ for more information and
        ‘citation()’ on how to cite R or R packages in publications.

        Type ‘demo()’ for some demos, ‘help()’ for on-line help, or
        ‘help.start()’ for an HTML browser interface to help.
        Type ‘q()’ to quit R.

        Attempting to load the environment ‘package:rattle’
        Rattle: A free graphical interface for data mining with R.
        Version 2.6.21 Copyright (c) 2006-2012 Togaware Pty Ltd.
        Type ‘rattle()’ to shake, rattle, and roll your data.
        [Previously saved workspace restored]

        > rattle()
        >

        It is not generating any console msg. on producing the box plot distribution of duration.

        Everything else looks like same except graphical presentation..

        • Hi

          Even I am also facing the same issue as Sudipto was faced. Karan, Please help us to understand why I am facing the same.

          Thanks,
          Vinayak

  8. Pingback: Installing Rattle and R | Analytics Training

Leave a Reply

Your email address will not be published.