Data Analysis ToolPak – Karl Pearson Correlation Matrix

In this video segment, I talk about enabling the Data Analysis ToolPak – Addin in excel. This is a powerful and rarely explored feature in MS Excel which can do a lot of stuff. In this series of video demonstration, I will be exploring these features.

The first of which is creating a Correlation matrix in Excel. The file used can be downloaded  here –> car_sales.


SAS Programming – Day 10

Arrays in SAS

  • SAS arrays are another way to temporarily group and refer to SAS variables. A SAS array provides a different name to reference a group of variables
  • Array statement begins with keyword ARRAY followed by array name and N – number of elements within array
  • _temporay_  option is used to create a temporary array

Base SAS Programming – Day 9

Looping in SAS

  1. Functions in SAS:  Continued
    • Text function:
      •  Compress – Returns a character string with specified characters removed from the original string
      • Index – Returns the position of the specific character in a string
    • Use of upcase, lowcase and propcase functions in string comparison
    • Math / Stat functions: Like Int, Round, Sum, Mean etc
    • Difference between Mean value (or any aggregate function) of Proc Means / Summary and Mean function
  2. Loops in SAS : Do Loops
    • Loops are used to iterate through every observation for specified number of times to obtain a desired result
    • Types of Loops:
      • Do Loop
      • Do While
      • Do Until
    • Default increments by 1
    • Can use BY to increment by any value other than 1

Base SAS Programming – Day 8

Functions in SAS (Continued)

  1. Text Functions
    • Catx – to concatenate characters / strings with any delimiter. Cat is also a function used to concatenate characters / strings
    • Trim – to remove trailing blanks in a string
    • Tranwrd – Replaces all occurrences of a substring in a character string
    • Translate – Replaces specific characters in a character expression.
    • Do check out Compress and other Text functions like Upcase, Lowcase and Propcase
  2. Date and Time Functions
    • Day – Returns the Day from a SAS date value
    • Month – Returns the Month from a SAS date value
    • Year – Returns the year from a SAS date value
    • Week – Returns the week number from a SAS date value (Try weekday function)
    • Mdy – Concatenates Month, Day and year into a date value
    • Today – Returns current system date
    • Datdif – Difference between 2 dates in days
    • Yeardif – Difference between 2 dates in years
    • INTCK – Returns the number of interval boundaries of a given kind that lie between 2 dates.
    • INTNX – Increments a date value by a given time interval, and returns a date

Base SAS Programming – Day 1

The BASE SAS video series begins with the assumption that the student viewer has no background in SAS programming and in fact very limited to no prior exposure to any kind of programming at all. Base SAS video series comprises of 9 video lectures (Average of hour and a half each), plus additional videos covering Advanced topics like PROC SQL and SAS Macros.

Day 1 topics as below —

  1. Intro to Libraries, Data step and Proc step in SAS
  2. Data step example:
    • Creating a sample dataset using Datalines / Cards statement
    • Understanding data types in SAS
    • Informat and Format
    • Label
  3.  Proc step example: Overview
    • Proc Contents – to know the dataset structure, with list of variables and number of observations
      • Varnum option – to list the variables of a dataset in creation order (otherwise the list is in alphabetic order)
    • Proc Print – to view the data in a dataset.
      • Label option and Var statement
    • Proc Means – to extract basic statistics of a numerical variables in a dataset like N, Mean, Standard Deviation, Minimum and Maximum (default)

Time Series using Holt’s Linear Exponential Smoothing (Seasonal Variation)

In this video , we explain how to implement Exponential Smoothing on Excel itself to generate a forecast.

We begin by explaining the decomposition of time series into 4 components

  • Trend (Long Term Progression of the Series)
  • Seasonality
  • Cyclic
  • Irregular/Noise

We then demonstrate the use of Moving averages and single exponential smoothing to extract the trend from the series. By subtracting trend from the original signal we can extract the seasonal variation around the trend.

Further we demonstrate the Holt’s technique for double exponential smoothing in a linear upwards trend and how we can use it for forecasting. Furthermore, by using the length of the season, we average out the seasonal fluctuation around the trend (thereby try to eliminate the irregular component) and then combine the forecasted trend and seasonal fluctuation to get an integrated forecast.

All of the above has been demonstrated using MS Excel and simple formulae, and then we proceed to demonstrate the use of IBM SPSS to do the same.

The worksheet with the implementation can be downloaded from here.

Online Batch on SAS Programming (Base and Advanced)



Learn-Analytics is starting an online batch on SAS Programming (Base and Advanced) on Saturday, Jan 14th. Classes are scheduled at 2000 IST (1430 GMT, 0930 Eastern), 3 hours a day. For those wishing to register for the training, the first two classes (6 hours of training) will be free to attend and enabling participants to evaluate the trainer as well as the delivery mechanism.

Medium of training will be through Webex, the instructor will take the participants through hands on sessions using datasets and case studies with exercises at the end of each session. Recordings of the session will be made available to all participants post the training for a period of 3 months.

For the detailed modules design and topics covered, click here. Interested candidates can drop us an email at or fill in the contact form here,  we will forward the webex invitation link for the free evaluation.


Demo on Time Series using Exponential Smoothing (IBM SPSS and Excel)

Following up on our last week’s Webex session on Logistic Regression for credit scoring (you can catch it here), this Sunday we will demonstrate the technique of exponential smoothing in time series forecasting.

More specifically, during the webinar we will take you through the basic decomposition of a time series into its components:

  • Trend
  • Seasonality
  • Cyclic
  • Error

We will concentrate on extracting and forecasting the trend and seasonality of a series. Trend component can be extracted using Exponential smoothing (Single, double and triple depending on the slope and pattern) and building a seasonal index to forecast the seasonal variation.

The webinar will demonstrate how this can be done using simple formulas on an excel sheet itself and then introduce the time series function in IBM SPSS. A basic introduction to Box-Jenkins (ARIMA) modelling will also be covered.

A trial version of IBM SPSS version 20 can be downloaded from here. (You will need to fill in some details). Those interested in attending the seminar can drop an email at or fill in the form at  and we will mail you the webex invite. The webinar is scheduled on Sunday, 15th Jan, 0930 IST – 0400 GMT.

The webinar is free to attend.

To receive regular updates, please join our linkedin group Learn Analytics.