Sorting and De-duping datasets in SAS
- Proc Sort
- By statement is used in Proc Sort to sort variables in a datset
- By default, the specified variable(s) is sorted in ascending order
- Descending option / keyword is used in the ‘By’ statement along with the variable to sort it in descending order
- Only the variable followed by ‘descending’ will be sorted in descending order, the rest of them, if any will be sorted in ascending order
- Noduprecs option – is used to extract only the non duplicate observations of the dataset
- Nodupkey option – is used to extract only the non duplicate observations of a particular variable(s) specified in the By statement
- Proc Print
- By and Sum statement in SAS – using aggregate functions in Proc Print procedure
- Proc Append
- To append SAS datasets, is to stack observations of one dataset over the other. It is like a horizontal join.
- The datasets being appended must have the same variables (same data type and variable name)
- Force option – to append datasets with uncommon variable names/attributes
- Appending of datasets can be done using data and set statement, which is more of a union of datasets.
Hello,
Day 6’s tutorial is missing, could you please upload it.