1
Paper SA-11-2016
Accessing Teradata through SAS, common pitfalls, solutions and tips
Kiran Venna, Experis Business Analytics Practice
Abstract
There are some common pitfalls while accessing Teradata from SAS® and Vice Versa. SAS Options and
SAS Macro's efficiently handle these pitfalls. Owing to its unique architecture, Teradata primary index has
to be designed properly for both space and efficiency of accessing data in Teradata. Data Set option
dbcreate_table_opts handles creation of primary index, when Teradata tables are created through SAS.
Inefficient data types are created, when Teradata tables are created using SAS. Data Set option dbtype
helps in creating efficient data types. SAS Macro can help to automate dbcreate_table_opts and dbtype
Data Set options, when SAS is used to create Teradata tables. Case Specificity of Teradata and also
right truncation of string data can cause major concern, when Explicit SQL Pass-Through is used. By
issuing appropriate mode in connect statement these concerns can be resolved. While creating a SAS
Data Set in Explicit SQL Pass-Through with row_number function can make query fail in Teradata 15 with
the same query running fine in Teradata 14. Bulk compression on large Data Sets in Teradata can be
done in Explicit SQL Pass-Through.
Keywords
SAS, SAS/ACCESS®, Teradata, dbcreate_table_opts, dbtype, mode=option, ANSI vs Teradata mode,
case specificity, right truncation of string data, block compression
Introduction
Teradata is very efficient in handling large amounts of data, owing to its parallel architecture. SAS is
excellent in Extract Transform and Load (ETL) capabilities and in its analytical power. SAS/ACCESS
provides a way to interact with Teradata either through SQL Pass-Through facility or by using libname
statement. The interaction of SAS and Teradata to handle large amounts of data is often necessary for
doing ETL, analytics and reporting. This interaction usually involves creating tables in Teradata through
SAS and vice versa. There are notable differences in architecture and data types of SAS and Teradata,
which if not considered carefully, can cause common pitfalls.
This paper covers following topics.
1. Issue of Primary index and dbcreate_table_opts= Data Set option.
2. Data type issues and dbtype= Data Set option.
3. Best practice for creating appropriate primary index and data types.
4. Automating dbcreate_table_opts and dbtype options by using SAS Macros.
5. Issue with case specificity in Explicit SQL Pass-Through and mode= option.
6. Issue with right truncation of string data in Explicit SQL Pass-Through and mode =option.
7. Issues with creating SAS Data Set with row_number in Teradata 14 vs Teradata 15.
8. Bulk compression Teradata tables in Explicit SQL Pass-Through.
Issue of Primary index and dbcreate_table_opts = Data Set option
In Teradata, table rows are distributed on Access Module Processor (AMP). Row distribution is
dependent on uniqueness of defined primary index column. More unique the primary index column is
better the data distribution and vice versa. Improper distribution of table rows in AMP’s will results in
skewed data. Data skew causes space wastage and also weakens the parallel processing capabilities of
Teradata. To create primary index in Teradata a column is explicitly defined as shown.