CDISC Builder™ - DIFFTEST

DIFFTEST

Overview
Variable and data attributes can be standardized to be more consistent with each other. A common standard deviation is if the same variable name exists on multiple datasets but their lengths or label is different. The DIFFTEST tool generates a report that would allow you to easily spot data standard discrepancies such as this example. This can be applied to data before it is transformed to identify the area that needs to be standardized. It can also be used after things have been transformed to ensure that standards applied. The DIFFTEST can be accessed as a graphical user interface from the main menu of CDISC Builder.

DIFFTEST Options
The DIFFTEST screen allows you to select the data and associated options for the DIFFTEST. The data flow for this tool is shown here.

Source Data

CDISC Builder
DIFFTEST

HTML Reports
for DIFFTEST

The options available for selecting data and options for DIFFTEST are:

Source Data - This is the location of the dataset that is going to have attribute differences reported.
Preview - A preview of the first 100 observations of the data is presented.
Select All - This allows the user to select all the dataset that is in the selected library source path.
CDISC - This will select all datasets that have the matching name with prescribed CDISC domain datasets.
Format Catalog Path - The path to the format catalog associated with the dataset selected.
Report Title - The title on the HTML report that is generated containing the attribute comparisons.
Save Code - The code saved in this location will contain a macro call capturing all the options that has been selected. This can therefore be executed independent from this interface as a SAS program.

The DIFFTEST performs seven distinct tests where it compares attributes among selected datasets and/or format catalogs. The details of these 7 comparisons are documented in the %difftest reference documentation.

Although one dataset can be selected, it is recommended more than one data used for this report. The more datasets selected, the more findings of standard deviations can be captured.