Managing the Data Object

 

What is the Data Object

The data object is a collection of information managed by the Trialex System designed for organizing SAS datasets within a clinical data warehouse.  The data object does not store the contents of the SAS dataset but, instead, it captures meta data pertaining to SAS datasets within a clinical study.  This information becomes very useful for study management and documentation.  Some of the information gathered for the data object includes:

  • Dataset name
  • Description
  • Date and time of last modification
  • Created date and time
  • Order of data
  • Data source path locations
  • Status of data

This information is stored as a SAS dataset named Data.  This dataset is stored in the study directory and is managed by the Trialex System.  This dataset may be viewed, but it is recommended that it is not altered outside of the Trialex System.  The purposes for centrally managing the meta data of SAS datasets includes:

  • Proper refresh or execution of the data warehouse
  • Documentation

The data object can be found in the Trialex System study conduct area.  

    data2.gif (1058 bytes)
   Data

Browsing the Data Object

The data object enables quick review of attributes pertaining to any dataset within a specific clinical study.   This meta data optimizes the management of the clinical study.  The main object browser screen is shown here:

data_object1.jpg (52658 bytes)

The view shown is sorted by the order number.  This implies that if the next button nextprg.gif (994 bytes) were selected, the data object will navigate to the view of the next dataset ordered by the order attribute.  In addition to the order attribute, the list of datasets can also be sorted alphabetically.   This is available by clicking on the alphabetize hyperlink. 

In addition to the meta data, there is an option to view the actual contents of the data.  In this case, the View button will display a view of the SAS datasets in twenty observations per screen.  The Data Viewer tool will allow for quick view of the data and associated variables.

Viewing Data with Data Viewer

The eData tool is a SAS viewer designed to optimize viewing SAS datasets through a web browser.  It has similar features to the SAS System Data Viewer which includes:

  • View formatted and unformatted values from SAS datasets
  • View SAS variables and their attributes
  • Search and subset dataset
  • View data by sorted variables
  • View Frequency Summary
  • Export view to Excel

The main Data Viewer view is shown here:

newedata1.jpg (88072 bytes)

The default view displays the values as unformatted variables.  It is optional to switch the view to formatted values by selecting the View By pull down menu.  The last item on this same menu gives the option to view all the SAS variables and their attributes.  An example view of the adverse event dataset is shown here.

edata3.jpg (81006 bytes)

There are two methods for navigating to different observations within the dataset.  The first method is to click on the edata2.gif (889 bytes)hyperlink which appears at the bottom left of the view.  This will go to the next 20 observations from the current view.  The second method is to select from the pull down menu named obs which allows for quick navigation to a specific observation further in the dataset.

Searching the dataset is a useful way of getting to exact values within the dataset.  The search button will display the following dialog box.

edata4.jpg (35908 bytes)

In this example, the selected search will find any occurrence of the text "head" which is contained in the variable ae.   It is optional to have this search be case sensitive.

Data Viewer enables the sorted views.  The view can be sorted by up to three variables.  This is accomplished through the Sort button as shown here.

edata5.jpg (32723 bytes)

The sort and the search criteria can be compounded into one view. 

Data Viewer has been optimized to show only data twenty observations at a time so that it can be more optimally delivered over a web browser.

Frequency Summary

Besides viewing the values and the variable attributes of the selected data, it is also possible to get a frequency summary or a mean summary of a selected variable.  This is accomplish through the Freq/Mean button.  The selection lists all the variables from the dataset with categorical variables highlighted in green and continuous variables in blue.  Categorical variables are either character variable or numeric variables with user defined formats.   Continuous variables are numeric variables without a user defined format.

Number of Observations:   3458
Summarize By:  
  Variable Type Length Format Informat Label
1 PTID Character 4 $4. Subject ID
2 INV Numeric 8 2. Investigator Number
3 PROTOCOL Character 8 $8. Protocol Number
4 _DOCID Numeric 8 5. Document ID Number
5 _RECID Numeric 8 7. Record ID Number
6 AE Character 100 $100. Adverse Experience
7 AENUM Numeric 8 2. AE Number
8 BODYSYS Character 4 $4. Body System
9 PREFTERM Character 21 $21. Preferred Term
10 PT Numeric 8 2. Subject Number
11 STATUS Character 5 $5. COSTART Dictionary and Glossary Status

KEY:
Green - Categorical Variables
Blue - Continuous Variables

A mouse click on the selected categorical variable will drill down to a frequency summary of that specific variable.   The following example summaries the body system variable.

Data: a_ae
Variable: BODYSYS
  Value Frequency Percent
1 193 5.6
2 BODY 806 23.3
3 CV 168 4.9
4 DIG 451 13
5 ENDO 12 0.3
6 HAL 101 2.9
7 MAN 214 6.2
8 MS 86 2.5
9 NER 481 13.9
10 RES 225 6.5
11 SKIN 439 12.7
12 SS 122 3.5
13 UG 160 4.6

A selection of a continuous blue variable will result in a mean summary such as the one below:

Data: a_ae
Variable: AENUM
Mean Minimum Maximum Count Standard Deviation Standard Error of Mean
3.162 1 12 2408 1.771 0.036

It is optional to have the summary be applied by the values of a specified variable.  The "by" variable can be selected from the variable screen as shown here.

sumarize_by.jpg (29302 bytes)

The list of summarize by variables are only categorical variables.  Once a summarized by variable has been selected, you can drill down to the frequency or mean statistics in the same way by clicking on the variable of interested.  Note that you can not drill down to the statistics of the same selected summarized by variable.

In this example, the body system variable system is selected with the patient id as the by variable.

Data: a_ae
Variable: BODYSYS
By Variable (PTID):
  Value Frequency Percent
1 BODY 3 23.1
2 CV 1 7.7
3 DIG 3 23.1
4 HAL 1 7.7
5 MAN 2 15.4
6 SKIN 3 23.1

The default value of the by variable will be the first value that appears in the data set.  The pull-down menu of each patient allows for the navigation to other statistics for the body system variable.  This same methodology of the using the summarized by variables can also be applied for continuous variables.

Similar to the data view, frequency and mean summary views can be incorporated into an e-note.  In that case, the summary will be attached to the email associated with the e-note.  This enables more effective discussions pertaining to the frequency counts and mean summary statistics.

Export View to Excel

The current data view can be exported to Excel format through the Excel button.  Search and sort conditions placed upon the data will be applied before the creation of the excel export file.  This is useful if you were to narrow your view to the proper data points and then you can use the excel spreadsheet for further investigation.  The excel spreadsheet will be delivered via email.

Updating Object Attributes

The Trialex System has tools which automate the capturing and updating of metadata pertaining to SAS datasets.  The first of these tools is the import wizard.  This is accessible through the  import_button.gif (1264 bytes) button.   This wizard will import meta data from datasets available to the current clinical study and store this information within the Trialex System. The first dialog box of the import wizard asks for the location of the data.

importdata1.jpg (84893 bytes)

Click on the next button to proceed with selecting the specific datasets.

importdata2.jpg (74921 bytes)

There is an option to select multiple datasets.  If the data already exists in the Trialex System, the attributes will be updated accordingly. Click on the next button and it will proceed with the import. A status of all the datasets imported will be shown.

In the event that the data are no longer part of the study, the delete button can be used to delete the current data object that is being viewed.  Note that the _template_ is permanent and cannot be deleted.

   refresh_button.gif (1270 bytes)

The refresh button will update all the data which has already been imported.  This is designed for datasets which have been imported but perhaps have been changed since the last import.   Unlike the import feature, however, the refresh will apply updates to all existing datasets so no additional selections are necessary.

 
     Meta-Xceed Inc.© 2007