NJSUG 2010 Third Quarter Meeting

The meeting was in the morning (9:00am - noon) on Friday, Oct 1st at:


Rutgers Labor Education Center
50 Labor Center Way
Rutgers, The State University of New Jersey
New Brunswick, NJ 08901

Agenda

09:00-09:20 Registration and Continental Breakfast
09:20-10:20 Condensed and Sparse Indexes for Sorted SAS Datasets (Mark Keintz)
10:20-10:40 Break
10:40-11:40 What's New with JMP9 (Valerie Hyde)
11:40-noon Closing Remarks and Door Prize Drawing

Condensed and Sparse Indexes for Sorted SAS Datasets

Downloads

Power point presentation file

Abstract

The standard SAS index normally can speed up data retrieval of subsets, but can be suboptimal for datasets sorted on the index variable. For large sorted groups the normal index wastes disk space by creating a "pointer" to every record in the group. This paper demonstrates a condensed index with pointers only to the first and last record in each group. Dramatic reductions in elapsed time, input/output, and disk space are realized.

The paper will also show results for a "sparse" index for sorted data in which groups are relatively small but the index range is large (e.g. every second over a span of days). When retrieving a subset of cases between a pair of values of the sort variable, an index that points to only a few selected records can also save disk space and elapsed time.

Author

Mark Keintz is a Senior Data Analyst at the Wharton Research Data Services group. His current interests are in developing and supporting financial research applications, and managing large datasets. Mark has been using SAS since it was documented in one book.

What's New with JMP® 9

Abstract

JMP® 9, to be released on October 12, has many new and exciting features. One enhancement is the new mapping capability, such as being able to put a map behind your data in the graph builder and the bubble plot. This enables data overlays to any kind of geospatial data. Data mining techniques have also been stepped up. It will be even easier to "train, validate, and test" you data, resulting in more robust models that produce better predictions. The partition platform, which is JMP's decision tree, now also includes bootstrap forests and boosted trees.

JMP is fully integrated with SAS. It will be shown how to run SAS code from within JMP, as well and SAS and JMP code together. This integration enables the scalability of SAS to be harnessed to a high graphical client via JMP. This is truly the best of both worlds.

Author

Valerie Hyde is a Systems Engineer for JMP, a division of SAS. She is passionate about data visualization and using analytics to help companies make better business decisions. Before joining SAS, Valerie worked for Accenture in the Marketing Science group and at AT&T Research on a project to model the effectiveness of viral marketing on a direct mail campaign. She has also performed econometric modeling for the US Justice Department's Antitrust Division, the US Census Bureau, and MiCRA Inc., a boutique consulting company.

Valerie earned a PhD in applied mathematics and statistics from the University of Maryland. Her research, which has been published in The American Statistician and Computational Statistics and Data Analysis, focused on mining online auction data for patterns in the price process. She also has a bachelor's degree in mathematics and economics from Binghamton University.