Skip to Main Content

Locating Data Sets & Raw Data

Resources for locating data sets, raw data, and statistics.

Statistical Software Comparison

Typically used in the social sciences, health sciences, marketing, and academia. The interface is easy and user friendly and it has a similar style to Excel. 

Imports: .xls, .xlsx, .csv, .txt, .dat, .sas7bdat, .dta files.

Exports: .xls, .xlsx, .csv, .dat, .sas7bdat, .dta files. 

Typically used in engineering, biology, and healthcare. It is a good tool for interactive graphics.

Imports: .xls, .xlsx, .csv, .txt, .dat, .sas7bdat, .dta, .sav files.

Exports: .xls, .xlsx, .csv, .dat, .sas7bdat files. 

Typically used in economics, sociology, political sciences, public health, medicine, and data science. Works really well with panel, survey, and time-series data types. 

Imports: .xls, .xlsx, .txt, .csv, .dat, .XPT, .XML, and various ODBC data sources.

Exports: .xls, .xlsx, .txt, .csv, .dat, .XPT, .XML, and various ODBC data sources.

Typically used in financial services, government, manufacturing, health, and life sciences. Is well suited to handle extremely large datasets. 

Imports: .xls, .xlsx, .txt, .dat, .csv, .sav, .dta, .jmp, .xml files. 

Exports: .xls, .xlsx, .txt, .dat, .csv, .sav, .dta, .jmp, .xml files. 

Typically used in private business, data science, finance, economics, bioinformatics, sociology, and marketing. R is free and open-source with a large online community. 

Import: .xls, .xlsx, .txt, .dat, .csv, .sav, .dta, .sas7bdat, .xml, .json files. 

Export: .xlsx, .txt, .csv, .sav, .dta, .json files. 

Typically used in education and engineering. 

Imports: .xls, .xlsx, .txt, .dat, .csv, .xml, .json files. 

Exports: .xls, .xlsx, .txt, .dat, .csv, .xml, .json files. 

Software Features and Capabilities

Software Interface Learning Curve Data Manipulation Statistical Analysis Graphics Specialties 
SPSS Menus and Syntax Gradual Moderate

Moderate Scope

Low Versatility

Good

Custom Tables

ANOVA

Multivariate Analysis

JMP Menus and Syntax Gradual  Strong

Moderate Scope

Medium Versatility

Great

Design of Experiments

Qaulity Control

Model Fit

Stata  Menus and Syntax Moderate Strong

Broad Scope

Medium Versatility

Good

Panel Data

Mixed Models

Survey Data Analysis

SAS Syntax Steep Very Strong

Very Broad Scope

High Versatility

Very Good

Large Datasets

Reporting

Password Encryption

Components for Specific Fields

R Syntax Steep Very Strong

Very Broad Scope

High Versatility

Excellent

Graphic Packages

Machine Learning

Predicitive Modeling

MATLAB Syntax Steep Very Strong

Limited Scope

High Versatility

Excellent

Simulations

Multidimensional Data

Image and Signal Processing 

Table from NYU Libraries, 2020

 

Maps and Parking