This Java library provides a reader for SAS datasets.
It also provides a command-line tool and a desktop application.
Key points:
- lightweight library designed as a facade
- Java 8 minimum requirement
- has a module-info that makes it compatible with JPMS
Features:
- reads meta and data from SAS datasets (*.sas7bdat)
- browses data with 3 types of cursor: forward-only, scrollable and splittable
- is compatible with Java Stream API
- provides a simple facade that allows to plug in any implementation at deployment time
- implies the addition of a single mandatory dependency
Sasquatch is instantiated by a factory:
Sasquatch sasquatch = Sasquatch.ofServiceLoader();
It provides 3 ways of browsing the data:
- forward-only: row by row from the first to the last
- scrollable: any row by its position
- splittable: rows as a (parallel) stream
Path file = ...;
// forward-only cursor
try (SasForwardCursor cursor = sasquatch.readForward(file)) {
while (cursor.next()) {
}
}
// scrollable cursor
try (SasScrollableCursor cursor = sasquatch.readScrollable(file)) {
for (int i = 0; i < cursor.getRowCount(); i++) {
cursor.moveTo(i);
}
}
// splittable cursor
try (SasSplittableCursor cursor = sasquatch.readSplittable(file)) {
Stream<SasRow> stream = StreamSupport.stream(cursor.getSpliterator(), false);
}
Some shortcuts are also available:
// sample factory that extracts the first field as a string
SasRow.Factory<String> factory = cursor -> row -> row.getString(0);
// stream shortcut
try (Stream<String> stream = sasquatch.rows(file, factory)) {
}
// list shortcut
List<String> rows = sasquatch.getAllRows(file, factory);
Metadata can be retrieved directly or through a cursor:
// direct
SasMetaData meta = sasquatch.readMetaData(file);
// through a cursor
try (SasCursor cursor = sasquatch.read...(file)) {
cursor.getMetaData();
}
At least one implementation must be available at runtime (on classpath or modulepath) in order to read datasets. No implementation triggers an IOException
on read operations.
Sasquatch supports the following implementations:
artifactId | description | support |
---|---|---|
sasquatch-ri |
native reference implementation | advanced |
sasquatch-parso |
wrapper around parso library | advanced |
sasquatch-sassy |
wrapper around sassy library | basic |
sasquatch-biostatmatt |
java version of biostatmatt r code | basic |
Feature matrix:
ri |
parso |
sassy |
biostatmatt |
|
---|---|---|---|---|
BIG_ENDIAN_32 |
x | x | - | - |
LITTLE_ENDIAN_32 |
x | x | x | x |
BIG_ENDIAN_64 |
x | x | - | - |
LITTLE_ENDIAN_64 |
x | x | - | x |
ATTRIBUTES |
x | x | - | x |
LABEL_META |
x | x | - | - |
FIELD_ENCODING |
x | x | - | - |
COLUMN_ENCODING |
x | x | - | - |
CHAR_COMP |
x | x | - | - |
BIN_COMP |
x | x | - | - |
DATE_TYPE |
x | x | - | - |
DATE_TIME_TYPE |
x | x | - | - |
TIME_TYPE |
x | x | - | - |
CUSTOM_NUMERIC |
x | x | x | - |
COLUMN_FORMAT |
x | x | - | - |
<dependencies>
<dependency>
<groupId>com.github.nbbrd.sasquatch</groupId>
<artifactId>sasquatch-api</artifactId>
<version>LATEST_VERSION</version>
</dependency>
<dependency>
<groupId>com.github.nbbrd.sasquatch</groupId>
<artifactId>sasquatch-ri</artifactId>
<version>LATEST_VERSION</version>
<scope>runtime</scope>
</dependency>
</dependencies>
The command-line tool (sasquatch
in sasquatch-cli
project) allows to export a SAS dataset to a CSV or SQL file.
$ sasquath csv somedata.sas7bdat -o somedata.csv
$ sasquath sql somedata.sas7bdat -o somedata.sql
The desktop application (sasquatchw
in sasquatch-desktop
project) is a basic dataset viewer.