snappy-java

snappy-java is a Java port of the snappy, a fast C++ compresser/decompresser developed by Google.

Features

Fast compression/decompression around 200~400MB/sec.
Less memory usage. SnappyOutputStream uses only 32KB+ in default.
JNI-based implementation to achieve comparable performance to the native C++ version.
- Although snappy-java uses JNI, it can be used safely with multiple class loaders (e.g. Tomcat, etc.).
Compression/decompression of Java primitive arrays (float[], double[], int[], short[], long[], etc.)
- To improve the compression ratios of these arrays, you can use a fast data-rearrangement implementation (BitShuffle) before compression
Portable across various operating systems; Snappy-java contains native libraries built for Window/Mac/Linux, etc. snappy-java loads one of these libraries according to your machine environment (It looks system properties, os.name and os.arch).
Simple usage. Add the snappy-java-(version).jar file to your classpath. Then call compression/decompression methods in org.xerial.snappy.Snappy.
Framing-format support (Since 1.1.0 version)
OSGi support
Apache License Version 2.0. Free for both commercial and non-commercial use.

Performance

Snappy's main target is very high-speed compression/decompression with reasonable compression size. So the compression ratio of snappy-java is modest and about the same as LZF (ranging 20%-100% according to the dataset).
Here are some benchmark results, comparing snappy-java and the other compressors LZO-java/LZF/QuickLZ/Gzip/Bzip2. Thanks Tatu Saloranta @cotowncoder for providing the benchmark suite.
The benchmark result indicates snappy-java is the fastest compressor/decompressor in Java: https://ning.github.io/jvm-compressor-benchmark/results/canterbury-roundtrip-2011-07-28/index.html
The decompression speed is twice as fast as the others: https://ning.github.io/jvm-compressor-benchmark/results/canterbury-uncompress-2011-07-28/index.html

Download

Release Notes

The current stable version is available from here:

Release version: https://repo1.maven.org/maven2/org/xerial/snappy/snappy-java/
Snapshot version (the latest beta version): https://oss.sonatype.org/content/repositories/snapshots/org/xerial/snappy/snappy-java/

Using with Maven

Snappy-java is available from Maven's central repository. Add the following dependency to your pom.xml:

<dependency>
  <groupId>org.xerial.snappy</groupId>
  <artifactId>snappy-java</artifactId>
  <version>(version)</version>
  <type>jar</type>
  <scope>compile</scope>
</dependency>

Using with Gradle

implementation("org.xerial.snappy:snappy-java:(version)")

Using with sbt

libraryDependencies += "org.xerial.snappy" % "snappy-java" % "(version)"

JDK 24+ Native Access Requirements

Starting with JDK 24, Java has introduced restrictions on native method access through JEP 472: Prepare to Restrict the Use of JNI. Since snappy-java uses JNI to load native libraries for high-performance compression, applications running on JDK 24 or later must enable native access.

Required JVM Flag

When running on JDK 24+, add the following JVM flag to your application:

--enable-native-access=ALL-UNNAMED

Examples

Running a JAR:

java --enable-native-access=ALL-UNNAMED -jar your-application.jar

Maven:

<plugin>
  <groupId>org.apache.maven.plugins</groupId>
  <artifactId>maven-surefire-plugin</artifactId>
  <configuration>
    <argLine>--enable-native-access=ALL-UNNAMED</argLine>
  </configuration>
</plugin>

Gradle:

tasks.withType(Test) {
    jvmArgs '--enable-native-access=ALL-UNNAMED'
}

sbt:

javaOptions += "--enable-native-access=ALL-UNNAMED"

Why is this needed?

Per JEP 472's policy of "integrity by default," it is the application developer's responsibility (not the library's) to explicitly enable native access. This change improves security by making native operations visible and controlled at the application level.

Without this flag on JDK 24+, you will see warnings like:

WARNING: A restricted method in java.lang.System has been called
WARNING: java.lang.System::load has been called by org.xerial.snappy.SnappyLoader
WARNING: Use --enable-native-access=ALL-UNNAMED to avoid a warning for callers in this module

These warnings will become errors in future JDK releases.

Note: This requirement only applies to JDK 24 and later. Earlier JDK versions (8-23) do not require this flag.

Usage

First, import org.xerial.snapy.Snappy in your Java code:

import org.xerial.snappy.Snappy;

Then use Snappy.compress(byte[]) and Snappy.uncompress(byte[]):

String input = "Hello snappy-java! Snappy-java is a JNI-based wrapper of "
     + "Snappy, a fast compresser/decompresser.";
byte[] compressed = Snappy.compress(input.getBytes("UTF-8"));
byte[] uncompressed = Snappy.uncompress(compressed);

String result = new String(uncompressed, "UTF-8");
System.out.println(result);

In addition, high-level methods (Snappy.compress(String), Snappy.compress(float[] ..) etc. ) and low-level ones (e.g. Snappy.rawCompress(.. ), Snappy.rawUncompress(..), etc.), which minimize memory copies, can be used.

Stream-based API

Stream-based compressor/decompressor SnappyOutputStream/SnappyInputStream are also available for reading/writing large data sets. SnappyFramedOutputStream/SnappyFramedInputStream can be used for the framing format.

See also Javadoc API

Compatibility Notes

The original Snappy format definition did not define a file format. It later added a "framing" format to define a file format, but by this point major software was already using an industry standard instead -- represented in this library by the SnappyOutputStream and SnappyInputStream methods.

For interoperability with other libraries, check that compatible formats are used. Note that not all libraries support all variants.

SnappyOutputStream and SnappyInputStream use [magic header:16 bytes]([block size:int32][compressed data:byte array])* format. You can read the result of Snappy.compress with SnappyInputStream, but you cannot read the compressed data generated by SnappyOutputStream with Snappy.uncompress.
SnappyHadoopCompatibleOutputStream does not emit a file header but write out the current block size as a preemble to each block

Data format compatibility matrix:

Write\Read	`Snappy.uncompress`	`SnappyInputStream`	`SnappyFramedInputStream`	`org.apache.hadoop.io.compress.SnappyCodec`
`Snappy.compress`	ok	ok	x	x
`SnappyOutputStream`	x	ok	x	x
`SnappyFramedOutputStream`	x	x	ok	x
`SnappyHadoopCompatibleOutputStream`	x	x	x	ok

BitShuffle API (Since 1.1.3-M2)

BitShuffle is an algorithm that reorders data bits (shuffle) for efficient compression (e.g., a sequence of integers, float values, etc.). To use BitShuffle routines, import org.xerial.snapy.BitShuffle:

import org.xerial.snappy.BitShuffle;

int[] data = new int[] {1, 3, 34, 43, 34};
byte[] shuffledByteArray = BitShuffle.shuffle(data);
byte[] compressed = Snappy.compress(shuffledByteArray);
byte[] uncompressed = Snappy.uncompress(compressed);
int[] result = BitShuffle.unshuffleIntArray(uncompress);

System.out.println(result);

Shuffling and unshuffling of primitive arrays (e.g., short[], long[], float[], double[], etc.) are supported. See Javadoc for the details.

Setting classpath

If you have snappy-java-(VERSION).jar in the current directory, use -classpath option as follows:

$ javac -classpath ".;snappy-java-(VERSION).jar" Sample.java  # in Windows
or
$ javac -classpath ".:snappy-java-(VERSION).jar" Sample.java  # in Mac or Linux

Public discussion group

Post bug reports or feature request to the Issue Tracker: https://github.com/xerial/snappy-java/issues

Public discussion forum is here: Xerial Public Discussion Group

For developers

snappy-java uses sbt (simple build tool for Scala) as a build tool. Here is a simple usage

$ ./sbt            # enter sbt console
> ~test            # run tests upon source code change
> ~testOnly        # run tests that matches a given name pattern  
> publishM2        # publish jar to $HOME/.m2/repository
> package          # create jar file
> findbugs         # Produce findbugs report in target/findbugs
> jacoco:cover     # Report the code coverage of tests to target/jacoco folder

If you need to see detailed debug messages, launch sbt with -Dloglevel=debug option:

$ ./sbt -Dloglevel=debug

For the details of sbt usage, see my blog post: Building Java Projects with sbt

Building from the source code

See the build instruction. Building from the source code is an option when your OS platform and CPU architecture is not supported. To build snappy-java, you need Git, JDK (1.6 or higher), g++ compiler (mingw in Windows) etc.

$ git clone https://github.com/xerial/snappy-java.git
$ cd snappy-java
$ make

When building on Solaris, use gmake:

$ gmake

A file target/snappy-java-$(version).jar is the product additionally containing the native library built for your platform.

Creating a new release

GitHub action [https://github.com/xerial/snappy-java/blob/master/.github/workflows/release.yml] will publish a new relase to Maven Central (Sonatype) when a new tag vX.Y.Z is pushed.

Miscellaneous Notes

Using snappy-java with Tomcat 6 (or higher) Web Server

Simply put the snappy-java's jar to WEB-INF/lib folder of your web application. Usual JNI-library specific problem no longer exists since snappy-java version 1.0.3 or higher can be loaded by multiple class loaders.

Configure snappy-java using property file

Prepare org-xerial-snappy.properties file (under the root path of your library) in Java's property file format. Here is a list of the available properties:

org.xerial.snappy.lib.path (directory containing a snappyjava's native library)
org.xerial.snappy.lib.name (library file name)
org.xerial.snappy.tempdir (temporary directory to extract a native library bundled in snappy-java)
org.xerial.snappy.use.systemlib (if this value is true, use system installed libsnappyjava.so looking the path specified by java.library.path)

Snappy-java is developed by Taro L. Saito. Twitter @taroleo

Name		Name	Last commit message	Last commit date
Latest commit History 1,240 Commits
.github		.github
docker		docker
lib		lib
project		project
script		script
src		src
.git-blame-ignore-revs		.git-blame-ignore-revs
.gitattributes		.gitattributes
.gitignore		.gitignore
.scalafmt.conf		.scalafmt.conf
.travis.yml		.travis.yml
BUILD.md		BUILD.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
Makefile		Makefile
Makefile.common		Makefile.common
Makefile.package		Makefile.package
Milestone.md		Milestone.md
NOTICE		NOTICE
README.md		README.md
SECURITY.md		SECURITY.md
build.sbt		build.sbt
sbt		sbt
stylesheet.css		stylesheet.css

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

snappy-java

Features

Performance

Download

Using with Maven

Using with Gradle

Using with sbt

JDK 24+ Native Access Requirements

Required JVM Flag

Examples

Why is this needed?

Usage

Stream-based API

Compatibility Notes

Data format compatibility matrix:

BitShuffle API (Since 1.1.3-M2)

Setting classpath

Public discussion group

For developers

Building from the source code

Creating a new release

Miscellaneous Notes

Using snappy-java with Tomcat 6 (or higher) Web Server

Configure snappy-java using property file

About

Uh oh!

Releases 27

Packages

Uh oh!

Contributors 74

Languages

License

xerial/snappy-java

Folders and files

Latest commit

History

Repository files navigation

snappy-java

Features

Performance

Download

Using with Maven

Using with Gradle

Using with sbt

JDK 24+ Native Access Requirements

Required JVM Flag

Examples

Why is this needed?

Usage

Stream-based API

Compatibility Notes

Data format compatibility matrix:

BitShuffle API (Since 1.1.3-M2)

Setting classpath

Public discussion group

For developers

Building from the source code

Creating a new release

Miscellaneous Notes

Using snappy-java with Tomcat 6 (or higher) Web Server

Configure snappy-java using property file

About

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 27

Packages 0

Uh oh!

Contributors 74

Languages

Packages