The Lightning Fast JSON Parser Library For Java
Written by Nikos Vaggalis   
Thursday, 24 August 2023

simdjson-java is the Java version of simdjson, the JSON parser that uses SIMD instructions. How fast can it go?

We had a look at simdjson its first version 1. 0 back in 2021. In plain terms simdjson is a C++ library that can parse JSON documents very fast:

Does parsing 3 gigabytes of JSON per second sound fast enough?

This library achieves it. In last year's benchmark against the fastest standard compliant C++ JSON parsers, RapidJSON and sajson, smidjson by far outperformed them. It can parse 4x faster than RapidJSON and 25x faster than Modern C++.

This efficiency is mainly achieved due to the library under the hood using SIMD instructions, which excel at data level parallelism by fitting operations many times over per instruction, even under a single core.

You might think that since it is a C++ lib that only devs writing in C++ are benefited. This is not true as there were already bindings for other languages like Go, Ruby, Python and more. There's even a port for PostgreSQL in pg_simdjson. Well now there's one for Java too.

With simdjson-java you can now leverage the power of the parser from Java, as easy as :

byte[] json = loadTwitterJson();

SimdJsonParser parser = new SimdJsonParser();

JsonValue jsonValue = simdJsonParser. parse(json, json. length);

Iterator<JsonValue> tweets = jsonValue. get("statuses"). arrayIterator();

while (tweets. hasNext()) {
     JsonValue tweet = tweets. next();
    JsonValue user = tweet. get("user");
    if (user. get("default_profile"). asBoolean()) {
        System. out. println(user. get("screen_name"). asString());
   }
}

While the library still not feature complete, it outperformed the rest of the Java json libraries by far when benchmarked under a target machine with the following specs:

  • CPU: Intel(R) Core(TM) i5-4590 CPU @ 3. 30GHz
  • OS: Ubuntu 23. 04, kernel 6. 2. 0-23-generic
  • Java: OpenJDK 64-Bit Server VM Temurin-20. 0. 1+9

The benchmark showed that simdjson-java produced 1450. 951 ops/sec while the rest (jackson, fastjson2, jsoniter) performed in the range of 500 ops/sec.

With that said, what's missing from the library at this early stage?

  • Support for Unicode characters
  • UTF-8 validation
  • Full support for parsing floats
  • Support for 512-bit vectors

They are features, however, on the project's roadmap and upon their completion the library will achieve an even better position.
That said, the point is that you can start using it now in your own code to enjoy the performance benefits by of course taking the missing functionality into consideration.

 

More Information

simdjson Java

simdjson

Related Articles

A Lightning Fast JSON Parser Library 

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


Avi Wigderson Gains Turing Award
16/04/2024

Israeli mathematician and computer scientist, Avi Wigderson, is the recipient of the 2023 ACM A.M Turing Award which carries a $1 million prize with financial support from Google.



Pure Virtual C++ 2024 Sessions Announced
19/04/2024

Microsoft has announced the sessions for Pure Virtual C++ 2024, which is taking place on April 30th 15:00 UTC. People who sign up will get access to five sessions happening on the day, alongside a ran [ ... ]


More News

raspberry pi books

 

Comments




or email your comment to: comments@i-programmer.info