Skip to content

Commit

Permalink
Initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
Kalrish Bäakjen committed Dec 9, 2017
0 parents commit 8080d86
Show file tree
Hide file tree
Showing 95 changed files with 3,759 additions and 0 deletions.
68 changes: 68 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
# json2lua

This program converts JSON to Lua efficiently. Instead of parsing and loading the entire JSON data at once, which would cost quite some memory, it reads and processes the input in chunks.

Currently, the project comprises two components:

- `libjson2lua`, a C++ library implementing a flexible, pull-style JSON-to-Lua parser and converter
- `json2lua`, the C++ program which exposes the required functionality to the command line.

The library is kept in the project because it is developed for the program, although it might also be useful elsewhere.

In the future, the program might be expanded to support more encodings and, perhaps, to write bytecode directly.


## Build

The Tup build system manages the process, which comprises three steps: configuration, build and testing.

### Configuration
Parameters are specified in the usual way, in a `tup.config` file. You might be interested in using variants.

The steps are as follows:

1. Choose an appropriate toolchain (`CONFIG_TOOLCHAIN`). Toolchains are defined in [src/toolchains.tup](src/toolchains.tup).
2. Fulfill the config items required or supported by the chosen toolchain (e.g. `CONFIG_CXXFLAGS` for the GNU toolchain).
3. Choose appropriate values for toolchain-independent variables (like `CONFIG_BUFFER_SIZE`).

[The provided tup.config](tup.config) contains the explanation of toolchain-independent variables.

Example configurations which may or may not be suitable for your build environment are provided in [configs.tup](configs.tup). You could try to proceed to the build step by using one of them after adding your settings for the toolchain-independent parameters.

### Build
The build step follows the standard procedure:

$ tup

#### Requirements

- the [Tup build system](http://gittup.org/tup/)
- a C++ toolchain
- [guardcheader](https://github.com/kalrish/guardcheader)
- the standard Lua interpreter

### Testing
Tests are part of the build as well, and thus managed by Tup.

Tests have been implemented for the command-line program. As they involve a compiled executable, they can be carried out only if the target architecture for which the program was built matches that on which it was built. They will be run automatically if that's the case, and, if any of them fails, Tup will complain.


## Usage

The accepted command-line depends on the version.

All versions admit the GNU standard to find it out:

$ json2lua --version

This prints `json2lua `_`something VERSION`_, where _VERSION_ is the last token and represents the version.

### Current version

The current version handles UTF-8-encoded JSON and outputs ASCII-encoded Lua 5.

The program shall be invoked as follows:

$ json2lua INPUT OUTPUT

_INPUT_ and _OUTPUT_ are the names of the files that shall be read and written, respectively.
Empty file added Tupfile.ini
Empty file.
1 change: 1 addition & 0 deletions Tuprules.lua
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
tup.creategitignore()
1 change: 1 addition & 0 deletions Tuprules.tup
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
.gitignore
6 changes: 6 additions & 0 deletions configs.tup/cl.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
CONFIG_TOOLCHAIN=msvc
CONFIG_CL=cl.exe
#CONFIG_CLFLAGS=/nologo /Od /Gs /GF /RTCsu /MT /EHsc
CONFIG_CLFLAGS=/nologo /std:c++17 /DNDEBUG /Ox /Gs- /GF /GL /Gw /Gy /MT /EHsc
#CONFIG_LINKFLAGS=/nologo /Od /Gs /GF /RTCcsu /MT
CONFIG_LINKFLAGS=/nologo /Ox /Gs- /GF /GL /Gw /Gy /MT /link /LTCG /OPT:REF,ICF
4 changes: 4 additions & 0 deletions configs.tup/g++-dev.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
CONFIG_TOOLCHAIN=gnu
CONFIG_CXX=g++
CONFIG_CXXFLAGS=-Wall -Wextra -Wpedantic -Wno-unused-variable -Wno-unused-const-variable -Wno-unused-parameter -Wno-unused-function -Werror -std=c++17 -fno-common -march=native
CONFIG_LDFLAGS=-fno-common -march=native
4 changes: 4 additions & 0 deletions configs.tup/g++-performance.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
CONFIG_TOOLCHAIN=gnu
CONFIG_CXX=g++
CONFIG_CXXFLAGS=-Wall -std=c++17 -DNDEBUG -flto -fomit-frame-pointer -fno-common -fdata-sections -ffunction-sections -O3
CONFIG_LDFLAGS=-flto -fuse-linker-plugin -fomit-frame-pointer -fno-common -O3 -march=native -static -s -Xlinker --as-needed -Xlinker -O1 -Xlinker --gc-sections -Xlinker --sort-common -Xlinker --strip-all -Xlinker --relax
11 changes: 11 additions & 0 deletions src/Tuprules.tup
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
include toolchains.tup/@(TOOLCHAIN).tup

ifdef TARGET
TARGET=@(TARGET)
else
TARGET=@(TUP_PLATFORM)
endif

ifeq ($(TARGET),win32)
PROGRAM_SUFFIX=.exe
endif
5 changes: 5 additions & 0 deletions src/cmd/Tupfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
include_rules

: foreach cbc.cpp libinst.cpp json2lua.cpp | ../lib/<headers> config/buffer_size.hpp |> !cxx |> {objs}
: foreach log.cpp main.cpp utf8.cpp | ../lib/<headers> |> !cxx |> {objs}
: {objs} |> !ld |> json2lua$(PROGRAM_SUFFIX) | <cmd>
162 changes: 162 additions & 0 deletions src/cmd/cbc.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,162 @@
#include "config.hpp"

#include "cbc.hpp"

#include "log.hpp"
#include "utf8.hpp"


json2lua::pointer_pair<char32_t>
cbc::read
( )
{
unsigned char buffer[buffer_size];

const auto input = this->input;
const auto code_points = this->buffer;

auto read = std::fread(buffer, 1, sizeof(buffer)-3, input);

if ( std::ferror(input) == 0 )
{
if ( read > 0 )
{
const auto last_byte = buffer[read-1];

std::size_t additional = 0;

if ( (last_byte & 0b11000000u) == 0b11000000u )
additional = 1;
else if ( (last_byte & 0b11100000u) == 0b11100000u )
additional = 2;
else if ( (last_byte & 0b11110000u) == 0b11110000u )
additional = 3;

if ( additional > 0 )
{
log_debug("must read more");

const auto second_read = std::fread(buffer+read, 1, additional, input);
if ( std::ferror(input) == 0 )
{
if ( second_read == additional )
{
read += second_read;
}
else
{
err("incomplete UTF-8 sequence");

throw 4;
}
}
else
{
err("couldn't read input file");

throw 2;
}
}
else if ( (last_byte & 0b10000000u) == 0b10000000u )
{
/* it's a continuation byte; read until the next non-continuation character */
auto buffer_p = buffer+read;

read_another_character:
const auto c = std::fgetc(input);
if ( c != EOF )
{
*buffer_p = c;
++read;

if ( (c & 0b10000000u) == 0b10000000u )
{
++buffer_p;
goto read_another_character;
}
}
else
{
if ( std::feof(input) == 0 )
{
err("couldn't read input file");
}
else
{
err("incomplete UTF-8 sequence");
}

throw 2;
}
}

json2lua::pointer_pair<char32_t> rv;
rv.beg = code_points;
rv.end = decode_utf8(buffer, read, code_points);

return rv;
}
else
{
if ( std::feof(input) == 0 )
{
err("couldn't read anything");
}
else
{
err("incomplete JSON");
}

throw 3;
}
}
else
{
err("couldn't read input file");

throw 2;
}
}

void
cbc::write
(
const unsigned char c
)
{
if ( std::fputc(c, this->output) == EOF )
{
err("couldn't write to the output file");

throw 1;
}
}

void
cbc::write
(
const unsigned char * const buffer,
std::size_t size
)
{
if ( std::fwrite(buffer, 1, size, this->output) != size )
{
err("couldn't write to the output file");

throw 1;
}
}

json2lua::pointer_pair<unsigned char>
cbc::encode_string_code_point
(
const char32_t code_point
)
{
json2lua::pointer_pair<unsigned char> rv;

rv.beg = this->string_code_units;
rv.end = rv.beg + encode_utf8(code_point, this->string_code_units);

return rv;
}
39 changes: 39 additions & 0 deletions src/cmd/cbc.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
#include <cstdio>

#include "config/buffer_size.hpp"
#include "../lib/pointer_pair.hpp"


class cbc final
{
private:
char32_t buffer[buffer_size];
unsigned char string_code_units[4];

public:
std::FILE * input;
std::FILE * output;

json2lua::pointer_pair<char32_t>
read
( );

void
write
(
const unsigned char
);

void
write
(
const unsigned char * const,
const std::size_t
);

json2lua::pointer_pair<unsigned char>
encode_string_code_point
(
const char32_t
);
};
3 changes: 3 additions & 0 deletions src/cmd/config.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
#if __STDC_HOSTED__ != 1
#error C++ implementation is not hosted
#endif
3 changes: 3 additions & 0 deletions src/cmd/config/Tupfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
include_rules

: buffer_size.hpp.in |> tup varsed %f %o |> %B
1 change: 1 addition & 0 deletions src/cmd/config/buffer_size.hpp.in
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
constexpr decltype(sizeof(int)) buffer_size = @BUFFER_SIZE@;
1 change: 1 addition & 0 deletions src/cmd/config/debug.h
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
#undef DEBUG
Loading

0 comments on commit 8080d86

Please sign in to comment.