Key Features • About • Use Cases • How To Use • Roadmap • Documentation • License
- Full in-memory handling
- Concurrency safe and optimized for multithreading applications
- Easy integration as a standalone library
- No third party dependencies
- Simple builder based query language (json compatible)
- Option to create/map nested structures at once (json compatible)
- Native directed graph structure support
- Full access via direct storage api
- Supports multiple parallel storages (factory)
- Global accessibility of storage index
GITS is designed to simplify the management of complex data structures in Go applications, eliminating the need for manual concurrency handling. By storing and operating on data entirely in memory, GITS enables rapid processing of large datasets and intricate structures. The library is inherently thread-safe, making it suitable for multithreaded environments.
While offering a straightforward query interface for most use cases, GITS also exposes an underlying storage API for advanced optimization. This flexibility allows GITS to accommodate a broad spectrum of applications, from high-performance in-memory object storage to intricate information network mapping.
The following use cases are example either applications in which i used GITS or ideas that came to my mind in which using GITS could be beneficial. Apart from that, GITS can be used in any environment that can benefit from the features it provides.
- Corona Dashboard
- Back in the main phase of corona i created my own dashboard hosting and showing the current infection/etc statistics for countries/states/ and other subnational units. A crawler collected information from official sources at an hourly rate and mapped them into the storage. This data was fully kept in memory till i hit my servers limits (20gb ~).
- Webcrawler
- Due to GITS offering the mapping of data in graph structure, you can create complex networks - in this case websites/pages and the interlinking in between those. Due to the index being in memory - lookups for already existing entries and such are very fast.
- File importer
- You got a very large XML export with different nested sections and ID relations in between those. You can easily use GITS to map the data and than simply retrieve the complete structures for exporting/processing instead of having to use multiple lookups every time.
- NFT trader
- GITS was used to keep track of NFT trading transactions and evaluate possible arbitrary trades based on the data kept in memory. Since such arbitrary trading is very time sensitive, an in memory storage was optimal.
- go-cyberbrain
- A longterm project of myself (which i initially created SlingshotDB for). It's a processing/computing framework which allows for self supervising/automated processing of data.
Since GITS is a library, the setup is really simple and only consists of requiring GITS into your existing go project
go get github.com/voodooEntity/gits@0.9.1
The following examples provide a sneak peak into the usage of GITS. For a more detailed overview please visit the documentation.
To use GITS first we need to create a new instance
myGitsInstance := gits.NewInstance("main")
myGistInstance is a *gits.Gits than can be used to map new data, query the storage or use the direct storage api.
GITS also keeps track of all storages. The first created storage will always be automatically set as "default" and globally retrievable using gits.GetDefault().
You are free to create as many instances of GITS as you like, in the following examples we are going to focus on having one storage to work with.
A dataset in GITS is considered an entity, or in graph speech a "node/vertice". When interacting with the data mapping function or query results, the type of structure used is the transport.TransportEntity
type TransportEntity struct {
Type string
ID int
Value string
Context string
Version int
Properties map[string]string
ChildRelations []TransportRelation
ParentRelations []TransportRelation
}
Each dataset in GITS will be primarily defined by its Type (string), ID (auto increment per type).
While providing a Value:string field to store data, it also holds a Context:string field and the possibility to store further data using the map[string]string Properties field. The Version field is used to keep track of versions when using GITS in a multithreaded environment and should usually not be tempered with. ChildRelations and ParentRelations are pretty self explaining and can hold any number (amount and depth) of further transport.TransportEntities.
This format is mainly used for input/output purposes and the actual stored dataset (types.StorageEntity) is a lot smaller. You will only encounter instances of said type when directly interacting with the storage API.
Also, since GITS is a graph structured storage, it by default has includes the ability to link any dataset with what we call a "relation" or in graph speech an "edge". Such a relation is structured as following
type TransportRelation struct {
Context string
Properties map[string]string
Target TransportEntity
SourceType string
SourceID int
TargetType string
TargetID int
Version int
}
To make creating new data as easy as possible, GITS provides the "MapData" method which only needs to be provided with an instance of transport.TransportEntity and will take care of the rest.
rootIntID := myGitsInstance.MapData(transport.TransportEntity{
ID: storage.MAP_FORCE_CREATE,
Type: "Alpha",
Value: "Something",
Context: "Example",
ChildRelations: []transport.TransportRelation{
{
Target: transport.TransportEntity{
ID: storage.MAP_FORCE_CREATE,
Type: "Beta"
Value: "Else"
Context: "Example",
},
},
},
})
As you can see in the example, we are passing a nested structure to the MapTransport function. You can map a single dataset or any amount of nested datasets. Since we are using the MAP_FORCE_CREATE constant both datasets will definitely be created. The return of MapTransport is always the ID of the root dataset passed to the method.
For more detailed information on the capabilities of MapTransport please check "MapTransport Examples & Details" addlink
The most simple way to access data in a GITS storage is to use the inbuilt query language builder. While the GITS custom query builder options may be limited, they include the most important options to cover a wide range of use cases.
An example of how to use it to read the previously mapped data
// Retrieve a query adapter instance. while its not required to store the adapter
// (you may also just always use myGitsInstance.Query() before calling query methods)
// it might be handy to reduce the amount of typing
qryAdapter := myGitsInstance.Query()
// now we create a query to read the desired data
finalQuery := qryAdapter.New().Read("Alpha").Match("Value","==","Something").To(
qryAdapter.New().Read("Beta").Match("Value","==","Else"),
)
// finally we execute the query to retrieve the results. ExecuteQuery
// is called on the previously created instance of *gits.Gits
result := qryAdapter.Execute(finalQuery)
For more information please refer to "Query Language Reference and Examples"
While it is recommended to primary use the query builder when accessing the storage, you are always able to directly interact with the storage.
To access the storage you simply call
storageInstance := myGitsInstance.Storage()
and afterwards you can interact with the storage directly
allEntityTypes := storageInstance.GetEntityTypes()
For more information please refer to "Storage API Reference and Examples"
While GITS is designed for optimal performance through in-memory handling and optimized indexing, understanding the underlying storage architecture can further enhance its usage. For a detailed explanation of this architecture, please refer to the following documentation: Storage Architecture
The following list contains topics that will be the focus of future major updates. This list is not ordered.
- Enhance query capabilities
- Right now the query builder/language has certain limitations especially in context of methods like "Link","Unlink" and the possibilities to create complex Match(Filter) conditions. This should be enhanced by reworking the query parser and enhancing the query.Query functionality.
- The current return format for linked data is nested struct instances. While this is very comfortable to work with, it also has certain limitations. Therefor it should also be possible to get the return in a flat format.
- Enhance test coverage
- As for now, only the query builder/parser is fully tested. This should be extended to cover as much code as (usefully) possible.
- Massive restructuring of gits in order to make it more accessible and easier to get into for newcomers. This changes are leading towards the first stabled release and include:
- Enable multiple storages due to factory pattern
- Add a global tracking of storages and easy accessibility
- Move the storage to storage.go package and make gits.go the main interface
- Add simple interface in gits.go to enable query'ing and storage access
- Remove old persistence implementation
- Rework README.md and add full Documentation
- Bugfixes
- Fixing an issue on which a array out of index error could occure on nested Reduce() queries
- Adjusting Match() and OrMatch() when used with "Properties." to require the results to have the searched Property name.
- Bugfix addresslist memory exception
Full Changelog - Latest Release
GNU General Public License v3.0
laughingman.dev · GitHub @voodooEntity