Skip to content
@goodcleanfun

goodcleanfun

Fun little NLP building blocks for the public good, free from capitalist interests, clean as in small/focused and low climate impact

Popular repositories Loading

  1. tokenizer tokenizer Public

    Jinja

  2. vector_ops vector_ops Public

    Generic vector functions for numeric types in C using OpenMP if available

    C

  3. tokens tokens Public

    Arrays of tokens as string offsets and lengths as well as a tokenized string which stores the tokens as a single contiguous array of NUL-terminated strings using cstring_array

    C

  4. token_types token_types Public

    Global enum of token types and associated grouping functions.

    C

  5. utf8 utf8 Public

    utf8 strings to unicode codepoints using utf8proc in C

    C

  6. khash khash Public

    Header-only clib package for khash.h

Repositories

Showing 10 of 69 repositories
  • max_interval_tree Public

    A dynamic, generic interval tree which allows querying the maximum weight of an interval containing the query and the interval at which that value was reached

    goodcleanfun/max_interval_tree’s past year of commit activity
    C 0 MIT 0 0 0 Updated Feb 7, 2025
  • weighted_sum_interval_tree Public

    A dynamic, balanced search tree that keeps track of a sum of weighted intervals

    goodcleanfun/weighted_sum_interval_tree’s past year of commit activity
    C 0 MIT 0 0 0 Updated Feb 7, 2025
  • memory_pool Public

    Generic memory pool for fixed types using block allocation and a free list stored within the blocks themselves

    goodcleanfun/memory_pool’s past year of commit activity
    C 0 MIT 0 0 0 Updated Feb 7, 2025
  • sartorial Public

    Pydantic model base classes and custom type handling, JSON schema generation, etc. covering a variety of common scenarios without much config

    goodcleanfun/sartorial’s past year of commit activity
    Python 0 MIT 0 0 2 Updated Feb 7, 2025
  • concurrent_array Public

    High-performance concurrent/thread-safe, generic, dynamic (push-only) array using read-write locks (write lock only held for resizing) and C11 atomics to ensure unique indices for each push.

    goodcleanfun/concurrent_array’s past year of commit activity
    C 0 MIT 0 0 0 Updated Feb 2, 2025
  • threading Public

    A simple cross-platform threads.h implementation

    goodcleanfun/threading’s past year of commit activity
    C 0 0 0 0 Updated Feb 2, 2025
  • array Public

    Generic, dynamic arrays in C using simple includes and defines instead of macros

    goodcleanfun/array’s past year of commit activity
    C 0 MIT 0 0 0 Updated Feb 2, 2025
  • byte_struct Public

    A contiguous-memory runtime struct using byte offsets, for constructing e.g. multidimensional keys, tuples with more than one type

    goodcleanfun/byte_struct’s past year of commit activity
    C 0 MIT 0 0 0 Updated Feb 2, 2025
  • lex_order Public

    Convert numeric types into binary-comparable representations for lexicographic sorting and trie storage

    goodcleanfun/lex_order’s past year of commit activity
    C 0 MIT 0 0 0 Updated Feb 2, 2025
  • byte_order Public

    Cross-platform endian buffer I/O

    goodcleanfun/byte_order’s past year of commit activity
    C 0 MIT 0 0 0 Updated Feb 1, 2025

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…