Skip to content

Latest commit

 

History

History
146 lines (108 loc) · 4.26 KB

tasks.md

File metadata and controls

146 lines (108 loc) · 4.26 KB

it # Tasks

Top Priority

  • docs

    • architecture.md
      • complete: flows, component breakdown
  • docs

    • tutorial.md
    • README
    • generate code docs (where should these be placed, perhaps src_docs)
  • bad ux:

    • "select tbl_name from catalog"
      • invalid column name - perhaps valid columns can be surfaces
    • "drop table table_doesnot_exist"
      • weird message
  • tests

    • add more lang_tests
    • no_source tests?
  • is the usage of constants::NULLPTR valid?

  • increase constants::INTERNAL_NODE_MAX_CELLS

  • constants::LEAF_NODE_MAX_CELLS

  • validate ordering

    • quicksort: does this equality need to handle floats in a special way?
  • put admin tasks somewhere (Make, python doit, shell)

    • run all tests
    • run black
  • release

    • add config file (controls output filepath, etc)

    • track bugs/gotchas

    • ungrouped source impl
  • for lang_tests, I should assert on contents, right now only checking if parse is successful

  • cleanup learndb.py; ensure all devloops are encoded in some test cases

  • how to best structure E2E tests?

    • how should they be named?

Testing

  • btree: test all permutations of small test cases
  • add stress tests (stress-tests.txt)
  • use pytest fixtures
    • seems it would be cleaner to define fixtures, i.e. pre-inited dbs with different schemas
    • right now, I have a lot of boiler plate
  • improve coverage and robustness of test suite
    • robustness: try randomized inputs
    • coverage: auto generate new inputs

User API

  • metaops to list tables, show table schema
  • add config to control
    • how to pass config- update entry point method
    • stop_execution_on_error
    • output data file
  • in addition to LearnDB do I want to support:
  • cursor?
  • records should be immutable-since they're shallow copied; or final records returned to user should be separate
  • repl should have help message at beginning
    • have an additional/secondary command to output sql example/primer
  • run learndb with input file

documentation/refactoring

  • complete architecture.md
  • document datatypes (type and valid ranges)
    • int is 4 byte int
    • real is a floating point number, but I'm handling it with much simpler rules than IEEE754
  • complete future-work.md
    • should contain high-level roadmap and interesting areas of future work
  • add tutorial
  • complete btree-structural-ops.txt
  • add btree types (pages/nodes) are bytearray (mutable) not bytes (immutable)

Lark

  • document how parse tree -> AST is working
  • pretty print transformed tree
  • to_ast
    • sql_handler should return cleaned up ast
  • write tests for lark
  • validations: when parse tree is being turned into ast, assert things like, e.g. no-on clause on cross-join

Optimization

  • If a function invocation is used, e.g. a count(col_a_) in both select, and having the expr value should be cached

Parser

  • ensure rules make sense with expression symbol
    • this (expression) should wrap or_clause
    • func_call from grammar/parser side only supports pos args; extend this to allow named args since function objects support named args

Storage (btree)

  • support deletion/free-list
    • support defragmentation of unallocated space
    • when allocating from free list, allocate whole block, the diff between block size and data size can be accounted for in diff between i.e. don't bother chunking blocks- since we'll have to account for padding anyways since free-list blocks have a min size
  • allocating on fragmented node (with enough space) should trigger single-node-in-place compaction

VM

  • flow
    • if a statement fails, should exec stop? how is this behavior controlled?
  • select statement
    • in addition to cursor iteration; select will have conditions and an optimizer
    • what is interface for select
      • user executes select and is returned a pipe object

Cleanliness

  • move all code into /learndb ?
  • run black
  • run mypy

Bugs

  • e2e_test.py::join_test should fail
  • duplicate key not erroring (this might be working now)
  • create table bar (col1 integer primary key, col2 text), i.e. num in colname
  • create table def , requires space after final column and final ')'
  • select count(*) from countries group by country_name

Release Requirements

  • complete docs
  • complete tutorial
  • document supported features