Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ideas #9

Open
gnusupport opened this issue Jan 19, 2023 · 4 comments
Open

Ideas #9

gnusupport opened this issue Jan 19, 2023 · 4 comments
Labels
enhancement New feature or request

Comments

@gnusupport
Copy link

gnusupport commented Jan 19, 2023

In my opinion, package `ekg' is not opinionated, rather, more human friendly, and personally I can't see it as substitute for other
applications, I see it as unique program.

So I propose changing your introduction: "The ekg module is a simple but opinionated note taking application. It is a substitute for such other emacs applications such as org-roam or denote. ekg stands for emacs knowledge graph."

There are a few core ideas driving the design of ekg. The first is
that a title and a tag are the same thing.

If I may say, I deal with many tags, tag I consider a class of
attributes or properties to an object.

I have elementary objects, they have their types, subtypes, types can
be actionable or not, there may be additional properties, there are
relations, etc. All those are properties in some sense.

Tags are properties across everything and are in itself important. I
use notion of tag types as well:

 1          Default
 2          Skill
 3          Topic
 4          Language
 5          Action
 6          Place
 7          Sales
 8          Computer

And notion that user should be able to enter tag types.

What we like in databases is use of intersection. It implies for me
that the more various sets of properties are there, the better we may
pin point to various destinations.

Practically it means we can better find the note we are looking for.

We can easier find relationships between objects.

Tags as separate properties from object name are useful to create more
intersections. Eliminating tags eliminate usefulness.

Back to searching, I have mentioned intersections. That is basic
principle by which various database based searches work. There are
different tables, different columns, types, classes defined in the
databases.

When searching we can then design functions such as:

  1. searching by tags only
  2. searching by tags and including name words as tags
  3. etc.
  4. etc. various combinations

How I have understood your idea is that you wish to say that searching
through tags and names of objects is helpful. Sure it is.

But it is not the only way of making intersections, there is number
3. and number 4. and X different ways for intersections.

If we think of merging various properties into "one" to provide better
search results, than I can recommend the full text search, one
reference is here:

Hyperscope full text search with PostgreSQL:
https://hyperscope.link/3/6/7/6/8/Hyperscope-full-text-search-with-PostgreSQL-36768.html

As then, I can update tokens to include names of the tags, to include
the name of object, description, text, language, country, related
currency, names of related people, etc.

I can't say that excluding tags alone is beneficial, but I can say
that including tags is beneficial in various search functions.

I have given you example in PostgreSQL, and I think full text search
in SQLite does not exist. Even building on it is less productive for
future, as it is single user database.

Build products that are multi-user and collaboration based.

This isn’t unique to ekg, other tools such as Logseq also consider
tags to be equivalent to pages of the same name, although this
functionality is limited since tags can only be just one word.

In my work I have table tags

                                              Table "public.tags"
┌───────────────────┬──────────────────────────┬───────────┬──────────┬───────────────────────────────────────┐
│      Column       │           Type           │ Collation │ Nullable │                Default                │
├───────────────────┼──────────────────────────┼───────────┼──────────┼───────────────────────────────────────┤
│ tags_id           │ integer                  │           │ not null │ nextval('tags_tags_id_seq'::regclass) │
│ tags_datecreated  │ timestamp with time zone │           │ not null │ CURRENT_TIMESTAMP                     │
│ tags_datemodified │ timestamp with time zone │           │          │                                       │
│ tags_usercreated  │ text                     │           │ not null │ CURRENT_USER                          │
│ tags_usermodified │ text                     │           │ not null │ CURRENT_USER                          │
│ tags_name         │ text                     │           │ not null │                                       │
│ tags_description  │ text                     │           │          │                                       │
│ tags_languages    │ integer                  │           │ not null │ 1                                     │
│ tags_tag1         │ integer                  │           │          │                                       │
│ tags_tag2         │ integer                  │           │          │                                       │
│ tags_tag3         │ integer                  │           │          │                                       │
│ tags_tagtypes     │ integer                  │           │ not null │ 1                                     │
│ tags_hidden       │ boolean                  │           │ not null │ false                                 │
│ tags_people       │ integer                  │           │          │                                       │
│ tags_rank         │ integer                  │           │ not null │ 0                                     │
└───────────────────┴──────────────────────────┴───────────┴──────────┴───────────────────────────────────────┘

You may notice that tag has its own properties. It has its name, and
because tag is addressed by its unique ID, the name can have
spaces. It may be long. It can contain any chars. It can have its
description. It may be in different language.

It may be tagged by three other tags. Why is that useful? Tag once any
object, and computer will update it with any additonal tags. If I tag
something with "Video" computer will tag it with "Media" and "Video",
as maybe that is what user wants. But I do not keep to add relevant
synonyms or relevant tags, I can add just one of them. If that feature
becomes very useful, I would simply provide tagging of tags.

Because tags are addressed by ID as integer and not by name, I can
rename tags on the object without losing the tag from the object. All
objects then appear with the renamed tag!

Then there are tag types. Very useful.

Maybe there is elementary object, like note, speaking about "Technical
School", but that they provide skill of "Optician" is not derived from
the name alone. That is one example among many why tags are useful.

It is however useful concatenating tags with name and searching among
them, that is one of intersections that are helpful to human.

In org-roam, a tag is just a tag, so you can have a note called
“emacs” and a tag called “emacs”, but these are not related.

Okay

ekg takes the idea a step further: there are (mostly) no titles,
only tags. So, instead of writing text in a note called “emacs”,
just write a note and tag it with “emacs”. There is no “title”, only
tags.

In my opinion that design is backwards, it does not help.

If it is searching by name, or splitting name into words to search by
those words, then it is not "tag".

Tag is a property separate from object. It may have different name
than anything in the object. I can have "USD" for US dollar as
tag.

Tags are properties of the object that have no special group or class.

It is useful to have tags in existence which are not related to any
object. This is for human to have easier work. What if you have tags
for currencies such as "USD", "EUR", "GBP", and you wish to tag object
with it, but in that moment you did not have "GBP" ready, so instead
of writing it every single time, you can prepare the tag list for user
to select it easily.

Elementary Objects:
https://www.dougengelbart.org/content/view/110/460/#2a1a

My elementary objects have "Currency" property directly attached, and
whatever is more important it becomes directly attached to the
object. Tags are there as dettached properties that may belong to any
object.

Separate functions can be made to automatically relate objects to
tags. Such as making tags out of the name. As that seems to be that
what you designed.

Example is with the object named "NonGNU Emacs Lisp Package Archive"
that may be automatically tagged by tags such as "Emacs" and
"Archive".

If you write another note about emacs, also tag it “emacs”, and
maybe something else too. Or tag it something more involved, like an
idea: “emacs’s power derives from putting all data in buffers, and
making all commands deal with buffers.” That’s a perfectly fine tag,
and if you notice a connecting idea, you can tag it with this as
well.

That is right and good. My tags may also be of arbitrary
length.

Though tags are useful because they are more simplistic ideas, not
complex.

Their usefulness is derived from their combination, not from their
quality.

Their meaning shall be elementary meaning, not complex meaning.

Their purpose shall be generation of intersections.

A tag like "Emacs" is less useful, as it would find anything about
Emacs. Then "Emacs" combined with "ekg" is more useful. Isn't it?

Using name of object to generate tags is useful. But excluding tags as
such is not.

The advantage of this method is that it solves something that has
bothered me for a while about the recent suite of tools like
org-roam: backlinks are non-symmetrical. If you enter a note in your
org-roam daily about emacs, and link it to the emacs note, then when
you go to the emacs note, you have to explicitly enable the
backlinks buffer to see the daily entry where you first entered
it. Systems such as Logseq and the original Roam have backlinks
alongside normal content, but this doesn’t seem possible in emacs,
where a buffer of a file is expected show the file, and tricks with
overlays can’t solve the issue. Even if it could, I want a system in
which it doesn’t matter where you enter the data, it shows up in the
original place the same as everywhere else it is linked to, not as a
backlink, but just as part of the content. Having notes with no
title, only tags, makes this possible, because there is no longer a
difference between linking and writing in the context in, both are
denoted by tags.

In my opinion your solution is only one of many solutions. It is not
necessary to be so, it can be implemented in various ways.

As a consequence of this design, notes can be small, because to add
another note to a subject, you don’t need to append to an existing
note, you can create another note.

That is right, multiple notes shall be available for any possible
relation.

Additionally, ekg has another key difference: it uses sqlite instead
of the filesystem. When notes are small and do not have titles,
files don’t make a lot of sense anymore.

File system is one way of "tagging" and sorting of files. It has
directories, filenames, hierarchical structure and access by variety
of means.

It is up to user to "sort" his stuff, relate to each other. It is not
as flexible as database.

Additionally, the filesystem is limited. Even in org-roam, which
uses it, it needs to be augmented with sqlite anyway to enable fast
querying of tags and other operations. The sqlite-only approach also
means it is much easier to make certain kinds of changes, since they
only involve changing the database and not the text as well. In
general, text and data are separated as much as possible here, so
there’s no need or desire for the text to have to store data as
well, we leave that completely to the database.

In my opinion your introduction is difficult to comprehend. It is good
if you make video or screenshots.

Prefixed tags

Another concept, loosely applied in ekg is that of tags with
standard prefixes. By default, date tags are prefixed with
“date/”. This is a way to distinguish date tags from other kinds of
tags. Most tags shouldn’t need it, but it often is useful to have
prefixes to group tags in some way. For instance, perhaps all idea
tags should be prefixed with “idea/”. In my ekg repository I use in
my company, I have “person/” as a tag prefix for my coworker’s
username.

We speak here of tag types. Think of implementing it. If you implement
"Default" tag type and let user add any other tag type, that way you
will give useful system. Users can then decide that tag belongs to tag
type "Idea", or "Skill" or maybe "Time".

The benefit of this is that it’s now possible to narrow in on just
tags of a certain type if necessary.

We speak here basically of tags for tags. Or properties of the
tags. And I find it very good notion for future.

There are a few other types of prefixes commonly used for tags. One
is that titled resources have default tags that are prefixed with
“doc/”, followed by the name of the document. Removed tags are
prefixed with “trash/”, but these are normally invisible to the
user. There’s a section on these trash tags below which goes into
more detail.

I would not do it that way to allow human mistake, for human to add
string with slash, rather using tag types.

I have table tagging, if tag is removed from object, it is removed
from table tagging, the table references tag with elementary
object
or with people. But there is use for the tag to remain in
the database, as for future selection of tags.

In this case tag can be related to people object or document
object. There is hundred of other tables, but for those two tags are
most useful.

                                                    Table "public.tagging"
┌──────────────────────┬──────────────────────────┬───────────┬──────────┬───────────────────────────────────────────────────┐
│        Column        │           Type           │ Collation │ Nullable │                      Default                      │
├──────────────────────┼──────────────────────────┼───────────┼──────────┼───────────────────────────────────────────────────┤
│ tagging_id           │ integer                  │           │ not null │ nextval('peopletags_peopletags_id_seq'::regclass) │
│ tagging_datecreated  │ timestamp with time zone │           │ not null │ CURRENT_TIMESTAMP                                 │
│ tagging_datemodified │ timestamp with time zone │           │          │                                                   │
│ tagging_usercreated  │ text                     │           │ not null │ CURRENT_USER                                      │
│ tagging_usermodified │ text                     │           │ not null │ CURRENT_USER                                      │
│ tagging_tags         │ integer                  │           │ not null │                                                   │
│ tagging_people       │ integer                  │           │          │                                                   │
│ tagging_hyobjects    │ integer                  │           │          │                                                   │
│ tagging_description  │ text                     │           │          │                                                   │
└──────────────────────┴──────────────────────────┴───────────┴──────────┴───────────────────────────────────────────────────┘

My tagging is such that I can tag with existing tags or newly created
tags. Tag searching uses function completing-read-multiple to allow
me finding multiple tags.

@ahyatt
Copy link
Owner

ahyatt commented Jan 19, 2023

Thank you very much for all these excellent notes.

Let me respond to some of this:

First, I agree that we really should have a full-text search. That would be the responsibility of the triples package, which I also maintain. I need to try this out, I hope that the emacs 29 sqlite has the necessary capabilities to do this. Anyway, I completely agree that it would contain tags and text.

About tags with tags, yes, right now ekg is very close to being able to do that. It just needs the ability to have tags have notes, which is very natural IMHO and fits in with how at least I want to organize things. Since the UI is note-driven, once we have notes, the user can tag the tag with whatever tags they want. Whether this has any interesting effects is something I'll have to think about, your experience on this is interesting though.

I also do have a video I've recorded, but I don't want to do any sort of advertisement of this package until it is in a repository. I'll see if I can make the text a bit more understandable, though.

@gnusupport
Copy link
Author

gnusupport commented Jan 20, 2023

In PostgreSQL it is easy to provide full text search. I strongly suggest switching to PostgreSQL and going away from single user model, single computer model. Collaboration with multiple users is so much more useful.

I have handled searches by using function like following:

(defun rcd-sql-search-snippet-for-and-column (column query &optional operator logic)
  (let* ((words (split-string query nil t (rx (any whitespace))))
	 (operator (or operator "~*"))
	 (logic (or logic "AND"))
	 (my-and (mapconcat 
		  (lambda (e)
		    (concat " " column " " operator " " 
			    (sql-escape-string e) " "))
		  words logic)))
	 my-and))

What it does is creating SQL queary with AND or OR logic for all words:

(rcd-sql-search-snippet-for-and-column "my_db_column" "query words I am searching for") ➜ " my_db_column ~* E'query' AND my_db_column ~* E'words' AND my_db_column ~* E'I' AND my_db_column ~* E'am' AND my_db_column ~* E'searching' AND my_db_column ~* E'for' "

and

(rcd-sql-search-snippet-for-and-column "my_db_column" "query words I am searching for" "LIKE" "OR") ➜ " my_db_column LIKE E'query' OR my_db_column LIKE E'words' OR my_db_column LIKE E'I' OR my_db_column LIKE E'am' OR my_db_column LIKE E'searching' OR my_db_column LIKE E'for' "

As that helps in constructing SQL queries to search this and that. On my side this is mostly used function.

@gnusupport
Copy link
Author

gnusupport commented Jan 20, 2023

For PostgreSQL:

Mastering PostgreSQL Tools: Full-Text Search and Phrase Search - Compose Articles:
https://compose.com/articles/mastering-postgresql-tools-full-text-search-and-phrase-search/

PostgreSQL: Documentation: 15: Chapter 12. Full Text Search:
https://www.postgresql.org/docs/15/textsearch.html

FOR SQLite:

SQLite FTS5 Extension:
https://www.sqlite.org/fts5.html

@gnusupport
Copy link
Author

I suggest looking into following design:

TECHNOLOGY TEMPLATE PROJECT OHS Framework :
https://www.dougengelbart.org/content/view/110/460/

Objects are basic content packets of an arbitrary, user and developer
extensible nature. Types of elementary objects could contain:

  • text
  • graphics
  • equations
  • tables
  • spreadsheets
  • canned-images
  • video
  • sound
  • code elements, etc.

I strongly suggest being able to record any kind of object. It does not mean saving video into database, but if video is on the file system, it would mean indexing the video from file system or putting it in the file system structure devised by program, and note has type of video.

I use that approach. See more on that link about Open-Document Hypertext Systems, as following that design makes programs most useful.

@ahyatt ahyatt added the enhancement New feature or request label Jul 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants