Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sources.browser: switch to use HPI + seanbreckenridge/browserexport #375

Merged
merged 4 commits into from
Feb 10, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
102 changes: 56 additions & 46 deletions doc/SOURCES.org
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,11 @@ import setup
for (name, description), vals in setup.DEPS_SOURCES.items():
# fuck org-ruby. promnesia[name] should be in quotes, but then it doesn't render as code. ugh.
# https://github.com/wallyqs/org-ruby/issues/45
vals = [v.split('>')[0] for v in vals]
if len(vals) == 0:
continue
print(f"- ~pip3 install --user promnesia[{name}]~")
print(f' ')
vals = [v.split('>')[0] for v in vals]
print(f' {description}: {", ".join(vals)}')
#+end_src

Expand All @@ -35,9 +37,6 @@ for (name, description), vals in setup.DEPS_SOURCES.items():
- ~pip3 install --user promnesia[org]~

dependencies for sources.org: orgparse
- ~pip3 install --user promnesia[telegram]~

dependencies for sources.telegram: dataset
:end:

Alternatively, you can just install all of them in bulk: ~pip3 install --user promnesia[all]~.
Expand All @@ -47,44 +46,41 @@ Alternatively, you can just install all of them in bulk: ~pip3 install --user pr

These are included with the current Promnesia distribution:

#+begin_src python :python "with_my python3" :dir ../src :exports output :results output drawer
#+begin_src python :dir ../src :exports output :results output drawer
print('\n') # fix github org-mode issue with drawers

import ast
from pathlib import Path
import pkgutil
import importlib
import inspect
import os

indent = lambda s: ''.join(' ' + l for l in s.splitlines(keepends=True))

git_root = Path('.').absolute().parent

from promnesia.common import Results

import promnesia.sources as pkg
for importer, name, ispkg in sorted(pkgutil.walk_packages(
path=pkg.__path__,
prefix=pkg.__name__+'.'
), key=lambda x: x[1]):
if name in {
# TODO damn, these modules need depednencies...
'promnesia.sources.browser',
'promnesia.sources.markdown',
'promnesia.sources.org',
'promnesia.sources.plaintext',
src = git_root / 'src'

for f in sorted((src / 'promnesia/sources').rglob('*.py')):
mp = f.relative_to(src)
module_name = str(mp.with_suffix('')).replace(os.sep, '.')
if module_name in {
'promnesia.sources.browser_old', # deprecated
'promnesia.sources.takeout_legacy', # deprecated
'promnesia.sources.guess',
'promnesia.sources.demo',
}:
continue
m = importlib.import_module(name)
public = [(k, v) for k, v in inspect.getmembers(m) if not k.startswith('_')]
indexers = [(k, v) for k, v in public if getattr(v, '__annotations__', {}).get('return') == Results]
assert len(indexers) > 0, name
for k, i in indexers:
# print(inspect.signature(i))
link = '../' + str(Path(m.__file__).relative_to(git_root))
print(f'- [[file:{link}][{name}]]')
d = m.__doc__
if d is not None:
print(indent(d))
a: ast.Module = ast.parse(f.read_text())
has_index = False
for x in a.body:
if isinstance(x, ast.FunctionDef) and x.name == 'index':
has_index = True
if not has_index:
continue
link = '../' + str(f.relative_to(git_root))
print(f'- [[file:{link}][{module_name}]]')
doc = ast.get_docstring(a, clean=False)
if doc is not None:
print(indent(doc))
#+end_src

#+RESULTS:
Expand All @@ -98,7 +94,11 @@ for importer, name, ispkg in sorted(pkgutil.walk_packages(
- can index most of plaintext files, including source code!
- autodetects Obsidian vault and adds `obsidian://` app protocol support [[file:../src/promnesia/sources/obsidian.py][promnesia.sources.obsidian]]
- autodetects Logseq graph and adds `logseq://` app protocol support [[file:../src/promnesia/sources/logseq.py][promnesia.sources.logseq]]


- [[file:../src/promnesia/sources/browser.py][promnesia.sources.browser]]

Uses [[https://github.com/karlicoss/HPI][HPI]] for visits from web browsers.

- [[file:../src/promnesia/sources/fbmessenger.py][promnesia.sources.fbmessenger]]

Uses [[https://github.com/karlicoss/HPI][HPI]] for the messages data.
Expand All @@ -107,10 +107,9 @@ for importer, name, ispkg in sorted(pkgutil.walk_packages(

Uses [[https://github.com/karlicoss/HPI][HPI]] github module

- [[file:../src/promnesia/sources/guess.py][promnesia.sources.guess]]
- [[file:../src/promnesia/sources/html.py][promnesia.sources.html]]
- [[file:../src/promnesia/sources/hackernews.py][promnesia.sources.hackernews]]

Extracts links from HTML files
Uses [[https://github.com/karlicoss/HPI][HPI]] dogsheep module to import HackerNews items.

- [[file:../src/promnesia/sources/hypothesis.py][promnesia.sources.hypothesis]]

Expand All @@ -133,12 +132,25 @@ for importer, name, ispkg in sorted(pkgutil.walk_packages(
Uses [[https://github.com/karlicoss/HPI][HPI]] for Roam Research data

- [[file:../src/promnesia/sources/rss.py][promnesia.sources.rss]]

Uses [[https://github.com/karlicoss/HPI][HPI]] for RSS data.

- [[file:../src/promnesia/sources/shellcmd.py][promnesia.sources.shellcmd]]

Greps out URLs from an arbitrary shell command results.

- [[file:../src/promnesia/sources/signal.py][promnesia.sources.signal]]

Collects visits from Signal Desktop's encrypted SQLIite db(s).

- [[file:../src/promnesia/sources/smscalls.py][promnesia.sources.smscalls]]

Uses [[https://github.com/karlicoss/HPI][HPI]] smscalls module

- [[file:../src/promnesia/sources/stackexchange.py][promnesia.sources.stackexchange]]

Uses [[https://github.com/karlicoss/HPI][HPI]] for Stackexchange data.

- [[file:../src/promnesia/sources/takeout.py][promnesia.sources.takeout]]

Uses HPI [[https://github.com/karlicoss/HPI/blob/master/doc/MODULES.org#mygoogletakeoutpaths][google.takeout]] module
Expand All @@ -147,16 +159,6 @@ for importer, name, ispkg in sorted(pkgutil.walk_packages(

Uses [[https://github.com/fabianonline/telegram_backup#readme][telegram_backup]] database for messages data

- [[file:../src/promnesia/sources/viber.py][promnesia.sources.viber]]

Uses all local SQLite files found in your Viber Desktop configurations:
usually in =~/.ViberPC/**/viber.db= (one directory for each telephone number).

- [[file:../src/promnesia/sources/signal.py][promnesia.sources.signal]]

When path(s) given, uses the SQLite inside Signal-Desktop's configuration directory
(see the sources for more parameters & location of the db-file for each platform)

- [[file:../src/promnesia/sources/twitter.py][promnesia.sources.twitter]]

Uses [[https://github.com/karlicoss/HPI][HPI]] for Twitter data.
Expand All @@ -165,10 +167,18 @@ for importer, name, ispkg in sorted(pkgutil.walk_packages(

Clones & indexes Git repositories (via sources.auto)

- [[file:../src/promnesia/sources/viber.py][promnesia.sources.viber]]

Collects visits from Viber desktop app (e.g. `~/.ViberPC/XYZ123/viber.db`)

- [[file:../src/promnesia/sources/website.py][promnesia.sources.website]]

Clones a website with wget and indexes via sources.auto

- [[file:../src/promnesia/sources/zulip.py][promnesia.sources.zulip]]

Uses [[https://github.com/karlicoss/HPI][HPI]] for Zulip data.

:end:


Expand Down
8 changes: 8 additions & 0 deletions src/promnesia/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -586,3 +586,11 @@ def measure(tag: str='', *, logger, unit: str='ms'):
mult = {'s': 1, 'ms': 10**3, 'us': 10**6}[unit]
xx = secs * mult
logger.debug(f'[{tag}]: {xx:.1f}{unit} elapsed')


def is_sqlite_db(x: Path) -> bool:
return x.is_file() and mime(x) in {
'application/x-sqlite3',
'application/vnd.sqlite3',
# TODO this mime can also match wal files/journals, not sure
}
3 changes: 3 additions & 0 deletions src/promnesia/sources/auto.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
"""
- discovers files recursively
- guesses the format (orgmode/markdown/json/etc) by the extension/MIME type
- can index most of plaintext files, including source code!
- autodetects Obsidian vault and adds `obsidian://` app protocol support [[file:../src/promnesia/sources/obsidian.py][promnesia.sources.obsidian]]
- autodetects Logseq graph and adds `logseq://` app protocol support [[file:../src/promnesia/sources/logseq.py][promnesia.sources.logseq]]
"""

import csv
Expand Down
Loading