Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve support for users that want to deploy their own backened #25

Closed
4 tasks
jacksonh opened this issue Feb 12, 2022 · 81 comments
Closed
4 tasks

Improve support for users that want to deploy their own backened #25

jacksonh opened this issue Feb 12, 2022 · 81 comments

Comments

@jacksonh
Copy link
Contributor

Currently Omnivore relies on a few GCP services to run, but open source users will likely want to deploy the api, web, and content fetching (puppeteer-parse) service to another platform. We need to come up with a list of target platforms and supported deployment configurations that are realistic for users wanting to deploy a minimalistic configuration.

Some of the services we currently rely on:

  • Pubsub & Cloud Task -- task manager is used for schedule jobs such as content fetching
  • CloudStorage -- we use CloudStorage as our main blob storage for PDF files
  • Cloud Functions -- we deploy the pupeteer-parse service to a google cloud function

Other services we are using:

  • SendGrid Inbound Parse -- to receive income email and invoke a webhook (running on a GCP Cloud Function)
@Limezy
Copy link

Limezy commented Apr 27, 2022

Hi, thanks for trying to make Omnivore self-hostable !
For the CloudStorage part, could it be replaced by a MinIO instance with minor modifications on your side ?
For the SendGrid I guess a simple SMTP connector would be sufficient ?
For the auth process, if you make your app compatible with passport.js would be very easy for the community to then add connectors as per their needs. I would recommand OIDC as the primary choice.

@trashhalo
Copy link

Google built this go library that abstracts gcp services and has plugs for aws, azure, etc. https://gocloud.dev/

I wonder if a similar library exists for nodejs. 🤔

@menelic
Copy link

menelic commented Jul 15, 2022

Thanks for a great tool - making this a nextcloud app would make this easy to deploy for many non technical users who have access to nextcloud instances. The app fits well with the open nextcloud ecossystem and meet a need not adressed by any nextcoud app.

@Nevarro
Copy link

Nevarro commented May 1, 2023

Could we get a progress update? Omnivore is exactly what I am looking for – only production-level self-hosting is missing.

@Coo-ops
Copy link

Coo-ops commented May 9, 2023

Please update to fully open source. Thank you.

@jacksonh
Copy link
Contributor Author

jacksonh commented May 9, 2023

Please update to fully open source. Thank you.

Hi @Coo-ops the only piece that isn't open source is a PDF viewer library that we license, in the future we will try to replace this with pdf.js.

@obvionaoe
Copy link

any new progress on this issue?

@r0bbie
Copy link

r0bbie commented Nov 18, 2023

I really want to go all-in on Omnivore. Coming across from Wallabag (and tried out a load of other options) the UX seems great, and happy to see it be open source! So been keeping an eye on this - right now both the lack of ability to self-host (or at least do so easily!) paired with the lack of any data export function (locking you in) makes me really hesitant..

If data export existed I'd at least be more willing to go with your hosted version for now, knowing it'll either be easy to migrate over when self-hosting is properly available (or to migrate my data to another solution altogether in the event self-hosting failed to materialise).

Wondered if there are any updates on this at present?

@stanthewizzard
Copy link

deployment with docker won't do the trick ?
I want omnivore in house

@jerryzhang721
Copy link

Is there a detailed tutorial for docker self-hosting?

@r0bbie
Copy link

r0bbie commented Nov 27, 2023

@jerryzhang721 All I've been able to find are the extremely basic instructions in the readme (https://github.com/omnivore-app/omnivore#how-to-setup-local-development-computer), but I was simply unable to get this working when using a custom domain rather than local IP/port. And I'm still extremely unclear on if all the external cloud dependencies have been refactored out yet allowing proper self-hosting or not..

@axelson
Copy link

axelson commented Dec 1, 2023

I'm very much hoping to self-host Omnivore as well!

I didn't see these docs posted in this issue so I'll post them here:

@grapemix
Copy link

grapemix commented Dec 5, 2023

For the record, @lawrencegripper did contribute on k8s setup in #2966. Unfortunately, it is in WIP and he is unavailable.

@se-jaeger
Copy link
Contributor

se-jaeger commented Dec 6, 2023

For the record, @lawrencegripper did contribute on k8s setup in #2966. Unfortunately, it is in WIP and he is unavailable.

Thanks for this pointer! I'm currently thinking about/planning to work on a Helm chart. Will probably start over the Christmas days.

Feel free to ping me or connect if you would like to support.

@grapemix
Copy link

grapemix commented Dec 13, 2023

For the record, @lawrencegripper did contribute on k8s setup in #2966. Unfortunately, it is in WIP and he is unavailable.

Thanks for this pointer! I'm currently thinking about/planning to work on a Helm chart. Will probably start over the Christmas days.

Feel free to ping me or connect if you would like to support.

FYI, not sure if you heard about https://bjw-s.github.io/helm-charts/docs/app-template/, it is pretty popular and it probably can save you quite a lot of time in this case. Also, lots of homelab users already have deployed cloudnative-pg or something equivalent, script to bootstrap PG is likely not needed to them.

@se-jaeger
Copy link
Contributor

se-jaeger commented Jan 7, 2024

For the record, @lawrencegripper did contribute on k8s setup in #2966. Unfortunately, it is in WIP and he is unavailable.

Thanks for this pointer! I'm currently thinking about/planning to work on a Helm chart. Will probably start over the Christmas days.

Feel free to ping me or connect if you would like to support.

Hi, sorry for the late response there were some hurdles I had to overcome.

As @grapemix suggested, I use the bjw-s helm chart to setup a functioning instance (Web, API, content-fetch). You can found the current (WIP) version here: https://github.com/se-jaeger/omnivore

There are some things I want to improve. However, in the meantime, I'd love to get feedback from you:

  • documentation
  • health checks
  • RSS

One more remark. I built and pushed the images to my Docker Hub account: https://hub.docker.com/u/sejaeger
Working on #3177 would definitively improve the chart.

@se-jaeger
Copy link
Contributor

FYI: already merged #3385

@Nevarro
Copy link

Nevarro commented Jan 23, 2024

Any news on docker self-hosting?

@mbhkoay
Copy link

mbhkoay commented Feb 4, 2024

Sharing my experience attempting to self-host this, not a coder at all so trying to fix some things is out of my expertise.
I self-host stuff on my unraid machine, so your mileage may vary.

  1. Git clone
    git clone https://github.com/omnivore-app/omnivore

  2. Change directory
    cd omnivore

  3. Adjust docker-compose
    a. added lines to create docker custom network.
    b. replace all secrets and environments required. Refer docker compose file.
    c. I think I messed some things up while updating the postgres password, so I ended up not changing them.

  4. Reverse proxy for port 3000 & 4000 (or the changed port number)

  5. Start docker
    docker compose up --detach

  6. Outstanding items & observations
    a. Readme says to save pages puppeteer-parse is required outside of docker - Not sure if this can be dockerised
    b. by default the login is demo@omnivore.app, password: demo_password. Have not found any settings to change password from within the app. I guess you can click forget password and reset it via email, but I haven't tried and think that it is likely that it won't work.
    c. No way to disable the signup button yet
    d. content-fetch is not working, it throws an error about redisURL not supplied. Attempted to throw in a redis container to see if it works. Apparently not. You can see the additional section in the docker compose.
    e. se-jaeger's contribution using helm chart is something that I have yet to explore. (No experience at all with helm/Kubernetes etc.) He mentions the requirement of elastic-search, as well as an alternative to handle RSS Subscriptions.

version: '3'
services:
  postgres:
    image: "ankane/pgvector:v0.5.1"
    container_name: "omnivore-postgres"
    environment:
      - POSTGRES_USER=postgres #nochange?
      - POSTGRES_PASSWORD=postgres #nochange?
      - POSTGRES_DB=omnivore
      - PG_POOL_MAX=20
    healthcheck:
      test: "exit 0"
      interval: 2s
      timeout: 12s
      retries: 3
    expose:
      - 5432 #change-likely-not-required from 5432
    networks: #custom docker network
      - omnivore #custom docker network

  migrate:
    build:
      context: .
      dockerfile: ./packages/db/Dockerfile
    container_name: "omnivore-migrate"
    command: '/bin/sh ./packages/db/setup.sh' # Also create a demo user with email: demo@omnivore.app, password: demo_password
    environment:
      - PGPASSWORD=postgres #nochange?
      - POSTGRES_USER=postgres #nochange?
      - PG_HOST=postgres
      - PG_PASSWORD=app_pass #changeme-postgres-app-pass
      - PG_DB=omnivore
    depends_on:
      postgres:
        condition: service_healthy
    networks: #custom docker network
      - omnivore #custom docker network

  api:
    build:
      context: .
      dockerfile: ./packages/api/Dockerfile
    container_name: "omnivore-api"
    ports:
      - "4000:8080"
    healthcheck:
      test: ["CMD-SHELL", "nc -z 0.0.0.0 8080 || exit 1"]
      interval: 15s
      timeout: 90s
    environment:
      - API_ENV=local
      - PG_HOST=postgres
      - PG_USER=app_user #changeme-postgres-app-user
      - PG_PASSWORD=app_pass #changeme-postgres-app-pass
      - PG_DB=omnivore
      - PG_PORT=5432 #change-likely-not-required from 5432
      - PG_POOL_MAX=20
      - JAEGER_HOST=jaeger
      - IMAGE_PROXY_SECRET=aaaaaaaaaaaaaaaaaaa #changeme
      - JWT_SECRET=bbbbbbbbbbbbbbbbbbbbbbbbbbb #changemejwt
      - SSO_JWT_SECRET=ccccccccccccccccccccccc #changeme
      - CLIENT_URL=https://omnivore.some.domain #change-port-3000-if-required
      - GATEWAY_URL=https://api.omnivore.some.domain/api #not sure if need to change? originally http://localhost:8080/api
      - CONTENT_FETCH_URL=http://content-fetch:8080/?token=dddddddddddddddddddddddd #changemetoken
    depends_on:
      migrate:
        condition: service_completed_successfully
    networks: #custom docker network
      - omnivore #custom docker network

  web:
    build:
      context: .
      dockerfile: ./packages/web/Dockerfile
      args:
        - APP_ENV=prod
        - BASE_URL=https://omnivore.some.domain #changeme-domain-url e.g. https://omnivore.domain.com
        - SERVER_BASE_URL=https://api.omnivore.some.domain #changeme-api-server-domain-url e.g. https://api.omnivore.domain.com
        - HIGHLIGHTS_BASE_URL=https://omnivore.some.domain #changeme-domain-url e.g. https://omnivore.domain.com
    container_name: "omnivore-web"
    ports:
      - "3001:8080" #change-port-3000-if-required
    environment:
      - NEXT_PUBLIC_APP_ENV=prod
      - NEXT_PUBLIC_BASE_URL=https://omnivore.some.domain #changeme-domain-url e.g. https://omnivore.domain.com
      - NEXT_PUBLIC_SERVER_BASE_URL=https://api.omnivore.some.domain #changeme-api-server-domain-url e.g. https://api.omnivore.domain.com
      - NEXT_PUBLIC_HIGHLIGHTS_BASE_URL=https://omnivore.some.domain #changeme-domain-url e.g. https://omnivore.domain.com
    depends_on:
      api:
        condition: service_healthy
    networks: #custom docker network
      - omnivore #custom docker network

  content-fetch:
    build:
      context: .
      dockerfile: ./packages/content-fetch/Dockerfile
    container_name: "omnivore-content-fetch"
    ports:
      - "9090:8080"
    environment:
      - JWT_SECRET=bbbbbbbbbbbbbbbbbbbbbbbbbbb #changemejwt
      - VERIFICATION_TOKEN=dddddddddddddddddddddddd #changemetoken
      - REST_BACKEND_ENDPOINT=https://api.omnivore.some.domain/api #not sure if need to change? originally http://api:8080/api
#      - REDISURL=redis://omnivore-redis:6379 #redis
    depends_on:
      api:
        condition: service_healthy
    networks: #custom docker network
      - omnivore #custom docker network

#################   redis   ###############
#  omnivore-redis:
#    image: redis:latest
#    container_name: omnivore-redis
#    environment:
#      - TZ=Asia/Kuala_Lumpur
#    restart: always
#    networks:
#      - omnivore
#################   redis   ###############

networks: #custom docker network
  omnivore: #custom docker network

@se-jaeger
Copy link
Contributor

se-jaeger commented Feb 4, 2024

Hi @mbhkoay,

thanks for this write up! Here are some pointers that may help.

3.c. I think I messed some things up while updating the postgres password, so I ended up not changing them.

In the https://github.com/omnivore-app/omnivore/blob/main/self-hosting/helm/values.yaml file, I added some hard coded credentials (PG_DB, PG_USER) that are also hard coded in the code base, which is why can't change them easily.

6.b. by default the login is demo@omnivore.app, password: demo_password.

I added a environment variable that allows to turn-off the creation of this default user: NO_DEMO_USER=1. However, if you register a new one, make sure to follow the steps documented here to verify it. (Normally, you would get an email that asks to click a link)

6.d. content-fetch is not working, it throws an error about redisURL not supplied. Attempted to throw in a redis container to see if it works. Apparently not. You can see the additional section in the docker compose.

Also stumbled across this. If you rollback to this comment (e44616b01), which is the latest before redis is required for content-fetch, it should be possible to run it.

I plan to dive into these changes and propose a solution for self-hosted instances.

Hope it helps. Cheers.

@mariusrugan
Copy link

Hi @mbhkoay and @se-jaeger,
thanks both for your contributions,

is still unclear to me if elastic is needed.
looking at the docker-compose from the projects' root, i just see pgvector (postgres+pgvector).

thanks in advance!

@jacksonh
Copy link
Contributor Author

Hey @mariusrugan we actually just dropped the elastic requirement recently, we're also in the middle of pulling out most of the GCP requirements and getting things down to two images (backend which will both process async jobs and run the API, and content-fetch which is the standalone service for fetching page content).

its in a bit of flux right now though as we wrap up this work.

@jacksonh
Copy link
Contributor Author

@jacksonh
Copy link
Contributor Author

For the record, @lawrencegripper did contribute on k8s setup in #2966. Unfortunately, it is in WIP and he is unavailable.

Thanks for this pointer! I'm currently thinking about/planning to work on a Helm chart. Will probably start over the Christmas days.
Feel free to ping me or connect if you would like to support.

Hi, sorry for the late response there were some hurdles I had to overcome.

As @grapemix suggested, I use the bjw-s helm chart to setup a functioning instance (Web, API, content-fetch). You can found the current (WIP) version here: https://github.com/se-jaeger/omnivore

There are some things I want to improve. However, in the meantime, I'd love to get feedback from you:

  • documentation
  • health checks
  • RSS

One more remark. I built and pushed the images to my Docker Hub account: https://hub.docker.com/u/sejaeger Working on #3177 would definitively improve the chart.

I think a lot of this is improved with our move to bullmq jobs instead of cloud functions. The backend service has health checks for both the api server and the queue-processor server that also handle graceful shutdown via SIGTERM. We've started running both in k8s for our services as well.

@stanthewizzard
Copy link

When using docker, is the extension for chrome able to connect to it ? Thanks

@jacksonh
Copy link
Contributor Author

You'd have to build the extension yourself. For security, the extension includes a content security policy that specifies the domains it can connect to.

@stanthewizzard
Copy link

would be awesome to have settings for that to bypass default :)
thanks

@ClariNerd617
Copy link

Well, so much for that. Guess we can always fork it and figure it out on our own, because I for one am not paying for Readwise Reader again.

@stanthewizzard
Copy link

Linkwarden for a moment on my side

@thiswillbeyourgithub
Copy link

Linkwarden for a moment on my side

Does it support highlighting?

@ghost
Copy link

ghost commented Oct 31, 2024

Linkwarden for a moment on my side

Does it support highlighting?

Try readeck! They just added omnivore Import yesterday

@stanthewizzard
Copy link

Linkwarden for a moment on my side

Does it support highlighting?

IDK
But very active

@stanthewizzard
Copy link

Linkwarden for a moment on my side

Does it support highlighting?

Try readeck! They just added omnivore Import yesterday

Deleted my docker last week for linkwarden
It was a good service

@asandikci
Copy link

Linkwarden vs Readreck vs Wallabag?
Which one do you suggest? I need highlight feature

@Gandalf-the-Blue
Copy link

Wouldn't recommend Wallabag. Go for Linkwarden or Linkding.

@asandikci
Copy link

Why not wallabag?

@Limezy
Copy link

Limezy commented Oct 31, 2024

I use wallabag since almost 10 years with success. Still a few pages not well scrapped but it works well. Didn't try the others though.

@philipp-koch
Copy link

Try readeck! They just added omnivore Import yesterday

I have also decided to switch to Readeck. It's a great piece of software — free, self-hosted, and “it just works”! It comes with browser extensions for Firefox and Chrome. And for those like me who want to self-host on their Synology NAS, here's a really easy how-to.

I find it remarkable that Readeck's author added an Omnivore import option in just one day! 🤯 You can use said import functionality by clicking the “three dots”-button next to the “add new link” bar and choose “import bookmarks”:
grafik

You can then choose Omnivore import:
grafik

You need an API key (which can be added via Omnivore settings > API keys). The process worked flawlessly for me and imported all my content, including all tags, without any errors whatsoever.

Overall, highly recommended!

@nicoska84
Copy link

Try readeck! They just added omnivore Import yesterday

I have also decided to switch to Readeck. It's a great piece of software — free, self-hosted, and “it just works”! It comes with browser extensions for Firefox and Chrome. And for those like me who want to self-host on their Synology NAS, here's a really easy how-to.

I find it remarkable that Readeck's author added an Omnivore import option in just one day! 🤯 You can use said import functionality by clicking the “three dots”-button next to the “add new link” bar and choose “import bookmarks”:

grafik

You can then choose Omnivore import:

grafik

You need an API key (which can be added via Omnivore settings > API keys). The process worked flawlessly for me and imported all my content, including all tags, without any errors whatsoever.

Overall, highly recommended!

Do you know if there is a way to save from iPhone? Maybe a shortcut?
I could give a try

@philipp-koch
Copy link

Do you know if there is a way to save from iPhone? Maybe a shortcut?
I could give a try

I'm not an Apple user. As far as I know, there's no dedicated mobile app (neither for Android nor iPhone), but Readeck itself is a mobile friendly web app, and as I said, there are extensions for Firefox and Chrome, which are usable from the iPhone as well, right? So, you could just do that: Use the browser extension or the web app itself, or install it as PWA.

@ghost
Copy link

ghost commented Oct 31, 2024

There is at https://shareshortcuts.com/download/2696-send-page-to-readeck.html

@Mikilio
Copy link

Mikilio commented Nov 2, 2024

I am subscribed to this issue to receive news on the state of the actual issue in the title.
I would like people who want to suggest alternatives to Omnivore because of the current controversy to stick to express their frustrations in the respective issues for that.

For anyone looking to self-host Omnivore: At the moment, the PR #4465 looks promising and would close this issue.

@linear linear bot closed this as not planned Won't fix, can't repro, duplicate, stale Nov 18, 2024
@Mikilio
Copy link

Mikilio commented Nov 19, 2024

bump

@Limezy
Copy link

Limezy commented Nov 19, 2024

You should forget about Omnivore, I guess the linear bot gave us a clear answer on how much the new owner cares about self hosting. Did you have a look at Wallabag ?

@maa-x
Copy link
Contributor

maa-x commented Nov 19, 2024

Seems like a shame to dump it when I've got it running self-hosted. There's a PR which improves that (though I have yet to test it).

It might be smart however to fork the repo for good?

@driversti
Copy link

driversti commented Nov 19, 2024

Hey folks. Have you read the article on their blog? Omnivore is shutting down at the end of November.

@thiswillbeyourgithub
Copy link

There does not seem to be alternative that allow both webpage AND pdfs and highlighting. And even then, I think most import features of the other service don't include pdfs or highlights.

I think I could get buy just fine for a lot of years if I could just docker compose up an omnivore instance. But this is not yet doable right?

@adlpz
Copy link

adlpz commented Nov 19, 2024

@maa-x I agree it'd be a shame to have all the work done in the last few weeks be wasted. A fork that the community can get behind would be best.

I have no clue however how such a migration could be organized. Maybe someone more experienced in open source governance could help.

@thiswillbeyourgithub I have just downloaded @Podginator's PR branch (#4465), built the images and self-hosted it. Appears to work as expected, I haven't done much behind navigating around, manually adding some links and doing some highlights though.

But surely appears to be something within reach.

@thiswillbeyourgithub
Copy link

@thiswillbeyourgithub I have just downloaded @Podginator's PR branch (#4465), built the images and self-hosted it. Appears to work as expected, I haven't done much behind navigating around, manually adding some links and doing some highlights though.
But surely appears to be something within reach.

Great to hear. Thanks.

I know I would be open to paying a monthly donation to anyone who's improving a "main" fork. I'm sure I'm not the only one. That could incite forkers to join efforts.

@nileshtrivedi
Copy link

There does not seem to be alternative that allow both webpage AND pdfs and highlighting.

Doesn't Zotero do that? It even has free sync storage of 300MB.

@thiswillbeyourgithub
Copy link

There does not seem to be alternative that allow both webpage AND pdfs and highlighting.

Doesn't Zotero do that? It even has free sync storage of 300MB.

Thanks a lot I had not considered using zotero. I thought it allowed highlighting pdf but not webpage, has that changed?

What I'm after is saving pdfs, webpages (and ideally .docx etc but I can manage), then reading, highlighting them, and accessing my highlights. Mobile support for all that too.

Can zotero do all that?

@thelazyoxymoron
Copy link

Except for mobile support, Zotero gives you everything you're looking for. They recently released v7 which brought support for webpage annotations. They also have an android app in alpha, which you can either build yourself or get a nightly build from here.

@thiswillbeyourgithub
Copy link

Except for mobile support, Zotero gives you everything you're looking for. They recently released v7 which brought support for webpage annotations. They also have an android app in alpha, which you can either build yourself or get a nightly build from here.

Very tempting. Thanks a lot!! I'll take a look someday

@menelic
Copy link

menelic commented Nov 21, 2024

@thiswillbeyourgithub @thelazyoxymoron zotero actually has a mobile app for iOS and Android that fully synchronizes with the desktop and web app. It has been in closed beta for a year, is well designed and fully functional.

It is possible to highlight and annotate archived webpages. The only feature still lacking on the mobile app is freehand annotations in html - they do work great on pdf though, to the point tat recent color eInk tablets sch as the boox series of devices allow for a great, cross device workflow that truly matches and surpasses paper when it comes to syncing, searching sharing, backup etc.

The reason why I don't want to use zotero for read it later of web content is because I use it as a citation manager for my own and shared projects, so I dont want my library in zotero to contain thousands of articles I will never cite.

@thiswillbeyourgithub
Copy link

Thanks a lot it's very helpful.

The only feature still lacking on the mobile app is freehand annotations in html

Can you just tell me if we can annotate webpages using the webapp on mobile?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests