Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maintainer Request #103

Open
jfahne opened this issue Oct 3, 2023 · 17 comments
Open

Maintainer Request #103

jfahne opened this issue Oct 3, 2023 · 17 comments

Comments

@jfahne
Copy link

jfahne commented Oct 3, 2023

Hi @mwatts15 I am interested in taking this project to a v1 release. Would you be open to letting me help maintain the project?

@mwatts15
Copy link
Collaborator

mwatts15 commented Dec 1, 2023

Hi, @jfahne. Sorry, I totally missed this back in October. Certainly, the help would be much appreciated if you're still interested.

@amotl
Copy link
Contributor

amotl commented Dec 18, 2023

Hi there. I also would like to offer my support on this matter. It is great that @richardscholtens already submitted a patch to update to SQLAlchemy 2, see GH-104. I've also just submitted a little fix, see GH-105.

@amotl
Copy link
Contributor

amotl commented Jan 15, 2024

Dear @mwatts15,

first of all, I would like to wish you a happy new year. On this matter, it would be so sweet if GH-104 by @richardscholtens could be driven forward in one way or another.

More by accident than not, we successfully picked up maintenance of a few Python projects in the past, some of them which even not had been on PyPI before, and received a few kudos about it. In this spirit, we would like to offer support on the maintenance of rdflib-sqlalchemy, and it would be so sweet if @jfahne could also join the crew if they are still interested.

With kind regards,
Andreas.

@mwatts15
Copy link
Collaborator

mwatts15 commented Jan 16, 2024 via email

@amotl
Copy link
Contributor

amotl commented Jan 16, 2024

Hi Mark,

thanks for your reply. The "we" in this case is not a specific entity. It was rather meant to refer to the "collective we", maybe including @jfahne or other future maintainers. Also, I made it a habit to sometimes write "we" and "they" instead of "me" and "she/he". Apologies for the confusion.

I would like to hand over maintenance to someone who can commit to maintaining it for a while (and better than I have).

I am sure you did an excellent job here, like your predecessors @adamhadani, Graham, and @ymph, so this topic is more about me being humble about and thankful for all your work, which would not have been possible within the scope of support I am offering here.

It is merely about keeping the lights on. So, I can't promise to keep the project constantly on 100% of my attention, or make any big changes, but I am hoping to occasionally have a look at it for regular maintenance reasons, maybe modernize it on a few infrastructure/sandbox details like I did with other projects to gradually migrate from setup.py to pyproject.toml, and take care about user requests and contributions if I find the capacity. Saying this, I will be more than happy to share maintenance with others who raise their voice.

I have less than zero interest in this package now.

Of course, I would appreciate if you would be still around for a while, and not abandon any conversation here completely. However, if it is decided already, I would still be up for it. C'est la vie ;].

Cheers,
Andreas.

@richardscholtens
Copy link
Contributor

richardscholtens commented Jan 17, 2024

@amotl , I am stuck at a certain test for GH-104 so every help is appreciated. In GH-104 one can see which test I am referring to. A sparring partner would be nice help to fix the tests.

I also have a suggestion for adding a docker-compose file that can be used to simulate the difference sorts of database being used. In debugging the patch I already set-up a docker-compose file for a PostgreSQL database, but I believe adding additional database will make the contribution threshold easier to overcome. This will also allow a developer to easily switch Python versions:

---
version: '3.8'
services:
  db:
    image: postgres:14.1-alpine
    restart: always
    environment:
      - POSTGRES_USER=${POSTGRES_USERNAME}
      - POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
      - POSTGRES_HOST=${POSTGRES_HOST}
      - POSTGRES_DB=${POSTGRES_DBNAME}
    ports:
      - "${POSTGRES_PORT}:${POSTGRES_PORT}"
    volumes:
      - db:/var/lib/postgresql/data
volumes:
  db:
    driver: local

Also @amotl , I agree that a pyproject.toml should be introduced. This will also allow the use of poetry and would make the rdflib-sqlalchemy library more consistent with the rdflib library. For reference see RDFLib developers guide.

@mwatts15
Copy link
Collaborator

mwatts15 commented Jan 18, 2024 via email

@jfahne
Copy link
Author

jfahne commented Jan 19, 2024

Hey all, sorry for only just now replying. I work on GitLab and Gitea servers more often. I do have capacity to actively maintain this project. I agree with the sentiments around using poetry and I think a path to a v1 release should begin by stabilizing the project in its current state. Once we have the code base to a point where it is easier to contribute, we can identify what is needed for the tool to be not only stable, but feature rich and performant enough for production use cases. I'll take a crack at switching us over to poetry and see if I can help @richardscholtens to resolve his test case.

@jfahne
Copy link
Author

jfahne commented Jan 19, 2024

@richardscholtens I have poetry working! I want to containerize the dev environment so that we do not have to worry about latent factors while working on debugging your changes. I think I can get an MR put together in a day or two for @mwatts15 to review. My plan is to build a DinD container and update the tests to use testcontainers. I'm tracking some of my brainstorming in a private project board. If @mwatts15 is open to letting me on as a maintainer after a couple MRs, I will port the project to a public one in this repo.

@richardscholtens
Copy link
Contributor

richardscholtens commented Jan 19, 2024

@jfahne, I have send you an collaboration invite for the forked repository I am working on.

Checkout on branch

Forked repository
Branch: bugfix/issue-100-update-sqlalchemy-to-version-2.0.23

@jfahne
Copy link
Author

jfahne commented Jan 22, 2024

@richardscholtens @mwatts15 I found where the issue was coming from in the upgrade MR. I am going to write up separate MRs for switching out setup.py for poetry and using DinD container. @richardscholtens I think it would be better to swap out setup.py and have the docker set up merged in first, then have you rebase to those commits and merge your changes in. Let me know what you think.

@amotl
Copy link
Contributor

amotl commented Jan 24, 2024

Hi. Thank you for taking the initiative here, @jfahne. On Poetry, I don't think it will be needed because this package is a library, and the main feature of Poetry, its lock file, is not applicable in this scenario anyway. I think a vanilla setuptools-based pyproject.toml file, also absorbing the configuration snippets, and swapping in Ruff instead of flake8/black/isort, will be sufficient. Of course, that's just my humble two cents to the topic of infrastructure modernization.

@jfahne
Copy link
Author

jfahne commented Jan 26, 2024

I understand that setuptools is plenty for what this repo implements currently. My interest in using Poetry is not in its capacity as a packaging tool. I want to make local development simpler to configure. The standard I have seen in projects lately is Poetry, but I am by no means in love with that tool. My experience is that setuptools is more difficult for the average developer to use when configuring a local development environment.

@amotl
Copy link
Contributor

amotl commented Jan 27, 2024

All right, thank you for elaborating, I hear you. Without needing (to remember) to create a virtualenv manually each time, and fiddle with it, I certainly see the convenience aspects, especially if you are maintaining multiple repositories of the same kind. Sorry for the noise.

@jfahne
Copy link
Author

jfahne commented Jan 27, 2024

Not noise at all. It is right to see setuptools as sufficient. I would like to make the repo accessible to more contributors because I feel that RDFLib is a tool which could be useful to a wide variety of projects. This sqlalchemy plugin feels like a good entrypoint to target for its ability to integrate with web backends using SQL Alchemy's ORM. With an upgrade to SQL Alchemy v2, the plugin could make use of psycopg v3 and async SQL engines in general. I think asyncio and async SQL engines could unlock performance for this tool that justifies using it for production systems. A personal focus for me is building out this plugin in service of ongoing work in semantic-free warehousing, semantic layers, and ontology engineering which are slowly gaining more traction in general for data intensive operations.

@namedgraph
Copy link

Hi. Not sure what's the best place to ask, so I'll try here:

  • is rdflib-sqlalchemy stable?
  • how does it scale in terms of data volumes and perform in terms of SPARQL queries? Anyone got any numbers/benchmarks?
  • which RDFLib version does it depend on?

I'd like to try ingesting a few million triples of SKOS data.

@richardscholtens
Copy link
Contributor

richardscholtens commented Mar 15, 2024

  1. So far I know it is quite stable. Properly tested on multiple databases.
  2. What kind of database do you want to use for this? Every database can be optimized if handled properly.
  3. You can check setup.py for the dependencies.

rdflib>=6,<8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants