Skip to content

Conversation

OskarStark
Copy link
Contributor

@OskarStark OskarStark commented Sep 5, 2025

Q A
Bug fix? no
New feature? yes
Docs? no
Issues Fixes #429
License MIT

Needs

./runner indexer

CleanShot 2025-09-08 at 17 11 53@2x

bin/console app:blog:embed (removed)

CleanShot 2025-09-08 at 17 12 45@2x

bin/console app:blog:query

CleanShot 2025-09-08 at 17 13 34@2x

bin/console ai:store:index blog

CleanShot 2025-09-08 at 22 41 00@2x

@carsonbot carsonbot changed the title Feature/document indexing pipeline Feature/document indexing pipeline Sep 5, 2025
@OskarStark OskarStark force-pushed the feature/document-indexing-pipeline branch from 643b560 to d3a3a15 Compare September 5, 2025 10:40
@OskarStark OskarStark added Store Issues & PRs about the AI Store component Examples Issues & PRs about the example scripts labels Sep 5, 2025
@OskarStark OskarStark marked this pull request as draft September 5, 2025 10:40
@carsonbot carsonbot changed the title Feature/document indexing pipeline [Examples][Store] Feature/document indexing pipeline Sep 5, 2025
@OskarStark OskarStark changed the title [Examples][Store] Feature/document indexing pipeline [Examples][Store] Implement indexing pipeline Sep 5, 2025
@OskarStark OskarStark force-pushed the feature/document-indexing-pipeline branch from 0827740 to bbc3145 Compare September 5, 2025 13:48
OskarStark added a commit that referenced this pull request Sep 5, 2025
…ark)

This PR was squashed before being merged into the main branch.

Discussion
----------

[AI Bundle][Demo] Make vectorizers configurable

| Q             | A
| ------------- | ---
| Bug fix?      | no
| New feature?  | yes
| Docs?         | no
| Issues        | Refs #465
| License       | MIT

Add support for configuring vectorizers via ai.yaml configuration, allowing reuse across multiple indexers and centralized vectorizer management.

Commits
-------

7acf871 [AI Bundle][Demo] Make vectorizers configurable
@OskarStark OskarStark force-pushed the feature/document-indexing-pipeline branch from 53486dc to a6b0ace Compare September 5, 2025 15:37
@OskarStark OskarStark added the BC Break Breaking the Backwards Compatibility Promise label Sep 5, 2025
@OskarStark OskarStark force-pushed the feature/document-indexing-pipeline branch from a6b0ace to 242501e Compare September 5, 2025 15:56
@chr-hertel

This comment was marked as outdated.

@OskarStark

This comment was marked as outdated.

@OskarStark OskarStark changed the title [Examples][Store] Implement indexing pipeline [Store] Implement indexing pipeline Sep 6, 2025
@OskarStark

This comment was marked as outdated.

@OskarStark OskarStark force-pushed the feature/document-indexing-pipeline branch from 0b5dc7e to 15236b3 Compare September 6, 2025 16:01
OskarStark added a commit that referenced this pull request Sep 8, 2025
…arStark)

This PR was merged into the main branch.

Discussion
----------

[Store] Add with `TextDocument::withContent` method

| Q             | A
| ------------- | ---
| Bug fix?      | no
| New feature?  | yes
| Docs?         | no
| Issues        | Helpful for #465
| License       | MIT

Commits
-------

4a3014f [Store] Add withContent method to TextDocument with test
@OskarStark OskarStark force-pushed the feature/document-indexing-pipeline branch 5 times, most recently from 6d2dcf7 to a49521a Compare September 8, 2025 10:56
@OskarStark OskarStark force-pushed the feature/document-indexing-pipeline branch from a49521a to 6bfb1ed Compare September 8, 2025 11:25
@OskarStark OskarStark force-pushed the feature/document-indexing-pipeline branch from 382151b to b668771 Compare September 8, 2025 13:03
@@ -32,7 +34,7 @@ protected function execute(InputInterface $input, OutputInterface $output): int
$io = new SymfonyStyle($input, $output);
$io->title('Loading RSS of Symfony blog as embeddings into ChromaDB');

$this->embedder->embedBlog();
$this->indexer->index('https://feeds.feedburner.com/symfony/blog');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be the source - as a next step

*/
public function index(TextDocument|iterable $documents, int $chunkSize = 50): void;
public function index(null|string|array $source = null, array $options = []): void;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would also argue for removing null here

if ($documents instanceof TextDocument) {
$documents = [$documents];
// Prevent conflicting sources
if (null !== $source && null !== $this->source) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this way we have both, we can set a default on config level, but cannot overwrite then, as one wants to overwrite, another one wants to merge. You can define/configure an indexer now with source = null, in this case you can provide it via the method.

@OskarStark OskarStark marked this pull request as ready for review September 8, 2025 15:14
@OskarStark OskarStark force-pushed the feature/document-indexing-pipeline branch 2 times, most recently from ef78629 to 659778d Compare September 8, 2025 15:28
@carsonbot carsonbot changed the title [Store] Implement indexing pipeline [Examples][Store] Implement indexing pipeline Sep 8, 2025
@chr-hertel chr-hertel force-pushed the feature/document-indexing-pipeline branch from d77a852 to 29349c1 Compare September 8, 2025 22:02
@chr-hertel
Copy link
Member

Thank you @OskarStark.

@chr-hertel chr-hertel merged commit f2a5327 into symfony:main Sep 8, 2025
7 checks passed
@OskarStark OskarStark deleted the feature/document-indexing-pipeline branch September 8, 2025 22:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BC Break Breaking the Backwards Compatibility Promise Examples Issues & PRs about the example scripts Feature New feature Status: Reviewed Store Issues & PRs about the AI Store component
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[AI Bundle][Store] Document Indexing Pipeline
3 participants