Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to apply SparseRandomProjector to large Image dataset? #40

Open
PeterKim1 opened this issue Mar 30, 2022 · 0 comments
Open

How to apply SparseRandomProjector to large Image dataset? #40

PeterKim1 opened this issue Mar 30, 2022 · 0 comments

Comments

@PeterKim1
Copy link

Hello.

I want to apply this model to large image dataset. (I have over 10,000 images)

But RAM memory issue arise.

https://github.com/hcw-00/PatchCore_anomaly_detection/blob/main/sampling_methods/kcenter_greedy.py#L95

self.features = model.transform(self.X)

I think this code puts all the data embedding into RAM memory and apply SparseRandomProjector, which seems to put a lot of

pressure on RAM memory.(I'm just novice, so this may be wrong.)

Does anyone know how to solve this problem?

One idea i have is to split the data in half and apply the SparseRandomProjector to each of them, but I think it might cause problems

because SparseRandomProjector determines the dimensionality of embeddings based on Johnson-Lindenstrauss lemma.

According to sklearn document(https://scikit-learn.org/stable/modules/generated/sklearn.random_projection.SparseRandomProjection.html), n_components can be automatically adjusted according to the number of samples in the dataset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant