-
Notifications
You must be signed in to change notification settings - Fork 143
Issues: IBM/data-prep-kit
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Bug] cannot import name 'FdedupRayTransformConfiguration' from 'fdedup_transform_ray'
bug
Something isn't working
#898
opened Dec 20, 2024 by
MFahadShahid
1 of 2 tasks
Added Focus on Spark-based scaling of transforms (medium priority)
enhancement
New feature or request
#884
opened Dec 16, 2024 by
shahrokhDaijavad
1 of 2 tasks
Memory Consumption and Batch Processing in DPK (Medium Priority)
enhancement
New feature or request
#883
opened Dec 16, 2024 by
shahrokhDaijavad
1 of 2 tasks
[Feature] Web2Parquet should expose all options supported by dpk_connector
enhancement
New feature or request
#876
opened Dec 13, 2024 by
sujee
2 tasks done
[Bug] pip install data-prep-toolkit-transforms[all]==0.2.2 gets error
bug
Something isn't working
#873
opened Dec 12, 2024 by
daw3rd
1 of 2 tasks
[Feature] a transform to perform file level de-dupe (exact)
enhancement
New feature or request
#870
opened Dec 12, 2024 by
sujee
2 tasks done
[Bug] ededup removes all samples if the document_id is an int
bug
Something isn't working
#868
opened Dec 10, 2024 by
burn2l
2 tasks done
Making sure that we run the Jupyter lab inside venv, when running notebooks locally
enhancement
New feature or request
#856
opened Dec 4, 2024 by
shahrokhDaijavad
2 tasks done
Link to the three example notebooks of fdedup in the README file
enhancement
New feature or request
#848
opened Dec 2, 2024 by
shahrokhDaijavad
1 of 2 tasks
Add the first Google Colab Compatible Notebook as a template for all transforms
enhancement
New feature or request
#844
opened Nov 30, 2024 by
shahrokhDaijavad
2 tasks done
[Feature] HAP example for kickstart
enhancement
New feature or request
#843
opened Nov 29, 2024 by
AishaDarga
2 tasks done
[Feature] PII example for kickstart
enhancement
New feature or request
#842
opened Nov 29, 2024 by
PoojaHolkar
2 tasks done
[Discussion] Could someone kindly help to answer the question in the discussion area
enhancement
New feature or request
#841
opened Nov 29, 2024 by
vincent-pli
1 of 2 tasks
[Bug] Fix link to language modules listed in pypi
bug
Something isn't working
#827
opened Nov 25, 2024 by
dtsuzuku-ibm
1 of 2 tasks
[Feature] Add discord link to the front page (README.md) and add Fuzzy Dedup python only to the table
enhancement
New feature or request
#820
opened Nov 20, 2024 by
sujee
2 tasks done
[Feature] Some subdirs not cleaning up venv on make clean
enhancement
New feature or request
#819
opened Nov 20, 2024 by
daw3rd
2 tasks done
[Bug] pdf2parquet: identical PDF files have different Something isn't working
contents
bug
#812
opened Nov 19, 2024 by
sujee
1 of 2 tasks
[Bug] Cannot run KFP pipeline for fuzzy dedup with more than 100 actors
bug
Something isn't working
#803
opened Nov 16, 2024 by
cmadam
2 tasks done
[Feature] Create a 'User Feedback' section in discussions
enhancement
New feature or request
#802
opened Nov 14, 2024 by
sujee
1 of 2 tasks
[Feature] RAG: when saving DPK processed data into vector database, optionally save it in llama-index format
enhancement
New feature or request
#795
opened Nov 12, 2024 by
sujee
2 tasks done
[Feature] Modify pdf2parquet to accept a parquet file with the payload in the content column
enhancement
New feature or request
#792
opened Nov 11, 2024 by
touma-I
1 of 2 tasks
[Feature] add an example of html2pq in the documentation
documentation
Improvements or additions to documentation
#788
opened Nov 8, 2024 by
sujee
2 tasks done
Previous Next
ProTip!
Updated in the last three days: updated:>2024-12-19.