You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Understand that A added checks to see what's been uploaded before. The prepdocs script now writes an .md5 file with an MD5 hash of each file that gets uploaded. Whenever the prepdocs script is re-run, that hash is checked against the current hash and the file is skipped if it hasn't changed.
However i realised when I tried on the ACL solution, MD5 didnt create as expected compared to the solution without ACL.
correct me if I am wrong:
Without ACL solution: Upload files from local folder, MD5 generated at local folder, files uploaded to blob storage and to AI Search Index.
With ACL solution: Upload files from local folder to datalake, datalake to AI Search.
These solution are difference in term of file processing...
This issue is for a: (mark with an x)
- [ ] bug report -> please search issues before submitting
- [x] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)
Minimal steps to reproduce
Any log messages given by the failure
Expected/desired behavior
With ACL solution: Upload files from local folder to datalake, MD5 generated in datalake, datalake to blobstorage and to AI Search.
OS and Version?
Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)
azd version?
run azd version and copy paste here.
Versions
Mention any other details that might be useful
Thanks! We'll be in touch soon.
The text was updated successfully, but these errors were encountered:
So is your goal to be able to repeatedly re-run prepdocs to pick up new files in ADLS2, without having to re-index existing files? I think we'd probably want to implement #942 for both normal Blob storage and ADLS2, which would mean the MD5 would be stored in the blob itself, and we'd check against that.
@pamelafox this could be a great feature, since in production, the docs are sitting somewhere on the lake that other system maybe able to drop files into.
Understand that A added checks to see what's been uploaded before. The prepdocs script now writes an .md5 file with an MD5 hash of each file that gets uploaded. Whenever the prepdocs script is re-run, that hash is checked against the current hash and the file is skipped if it hasn't changed.
However i realised when I tried on the ACL solution, MD5 didnt create as expected compared to the solution without ACL.
correct me if I am wrong:
Without ACL solution: Upload files from local folder, MD5 generated at local folder, files uploaded to blob storage and to AI Search Index.
With ACL solution: Upload files from local folder to datalake, datalake to AI Search.
These solution are difference in term of file processing...
This issue is for a: (mark with an
x
)Minimal steps to reproduce
Any log messages given by the failure
Expected/desired behavior
OS and Version?
azd version?
Versions
Mention any other details that might be useful
The text was updated successfully, but these errors were encountered: