-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Asynchronous Import #330
base: master
Are you sure you want to change the base?
Asynchronous Import #330
Conversation
naydav
commented
Oct 23, 2019
•
edited
Loading
edited
- Intro
- REST API
- Modularity (API / Extension points)
-- Intro -- REST API -- Modularity (API / Extension points)
"format': { | ||
"escape": "|" | ||
"enclosure" : "|" | ||
"delimiter" => "|" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pls. replace => with :
|
||
``` | ||
{ | ||
"UUID": "uuid_string" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"UUID" as Lowercase?
- Import Data Exchanging; | ||
- Import configuration; | ||
- Product import; | ||
- Stock status import; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused with stock status. You mean Stock QTY and not only enabled/disabled?
Also how about differentiate Single Stock and MSI source/stock?
After digging a little bit into this issue, the following assesment has come out: It turns out that the class RetrieveSourceData contains the source validator actor that wraps all the possible validator to a specific source data retriever(strategy), this means that each strategy must comply with the validations defined by di.xml: This forces me to assume a sort of data contract where a new strategy(json's in this case), must fit such a data contract. Not empty: sourceType, sourceDefinition, sourceDataFormat; and also sourceDefinition must come in valid base64 format. This way, coming down to Json strategy, i understand that sourceDefinition field is a json encoded in base64, if this is ok then i assume that it is actually a csv translated into json, so the data strategy retriever takes this chunk and translate it into array iterator as if a magento standard csv was. |
- Advanced pricing import; | ||
- Documentation; | ||
|
||
## TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets change a Priorities here
- Design of
Import configuration
; - Design of
Get Import status
; - Design of
Import processing - Sync / Async
- Design of
Restart failed operations
;
* | ||
* @api | ||
*/ | ||
interface CsvFormatInterface extends ExtensibleDataInterface |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this a duplicate, you already defined this interface above?
* | ||
* @api | ||
*/ | ||
interface SourceValidatorInterface |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How & where we will going to use this extension point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be used:
https://github.com/magento/architecture/pull/330/files#diff-610ea2fd7eb8081598349a79459e9313R40
On a step we are retrieving data to validate them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it injected through DI configuration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right
|
||
- [REST API](rest-api.md) | ||
|
||
## Modularity (API / Extension points) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1- It will be helpful to add a diagram to show the flow of interaction between the modules!
2- A simple class / interface UML diagram will make it really easy to grasp the concepts
*/ | ||
interface SourceDataInterface extends \IteratorAggregate | ||
{ | ||
public const ITERATOR = 'iterator'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how we use this?
* | ||
* @return string|null | ||
*/ | ||
public function getUuid(): ?string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to generalize UUID? can't we just say ID. The problem with UUID is that they are big in size and doesn't index well. When your data grows in the table, the index size also grows with it, which leads to query performance hit.
https://www.callicoder.com/distributed-unique-id-sequence-number-generator/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not expect really big data flow where, even if you are doing daily imports with 100+ files daily (which is not common case), we would need to have some cleaning functionality for such table.
to use UUID was provided in Asynchronous operations and as I know idea of Arch team was to go into UUID direction and discussion is still open, or?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there some other PRs currently opened to replace DB Sequence for IDs with some other strategy, when the ID is not externally provided; where we are having some discussions!
|
||
/** | ||
* Describes how to change data before import | ||
* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add some example use or bit more explanation of this
@@ -0,0 +1,146 @@ | |||
# Data converting before import |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also would like to recommend that please don't implement your own ETL code, there are many good PHP libraries available that can do all this for you. Please have a look into those
https://php-etl.readthedocs.io/en/v1/
https://blog.panoply.io/6-of-the-best-php-etl-tools
and If you are designing you own interface, use the standard mature libraries to manipulate the data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for info, I will check it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tariqjawed83, I expect that its will take a lot of time to confirm from Architecture team of usage some library in Magento Core?
* | ||
* @return string | ||
*/ | ||
public function getImportBehaviour(): string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this behavior? and why it is string... trying to understand, if there could be more better way to represent it rather than String.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Behaviour here is an "add", "update", "delete", "replace" ....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So on import process we can add new objects, update existed, delete some or replace.
We will start with "add" and "delete" with MVP
Related to #330 (comment) Its a high level diagram, hope will help a bit |