-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MC-21651: Proposal to refactor customer import #323
base: master
Are you sure you want to change the base?
Conversation
For all 4 concepts we need to have a migration from `customer_entity` to `sequence` table idea like | ||
`migrateSequneceColumnData(customer_entity,entity_id)`. I think we will get the most benefit with the implementation of | ||
Concept 3 because we are sending less queries to the database. | ||
All concepts can be implemented in a backward compatible way because we only touch constructors and protected functions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally yes, but in reality many extensions/customizations depend on DB stricture. As a result https://devdocs.magento.com/guides/v2.3/extension-dev-guide/versioning/codebase-changes.html includes DB schema changes as MINOR/MAJOR change. This should be taken into account.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For distributed systems, it will not be possible to scale this kind of solution, where for generating ids we are relying on the database itself; we need to control at the program level how we want to generate the IDs. Twitter was using similar sequence for MySQL using AUTO_INCREMENT; but they realized it was not possible to scale this approach.
I would suggest looking into Twitter Snowflake
https://github.com/twitter-archive/snowflake/tree/snowflake-2010
Another article around this subject is this
https://www.callicoder.com/distributed-unique-id-sequence-number-generator/
Although currently many of Magento customers running on single-node database, but in the cloud we would need to support large scale bulk operations in distributed style, in which case we would be able to allow multiple threads pushing the data; so an optimized service that can handle this scale would be better approach; which doesn't rely on hitting database for Ids.
Snowflake uses 8 byte unsigned integers, currently Magento using 4 byte unsigned integer, not-sure whether it will be BIC, it seems like a PATCH? according to https://devdocs.magento.com/guides/v2.3/extension-dev-guide/versioning/codebase-changes.html
But the idea is to centralize the ID generation and take it outside of database itself; the service should be similarly scalable as other services to support the load.
This service can provide Bulk IDs as well, and many implementations I have seen, uses Redis or similar Cache to store that Bunch of IDs, in order to avoid unnecessary calls to ID Generation Service as well. We may still be able to use Magento SequenceInterface, while underlying calling this new Distributed ID Service.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@larsroettig please review the comment above from my side, and let me know your thoughts?
Problem
Solution
Requested Reviewers