You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The database's design as Key and Value in Mysql has performance issue when the database is growing. Basically, the issue is in order to pull the report of Ark ID with metatdata, we have to do mysql JOINs, but they take so long to get the results.
SELECT arks.*
FROM `dsu_ark`
AS arks
JOIN (
SELECT bound.id,
COALESCE(redirected._value, 0)
AS _value
FROM (
SELECT DISTINCT REGEXP_SUBSTR(_key, '^([^\\s]+)') AS id
FROM `dsu_ark`
WHERE _key LIKE '61220/utsc%' AND _key NOT REGEXP '(\\s:\/c|\\sREDIRECT|\\sPID|\\sLOCAL_ID|\\sCOLLECTION)$' AND (_key LIKE '%%' OR _value LIKE '%%')
) AS bound
LEFT JOIN (
SELECT REGEXP_SUBSTR(_key, '^([^\\s]+)') AS id, _value
FROM `dsu_ark`
WHERE _key LIKE '61220/utsc%' AND _key REGEXP '\\sREDIRECT$'
) AS redirected ON bound.id = redirected.id
ORDER BY _value ASC
LIMIT 10
OFFSET 0
) AS subquery
ON arks._key LIKE CONCAT(subquery.id, '%')
AND arks._key NOT LIKE '%:\/c'
ORDER BY arks._key ASC;
SELECT arks.*
FROM `dsu_ark`
AS arks
JOIN (
SELECT * FROM (
SELECT DISTINCT REGEXP_SUBSTR(_key, '^([^\\s]+)') AS id
FROM `dsu_ark`
WHERE _key LIKE '61220/utsc%' AND _key NOT REGEXP '(\\s:\/c|\\sREDIRECT|\\sPID|\\sLOCAL_ID|\\sCOLLECTION)$'
INTERSECT
SELECT DISTINCT REGEXP_SUBSTR(_key, '^([^\\s]+)') AS id
FROM `dsu_ark`
WHERE _key LIKE '61220/utsc%' AND _key NOT REGEXP '\\s:\/c' AND (_key LIKE '%%' OR _value LIKE '%%')
) AS target
ORDER BY id ASC
LIMIT 10
OFFSET 0
) AS subquery
ON arks._key LIKE CONCAT(subquery.id, '%')
AND arks._key NOT LIKE '%:\/c'
ORDER BY arks._key ASC;
These query is querying the ARKs IDs with metadata which is bound to it. As LIMIT 10 which is querying 10 at a time, but it's take at least 15 secs to load which is quite slow for just 10 items.
Goals
Develop MySQL query which can pull Arks IDs and their metadata more sufficiently in a acceptable loading time
as 10, 25, 50 at a time (from the UI)
All items (exporting to CSV feature)
Stretch Goals
If the Goal above can't be achieved, we may look into restructure the database and look for alternative database platform to use
Scope the migration and development, and testing performance in the new database platform, for example:
About
How to reproduce
LIMIT 10
which is querying 10 at a time, but it's take at least 15 secs to load which is quite slow for just 10 items.Goals
Stretch Goals
Related tickets
The text was updated successfully, but these errors were encountered: