Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

connection/jobs table update: track server connid, add user/id to jobs table #276

Closed
wants to merge 2 commits into from

Conversation

ixcat
Copy link
Contributor

@ixcat ixcat commented Aug 18, 2020

to fix issue #87, #275 (add user/server connection to jobs table).

ideally, connection ID would be directly available via MyM; for the moment,
hack in a query to retrieve connection ID on connection and use this for
jobs table logging.

…s table

to fix issue #87, #275 (add user/server connection to jobs table).

ideally, connection ID would be directly available via MyM; for the moment,
hack in a query to retrieve connection ID on connection and use this for
jobs table logging.
@ixcat ixcat marked this pull request as draft August 19, 2020 03:01
@ixcat ixcat marked this pull request as ready for review October 9, 2020 17:25
Copy link
Collaborator

@guzman-raphael guzman-raphael left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you fix merge conflicts? Also, I would prefer if we added a test to validate that new attributes user/connection_id are set properly in Jobs table.

@@ -168,6 +169,9 @@ function reload(self)
if ~isempty(self.initQuery)
self.query(self.initQuery);
end

self.serverId = self.query('SELECT CONNECTION_ID() as id').id;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see two problems here:

  1. Have you tested this? Would it not create 'infinite' recursion or mutation of query for id?
  2. I don't think you can evaluate as is. Should verify but thinking it will fail as MATLAB does not allow you to index (or chain for that matter) into the result of a function. You would need to explicitly assign the result of query(...) to a variable and then index into it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good points - has been a while, will double check everything, though I do seem to recall the .id doing.

Copy link
Contributor Author

@ixcat ixcat Oct 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

R.e: infinite (point 1): not clear on thinking why this would recurse,etc.
R.e: works (point 2): This works on 2020a:

mysql> select * from `~jobs` \G;
*************************** 1. row ***************************
   table_name: mattut.First
     key_hash: jSWi-Y9tCiH5y_OFLC0kpx1vCxJFeQVL
       status: reserved
          key: mYm S                        top_id )       A                                   @
error_message: 
  error_stack: NULL
         user: cat
         host: atv01
          pid: 33430
connection_id: 8
    timestamp: 2020-10-13 10:06:54
1 row in set (0.00 sec)

ERROR: 
No query specified

mysql> show full processlist;
+----+------+-----------------+-------------+---------+------+----------+-----------------------+-----------+---------------+
| Id | User | Host            | db          | Command | Time | State    | Info                  | Rows_sent | Rows_examined |
+----+------+-----------------+-------------+---------+------+----------+-----------------------+-----------+---------------+
|  8 | cat  | localhost:54028 | NULL        | Sleep   |   44 |          | NULL                  |         0 |             0 |
|  9 | cat  | localhost       | test_mattut | Query   |    0 | starting | show full processlist |         0 |             0 |
+----+------+-----------------+-------------+---------+------+----------+-----------------------+-----------+---------------+
2 rows in set (0.00 sec)

mysql> ^Z
[1]+  Stopped                 mysql -ucat -p
$ ps auwx|grep 33430
cat        33430  0.3  4.8 6968968 784680 pts/7  Sl+  Oct12   3:23 /opt/MATLAB/R2020a/bin/glnxa64/MATLAB -nodesktop -prefersoftwareopengl
cat        53331  0.0  0.0  18668   732 pts/11   S+   10:08   0:00 grep 33430
$

apparently this only works for some limited cases:
https://www.mathworks.com/help/matlab/matlab_prog/indexing-into-function-call-results.html
not sure if this is new functionality or not.

Copy link
Contributor Author

@ixcat ixcat Oct 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

R.e. testing: could write, but noting we have literally no unit tests covering populate/parpopulate, basic joins, etc. so this is a bit of an undertaking & might be best to discuss approach / code architecture / etc. before proceeding to save unnecessary/redundant rework later during PR review

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the external-storage branch we have some tests related to populate within the external storage tests. If you want to take a look, they are in TestExternalFile.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r.e. point 1:
My thinking on why this might fall into a recursive loop is consider the following definition for query in your PR:

        function ret = query(self, queryStr, varargin)
            % dj.Connection/query - query(connection, queryStr, varargin) issue an
            % SQL query and return the result if any.
            % The same connection is re-used by all DataJoint objects.
            if ~self.isConnected
                self.connId=mym(-1, 'open', self.host, self.user, self.password, self.use_tls);
                if ~isempty(self.initQuery)
                    self.query(self.initQuery);
                end

                self.serverId = self.query('SELECT CONNECTION_ID() as id').id;

Based on the above, on first connection, it would go into if block, connect and then hit the line self.serverId and enter the query block once more likely infinitely or until hit a recursive limit, no? We might be fine if there is special handling on get of isConnected but don't recall off-hand.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r.e. point2:
Hmmm, I haven't yet come across a use-case where this has worked in my dev. Could be related that I dev in R2016b to ensure backward-compatibility but generally, it is best to evaluate functions into a variable and then index into it (as wasteful as it is...).

Copy link
Contributor Author

@ixcat ixcat Oct 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

notes from 2020-10-16 dev:
r.e. recurse: adjust to direct mym calls
r.e. indexing: feature too new, save ref and index that
r.e. unit test: base on external storage which does have them

@ixcat ixcat changed the base branch from stage to external-storage October 30, 2020 20:52
@ixcat ixcat changed the base branch from external-storage to stage October 30, 2020 20:52
@ixcat
Copy link
Contributor Author

ixcat commented Oct 30, 2020

superseded by: #308

@ixcat ixcat closed this Oct 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants