Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TableParser parsing data incorrectly #62

Open
falconscript opened this issue Jul 18, 2017 · 5 comments
Open

TableParser parsing data incorrectly #62

falconscript opened this issue Jul 18, 2017 · 5 comments
Labels

Comments

@falconscript
Copy link

falconscript commented Jul 18, 2017

ps-node version: "0.1.6"
table-parser version: "0.1.3"

I've been running a node process for a while now without problems that uses ps-node every few minutes. But I also run a few other CPU intensive processes on the machine.

One day it stopped working, maybe due to PIDs, used memory, or total CPU time getting too high. I didn't step through TableParser to find where it goes wrong. I've supplied some code here with my output (with some process names and arguments redacted) that will recreate the problem.

Maybe it's time to switch from Unix ps to using something like:
https://www.npmjs.com/package/procps

// Output copied from calling console.log() within ps-node/index.js's parseGrid function 
var psoutput = `F   UID   PID  PPID PRI  NI    VSZ   RSS WCHAN  STAT TTY        TIME COMMAND
4  1000  3002     1  20   0  45248     0 ep_pol Ss   ?          0:00 /lib/systemd/systemd --user
5  1000  3006  3002  20   0 163688     4 -      S    ?          0:00 (sd-pam)
0  1000  3101     1  20   0 9657524 6355656 poll_s Sl ?       1440:57 mysqld --datadir /aa/aa/aa
0  1000  3178     1  20   0 808752 247892 hrtime Sl  ?        11234:03 ./aa -aa /aa/aa/aa/aa.aa -aa /aa/aa/aa/aa/aa/aa/aa -aa /aa/aa/aa/aa/aa/bb/bb.bb.bb.bb -t 4 -bb /cc/cc/cc/cc/cc/cc/cc.cc -cc /a/a/a/a/a/a/a.a
0  1000  3198     1  20   0 639312     0 hrtime Sl   ?          3:54 ./h -d /a/a/a -p 3
0  1000  9215     1  20   0 1879292 95352 ep_pol Sl  ?         47:06 node ./d/d
0  1000 12547     1  20   0 1931424 21016 ep_pol Sl  ?          0:14 node app.js --prod
0  1000 17244     1  20   0 1450676 190724 ep_pol Sl ?        226:10 node app.js --prod
1  1000 17789     1  20   0  79940 44376 -      S    ?         17:43 tor --runasdaemon 1
5  1000 21352 21325  20   0 113860  1236 -      S    ?          0:01 sshd: user@pts/8
0  1000 21353 21352  20   0  22676  3804 wait_w Ss+  pts/8      0:00 -bash
5  1000 21675 21647  20   0 113868  1232 -      S    ?          0:00 sshd: user@pts/9
0  1000 21676 21675  20   0  22788  4748 wait   Ss   pts/9      0:00 -bash
0  1000 21973 21676  20   0 920496 28816 ep_pol Sl+  pts/9      0:00 node
0  1000 21987 21973  20   0  28916  1500 -      R+   pts/9      0:00 ps lx`;

// Try parsing and view the output
var TableParser = require('table-parser');
var garbledTable = TableParser.parse(psoutput);
console.log(garbledTable);
@falconscript
Copy link
Author

falconscript commented Jul 18, 2017

An option I've found is to modify psargs as passed to ps-node to get the correct output:

// Find running tor process
ps.lookup({ command: 'tor', psargs: 'awwxo pid,comm,args,ppid', }, (err, resultList) => { 
  console.log(err, resultList);
});

However, for some of the processes, the first argument is the process name itself. This isn't a problem for me, but it might be for some users.

@neekey
Copy link
Owner

neekey commented Jul 18, 2017

@falconscript thanks for reporting, I will take a look later!

@neekey
Copy link
Owner

neekey commented Jul 22, 2017

According the current table-parser's algorithm:

 1, define the edge ( begin and end ) of every title field
 2, parse all the lines except the title line, get all the connected-domains
 3, group all the connected-domains vertically overlapped.
 4, a domain group belongs to a title field if they vertically overlapped
 5, calculate all the edge info through the group domain and title field relations.

So the problem caused by this overlapping:

image

hmm.. very annoying, I might have to improve the algorithm, but might not be soon, any suggestion will be appreciated.

@falconscript
Copy link
Author

I'm more surprised that the table can be split consistently at all. I wouldn't expect whitespace splitting to work for many cases.

I mentioned above that changing your default arguments sent to ps from 'lx' to 'awwxo pid,comm,args,ppid' specifies the desired fields and their order, since ps doesn't use them all anyway.

This works fine for me, and probably would work overall, but should get a little testing on Mac/Linux variants.

@dustingraham
Copy link

The awwxo suggestion truncates the command name on centos 7. -ef worked well for me in my test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants