-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect Characterization #1
Comments
This is async I/O, the same thing in google's Dart which I always have problem with. I know there are areas where async I/O is better especially in server-side applications, but there are also areas where it is cumbersome. Just try to implement a Unix paste or join. It is trivial with Perl/Python, but much harder with async I/O. Your line reading also adds noticeable overhead. For a naive implementation of "wc -l" on a ~9GB file without long lines, node.js takes 56s using 54MB RSS, k8 23s/8.1MB and perl 23s/1.6MB. At the end of day, is it that hard for node.js and dart to provide a proper readline()? I am not demanding much. |
@lh3 just curious, are you benchmarks for |
@maxogden If you are taking this more seriously, here is a little better evaluation. This time, I am implementing
Implementations: # perl
my $max = 0;
$max = $max > length - 1? $max : length - 1 while (<>);
print("$max\n"); // node-1, modified from @jbenet's examples
var split = require('split'), max = 0;
process.stdin.pipe(split()).on('data', function(line) {
max = max > line.length? max : line.length; });
process.stdin.on('end', function() { console.log(max); }); // k8-1; this is not quite fair as b is a byte array but we more frequently use a string
var f = new File(), b = new Bytes(), max = 0;
while (f.readline(b) >= 0) max = max > b.length? max : b.length;
print(max); b.destroy(); f.close(); // k8-2; similar to k8-1 except that the byte array is converted to a string
var f = new File(), b = new Bytes(), max = 0;
while (f.readline(b) >= 0) {
var line = b.toString();
max = max > line.length? max : line.length;
}
print(max); b.destroy(); f.close(); // c-1
#include <stdio.h>
#include <string.h>
int main()
{
char buf[0x10000];
int max = 0;
while (fgets(buf, 0x10000, stdin) != 0) {
int l = strlen(buf) - 1;
max = max > l? max : l;
}
printf("%d\n", max);
return 0;
} // c-3; c-2 is similar
#include <stdio.h>
#include "kseq.h"
KSTREAM_INIT(int, read, 0x10000)
int main()
{
kstream_t *ks;
int max = 0, dret;
kstring_t str = {0,0,0};
ks = ks_init(fileno(stdin));
while (ks_getuntil(ks, 2, &str, &dret) >= 0) max = max > str.l? max : str.l;
ks_destroy(ks);
printf("%d\n", max);
return 0;
} |
this may be a more fair entry, written using a more modern node style var split = require('binary-split')
var through = require('through2')
var max = 0
var counter = through(function(line, enc, next) {
max = max > line.length ? max : line.length
next()
}, function(next) {
console.log(max)
next()
})
process.stdin.pipe(split()).pipe(counter) |
LOL |
@lh3 try me! try me! var fs = require('fs')
var buf = new Buffer(65536)
var max = 0
var line = 0
var prevByte = 0
var ondata = function(err, read) {
if (err) throw err
if (!read) return console.log(max)
for (var i = 0; i < read; i++) {
if (buf[i] === 10) {
if (prevByte === 13) line--
if (max < line) max = line
line = 0
} else {
line++
}
prevByte = buf[i]
}
fs.read(0, buf, 0, buf.length, null, ondata)
}
fs.read(0, buf, 0, buf.length, null, ondata) |
@mafintosh added. |
ah, the versions using |
I'm with @lh3 on this one, I wish var readline = require('readline');
var rl = readline.createInterface({
input: process.stdin,
output: process.stdout,
terminal: false
});
var top = 0;
rl.on('line', function(cmd) {
top = Math.max(top,cmd.length);
}).on('close', function() {
console.log(top);
}); |
@paulfitz added. I gave up dart for scripting also due to the lack of a C/perl/python/ruby-like synchronized I/O. I know async I/O is important, but it does not conveniently replace other types of I/O routines in all scenarios. |
Just came across this thing. You can make @mafintosh's solution a bit faster by scheduling multiple reads in parallel: var fs = require('fs')
var BUFFER_SIZE = (1<<20)
var front = new Buffer(BUFFER_SIZE)
var back = new Buffer(BUFFER_SIZE)
var max = 0
var last = 0
var prev = 0
function process(buffer, size) {
var p = prev
for(var i=0; i<size; ++i) {
var b = buffer[i]
if(b === 10) {
if(p === 13) {
last += 1
}
max = Math.max(max, i-last)|0
last = i
}
p = b
}
last -= size
prev = p
}
function ondata(err, read) {
if (err) {
throw err
}
if (!read) {
console.log(max)
return
}
fs.read(0, back, 0, BUFFER_SIZE, null, ondata)
process(front, read)
var tmp = front
front = back
back = tmp
}
fs.read(0, front, 0, BUFFER_SIZE, null, ondata) |
Outdated. Close. |
Hello!
Cool work! Hacking with v8 is great :) And am glad you're making cool stuff. Though I couldn't help but cringe when I read:
(bold mine)
This is an incorrect characterization of Node.js. This is simply not true. Please check out node Streams. This is how IO is done in node.
Let's take your example, line splitting. Here's how you do it:
And it works beyond shells, you can use
process
to handle stdio:And you can use through to reverse them:
Put that in a
foo.js
and run it:I recommend you learn more about node Streams (they're almost magical!), a great place to start is: https://github.com/substack/stream-handbook -- streams in node are much closer to UNIX pipes than C is. Yep, I'm well aware C was built for UNIX. Node's philosophy is really, really close to UNIX.
Oh I almost forgot, you'll definitely have to install
split
as it's not in core. How? you search npm for "stream line split" and find many modules, includingsplit
which has a ton of downloads. You install it with:I think maybe your real beef with node is that when you run
node
it doesn't come pre-loaded with lots of shiny things likesplit
and so on. This is really a tradeoff between users, the idea is to keep a minimal core in Node and leave almost everything up to the modules.This is the core philosophy of how node and npm do their magic. Having programmed in many environments for a while, I have to say, this is actually amazingly productive. It's The Right Thing To Do.
I agree that having to search and install a particular package is a bit hard to wrap your head around the first time you use node. It's a bit of friction when you first get started. Perhaps this will be fixed by better education: thankfully, core node folks have made nodeschool.io to help introduce the core conepts. Or by providing custom startup imports (i.e. a set of blessed modules you want with you on every shell execution). Or when we no longer have to install packages at all.
Cheers!
:)
The text was updated successfully, but these errors were encountered: