Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support for reading csvs from a stream #141

Open
willm opened this issue Nov 23, 2024 · 2 comments
Open

support for reading csvs from a stream #141

willm opened this issue Nov 23, 2024 · 2 comments

Comments

@willm
Copy link

willm commented Nov 23, 2024

The cli supports reading csv data from stdin, I'm wondering if there's similar support for reading from a node js stream to avoid having to write a csv file to disk first?

@willm
Copy link
Author

willm commented Nov 23, 2024

I'm assuming you'd have to do something similar to the httpfs extension and override the OpenFile method?

@willm
Copy link
Author

willm commented Nov 24, 2024

Another approach to avoid having to write more C++ would be to use the httpfs extension and actually start a server during the import process and pipe your csv stream to the response stream. This works, but feels a bit hacky. See this example using the fast-csv library.

const duckdb = require("duckdb");
const {format} = require("@fast-csv/format");
const db = new duckdb.Database(":memory:");
const {createServer} = require("http");
const con = db.connect();

con.run(`CREATE TABLE product (name VARCHAR);`);

const server = createServer((req, res) => {
  const csvStream = format({
    delimiter: ",",
    headers: ["name"],
  });
  if (req.method === "HEAD") {
    res.writeHead(200, {});
    res.end();
    return;
  }
  if (req.url === "/csv") {
    res.writeHead(200, {"Content-Type": "text/csv"});
    req.pipe(res);
    csvStream.pipe(res);
    csvStream.write(["test"]);
    csvStream.end();
  }
});
server.listen(4444);

con.run(
  `
INSERT INTO product (name)
SELECT  name
FROM read_csv('http://localhost:4444/csv', header = true, delim = ',', columns = {
  'name': 'VARCHAR'
})
`,
  (err) => {
    if (err) {
      throw err;
    }
    server.close();
  }
);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant