-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(cli): import data errors with strings, nulls #290
base: main
Are you sure you want to change the base?
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
@@ -61,6 +64,13 @@ export const handler = async ( | |||
signer, | |||
aliases, | |||
}); | |||
const baseUrl = SdkHelpers.getBaseUrl(31337); | |||
const val = new Validator({ baseUrl }); | |||
// get the table schema to help map values to their type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we can't assume the type (e.g., a JS number could be a TEXT column), we have to map the schema col types to the CSV headers (see the update in utils
).
rowCount === 1 ? "" : "s" | ||
} into ${definition} | ||
const rec = await result.meta.txn?.wait(); | ||
if (rec?.errorEventIdx !== undefined) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
before, we were logging that the tx was successful. in reality, data was being inserted like INSERT INTO my_table(id,val) VALUES(1,my_value)
—i.e., the my_value
was erroring out since it wasn't wrapped in '
table: string, | ||
) { | ||
let rowCount = 0; | ||
const batches = []; | ||
while (rows.length) { | ||
let statement = `INSERT INTO ${table}(${headers.join(",")})VALUES`; | ||
// map headers to schema columns to add in the header index | ||
const schemaWithIdx = headers.map((header, idx) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
basically, take the schema columns and add the header/col index wrt the CSV file to help with logic below
schemaWithIdx: Array<Record<string, any>>, | ||
) { | ||
// wrap values in single quotes if the `type` is not `int` nor `integer` | ||
const rowFormatted = row.map((val, idx) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are probs more performant ways to achieve this setup, but this will get the type for the row value based on the header index.
if the CSV row value gets parsed as an empty string, assume a NULL value. else, use the SQL type to dictate single quote wrapping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thinking about this a bit more...the downside here is that it doesn't sanitize the data. so, if you have a CSV file with single quotes in the actual value, that could pose a syntax issue.
i did consider refactoring this to use the bind()
method when initially creating the statements, which would handle this issue. instead, i ended up with this approach to keep things more aligned with the setup as-is.
|
||
before(async function () { | ||
const batch = await db.batch([ | ||
// use no backticks vs. including them to emulate a non-Studio vs. Studio |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i noticed a comment about the Studio table columns having backticks around them, and the user might not know this: here. e.g., a col name is `id`
, but in the UI, a user defines it as id
.
i can't remember—is this still the approach? it didn't look to have an impact here, though, because i used a table that was created outside of the Studio and without the backticked col names.
@dtbuchholz I investigated this idea a little when I first built this command. At the time I decided that requiring the user to handle formatting was the most straight forward, but if people want the command to handle formatting it seems reasonable. |
0d16c94
to
23d35ca
Compare
What's the plan with this PR @dtbuchholz and @joewagner? |
I was looking at this today. If we do want to go this route, we should make sure quotes are being escaped correctly and row name escape characters work as expected |
23d35ca
to
f790e58
Compare
f790e58
to
58661c7
Compare
test.only("can import with quotes, escape chars, and multi line", async function () { | ||
/* eslint-disable no-useless-escape */ | ||
const csvStr = `id,val | ||
1,"I'll work" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joewagner up to you! i've trying to mess with the csv parser to get things to work with these different test cases. lmk if this is what you were thinking.
(I haven't been able to get the csv-parse
to work properly tbh, and I'll probably need to add a SQLite-specific parser on top of it)
Summary
Fixes
import-data
command wrt strings and empty CSV values.Edit: maybe the original setup and "requiring" the user make the CSV SQL-compatible is the right approach? I initially made these changes to try and debug the issue, which turned out to be due to an outdated Studio CLI release. This PR is basically a more flexible way to send CSV data vs. the current setup.
Details
A user reported that importing data into a table was not working when they had an empty value in the CSV file. This led to further digging, uncovering that strings weren't being inserted correctly (they were not wrapped in single quotes) in addition to NULL values not being inserted in case a row had an empty value for a column.
How it was tested
Added a new test file for
import-data
flows.