-
Notifications
You must be signed in to change notification settings - Fork 30
connect Tableau to Elasticsearch #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
add attributes queried by Tableau past connect
driver/connect.c
Outdated
if (2 < i) | ||
return FALSE; | ||
return TRUE; | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This hunk slipped through by mistake and will be removed.
Add SAVEFILE and FILEDSN. These aren't use by the driver (only by DM), but they are passed to the driver and the driver should pipe them through to the output connection string.
Add a 'setup' option to invoke MSVC's VsDevCmd.bat, which sets the needed build env variables and paths (including MSBuild.exe and cl.exe). Also reshuffled code around, groupping it in "functions", for easier maintenance.
This will facilitate string comparisions, since ODBC API specifies the string lenght in most of the function, but this can be given as a negative number for 0-terminated strings.
allow upper/lower/mixed-case parameters
Syntax changed in the plugin, implement the new format.
Add more statement attributes cases.
- add conversion to DATE - add conversion to SBIGINT
add basic support, always returning empty set (ES doesn't support keys).
"Server" config option is wider spread and recognized by apps.
convert to signed short and long.
s/t/d, since the value is not a pointer.
hunk slipped through by mistake
First testing with Tableau emphasized a couple of issues that need to be next addressed in the driver (multi-threading and logging), but pushing the testing further depends now on improving the SQL plugin in ES. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a few comments.
It looks like most of this PR is about allowing the driver to accept and ignore features ES doesn't have. I guess the key thing here is making sure the docs clearly say which features are accepted (for compatibility with 3rd party products) but achieve nothing.
build.bat
Outdated
echo useful with 'setup' and 'reg*' arguments only; | ||
echo x86 and x64 platforms supported only. | ||
echo setup : invoke MSVC's build environment setup script before | ||
echo building (requires 2017 version of later^). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: of -> or
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
build.bat
Outdated
REM SETUP function: set-up the build environment | ||
:SETUP | ||
REM TODO: add some logic to detect available versions (Enterprise, Professional, Community) | ||
REM TODO: add support for vcvarsall.bat (2015) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure it's a good idea to support lots of compiler versions. It's the opposite of what Elasticsearch does - they require a specific version of Java to build each branch. The ML C++ also requires a specific version of Visual Studio on each branch. I realise it will be easier for C than C++, but still could soak up time if there are unforeseen compatibility problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see.
Indeed, as far as MSVC is concerned, the code should stay as "standard" as possible and likely usable with all available versions. ANSI-wise, I might use some of the C11 features eventually. So fairly conservative.
In any case however, you're generally right, I would stick to one version, especially concerting the Editions. I was only thinking that if we need to support some older versions of Windows, it might simply be easier using an older version of the build environment all-together. But haven't invested any time into this yet (and it's not a high prio just yet).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was only thinking that if we need to support some older versions of Windows, it might simply be easier using an older version of the build environment all-together
I think if you want a compiler that defaults to building for Windows 7/Windows 2008r2 you'd need Visual Studio 2013.
In terms of binary compatibility with the OS libraries there's really no need to use an old compiler: simply add -DNTDDI_VERSION=0x06010000 -D_WIN32_WINNT=0x0601
to your compile flags and the Windows headers will prevent you using any functionality that would be incompatible. (Note: Windows 7/Windows 2008r2 = Windows NT 6.1)
The only reason I can think of for using Visual Studio 2013 would be that the programs it builds do not use Microsoft's Universal C Runtime (UCRT). The UCRT is an integral part of more modern versions of Windows, and has been distributed via Windows update for older versions, including Windows 7, for the last 3 years. So most people will have it. The only people who wouldn't have it are people running an old version of Windows who don't install Windows updates. For ML we've taken the view that such people will be very rare and if we ever come across one we'll just point them at Microsoft's docs on how to install the UCRT: https://support.microsoft.com/en-gb/help/2999226/update-for-universal-c-runtime-in-windows
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this is great!
build.bat
Outdated
REM TODO: add some logic to detect available versions (Enterprise, Professional, Community) | ||
REM TODO: add support for vcvarsall.bat (2015) | ||
set RELEASE=2017 | ||
set EDITION=Community |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it's not allowed to build commercial software with the community edition, so I think the default here should be professional. The professional edition is also what's installed on the VM used by release manager.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point!
I'll need to sync with Infra on this, since they just pull packages through Chocolatey and (only?) Community is available through it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Chocolatey isn't used any more for the release manager Windows VM, as it caused too many intermittent build failures.
Since elastic/infra#3313 the Windows VM used by release manager now has VS2017 Professional included in the base VM image rather than being added by Vagrant - in particular this section does the install when the base image is built: https://github.com/elastic/infra/pull/3313/files#diff-559c38c9cad0976bf4839e2b4e217f98R54
You can get the professional edition through Chocolatey as well though if you want to use Chocolatey for your local development. I seem to remember that Martijn allocated one of Elastic's MSDN subscriptions to you? If so you're entitled to use it. I don't think Chocolatey actually makes things any easier for Visual Studio now that the installer dynamically downloads the packages it needs. You'd have to get both https://chocolatey.org/packages/visualstudio2017professional and https://chocolatey.org/packages/visualstudio2017-workload-nativedesktop. Personally I found it easier to just get Microsoft's installer from the MSDN downloads site and use that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's great, thanks! I'll forward this to Infra (this https://github.com/elastic/infra/pull/4208/commits/775267ce1cb74a01fc0652408910f24196213c22 should probably be corrected then). Once I get their OK, I'll update the .bat.
Myself, yes, I have the Pro edition locally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
driver/util.c
Outdated
@@ -109,6 +109,20 @@ int wmemncasecmp(const wchar_t *a, const wchar_t *b, size_t len) | |||
return diff; | |||
} | |||
|
|||
int wszmemcmp(const wchar_t *a, const wchar_t *b, long count) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're using wchar_t
and SQLWCHAR
interchangably.
I think that will be fine on Windows, but on Linux does SQLWCHAR
stay as a signed 16 bit number? If so then this function will no longer work as wchar_t
is signed 32 bit on Linux.
It would be more consistent to use SQLWCHAR
throughout for UTF-16 strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're using wchar_t and SQLWCHAR interchangably.
For the compilers with wchar_t support (past C95), ODBC defines SQLWCHAR as wchar_t (for the others, it's an unsigned short -- problematic with current code).
Generally, the library functions will either use char or wchar_t types, of course (SQLWCHAR being an ODBC specific type), so the driver will need to make the conversion explicitly.
But for Unicode drivers and reasonably new compilers I believe SQLWCHAR will always be wchar_t on our target platforms (Win, *nix).
I think that will be fine on Windows, but on Linux does SQLWCHAR stay as a signed 16 bit number?
True. I hope I've only been using sizeof()
to compute wchar_t's size. Otherwise it's clearly a bug.
If so then this function will no longer work as wchar_t is signed 32 bit on Linux.
The function itself should hopefully still be correct, but the pointer conversions short <-> int will definitely be wrong, but also compiler flagged. And this, I believe would only occur on setups with no wchar_t support.
It would be more consistent to use SQLWCHAR throughout for UTF-16 strings.
I was having the task of having an ANSI-only driver version cover the evaluation of this assumption of SQLWCHAR as always wchar_t. We can also discuss this further during the next meeting though, if you think this should be dealt with sooner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I've just discovered a bug in int ansi_w2c(const SQLWCHAR *src, char *dst, size_t chars)
, which I've just noticed that it's using SQLWCHAR type as the wide character source. This is indeed not clean and I'd change it to wchar_t
to stay consistent.
driver/util.h
Outdated
* | ||
* This is useful in comparing SQL strings which the API allows to be passed | ||
* either as 0-terminated or not (SQL_NTS). | ||
* The function does a single pass (no lenght evaluation of the strings). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: lenght -> length
Same two lines below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. (ght instead of gth seems to be my unconscious favourite typo)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of the coding style is inconsistent in places - e.g. the treatment of single line if statements. Perhaps consider the use of a tool such as clang-format to tidy the code up?
Other than that (& the points already made by Dave) the changes look good to me.
driver/util.c
Outdated
|
||
for (; *a && *b && count; a ++, b ++, count --) { | ||
diff = *a - *b; | ||
if (diff) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd strongly prefer that the bodies of all conditional statements were surrounded by braces, for reasons of code maintainability and consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, sorry, was trying to incorporate always bracing the conditionals into my style, I missed this one. I'll update (and look into clang-format'ing).
And thanks!
Brace the block even for one-liners.
- This will be inline with ODBC-W API. In practice, the driver will only work with compilers/platforms where SQLWCHAR is a typedef or define of wchar_t. For that: - Added a sentinel to assert that this is the case.
Build script will iterate on Ent, Pro or Community, to find an installation to prepare the environment with. If Community is used, it'll warn about its license.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
connect Tableau to Elasticsearch
Tableau is a sophisticated BI tool, able to connect to most relevant data sources out there.
It also has the ability to use ODBC for connecting to databases.
The goal of this PR is to learn what are Tableau's requirements and develop the driver (potentially the SQL plugin too) to make the integration smooth.
Some further PRs might be created to complete this one.