Skip to content

connect Tableau to Elasticsearch #3

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Apr 20, 2018
Merged

connect Tableau to Elasticsearch #3

merged 17 commits into from
Apr 20, 2018

Conversation

bpintea
Copy link
Collaborator

@bpintea bpintea commented Apr 4, 2018

Tableau is a sophisticated BI tool, able to connect to most relevant data sources out there.
It also has the ability to use ODBC for connecting to databases.
The goal of this PR is to learn what are Tableau's requirements and develop the driver (potentially the SQL plugin too) to make the integration smooth.

Some further PRs might be created to complete this one.

add attributes queried by Tableau past connect
driver/connect.c Outdated
if (2 < i)
return FALSE;
return TRUE;
#endif
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This hunk slipped through by mistake and will be removed.

@bpintea bpintea requested a review from costin April 4, 2018 22:09
@bpintea bpintea mentioned this pull request Apr 5, 2018
Closed
bpintea added 11 commits April 6, 2018 18:57
Add SAVEFILE and FILEDSN. These aren't use by the driver (only by DM),
but they are passed to the driver and the driver should pipe them
through to the output connection string.
Add a 'setup' option to invoke MSVC's VsDevCmd.bat, which sets the
needed build env variables and paths (including MSBuild.exe and cl.exe).

Also reshuffled code around, groupping it in "functions", for easier
maintenance.
This will facilitate string comparisions, since ODBC API specifies the
string lenght in most of the function, but this can be given as a
negative number for 0-terminated strings.
allow upper/lower/mixed-case parameters
Syntax changed in the plugin, implement the new format.
Add more statement attributes cases.
- add conversion to DATE
- add conversion to SBIGINT
add basic support, always returning empty set (ES doesn't support keys).
"Server" config option is wider spread and recognized by apps.
convert to signed short and long.
s/t/d, since the value is not a pointer.
@bpintea bpintea requested review from droberts195 and edsavage April 11, 2018 17:37
hunk slipped through by mistake
@bpintea
Copy link
Collaborator Author

bpintea commented Apr 11, 2018

First testing with Tableau emphasized a couple of issues that need to be next addressed in the driver (multi-threading and logging), but pushing the testing further depends now on improving the SQL plugin in ES.
So I'd rather conclude this PR here, safe for the any changes required by reviewers.

Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few comments.

It looks like most of this PR is about allowing the driver to accept and ignore features ES doesn't have. I guess the key thing here is making sure the docs clearly say which features are accepted (for compatibility with 3rd party products) but achieve nothing.

build.bat Outdated
echo useful with 'setup' and 'reg*' arguments only;
echo x86 and x64 platforms supported only.
echo setup : invoke MSVC's build environment setup script before
echo building (requires 2017 version of later^).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: of -> or

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

build.bat Outdated
REM SETUP function: set-up the build environment
:SETUP
REM TODO: add some logic to detect available versions (Enterprise, Professional, Community)
REM TODO: add support for vcvarsall.bat (2015)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it's a good idea to support lots of compiler versions. It's the opposite of what Elasticsearch does - they require a specific version of Java to build each branch. The ML C++ also requires a specific version of Visual Studio on each branch. I realise it will be easier for C than C++, but still could soak up time if there are unforeseen compatibility problems.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see.
Indeed, as far as MSVC is concerned, the code should stay as "standard" as possible and likely usable with all available versions. ANSI-wise, I might use some of the C11 features eventually. So fairly conservative.
In any case however, you're generally right, I would stick to one version, especially concerting the Editions. I was only thinking that if we need to support some older versions of Windows, it might simply be easier using an older version of the build environment all-together. But haven't invested any time into this yet (and it's not a high prio just yet).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was only thinking that if we need to support some older versions of Windows, it might simply be easier using an older version of the build environment all-together

I think if you want a compiler that defaults to building for Windows 7/Windows 2008r2 you'd need Visual Studio 2013.

In terms of binary compatibility with the OS libraries there's really no need to use an old compiler: simply add -DNTDDI_VERSION=0x06010000 -D_WIN32_WINNT=0x0601 to your compile flags and the Windows headers will prevent you using any functionality that would be incompatible. (Note: Windows 7/Windows 2008r2 = Windows NT 6.1)

The only reason I can think of for using Visual Studio 2013 would be that the programs it builds do not use Microsoft's Universal C Runtime (UCRT). The UCRT is an integral part of more modern versions of Windows, and has been distributed via Windows update for older versions, including Windows 7, for the last 3 years. So most people will have it. The only people who wouldn't have it are people running an old version of Windows who don't install Windows updates. For ML we've taken the view that such people will be very rare and if we ever come across one we'll just point them at Microsoft's docs on how to install the UCRT: https://support.microsoft.com/en-gb/help/2999226/update-for-universal-c-runtime-in-windows

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is great!

build.bat Outdated
REM TODO: add some logic to detect available versions (Enterprise, Professional, Community)
REM TODO: add support for vcvarsall.bat (2015)
set RELEASE=2017
set EDITION=Community
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it's not allowed to build commercial software with the community edition, so I think the default here should be professional. The professional edition is also what's installed on the VM used by release manager.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point!
I'll need to sync with Infra on this, since they just pull packages through Chocolatey and (only?) Community is available through it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chocolatey isn't used any more for the release manager Windows VM, as it caused too many intermittent build failures.

Since elastic/infra#3313 the Windows VM used by release manager now has VS2017 Professional included in the base VM image rather than being added by Vagrant - in particular this section does the install when the base image is built: https://github.com/elastic/infra/pull/3313/files#diff-559c38c9cad0976bf4839e2b4e217f98R54

You can get the professional edition through Chocolatey as well though if you want to use Chocolatey for your local development. I seem to remember that Martijn allocated one of Elastic's MSDN subscriptions to you? If so you're entitled to use it. I don't think Chocolatey actually makes things any easier for Visual Studio now that the installer dynamically downloads the packages it needs. You'd have to get both https://chocolatey.org/packages/visualstudio2017professional and https://chocolatey.org/packages/visualstudio2017-workload-nativedesktop. Personally I found it easier to just get Microsoft's installer from the MSDN downloads site and use that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's great, thanks! I'll forward this to Infra (this https://github.com/elastic/infra/pull/4208/commits/775267ce1cb74a01fc0652408910f24196213c22 should probably be corrected then). Once I get their OK, I'll update the .bat.
Myself, yes, I have the Pro edition locally.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

driver/util.c Outdated
@@ -109,6 +109,20 @@ int wmemncasecmp(const wchar_t *a, const wchar_t *b, size_t len)
return diff;
}

int wszmemcmp(const wchar_t *a, const wchar_t *b, long count)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're using wchar_t and SQLWCHAR interchangably.

I think that will be fine on Windows, but on Linux does SQLWCHAR stay as a signed 16 bit number? If so then this function will no longer work as wchar_t is signed 32 bit on Linux.

It would be more consistent to use SQLWCHAR throughout for UTF-16 strings.

Copy link
Collaborator Author

@bpintea bpintea Apr 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're using wchar_t and SQLWCHAR interchangably.

For the compilers with wchar_t support (past C95), ODBC defines SQLWCHAR as wchar_t (for the others, it's an unsigned short -- problematic with current code).
Generally, the library functions will either use char or wchar_t types, of course (SQLWCHAR being an ODBC specific type), so the driver will need to make the conversion explicitly.
But for Unicode drivers and reasonably new compilers I believe SQLWCHAR will always be wchar_t on our target platforms (Win, *nix).

I think that will be fine on Windows, but on Linux does SQLWCHAR stay as a signed 16 bit number?

True. I hope I've only been using sizeof() to compute wchar_t's size. Otherwise it's clearly a bug.

If so then this function will no longer work as wchar_t is signed 32 bit on Linux.

The function itself should hopefully still be correct, but the pointer conversions short <-> int will definitely be wrong, but also compiler flagged. And this, I believe would only occur on setups with no wchar_t support.

It would be more consistent to use SQLWCHAR throughout for UTF-16 strings.

I was having the task of having an ANSI-only driver version cover the evaluation of this assumption of SQLWCHAR as always wchar_t. We can also discuss this further during the next meeting though, if you think this should be dealt with sooner.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I've just discovered a bug in int ansi_w2c(const SQLWCHAR *src, char *dst, size_t chars), which I've just noticed that it's using SQLWCHAR type as the wide character source. This is indeed not clean and I'd change it to wchar_t to stay consistent.

driver/util.h Outdated
*
* This is useful in comparing SQL strings which the API allows to be passed
* either as 0-terminated or not (SQL_NTS).
* The function does a single pass (no lenght evaluation of the strings).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: lenght -> length

Same two lines below.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. (ght instead of gth seems to be my unconscious favourite typo)

Copy link
Collaborator

@edsavage edsavage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the coding style is inconsistent in places - e.g. the treatment of single line if statements. Perhaps consider the use of a tool such as clang-format to tidy the code up?

Other than that (& the points already made by Dave) the changes look good to me.

driver/util.c Outdated

for (; *a && *b && count; a ++, b ++, count --) {
diff = *a - *b;
if (diff)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd strongly prefer that the bodies of all conditional statements were surrounded by braces, for reasons of code maintainability and consistency.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry, was trying to incorporate always bracing the conditionals into my style, I missed this one. I'll update (and look into clang-format'ing).
And thanks!

bpintea added 4 commits April 16, 2018 13:02
Brace the block even for one-liners.
- This will be inline with ODBC-W API.
  In practice, the driver will only work with compilers/platforms where
  SQLWCHAR is a typedef or define of wchar_t. For that:
- Added a sentinel to assert that this is the case.
Build script will iterate on Ent, Pro or Community, to find an
installation to prepare the environment with. If Community is used,
it'll warn about its license.
Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bpintea bpintea merged commit 3de5728 into elastic:master Apr 20, 2018
@bpintea bpintea deleted the feature/tableau_connect branch April 20, 2018 16:01
bpintea added a commit that referenced this pull request Jun 4, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants