Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distributors API (TME Prices from API) #295

Closed
piotrkochan opened this issue Jul 24, 2018 · 28 comments
Closed

Distributors API (TME Prices from API) #295

piotrkochan opened this issue Jul 24, 2018 · 28 comments
Assignees
Labels
discussion Discution about implementation and new features.

Comments

@piotrkochan
Copy link

Feature / Enhancement request

TME Prices should be received via TME API https://developers.tme.eu/en

@hildogjr hildogjr added the discussion Discution about implementation and new features. label Jul 24, 2018
@hildogjr
Copy link
Owner

My opinion and maybe also @xesscorp (we already had this discussion in past).
Some distributors allow the use of APIs, I think this could be implemented in KiCost, using the GUI with a additional tab for your login and password (there is the first problem). But:

  1. A way to save / keep the password saved and safe have to be implemented (more programmer / collaborator are need to code);
  2. We have to keep the conventional scrape because this allow all users get all informations about one part (even when not logged in);
  3. Display "extra" component data by the distributor #4 and Specify spreadsheet currency #65 have to be supported (Also Automatic specify generic components #17 in the future?);
  4. The big problem (I don't read the full documentation), but most of the distributors that use API allow just a limited number of scrapes in the free account (until now, no KiCad/KiCost use want to pay for a account for after pay for the PCB components).

@piotrkochan
Copy link
Author

There is no need to use login/password, you need to create account at https://developers.tme.eu/en/ and then generate application token. In order to query TME API you need this token and secret. Secret is used to calculate request Signature.

Python example is there: https://github.com/tme-dev/TME-API/blob/master/Python/call.py

@hildogjr
Copy link
Owner

hildogjr commented Jul 24, 2018

If there is not limit of search/responses could be interesting to the user. But I think the "good user experience" way to implement is use the the GUI and and the library that wxPython provide to save information of the software.
The token have not to be kept on the *.py file, but in a computer registry/file memory.
Do you already use it? (the GUI).

@piotrkochan
Copy link
Author

I haven't used used KiCost yet, I just found it today.

@hildogjr
Copy link
Owner

So, have a look in
Quick start: https://www.youtube.com/watch?v=AeccxROpDfY
Installation: https://github.com/xesscorp/KiCost/blob/master/docs/installation.rst
If you are not terminal/python user skilled, just type kicost on terminal to access the user graphical interface (GUI - one new resource).

@xesscorp
Copy link
Collaborator

My opinion about web scraping has started to change given what @mmmaisel told us about Mouser using Distil (#282). It's becoming increasingly difficult to scrape because of things like that, and also there's the continual modifications to the web scrapers as the distributors change the HTML coding of their web pages.

I looked at the Mouser API and it's not as bad as it was. It is giving more price/quantity info (up to four price breaks whereas it used to be two). It allows up to 1,000 queries per day and 50 results per query. It does require a key for each user. I haven't looked at the TME API, but I would be surprised if they don't require each individual to have their own key.

I've always considered the need for an individual key to be an inconvenience for the user. Maybe there is a way to automate getting the keys from each distributor.

In any event, it looks like using distributor APIs may become the only way to access their data as we move forward.

@hildogjr hildogjr changed the title TME Prices from API Distributors API (TME Prices from API) Jul 24, 2018
@hildogjr
Copy link
Owner

I think before any step forward this have to be better discussed as "the future of KiCost" to create so standardization about this implementation (if we decide to support API) and do in some way to support future migration to API in the others distributors.

@xesscorp
Copy link
Collaborator

xesscorp commented Aug 1, 2018

Any thoughts about how you want to move forward on this @hildogjr? Losing Mouser is a problem because they're the other major US distributor comparable to Digikey. In many cases, they have the best price. With Mouser out of the picture, KiCost is limited to being a tool for ordering from Digikey (in the US). And if Digikey decides to pull a similar trick...

I see three alternatives:

  1. Use the Mouser API and accept whatever limitations it has.
  2. Use some type of proxy service that will process the Javascript and return the HTML to KiCost.
  3. Use that v8 engine found by @mmmaisel to process the Javascript.
  4. Just say "fuck it" and put a banner under the Mouser section in the spreadsheet that says "Mouser doesn't want business from KiCad users."

With options 2 & 3 , we still have to fake-out the Distil servers to make them believe we're valid users.

Do you see any other alternatives?

@hildogjr
Copy link
Owner

hildogjr commented Aug 2, 2018

  1. I am reading the terms of use to understanding the limitations.
    GOOD: better stability and speed of KiCost.
    BAD: needs the user to create the KEY.
    WORKING: I can easily store in the GUI that already provides the memory functionality and pass through kicost(foo, ..., KEYS). Maybe it is more complicated on the terminal? Try to read from a %USERDATA%/KiCost directory?

  2. I have not much experience but, despise keep almost the code of KiCost, it could create a additional module complicated to maintenance and may be much dependent of this additional server/service. Also could be necessary some additional login (if necessary I prefer (1) under (2), because one time the KEY is saved, no more work is needed);

  3. Interesting solution. Will this create more dependence to install with KiCost? I saw some documentation and it appeared quiet complicated and OS dependent, please @mmmaisel correct me if I am wrong.

  4. Yep, but so bad do that. :-/ Digikey and Mouse have the best cost-benefit to Brazil.

I see no other path to follow, but so far I "study the points", my opinion is between (1) and (3).

@mmmaisel
Copy link
Contributor

mmmaisel commented Aug 4, 2018

  1. Using the Mouser API may be a good temporary solution for users which already
    have a API key. Those APIs usually use JSON data for requests and responses and
    should be easy to implement in a new module (e.g. distributor_api base class).
    Since APIs are designed for use with scripts there is no need for anything like
    fake_browser or cookie handling. Just send JSON object with search keyword
    and API key and you will get another JSON object with the results.
    However, API keys should be treated like passwords so don't store them anyway
    unencrypted. This could be solved by using some secure password storage functions
    but this leads directly into OS dependent code hell.
    On the terminal, don't pass the key via command line, instead ask for it via stdin.
    Otherwise it would be stored in terminal history and is visible in a process
    monitor.

  2. I can't recommend this.

  3. I don't think that this will create platform dependent code. Main problem
    here is that we need to implement all Javascript APIs (just fake the expected
    return values should be enough) that are included in a browser ourselfs.
    In V8 this is done with the global->Set(name, function_ptr) method which is
    not yet exposed in pyminiracer.
    This is my recommended long-term solution.

  4. Sounds nice as it safes a lot of work but is no solution for users.
    Users which depend on Mouser should at least be able to use the API (see 1).

@hildogjr
Copy link
Owner

hildogjr commented Aug 4, 2018

Could we agree to dismiss (2) and (4) so and discuss about (1) and (3)?

" 1. It will be interesting and quicker but we will at least add some additional Python package (I think already saw something to deal with encrypted information). But really, if this is important (I already was bad programs so that doesn't encrypt API keys), we have to discuss how to do and how to interact with the already made code.

" 3. So far I study some code is possible to use the motor of your installed browser (which create the OS dependent code). I don't now how to procedure in this path (yet).

In both, I would like to improve #65 and #4.

@anderwm
Copy link

anderwm commented Aug 21, 2018

Also it may be worth considering octopart again. If they are only rate limiting and not hard limiting requests I think 3/sec might be ok.

@xesscorp
Copy link
Collaborator

I agree that doing some exploratory work with the Octopart API is the best option right now. Trying to fake out a browser to get around things like what Mouser is doing with Distil is just going to be too much work and we'll forever be patching it.

There appear to be several Python packages that implement an interface to Octopart. Those might be the easiest way to try it out and see how much Octopart will support the current KiCost functions.

As for the need of an API key, maybe we can implement something that makes getting and installing a key easier for the user. And if it's just Octopart (rather than every distributor), then it won't be so bad.

Of course, we could get screwed if Octopart limits part searches or closes up entirely, but we're getting screwed right now by Mouser and others making scraping impossible.

@anderwm
Copy link

anderwm commented Aug 21, 2018

To me a lot of the cool stuff you have done is the build up of the spreadsheet and everything after you have the data (I guess you'd call it user interface). If the interface between getting the part data and processing/displaying the part data were a little cleaner (it may be clean and I just haven't tried hard enough to understand it) you wouldn't be too screwed. It would just be like supporting multiple back ends (part data capture).
This tool is to useful to be lost to robot detecting robots.

@hildogjr
Copy link
Owner

@anderwm, since last year (when I join to the team) @devbisme and others had a bigger effort to split the code, living all the scrape based routines inside distributors folder (also to re-facture the scrape every time that the sites changed). So, all the changes will be there.

I agree that the incredible feature of KiCost is the way that it display the price and informations, allow the users to check (missing here finish the implementation of #4), and the most difficult to implement is the scrape it self (because of the sites changes and robots detections).

About the Octapart, I came up with some observations. Since it return with a lot of real distributors, it will create more than one column in the spreadsheet? Will eache distributor scrape module (and future distributor API) be predominant under Octapart (since they will have more information).

Are this API suported by #65?

@xesscorp
Copy link
Collaborator

If Octopart can deliver the data we need (and a preliminary scan of their current API docs suggests that it can), then it should probably replace all the web scrapers currently in use and feed its data into the various columns of the spreadsheet. Then we will need a way for the user to select which distributors they want to use and display only the data for them. Otherwise, the spreadsheet will have hundreds of columns with distributors that nobody uses.

I don't know if the API takes the current location of the user into account when gathering distributor data, or if it handles currency conversion.

@hildogjr
Copy link
Owner

On the main page https://octopart.com/ there is a list.
In the octapart\__init__.py could be created some dict struct with this informations, colors and title.
To start this development is a good idea create a branch.

@mmmaisel
Copy link
Contributor

I tried out Octopart and come to the conclusion that there is all important data available.
It also doesn't look like they have scraper protection (whole page works without javascript) so I think the best idea would be to simply scrape the data from Octopart as this avoids all the hassle with API keys.

Octopart API keys are definitely designed to be included into closed-source applications as you have to register the application itself. However, I think it is a bad idea to create an API key for KiCost and include it public available in the code.

@xesscorp
Copy link
Collaborator

Is there a reason why we would want to keep the API code secret? I mean, some other application or person might start using our code, but would that cause a problem for us? The rate limiting is in effect regardless.

I would like to avoid scraping Octopart since that means we would have to keep up with how they format the HTML of their pages. That's one of the hassles with maintaining KiCost now.

@anderwm
Copy link

anderwm commented Aug 25, 2018

Seems like a bad idea to go back to scraping unless you absolutely have to. Once you get enough users doing it octopart will start cracking down on it too and you will be back where you started.
I guess the problem with having a "kicost" octopart token in the clear is that anyone could impersonate kicost and violate octopart terms of service. To me, it doesn't seem to bad to make a user sign up and paste their own API key somewhere. Otherwise you would have to run a server somewhere that kicost hits to get the key. But then how do you know it's really kicost and so on...

@hildogjr
Copy link
Owner

I agree with @anderwm about the user signing and the possibility of the Octopart behavior.
In additional, we could keep the scrape modules as secondary motor, and warning the users about.
I think even, after some success in Octopart API, this could be expand to others APIs.

As @xesscorp, I am not sure about encrypt the tokens, I have more applications installed here that didn't do.

If we decide for this path (API in Octopart to start):

  1. The token could be kept on a MY_TOKENS.txt edited by the user (we provide a template file on the installation):

    digikey_id=#Kjkjkjkjk##JKjl
    digikey_toke=&232uiuuiousas8
    ... others API modules that KiCost could have

  2. Could be read by kicost .... --token MY_TOKENS.txt on __main.py`;

  3. kicost() in kicost.py will have a additional input as a dictionary:

    {'dikikey': {'id':jjjjj ; 'token'=jkjkjk}, 'octopart': {} ; ...}

  4. This mark also easy to implement in the GUI whereas it already have a memory system.

@xesscorp
Copy link
Collaborator

I also checked Octopart and it looks like it will provide the information we need. You can do a single request with all the part numbers and it will return all the pricing and availability information for all the distributors in less than 200 msec. It doesn't look like the rate limit of 3 requests per second will be an issue. We might be able to replace all the distributor web-scraping code with a single module.

The pricing comes back with a field indicating the monetary unit. We might be able to do country-specific pricing with a simple currency-conversion web service and then apply it to the price tables. (That is, if Octopart doesn't already have this built into their API, somewhere.)

@hildogjr
Copy link
Owner

Maybe they have, at least on the web page the currency is always converted to my own (that is not available at Digikey).

@xesscorp
Copy link
Collaborator

I tried setting my VPN to places like Spain and Mexico, but the prices through the API still came back in US dollars. Maybe it's keyed to something else in the environment. I've attached the little example program I'm using. You might try running it where you are and see if the prices come back in your local currency. (You'll have to run it in Python 2.7.)

test.zip

@hildogjr
Copy link
Owner

Yes, returned USD (the webpage here return BRL). Could it have same difference if ran by fake_browser?

I curious now:

  1. Could Octopart provide also "not recommended for new designs"? I was thinking to use this to highlight in the spreadsheet (also package, a others informations in the path of implementation of Display "extra" component data by the distributor #4 and some validation checking).
  2. The API returned a link (that redirect to distributors page) but I didn't see the distributor specific stock-code (that we use to create the purchase list). Checking the full JSON vector appear to be present at 'sku' field.

@hildogjr
Copy link
Owner

Using the link created direct on my browser, also returned USD. So, fake_browser will do the same.

@xesscorp
Copy link
Collaborator

I haven't seen an "obsolete" field in the JSON.

Yes, the distributor's ordering code is in the sku.

@hildogjr
Copy link
Owner

@piotrkochan, discussion merge to #314 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Discution about implementation and new features.
Projects
None yet
Development

No branches or pull requests

5 participants