Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: trending queries #301

Merged
merged 40 commits into from
Nov 21, 2024
Merged

feat: trending queries #301

merged 40 commits into from
Nov 21, 2024

Conversation

grantdfoster
Copy link
Contributor

@grantdfoster grantdfoster commented Oct 31, 2024

Description

This PR fixes #297, #298

  1. Refactoring and Code Organization:

    • Added TODO comments suggesting the refactoring of certain classes to a synapses directory for better organization, as they are used by both validator and miner components.
  2. Enhancements in masa/base/miner.py:

    • Introduced a new method forward_tweets_synapse in BaseMinerNeuron to handle the forwarding of recent tweets using the RecentTweetsSynapse class.
    • Updated the axon attachment to use the new forward_tweets_synapse method.
  3. Updates in masa/miner/masa_protocol_request.py:

    • Added an optional timeout parameter to the get and post methods to allow customizable request timeouts.
  4. Improvements in masa/miner/twitter/tweets.py:

    • Modified the RecentTweetsSynapse class to include an optional timeout attribute.
    • Updated the forward_recent_tweets function to accept a max parameter, which is used to limit the number of tweets fetched.
    • Adjusted the TwitterTweetsRequest class to use the max_tweets parameter from the miner configuration.
  5. Configuration Changes in masa/utils/config.py:

    • Added a new command-line argument --twitter.max_tweets_per_request to define the maximum number of tweets to scrape per request.
  6. Enhancements in masa/validator/forwarder.py:

    • Replaced the fetch_twitter_config method with fetch_twitter_queries to fetch trending queries instead of a static configuration.
    • Updated the logic to check tweets since a specific date rather than just the current day.
    • Introduced a TweetValidator instance for validating tweets.
  7. Dependency Update in pyproject.toml:

    • Updated the masa-ai dependency version from 0.2.3 to 0.2.5.

These changes aim to improve the flexibility, organization, and functionality of the codebase, particularly in handling Twitter data and configuration management.

Notes for Reviewers

None

Signed commits

  • Yes, I signed my commits.

@grantdfoster grantdfoster self-assigned this Oct 31, 2024
@grantdfoster grantdfoster linked an issue Oct 31, 2024 that may be closed by this pull request
2 tasks
@grantdfoster grantdfoster linked an issue Oct 31, 2024 that may be closed by this pull request
2 tasks
@grantdfoster grantdfoster marked this pull request as ready for review October 31, 2024 20:13
* chore: preps makefile for testing

* feat: randomness reduced on miner selection and query

* fix: quotes on keyword

* fix: cleanup makefile

* fix: centralize volume testing window param

* feat: tweets by UID (miner) (#303)

* fix: similarity score and prep makefile for testing

* fix: improvement on unique tweet ids

* fix: stores unique tweets by uid

* fix: removes unique tweets for uid when they are de-registered

* fix: simpler tweet indexing

* fix: adds endpoint to expose tweets by uid as well

* fix: function name change

* fix: cleanup endpoint logic

* fix: cleanup notes

* fix: remove note

* fix: revert makefile

* fix: makefile
Copy link

codecov bot commented Nov 2, 2024

Codecov Report

Attention: Patch coverage is 74.89362% with 59 lines in your changes missing coverage. Please review.

Project coverage is 65.47%. Comparing base (59f0f18) to head (14fa81e).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
masa/validator/forwarder.py 59.37% 26 Missing ⚠️
masa/miner/twitter/tweets.py 29.41% 12 Missing ⚠️
masa/api/server.py 22.22% 7 Missing ⚠️
masa/base/validator.py 85.29% 5 Missing ⚠️
masa/base/miner.py 72.72% 3 Missing ⚠️
masa/utils/uids.py 85.00% 3 Missing ⚠️
masa/synapses/__init__.py 92.30% 2 Missing ⚠️
masa/validator/scorer.py 50.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##             main     #301       +/-   ##
===========================================
+ Coverage   53.18%   65.47%   +12.28%     
===========================================
  Files          23       23               
  Lines        1286     1318       +32     
===========================================
+ Hits          684      863      +179     
+ Misses        602      455      -147     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

hide-on-bush-x
hide-on-bush-x previously approved these changes Nov 5, 2024
Copy link
Contributor

@hide-on-bush-x hide-on-bush-x left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, tested, seems to be working as expected ( only on testnet )

{
"mainnet": {
"organic": {
"sample_size": 3,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would that make sense that the sample size of the organic is defined by the requester instead of the miner?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can definitely make a ticket for that - my concern is that there would need to be a hard-coded limit on the organic sample size, otherwise someone could spam the entire network and request 240 miners / call

@grantdfoster
Copy link
Contributor Author

grantdfoster commented Nov 6, 2024

waiting for tests to pass / codcov report, and community input on mip2 before merging!

hide-on-bush-x
hide-on-bush-x previously approved these changes Nov 20, 2024
Copy link
Contributor

@hide-on-bush-x hide-on-bush-x left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

One comment: I am facing some weird errors when getting 429 or similar from the protocol ( They are not clear enough I guess )

2024-11-20 10:30:07.382 |      ERROR       | bittensor:loggingmachine.py:457 |  - ConnectionError#af67eeea-81d9-429e-88a6-1c44fbee9f1b: HTTPConnectionPool(host='x.x.x.x', port=8080): Max retries exceeded with url: /api/v1/data/twitter/tweets/recent (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x10f5ca210>: Failed to establish a new connection: [Errno 61] Connection refused')) -

Maybe we could improve this error handling and exposing something like "4XX error from protocol" to make that easier to understand

@grantdfoster
Copy link
Contributor Author

@hide-on-bush-x added improved error handling in recent tweets request!

@hide-on-bush-x hide-on-bush-x mentioned this pull request Nov 21, 2024
1 task
@grantdfoster grantdfoster merged commit a58f54a into main Nov 21, 2024
6 checks passed
@grantdfoster grantdfoster deleted the feat--trending-queries branch November 21, 2024 17:59
5u6r054 added a commit that referenced this pull request Jan 27, 2025
* chore: installs sdk 0.2.5 locally, preps branch

* feat: incorporates new sdk, removes count and twitter config, updates query to since yesterday

* fix: timedelta

* fix: validation function name

* feat: miners define max count

* fix: cleans up timeout and counts

* feat: reduce randomness (#302)

* chore: preps makefile for testing

* feat: randomness reduced on miner selection and query

* fix: quotes on keyword

* fix: cleanup makefile

* fix: centralize volume testing window param

* feat: tweets by UID (miner) (#303)

* fix: similarity score and prep makefile for testing

* fix: improvement on unique tweet ids

* fix: stores unique tweets by uid

* fix: removes unique tweets for uid when they are de-registered

* fix: simpler tweet indexing

* fix: adds endpoint to expose tweets by uid as well

* fix: function name change

* fix: cleanup endpoint logic

* fix: cleanup notes

* fix: remove note

* fix: revert makefile

* fix: makefile

* fix: increased test coverage

* fix: cleans up tests

* fix: makefile

* fix: pushing config

* fix: backup fetch

* fix: naming

* fix: config and mapping

* fix: removes mock

* fix: config for network

* chore: ports entire config

* fix: adds tests

* fix: reverts makefile

* fix: refactor forwarder

* fix: since

* fix: update config

* fix: tempo blocktime

* fix: test error string

* fix: bump version

* fix: more tests

* fix: scoring test

* fix: cleanup unused files like docker and scrips

* fix: remove miner test for protocol

* feat: adds max tweet count and refactors synapses

* fix: healthcheck import

* fix: import

* fix: reverts makefile for deploy

* fix: Fix tests by adding registered miner and validator wallets as secrets, among other things (#306)

* setup hotkey of test miner

* add async library, fix warnings

* add async library, fix warnings

* add async library, fix warnings

* fix validator tests.

---------

Co-authored-by: JD <john@masa.ai>

* fix: vali test

* fix: twitter limit and makefile logging

* fix: state loading guards

* fix: improved error handling in recent tweets

* fix: adds requests library for better error handling

---------

Co-authored-by: J2D3 <156010594+5u6r054@users.noreply.github.com>
Co-authored-by: JD <john@masa.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ISSUE]: Increase Tweet Count and Timeout [ISSUE]: Validator Integration of Masa SDK Trending Tweets
3 participants