Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some fixes for OpenSea, Seaport, LooksRare and X2Y2 #1536

Merged
merged 14 commits into from
Oct 24, 2022
Merged

Some fixes for OpenSea, Seaport, LooksRare and X2Y2 #1536

merged 14 commits into from
Oct 24, 2022

Conversation

hildobby
Copy link
Collaborator

@hildobby hildobby commented Sep 8, 2022

Brief comments on the purpose of your changes:

Buyer is sometimes the address of an aggregator, fixed that for OpenSea, Seaport, LooksRare & X2Y2 (This query should ideally always return nothing: https://dune.com/queries/1249102).

Also for LooksRare: fixed sometimes missing token_standard and changed NULLs to 0 for royalties/marketplaceplace fees & simplified number_of_items.

FYI: Probably needs a full rerun of all those to apply the fixes retroactively :)

For Dune Engine V2
I've checked that:
General checks:

  • I tested the query on dune.com after compiling the model with dbt compile (compiled queries are written to the target directory)
  • I used "refs" to reference other models in this repo and "sources" to reference raw or decoded tables
  • if adding a new model, I added a test
  • the filename is unique and ends with .sql
  • each sql file is a select statement and has only one view, table or function defined
  • column names are lowercase_snake_cased
  • if adding a new model, I edited the dbt project YAML file with new directory path for both models and seeds (if applicable)
  • if adding a new model, I edited the alter table macro to display new database object (table or view) in UI explorer
  • if adding a new materialized table, I edited the optimize table macro

Join logic:

  • if joining to base table (i.e. ethereum transactions or traces), I looked to make it an inner join if possible

Incremental logic:

  • I used is_incremental & not is_incremental jinja block filters on both base tables and decoded tables
    • where block_time >= date_trunc("day", now() - interval '1 week')
  • if joining to base table (i.e. ethereum transactions or traces), I applied join condition where block_time >= date_trunc("day", now() - interval '1 week')
  • if joining to prices view, I applied join condition where minute >= date_trunc("day", now() - interval '1 week')

@masquot
Copy link
Contributor

masquot commented Sep 9, 2022

Thanks so much for these great updates @hildobby !

Your proposal introduces expensive joins in every individual model. Even if we run these incrementally, it might still require a lot of compute. cc @aalan3 @dot2dotseurat

I wonder if we should use an update statement for this that updates the buyer values for all marketplaces in one run (separate from the initial and incremental inserts) https://docs.databricks.com/sql/language-manual/delta-update.html. This could run separate from the initial and incremental inserts and ideally runs right after the insert or merge statements complete.

@dsalv can you please look into this?

@masquot masquot added the question Further information is requested label Sep 9, 2022
soispoke pushed a commit that referenced this pull request Sep 11, 2022
Removed join on erc20 transfers for now as it results in duplication of buy events and duplicate `unique_trade_id` in some cases.

fyi @hildobby an example of a tx that results in duplication is [0x04f019f90a5afea8bdc72dddccdf5f7fb5ec5cdce69877497ad35bdabe07f9c5](https://etherscan.io/tx/0x04f019f90a5afea8bdc72dddccdf5f7fb5ec5cdce69877497ad35bdabe07f9c5#eventlog)

#1536 may very well be affected by this issue also.
@soispoke
Copy link
Contributor

@dsalv @hildobby @masquot Did we end up making a call on these changes ?

@hildobby
Copy link
Collaborator Author

@dsalv @hildobby @masquot Did we end up making a call on these changes ?

not that I know of, waiting on @dsalv @masquot for now

@jeff-dude jeff-dude added WIP work in progress and removed v2_engine labels Oct 3, 2022
@jeff-dude
Copy link
Member

@masquot looks like you had open concerns here. if you'd like me to look into it, please assign me to the PR & share a few updated thoughts on where you'd like to see this go

@jeff-dude jeff-dude self-assigned this Oct 6, 2022
@jeff-dude jeff-dude added ready-for-review this PR development is complete, please review and removed question Further information is requested WIP work in progress labels Oct 6, 2022
@jeff-dude
Copy link
Member

@hildobby please merge the dune main branch into this branch. i tried, but i'm getting merge conflicts that conflict with other recent merge changes & i'm not 100% sure the direction needed

@jeff-dude jeff-dude added WIP work in progress and removed ready-for-review this PR development is complete, please review labels Oct 6, 2022
@hildobby
Copy link
Collaborator Author

hildobby commented Oct 7, 2022

@hildobby please merge the dune main branch into this branch. i tried, but i'm getting merge conflicts that conflict with other recent merge changes & i'm not 100% sure the direction needed

Done @jeff-dude

@jeff-dude
Copy link
Member

please see the dbt slim ci job, it's failing due to issues with columns in select statement in opensea

@jeff-dude
Copy link
Member

@hildobby the job is still failing. i highly recommend running the dbt commands locally and finalizing. run a dbt clean then dbt deps, dbt compile, grab the query and run on dune and ensure it's working as expected

@hildobby
Copy link
Collaborator Author

@hildobby the job is still failing. i highly recommend running the dbt commands locally and finalizing. run a dbt clean then dbt deps, dbt compile, grab the query and run on dune and ensure it's working as expected

I run dbt compile before every commit and make sure it passes and run the compiled query on dune.com as well. I wasn't aware of dbt clean and dbt deps but I'l also run those regularly. Unfortunately, compiling and running on dune.com often works but I have to wait to see if tests pass after every push. Since it takes a while for Github's CI to run I usually don't have the time to wait and will just get back to it and see if tests pass whenever I next look at it. If we could run tests locally that would be a game changer for sure

Also, I just reran pipenv install for the first time in a few weeks and realised you updated the dbt_utils version. You may want to have some kind of warning whenever making a change into the pipenv file so that we also all rerun the install for it

@jeff-dude
Copy link
Member

@hildobby the job is still failing. i highly recommend running the dbt commands locally and finalizing. run a dbt clean then dbt deps, dbt compile, grab the query and run on dune and ensure it's working as expected

I run dbt compile before every commit and make sure it passes and run the compiled query on dune.com as well. I wasn't aware of dbt clean and dbt deps but I'l also run those regularly. Unfortunately, compiling and running on dune.com often works but I have to wait to see if tests pass after every push. Since it takes a while for Github's CI to run I usually don't have the time to wait and will just get back to it and see if tests pass whenever I next look at it. If we could run tests locally that would be a game changer for sure

Also, I just reran pipenv install for the first time in a few weeks and realised you updated the dbt_utils version. You may want to have some kind of warning whenever making a change into the pipenv file so that we also all rerun the install for it

definitely understand the pain points on compile/run/testing at the moment, but we will try to continue to enhance it. this PR involves multiple models, so we have to ensure we check them all. for instance, i cloned locally & ran dbt compile then grabbed the opensea v1 query & pasted in dune. here is output:

WITH
  wyvern_call_data as (
    SELECT
      call_tx_hash,
      call_block_time,
      CASE
        WHEN contains('0x68f0bcaa', substring(calldataBuy, 1, 4)) THEN 'Bundle Trade'
        ELSE 'Single Item Trade'
      END AS trade_type,
      CASE
        WHEN contains('0xfb16a595', substring(calldataBuy, 1, 4)) THEN 'erc721'
        WHEN contains('0x23b872dd', substring(calldataBuy, 1, 4)) THEN 'erc721'
        WHEN contains('0x96809f90', substring(calldataBuy, 1, 4)) THEN 'erc1155'
        WHEN contains('0xf242432a', substring(calldataBuy, 1, 4)) THEN 'erc1155'
      END AS token_standard,
      addrs [0] as project_contract_address,
      CASE
        WHEN contains('0xfb16a595', substring(calldataBuy, 1, 4)) THEN '0x' || substr(calldataBuy, 163, 40)
        WHEN contains('0x96809f90', substring(calldataBuy, 1, 4)) THEN '0x' || substr(calldataBuy, 163, 40)
        WHEN contains('0x23b872dd', substring(calldataBuy, 1, 4)) THEN addrs [4]
        WHEN contains('0xf242432a', substring(calldataBuy, 1, 4)) THEN addrs [4]
      END AS nft_contract_address,
      CASE
        -- Replace `ETH` with `WETH` for ERC20 lookup later
        WHEN addrs [6] = '0x0000000000000000000000000000000000000000' THEN '0xc02aaa39b223fe8d0a0e5c4f27ead9083c756cc2'
        ELSE addrs [6]
      END AS currency_contract,
      uints [4] as amount_original,
      addrs[4] as shared_storefront_address,
      addrs [1] as buyer,
      addrs [8] AS seller,
      -- Temporary fix for token ID until we implement a UDF equivalent for bytea2numeric that works for numbers higher than 64 bits
      CASE
        WHEN contains('0xfb16a595', substring(calldataBuy, 1, 4))
        AND conv(substr(calldataBuy, 203, 64), 16, 10):: string = '18446744073709551615' THEN 'Token ID is larger than 64 bits and can not be displayed'
        WHEN contains('0x96809f90', substring(calldataBuy, 1, 4))
        AND conv(substr(calldataBuy, 203, 64), 16, 10):: string = '18446744073709551615' THEN 'Token ID is larger than 64 bits and can not be displayed'
        WHEN contains('0x23b872dd', substring(calldataBuy, 1, 4))
        AND conv(substr(calldataBuy, 139, 64), 16, 10):: string = '18446744073709551615' THEN 'Token ID is larger than 64 bits and can not be displayed'
        WHEN contains('0xf242432a', substring(calldataBuy, 1, 4))
        AND conv(substr(calldataBuy, 139, 64), 16, 10):: string = '18446744073709551615' THEN 'Token ID is larger than 64 bits and can not be displayed'
        WHEN contains('0xfb16a595', substring(calldataBuy, 1, 4)) THEN conv(substr(calldataBuy, 203, 64), 16, 10):: string
        WHEN contains('0x96809f90', substring(calldataBuy, 1, 4)) THEN conv(substr(calldataBuy, 203, 64), 16, 10):: string
        WHEN contains('0x23b872dd', substring(calldataBuy, 1, 4)) THEN conv(substr(calldataBuy, 139, 64), 16, 10):: string
        WHEN contains('0xf242432a', substring(calldataBuy, 1, 4)) THEN conv(substr(calldataBuy, 139, 64), 16, 10):: string
      END AS token_id,
      CASE
        WHEN size(call_trace_address) = 0 then array(3:: bigint) -- for bundle join
        ELSE call_trace_address
      END as call_trace_address,
      addrs [6] AS currency_contract_original
    FROM
      opensea_ethereum.wyvernexchange_call_atomicmatch_ wc
    WHERE
      (
        addrs[3] = '0x5b3256965e7c3cf26e11fcaf296dfc8807c01073'
        OR addrs [10] = '0x5b3256965e7c3cf26e11fcaf296dfc8807c01073'
      )
      AND call_success = true
      AND call_block_time >= date_trunc("day", now() - interval '1 week')
  ),
  wyvern_all as (
    SELECT
      call_tx_hash,
      call_block_time,
      trade_type,
      token_standard,
      project_contract_address,
      nft_contract_address,
      currency_contract,
      amount_original,
      shared_storefront_address,
      buyer,
      seller,
      token_id,
      call_trace_address,
      currency_contract_original,
      fees,
      fees.to as fee_receive_address,
      fees.fee_currency_symbol,
      call_trace_address
    FROM
      wyvern_call_data wc
      LEFT JOIN opensea_v1_ethereum.fees fees ON fees.tx_hash = wc.call_tx_hash
      AND fees.trace_address = wc.call_trace_address
      AND fees.block_time >= date_trunc("day", now() - interval '1 week')
    WHERE
      wc.call_block_time >= date_trunc("day", now() - interval '1 week')
  ),
  erc_transfers as (
    SELECT
      evt_tx_hash,
      CASE
        WHEN length(id:: string) > 64 THEN 'Token ID is larger than 64 bits and can not be displayed'
        ELSE id:: string
      END as token_id_erc,
      cardinality(collect_list(value)) as count_erc,
      value as value_unique,
      CASE
        WHEN erc1155.from = '0x0000000000000000000000000000000000000000' THEN 'Mint'
        WHEN erc1155.to = '0x0000000000000000000000000000000000000000'
        OR erc1155.to = '0x000000000000000000000000000000000000dead' THEN 'Burn'
        ELSE 'Trade'
      END AS evt_type,
      evt_index
    FROM
      erc1155_ethereum.evt_transfersingle erc1155
    GROUP BY
      evt_tx_hash,
      value,
      id,
      evt_index,
      erc1155.from,
      erc1155.to
    UNION ALL
    SELECT
      evt_tx_hash,
      CASE
        WHEN length(tokenId:: string) > 64 THEN 'Token ID is larger than 64 bits and can not be displayed'
        ELSE tokenId:: string
      END as token_id_erc,
      COUNT(tokenId) as count_erc,
      NULL as value_unique,
      CASE
        WHEN erc721.from = '0x0000000000000000000000000000000000000000' THEN 'Mint'
        WHEN erc721.to = '0x0000000000000000000000000000000000000000'
        OR erc721.to = '0x000000000000000000000000000000000000dead' THEN 'Burn'
        ELSE 'Trade'
      END AS evt_type,
      evt_index
    FROM
      erc721_ethereum.evt_transfer erc721
    GROUP BY
      evt_tx_hash,
      tokenId,
      evt_index,
      erc721.from,
      erc721.to
  )
SELECT
  DISTINCT 'ethereum' as blockchain,
  'opensea' as project,
  'v1' as version,
  TRY_CAST(date_trunc('DAY', wa.call_block_time) AS date) AS block_date,
  tx.block_time,
  coalesce(token_id_erc, wa.token_id) as token_id,
  tokens_nft.name AS collection,
  wa.amount_original / power(10, erc20.decimals) * p.price AS amount_usd,
  CASE
    WHEN erc_transfers.value_unique >= 1 THEN 'erc1155'
    WHEN erc_transfers.value_unique is null THEN 'erc721'
    ELSE wa.token_standard
  END AS token_standard,
  CASE
    WHEN agg.name is NULL
    AND erc_transfers.value_unique = 1
    OR erc_transfers.count_erc = 1 THEN 'Single Item Trade'
    WHEN agg.name is NULL
    AND erc_transfers.value_unique > 1
    OR erc_transfers.count_erc > 1 THEN 'Bundle Trade'
    ELSE wa.trade_type
  END AS trade_type,
  -- Count number of items traded for different trade types and erc standards
  CASE
    WHEN agg.name is NULL
    AND erc_transfers.value_unique > 1 THEN erc_transfers.value_unique
    WHEN agg.name is NULL
    AND erc_transfers.value_unique is NULL
    AND erc_transfers.count_erc > 1 THEN erc_transfers.count_erc
    WHEN wa.trade_type = 'Single Item Trade' THEN cast(1 as bigint)
    WHEN wa.token_standard = 'erc1155' THEN erc_transfers.value_unique
    WHEN wa.token_standard = 'erc721' THEN erc_transfers.count_erc
    ELSE (
      SELECT
        count(1):: bigint cnt
      FROM
        erc721_ethereum.evt_transfer erc721
      WHERE
        erc721.evt_tx_hash = wa.call_tx_hash
    ) + (
      SELECT
        count(1):: bigint cnt
      FROM
        erc1155_ethereum.evt_transfersingle erc1155
      WHERE
        erc1155.evt_tx_hash = wa.call_tx_hash
    )
  END AS number_of_items,
  'Buy' AS trade_category,
  wa.seller AS seller,
  CASE
    WHEN buyer = agg.contract_address
    AND erct2.to IS NOT NULL THEN erct2.to
    WHEN buyer = agg.contract_address
    AND erct3.to IS NOT NULL THEN erct3.to
    ELSE buyer
  END AS buyer,
  CASE
    WHEN shared_storefront_address = '0x495f947276749ce646f68ac8c248420045cb7b5e' THEN 'Mint'
    WHEN evt_type is not NULL THEN evt_type
    ELSE 'Trade'
  END as evt_type,
  wa.amount_original / power(10, erc20.decimals) AS amount_original,
  wa.amount_original AS amount_raw,
  CASE
    WHEN wa.currency_contract_original = '0x0000000000000000000000000000000000000000' THEN 'ETH'
    ELSE erc20.symbol
  END AS currency_symbol,
  wa.currency_contract,
  wa.nft_contract_address AS nft_contract_address,
  wa.project_contract_address,
  agg.name as aggregator_name,
  agg.contract_address as aggregator_address,
  tx.block_number,
  wa.call_tx_hash AS tx_hash,
  tx.from as tx_from,
  tx.to as tx_to,
  ROUND((2.5 * (wa.amount_original) / 100), 7) AS platform_fee_amount_raw,
  ROUND(
    (
      2.5 * (wa.amount_original / power(10, erc20.decimals)) / 100
    ),
    7
  ) AS platform_fee_amount,
  ROUND(
    (
      2.5 * (
        wa.amount_original / power(10, erc20.decimals) * p.price
      ) / 100
    ),
    7
  ) AS platform_fee_amount_usd,
  '2.5' AS platform_fee_percentage,
  wa.fees AS royalty_fee_amount_raw,
  wa.fees / power(10, erc20.decimals) AS royalty_fee_amount,
  wa.fees / power(10, erc20.decimals) * p.price AS royalty_fee_amount_usd,
  (wa.fees / wa.amount_original * 100):: string AS royalty_fee_percentage,
  wa.fee_receive_address as royalty_fee_receive_address,
  wa.fee_currency_symbol as royalty_fee_currency_symbol,
  'opensea' || '-' || wa.call_tx_hash || '-' || coalesce(wa.token_id, token_id_erc, '') || '-' || wa.seller || '-' || coalesce(evt_index:: string, '') || '-' || coalesce(wa.call_trace_address:: string, '') as unique_trade_id
FROM
  wyvern_all wa
  INNER JOIN ethereum.transactions tx ON wa.call_tx_hash = tx.hash
  and tx.block_time >= date_trunc("day", now() - interval '1 week')
  LEFT JOIN erc_transfers ON erc_transfers.evt_tx_hash = wa.call_tx_hash
  AND (
    wa.token_id = erc_transfers.token_id_erc
    OR wa.token_id = null
  )
  LEFT JOIN tokens.nft tokens_nft ON tokens_nft.contract_address = wa.nft_contract_address
  and tokens_nft.blockchain = 'ethereum'
  LEFT JOIN nft.aggregators agg ON agg.contract_address = tx.to
  AND agg.blockchain = 'ethereum'
  LEFT JOIN erc721_ethereum.evt_transfer erct2 ON erct2.evt_block_time = tx.block_time
  AND wa.nft_contract_address = erct2.contract_address
  AND erct2.evt_tx_hash = wa.call_tx_hash
  AND erct2.tokenId = coalesce(token_id_erc, wa.token_id)
  AND erct2.from = buyer
  and erct2.evt_block_time >= date_trunc("day", now() - interval '1 week')
  LEFT JOIN erc1155_ethereum.evt_transfersingle erct3 ON erct3.evt_block_time = tx.block_time
  AND wa.nft_contract_addresss = erct3.contract_address
  AND erct3.evt_tx_hash = wa.call_tx_hash
  AND erct3.id = coalesce(token_id_erc, wa.token_id)
  AND erct3.from = buyer
  and erct3.evt_block_time >= date_trunc("day", now() - interval '1 week')
  LEFT JOIN prices.usd p ON p.minute = date_trunc('minute', tx.block_time)
  AND p.contract_address = wa.currency_contract
  AND p.blockchain = 'ethereum'
  AND p.minute >= date_trunc("day", now() - interval '1 week')
  LEFT JOIN tokens.erc20 erc20 ON erc20.contract_address = wa.currency_contract
  and erc20.blockchain = 'ethereum'
WHERE
  wa.call_tx_hash not in (
    SELECT
      *
    FROM
      opensea_ethereum.excluded_txns
  )

you can see it error out on a typo in a column name. this is what the gh action is spitting out for us

@hildobby hildobby added ready-for-review this PR development is complete, please review and removed WIP work in progress labels Oct 18, 2022
@hildobby hildobby requested a review from jeff-dude October 18, 2022 22:19
@hildobby
Copy link
Collaborator Author

I see that there tests are failing with opensea v1, but I am pretty sure this isn't caused by my PR but is a problem from the original abstraction, please do lmk if my assumption is incorrect!

@jeff-dude
Copy link
Member

i'm seeing a few things:

  1. the opensea v1 model had its code written twice, maybe a copy/paste error? i pushed a commit to remove second portion
  2. the opensea v1 model still fails after removing that extra code, as evt_index is now ambiguous after adding more joins to tables which contain the column: https://dune.com/queries/1434358?d=1
  3. looksrare model has duplicates, causing failures: https://dune.com/queries/1434322

@jeff-dude jeff-dude added WIP work in progress and removed ready-for-review this PR development is complete, please review labels Oct 20, 2022
@hildobby
Copy link
Collaborator Author

hildobby commented Oct 20, 2022

i'm seeing a few things:

  1. the opensea v1 model had its code written twice, maybe a copy/paste error? i pushed a commit to remove second portion

Ah awesome ty!

  1. the opensea v1 model still fails after removing that extra code, as evt_index is now ambiguous after adding more joins to tables which contain the column: https://dune.com/queries/1434358?d=1

Issued a new PR to fix it and I also checked on dune.com, runs fine now!

  1. looksrare model has duplicates, causing failures: https://dune.com/queries/1434322

That doesn't seem to be caused by this PR, I'm already seeing the duplicate in current nft.trades table: https://dune.com/queries/1434604?d=1
Update: Actually looking into the tx it's not a duplicate but 3 distinct transactions, going to issue a fix in a PR that changes unique_trade_id
Here you can see the tx having all 3 of those trades: https://etherscan.io/tx/0x7fa82bbe1dbca851e153f03237ebc6e2e8e0aeb3dffc90c9f02cde62ba4c2c11

@hildobby hildobby added ready-for-review this PR development is complete, please review and removed WIP work in progress labels Oct 20, 2022
@jeff-dude
Copy link
Member

i'm seeing a few things:

  1. the opensea v1 model had its code written twice, maybe a copy/paste error? i pushed a commit to remove second portion

Ah awesome ty!

  1. the opensea v1 model still fails after removing that extra code, as evt_index is now ambiguous after adding more joins to tables which contain the column: https://dune.com/queries/1434358?d=1

Issued a new PR to fix it and I also checked on dune.com, runs fine now!

  1. looksrare model has duplicates, causing failures: https://dune.com/queries/1434322

That doesn't seem to be caused by this PR, I'm already seeing the duplicate in current nft.trades table: https://dune.com/queries/1434604?d=1 Update: Actually looking into the tx it's not a duplicate but 3 distinct transactions, going to issue a fix in a PR that changes unique_trade_id Here you can see the tx having all 3 of those trades: https://etherscan.io/tx/0x7fa82bbe1dbca851e153f03237ebc6e2e8e0aeb3dffc90c9f02cde62ba4c2c11

you're right -- we currently exclude looksrare in production as it's been failing for this issue. once you submit new PR to fix unique_trade_id, merge and verify, then we can merge main back into this branch and hopefully finalize it

@hildobby
Copy link
Collaborator Author

Awesome, here is the fix PR for LooksRare @jeff-dude: #1824

@hildobby hildobby mentioned this pull request Oct 20, 2022
13 tasks
@jeff-dude jeff-dude added ready-for-merging and removed ready-for-review this PR development is complete, please review labels Oct 24, 2022
Copy link
Member

@jeff-dude jeff-dude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we're set here -- we'll look to merge and deploy

@jeff-dude jeff-dude merged commit 98cb56d into duneanalytics:main Oct 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants