Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] [federation] honk (tedu) #2527

Closed
mirabilos opened this issue Jan 14, 2024 · 28 comments
Closed

[bug] [federation] honk (tedu) #2527

mirabilos opened this issue Jan 14, 2024 · 28 comments
Labels
bug Something isn't working

Comments

@mirabilos
Copy link
Contributor

Describe the bug with a clear and concise description of what the bug is.

After benjojo, I now have federation issues with tedu’s honk instance.

Searching for https://honk.tedunangst.com/u/tedu/h/189775FqjZ7tN7qz6F gives “The data
couldn’t be read because it isn’t in the correct format.” in FediText, and is probably connected to the following log line:

timestamp="2024-01-14T19:37:47.528Z" func=server.glob..func1.Logger.func13.1 level=ERROR stacktrace="()\n\t/usr/local/go/src/runtime/panic.go:261\nruntime.sigpanic()\n\t/usr/local/go/src/runtime/signal_unix.go:861\ntypeutils.(*Converter).AccountToAPIAccountPublic()\n\t/drone/src/internal/typeutils/internaltofrontend.go:218\nsearch.(*Processor).packageAccounts()\n\t/drone/src/internal/processing/search/util.go:79\nsearch.(*Processor).packageSearchResult()\n\t/drone/src/internal/processing/search/util.go:184\nsearch.(*Processor).Get()\n\t/drone/src/internal/processing/search/get.go:238\nsearch.(*Module).SearchGETHandler()\n\t/drone/src/internal/api/client/search/searchget.go:232\n()\n\t/drone/src/vendor/github.com/gin-gonic/gin/context.go:174\ngzip.(*gzipHandler).Handle()\n\t/drone/src/vendor/github.com/gin-contrib/gzip/handler.go:60\n" requestID=0vpqm2cd04001f77cg60 msg="recovered panic: runtime error: invalid memory address or nil pointer dereference"

Searching for https://honk.tedunangst.com/u/tedu gives me an… empty profile with one toot? Screenshot follows, it better describes this issue.

Screenshot_20240114_204252

Of course, I cannot follow him, etc. from here.

What's your GoToSocial Version?

0.13.1 git-ccecf5a

GoToSocial Arch

amd64 binary

What happened?

No response

What you expected to happen?

I was under the impression that it was possible to interact with those honks as Fedi instances, and I searched the bugtracker and found no open bug mentioning honk, so I thought it would be possible.

If it is out of scope, that’s okay as well.

How to reproduce it?

No response

Anything else we need to know?

No response

@mirabilos mirabilos added the bug Something isn't working label Jan 14, 2024
@tsmethurst
Copy link
Contributor

Weird, this bug looks just like #2514

Just like in that issue, I can't recreate the bug from my instance(s). Lookup works fine. But clearly there really is a bug there.

For the record, I don't believe this is a federation bug per se, it looks more like a bug in the typeutils, though when and how we managed to introduce it is a mystery to me. I'll investigate. To help me narrow it down, would you mind temporarily redeploying v0.13.0 and trying the lookup? There are no database migrations or config file changes between that and v0.13.0, so it's safe to step down to it temporarily.

@mirabilos
Copy link
Contributor Author

Sure, will have a look in a bit

@mirabilos
Copy link
Contributor Author

timestamp="2024-01-14T22:41:45.131Z" func=server.glob..func1.Logger.func13.1 level=ERROR stacktrace="()\n\t/usr/local/go/src/runtime/panic.go:261\nruntime.sigpanic()\n\t/usr/local/go/src/runtime/signal_unix.go:861\ntypeutils.(*Converter).AccountToAPIAccountPublic()\n\t/drone/src/internal/typeutils/internaltofrontend.go:218\nsearch.(*Processor).packageAccounts()\n\t/drone/src/internal/processing/search/util.go:79\nsearch.(*Processor).packageSearchResult()\n\t/drone/src/internal/processing/search/util.go:184\nsearch.(*Processor).Get()\n\t/drone/src/internal/processing/search/get.go:238\nsearch.(*Module).SearchGETHandler()\n\t/drone/src/internal/api/client/search/searchget.go:232\n()\n\t/drone/src/vendor/github.com/gin-gonic/gin/context.go:174\ngzip.(*gzipHandler).Handle()\n\t/drone/src/vendor/github.com/gin-contrib/gzip/handler.go:60\n" requestID=n1c262md04001zqwj840 msg="recovered panic: runtime error: invalid memory address or nil pointer dereference"
timestamp="2024-01-14T22:41:45.131Z" func=server.glob..func1.Logger.func13.1 level=INFO latency="2.657329ms" userAgent="Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/115.0" method=GET statusCode=200 path=/api/v2/search clientIP=… errors="Error #01: recovered panic: runtime error: invalid memory address or nil pointer dereference\n" requestID=n1c262md04001zqwj840 msg="OK: wrote 23B"

GoToSocial 0.13.0 git-f4fcffc

This was when searching for the status.

However, searching for the user yields:

Screenshot_20240114_234308

But:

timestamp="2024-01-14T22:43:17.752Z" func=server.glob..func1.Logger.func13.1 level=INFO latency="579.927µs" userAgent="Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/115.0" method=POST statusCode=404 path=/api/v1/accounts/01P3C447JEC64Y6SK7K0N6T3Z9/follow clientIP=… errors="Error #01: target account not found\n" requestID=ez1282md04001bpg7rp0 msg="Not Found: wrote 69B"

@tsmethurst
Copy link
Contributor

OK, so more or less the same result then. Thanks!

@tsmethurst
Copy link
Contributor

OK, I have a guess what's going on here. Could you help narrow it down by doing:

SELECT username FROM accounts WHERE locked IS NULL;

Ideally there should be no entries. My suspicion is that there will be entries, which will help me identify the bug.

@mirabilos
Copy link
Contributor Author

gotosocial=# SELECT username FROM accounts WHERE locked IS NULL;
 username 
----------
(0 rows)

@tsmethurst
Copy link
Contributor

Curiouser and curiouser. Alright, thanks! I'll keep looking then.

@tsmethurst
Copy link
Contributor

tsmethurst commented Jan 19, 2024

I'm still trying to hunt this one down. @mirabilos could you try searching for the statuses / accounts mentioned again, restarting your instance, and then doing the searches once again, and see if you get the same result? I'd like to know if it's a caching issue.

@NyaaaWhatsUpDoc
Copy link
Member

@mirabilos if this issue does persist, could you let us know what values of created_at updated_at and fetched_at you have for the tedu account?

@mirabilos
Copy link
Contributor Author

could you try searching for the statuses / accounts mentioned again, restarting your instance, and then doing the searches once again, and see if you get the same result?

I basically did that when I was doing the downgrade / upgrade dance, and it didn’t change anything.

if this issue does persist, could you let us know what values of created_at updated_at and fetched_at you have for the tedu account?

Sure. Have that info and a bit more:

gotosocial=# SELECT id,created_at,updated_at,fetched_at,username,domain,uri,url FROM accounts WHERE domain='honk.tedunangst.com';
             id             |          created_at           |          updated_at           |          fetched_at           | username |       domain        |                uri                 |                url                 
----------------------------+-------------------------------+-------------------------------+-------------------------------+----------+---------------------+------------------------------------+------------------------------------
 01P3C447JEC64Y6SK7K0N6T3Z9 | 2022-11-19 02:35:28.663157+00 | 2024-01-14 19:38:39.431505+00 | 2024-01-14 19:38:39.431503+00 | tedu     | honk.tedunangst.com | https://honk.tedunangst.com/u/tedu | https://honk.tedunangst.com/u/tedu
(1 row)

@tsmethurst
Copy link
Contributor

Could you please try searching for the user https://social.9grid.net/u/john (also a Honk instance) and see if that works? Could you also try searching for https://catodon.social/users/9ms2ccihepv4shhy (a user on a Catodon instance)?

@mirabilos
Copy link
Contributor Author

mirabilos commented Jan 21, 2024

Both show up with the usual “not yet known profile” properties: name, profile picture, etc. but no posts yet. No errors either.

Logs (grepping for the instance name):

timestamp="2024-01-21T18:14:21.882Z" func=httpclient.(*Client).DoSigned level=INFO method=GET url=https://social.9grid.net/u/john requestID=p87kpbcd040004ksjkzg pubKeyID=https://toot.mirbsd.org/users/mirabilos/main-key msg="performing request"
timestamp="2024-01-21T18:14:22.438Z" func=httpclient.(*Client).DoSigned level=INFO method=GET url=https://social.9grid.net/.well-known/webfinger?resource=acct%3Ajohn%40social.9grid.net requestID=p87kpbcd040004ksjkzg pubKeyID=https://toot.mirbsd.org/users/mirabilos/main-key msg="performing request"
timestamp="2024-01-21T18:14:22.613Z" func=httpclient.(*Client).DoSigned level=INFO method=GET url=https://social.9grid.net/meme/john.jpg requestID=p87kpbcd040004ksjkzg pubKeyID=https://toot.mirbsd.org/users/mirabilos/main-key msg="performing request"

And:

timestamp="2024-01-21T18:14:37.725Z" func=httpclient.(*Client).DoSigned level=INFO method=GET url=https://catodon.social/users/9ms2ccihepv4shhy requestID=jx6kpbcd04001dv00y90 pubKeyID=https://toot.mirbsd.org/users/mirabilos/main-key msg="performing request"
timestamp="2024-01-21T18:14:37.943Z" func=httpclient.(*Client).DoSigned level=INFO method=GET url=https://catodon.social/.well-known/webfinger?resource=acct%3Aadmin%40catodon.social requestID=jx6kpbcd04001dv00y90 pubKeyID=https://toot.mirbsd.org/users/mirabilos/main-key msg="performing request"
timestamp="2024-01-21T18:14:37.993Z" func=httpclient.(*Client).DoSigned level=INFO method=GET url=https://storage.catodon.social/uploads/49212c3b-9fa7-4603-be7e-2b0b58b0bfe8.png requestID=jx6kpbcd04001dv00y90 pubKeyID=https://toot.mirbsd.org/users/mirabilos/main-key msg="performing request"
timestamp="2024-01-21T18:14:38.072Z" func=httpclient.(*Client).DoSigned level=INFO method=GET url=https://catodon.social/users/9ms2ccihepv4shhy/collections/featured pubKeyID=https://toot.mirbsd.org/users/mirabilos/main-key msg="performing request"

I have to say I totally don’t envy you, given the Catodon request unexpectedly did not error out for me, debugging this must be hell.

@tsmethurst
Copy link
Contributor

Thanks, this helps! So they both work, basically, right?

@mirabilos
Copy link
Contributor Author

I’d say so, yes.

Things I could still try:

  • remove instance keys
  • block then unblock the instance
  • remove all traces of it from the DB and retry

What do you think?

@tsmethurst
Copy link
Contributor

Blocking and unblocking the instance could be interesting. I don't think it will fix it but you never know. I already have an inkling of what the issue might be though, still going through it.

@mirabilos
Copy link
Contributor Author

Blocking and unblocking did not fix it.

DELETEing the row from accounts, restarting GtS and doing the search again yields the backtrace:

timestamp="2024-01-21T19:58:17.307Z" func=server.glob..func1.Logger.func13.1 level=ERROR stacktrace="()\n\t/usr/local/go/src/runtime/panic.go:261\nruntime.sigpanic()\n\t/usr/local/go/src/runtime/signal_unix.go:861\ntypeutils.(*Converter).AccountToAPIAccountPublic()\n\t/drone/src/internal/typeutils/internaltofrontend.go:218\nsearch.(*Processor).packageAccounts()\n\t/drone/src/internal/processing/search/util.go:79\nsearch.(*Processor).packageSearchResult()\n\t/drone/src/internal/processing/search/util.go:184\nsearch.(*Processor).Get()\n\t/drone/src/internal/processing/search/get.go:238\nsearch.(*Module).SearchGETHandler()\n\t/drone/src/internal/api/client/search/searchget.go:232\n()\n\t/drone/src/vendor/github.com/gin-gonic/gin/context.go:174\ngzip.(*gzipHandler).Handle()\n\t/drone/src/vendor/github.com/gin-contrib/gzip/handler.go:60\n" requestID=twt9mbcd040001mj7x3g msg="recovered panic: runtime error: invalid memory address or nil pointer dereference"

@tsmethurst
Copy link
Contributor

Alright, all expected. Though please stop manually hacking around in the database like that.

Would you be willing to run a debug build of GoToSocial and make a few API calls? My current line of thinking is that honk.tedunangst.com is returning a 4xx status code to dereference requests coming from your instance, and GoToSocial is not handling it very gracefully. If you run with a debug build, we have an endpoint that lets you make signed GET requests with your account, and returns the raw results of the GET request to you (code and all), so we could use that to see what's going on.

@mirabilos
Copy link
Contributor Author

If I can go back from the debug build to the last release, or the debug build is known to not introduce any regressions (does not contain experimental/WIP code), then yes I’m willing.

@tsmethurst
Copy link
Contributor

tsmethurst commented Jan 21, 2024

OK great, thank you! I uploaded a bunch of builds here: https://minio.s3.superseriousbusiness.org/browser/gotosocial-snapshots/MC4xMy4xLURFQlVHLw==

Despite the name, these are exactly the same as the 0.13.1 release, only built with the DEBUG flag set to 1 / true. So it's perfectly fine to use this and then switch back.

Once you've deployed the appropriate build for your architecture, you can use the debug apurl endpoint like so:

curl -H 'Authorization: Bearer INSERT_BEARER_TOKEN_HERE' 'https://your_instance.tld/api/v1/admin/debug/apurl?url=https://honk.tedunangst.com/u/tedu'

For INSERT_BEARER_TOKEN_HERE you can use any admin-level token (obtained either from your database directly, or by signing into semaphore/pinafore and grabbing the token from the network requests). And ofc replace your_instance.tld with your instance's host.

Once you've done that, could you paste the output here, minus any sensitive information? Thank you!

@mirabilos
Copy link
Contributor Author

Hmmhmm…

timestamp="2024-01-21T21:12:54.546Z" func=server.glob..func1.Logger.func13.1 level=INFO latency="31.201µs" userAgent=curl/7.88.1 method=GET statusCode=404 path=/api/v1/admin/debug/apurl errors="Error #01: Not Found\n" requestID=263dwbcd04001wn3a41g msg="Not Found: wrote 21B"

@tsmethurst
Copy link
Contributor

Oh, I'm sorry, I forgot to mention, you have to also run GoToSocial with the environment variable DEBUG=1, so DEBUG=1 ./gotosocial --config-path config.yaml server start, or whatever equivalent command is on your machine.

@mirabilos
Copy link
Contributor Author

Same result õÕ

@mirabilos
Copy link
Contributor Author

and yes, /proc/$pid/{environ,exe} are correct…

@tsmethurst
Copy link
Contributor

tsmethurst commented Jan 21, 2024

Ah right sorry, my bad, I checked out the 0.13.1 tag to do those builds, but I forgot we only added code to parse the DEBUG=1 flag into goreleaser after 0.13.1 was already out.

I uploaded DEBUG=1 0.13.1 builds to the same minio folder again, could you try again please with one of those? Apologies for the mix-up.

@mirabilos
Copy link
Contributor Author

{"request_url":"https://honk.tedunangst.com/u/tedu",
"request_headers":{
  "Accept":["application/ld+json; profile=\"https://www.w3.org/ns/activitystreams\",application/activity+json"],
  "Accept-Charset":["utf-8"],
  "Date":["Sun, 21 Jan 2024 22:33:14 GMT"],
  "Host":["honk.tedunangst.com"],
  "Signature":[
    "keyId=\"https://toot.mirbsd.org/users/mirabilos/main-key\",
    algorithm=\"hs2019\",
    headers=\"(request-target) host date\",
    signature=\"R+ […] +A==\""
  ],
  "User-Agent":["gotosocial (+https://toot.mirbsd.org) gotosocial/0.13.2-SNAPSHOT git-ccecf5a"]
},
"response_headers":{
  "Content-Length":["19"],
  "Content-Security-Policy":["default-src 'none'; script-src 'self'; connect-src 'self'; style-src 'self'; img-src 'self'; media-src 'self'"],
  "Content-Type":["text/plain; charset=utf-8"],
  "Date":["Sun, 21 Jan 2024 22:33:14 GMT"],
  "Strict-Transport-Security":["max-age=7776000"],
  "X-Content-Type-Options":["nosniff"]
},
"response_code":404,
"response_body":"404 page not found\n"
}

@tsmethurst
Copy link
Contributor

tsmethurst commented Jan 21, 2024

Aha! Well, that confirms my thinking that the instance was returning a 4xx code in response to dereference requests. I can do the very same request and get a 200 ActivityPub JSON response though:

{
  "request_url": "https://honk.tedunangst.com/u/tedu",
  "request_headers": {
    "Accept": [
      "application/ld+json; profile=\"https://www.w3.org/ns/activitystreams\",application/activity+json"
    ],
    "Accept-Charset": [
      "utf-8"
    ],
    "Date": [
      "Sun, 21 Jan 2024 22:41:00 GMT"
    ],
    "Host": [
      "honk.tedunangst.com"
    ],
    "Signature": [
      "keyId=\"https://goblin.technology/users/tobi/main-key\",algorithm=\"hs2019\",headers=\"(request-target) host date\",signature=\".......\""
    ],
    "User-Agent": [
      "gotosocial (+https://goblin.technology) gotosocial/0.13.1-SNAPSHOT git-238cc19"
    ]
  },
  "response_headers": {
    "Content-Security-Policy": [
      "default-src 'none'; script-src 'self'; connect-src 'self'; style-src 'self'; img-src 'self'; media-src 'self'"
    ],
    "Content-Type": [
      "application/ld+json; profile=\"https://www.w3.org/ns/activitystreams\""
    ],
    "Date": [
      "Sun, 21 Jan 2024 22:41:01 GMT"
    ],
    "Strict-Transport-Security": [
      "max-age=7776000"
    ]
  },
  "response_code": 200,
  "response_body": "{\n  \"@context\": \"https://www.w3.org/ns/activitystreams\",\n  \"chatKeyV0\": \"vIT5wj9bJqGkvwBxhmaz4Lh4eZIeOKnsSIQifShmJUY=\",\n  \"followers\": \"https://honk.tedunangst.com/u/tedu/followers\",\n  \"following\": \"https://honk.tedunangst.com/u/tedu/following\",\n  \"icon\": {\n    \"mediaType\": \"image/png\",\n    \"type\": \"Image\",\n    \"url\": \"https://honk.tedunangst.com/a?a=https%3A%2F%2Fhonk.tedunangst.com%2Fu%2Ftedu\"\n  },\n  \"id\": \"https://honk.tedunangst.com/u/tedu\",\n  \"inbox\": \"https://honk.tedunangst.com/u/tedu/inbox\",\n  \"name\": \"tedu\",\n  \"outbox\": \"https://honk.tedunangst.com/u/tedu/outbox\",\n  \"preferredUsername\": \"tedu\",\n  \"publicKey\": {\n    \"id\": \"https://honk.tedunangst.com/u/tedu#key\",\n    \"owner\": \"https://honk.tedunangst.com/u/tedu\",\n    \"publicKeyPem\": \"-----BEGIN PUBLIC KEY-----\\nMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA593GZ9TYrvWgMaMKQ6k6\\ngkItUapUgNnNXzU9J63GRtYZ7CE/Zi39Kgpsxu77hHBj34vwjr1Oc9AMrVDIMfu9\\nEirW1RWxPvrjThBU56VgkpkAXVsieaffJo80BA00QzV4x69Jgat6OT7ox/HMvMxR\\nyZ6CXNCPKQALYqQF6v1fX1kO9lhIA+mPd0JN/qMKvZfd1NXABEk9nORUneH7Audt\\nIHNdJzKMHC6wPSQWC7SmXT0/nq6o5mR2SgvwTI/JUx6T5r8NDrwSaqB69e+EMJqR\\nxKOh9N4A1ba/AQOQZbO/YkFyYY2VE4HWbvS9XpYL74yT9D6Fp4cUovJiXC+ziam0\\nNwIDAQAB\\n-----END PUBLIC KEY-----\\n\"\n  },\n  \"summary\": \"<p>Purveyor of modest software.\",\n  \"type\": \"Person\",\n  \"url\": \"https://honk.tedunangst.com/u/tedu\"\n}\n"
}

If you were blocked/defederated from that instance, I would expect it to return a code 403, so 404 is quite surprising, but there you go. So I think the resolution here is probably to just update the GtS dereferencing logic to handle those 4xx codes a bit better (which I already have on a branch right now).

@mirabilos
Copy link
Contributor Author

Oh hmm… that’s annoying. Thanks for the debugging!

@tsmethurst
Copy link
Contributor

Thanks for the debugging!

No problem, thanks for running the debug build!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants