Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sendpay custom onion #1715

Closed

Conversation

bitonic-cjp
Copy link
Contributor

For more explanation and discussion, see pull request #1686 .

This is the first part of a replacement of #1686 : this contains the additional functionality in sendpay. The rest will come in a later pull request, and will depend on this functionality.

Copy link
Collaborator

@jb55 jb55 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks cleaner! I'll defer a more detailed analysis to those who understand this part of the code, but from what I can tell it's mostly a refactor?

I just have a few style suggestions.

lightningd/pay.c Outdated
@@ -624,54 +634,76 @@ bool wait_payment(const tal_t *cxt,
return cb_not_called;
}

static struct custom_route_data route_to_custom_route_data(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's fairly unconventional to return structs as values from my experience, usually you would just pass in a pointer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, we do pass back quite a few pointers to structs if we pass in a context we can allocate off.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right that would work too

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, pass in a context, allocate off it and return (and allocate members off of the thing you're returning).

Generally, returning something allocated off tmpctx is a layering violation which can get someone in trouble.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the point you make about the layering violation, and it still isn't resolved in my later commits. Another thing I don't like about my approach is that this data gets tal_stolen when it gets passed to send_payment_custom_route_data. So I'm open for suggestions on how to improve this.

However, I'm not sure I fully understand what you suggest. Maybe it's easiest if you just write the code improvement yourself?

Side note: I was wondering how well tal combines with data allocated on the stack. I guess that when a stack-allocated object is used as a context, the child objects are not automatically going to be deallocated when the context object on the stack goes out of scope, right? This isn't C++ after all.

lightningd/pay.c Outdated
@@ -624,54 +634,76 @@ bool wait_payment(const tal_t *cxt,
return cb_not_called;
}

static struct custom_route_data route_to_custom_route_data(
struct lightningd* ld,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

μ-nit: struct lightning *ld

lightningd/pay.c Outdated
struct lightningd* ld,
const struct sha256 *rhash,
const struct route_hop *route,
struct custom_route_data route_data,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pass a pointer instead of a value here

lightningd/pay.c Outdated
@@ -30,6 +31,15 @@ struct sendpay_command {
void *cbarg;
};

/* data consumed by send_payment_custom_route_data */
struct custom_route_data
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: perhaps payment_route_data, custom is a bit ambiguous.

lightningd/pay.c Outdated
@@ -1006,6 +1054,46 @@ static void json_sendpay(struct command *cmd,
return;
}

/* Default (realm = 0) realm / hop_data, and other route data */
custom_route_data = route_to_custom_route_data(cmd->ld, route);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pass custom_route_data as a pointer to route_to_custom_route_data here instead of returning.

@bitonic-cjp
Copy link
Contributor Author

Yes, the first couple of commits is just refactoring, in preparation for the "feaure" commits:

aa216e4 adds the new feature
9d72d63 fixes an issue that would otherwise partially nullify the feature addition
c233849 adds a test that uses the new feature

@bitonic-cjp
Copy link
Contributor Author

I added a couple of commits that address @jb55 's suggestions. Cherry-pick what you like, or just merge it all.

@bitonic-cjp
Copy link
Contributor Author

bitonic-cjp commented Jul 23, 2018

@cdecker @rustyrussell is there any way I can speed up the review process of this pull request?

I have another change ready to be pulled (the second part of #1686), and it depends on this one. Right now, its Git history starts at the end of this code (since it depends on this code), but I'm not sure how clean it will end up in Git if I ask it to be pulled in its current form - therefore I'm delaying the creation of a pull request for that second part until after this part gets pulled.

@bitonic-cjp
Copy link
Contributor Author

bitonic-cjp commented Jul 23, 2018

Another thing: apparently a merge conflict has appeared. Should I merge master into this pull request and resolve conflicts in the merge commit, rebase & force-push the pull request, or something else?

@jb55
Copy link
Collaborator

jb55 commented Jul 23, 2018

Should I merge master into this pull request and resolve conflicts in the merge commit, rebase & force-push the pull request

yup!


/* Returns false if cb was called, true if cb not yet called. */
static bool
send_payment_custom_route_data(const tal_t *ctx,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name is a bit misleading imho, it should be the common entrypoint for injected and locally originating payments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notice also that this function does quite a lot of book-keeping that should really only be done for payments that originated here. If we later inject an onion from outside we likely don't want it to show up in the listpayments output. These should be extracted in a wrapper instead, that is bypassed when injecting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, for now, we can see this as a locally originating payment. The extended functionality is not related to an app-specific -> standard-lightning bridge on this node, but to enable a standard-lightning -> app-specific bridge on another node, downstream the route.

The app-specific -> standard-lightning bridge idea is an interesting line of thought. Maybe it should have its own RPC command, or maybe an option in sendpay to consider this routing of someone else's transaction instead of considering this our own transaction. For simplicity, let's not include that in this pull request.

@cdecker
Copy link
Member

cdecker commented Jul 24, 2018

Seems ok to me, I'll let @rustyrussell do a code-style check.

@bitonic-cjp
Copy link
Contributor Author

I added a merge commit that resolves a conflict with recent changes in lightningd/pay.c in the master branch.

@cdecker
Copy link
Member

cdecker commented Jul 26, 2018

I added a merge commit that resolves a conflict with recent changes in lightningd/pay.c in the master branch.

Generally speaking rebases are to be preferred since they avoid pulling in unrelated stuff from master

u64 *amt_forward,
u32 *outgoing_cltv
);

Copy link
Collaborator

@wythe wythe Jul 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and in other places the style should be a little more like:

void deserialize_per_hop_data(const u8 *per_hop_data,
                              struct short_channel_id *channel_id,
                              u64 *amt_forward,
                              u32 *outgoing_cltv);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should just have a clang-format linting step in travis...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm running clang-format on patches that I create, but it's slow progress and having travis check it is really painful if the code mostly does its own thing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was about to create a new commit to fix this, but then I saw the surrounding, old code was inconsistent w.r.t. this coding style. For instance, the process_onionpacket declaration does not conform to your style suggestion.

Instead of only fixing my new code, it might be better to make one commit that fixes all code. Including such a commit in this pull request would mess up the readability of this pull request, since you'd have a mix of functional and non-functional changes, obscuring the view on where the functional changes are.

So I agree it should be done, but I don't think it should be done right here right now. Let's make a separate pull request that makes the coding style consistent across the entire code base.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we maybe commit a .clang-format config to format all future code?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could just copy what the linux kernel does:

https://www.kernel.org/doc/html/v4.17/process/clang-format.html

And here's their config file:

https://github.com/torvalds/linux/blob/master/.clang-format

I currently just use indent -kr -i8 -l80 and then just touch-up the result by hand.

lightningd/pay.c Outdated
@@ -624,54 +634,76 @@ bool wait_payment(const tal_t *cxt,
return cb_not_called;
}

static struct custom_route_data route_to_custom_route_data(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, pass in a context, allocate off it and return (and allocate members off of the thing you're returning).

Generally, returning something allocated off tmpctx is a layering violation which can get someone in trouble.

}
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I think i prefer expanding struct route_hop to be a union, and rename the current struct route_hop to route_hop_bitcoin or something?

ie.

struct route_hop {
    u8 realm;
    union {
        /* Realm 0 */
        struct route_hop_bitcoin b;
        /* Other */
        struct route_hop_generic generic;
};

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you use a union, then how does the code that reads the data know which of the two to use? If it always uses the generic one, then you implicitly depend on the way how the C compiler structures the route_hop_bitcoin struct for its serialization. I guess you might get away with that if you are very careful with struct alignment, endianness and so on, but do you really want to go that way?

In the rest of the code, I see all serialization being done explicitly, without depending on how the C compiler deals with structs/unions. To me, that seems like the more sensible / portable way of doing serialization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to add that I allow the caller to change realm number and data independently of each other, so the user might change the realm number to signal something, but still want to use the the realm 0 format, or keep realm value 0, but use a different data format. The second might not be very useful, except maybe for our own testing purposes (it allows us to inject completely invalid realm 0 data, and see how a node responds).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not suggesting using layout assumptions: the realm byte is the descriminator. You check that. You can call 'generic' 'unknown' if you prefer...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like you to consider the case where the user does both of the following things for a certain hop in the route:

  • set the data
  • set the realm number to 0, or keep it unchanged (the default is 0)

If you use the realm number as discriminator for the union, this breaks: the user-supplied data gets written in the route_hop_generic part of the union, and later the route_hop_bitcoin part gets used (because the realm number is 0). Worse: it might actually work on some systems, and break on other systems, depending on layout, if the struct order+size accidentally happens to be the same as the realm 0 serialization.

The consequence of breaking is that different data arrives at the node than was intended. I'm not sure if this "different data" can actually theoretically include uninitialized memory, but I don't want to break my head over it. If it's difficult to analyze, the design is probably wrong.

One possible approach is to say "the user should never do this". I don't like that approach: it adds another thing for the (RPC) user to know, and it removes a bit of orthogonality between setting the realm number and setting the data. Maybe there are actually valid use cases for setting the data and setting/keeping the realm number at 0.

So I prefer not to use the realm number as discriminator. The union approach might still be possible though, if, next to the realm number, we also add a boolean to struct route_hop that indicates which union part should be used. Would you like that?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sending a realm byte of 0 while having a hop with a different format would be confusing to BOLT-conforming implementations, so I think realm as discriminator, is quite fine.

Copy link
Contributor Author

@bitonic-cjp bitonic-cjp Sep 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ZmnSCPxj : If the caller passes BOLT-conforming realm 0 data, it still fails. You are really eliminating possibilities from the user; even if you don't think those possibilities will ever be useful, someone else will think they are useful to him, and try to use the API that way. It will cause unexpected failures, and maybe even a security issue for the user doing this.

I'd really like the function to have consistent behavior for every possible combination of input values, even for combinations that seem useless.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is BOLT-conforming realm 0 data, then the struct route_hop_bitcoin will work and is perfectly appropriate. Probably if realm is 0 then sendpay should check the route data in detail and see that it conforms, then build the struct route_hop_bitcoin correctly.

Copy link
Contributor Author

@bitonic-cjp bitonic-cjp Sep 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That adds way too much complexity. Why do that? To be able to use the union? Why not just do what the user asks for? The user explicitly overrides the default data formation, so apparently the user knows what he wants, and it's different from what we'd normally do.

Let the receiving node complain if it does not understand the message.

Besides: sending "wrong" realm 0 data is useful in security / bug testing. Having this as a feature adds to "swiss army knife" usability of C-Lightning.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I understand.

@bitonic-cjp
Copy link
Contributor Author

Without putting any pressure on you: :-P

if we can finish this pull request in the first half of this week, I can create the next pull request before the start of my holiday (which will take me off-line for 2 weeks). Then you can at least discuss the merits of my next pull request during my absence.

@bitonic-cjp
Copy link
Contributor Author

Here you have the long-awaited rebase. These should be the only changes compared to before the rebase:

  • Adjusted to the new code in the master branch
  • Merged some "fix" commits into the commits they fix, to simplify the code history

This way, the per_hop data stays serialized for longer, so it can be passed in its original (but decrypted) form to a custom transaction router.
This way, we can later use the rest of the function with other route data.
This allows the caller to specify non-default realm and per-hop data in the route.
The realm value is provided by the calling code (which will zero it where needed).
@bitonic-cjp
Copy link
Contributor Author

Yet another rebase.

@bitonic-cjp
Copy link
Contributor Author

@cdecker maybe you can remove the needs-rebase label? At least for now, it seems it doesn't need a rebase. We'll keep playing the add-label rebase remove-label game until this gets merged.

Trust me: this costs me a lot more effort than you. Some of these rebase changes are really non-trivial, and require me to understand how the surrounding code has been refactored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants