Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Messages property of the DBC class to Dictionary #14

Open
hakwes opened this issue Sep 29, 2022 · 17 comments
Open

Update Messages property of the DBC class to Dictionary #14

hakwes opened this issue Sep 29, 2022 · 17 comments
Labels
enhancement New feature or request public API This request involves user API

Comments

@hakwes
Copy link
Contributor

hakwes commented Sep 29, 2022

I would like to see the Messages property of the DBC class to be set as a Dictionary to improve lookup.
Another option would be to add a Get method that would use an internal dictionary.
Another suggestions is to add the ValueTable as a dictionary as a complement to the string. That would simplify mapping values to descriptions.

@EFeru
Copy link
Owner

EFeru commented Sep 30, 2022

Sounds interesting. I din't use Dictionary in C# yet, so if you are willing to help I would be more than happy

@taylorjdlee
Copy link
Contributor

taylorjdlee commented Sep 30, 2022

I guess this also somewhat ties into issue #15 making message a dictionary would help in solving this issue as well.

@hakwes
Copy link
Contributor Author

hakwes commented Oct 3, 2022

@EFeru For the messages and the nodes you can just change Dbc properties
public IEnumerable<Node> Nodes {get;} public IEnumerable<Message> Messages {get;}

to Dictionaries since the builder already uses a Dictionary to populate them.
That would be a breaking change though. Hence, you might opt for adding a Get method instead.

For the ValueTable lookup for a signal I've made a workaround in https://github.com/hakwes/DbcParser/blob/currentinternalrelease/DbcParserLib/DbcBuilder.cs#:~:text=%7D-,public%20void%20LinkTableValuesToSignal(uint%20messageId%2C%20string%20signalName%2C%20string,%7D,-private%20bool%20CheckExtID

This fix also solves #13 . I attempted to put that in the ValueTableLineParser but it broke to many tests so i opted for the quickfix above.

@EFeru EFeru changed the title Improved lookup Update Messages property of the DBC class to Dictionary Oct 28, 2022
@EFeru
Copy link
Owner

EFeru commented Oct 28, 2022

@hakwes Are you willing to support on converting to Dictionary? Especially the Value Tables would be really nice to have them as Dictionary.
I can help you updating the Tests when you have the PR ready.
For now I did a quick fix on Value table because it could not parse properly the spaces see #13 .

@hakwes
Copy link
Contributor Author

hakwes commented Nov 1, 2022

Sure, I'll have a go at it @EFeru

@Adhara3
Copy link
Collaborator

Adhara3 commented Jan 14, 2023

Hi,

Why I would not change Node and Message container type to IDictionary

  • The c# language provides easy ways to make a list a dictionary using what the user likes as a key.
  • forcing a unique key would not work with the composite dbc as in Support for multiple .dbc files #20 because multiple files may contain the same ID.
  • the keys we use during parsing depend on how the file links lines, which may not be what the user expects. It is dangerous to expose and internal choice outside, that would make things impossible to change

I think the goal of this library is to parse a text file and put stuff into usable objects. Keep it simple.
If you need extra features do not force them into the parsing process, instead let's add validators or helpers around the parsed objects to provide the requested functionalities.
Indexing is useful but has nothing to do with parsing strictly speaking. It has more to do with optimization.
So we could provide some helper classes like

public class IndexedDbc
{
  public static IndexedDbc CreateFrom(Dbc dbc)
  {
    // Here we take the dbc and convert structures in an optimal way (i.e. indexing)
  }

  private IndexedDbc()
  {
  }

  public bool TryGetMessage(int id, out Message message)
  {
    // Search by index
  }
}

This has the advantage of extending with usability. Having the TryGet method instead of directly exposing the IDictionary keeps the object read only by design

Why I would, instead

Node name should be unique by dbc spec. Same for message IDs and signal names inside a message. So exposing IDictionary directly makes sense because it somehow reflects the specs.
If in the future we plan to write DBCs starting from objects, then we must allow people to build object so the read only approach (e.g. the TryGet method instead) would not allow this. But write objects may be different ones or we could even build the file using an ad hoc builder class.

As you can see I need to think a bit more about this. We can break the API, but if we do it should be for a bigger plan, not just for a specific need we have today.

My 2 cents
A

@Adhara3
Copy link
Collaborator

Adhara3 commented Jan 14, 2023

I guess this also somewhat ties into issue #15 making message a dictionary would help in solving this issue as well.

I fear the two things are unrelated

A

@EFeru
Copy link
Owner

EFeru commented Jan 14, 2023

Hi @Adhara3 , thanks for your detailed answer. Is very good to have both advantages and disadvantages in our overview.

@Adhara3
Copy link
Collaborator

Adhara3 commented Jan 14, 2023

So just to recap:

  • dictionary vs try get, still under investigation. For proper API the same should apply to Messages (by Id) Nodes (by name) and Signals (by name) in a message especially if we want to reflect the spec
  • for ValueTable, yes. It is a string because we left the pure parsed value but should become a dictionary or similar.

Cheers
A

@Adhara3 Adhara3 added enhancement New feature or request public API This request involves user API labels Jan 15, 2023
@Adhara3 Adhara3 added this to the Improve API for fast item browsing milestone Jan 15, 2023
@hakwes
Copy link
Contributor Author

hakwes commented Jan 25, 2023

@Adhara3 I don't really understand your concerns about exposing the Message property as a Dictionary.
Internally, messages are added to a Dictionary first and then the Dictionary is converted to an Array.
A Dbc file can be very, very large and adding an index method where you would parse the file again seems excessive.

The TryGet method is fine but I don't think you need to indexDbc class to implement that.

@Adhara3 Adhara3 modified the milestones: Improve API for fast item browsing, Further improvements Feb 14, 2023
@Uight
Copy link
Contributor

Uight commented Jul 29, 2024

@EFeru, @Adhara3
I would have a question on this. What are your usecases for this nuget package. For me it is reading in the dbc file once and then using it to interpret incoming can messages. We can have 500-10000 Can messages per second and for every receive message you must find the message and then read the signals. The only useable way to make this lookup fast enough is to have a dictionary.
As this is a parser and not an editor i think having a fixed (readonly) dictionary could be a good way. I would therefor suggest making messages a IReadOnlyDictionary.

but i would also be interested in other usecases to see if there might be a better solution.

@Uight
Copy link
Contributor

Uight commented Jul 29, 2024

I was just going through the code to check out how using a IReadonlydictionary for messages would change the code. (pretty much only the tests)
In the DbcBuild.cs i found this comment: //TODO: uncomment once Immutable classes are used
Is it the plan to make everything in the Dbc class immutable? If yes i would really suggest using a IReadOnlyDictionary for the Messages and i could do some of the immutable stuff to when im at it. I just think its impossible to make everything immutable in one swoop but it could atleast start it for messages and nodes. I would use a IReadOnlyCollection on the top level for nodes then. Whats your thoughts on that?

Uight added a commit to Uight/DbcParser that referenced this issue Jul 29, 2024
Uight added a commit to Uight/DbcParser that referenced this issue Jul 29, 2024
…r for code cleanup; Add IsMultiplexed to signal
Uight added a commit to Uight/DbcParser that referenced this issue Aug 1, 2024
@Uight
Copy link
Contributor

Uight commented Aug 1, 2024

@Adhara3 @EFeru,

i know changed to code in a way that the dbc Class is the same as before but you can now convert to a immutable dbc class that works with the immutabel objects and dictionaries. The immutable classes support some more functions as they cant be changed, which solves #37 and is a part of solving #68.
I changed the dbc builder test to test the immutable classes while i kept the parser test working on the old dbc class.
this could be a way to keep open for the future. If you want to edit the dbcs in code you would use the old classes and if you dont and want fast access you just use the immutable classes.
However im not sure if this is really needed as i have a feeling that this is only done because of fear of a breaking change. And this introduces some other problems with classes like the Packer which currently work on signals while i would want to work them on immutableSignals. You could just use another wrap there to make both possible but it just increases the work when changing stuff.

@Adhara3
Copy link
Collaborator

Adhara3 commented Aug 26, 2024

@EFeru, @Adhara3 I would have a question on this. What are your usecases for this nuget package. For me it is reading in the dbc file once and then using it to interpret incoming can messages.

I think someone in the past asket to add messages but the usage requested was not clear since no .save() method was requested.

At the moment the lib has been designed primarily with this exact scenario in mind.
Our design concern has been to avoid having objects that can do everything. The DBC class should only be an in memory "browsable" representation of a dbc file with probably no optimizations in it. The reason is that if it is optimized for read, then it would not work well for write (e.g. IReadOnly is fine for a non editable object only).

The Packer part has been kept separated because if is another kind of functionality, not directly related with the dbc in memory representation . So we could think of a method of some kind that takes the generic Dbc object and .ToOptimizedRead() to get a new optimized version of the DBC for reading and that could even embed the packer part.

The overall design can be discussed, for now the lib is designed to work best on the scenario you described, I would not close the door to future developments to support other scenarios this is why I would not force stuff on the DBC class itself.

Hope this clarifies the intent

Cheers
A

@EFeru
Copy link
Owner

EFeru commented Aug 26, 2024

Hi @Adhara3,
My use case is the same as @Uight parse the dbc once and then unpack/pack incoming/outgoing mesages. Is your use case different?
If not, then I would also go for an implementation that maximizes the throughput. So, if Dictionaries is the way to go for speed in getting the Message by id, I would sugest we do that, or we provide helpers that do that. In essence we need something to get the messages by ID quickly, :

dbc.Messages.TryGetValue(myID, out Messsage myMessage)

I think Dictionay complexity O(1), is faster than IEnumerable complexity O(n):

Messsage myMessage = dbc.Messages.Where(m => m.ID == myID);

Regarding loading multiple dbc files and clash between IDs, please correct if I am wrong, I think they should be treated as different dbc objects dbc1, dbc2, ... dbcN.

@Uight
Copy link
Contributor

Uight commented Aug 26, 2024

I think the change to dictionary is kind of obvios here as the dictionary is also used internally. the way i did is that internally a dictionary is used and externaly this dictionary is readonly.
this shouldnt realy affect speed in both cases. It only means that the dbc is more or less immutable from extern. But you could always add dedicated methods for changing objects which would in my opinion be teh better way anyway as it allows for validation in the methods and not just resetting/overwriting properties.
I would also argue that a normal dictionary will be good enough speed wise in 95% of the cases. We are talking of c# anyway ;)
One thing could be the frozen stuff added to .net8 but that wouldnt work in .net framework.

The thing i changed two was the properties because you could easily get null ref exceptions there because only one of the properties was set at any given time (either string, enum, double....). I think this change could be discussed

@Adhara3
Copy link
Collaborator

Adhara3 commented Aug 27, 2024

I think Dictionay complexity O(1), is faster than IEnumerable complexity O(n):

Sure it is, the point was that if a user requires fast access and he knows that no duplications are in place, then he could always call the available:

var optimizedMessages = dbc.Messages.ToDictionary(m => m.ID, m => m);

It's one line.
Then if we are sure that message IDs are unique (and if not so we need to fail parsing or to provide a duplicate message ID handling strategy) and we assume that we won't be supporting a composite dbc, then ok, let's expose a I(ReadOnly)Dictionary.

As I stated above, the fact the we use a dictionary internally (i.e. for parsing) may have different reasons than the public API. Please keep the two concepts separated.

Cheers
A

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request public API This request involves user API
Projects
None yet
5 participants