Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Proposed reorganization for automatic signal formatting and implementation of Sounds API #44

Closed
wants to merge 2 commits into from

Conversation

haberdashPI
Copy link
Contributor

@haberdashPI haberdashPI commented Sep 8, 2018

Whew! Okay... this one is large. Quite a few things here—it should eventually
be broken up—but this was in a place where I wanted to actually discuss what
I've been thinking through.

If this is what SampledSignals ends up looking like I'll have some clean-up and
more testing to do and probably some fixes to make, but I'm pretty convinced
the basic approach is do-able and practical approach at this point. And
all of the original tests that make sense for the new approach (all but
a few) pass locally. This needs a few small updates to LibSndFile
so the tests won't pass on CI.

So the main question is whether this is the right way to go. It would
involve some fairly major changes:

  1. I implemented mapsignals which applies a function to each sample of
    multiple signals but automatically pads the end of signals and can operate
    over streams, buffers, numbers and any abstract array, automatically
    coercing the objects as necssary. This can be used to implement the most of
    the methods in Sounds. This is really the motivating force behind many of
    the below changes and I think the goal should be to have something that
    makes this work well. You can think of the rest of the changes as really a
    proposal for how to make mapsignals work.
  2. It uses generic, trait-based dispatch to determine the format of a signal.
    The functions promote_signals(xs...) and promote_signal(x,by=y) inspect
    their objects and call tosamplerate and tochannels to format the signals
    appropriately. Signals define the latter two functions along with the
    SignalFormat trait to participate in the generic interface for signal
    promotion.
  3. The signal formatting is represented by a unitful sample
    rate and bit rate. A spectral signal has a sample rate in time, a temporal
    signal a sample rate in frequency. The bit rate is represented separately
    because it needs to be handled somewhat differently for buffers and streams.
    I tried combining them but it was a bit of a mess and there were lots of
    exceptions
  4. SampleBuf can now be used to represent any AbstractArray as a signal: this
    comes in handy when trying to operate over ranges, for instance. It makes
    the generic code to operate over signals with mapsignals easier to write.
  5. SampleStream has a simplified interface. Rather than specifying the start
    and length it simply takes an abstract array. One can limit the range by
    using a view over a larger array. The fact that it's abstract also means
    it can handle channel mixing lazily because I've defined an object that
    mixes the channels on demand.
  6. SampleStream doesn't convert the bit rate. Most functions in julia converts
    bit rates automatically, so most of the logic for same and different bit
    rate code is exactly the same, and converting it before passing it is slow.
    So slow, uncessary work. If necessary it's easy enough to convert the bit
    rate within an implementation that can't be more generic.
  7. There are some miscelaneous smaller changes I made, largely to simplify
    and improve my understanding of the code.
  8. I added some additional functionally—just to get a sense of what it would
    take to do it—to create signals from functions. As a consequence
    one no longer really neads SinSource, as it can be defined as, e.g.
    signal(sin,1kHz). A missing behavior, that would need to be implemented
    is to combine signals into multiple channels, e.g. something like channels(signal(sin,1kHz,phase=0),signal(sin,1kHz,phase=pi)).

NOTE: I'm not sure mapsignals can easily be separated into another
package. We were originally thinking Sounds functions for
manipulating sounds would be in another package. However, mapsignals has to somehow generate a
signal which, to be non-specific to this package, would require
recreating the broadcast machinery ala broadcast_similar and friends
and defining an interface for reading and writing to signals that looks
a lot like the interface for SampleSource and SampleSink... so
why not just use those, and other packages that want to implement
signals that work with this machinery can just ensure their objects are
coercable to sources or sinks. I already have defined tosource to
coerce to sources, and it would be easy to also have tosink as well.

…tting.

This is a draft proposal to really mix things up. All the existing
tests pass locally, but I have yet to write any new tests.

It defines the basis of a new interface which could be used to
build the functionality present in the `Sounds` package.

The formatting is largely handled through a new trait called SignalFormat.
This is the basis of a function, `promote_signals` which can be used to ensure
all signals passed are an appropriate format.
@haberdashPI
Copy link
Contributor Author

One thing I realized I forgot to mention about SampleBuf's new design:

My thinking was that, in this approach, to design a new kind of sample-buffer-like object with different behaviors you simply wrap some AbstractArray object in a SampleBuf using signal(myobject,sr). If you need to specialize the behavior of this new type you can specialize dispatch over SampleBuf{<:MyType}. There is never a need to create a child type of SampleBuf or worry about what the other type parameters of SampleBuf are.

@haberdashPI
Copy link
Contributor Author

haberdashPI commented Sep 13, 2018

Another point that has occurred to me about this is that the resulting SampleBuf looks a lot more like the format for AxisArray: it might well make sense to just use AxisArray types and eliminate SampleBuf entirely, ala #38, only defining the SampleSink and SampleSource objects.

That said, having thought about the operators more, I kind of like having + and * defined to mean the same thing as mix and amplify, and this would only make sense if there was a specific SampleBuf type, and would not make so much sense for AxisArray objects. (And AxisArrays could easily participate in mapsignals: it would not be hard to define SignalFormat(x::AxisArray) and friends).

@haberdashPI
Copy link
Contributor Author

Hey there, just wanted to check in on this. If you are simply busy and need time to look at this, that's cool. However it would be nice to know if you would no longer like to integrate this functionality, or something like it, into SampledSignals. That's totally fine, of course; I can figure out a way to write the mapsignals functionality and friends into a separate package that SampledSignals would be able to participate in at some future date, if you'd like. The main redundancy would be that said package would also define an interface for working with audio streams, since it was easier to write a generic interface for mapsignals that works with both buffers and streams when it handles streams (that buffers can be converted to).

@ssfrr
Copy link
Collaborator

ssfrr commented Oct 24, 2018

Hello. Big apologies for ghosting on this.

I haven't been able to go through in detail, but here're some initial thoughts:

I implemented mapsignals which applies a function to each sample of multiple signals

Is this equivalent to map(fn, zip(sig1, sig2))? If so I wonder if the padding/resampling/channel mixing behavior could be added to some new methods of zip rather than introducing a new function. I guess one issue would be threading the samplerate info through, because fn would be operating on individual samples and by default map would just collect them into a Vector, but maybe zip would return some kind of SignalIterator or something that would provide the samples of the signals, and then a map(::Function, ::SignalIterator method would apply the right samplerate to the result signal.

kind of like having + and * defined to mean the same thing as mix and amplify

I don't really like * as amplify, because SampleBufs should still behave as Vectors, and vector-vector multiplication doesn't work. The operation should be .* to pointwise multiply signals, but because that is handled by the broadcast machinery we can't just define a .* method (I think?). One idea would be to hook into broadcasting to do the automatic length extension. I haven't thought enough about exactly how that would work, but it seems like it might let us take advantage of more AbstractArray fallback handling where all the broadcasted operations work as expected but with the additional length extension as necessary.

SampleStream has a simplified interface. Rather than specifying the start and length it simply takes an abstract array

Yeah, this is something I've actually been meaning to do as well. Originally I did it this way to make it possible to define zero-allocating unsafe_read! methods, but it's much less convenient and not worth it.

A spectral signal has a sample rate in time, a temporal signal a sample rate in frequency

I had it this way in an earlier iteration rather than separate types, but I think that dealing with dispatching on the type parameters was a hassle. That might not apply anymore given the other architectural changes you've made, so I'm definitely open to this.

the resulting SampleBuf looks a lot more like the format for AxisArray: it might well make sense to just use AxisArray

I think that there's value to having these types more focused on time and frequency-domain signals, and not have to worry about generalized units. It lets us make more assumptions about use-cases.

to design a new kind of sample-buffer-like object with different behaviors you simply wrap some AbstractArray object in a SampleBuf using signal(myobject,sr)

I'm a little worried that signal is already used for the FRP-inspired libraries like Reactive and Signals, and it wouldn't be surprising for a user to be using both of them (e.g. creating a bunch of sliders that you map to audio parameters). That said, it's a good name, and maybe it's OK if user's have to qualify the name when they coexist. If we do go that direction, it's worth thinking about whether we can go all-in and read and write Signals from SignalSources and SignalSinks. Especially given the time-frequency unification Signal makes sense, and is definitely snappier than SampleBuf. And it's still in the package name. :)

That's all I can get into for now, but I'll think on this a little more and try to find some time to get more familiar with the code. Thanks for putting all the work into this! We'll get it sorted eventually...

@ssfrr
Copy link
Collaborator

ssfrr commented Oct 24, 2018

thinking about it a little more, I think your mapsignals is sometimes called lift (e.g. in the Elm language, but it seems they've moved away from that terminology).

@haberdashPI
Copy link
Contributor Author

haberdashPI commented Oct 24, 2018

Great! I totally understand taking some time to get to this: it's a lot of material.

Here are my responses to your initial thoughts:

Is this equivalent to map(fn, zip(sig1, sig2))? If so I wonder if the padding/resampling/channel mixing behavior could be added to some new methods of zip

I think the closest analogy would actually just be map(fn,sig1,sig2), since the signals are passed as individual arguments, rather than as a tuple to fn. However I think the closer analogy is to broadcast, since, if we were to specialize some function in Base, we would need to specialize over the arguments sig1 and sig2 in the same way that the broadcast machinery does (selecting the output container on the basis of some combination of all argument types). See my comments about broadcast below.

One idea would be to hook into broadcasting to do the automatic length extension.

I've thought about this as well; it is definitely very appealing. I ran into two problems in thinking through this possibility:

  1. At an implementation level, signal padding breaks a very basic contract that the broadcast machinery assumes: that you can call axes(x) on a broadcastable object x to determine its dimensions. Padding means you determine the dimensions of each of the arguments on the basis of all arguments. The axes(x) assumption is pretty baked into the way the broadcasting machinery in Base is written, from what I have gleaned from it so far.

  2. A more fundamental, semantic, issue is that mapsignals treats the time dimension specially, at least as implemented, because that's the dimension that gets padded. One could imagine padding across multiple dimensions, but that starts looking like the Images package. In any case, the point is that not all dimensions are treated the same when you pad, and you don't necessarily want them to be treated equally. On the other hand, broadcast treats all dimensions in the same way. So ultimately I think it is a conceptually different operation.

I don't really like * as amplify, because SampleBufs should still behave as Vectors, and vector-vector multiplication doesn't work.

I see your point here. I definitely go back and forth on whether it makes sense. I think in either case the interface should be consistent: either + and * correspond to mix and amplify, respectively, or neither correspond to those functions. Really, I think the question is whether this behavior would or would not be confusing or not to users. I don't think of the SampleBuf (or whatever it's name becomes) as a Vector, since there are many extra things it does, so I don't personally find it confusing, but that's just one data point. Certainly the safest thing would be to not implement this equivalence.

Yeah, this is something I've actually been meaning to do as well.

Cool, sounds like we're on the same page here.

I think that there's value to having these types more focused on time and frequency-domain signals, and not have to worry about generalized units.

I think we're also in sync here. And certainly it would be easy to make mapsignals work well with AxisArrays in a number of cases anyhow, with SampleBuf have some added usefulness. Both types can coexist, so people can use what they want.

If we do go that direction, it's worth thinking about whether we can go all-in and read and write Signals from SignalSources and SignalSinks.

I kind of like that idea, it has a nice consistency to it. Seems like a reasonable approach.

I'm a little worried that signal is already used for the FRP-inspired libraries like Reactive and Signals, and it wouldn't be surprising for a user to be using both of them

But, yeah, it might be worth thinking about another reasonable name that doesn't conflict with these libraries.

@haberdashPI
Copy link
Contributor Author

More coming... just wanted to post my thoughts so far.

@haberdashPI
Copy link
Contributor Author

haberdashPI commented Oct 24, 2018

Oh! I just realized, I think there is a way to implement mapsignals as a specialization of broadcast. I've been learning more about the ways to customize this machinery with another package I'm working on to create a generic MetaArray that maintains the broadcast behavior of the array it wraps. I think if we override copyto!, copy and instantiate we can select the shape of the array based on all the arguments using instantiate, and then call the default broadcast machinery on each of the subparts of the array.

This still leaves the issue of whether that's the right approach given that some axes are padded (time) but others aren't (channel), but maybe that is okay, because it is a specific type, where padding is well defined for this specific dimension.

@haberdashPI
Copy link
Contributor Author

haberdashPI commented Oct 24, 2018

So I think there are a few orthogonal issues, which maybe can be separated into separate pull requests. Now that I've worked on all of the parts I think I have a better idea of how to best separate these out.

  1. The changes in the implementation to SampleBuf and SpectrumBuf to one type and the changes to their interface: I think this is in someways the most disruptive change, and the one that is the least settled. Thus, it might be worth hashing out the merit of this separately from the other changes, and since a lot of how I got mapsignals working was to start with these changes, it seems like the place to start. This pull request would define the Signal trait (or whatever it's name becomes... SignalTrait, SignalFormat?).

  2. The changes and simplifications in the interface and implementation of SampleSink and SampleSource. Seems like we're mostly on the same page here, baring the formatting information being stored in the type, which we'll hopefully have figured out, one way or another, in the pull request for change 1.

  3. The implementation of mapsignals or the specialization of broadcast to support padding. I think if specializing broadcast works, the issue of mapping mix and amplify to + and * is pretty moot.

  4. Name changes, and the introduction of signal method (or whatever it becomes). Once we've mostly worked out what the changes in 1 and 2 look like, then maybe we can hash out the names, once it's clear what there is to be named.

This would mean losing some functionality along the way, as we figure out how each part works, but it seems like a good way of breaking down what needs to be worked out into smaller parts.

Edit: I just realized another problem with using broadcast. This may be quite challenging to do for streams, where the time dimension is essentially an unknown, arbitrary value, and yet the interface for buffers and streams should probably be the same. Perhaps the lift name you suggest would be good.

@haberdashPI
Copy link
Contributor Author

haberdashPI commented Oct 24, 2018

Ha! Sorry, I keep thinking of new things. I think it is worth trying to extend broadcast, to see if it works: once the other pieces are in place. One can define two new broadcast styles: one for buffers alone and another that includes at least one stream. Then, by overriding copyto! and copy for streams, these can call the broadcasting machinery, lazily, on each chunk that's read out.

@haberdashPI
Copy link
Contributor Author

Hey there, just checking in on this again. I'm just reflecting on this, and I definitely see some confusing pieces that could do with some cleaning up. I think it might be good for me to clean this up more, and try out the broadcasting idea. I think there's a big question of whether broadcasting will work well: if it does I think that addresses some of the naming issues, and will probably help clarify the code somewhat. Sound good?

@haberdashPI
Copy link
Contributor Author

Hi @ssfrr, just checking in on this again, I haven't committed any time to this because it wasn't clear whether this is was the direction you wanted to see SampledSignals take or not.

@ssfrr
Copy link
Collaborator

ssfrr commented Mar 18, 2019

Yeah, that's probably the smart move. I'm trying to limit how much time I spend on things that are not shortest-path to finishing my dissertation, and this big reorganization isn't something that I have much time for.

As an aside, I've been doing a little reworking of Ogg.jl and Opus.jl to handle streaming, and have had pretty good results with an iterate-based API that gives Tuples of samples. Performance seems quite good, so whenever I get around to digging into this more I'm strongly considering ditching the read/write streaming for iterating over frames. Unfortunately that probably complicates rather than simplifies this discussion, but that's where my head is at right now.

@haberdashPI
Copy link
Contributor Author

haberdashPI commented Mar 18, 2019

Aye, yes, by all means, focus on the dissertation! Best wishes!

I've wondered about an iterate interface: one thing I like about the read/write option is that it allows more flexibility when retrieving the block; although maybe I misunderstand your point: are you saying the iterate operates sample-by-sample?

I think what I'll do for know then is refine an approach for the buffer case, and not worry too much about streams at the moment, since it isn't what I need myself. I'll probably publish that as part of a public Sounds.jl library. (Obviously I'll give you credit, since I'll be building on what I made here based on my integration of Sounds and SampledSignals). Perhaps at some point later on, we can unify the interfaces.

@ssfrr
Copy link
Collaborator

ssfrr commented Mar 18, 2019

yes, iterate operates sample-by-sample (or frame-by-frame for multichannel). What sort of flexibility does read/write give that you find useful?

@haberdashPI
Copy link
Contributor Author

If it's sample-by-sample my point about flexibility is moot. I would thinking you iterated block-by-block (in which case, the read/write option allows you to specify how large each block is).

@haberdashPI
Copy link
Contributor Author

One point that occurs to me is that the two interfaces are inter-operable: you should be able to implement the one with the other. So that might mean that if I work on the general functions I'd like to have (e.g. ramp etc...) using the read/write interface as an assumption, that could be implemented by a sample source implemented using the iterate interface. So it might not be too hard to integrate our work, when/if that makes sense to do.

@haberdashPI
Copy link
Contributor Author

Just an FYI: I've implemented my various operations as proposed in this PR in the package SignalOperators, which should interoperate fairly well with both SampleBuf objects and AxisArrays objects (and soon with arrays from the DimensionalData package). @rob-luke, I know you were interested in this. I would welcome any feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants