Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interested in helping out... #29

Closed
haberdashPI opened this issue Mar 24, 2018 · 9 comments
Closed

Interested in helping out... #29

haberdashPI opened this issue Mar 24, 2018 · 9 comments

Comments

@haberdashPI
Copy link
Contributor

TLDR - I'm wondering if there is any interest in me porting some of the methods/features of my Sounds package to SampledSignals.

Sounds is a very small package I use for manipulating sounds and generating simple stimuli for psychoacoustics experiments. (The name will probably change, I suspect it's too generic right now). I started the package using SampleBuf objects, but over time I found there were a few reasons it was a pain to use for my purposes, and I made my own Sound type.

I've been thinking it's in a state where it might be helpful to others if I register it with METADATA.jl. However, it seems that it would be a lot better if I could just eschew usage of my Sound type and use SampleBuf objects instead. Then I could release the sound manipulation functions as a Psychoacoustics package. SampledSignals is a more fully featured package, I really just implemented what I needed/wanted to use and nothing more. It would be cool to have one package that has all the work that's been put into SampledSignals and the parts of Sound that I have found convenient.

The README file on the github page I linked to gives some examples of usage and describes the key differences of Sounds from the SampledSignals package.

Let me know if there's interest and if there is, maybe we could discuss what features from that package it would make sense to move over here, and I could then create a pull request with those features.

@ssfrr
Copy link
Collaborator

ssfrr commented Mar 25, 2018

Yeah, I'd certainly love to see Sounds.jl and the JuliaAudio packages work together, and your pipeline API is quite nice. I think your use-cases are really well-aligned with the things I want to make easy, so I'd definitely like to hear more about why the SampledSignals stuff didn't work well for you. Also you've built some nice stuff that I'd love to take advantage of, so I'm 100% supportive of joining forces.

Here are the main parts I initially see that would need to be unified (you probably have a more detailed view, so please feel free to flesh out the list). Maybe take a look and comment and then we can figure out a more concrete plan for how to move forward.

SIUnits.jl vs. Unitful.jl

I've been meaning to make this transition for a while and haven't gotten around to it, but it should be pretty easy and straightforward, and I can take a look at it shortly.

using IntervalSets.jl

This package didn't exist when I was starting SampledSignals, but I've been meaning to switch to it for a while.

Promotion mechanism

In an earlier version of SampledSignals I put the sample rate and channel count as type parameters, but after working with it for a while it seemed like more trouble than it was worth. In general with my Julia code I've found that type parameters are great when you need them for type stability, but trying to move too much logic into the type system gets messy and regular run-time branches on field values are simpler.

It's been a while since I made the switch, but some issues I remember were:

  1. Worrying about the type of your type parameter, e.g. Foo{48000} is not the same as Foo{48000.0} so either you normalize to a single type, allow multiple types to be equivalent in your comparisons, or have a type check in your inner constructor that throws an error for the wrong type.

  2. I ended up with a ton of type parameters, particularly in SampledSignals where I have SampleBufs, SampleSinks, and SampleSources (which are abstract), and then a ecosystem of concrete source and sink types. Then whenever I wanted to make a change to how something worked it rippled through wherever those types were used, and created a lot of tight coupling.

That said, there could definitely be a way to improve how I'm doing things right now with doing all my conversion in the source/sink domain. In particular precompile times are pretty long with SampledSignals, which I suspect might have to do with my recursive conversion machinery. Having another set of eyeballs and other use-cases would be helpful in iterating the design.

More convenient playback

My plan has been to revive my old AudioIO package (just re-using the name) to be a metapackage that would export play and record functions, and auto-choose the audio backend (usually PortAudio, but WAV.jl has a native PulseAudio backend that doesn't require a binary dependency that I'd like to pull out as a separate package). That also might be a good place for most of the Sound.jl functions to live (most of them don't seem particularly psychoacoustic-specific, and are nice general purpose audio processing)

Unit usage

I've struggled a bit to figure out where units are really useful. At first I made SampledSignals pretty unit-full, e.g. samplerate(buf) would return the samplerate in Hz. The issue that came up was that often when I was doing other computations they weren't unitful, so I was constantly needing to strip and add units as I got things in and out of SampledSignals. Where I've come to now is using units as a convenience UI, so you can do things like indexing a buffer in seconds (or Hz for frequency-domain buffers), but generally the API functions return unitless floats, and you don't need to supply unitful arguments in general. I'd be curious to hear your experience for where the units have worked well. (amplify is definitely a case where it's great to be able to use both amplify(0.5) or amplify(-6dB), though it might be nice to just add a * method).

Streaming audio

One thing that's important to me (and possibly a source of extra complexity that doesn't matter for your use case) is that things work for on-line streaming audio. Fortunately the DSP.jl filtering works well on-line, and I think most of your functions would work well on-line as well, e.g. one could put together a quick on-line filtering pipeline:

record() |> lowpass(2kHz) |> amplify(20dB) |> play()

Or if you have a gigantic (larger than memory) audio file that you wanted to filter you could do:

loadstreaming("source.wav") do src
    loadstreaming("dest.wav") do dest
        src |> lowpass(2kHz) |> amplify(20dB) |> rampon(500ms) |> rampoff(500ms) |> dest
    end
end

Hopefully pretty soon loadsteaming and savestreaming into FileIO.jl. We'd need to define a method to make dest a callable to enable this API, or else use a different operator than |>. doing rampoff streaming would have to add latency as long as the fade time (so it could start applying the fade when its input stream ends), but that's not a deal-breaker and is a really nice API.

@ghost
Copy link

ghost commented Mar 25, 2018 via email

@haberdashPI
Copy link
Contributor Author

Great! Here are my comments:

Unitful and IntervalSets

Cool, this was a minor barrier in my use case because I was making use of Unitful and IntervalSets elsewhere in my application of these sound manipulation routines. It sounds like we're both on the same page with ultimately using these newer packages.

Promotion mechanism

My current approach is very convenient for my purposes, but I haven't thought much about whether it would scale to your use cases. Maybe it would help if I explained my reasoning; then I'll address your specific points.

For my purposes the problem I'm trying to solve is to make it easy and straightforward to do the following operations.

  1. sound mixing (i.e. addition)
  2. envelope application (i.e. multiplication)
  3. concatenation

I want those to work even if there are a different number of channels, bit rates, sample rates or lengths. For different lengths, I use the additive and multiplicative identity, respectively, to automatically pad the sound (ala broadcast).

The first obvious, simple solution for the rest of these formatting differences (other than length), to me, was to encode most of the information in the type of a Sound object (other than length) and use type promotion. I imagine the same goals could be accomplished using a parameter of a struct, but it didn't seem any easier or harder to implement in my case and the type parameters might lead to faster runtimes: however, I haven't benchmarked anything because I normally generate sounds offline and it seemed fast enough for my purposes.

The functionality of these three operations (and a few others) was the other big barrier for my usage of SampledSignals; I would have had to pirate types or even overwrite some methods, for example, overwriting Base.vcat for SampleBuf. I was worried this would break in not so obvious ways if the methods defined for SampleBuf in SampledSignals changed or if I didn't load my package last.

Thinking about the specific problems with putting the sample rate in the type vs. as a parameter of the struct:

  1. Regarding the type of the sample rate itself — Yes, my inner constructor ensures sample rate is an integer. (I could always use a Float64 instead, though, are there applications where there's a fractional sampling rate???).

  2. Concerning the proliferation of type parameters - With the caveat that I'm not completely sold on using type parameters — I don't love how unreadable errors can get — I wonder if the solution for a larger ecosystem of sampled object is to have one type that carries all the formatting information, possibly with a type hierarchy if formatting information differs in some cases. Then all sampled signals can have a single type parameter indicating its format. Seems like that might possibly avoid the proliferation of parameters, and it has the potential advantage of type parameters that type promotion can be readily leveraged, and it might lead to faster runtimes.

Playback

I definitely see my current solution for this as a sort of band-aid rather than a good general purpose solution. I like the interface of just being able to call play or record, but yeah, ultimately it makes more sense to be able to switch out backends, with an implied default backend.

I like your point about the sound manipulation functions potentially being more general purpose. And yeah, AudioIO sounds like a reasonable place for those functions to live.

Unit usage

The main advantage I have found in using units is the readability of the API. E.g. tone(200Hz,500ms) is a lot clearer than tone(200,500). Sometimes I want to compute a number of samples, and there, units are also pretty handy, e.g. floor(Int,50ms * samplerate(sound)). This seems safer and not much more verbose than the unitless version. These are my most common use cases so I decided to return a unitful value from samplerate.

An older version of the API assumed a particular unit and automatically added the units and threw a warning. E.g. tone(200,500) would throw a warning indicating that it was assuming a 200 second long tone at 500 Hz was the inteded output. However, I never found myself needing to use the functions without units, and it's really easy to add units, so it seems simple enough to just make them errors instead of warnings. It would be easy to change them back to warnings (it just requires changing the definitions of my inHz, inseconds and inframes methods).

For more general use, maybe it makes sense to have a unitless and unitful version of samplerate (samplerate and samplerate_Hz??), or maybe it's easy enough to just write samplerate(x)*Hz when you need the units. Not sure about that.

There is one place where I found units to be annoying, and it is the only place in my current API where I don't employ units: defining a sound in terms of a function, like so:

sound = Sound(t -> 1000t .% 1,2s) |> normpower

A unitful version of this would look something like the following

sound = Sound(t -> uconvert(unit(1),1kHz * t) .% 1,2s) |> normpower

Which I find a lot less readable, and there is little value added.

However, I really think this is a problem with the implementation of Unitful. It seems like in this particular case, Unitful should be able to know that the resulting number is unitless, and should convert the result to an appropriate Real type. Maybe that's an issue I should submit or even make a pull request for...

Regarding amplify, I also implemented * and / on Real and Sound inputs,
but in a chain of operations I find |> amplify(10dB) more readable than |> x -> x*10dB. In a one off manipulation you can still do x = 20dB * x. Perhaps I could eschew amplify if
JuliaLang/julia#24990 gets merged, since |> 20dB * _
is fairly readable.

Streaming audio

I actually implemented these methods for streaming audio a while back. I ended up being unhappy with my actual implementation of streams: they were buggy and hard to troubleshoot, and I ended up finding a simpler solution to the particular problem I was facing at the time, that didn't require streams.

However, with a well defined API for low level stream operations that isn't
buggy, I could easily resurrect the versions of the sound manipulation functions
that I wrote to work with streams, to save time. They were pretty similar
to the static sound versions.

I like your idea for how to work with streams using the pipes, that makes sense
to me. Here are a few adjustments to it:

loadstreaming("source.wav") do src
    loadstreaming("dest.wav") do dest
        src |> lowpass(2kHz) |> amplify(20dB) |> ramp(500ms) |> write(dest)
    end
end

That is, rampon and rampoff can be changed to ramp, and at the end dest need not itself be a fucntion, you can have a function that returns a function that writes to that stream. Maybe it isn't called write, maybe it's named something else.

Whew! That was a lot. Those are my thoughts!

I haven't thought through what the next steps should be, but there is already a lot here, so I'll post this now and come back to this issue later when I have some more time.

@ghost
Copy link

ghost commented Mar 25, 2018 via email

@haberdashPI
Copy link
Contributor Author

Okay, so I'm thinking about what the changes would actually look like and here are what I see as the outstanding questions.

Category 1 - How to concatenate sounds

Bottom line - I could overide Base.vcat, define a new method sequence or do both.

Both mix and envelope seem readily definable for SampleBuf (minus the questions about how to handle promotion I raise below). However, right now I redefine
Basae.vcat in Sounds to handle concatenation of multiple sounds with
differing numbers of channels. I could do this for SampledSignals as well.

An alternative is that one could define a new method (sequence ??) that can
concateate sounds with different channel counts, and leave Base.vcat
untouched. Things in favor of using vcat are (1) it is an obvious way to
concatenate array-like objects, (2) it has a built in syntax that is visually
comprehensable, and (3) I don't currently envision there being issues with having
vcat behave a little differently for sounds.

An argument against vcat is that it would not be so natural or obvious an
interface when using sound streams. Another potential concern is that automatic
promotion to different channel counts may not make as much sense for all types
of sampled signals. EEG signals should probably not just duplicate or mix
channels (there'd need to be some sort of spatial interpolation... really though that's
an entirely different set of design constraints). Maybe as part of the
formatting object I'm proposing below there is a flag that indicates how
channels should be increased or decreased.

It might even be that both vcat and sequence are
defined, so that users can leverage what they know about vcat easilly, but use
sequence for streams and sounds just as easily.

All of that should be relatively straightforward to implement, so it's mostly a
question of design. I'm inclined to define both, making vcat and sequence
equivalent for SampleBuf objects, but only define sequence for streaming
objects.

Thoughts?

Catgeory 2 - When to use units

Bottom line - the absence of any further impetus, I say the rule should be, unitful inputs, unitless outputs.

I think there's argreement that units are useful as input to functions. But it
sounds like unitful outputs can be problematic in your experience.

One argument against avoiding units is that internal calls, not the public the
API, can avoid unitful values in general. In Sounds I regularize all inputs
using inHz, inseconds and insamples as appropriate, and then all internal
functions have well defined interfaces in terms of these three quantities. Thus,
the internals of Sounds doesn't have to worry about stripping units. For
instance, right now I have a unitless version of samplerate, which I somewhat
confusingly call rtype (since it is the rate type parameter, typically denoted
R of the Sound object).

However, I'm currently, slightly inclined to take the conserative approach and
change my design in Sounds to use a unitless samplerate to avoid the
problems you note. The rule would be to use unitful values for input and
unitless values for output. It would be a very consistent approach to
document. It's also not much longer to type samplerate(x)*Hz than to type
samplerate(x), when necessary.

The one somewhat akward thing about this approach is that there might be cases
where a user wants to pass output, from samplerate for instance, as input to
another function in the API, and it seems strange that they'd have to
specify the units manually, since we already know what the units are. What I'd really like to be true is for Unitful to know how to strip units away when they cancel out. I've submitted an issue for that here: PainterQubits/Unitful.jl#128.

However, that's a hypothetical set of use cases, and the only real use case I
can think of where units are more convienient in the output, is for using
samplerate to compute some sample length from a duration (also specified as a
unitful quantity), for which I already have the method insamples.

Category 3 - How to promote

Bottom line - I'd like to try implementing a type-parameter verison of SampledBuf and friends and share the results with you and discuss.

How to do this seems like an 'empircal' quesiton. I really like the way using
type parameters worked out for Sounds; I'd be willing to take a shot at taking
this approach for the more general SampledSignals design, based on the idea I
mentioned in my last post about storing a type parameter that is a formatting
object as a type parameters that's shared across different signal types.

Alternatively, it should also be possible to change my design for Sounds to
use a parallel promotion mechanim (promote_sounds) that did basically the same
thing my code does now, but using fields of the structs (or a simple interface
defined over all kinds of sampled signals). This could probably leverage the
"lower-level" methods in SampledSignals for reformatting and resampling via
sinks. That said, I really like how simple the promotion approach was to
implement. But maybe it won't scale???

Either way, I could try to start making changes and see which one seems
better/easier to do, and then we can discuss the results.

Category 4 - Incorpating streaming audio

Bottom line - I think this should be pretty straightforward to do.

I took a brief look at what streaming looks like in a little more detail, and it
does seem like it would be pretty straightforward to implement the Sounds
operations I've defined over them without much trouble. It seems like this
change could happen at a different pace: i.e. these could be defined over
SampleBuf and then eventually extended to streams once a design has been
settled on.

Promotion seems like it would work pretty naturally here: if the stream returns
a SampleBuf of the wrong type (assuming we use type to indicate format), then
operations will first promote the chunk of audio being read and then manipulate
the sound.

@haberdashPI
Copy link
Contributor Author

@kino-tech I think it could be valuable if you have any feedback about the interface we're thinking about and whether you consider it to be simple, as you recommend. I've certainly aimed to make something that should be intuitive and easy to use with the piping interface. (See the README.md for Sounds).

I think it would be really cool to have an interface kids can use to do 'turtle-like' exercises with. Seems like there is a lot of tooling that needs to happen in Julia (e.g. an interactive debugger) before that will be a realistic aim. Hopefully that'll start happening when the v1.0 is done!!

@haberdashPI
Copy link
Contributor Author

Oh, one thing I forgot to discuss above: play and record. Mostly because I think my approach to that interface is not really a solution so much as a hack to make my life easier, and so I don't have any deep insight into it, nor is there much in Sounds to "port" over to it. It seems pretty orthogonal to the other design issues. I'm happy to help there as well, though my most immediate interest is working on the changes I listed above.

@ssfrr
Copy link
Collaborator

ssfrr commented Apr 4, 2018

Phew! I just went through and tried to split out the main themes from this thread into separate issues to keep the discussion manageable. Thanks for all your thoughtful feedback and ideas. I'm going to read through the issues tonight and tomorrow and comment further.

@ssfrr ssfrr closed this as completed Apr 4, 2018
@haberdashPI
Copy link
Contributor Author

Sounds great! I should be able to take a more detailed look later in the week (and it sounds like that'll give you time to add any more thoughts you have anyways).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants