-
Notifications
You must be signed in to change notification settings - Fork 2
TiledSolutions
Gijs Molenaar edited this page Feb 13, 2014
·
4 revisions
Some important conclusions from Operation 343:
- Antenna phase needs to be solved for on small (1 timeslot) timescales;
- Node overhead becomes the major limitation, given small (1 timeslot by 10-100 channels) domains;
- Other parameters need to be solved for on larger timescales to maximize S/N; things like intrinsic flux are, ideally, solved for the entire MS as a whole, if RAM allows this.
- There could be advantages to simultaneously solving for slowly and quickly varying parameters, this is still at the conjecture stage but does promise to be a very useful option. A scheme to address these points is proposed:
Consider, e.g., a phase parameter that is represented by 1-timeslot-wide polcs. If phase over a larger domain is requested, MeqParm can currently do a "tiled predict", where a larger domain is perfectly tiled by smaller polc domains. [[!table header="no" class="mointable" data=""" | fq1 | fq2 | fq3 time1 | polc 1 (c00) ||| time2 | polc 2 (c00) ||| time3 | polc 3 (c00) ||| time4 | polc 4 (c00) |||
Now, imagine trying a "tiled solve". We solve for all four c00's at the same time, over the "larger" domain of time 1--4. Under the current scheme, [[MeqParm|MeqParm]] would produce a main value and four perturbed values, but because the individual c00s are completely orthogonal, each perturbed value would differ from the main value in only one row of the matrix. Passing such a [[VellSet|VellSet]] through the tree would result in many redundant computations.
Now, imagine we were to produce only **one** perturbed value, with each row of it corresponding to a different c00. Such a [[VellSet|VellSet]] could be processed by the rest of the tree without violating correctness.
[[MeqSolver|MeqSolver]] would need to be made aware of such cases, so as to decompose a "tiled" perturbed value into different equations for different unknowns. However, this change is limited to [[MeqParm|MeqParm]] and [[MeqSolver|MeqSolver]], while the rest of the tree can be happily oblivious to it.
Advantages of this scheme:
* Speed! The larger the domain, the faster we can process it. This is true regardless of any node overhead, as even the most optimized tree can still profit from computing data in larger chunks (CPU cache and pipelining come into play here.)
* Simultaneously solve for long- and short-term variable parameters (e.g. source flux and antenna phases). There is some controversy about the benefits of this, but then it seems no-one's really done it before anyway, so let's experiment.
Cons:
* Extra complexity in [[MeqParm|MeqParm]].
* Some extra information needs to be placed into [[VellSet|VellSet]], to identify "tiled" perturbed values.
* Extra complexity in [[MeqSolver|MeqSolver]].
* Solve matrices quickly get very big (e.g. WSRT case: 26 phases over 1000 timeslots is already a 26000*26000 matrix.) It remains to be seen whether this will really be a problem; in any case these matrices will be sparse, of a very specific (box-banded sparse) type, so the SVD solver could eventually be optimized for this.
## VellSet representation
Changes to [[VellSet|VellSet]] are no longer needed, now that I've implemented [[SpidDiscovery|SpidDiscovery]]. The [[VellSet|VellSet]] still contains just one spid and one perturbed value. [[MeqSolver|MeqSolver]] discovers in advance which spids are tiled, and can reserve space for _N_ equations for each tiled spids.
Tiling information is passed in a [[SpidDefinitionRecord|SpidDefinitionRecord]] during [[SpidDiscovery|SpidDiscovery]].
## Updates
The Update.Values command passed up from the solver currently contains a vector of N update values for the N spids associated with each Parm. In the tiled case (M tiles per spid), it will still pass updates up as a flat vector: M updates for spid 0, M updates for spid 1, ... If tiling is multi-dimensional, then tiles are arranged in C order.
## Implementation plan
* Maaijke will modify [[MeqParm|MeqParm]] to produce "tiled" pertvals. Target date: 09/06/05, although tiling along only the time axis is required for now. MM: I suggest to use the already defined class "[[ComposedPolc|ComposedPolc]]" and specify via class like the [[PolcLog|PolcLog]]. Optional arguments, like varying axis and number of tiles could be given, per default 1 Polc for every Cell in time direction? OS: I'm not sure this belongs in Polc to begin with... what if we wanted to tile [[PolcLogs|PolcLogs]], for example? I think tiling should happen and should be specified on the [[MeqParm|MeqParm]] level. At the parm level, I suggest specifying tiling as follows: {tiling={time=1}}. Axis names can be mapped to numbers via the Meq::Axis routines (see MEQ/Axis.h).
MM: Right. Implemented it now such that if 'tiling' (eg. =[freq=3,time=1]) is specified in the state record, automatically the [[ComposedPolc|ComposedPolc]] is created...only freq and time available at this moment.
* Oleg will modify [[MeqSolver|MeqSolver]] to produce tiles solutions. Target date: 09/06/05
* For now, forget about sparse matrix optimizations, and just build one big matrix. Wim says that a matrix can be accumulated piecewise anyway, so this should be easy. If and when naive inversion becomes a bottleneck, we'll deal with it then.
* If inversion is slow, one speedup for the common case of phase-only solutions is to produce a separate solver per each timeslot. This effectively decomposes the matrix into orthogonal 26x26 blocks, thrown into different solvers. But let's not worry about this until we actually do see a slow inversion...