-
Notifications
You must be signed in to change notification settings - Fork 4
/
dpdd.tex
1376 lines (882 loc) · 130 KB
/
dpdd.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\section{Introduction}
\subsection{The Vera C. Rubin Observatory}
% Note: paragraph lifted from Zeljko's overview paper
The Rubin Observatory will host the LSST Camera, a large, wide-field ground-based optical telescope system
designed to obtain multiple images covering the sky that is visible from Cerro Pach\'{o}n in Northern Chile. The current baseline design, with an 8.4m (6.7m effective) primary mirror, a 9.6 deg$^2$ field of view, and a 3.2 Gigapixel camera, will allow about 10,000 square degrees of sky to be covered every night using pairs of 15-second exposures, with typical 5$\sigma$ depth for point sources of $r\sim24.5$ (AB). The system is designed to yield high image quality as well as superb astrometric and photometric accuracy. The total survey area will include $\sim$30,000 deg$^2$ with $\delta<+34.5^\circ$, and will be imaged multiple times in six bands, $ugrizy$, covering the wavelength range 320--1050 nm. This is referred to as the Legacy Survey of Space and Time (LSST).
For a more detailed, but still concise,
summary of Rubin, please see the overview paper \citep{2008arXiv0805.2366I}\footnote{\url{http://ls.st/2m9}}.
The project is scheduled to begin the regular survey operations at the start of next decade. About 90\% of the observing time will be devoted to a deep-wide-fast survey mode which will uniformly observe a 18,000 deg$^2$ region about 1000 times (summed over all six bands) during the anticipated 10 years of operations, and yield a coadded map to $r\sim27.5$. These data will result in catalogs including over $38$ billion stars and galaxies, that will serve the majority of the primary science programs. The remaining 10\% of the observing time will be allocated to special projects such as a Very Deep and Fast time domain survey\footnote{Informally known as ``Deep Drilling Fields''.}.
Rubin will be operated in fully automated survey mode. The images acquired by the LSST Camera will be processed by LSST Data Management software to a) detect and characterize imaged astrophysical sources and b) detect and characterize temporal changes in the LSST-observed universe. The results of that processing will be reduced images, catalogs of detected objects and the measurements of their properties, and prompt alerts to ``events'' -- changes in astrophysical scenery discovered by differencing incoming images against older, deeper, images of the sky in the same direction (\emph{templates}, see \S \ref{sec:templates}). Measurements will be internally and absolutely calibrated.
The \emph{broad}, \emph{high-level}, requirements for LSST Data Products are given by the \emph{LSST Science Requirements Document} (\SRD; \citeds{LPM-17}). This document lays out the \emph{specifics} of what the data products will comprise, how those data will be generated, and when. It serves to inform the flow-down from the LSST \SRD through the \emph{LSST System Requirements Document} (the \LSR; \citeds{LSE-29}) and the \emph{LSST Observatory System Specifications} (\OSS; \citeds{LSE-30}), to the \emph{LSST Data Management System Requirements} (\DMSR; \citeds{LSE-61}), the UML model (\appsUMLdomain), and the database schema (\citeds{LDM-153}).
The \OSS explicitly ties this document to the requirements flow down through requirements OSS-REQ-0126 Prompt, OSS-REQ-0133 Data Release, OSS-REQ-0139 User Generated, OSS-REQ-0391 (general conventions), and OSS-REQ-0392 (special programs).
Throughout this document margin notes are used to provide linkage to formal LSST requirements and parameters associated with the nearby text.
\subsection{General Image Processing Concepts for LSST}
A raw image (baselined as a pair of successive 15-second exposures, called snaps),
delivered by the LSST camera is processed by the Single Frame Processing Pipeline to produce a single-visit image with, at least
conceptually, counts proportional to photon flux entering the
telescope pupil (in reality, there are many additional optical, pixel and
bandpass effects, including random counting noise and various subtle
systematic errors, that are treated during subsequent processing).
This single-visit image processed by the pipeline is called a ``Processed Visit Image'' and its main data structures include counts, their variance and
various masks, all defined on per pixel basis.
These single-visit images are used downstream to produce coadded and difference
images. The rest of the processing is essentially a model-based interpretation
of imaging observations that includes numerous astrophysical and other
assumptions.
The basic interpretation model assumes a sum of discrete (but possibly overlapping)
sources and a relatively smooth background. The background has a different
spectral energy distribution than discrete sources, and it can display both
spatial gradients and temporal changes. Discrete sources can vary
in brightness and position. The motion can be slow or fast (between two successive observations,
less or more motion than about the seeing disk size), and naturally separates stars
with proper motions and trigonometric parallax from moving objects in the Solar System.
Some objects that vary in brightness can be detectable for only a short period of time
(e.g., supernovae and other cosmic explosions).
The image interpretation model separates time-independent model
component from a temporally changing component (``DC'' and ``AC'',
respectively). Discrete DC sources are \textit{not} operationally nor astrophysically
associated with discrete AC sources even when they are spatially coincident.
Images of discrete objects are
modeled using two models (\S\ref{sec:objchar}). A two-component galaxy model includes a linear
combination of bulge and disk, with their radial intensity variation described using
Sersic profiles.
% \textbf{XXX this may change:} With 8 model parameters for each of the
% six LSST bandpasses, this model includes 48 free parameters.
Stars are modeled using a moving point source model with its parallax motion
superposed on a linear proper motion. This model shares motion parameters across
the six bandpasses and assumes constant flux in each band, and thus includes
11 free parameters. Both galaxy and stellar models are fit to all objects, except
for fast-moving objects (the Solar System objects), which are treated separately.
Discrete objects detected in \emph{difference} images will be modeled using three models:
a point source model, a trailed point source model, and a point source dipole model.
\subsection{Classes of LSST Data Products}
The main LSST data products are illustrated in Figure~\ref{fig:Detail0} (see Appendix for
a conceptual design of pipelines which will produce these data products).
LSST Data Management will perform image analysis on two different timescales, with two corresponding categories of output data products\footnote{Note that prior to 2018 data products were referred to as Level 1, Level 2, and Level 3; this nomenclature was updated in 2018 to Prompt Products, Data Release Products and User Generated Products respectively \citedsp{LPM-231}.}:
\begin{figure}[!t]
\centering
\vskip -0.7in
\includegraphics[scale=0.515, angle=270]{gliffy/LSSTimageProcessingDetail0}
\vskip -0.7in
\caption{Overview of data products produced by LSST Imaging Processing Science Pipelines.\label{fig:Detail0}}
\end{figure}
\begin{enumerate}
\item \textbf{Prompt Processing}, which is performed on a nightly or daily basis and results in \textbf{Prompt Data Products}. The goal of this processing is the detection and characterization of astrophysical phenomena that are revealed by their time-dependent nature. The detection of supernovae superimposed on bright extended galaxies is an example of this analysis. Prompt products will include single visit images, difference images, catalogs of sources detected in difference images (\DIASources), astrophysical objects\footnote{The LSST has adopted the nomenclature by which single-epoch detections of astrophysical \emph{objects} are called \emph{sources}. The reader is cautioned that this nomenclature is not universal: some surveys call \emph{detections} what LSST calls \emph{sources}, and use the term \emph{sources} for what LSST calls \emph{objects}.} these are associated to (\DIAObjects), and Solar System objects (\SSObjects\footnote{\SSObjects used to be called ``Moving Objects'' in previous versions of the LSST Data Products baseline. The name is potentially confusing as high-proper motion stars are moving objects as well. A more accurate distinction is the one between objects \emph{inside} and \emph{outside} of the Solar System.}). The catalogs will be entered into the \textbf{\PPDB} and made available in near real time. Notifications (``alerts'') about new \DIASources will be issued using community-accepted
standards\footnote{For example, VOEvent, see \url{http://ls.st/4tt}} within 60 seconds of observation. Prompt data products are discussed in \S~\ref{sec:level1}.
\item \textbf{Data Release Processing}, which is performed annually\footnote{Except for the first two data releases, which will be created six months apart.} and produces \textbf{Data Release} data products. These processing campaigns use all data collected by the survey to date in order to perform a comprehensive analysis, including measurement of both faint static objects through the construction of deep coadds, and light curve characterization through image differencing across all available epochs.
The data products produced will include the calibrated single-epoch images, deep coadds, catalogs of characterized \Objects (detected on deep coadds as well as individual visits\footnote{The LSST takes two exposures per pointing, nominally 15 seconds in duration each, called \emph{snaps}. For the purpose of data processing, that pair of exposures will typically be coadded and treated as a single exposure, called a \emph{visit}.}), \Sources\footnote{When written in bold monospace type (i.e., \texttt{\textbackslash{}tt}), \Objects and \Sources refer to objects and sources detected and measured as a part of Data Release processing.} (detections and measurements on individual visits), and \ForcedSources (constrained measurement of flux on individual visits). It will also include difference image analysis tables similar to those produced during Prompt Processing (see \S \ref{sec:l1dbreproc}). In contrast to the \PPDB, which is updated continuously during observing, the \DR is static and will not change after release. Data Release data products are discussed in \S~\ref{sec:level2}.
\end{enumerate}
The two processing timescales are driven by differing scientific requirements. Changes in flux or position of objects may need to be immediately followed up, lest interesting information be lost. Thus the primary results of analysis of difference images -- discovered and characterized \DIASources{} -- generally need to be broadcast as \emph{event alerts} within 60 seconds\reqparam{OTT1} of end of visit acquisition.\dmreq{0004} The analysis of science (direct) images is less time sensitive, and will be done as a part of annual data release process.
Recognizing the diversity of astronomical community needs, and the need for specialized processing not part of the automatically generated Prompt and Data Release products, LSST plans to devote 10\% of its data management system capabilities to enabling the creation, use, and federation of \textbf{User Generated} data products. The \textbf{Science Platform} will enable science cases that greatly benefit from co-location of user processing and/or data within the LSST Archive Center. The high-level requirement for this is established in \S~3.5 of the LSST \SRD. Some details are discussed in \S~\ref{sec:level3} of this document.
Finally, LSST Survey Specifications (\S~3.4 of LSST \SRD) prescribe that 90\% of LSST observing time be spent in the so-called ``universal cadence'' mode of surveying the sky. These observations will result in Prompt and Data Release data products discussed above. The remaining 10\% of observing time will be devoted to \textbf{special programs}, designed to obtain improved coverage of interesting regions of observational parameter space. Examples include very deep ($r\sim26$, per exposure) observations, observations with very short revisit times ($\sim$1 minute), and observations of ``special'' regions such as the Ecliptic, Galactic plane, and the Large and Small Magellanic Clouds. The data products for these programs will be generated using the same processing software and hardware and possess the general characteristics of Prompt and Data Release data products, but may be performed on a somewhat different cadence. They will be discussed in \S~\ref{sec:specialProgs}.
\section{General Considerations}
Most LSST data products will consist of images and/or catalogs. The catalogs will be stored and offered to the users as \emph{relational databases} which they will be able to query. This approach was shown to work well by prior large surveys, for example the Sloan Digital Sky Survey (SDSS).
Catalogs may be stored in different databases to meet the operational requirements particular to each dataset. The \PPDB will hold catalogs from Prompt Processing, to enable rapid access to recent measurements.
Data Release ``universal cadence'' catalogs will be stored in a \DR. The products for special programs may be stored in many different databases, depending on the nature of the program.
Nevertheless, all these databases will follow certain naming and other conventions. We discuss these in the subsections to follow.
\subsection{Estimator and Naming Conventions}
For all catalogs data, we will employ a convention where estimates of standard errors have the suffix \texttt{Err}, while the estimates of inherent widths of distribution (or functions in general) have the suffix \texttt{Sigma}\footnote{Given $N$ measurements, standard errors scale as $N^{-1/2}$, while widths remain constant.}. The latter are defined as the square roots of the second moment about the quoted value of the quantity at hand.
\dmreq{0333}
Unless noted otherwise, maximum likelihood values (called likelihood for simplicity) will be quoted for all fitted parameters (measurements). Together with covariances, these let the end-user apply whatever prior they deem appropriate when computing posteriors\footnote{There's a tacit assumption that a Gaussian is a reasonably good description of the likelihood surface around the ML peak.}. Where appropriate, multiple independent samples from the likelihood may be provided to characterize departures from Gaussianity.
We will provide values of log likelihood, the $\chi^2$ for the fitted parameters, and the number of data points used in the fit. \dmreq{0331}Database functions (or precomputed columns) will be provided for frequently used combinations of these quantities (e.g., $\chi^2/dof$). These can be used to assess the model fit quality. Note that, \textit{if the errors of measured quantities are normally distributed,} the likelihood is related to the $\chi^2$ as:
%
\begin{equation}
L = \left(\prod_{k}\frac{1}{\sigma_k \sqrt{2 \pi}}\right) \exp \left[- \frac{\chi^2}{2}\right]
\end{equation}
%
where the index $k$ runs over all data points included in the fit.
For completeness, $\chi^2$ is defined as:
%
\begin{equation}
\chi^2 = \sum_{k} \left( \frac{x_k-\bar{x}}{\sigma_k}\right)^2,
\end{equation}
%
where $\bar{x}$ is the mean value of $x_k$.
For fluxes, we recognize that a substantial fraction of astronomers will just want the posteriors marginalized over all other parameters, trusting the LSST experts to select an appropriate prior\footnote{It's likely that most cases will require just the expectation value alone.}. For example, this is nearly always the case when constructing color-color or color-magnitude diagrams. We will support these use cases by \dmreq{0331}providing additional pre-computed columns, taking care to name them appropriately so as to minimize accidental incorrect usage. For example, a column named \texttt{gFlux} may be the expectation value of the g-band flux, while \texttt{gFluxML} may represent the maximum likelihood value.
\subsection{Image Characterization Data}
Raw images will be processed to \dmreq{0069}remove instrumental signature and characterize their properties, \dmreq{0327}including backgrounds (both due to night sky and astrophysical), \dmreq{0070}the point spread function and its variation, \dmreq{0029}photometric zero-point model, and the \dmreq{0030}world coordinate system (WCS).
That characterization is crucial for deriving LSST catalogs and understanding the images. It will be kept and made available to the users. The exact format used to store these (meta)data will depend on the final adopted algorithm in consultation with the scientific community to ensure the formats in which these data are served are maximally useful.\dmreq{0328}
\dmreq{0072}Each processed image\footnote{It is also frequently referred to as \emph{calibrated exposure}, from the Butler product type of \texttt{calexp}.}, including the coadds, will record information on pixel variance (the ``variance plane''), as well as per-pixel masks (the ``mask plane''). These will allow the users to determine the validity and usefullness of each pixel in estimating the flux density recorded in that area of the sky.
This information will be per-pixel, and potentially unwieldy to use for certain science cases. We plan to investigate approximate schemes for storing masks based on geometry (e.g., similar to Mangle or STOMP), \emph{in addition} to storing them on a per pixel basis.\dmreq{0326}
\subsection{Fluxes and Magnitudes}
\label{sec:fluxes}
\dmreq{0043}
Because flux measurements on difference images (Prompt data products; \S~\ref{sec:level1}) are performed against a template
and thus represent a flux difference, the measured flux of a source on the difference image can be negative. The flux can also go negative for faint sources in the presence of noise. Negative fluxes cannot be stored as (Pogson) magnitudes; $\log$ of a negative number is undefined. We therefore prefer to store fluxes, rather than magnitudes, in database tables\footnote{This is a good idea in general. E.g., given multi-epoch observations, one should always be averaging fluxes, rather than magnitudes.}.\dmreq{0347}
LSST fluxes are quoted in units of nanojansky\footnote{For details, please see LSST \citeds{Document-27758}.} (1 nJy = $10^{?35}$ Wm$^{?2}$Hz$^{?1}$).
We will provide columns with flux and flux errors as well as estimates of the relative and absolute photometric calibration errors, and the normalized system response (for details see Section 3.3.4 in the LSST Science Requirements Document).
We acknowledge that the large majority of users will want to work with magnitudes. For convenience, we plan
in addition
to provide columns with (Pogson) magnitudes and magnitude errors\footnote{These will most likely be implemented as ``virtual'' or ``computed'' columns.}, where values with negative flux will evaluate to \code{NULL}.
\subsection{Uniqueness of IDs across database versions}
\dmreq{0292}
To reduce the likelihood for confusion, all IDs shall be unique across databases and database versions, other than those corresponding to uniquely identifiable entities (i.e., IDs of exposures).
For example, DR4 and DR5 (or any other) release will share no identical \Object, \Source, \DIAObject or \DIASource IDs (see \S~\ref{sec:level1}~and~\ref{sec:level2} for the definitions of \Objects, \DIAObjects, etc.).
\subsection{Repeatability of Queries}
\label{sec:repeatability}
\dmreq{0291}
We require that queries executed at a known point in time against any LSST-delivered database be repeatable at a later date. This promotes the reproducibility of science derived from LSST data. It is of special importance for Prompt product catalogs (\S~\ref{sec:level1}) that will change on a nightly basis as new time domain data is being processed and added to the catalogs.
The exact implementation of this requirement is left to the LSST Data Management database team. One possibility may be to make the key tables (nearly) append-only, with each row having two timestamps -- \texttt{createdTai} and \texttt{deletedTai}, so that queries may be limited by a \code{WHERE} clause:
\begin{lstlisting}[language=SQL,basicstyle=\ttfamily]
SELECT * FROM DIASource WHERE 'YYYY-MM-DD-HH-mm-SS'
BETWEEN createdTAI and deletedTAI
\end{lstlisting}
%
or, more generally:
%
\begin{lstlisting}[language=SQL,basicstyle=\ttfamily,showstringspaces=false]
SELECT * FROM DIASource WHERE 'data is valid as of YYYY-MM-DD'
\end{lstlisting}
%
A perhaps less error-prone alternative, if technically feasible, may be to provide multiple virtual databases that the user would access as:
%
\begin{lstlisting}[language=SQL,basicstyle=\ttfamily]
CONNECT lsst-dr5-yyyy-mm-dd SELECT * FROM DIASource
\end{lstlisting}
%
The latter method would probably be limited to nightly granularity, unless there's a mechanism to create virtual databases/views on-demand.
\clearpage
\section{Prompt Data Products}
\label{sec:level1}
\subsection{Overview}
Prompt data products are the result of nightly processing. These data products are single visit images, difference images, and the results of difference image analysis (DIA; \S \ref{sec:dia}). DIA outputs consist of the sources detected in difference images (\DIASources), astrophysical objects that these are associated to (\DIAObjects), characterizations of hitherto identified Solar System objects (\SSObject), and discoveries of new Solar System objects. Various metadata products are also produced during nightly processing and made available to users.
\DIASources are sources detected on difference images (those with the signal-to-noise ratio $S/N>transSNR$ after correlation with the PSF profile, with $transSNR$\reqparam{transSNR} defined in the \SRD and presently set to \transSNR). They represent changes in flux with respect to a deep template. Detections with high probability of being instrumental non-astrophysical artifacts may be excluded. Physically, a \DIASource may be an observation of new astrophysical object that was not present at that position in the template image (for example, an asteroid), or an observation of flux change in an existing source (for example, a variable star). Their flux can be negative (eg., if a source present in the template image reduced its brightness, or moved away). Their shape can be complex (eg., trailed, for a source with proper motion approaching $\sim$deg/day, or ``dipole-like'', if an object's observed position exhibits an offset -- true or apparent -- compared to its position on the template).
Some \DIASources will be caused by background fluctuations; with $transSNR = \transSNR$,
we expect about one such false positive per CCD (of the order 200,000 per typical night). The expected number of false
positives due to background fluctuations is a very strong function of adopted $transSNR$: a change of $transSNR$ by 0.5
results in a variation of an order of magnitude, and a change of $transSNR$ by unity changes the number of false
positives by about two orders of magnitude.
Clusters of \DIASources detected on visits taken at different times are associated with either a \DIAObject or an \SSObject,\dmreq{0285} to represent the underlying astrophysical phenomenon. The association can be made in two different ways: by assuming the underlying phenomenon is an object within the Solar System moving on an orbit around the Sun\footnote{We don't plan to fit for motion around other Solar System bodies; eg., identifying new satellites of Jupiter is left to the community.}, or by assuming it to be distant enough to only exhibit small parallactic and proper motion\footnote{Where 'small' is small enough to unambiguously positionally associate together individual apparitions of the object.}. The latter type of association is performed during difference image analysis right after the image has been acquired. \dmreq{0089}The former is done at daytime by the object linking element of the Solar System Processing Pipeline, unless the \DIASource is an apparition of an already known \SSObject. In that case, it will be flagged as such during difference image analysis.
\dmreq{0002}
At the end of the difference image analysis of each visit, we will issue time domain event alerts for all newly %discovered
detected \DIASources\footnote{For observations on the Ecliptic near the opposition Solar System objects will dominate the \DIASource counts and (until they're recognized as such) overwhelm the explosive transient signal. It will therefore be advantageous to quickly identify the majority of Solar System objects early in the survey.}.
\subsection{Prompt Data Processing}
\subsubsection{Difference Image Analysis}
\label{sec:dia}
The following is a high-level description of steps which will occur during regular \emph{nightly} difference image analysis (see Figures~\ref{fig:Pipe1} and \ref{fig:Pipes567}):
\begin{enumerate}
\item A visit is acquired and reduced to a single \emph{Processed Visit Image} (cosmic ray rejection, instrumental signature removal\footnote{Eg., subtraction of bias and dark frames, flat fielding, bad pixel/column interpolation, etc.}, combining of snaps, etc.).\dmreq{0069}
\item The Processed Visit Image is differenced against the appropriate template and \DIASources are detected.
If necessary, deblending will be performed at this stage. Both the parent blend and the deblended children will be measured and stored as \DIASources (see next item), but only the children will be matched against \DIAObjects and alerted on. Deblended objects will be flagged as such.\dmreq{0010}
\item The flux and shape\footnote{The "shape" in this context consists of weighted 2$^{\rm{nd}}$ moments, as well as fits to a trailed source model and a dipole model.} of the DIASource are measured on the difference image.
PSF photometry is performed on the Processed Visit Image at the position of the \DIASource to obtain a measure of the total flux.\dmreq{0269}
\item The \PPDB (see \S \ref{sec:level1db}) is searched for a \DIAObject or an \SSObject with which to positionally associate the newly discovered \DIASource\footnote{The association algorithm will guarantee that a \DIASource is associated with not more than one existing \DIAObject or \SSObject. The algorithm will take into account the parallax and proper (or Keplerian) motions, as well as the errors in estimated positions of \DIAObject, \SSObject, and \DIASource, to find the maximally likely match. Multiple \DIASources in the same visit will not be matched to the same \DIAObject.}. If no match is found, a new \DIAObject is created and the observed \DIASource is associated to it.\dmreq{0273}\dmreq{0271}\dmreq{0285}
\item If the \DIASource has been associated with an \SSObject (a known Solar System object), it will be flagged as such and an alert will be issued. Further processing will occur in daytime (see section \ref{sec:ssProcessing}).\dmreq{0274}
\item Otherwise, the associated \DIAObject measurements will be updated with new data
collected during previous 12 months. Hence, the computed parameters for \DIAObjects have a ``memory'' of past data that does not extend beyond this cutoff\footnote{This restriction is removed when Prompt processing is
rerun during Data Release production, see \S~\ref{sec:l1dbreproc}.}. All affected columns will be recomputed, including proper motions, centroids, light curves, etc.\dmreq{0319}\reqparam{diaCharacterizationCutoff}
\item To aid in rapid prioritization of new \DIAObjects in the latency interval before the precovery forced photometry is run\reqparam{L1PublicT} (see below), a table with the median noise in the difference image per visit is queried. For each visit in the last twelve months at the position of the \DIAObject, if there are no \DIASource or \DIAForcedSource records for that visit, the time of that visit, the filter, and the difference image noise are included in the alert packet. These data allow computation of appropriate upper limits in the difference image light curve.
\item The \DR\footnote{\DR is a database resulting from annual data release processing. See \S~\ref{sec:level2} for details.} is searched for the three nearest stars and three nearest galaxies to the \DIAObject in \Objects\reqparam{diaNearbyObjMaxStar}\reqparam{diaNearbyObjMaxGalaxy} out to some maximum radius.\reqparam{diaNearbyObjRadius} The IDs of these nearest-neighbor \Objects are recorded in the \DIAObject record and provided in the complete \DIAObject record issued with the issued event alert (see below).\dmreq{0271}\dmreq{0002}
\item An alert is issued that includes: the timestamp of when this database has been queried to issue this alert, the \DIASource ID, the complete \SSObject or \DIAObject record\footnote{We guarantee that a receiver will always be able to regenerate the alert contents at any later date using the included timestamps and metadata.}, \emph{including all} \DIASources and \DIAForcedSources from the last 12 months\reqparam{diaCharacterizationCutoff} that are linked with the \SSObject or \DIAObject. The science content associated with the \DR \Objects will not be included. See Section \ref{sec:voEventContents} for a more complete enumeration.
\dmreq{0274}
\item For all \DIAObjects overlapping the field of view, including those that have an associated
new \DIASource from this visit, forced photometry will be performed on the difference image (point source photometry only). Those measurements will be stored as \DIAForcedSources. No alerts will be issued for these \DIAForcedSources, but the \DIAForcedSource measurements will be included in any future alerts triggered by a new \DIASource at that location.\dmreq{0317}
\item Within 24 hours of discovery\reqparam{L1PublicT}, \emph{precovery} PSF forced photometry will be performed on any difference image overlapping the position of new \DIAObjects taken within the past 30 days\reqparam{precoveryWindow}, and added to the \DIAForcedSource table. Alerts will not be issued with precovery photometry information but the resulting \DIAForcedSource measurements will be included in future alerts from this \DIAObject.\dmreq{0287}
\end{enumerate}
In addition to the processing described above, a smaller sample of sources detected on difference images \emph{below} the nominal $transSNR = \transSNR$ \reqparam{transSNR} threshold will be measured and stored, in order to enable monitoring of difference image analysis quality.\dmreq{0270}
Also, the system will have the ability to measure and alert on a limited\footnote{It will be sized for no less than $\sim$10\% of average \DIASource per visit rate.} number of sources detected below the nominal threshold for which additional criteria are satisfied. For example, a $transSNR$ = 3 source detection near a gravitational keyhole\footnote{
A gravitational keyhole is a region of space where Earth's gravity would modify the orbit of a passing asteroid
such that the asteroid would collide with the Earth in the future.}
may be highly significant in assessing the danger posed by a potentially hazardous asteroid.
The initial set of criteria will be defined by the start of LSST operations.
\subsubsection{Solar System Object Processing}
\label{sec:ssProcessing}
The Solar System Processing described in this section occurs in daytime, after a night of observing \dmreq{0004}\dmreq{0089}. Its goal is to link (identify) previously unknown \SSObjects, given the additional night of observing and report the discoveries to the Minor Planet Center, as well as compute physical (e.g., absolute magnitudes) and other auxiliary properties (e.g., predicted apparent magnitudes and coordinates in various coordinate systems) for known Solar System objects and their LSST observations. The process is graphically illustrated in Figure~\ref{fig:Pipe8}.
The pipeline consists of the following conceptual steps:
\begin{enumerate}
\item {\bf Linking:} All \DIASources detected on the previous night that have \emph{not} been matched at a high confidence level to a known \Object,
\DIAObject, \SSObject, or an artifact, are analyzed for potential pairs, forming \emph{tracklets}. The collection of tracklets collected over no fewer than past 14 days\footnote{The exact time window is largely computationally limited; longer windows increase the discovery rates and are preferable.} is searched for subsets forming \emph{tracks} consistent with being on the same Keplerian orbit around the Sun.
\item {\bf Reporting:} These newly discovered Solar System objects are reported to the MPC, using the standard data-exchange protocols (e.g., the ADES format). The measurements of all \DIASources detected on the previous night that have been matched at a high level of confidence to a known \SSObject are also submitted to the MPC.
\item {\bf Catalog Update:} An updated orbit catalog, incorporating previously submitted LSST discoveries as well as discoveries by other contemporaneous programs, is downloaded from the MPC and ingested into the Prompt Products database. \DIASource records are updated to point to the relevant \SSObject records. \DIAObjects ``orphaned'' by this unlinking are deleted.\footnote{Some \DIAObjects may only be left with forced photometry measurements at their location (since all \DIAObjects are force-photometered on previous and subsequent visits); these will be kept but flagged as such.}.
\item {\bf Physical Characterization:} The physical properties of all known \SSObjects, as defined by the orbit catalog, are recomputed. \dmreq{0288} Updated data are entered into the relevant tables.\dmreq{0273}
\item {\bf Precovery:} Precovery linking is attempted for all \SSObjects whose orbits were updated in this process (or are new).\dmreq{0286} Where successful, newly discovered observations are queued up for submission to the MPC.
\end{enumerate}
\subsection{Prompt Catalogs}
\label{sec:level1db}
The alert processing design relies on the \PPDB that contains the objects and sources detected on difference images. At the very least\footnote{It will also contain exposure and visit metadata, SSP-specific tables, etc. These are either standard/uncontroversial, implementation-dependent, or less directly relevant for science and therefore not discussed in this document.}, this database will have tables of \DIASources,\dmreq{0269} \DIAObjects,\dmreq{0271} and \SSObjects,\dmreq{0273} populated in the course of nightly and daily difference image and Solar System object processing. As these get updated and added to, their updated contents becomes visible (query-able) to users within 24 hours\reqparam{L1PublicT} of the corresponding observation times. In the case of SSObjects, this means 24 hours after orbits have been determined or updated.\dmreq{0312}
This database is \emph{only loosely coupled to the \DR}. All of the coupling is through positional matches between the \DIAObjects entries in the \PPDB and the \Objects in the \DR. There is no direct \DIASource-to-\Object match:
in general, a time-domain object is not necessarily the same astrophysical object as a static-sky object, even if the two are
positionally coincident (eg. an asteroid overlapping a galaxy).
Therefore, adopted data model emphasizes that \emph{having a \DIASource be positionally coincident with an \Object does not imply it is physically related to it}. Absent other information, the least presumptuous data model relationship is one of \emph{positional association}, not \emph{physical identity}.
This may seem odd at first: for example, in a simple case of a variable star, matching individual \DIASources to \Objects is exactly what an astronomer would want. That approach, however, fails in the following scenarios:
\begin{itemize}
\item \emph{A supernova in a galaxy.} The matched object in the \Object table will be the galaxy, which is a distinct astrophysical object. We want to keep the information related to the supernova (eg., colors, the light curve) separate from those measurements for the galaxy.
\item \emph{An asteroid occulting a star.} If associated with the star on first apparition, the association would need to be dissolved when the source is recognized as an asteroid (perhaps even as early as a day later).
\item \emph{A supernova on top of a pair of blended galaxies.} It is not clear in general to which galaxy this \DIASource would ``belong''. That in itself is a research question.
\end{itemize}
\DIASource-to-\Object matches can still be emulated via a two-step relation (\DIASource-\DIAObject-\Object). For ease of use, views or pre-built table with these matches will be offered to the end-users.\dmreq{0324}
% There are three ``core'' tables in the \PPDB: the \DIASource table, with information about detected and/or measured \DIASources, \DIAObject table, with summary information about \DIAObjects derived from the associated \DIASources, and the \SSObject table (short for \textbf{Solar System Object}\footnote{This is what we used to call a ``Moving Object''. This name is potentially confusing, as high-proper motion stars are moving objects as well. A more accurate distinction is the one between objects in an out of the Solar System.}) holding derived orbits and associated Solar System Object-specific information.
In the sections to follow, we present the \emph{conceptual schemas} for the most important \PPDB tables. These convey \emph{what} data will be recorded in each table, rather than the details of \emph{how}. For example, columns whose type is an array (eg., \texttt{radec}) may be expanded to one table column per element of the array (eg., \texttt{ra}, \texttt{decl}) once this schema is translated to SQL\footnote{The SQL realization of this schema is defined in the \texttt{cat} package and documented in \citeds{LDM-153} and can be browsed at \url{http://ls.st/8g4}}. Secondly, the tables to be presented are largely normalized (i.e., contain no redundant information). For example, since the band of observation can be found by joining a \DIASource table to the table with exposure metadata, there's no column named \texttt{band} in the \DIASource table. In the as-built database, the views presented to the users will be appropriately denormalized for ease of use.
\subsubsection{\DIASource Table}
This is a table of sources detected at $transSNR \geq \transSNR$ \reqparam{transSNR} on difference images\footnote{This requirement is specified in the LSST \SRD.}\lsrreq{0101} (\DIASources).
On average, the LSST \SRD expects
up to
$\sim$10,000 astrophysical \DIASources per visit ($\sim$10M per night; 100,000 per deg$^2$
of the sky per hour).
\reqparam{transN}
Some $transSNR \geq \transSNR$ sources will not be caused by observed astrophysical phenomena, but by artifacts (bad columns, diffraction spikes, etc.). The difference image analysis software will attempt to identify and flag these as such.
Unless noted otherwise, all \DIASource quantities (fluxes, centroids, etc.) are measured on the difference image.
\dmreq{0269}
\begin{schema}{\DIASource Table}{\DIASource Table}{tbl:diasourceTable}
diaSourceId & uint64 & ~ & Unique source identifier \\
ccdVisitId & uint64 & ~ & ID of CCD and visit where this source was measured \\
diaObjectId & uint64 & ~ & ID of the \DIAObject this source was associated with, if any. \\
ssObjectId & uint64 & ~ & ID of the \SSObject this source has been linked to, if any. \\
parentDiaSourceId & uint64 & ~ & ID of the parent \DIASource this object has been deblended from, if any. \\
midPointTai & double & time & Time of mid-exposure for this DIASource\footnote{The visit mid-exposure
time generally depends on the position of the source relative to the shutter blade motion trajectory.}. \\
radec & double[2] & degrees & Centroid, $(\alpha, \delta)$\footnote{The astrometric reference frame will be chosen closer to start of operations.}. \\
radecCov & float[3] & various & \texttt{radec} covariance matrix. \\
xy & float[2] & pixels & Column and row of the centroid. \\
xyCov & float[3] & various & Centroid covariance matrix. \\
apFlux & float & nJy\footnote{LSST fluxes are reported in nanojansky (1 nJy = $10^{−35}$ Wm$^{−2}$Hz$^{−1}$).
See \S \ref{sec:fluxes} and LSST Document-27758 for details.} & Calibrated aperture flux. Note that this actually measures
the flux \emph{difference} between the template and the visit image. \\
apFluxErr & float & nJy & Estimated uncertainty of \texttt{apFlux}. \\
SNR & float & ~ & The signal-to-noise ratio at which this source was detected in the difference image.\footnote{This is not necessarily the same as apFlux/apFluxErr, as the flux measurement algorithm may be more accurate than the detection algorithm.} \\
psFlux & float & nJy & Calibrated flux for point source model. Note this actually measures the flux \emph{difference} between the template and the visit image. \\
psFluxErr & float & nJy & Estimated uncertainty of \texttt{psFlux}. \\
psRadec & double[2] & degrees & Centroid for point source model. \\
psCov & float[6] & various & Covariance matrix for point source model parameters. \\
psLnL & float & ~ & Natural $log$ likelihood of the observed data given the point source model. \\
psChi2 & float & ~ & $\chi^2$ statistic of the model fit. \\
psNdata & int & ~ & The number of data points (pixels) used to fit the model. \\
trailFlux & float & nJy & Calibrated flux for a trailed source model\footnote{A \emph{Trailed Source Model} attempts to fit a (PSF-convolved) model of a point source that was trailed by a certain amount in some direction (taking into account the two-snap nature of the visit, which may lead to a dip in flux around the mid-point of the trail). Roughly, it's a fit to a PSF-convolved line. The primary use case is to characterize fast-moving Solar System objects.}$^,$\footnote{This model does not fit for the \emph{direction} of motion; to recover it, we would need to fit the model to separately to individual snaps of a visit. This adds to system complexity, and is not clearly justified by increased Solar System object linking performance given the added information.}. Note this actually measures the flux \emph{difference} between the template and the visit image. \\
trailRadec & double[2] & degrees & Centroid for trailed source model. \\
trailLength & float & arcsec & Maximum likelihood fit of trail length\footnote{Note that we'll likely measure trailRow and trailCol, and transform to trailLength/trailAngle (or trailRa/trailDec) for storage in the database. A stretch goal is to retain both.}$^,$\footnote{TBD: Do we need a separate trailCentroid? It's unlikely that we do, but one may wish to prove it.}. \\
trailAngle & float & degrees & Maximum likelihood fit of the angle between the meridian through the centroid and the trail direction (bearing, direction of motion). \\
trailCov & float[15] & various & Covariance matrix of trailed source model parameters. \\
trailLnL & float & ~ & Natural $log$ likelihood of the observed data given the trailed source model. \\
trailChi2 & float & ~ & $\chi^2$ statistic of the model fit. \\
trailNdata & int & ~ & The number of data points (pixels) used to fit the model. \\
dipMeanFlux & float & nJy & Maximum likelihood value for the mean absolute flux of the two lobes for a dipole model\footnote{A \emph{Dipole Model} attempts to fit a (PSF-convolved) model of two point sources, with fluxes
of opposite signs, separated by a certain amount in some direction. The primary use case is to characterize moving stars and problems with image differencing (e.g., due to astrometric offsets).}.
\\
dipFluxDiff & float & nJy & Maximum likelihood value for the difference of absolute fluxes of the two lobes for a dipole model.
\\
dipRadec & double[2] & degrees & Centroid for dipole model. \\
dipLength & float & arcsec & Maximum likelihood value for the lobe separation in dipole model. \\
dipAngle & float & degrees & Maximum likelihood fit of the angle between the meridian through the centroid and the dipole direction (bearing, from negative to positive lobe). \\
dipCov & float[21] & various & Covariance matrix of dipole model parameters. \\
dipLnL & float & ~ & Natural $log$ likelihood of the observed data given the dipole source model. \\
dipChi2 & float & ~ & $\chi^2$ statistic of the model fit. \\
dipNdata & int & ~ & The number of data points (pixels) used to fit the model. \\
totFlux & float & nJy & Calibrated flux for point source model measured on the visit image centered at the centroid measured on the difference image (forced photometry flux) \\
totFluxErr & float & nJy & Estimated uncertainty of \texttt{totFlux}. \\
diffFlux & float & nJy & Calibrated flux for point source model centered on \texttt{radec} but measured on the difference of snaps comprising this visit\footnote{This flux can be used to detect sources changing on timescales comparable to snap exposure length ($\sim$15\,sec).}. \\
diffFluxErr & float & nJy & Estimated uncertainty of \texttt{diffFlux}. \\
fpBkgd & float & nJy/arcsec$^{2}$ & Estimated background at the position (centroid) of the object in
the template image. \\
fpBkgdErr & float & nJy/arcsec$^{2}$ & Estimated uncertainty of \texttt{fpBkgd}. \\
%grayExtinction & float & nJy & Applied photometric extinction correction (gray component) \\
%nonGrayExtinction & float & nJy & Applied photometric extinction correction (color-dependent component) \\
Ixx & float & arcsec$^{2}$ & Adaptive second moment of the source intensity. See \citet{2002AJ....123..583B} for detailed discussion of all adaptive-moment related quantities\footnote{Or \url{http://ls.st/5f4} for a brief summary.}. \\
Iyy & float & arcsec$^{2}$ & Adaptive second moment of the source intensity. \\
Ixy & float & arcsec$^{2}$ & Adaptive second moment of the source intensity. \\
Icov & float[6] & arcsec$^{4}$ & \texttt{Ixx}, \texttt{Iyy}, \texttt{Ixy} covariance matrix. \\
IxxPSF & float & arcsec$^{2}$ & Adaptive second moment for the PSF. \\
IyyPSF & float & arcsec$^{2}$ & Adaptive second moment for the PSF. \\
IxyPSF & float & arcsec$^{2}$ & Adaptive second moment for the PSF. \\
extendedness & float & ~ & A measure of extendedness, computed using a combination of available moments, or from a likelihood ratio of point/trailed source models (exact algorithm TBD). $extendedness=1$ implies a high degree of confidence that the source is extended. $extendedness=0$ implies a high degree of confidence that the source is point-like. \\
spuriousness & float & ~ & A measure of spuriousness, computed using information\footnote{The computation
of spuriousness will be “prior free” to the extent possible and not use any information about the astrophysical neighborhood of the source, whether it has been previously observed or not, etc. The intent is to avoid introducing
a bias against unusual sources or sources discovered in unusual environments.}
from the source and image characterization, as well as the information on the Telescope and Camera system
(e.g., ghost maps, defect maps, etc.).
\\
flags & bit[64] & bit & Various useful flags. \\
\end{schema}
\begin{schema}{\DIAForcedSource Table}{\DIAForcedSource Table}{tbl:diaforcedsourceTable}
diaForcedSourceId & uint64 & ~ & Unique source identifier \\
ccdVisitId & uint64 & ~ & ID of CCD and visit where this source was measured \\
diaObjectId & uint64 & ~ & ID of the \DIAObject this forced photometry was seeded by. \\
midPointTai & double & time & Time of mid-exposure for this \DIAForcedSource. \\
psFlux & float & nJy & Calibrated flux for point source model. Note this actually measures the flux \emph{difference} between the template and the visit image. \\
psFluxErr & float & nJy & Estimated uncertainty of \texttt{psFlux}. \\
totFlux & float & nJy & Calibrated flux for point source model measured on the visit image centered at the \DIAObject centroid. \\
totFluxErr & float & nJy & Estimated uncertainty of \texttt{totFlux}. \\
\end{schema}
Some fast-moving, trailed, sources may be due to passages of nearby asteroids. Their trails may exhibit significant curvature.
While we do not measure the curvature directly, it can be inferred by examining the length of the trail, the trailed model covariance matrices, and the adaptive shape measures. Once curvature is suspected, the users may fit curved trail models to the cutout provided with the alert.
\begin{changelog}
Notes about changes with respect to the previous baseline:
\begin{itemize}
\item I removed the \texttt{astromRefr*} columns. These will depend on the SED (color) of the object, and the color won't be know when the object is discovered. It may be better to provide a UDF to compute the refraction given a \DIAObject record.
\item Removed ``small galaxy'' model fits. We don't plan to do galaxy model fits on difference images.
\item Removed ``canonical small galaxy'' model fits. See above.
\item Removed galExtinction: this should be a UDF using extinction maps
\item I removed the aperture correction column.
\item gray/nonGray extinction columns removed. May be implemented as an UDF.
\item TODO: See what other fields SDSS has. Also see what fields PanSTARRS has. Collect input from SCs.
\end{itemize}
\end{changelog}
\subsubsection{\DIAObject Table}
\dmreq{0271}\dmreq{0272}\reqparam{diaNearbyMaxObj}
\begin{schema}{\DIAObject Table}{\DIAObject Table}{tbl:diaobjectTable}
diaObjectId & uint64 & ~ & Unique identifier. \\
radec & double[2] & degrees & $(\alpha, \delta)$ position of the object at time \texttt{radecTai}. \\
radecCov & float[3] & various & \texttt{radec} covariance matrix. \\
radecTai & double & time & Time at which the object was at a position \texttt{radec}. \\
pm & float[2] & mas/yr & Proper motion vector\footnote{High proper-motion or parallax objects will appear as ``dipoles'' in difference images. Great care will have to be taken not to misidentify these as subtraction artifacts.}. \\
parallax & float & mas & Trigonometric parallax. \\
pmParallaxCov & float[6] & various & Proper motion - parallax covariances. \\
pmParallaxLnL & float & ~ & Natural log of the likelihood of the linear proper motion-parallax fit\footnote{\texttt{radec}, \texttt{pm}, and \texttt{parallax} will all be simultaneously fitted for.}. \\
pmParallaxChi2 & float & ~ & $\chi^2$ statistic of the model fit. \\
pmParallaxNdata & int & ~ & The number of data points used to fit the model. \\
psFluxMean & float[ugrizy] & nJy & Weighted mean of point-source model flux, \texttt{psFlux}. \\
psFluxMeanErr & float[ugrizy] & nJy & Standard error of \texttt{psFluxMean}. \\
psFluxSigma & float[ugrizy] & nJy & Standard deviation of the distribution of \texttt{psFlux}. \\
psFluxChi2 & float[ugrizy] & ~ & $\chi^2$ statistic for the scatter of \texttt{psFlux} around \texttt{psFluxMean}. \\
psFluxNdata & int[ugrizy] & ~ & The number of data points used to compute \texttt{psFluxChi2}. \\
totFluxMean & float[ugrizy] & nJy & Weighted mean of forced photometry flux, \texttt{totFlux}.\\
totFluxMeanErr & float[ugrizy] & nJy & Standard error of \texttt{totFluxMean}. \\
totFluxSigma & float[ugrizy] & nJy & Standard deviation of the distribution of \texttt{totFlux}. \\
% lsPeriod & float[ugrizy] & day & Period (the coordinate of the highest peak in Lomb-Scargle periodogram) \\
% lsSigma & float[ugrizy] & day & Width of the peak at \texttt{lsPeriod}. \\
% lsPower & float[ugrizy] & ~ & Power associated with \texttt{lsPeriod} peak. \\
% lcChar & float[$6\times{}M$] & ~ & Light-curve characterization summary statistics (eg., 2nd moments, etc.). The exact contents, and an appropriate value of M, are to be determined in consultation with time-domain experts. \\
lcPeriodic & float[6~\x~32] & ~ & Periodic features extracted from \DIASource light-curves using generalized Lomb-Scargle periodogram \citep[Table~4,][]{2011ApJ...733...10R}\footnote{The exact features in use when LSST begins operations are likely to be different compared to the baseline described here. This is to be expected given the rapid pace of research in time domain astronomy. However, the \emph{number} of computed features is unlikely to grow beyond the present estimate.}. \\
lcNonPeriodic & float[6~\x~20] & ~ & Non-periodic features extracted from \DIASource light-curves \citep[Table~5,][]{2011ApJ...733...10R}. \\
nearbyObj & uint64[6] & ~ & Closest \Objects\ (3 stars and 3 galaxies) in \DR.\\
nearbyObjDist & float[6] & arcsec & Distances to \texttt{nearbyObj}. \\
nearbyObjLnP & float[6] & ~ & Natural $\log$ of the probability that the observed \DIAObject is the same as the nearby \Object\footnote{This quantity will be computed by marginalizing over the product of position and proper motion error ellipses of the \Object and \DIAObject, assuming an appropriate prior.}. \\
nearbyExtObj & unit64[3] & ~ & Three extended Objects with lowest separations in Data Release database\footnote{Separations should be calculated with respect to the transient location using the second moments of each Object's luminosity profile, as described in DMTN-151.}. \\
nearbyExtObjSep & float[3] & ~ & Separations of nearbyExtObj. \\
nearbyLowzGal & str[1] & ~ & External catalog name of the nearest low-redshift galaxy\footnote{External catalog will be, e.g., the NGC/IC, unless the community provides one that they deem to be more scientifically useful, as described in DMTN-151.}. \\
nearbyLowzGalSep & float[1] & ~ & Separation of nearbyLowzGal\footnote{Separations will be radial offset in arcseconds unless a community-provided catalog includes galaxy characteristics that would enable an alternative, as described in DMTN-151.}. \\
flags & bit[64] & bit & Various useful flags. \\
\end{schema}
\subsubsection{{\tt MPCORB}, \SSObject and \SSSource Tables}
The three tables presented below capture the information that will be available in the database for each Solar System Object (\SSObject) and for each observation of a Solar System Object (\SSSource). {\bf The details of the schema will evolve as the implementation progresses, maintaining the high-level content}. The implementation will be maintained in the catalog repository at \url{https://github.com/lsst/cat}, as well as linked at \url{http://ls.st/ssp}.
\dmreq{0273}
\input{schemas/MPCORB.tex}
The {\tt MPCORB} table will be ingested, at least daily, from the Minor Planet Center. Its exact contents will mirror the columns available in the MPCORB at the time the LSST is operations. While MPCORB does not contain orbital element covariances at this point, it is expected these will become available in time for LSST operations.
\input{schemas/SSObject.tex}
The $G_{12}$ parameter for a large fraction of asteroids may not be well constrained until later in the survey. We may decide not to fit for it at all over the first few DRs and add it later in Operations. Alternatively, we may fit it using strong priors on slopes poorly constrained by the data. The design of the data management system is insensitive to this decision, making it possible to postpone it to Commissioning to ensure it follows the standard community practice at that time.
{\bf Per-observation quantities:} The LSST database will provide an auxiliary table (\SSSource) or equivalent functions to compute the phase (Sun-Asteroid-Earth) angle $\alpha$ for every observation, the reduced, $H(\alpha)$, and absolute, $H$, asteroid magnitudes in LSST bands, as well as any other quantities defined in the realized schema.\dmreq{0323}
\input{schemas/SSSource.tex}
\subsubsection{Precovery Measurements}
When a new \DIASource is detected, it's useful to perform forced PSF photometry at the location of the new source on images taken prior to discovery. These are colloquially known as \emph{precovery measurements}\footnote{When Solar System objects are concerned, precovery has a slightly different meaning: predicting the positions of newly identified \SSObjects on previously acquired visits, and associating with them the \DIASources consistent with these predictions.}.\dmreq{0287}\dmreq{0286} Performing precovery in real time over all previously acquired visits is too I/O intensive to be feasible. We therefore plan the following:
\begin{enumerate}
\item For all newly discovered objects, perform precovery PSF photometry on visits taken over the previous 30 days\reqparam{precoveryWindow}\footnote{We will be maintaining a cache of $30$ days of processed images to support this feature.}. These measurements will be stored in the \DIAForcedSource table.
\item Make available a ``precovery service'' to request precovery for a limited number of \DIASources across all previous visits, and make it available within 24 hours of the request. Web interface and machine-accessible APIs will be provided.\dmreq{0341}
\end{enumerate}
The former should satisfy the most common use cases (eg., SNe), while the latter will provide an opportunity for more extensive yet timely precovery of targets of special interest.
\subsubsection{Image differencing during Data Release Production}
\label{sec:l1dbreproc}
In what we've described so far, the \PPDB is continually being added to as new images are taken and \DIASources identified.\dmreq{0312} Every time a new \DIASource is associated to an existing \DIAObject, the \DIAObject record is updated to incorporate new information brought in by the \DIASource. Once discovered and measured, the \DIASources would never be re-discovered and re-measured at the pixel level.
This would be far from optimal. The instrument will be better understood with time. Newer versions of LSST pipelines will improve detection and measurements on older data. Also, precovery photometry should optimally be performed for \emph{all} objects, and not just a select few.
The annual Data Release productions will therefore also include a full reanalysis of the time-domain information in the collected dataset , by performing image differencing on all images collected by the survey \dmreq{0313}\dmreq{0325}. The images will be processed using a single version of the image differencing and measurement software, resulting in a consistent set of \DIASources, \DIAObjects, \ForcedSources, and \SSObjects as part of a data release.
There will be three main advantages of time-domain data produced during Data Release processing, compared to
the \PPDB:
i) even the oldest data will be processed with the latest software,
ii) astrometric and photometric calibration will be better, and
iii) there will be no 12-month limit on the width of data window used to computed associated \DIAObject
measurements (proper motions, centroids, light curves, etc.). \reqparam{diaCharacterizationCutoff}
\subsection{Prompt Image Products}
\subsubsection{Visit Images}
\dmreq{0069}
Raw and Processed Visit Images will be made available for download no later than 24 hours\reqparam{L1PublicT} from the end of visit acquisition.
The images will remain accessible with low-latency (seconds from request to start of download) for at least 30 days\reqparam{l1CacheLifetime}, with slower access afterwards (minutes to hours).
\subsubsection{Difference Images}
\label{sec:diffims}
\dmreq{0010}
Complete difference images will be made available for download no later than 24 hours\reqparam{L1PublicT} from the end of visit acquisition.
The images will remain accessible with low-latency (seconds from request to start of download) for at least 30 days\reqparam{l1CacheLifetime}, with slower access afterwards (minutes to hours).
\subsubsection{Image Differencing Templates}
\label{sec:templates}
\dmreq{0280}
Coadded images will be used as templates for difference image analysis.
The coaddition process will take care to remove transient or fast moving objects (eg., asteroids) from the templates.
The time range of the epochs included in a template is TBD, and will be chosen to minimize false positives due to high proper-motion stars (favoring shorter ranges) with the need to correct for DCR and maximize depth (favoring longer ranges).
The numbers and types of coadds used for templates are also TBD.
\subsection{Alerts to \DIASources}
\label{sec:voEventContents}
\subsubsection{Information Contained in Each Alert}
For each detected \DIASource, LSST will emit an ``Event Alert'' within 60 seconds\reqparam{OTT1} of the end of visit (defined as the end of image readout from the LSST Camera). These alerts will be issued in \VOEvent format\footnote{Or some other format that is broadly accepted and used by the community at the start of LSST commissioning.}, and should be readable by \VOEvent-compliant clients.\dmreq{0002}
Each alert (a \VOEvent packet) will at least include the following:
\begin{itemize}
\item \emph{alertID}: An ID uniquely identifying this alert. It can also be used to execute a query against the \PPDB as it existed when this alert was issued.\dmreq{0274}
\item Science Data:
\begin{itemize}
\item The \DIASource record that triggered the alert, as well as the \texttt{filterName} and \texttt{programId} of the corresponding \texttt{Visit}
\item The entire \DIAObject (or \SSObject) record. \DIAObject records include matching \Object IDs from the latest Data Release, if they exist.
\item Any \DIASource and \DIAForcedSource records that exist, and difference image noise estimates where they do not, taken from the previous 12 months. \reqparam{diaCharacterizationCutoff}
\end{itemize}
% \item Flags (isSolarSystemObject, isArtefact, etc.)
\item Cut-out of the difference image centered on the \DIASource (10 bytes/pixel, FITS MEF)
\item Cut-out of the template image centered on the \DIASource (10 bytes/pixel, FITS MEF)
\end{itemize}
The variable-size cutouts will be sized so as to encompass the entire footprint of the detected source, but be no smaller than $30 \times 30$ pixels. The provided images will comprise of a flux (32 bit float), variance (32 bit float), and mask (at least 16 bit flags) planes, and include metadata necessary for further processing (e.g., WCS, zero-point, PSF, etc.).
The items above are meant to represent the \emph{information} transmitted with each alert; the content of the alert packet itself will be formatted to confirm to \VOEvent (or other relevant) standard. Where the existing standard is inadequate for LSST needs, LSST will propose extensions and work with the community to reach a common solution.
With each alert, we attempt to include as much information known to LSST about the \DIASource as possible, to minimize the need for follow-up database queries. This speeds up classification and decision making at the user end, and relaxes the requirements on the database on the Project end.
\subsubsection{Receiving and Filtering the Alerts}
\label{sec:eventbrokers}
Alerts will be transmitted in \VOEvent format, using standard IVOA protocols\dmreq{0002} (e.g., VOEvent Transport Protocol; VTP\footnote{VOEvent Transport Protocol is currently an IVOA Note, but we understand work is under way to finalize and bring it up to full IVOA Recommendation status.}). As a very high rate of alerts is expected, approaching $\sim$10 million per night, we plan for public VOEvent Event Brokers\footnote{These brokers are envisioned to be operated as a public service by third parties who will have signed MOUs with LSST.} to be the primary end-points of LSST's event streams. End-users will use these brokers to classify and filter events for subsets fitting their science goals. End-users will \emph{not} be able to subscribe to full, unfiltered, alert streams coming directly from LSST\footnote{This is due to finite network bandwidth available: for example, 100 end-users subscribing to a $\sim$100\,Mbps stream (the peak full stream data rate at end of the first year of operations) would require 10Gbps WAN connection from the archive center, just to serve the alerts.}.
To directly serve the end-users, LSST will provide a basic, limited capacity, alert filtering service.\dmreq{0342} This service will run at the LSST U.S. Archive Center (at NCSA). It will let astronomers create simple filters that limit what alerts are ultimately forwarded to them\footnote{More specifically, to their VTP clients. Typically, a user will use the Science User Interface (the web portal to LSST Archive Center) to set up the filters, and use their VTP client to receive the filtered \VOEvent stream.}. These \emph{user defined filters} will be possible to specify using an SQL-like declarative language, or short snippets of (likely Python) code. For example, here's what a filter may look like:
\begin{lstlisting}[language=python,commentstyle=\bfseries\color{green!40!black}]
# Keep only never-before-seen events within two
# effective radii of a galaxy. This is for illustration
# only; the exact methods/members/APIs may change.
def filter(alert):
if len(alert.sources) > 1:
return False
nn = alert.diaobject.nearest_neighbors[0]
if not nn.flags.GALAXY:
return False
return nn.dist < 2. * nn.Re
\end{lstlisting}
We emphasize that this LSST-provided capability will be limited, and is \emph{not} intended to satisfy the wide variety of use cases that a full-fledged public Event Broker could.
For example, we do not plan to provide any \emph{exclusive} classification to a unique category of object.
Following the \SRD specification, however, we will provide a limited number of pre-defined filters for a small number of object types of common interest.\dmreq{0348}
These will answer non-exclusive questions such as ``is the light curve consistent with an RR Lyra?'', and will have potentially highly overlapping selections, designed to provide good completeness but perhaps only very modest purity.
No information beyond what is contained in the \VOEvent packet will be available to the pre-defined or user-defined filters (e.g., no cross-matches to other catalogs).
The complexity and run time of user defined filters will be limited by available resources.
Execution latency will not be guaranteed.
The number of \VOEvents transmitted to each user per visit will be limited as well (e.g., the equivalent of 20 full-size alert packets per visit per user\reqparam{numBrokerAlerts}\dmreq{0343}, dynamically throttled depending on load).
Finally, the total number of simultaneous subscribers is likely to be limited\reqparam{numBrokerUsers} -- in case of overwhelming interest, a TAC-like proposal process may be instituted.
\begin{openissues}
\subsection{Open Issues}
What follows is a (non-exhaustive) list of issues, technical and scientific, that are still being discussed and where changes are possible. The estimate of the time by which a decision should be made is noted in parentheses.
\begin{itemize}
\item \emph{Should we measure on individual snaps (or their difference)?} Is there a demonstrable science case requiring immediate followup that would be triggered by the flux change over a $\sim$15 second period? Is it technically feasible? (FDR)
\item \emph{Is \PPDB required to be relational?}. A no-SQL solution may be more appropriate given the followup-driven use cases. Even if it is relational, the Prompt database will \emph{not} be sized or architected to perform well on large or complex queries (eg. complex joins, full table scans, etc.). (FDR)
\item \emph{Should we associate alerts with external catalogs?}. LSST will have a copy of many external catalogs (or parts thereof), and could in principle provide limited positional association. Right now, it's up to the downstream brokers to perform such associations. (CONSTRUCTION)
\item \emph{Can we, should we, and how will we measure proper motions on difference images?} This is a non-trivial task (need to distinguish between dipoles that are artifacts, and those due to proper motions), without a clear science driver (since high proper motion stars will be discoverable using Data Release catalogs). (CONSTRUCTION)
\item \emph{What light-curve metric should we compute and provide with alerts?} We strive to compute general purpose metrics which will facilitate classification. We have currently baselined the \citet{2011ApJ...733...10R} feature set. (COMMISSIONING)
\item \emph{Should we choose \texttt{nearbyObj}s differently?} One proposal is to find the brightest \Object within $XX$ arcsec (with $XX \sim$10\,arcsec), and the total number of \Objects within $XX$ arcsec. (COMMISSIONING)
\item \emph{Should the postage stamps provided with the alerts be binned, and by what factor?} The thought is that a human user may have an easier time performing a final by-eye check with larger postage stamps. (COMMISSIONING)
\item \emph{When should we (if ever) stop performing forced photometry on positions of \DIAObjects?} Depending on the rate of false positives, unidentified artifacts, or unrecognized Solar System objects, the number of forced measurements may dramatically grow over time. (OPERATIONS)
\end{itemize}
\end{openissues}
\clearpage
\section{Data Release Data Products}
\label{sec:level2}
\subsection{Overview}
Data Release data products result from direct image\footnote{As opposed to \emph{difference image}, for Prompt products.} analysis. They're designed to enable \emph{static sky} science (eg., studies of galaxy evolution, or weak lensing), and time-domain science that is not time sensitive (eg. statistical investigations of variability). They include image products (reduced single-epoch exposures, called \emph{Processed Visit Images} or, sometimes, \emph{calibrated exposures},\dmreq{0069} and coadds\dmreq{0279}\dmreq{0281}), and catalog products (tables of objects\dmreq{0275}, sources\dmreq{0267}, their measured properties\dmreq{0276}, and related metadata).
Similarly to Prompt catalogs of \DIAObjects and \DIASources, \Objects in the Data Release catalog\dmreq{0275} represent the astrophysical phenomena (stars, galaxies, quasars, etc.), while \Sources represent their single-epoch observations. \Sources are independently detected and measured in single epoch exposures and recorded in the \Source table.\dmreq{0267}
The master list of \Objects in the Data Release will be generated by associating and deblending the list of single-epoch \DIASource detections and the lists of sources detected on coadds (\Coadds).\dmreq{0275} We plan to build coadds designed to maximize depth (\emph{``deep coadds''}), although this may still not include some visits with extremely poor seeing\dmreq{0279}.
Additional coadds may be used but may not be retained (see \S~\ref{sec:coadds} for details). We will provide a facility to generate any coadds we use internally (as well as others we may not use internally) as pipeline tasks in the Science Platform (\S~\ref{sec:level3}).\dmreq{0311}
\dmreq{0275}
The deblender will be run simultaneously on the catalog of peaks\footnote{The source detection algorithm we plan to employ finds regions of connected pixels above the nominal $S/N$ threshold in the \emph{PSF-likelihood image} of the visit (or coadd). These regions are called \emph{footprints}. Each footprint may have one or more \emph{peaks}, and it is these peaks that the deblender will use to infer the number and positions of objects blended in each footprint.} detected in the coadds, the \DIAObject catalog from the \PPDB, and one or more external catalogs. It will use the knowledge of peak positions, bands, time, time variability (from Prompt products and the single-epoch \Source detections), inferred motion, Galactic longitude and latitude, and other available information to produce a master list of deblended \Objects. Metadata on why and how a particular \Object was deblended will be kept.
\dmreq{0276}
The properties of \Objects, including their exact positions, motions, parallaxes, photometry, and shapes, will be characterized via a broad suite of algorithms (see \S~\ref{sec:objchar}, mostly (but not entirely) performed on coadds.
Finally, to enable studies of variability, the fluxes of all \Objects will be measured on individual visits (using both
direct and difference images), with their shape parameters and deblending resolutions kept constant. This process is known as \emph{forced photometry} (see \S~\ref{sec:forcedPhotL2}), and the flux measurements will be stored in the \ForcedSource table.\dmreq{0268}
\subsection{Data Release Data Processing}
\label{sec:level2dp}
% commented out by ZI: we have now new figures
%\begin{figure}[!htbp]
% \centering
% \includegraphics[scale=0.5]{Level_2_Processing_Flowchart}
% \caption{Level 2 Data Processing Overview\label{fig:level2dp}}
%\end{figure}
Figures~\ref{fig:Pipe1} and \ref{fig:Pipes234}
present a high-level overview of the Data Release data processing workflow\footnote{Note that some LSST documents refer to \emph{Data Release Processing}, which includes both Prompt reprocessing (see \S~\ref{sec:l1dbreproc}), and the Data Release processing described here.}. Logically\footnote{The actual implementation may parallelize these steps as much as possible.}, the processing begins with single-visit image reduction and source measurement, followed by global astrometric and photometric calibration, coadd creation, detection on coadds, association and deblending, object characterization, and forced photometry measurements.
The following is a high-level description of steps which will occur during regular Data Release data processing
(bullets 1 and 2 below map to pipeline 1, \emph{Single Visit Processing}, in Figure~\ref{fig:Pipe1}, bullet 3 is
pipeline 2, \emph{Image Coaddition}, bullets 4-6 map to pipeline 3, \emph{Coadded Image Analysis}, and
bullets 7-8 map to pipeline 4, \emph{Multi-epoch Object Characterization}):
\begin{enumerate}
\item \emph{Single Visit Processing}: Raw exposures are reduced to Processed Visit Images, and \Sources are independently detected, deblended, and measured on all visits. Their measurements (instrumental fluxes and shapes) are stored in the \Source table.\dmreq{0267}
\item \emph{Relative calibration}: The survey is internally calibrated, both photometrically and astrometrically. Relative zero-point and astrometric corrections are computed for every visit.\dmreq{0029} \dmreq{0030} Sufficient data is kept to reconstruct the normalized system response function $\phi_b(\lambda)$ (see Eq.~5, \SRD) at every position in the focal plane at the time of each visit as required by \S~3.3.4 of the \SRD.
\item \emph{Coadd creation}: Deep, seeing optimized, \dmreq{0279} and short-period per-band coadds\dmreq{0337} are created in $ugrizy$ bands, as well as deeper, multi-color, \dmreq{0281} coadds\footnote{We'll denote the ``band'' of the multi-color coadd as 'M'.}. Transient sources (including Solar System objects, explosive transients, etc), will be rejected from the coadds. See \S~\ref{sec:coadds} for details.
\item \emph{Source detection}. Sources will be detected on all coadds generated in the previous step. The source detection algorithm will detect regions of connected pixels, known as \emph{footprints}, above the nominal $S/N$ threshold in the \emph{PSF-likelihood image} of the visit. An appropriate algorithm will be run to also detect extended low surface brightness objects
(eg., binned detection algorithm from SDSS).\dmreq{0349} Each footprint may have one or more \emph{peaks}, and the collection of these peaks (and their membership in the footprints) are the output of this stage.
\item \emph{Association and deblending}. \dmreq{0275} The next stage in the pipeline, which we will for simplicity just call \emph{the deblender}, will synthesize a list of unique objects. In doing so it will consider the list of sources detected on \Coadds, catalogs of \DIASources, \DIAObjects and \SSObjects detected on difference images, and objects from external
catalogs\footnote{Note that \Sources are not considered when generating the \Object list (given the large
number of visits in each band, the false positives close to the faint end would increase the complexity of
association and deblending algorithms). It is possible for intermittent sources that are detected just above
the faint detection limit of single visits to be undetected in coaddds, and thus to not have a matching \Object.
To enable easy identification of such \Sources, the nearest \Object associated with each \Source, if any, will be recorded,
and some \Object measurements (e.g. stellar motion parameters) \emph{may} utilize positions from securely-matched \Sources.}.\dmreq{0034}
The deblender will make use of all information available at this stage, including the knowledge of peak positions, bands, time, time variability (from Prompt products), Galactic longitude and latitude, etc. The output of this stage is a list of uncharacterized \Objects\footnote{Depending on the exact implementation of the deblender, this stage may also attach significant metadata (eg, deblended footprints and pixel-weight maps) to each deblended \Object record.}.
\item \emph{Coadd object characterization}. The vast majority of columns in the \Object table will be measured on one or more coadds. These include positions, shapes, and photometry. For the most part, these are static-sky measurements, but short-period coadds may be used (especially in later data releases) for some measurements with long but finite time scales, such as proper motions. \dmreq{0276} \dmreq{0275}
\item \emph{Forced Photometry}. Source fluxes will be measured on every visit, with the position and motion derived from previous steps held fixed, producing our best estimates of the light-curve for each object in the survey. The fluxes will be stored in the \ForcedSource table.\dmreq{0268}
Measurements will be performed on both direct images and difference images.
\item \emph{Object Postprocessing}. Measurements from all previous steps are combined in catalog-space algorithms. This includes fitting proper motion and parallax, and characterizing variability. These are also used to populate the \Object table.
\end{enumerate}
Data Release data processing will include the computation of photometric redshifts for all detected \Objects. \dmreq{0046}
We are still investigating the best approach to take; the DM Science team is working with the community to decide on the most appropriate algorithm and the format for the results.
When an algorithm and data product format has been chosen, the exact manner in which it will be incorporated into the Data Release data processing will be described here.
\subsubsection{Object Characterization Measures}
\label{sec:objchar}
Properties of detected objects will be measured as a part of the object characterization step described in the previous section and stored in the \Object table. These measurements are designed to enable LSST ``static sky'' science. This section discusses at a high level which properties will be measured and how those measurements will be performed. For a detailed list of quantities being fit/measured, see the table in \S~\ref{sec:objectTable}.
All measurements discussed in this section deal with properties of \emph{objects}, and will be performed primarily on a suite of multi-epoch coadds. Measurements of sources in individual visits, independent of all others, are described in \S~\ref{sec:sourceMeas}, though in rare cases those \Source measurements and/or the forced measurements described in \S~\ref{sec:forcedPhotL2} may be used in addition to coadds for \Object measurements.
The measurements performed on \Objects include:
%
\begin{itemize}
\item \emph{Point source model fit}. The observed object is modeled as a point source with finite proper motion and parallax and constant flux (allowed to be different in each band). \dmreq{0276} This model is a good description for non-variable stars and other unresolved sources. Its 11 parameters will be constrained using some combination of matched \Source positions, centroids measured on short-period coadds, and dipole moments measured during difference-image forced photometry\footnote{The fitting procedure will account for differential chromatic refraction.}.
\item \emph{Bulge-disk model fit}. \dmreq{0276} The object is modeled as a sum of a de Vaucouleurs (Sersic $n=4$) and an exponential (Sersic $n=1$) component, but with some parameters constrained across the two components (the model described in the Table~\ref{tbl:objectTable} fixes the ellipticity to be the same and the radii to have a constant ratio while allowing the fluxes to vary independently for example, but it should be emphasized that this just one of several possible models).
The object is assumed not to move, and we will probably use the general-purpose centroid for the position rather than include the position as a model parameter. The model will be fit first to the coadds for all bands simultaneously (with only fluxes allowed to vary) and then to each band independently.
In all cases the fitting is done via forward modeling -- the model is convolved with the PSF before being compared to the image.
Our primary goal for this model is to provide robust photometry of most galaxies detected by LSST, with shapes and other morphology information (and particularly bright and/or well-resolved galaxies) secondary goal, and we will tune the model parametrization accordingly.
\item \emph{Standard colors}. \dmreq{0276} Colors of the object in ``standard seeing'' (for example, the third quartile expected survey seeing in the $i$ band, $\sim$0.9\,arcsec) will be measured. These colors are guaranteed to be seeing-insensitive, suitable for estimation of photometric redshifts\footnote{The problem of optimal determination of photometric redshift is the subject of intense research. The approach we're taking here is conservative, following contemporary practices. As new insights develop, we will revisit the issue.}.
\item \emph{Centroids}. Centroids will be computed independently for each band using an algorithm similar to that employed by SDSS. Information from all\footnote{Whenever we say \emph{all}, it should be understood that this does not preclude reasonable data quality cuts to exclude data that would otherwise degrade the measurement.} epochs will be used to derive the estimate. These centroids will be used for adaptive moment, Petrosian, Kron, standard color, and aperture measurements. \dmreq{0276}
\item \emph{Adaptive moments}. Adaptive moments will be computed using information from all epochs, independently for each band. The moments of the PSF realized at the position of the object will be provided as well. \dmreq{0276}
\item \emph{Petrosian and Kron fluxes}. Petrosian and Kron radii and fluxes will be measured in standard seeing using self-similar elliptical apertures computed from adaptive moments. The apertures will be PSF-corrected and \emph{homogenized}, convolved to a canonical circular PSF\footnote{This is an attempt to derive a definition of elliptical apertures that does not depend on seeing. For example, for a large galaxy, the correction to standard seeing will introduce little change to measured ellipticity. Corrected apertures for small galaxies will tend to be circular (due to smearing by the PSF). In the intermediate regime, this method results in derived apertures that are relatively seeing-independent. Note that this is only the case for \emph{apertures}; the measured flux will still be seeing dependent and it is up to the user to take this into account.}. The radii will be computed independently for each band. Fluxes will be computed in each band, by integrating the light within some multiple of \emph{the radius measured in the canonical band}\footnote{The shape of the aperture in all bands will be set by the profile of the galaxy in the canonical band alone. This procedure ensures that the color measured by comparing the flux in different bands is measured through a consistent aperture. See \url{http://www.sdss.org/dr7/algorithms/photometry.html} for details.} (most likely the $i$ band). Radii enclosing 50\% and 90\% of light will be provided. \dmreq{0276}
\item \emph{Aperture surface brightness}. Aperture surface brightness will be computed in a variable number\footnote{The number will depend on the size of the source.} of concentric, logarithmically spaced, PSF-homogenized, elliptical apertures, in standard seeing. \dmreq{0276}
\item \emph{Variability characterization}. Parameters will be provided, designed to characterize periodic and aperiodic variability features \citep{2011ApJ...733...10R}, in each bandpass.
We caution that the exact features in use when LSST begins operations are likely to be different compared to the baseline described here; this is to be expected given the rapid pace of research in time domain astronomy. However, their \emph{number} is unlikely to grow beyond the present estimate. \dmreq{0276}
\end{itemize}
\subsubsection{Supporting Science Cases Requiring Full Posteriors}
Science cases sensitive to systematics, departures of likelihood from Gaussianity, or requiring user-specified priors, demand knowledge of the shape of the likelihood function beyond a simple Gaussian approximation around the ML value. Photometric redshift estimation is the primary example where knowledge of the full posterior is likely to be needed for LSST science cases.\dmreq{0046}\dmreq{0276}
We currently plan to provide this information by providing parametric estimates of the likelihood function (for the photometric redshifts). As will be shown in Table~\ref{tbl:objectTable}, the current allocation is $\sim$100 parameters for describing the photo-z (photometric redshift) likelihood distributions, per object.
\dmreq{0333}
The methods of storing likelihood functions (or samples thereof) will continue to be developed and optimized throughout Construction and Commissioning. The key limitation, on the amount of data needed to be stored, can be overcome by compression techniques. For example, simply noticing that not more than $\sim$0.5\% accuracy is needed for sample values allows one to increase the number of samples by a factor of $4$. More advanced techniques, such as PCA analysis of the likelihoods across the entire catalog, may allow us to store even more, providing a better estimate of the shape of the likelihood function. In that sense, what is presented in Table~\ref{tbl:objectTable} should be thought of as a \textbf{\emph{conservative estimate}}, which we plan to improve upon as development continues in Construction.
\subsubsection{Source Characterization}
\label{sec:sourceMeas}
Sources will be detected on individual visits as well as the coadds. Sources detected on coadds will primarily serve as inputs to the construction of the master object list as described in \S~\ref{sec:level2dp}, and may support other LSST science cases as seen fit by the users (for example, searches for objects whose shapes vary over time). \dmreq{0267}
The following \Source properties are planned to be measured:
%
\begin{itemize}
\item \emph{Centroids}. Centroids are currently planned to be computed using an algorithm similar to that employed by SDSS, which uses an approximation to the PSF model as a weight function. These centroids will be used for PSF photometry, adaptive moments, and aperture magnitude measurements. \dmreq{0267}
\item \emph{Point source photometry}. Fluxes are measured using the PSF model as a weight function, with aperture corrections applied to account for flux beyond the PSF model's extend and put this measurement on the same system as other flux measurements.\dmreq{0267}
\item \emph{Adaptive moments}. Adaptive moments will be computed. The moments of the PSF realized at the position of the object will be provided as well. \dmreq{0267}
% \item \emph{Petrosian and Kron fluxes}. Petrosian and Kron radii and fluxes will be measured using elliptical apertures computed from adaptive moments. The apertures will be PSF-corrected and convolved to a canonical circular PSF. Fluxes will be computed in each band, by integrating the light within some multiple of \emph{the radius measured in the same band}\footnote{Note that this is different}. Radii enclosing 50\% and 100\% of light will be provided.
\item \emph{Aperture surface brightness}. Aperture surface brightness will be computed in a variable number\footnote{The number will depend on the size of the source.} of concentric, logarithmically spaced, PSF-homogenized, elliptical apertures. \dmreq{0267}
% \item \emph{Bulge-disk model fit}. The object is modeled as a sum of a de Vaucouleurs (Sersic $n=4$) and an exponential (Sersic $n=1$) component. This model is intended to be a reasonable description of galaxies. The object is assumed not to move. The components share the same ellipticity and center. There are a total of 8 free parameters ($\alpha$, $\delta$, $e_1$, $e_2$, $I_{0B}$, $I_{0D}$, $R_{eB}$, $R_{eD}$). Where there's insufficient data to constrain the likelihood (eg., small, poorly resolved, galaxies, or very few epochs), priors will be adopted. Only maximum likelihood values and the covariance matrix will be stored.
\end{itemize}
Note that we do \emph{not} plan to fit extended source Bulge+Disk models to individual \Sources, nor measure per-visit Petrosian or Kron fluxes. These are object properties that are not expected to vary in time\footnote{Objects that \emph{do} change shape with time would, obviously, be of particular interest. Aperture fluxes provided in the \Source table should suffice to detect these. Further per-visit shape characterization can be performed using the Science Platform.}, and will be better characterized by coadd-based measurements (in the \Object table).
%including the so-called star-galaxy separation (estimation of the probability that a source is resolved, given the PSF).
For example, although a simple extendedness characterization is present in the Source table, star-galaxy separation (an estimate of the probability that a source is resolved, given the PSF) will be better characterized by a combination of \Object measurements.
\subsubsection{Forced Photometry}
\label{sec:forcedPhotL2}
\dmreq{0268}\dmreq{0287}
\emph{Forced Photometry} is the measurement of flux in individual visits, given a fixed position, shape, and the deblending parameters of an object. It enables the study of time variability of an object's flux, irrespective of whether the flux in any given individual visit is above or below the single-visit detection threshold.
Forced photometry will be performed on all visits, for all \Objects, using both direct images and difference images.
The measured fluxes will be stored in the \ForcedSources table. Due to space constraints, we only plan to measure the PSF flux.
\subsubsection{Crowded Field Photometry}
A fraction of LSST imaging will cover areas of high object (mostly stellar) density. These include the Galactic plane, the Large and Small Magellanic Clouds, and a number of globular clusters (among others).
% LSST does \emph{not} plan to build or deploy algorithms specifically designed for crowded field photometry. \textbf{\emph{Processing these areas using specialist crowded field photometry codes is left to the users as a Level 3 task (see \S~\ref{sec:level3})}}.
LSST image processing and measurement software, although primarily designed to operate in non-crowded regions, is expected to perform well in areas of crowding. The current LSST applications development plan envisions making the deblender aware of Galactic longitude and latitude, and permitting it to use that information as a prior when deciding how to deblend objects. While not guaranteed to reach the accuracy or completeness of purpose-built crowded field photometry codes, we expect this approach will yield acceptable results even in areas of moderately high crowding.
Note that this discussion only pertains to processing of \emph{direct images}. Crowding is not expected to significantly impact the quality of data products derived from \emph{difference images} (i.e., Prompt products).
\subsubsection{Shear Measurement}
\label{sec:shearMeasurement}
Cosmology is one of the major pillars of LSST science, and a substantial fraction of the survey's cosmological constraining power will come from weak gravitational lensing, which involves estimating the shear field applied to a population of galaxies by foreground matter.
For our pixel-level measurements of galaxy shapes to be usable for shear estimation, the responses of the algorithms we apply to shear must be carefully controlled and/or measured.
While the state of the art for these algorithms continues to evolve, our plan for at least DR1 and DR2 is to use the \texttt{Metadetection} algorithm of \citet{2023arXiv230303947S,2020ApJ...902..138S}.
Additional algorithms may be run as well, as long as their processing time and catalog storage costs are negligible (which is actually the case for some promising but still unproven methods, e.g. \citet{2016MNRAS.459.4467B}).
This is a variant of the \texttt{Metacalibration} \citep{2017arXiv170202600H,2017ApJ...841...24S} approach, in which the response to shear is measured by applying the same algorithms to sheared versions of the original (coadd) images and their PSF models, and then using finite differences to compute the derivative of those measurements with respect to shear.
\texttt{Metadetection} requires the algorithms run on these images to include detection and deblending, and like all variants of \texttt{Metacalibration}, it requires all algorithms run on the modified images to be simple enough for finite differences to work well.
In practice this means measurements must be as close as possible to linear in the pixel values.
This is incompatible with our usual desire for algorithms to produce catalogs that map as well as possible to astrophysical reality, especially since the way in which the coadd images are modified represents a slight degredation in signal-to-noise and image quality.
As a result, the measurements from \texttt{Metadetection} are best structured as a separate \texttt{ShearObject} table representing a completely distinct set of detections and measurements, with many fewer columns than the main \Object table and $3-5\times$ as many rows (for the modified versions of the original coadds).
A match table from the main \Object table to \texttt{ShearObject} will not be provided, as actually using any such matches would void the carefully measured responses in \texttt{ShearObject}, rendering it useless.
\texttt{Metadetection} measurements will be performed on small coadd cells, over which the PSF can be considered spatially constant and the set of input visits is constant.
This will also be true of some regular \Object measurements, but the cell-based nature of those measurements will be less important; some key \texttt{Metadetection} measurements are actually reported per cell rather than per Object.
% TODO: Citation for tests of metadetection on LSST-like sims should be
% available very soon and should be included; the paper has just completed its
% DESC collaboration-wide review. DES Y6 shear catalog paper is also worth
% referencing when it is available (not sure of the timescale).
\subsection{The Data Release Catalogs}
This section presents the contents of key Data Release catalog tables. As was the case for Prompt Products (see \S~\ref{sec:level1db}), here we present the \emph{conceptual schemas} for the most important Data Release tables (the \Object, \Source, and \ForcedSource tables).
The tables themselves are defined in \citeds{LDM-153}\footnote{The SQL definition itself can be found in the \texttt{cat} package.}.
These convey \emph{what} data will be recorded in each table, rather than the details of \emph{how}. For example, columns whose type is an array (eg., \texttt{radec}) may be expanded to one table column per element of the array (eg., \texttt{ra}, \texttt{decl}) once this schema is translated to SQL. Secondly, the tables to be presented are normalized (i.e., contain no redundant information). For example, since the band of observation can be found by joining a \Source table to the table with exposure metadata, there's no column named \texttt{band} in the \Source table. In the as-built database, the views presented to the users will be appropriately denormalized for ease of use.\dmreq{0332}
\subsubsection{The \Object Table}
\label{sec:objectTable}
\dmreq{0275}
\begin{schema}{\Object Table}{Data Release Catalog \Object Table}{tbl:objectTable}
objectId & uint64 & ~ & Unique object identifier \\
parentObjectId & uint64 & ~ & ID of the parent \Object this object has been deblended from, if any. \\
radec & double[6][2] & degrees & Position of the object (centroid), computed independently in each band.
The centroid will be computed using an algorithm similar to that employed by SDSS.\\
radecErr & double[6][2] & arcsec & Uncertainty of \texttt{radec}. \\
psRadecTai & double & time & Point source model: Time at which the object was at position \texttt{psRadec}. \\
psRadec & double[2] & degrees & Point source model: $(\alpha, \delta)$ position of the object at time \texttt{psRadecTai}. \\
psPm & float[2] & mas/yr & Point source model: Proper motion vector.\\
psParallax & float & mas & Point source model: Parallax. \\
psFlux & float[ugrizy] & nJy & Point source model fluxes\footnote{Point source model assumes that fluxes are constant in each band. If the object is variable, \texttt{psFlux} will effectively be some estimate of the average flux.}.\\
psCov & float[66] & various & Point-source model covariance matrix\footnote{Not all elements of the covariance matrix need to be stored with same precision. While the variances will be stored as 32 bit floats ($\sim$ seven significant digits), the covariances may be stored to $\sim$ three significant digits ($\sim$1\% ).}. \\
psLnL & float & ~ & Natural $log$ likelihood of the observed data given the point source model. \\
psChi2 & float & ~ & $\chi^2$ statistic of the model fit. \\
psNdata & int & ~ & The number of data points (pixels) used to fit the model. \\
bdEllip & float[2][ugrizyM] & ~ & B+D model: Ellipticity $(e_1, e_2)$ of the object. \\
bdFluxB & float[2][ugrizy] & nJy & B+D model\footnote{Though we refer to this model as ``Bulge plus Disk'', we caution the reader that the decomposition, while physically motivated, should not be taken too literally.}: $(\alpha, \delta)$ position of the object, in each band.: Integrated flux of the de Vaucouleurs component, in both the multi-band fit and the per-band fit. \\
bdFluxD & float[2][ugrizy] & nJy & B+D model: Integrated flux of the exponential component, in both the multi-band fit and the per-band fit. \\
bdReB & float[ugrizyM] & arcsec & B+D model: Effective radius of the de Vaucouleurs profile component. \\
bdReD & float[ugrizyM] & arcsec & B+D model: Effective radius of the exponential profile component. \\
bdCovM & float[136] & ~ & B+D model covariance matrix for the multi-band fit\footnote{See \texttt{psCov} for notes on precision of variances/covariances.}. \\
bdCovM & float[21][ugrizy] & ~ & B+D model covariance matrices for the independent per-band fits. \\
bdLnL & float[ugrizyM] & ~ & Natural $log$ likelihood of the observed data given the bulge+disk model. \\
bdChi2 & float[ugrizyM] & ~ & $\chi^2$ statistic of the model fit. \\
bdNdata & int[ugrizyM] & ~ & The number of data points (pixels) used to fit the model. \\
stdColor & float[5] & mag & Color of the object measured in ``standard seeing''. While the exact algorithm is yet to be determined, this color is guaranteed to be seeing-independent and suitable for photo-z determinations.\\
stdColorErr & float[5] & mag & Uncertainty of \texttt{stdColor}. \\
Ixx & float & arcsec$^{2}$ & Adaptive second moment of the source intensity. See \citet{2002AJ....123..583B} for detailed discussion of all adaptive-moment related quantities\footnote{Or \url{http://ls.st/5f4} for a brief summary.}. \\
Iyy & float & arcsec$^{2}$ & Adaptive second moment of the source intensity. \\
Ixy & float & arcsec$^{2}$ & Adaptive second moment of the source intensity. \\
Icov & float[6] & arcsec$^{4}$ & \texttt{Ixx}, \texttt{Iyy}, \texttt{Ixy} covariance matrix. \\
IxxPSF & float & arcsec$^{2}$ & Adaptive second moment for the PSF. \\
IyyPSF & float & arcsec$^{2}$ & Adaptive second moment for the PSF. \\
IxyPSF & float & arcsec$^{2}$ & Adaptive second moment for the PSF. \\
m4 & float[ugrizy] & ~ & Fourth order adaptive moment. \\
petroRad & float[ugrizy] & arcsec & Petrosian radius, computed using elliptical apertures defined by the adaptive moments. \\
petroRadErr & float[ugrizy] & arcsec & Uncertainty of \texttt{petroRad} \\
petroBand & int8 & ~ & The band of the canonical \texttt{petroRad} \\
petroFlux & float[ugrizy] & nJy & Petrosian flux within a defined multiple of the canonical \texttt{petroRad} \\
petroFluxErr & float[ugrizy] & nJy & Uncertainty in \texttt{petroFlux} \\
petroRad50 & float[ugrizy] & arcsec & Radius containing 50\% of Petrosian flux. \\
petroRad50Err & float[ugrizy] & arcsec & Uncertainty of \texttt{petroRad50}. \\
petroRad90 & float[ugrizy] & arcsec & Radius containing 90\% of Petrosian flux. \\
petroRad90Err & float[ugrizy] & arcsec & Uncertainty of \texttt{petroRad90}. \\
kronRad & float[ugrizy] & arcsec & Kron radius (computed using elliptical apertures defined by the adaptive moments) \\
kronRadErr & float[ugrizy] & arcsec & Uncertainty of \texttt{kronRad} \\
kronBand & int8 & ~ & The band of the canonical \texttt{kronRad} \\
kronFlux & float[ugrizy] & nJy & Kron flux within a defined multiple of the canonical \texttt{kronRad} \\
kronFluxErr & float[ugrizy] & nJy & Uncertainty in \texttt{kronFlux} \\
kronRad50 & float[ugrizy] & arcsec & Radius containing 50\% of Kron flux. \\
kronRad50Err & float[ugrizy] & arcsec & Uncertainty of \texttt{kronRad50}. \\
kronRad90 & float[ugrizy] & arcsec & Radius containing 90\% of Kron flux. \\
kronRad90Err & float[ugrizy] & arcsec & Uncertainty of \texttt{kronRad90}. \\
apNann & int8 & ~ & Number of elliptical annuli (see below). \\
apMeanSb & float[6][\texttt{apNann}] & nJy/arcsec$^2$ & Mean surface brightness within an annulus\footnote{A database function will be provided to compute the area of each annulus, to enable the computation of aperture flux.}. \\
apMeanSbSigma & float[6][\texttt{apNann}] & nJy/arcsec$^2$ & Standard deviation of \texttt{apMeanSb}. \\
extendedness & float & ~ & A measure of extendedness, computed using a combination of available moments, or from a likelihood ratio of point/B+D source models (exact algorithm TBD). $extendedness=1$ implies a high degree of confidence that the source is extended. $extendedness=0$ implies a high degree of confidence that the source is point-like. \\
lcPeriodic & float[6~\x~32] & ~ & Periodic features extracted from difference image-based light-curves using generalized Lomb-Scargle periodogram \citep[Table~4,][]{2011ApJ...733...10R}.\\
lcNonPeriodic & float[6~\x~20] & ~ & Non-periodic features extracted from difference image-based light-curves \citep[Table~5,][]{2011ApJ...733...10R}. \\
photoZ & float[2~\x~95] & ~ & Photometric redshift likelihood samples -- pairs of ($z$, $logL$) -- computed using a to-be-determined published and widely accepted algorithm at the time of LSST Commissioning. \\
photoZ\_pest & float[10] & ~ & Point estimates for photometric redshift\footnote{TBD but likely candidates
are the mode, mean, standard deviation, skewness, kurtosis, and 1\%, 5\%, 25\%, 50\%, 75\%, and 99\% points
from cumulative distribution.}, computed using photoZ. \\
flags & bit[128] & bit & Various useful flags. \\
\end{schema}
\subsubsection{\Source Table}
\label{sec:sourceTable}
\Source measurements are performed independently on individual visits. They're designed to enable relative astrometric and photometric calibration, variability studies of high signal-to-noise objects, and studies of high SNR objects that vary in position and/or shape (eg., comets).\dmreq{0267}