| Both sides previous revision Previous revision Next revision | Previous revision | 
| public:user_software:documentation:ndppp [2017-12-07 08:28]  – Document H5ParmPredict Tammo Jan Dijkema | public:user_software:documentation:ndppp [2021-02-26 14:18] (current)  – [DPPP]  Tammo Jan Dijkema | 
|---|
| ===== DPPP ===== | ===== DPPP ===== | 
|  |  | 
|  | ==== Important ==== | 
|  |  | 
|  | A newer version of this documentation is available at https://www.astron.nl/citt/DP3 | 
|  |  | 
|  | ==== Old documentation ==== | 
|  |  | 
| DPPP (the Default Preprocessing Pipeline, previously NDPPP for New Preprocessing Pipeline) is the LOFAR data pipelined processing program. It can be used to do all kind of operations on the data in a pipelined way, so the data are read and written only once. | DPPP (the Default Preprocessing Pipeline, previously NDPPP for New Preprocessing Pipeline) is the LOFAR data pipelined processing program. It can be used to do all kind of operations on the data in a pipelined way, so the data are read and written only once. | 
|  |  | 
| * **[[#DDECal]]** to calibrate direction dependent gains. | * **[[#DDECal]]** to calibrate direction dependent gains. | 
| * **[[#Predict]]** to predict the visibilities of a given sky model. | * **[[#Predict]]** to predict the visibilities of a given sky model. | 
| * **[[#H5ParmPredict]]** to predict visibilities corrupted by an instrument model (in H5Parm) | * **[[#H5ParmPredict]]** to subtract multiple directions of visibilities corrupted by an instrument model (in H5Parm) generated by DDECal. | 
| * **[[#ApplyBeam]]** to apply the LOFAR beam model, or the inverse of it. | * **[[#ApplyBeam]]** to apply the LOFAR beam model, or the inverse of it. | 
|  | * **[[#SetBeam]]** to set the beam keywords after prediction. | 
| * **[[#ScaleData]]** to scale the data with a polynomial in frequency (based on SEFD of LOFAR stations). | * **[[#ScaleData]]** to scale the data with a polynomial in frequency (based on SEFD of LOFAR stations). | 
|  | * **[[#Upsample]]** to upsample visibilities in time | 
|  | * **[[#Intermediate_output_step|Out]]** to add intermediate output steps | 
|  | * **[[#Interpolate]]** for improving the accuracy of data averaging. | 
| * **[[#User defined]]** steps provide a plugin mechanism for arbitrary steps implemented in C++. | * **[[#User defined]]** steps provide a plugin mechanism for arbitrary steps implemented in C++. | 
| * **[[#Python defined]]** steps provide a plugin mechanism for arbitrary steps implemented in Python. | * **[[#Python defined]]** steps provide a plugin mechanism for arbitrary steps implemented in Python. | 
| </code> | </code> | 
| where WGHT is the weight put in by RTCP (number of samples used / total number of samples). | where WGHT is the weight put in by RTCP (number of samples used / total number of samples). | 
| \\ {{ DPPP_weights.pdf | This note}} discusses weighting in some more detail. | \\ {{:public:user_software:documentation:ndppp_weights.pdf|This note}} discusses weighting in some more detail. | 
|  |  | 
| === Flagging === | === Flagging === | 
| * [[#PhaseShift|Data can be shifted]] to another phase center. | * [[#PhaseShift|Data can be shifted]] to another phase center. | 
| * A shift step can shift back to the original phase center (by giving an empty center). If that is done by the last shift step, no new MS needs to be created. | * A shift step can shift back to the original phase center (by giving an empty center). If that is done by the last shift step, no new MS needs to be created. | 
|  |  | 
|  | === Upsample === | 
|  | * [[#Upsample|Upsampling]] data can be useful for at least one use case. Consider data that has been integrated for two seconds, by a correlator (the AARTFAAC correlator) that sometimes misses one second of data. The times of the visibilities will then look like [0, 2, 4, 7, 9, 12], each having integration time 2 seconds. DPPP will automatically fill missing time slots, which will lead to times [0, 2, 4, 6, 7, 9, 11, 12]. This is still a nonuniform time coverage, which is not desirable. Calling the upsample step with ''timestep=2'' on this data will create times [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13] (it will remove the inserted dummy time slots that overlap, i.e. at 7 and 12). This data is then useful for further processing, e.g. averaging to 10 seconds. | 
|  |  | 
| === Station summation === | === Station summation === | 
| * The ''plotflags'' function in the Python module ''lofar.dppp'' can be used to plot those tables. It can plot multiple subbands by giving it a list of table names. The flags per station will be averaged for those subbands. | * The ''plotflags'' function in the Python module ''lofar.dppp'' can be used to plot those tables. It can plot multiple subbands by giving it a list of table names. The flags per station will be averaged for those subbands. | 
|  |  | 
|  | === Intermediate output step === | 
|  | The step ''out'' can write data to disk at an intermediate stage. It takes the same arguments as the [[#Output|'msout' step]]. As an example, the following reduction will flag, save flagged data at high resolution, then average and save the result in another measurement set. On the averaged data, it will also apply a calibration table and save that in the ''CORRECTED_DATA'' column. | 
|  | <code> | 
|  | msin=L123.MS | 
|  |  | 
|  | steps=[aoflag,out1,average,out2,applycal] | 
|  |  | 
|  | # Write out flagged data at full resolution | 
|  | out1.type=out | 
|  | out1.name=L123-flagged.MS | 
|  |  | 
|  | average.timestep=4 | 
|  |  | 
|  | # Write out averaged data | 
|  | out2.type=out | 
|  | out2.name=L123-averaged.MS | 
|  | out2.datacolumn=DATA | 
|  |  | 
|  | applycal.parmdb=instrument.parmdb | 
|  |  | 
|  | # Write the corrected data to CORRECTED_DATA | 
|  | msout=L123-averaged.MS | 
|  | msout.datacolumn=CORRECTED_DATA | 
|  | </code> | 
|  |  | 
| === User defined step === | === User defined step === | 
|  |  | 
| ==== Input ==== | ==== Input ==== | 
| | msin \\ msin.name | string | | Name of the input MeasurementSets. If a single name is given, it can be a glob-pattern (like L23456_SAP000_SB*) meaning that all MSs matching the pattern will be used. A glob-pattern can contain *, ?, [], and {} pattern characters (as used in bash). \\ If multiple MSs are to be used, their data are concatenated in frequency, thus multiple subbands are combined to a single band. In principle all MSs should exist, but if 'missingdata=true' and 'orderms=false' flagged zero data will be inserted for missing MS(s) and their frequency info will be deduced from the other MSs. | |  | 
| | msin.sort | bool | false | Does the MS need to be sorted in TIME order? | | |msin \\ msin.name|string| |Name of the input MeasurementSets. If a single name is given, it can be a glob-pattern (like L23456_SAP000_SB*) meaning that all MSs matching the pattern will be used. A glob-pattern can contain *, ?, [], and {} pattern characters (as used in bash). \\ If multiple MSs are to be used, their data are concatenated in frequency, thus multiple subbands are combined to a single band. In principle all MSs should exist, but if 'missingdata=true' and 'orderms=false' flagged zero data will be inserted for missing MS(s) and their frequency info will be deduced from the other MSs.| | 
| | msin.orderms | bool | true | Do the MSs need to be ordered on frequency? If true, all MSs must exist, otherwise they cannot be ordered. If false, the MSs must be given in order of frequency. | | |msin.sort|bool|false|Does the MS need to be sorted in TIME order?| | 
| | msin.missingdata | bool | false | true = it is allowed that a data column in an MS does not exist. In that case its data will be 0 and flagged. It can be useful if the CORRECTED_DATA of subbands are combined, but a BBS run for one of them failed. \\ If 'orderms=false', it also makes it possible that a MS is specified but does not exist.  In such a case flagged data will be used instead. The missing frequency info will be deduced from the other MSs where all MSs have to have the same number of channels and must be defined in order of frequency. | | |msin.orderms|bool|true|Do the MSs need to be ordered on frequency? If true, all MSs must exist, otherwise they cannot be ordered. If false, the MSs must be given in order of frequency.| | 
| | msin.baseline | string | | Baselines to be selected (default is all baselines). See [[#Description of baseline selection parameters]]. Only the CASA baseline selection syntax as described in {{msselection.pdf | this note}} can be used. | | |msin.missingdata|bool|false|true = it is allowed that a data column in an MS does not exist. In that case its data will be 0 and flagged. It can be useful if the CORRECTED_DATA of subbands are combined, but a BBS run for one of them failed. \\ If 'orderms=false', it also makes it possible that a MS is specified but does not exist. In such a case flagged data will be used instead. The missing frequency info will be deduced from the other MSs where all MSs have to have the same number of channels and must be defined in order of frequency.| | 
| | msin.band | integer | -1 | Band (spectral window) to select (<0 is no selection). This is mainly useful for WSRT data. | | |msin.baseline|string| |Baselines to be selected (default is all baselines). See [[#description_of_baseline_selection_parameters|Description of baseline selection parameters]]. Only the CASA baseline selection syntax as described in {{:public:user_software:documentation:msselection.pdf| this note}}  can be used.| | 
| | msin.startchan | integer | 0 | First channel to use from the input MS (channel numbers start counting at 0). Note that skipped channels will not be written into the output MS. It can be an expression with `nchan` (nr of input channels) as parameter. E.g. \\ ''  nchan/32'' \\ will be fine for LOFAR observations with 64 and 256 channels. | | |msin.band|integer|-1|Band (spectral window) to select (<0 is no selection). This is mainly useful for WSRT data.| | 
| | msin.nchan | integer | 0 | Number of channels to use from the input MS (0 means till the end). It can be an expression with `nchan` (nr of input channels) as parameter. E.g. \\ ''15*nchan/16'' | | |msin.startchan|integer|0|First channel to use from the input MS (channel numbers start counting at 0). Note that skipped channels will not be written into the output MS. It can be an expression with `nchan` (nr of input channels) as parameter. E.g. \\  ''nchan/32'' \\ will be fine for LOFAR observations with 64 and 256 channels.| | 
| | msin.starttime | string | first time in MS | Center of first time slot to use; if < first time in MS, dummy time slots are inserted. A date/time must be specified in the casacore MVTime format, e.g. 19Feb2010/14:01:23.817 | | |msin.nchan|integer|0|Number of channels to use from the input MS (0 means till the end). It can be an expression with `nchan` (nr of input channels) as parameter. E.g. \\  ''15*nchan/16'' | | 
| | msin.endtime | string | last time in MS | Center of last time slot to use; if > last time in MS, dummy time slots are inserted. | | |msin.starttime|string|first time in MS|Center of first time slot to use; if < first time in MS, dummy time slots are inserted. A date/time must be specified in the casacore MVTime format, e.g. 19Feb2010/14:01:23.817| | 
| | msin.ntimes | integer | 0 | Number of time slots to use (0 means till the end). | | |msin.starttimeslot|int|0|Starting time slot. This can be negative to insert flagged time slots before the beginning of the MS.| | 
| | msin.useflag | bool | true | Use the current flags in the MS? If false, all flags in the MS are ignore and the data (except NaN and infinite values) are assumed to be good and will be used in later steps. | | |msin.endtime|string|last time in MS|Center of last time slot to use; if > last time in MS, dummy time slots are inserted.| | 
| | msin.datacolumn | string | DATA | Data column to use. | | |msin.ntimes|integer|0|Number of time slots to use (0 means till the end).| | 
| | msin.weightcolumn | string | WEIGHT_SPECTRUM or WEIGHT | Weight column to use. Defaults to WEIGHT_SPECTRUM if this exists, otherwise the WEIGHT column is used. | | |msin.useflag|bool|true|Use the current flags in the MS? If false, all flags in the MS are ignore and the data (except NaN and infinite values) are assumed to be good and will be used in later steps.| | 
| | msin.modelcolumn | string | MODEL_DATA | Model data column. Currently only used in gaincal | | |msin.datacolumn|string|DATA|Data column to use, i.e. the name of the column in which the visibilities are written.| | 
| | msin.autoweight | bool | false | Calculate weights using the auto-correlation data? It is meant for setting the proper weights for a raw LOFAR MeasurementSet. | | |msin.weightcolumn|string|WEIGHT_SPECTRUM or WEIGHT|Weight column to use. Defaults to WEIGHT_SPECTRUM if this exists, otherwise the WEIGHT column is used.| | 
| | msin.forceautoweight | bool | false | In principle the calculation of the weights should only be done for the raw LOFAR data. It appeared that sometimes the ''autoweight'' switch was accidently set in a DPPP run on already dppp-ed data. To make it harder to make such mistakes, the ''forceautoweight'' flag has to be set as well for MSs containing dppp-ed data. | | |msin.modelcolumn|string|MODEL_DATA|Model data column. Currently only used in gaincal and ddecal.| | 
|  | |msin.autoweight|bool|false|Calculate weights using the auto-correlation data? It is meant for setting the proper weights for a raw LOFAR MeasurementSet.| | 
|  | |msin.forceautoweight|bool|false|In principle the calculation of the weights should only be done for the raw LOFAR data. It appeared that sometimes the ''autoweight'' switch was accidently set in a DPPP run on already dppp-ed data. To make it harder to make such mistakes, the ''forceautoweight'' flag has to be set as well for MSs containing dppp-ed data.| | 
|  |  | 
|  | \\ | 
|  |  | 
| ==== Output ==== | ==== Output ==== | 
| | msout.clusterdesc | string | "" | If not empty, create the VDS file using this ClusterDesc file. | | | msout.clusterdesc | string | "" | If not empty, create the VDS file using this ClusterDesc file. | | 
| | msout.vdsdir | string | "" | Directory where to put the VDS file; if empty, the MS directory is used. | | | msout.vdsdir | string | "" | Directory where to put the VDS file; if empty, the MS directory is used. | | 
| | msout.storagemanager | string | "" | What storage manager to use. When empty (default), the data will be stored uncompressed. When set to "dysco", the data will be compressed. Settings below will set the compression settings; see [[https://github.com/aroffringa/dysco/wiki|the Dysco wiki]] and [[https://arxiv.org/abs/1609.02019|the paper]] for more info. The default settings are reasonably conservative and safe. | | | msout.storagemanager \\ msout.storagemanager.name| string | "" | What storage manager to use. When empty (default), the data will be stored uncompressed. When set to "dysco", the data will be compressed. Settings below will set the compression settings; see [[https://github.com/aroffringa/dysco/wiki|the Dysco wiki]] and [[https://arxiv.org/abs/1609.02019|the paper]] for more info. The default settings are reasonably conservative and safe. | | 
| | msout.storagemanager.databitrate | integer | 10 | Number of bits per float used for columns containing visibilities. Can be set to zero to compress weights only. | | | msout.storagemanager.databitrate | integer | 10 | Number of bits per float used for columns containing visibilities. Can be set to zero to compress weights only. | | 
| | msout.storagemanager.weightbitrate | integer | 12 | Number of bits per float used for WEIGHT_SPECTRUM column. | | | msout.storagemanager.weightbitrate | integer | 12 | Number of bits per float used for WEIGHT_SPECTRUM column. Can be set to zero to compress data only. Note that compressing weights will set all polarizations to the same weight (determined by the minimum weight over the polarizations). | | 
| | msout.storagemanager.distribution | string | "TruncatedGaussian" | Assumed distribution for compression; "Uniform", "TruncatedGaussian", "Gaussian" or "StudentsT".| | | msout.storagemanager.distribution | string | "TruncatedGaussian" | Assumed distribution for compression; "Uniform", "TruncatedGaussian", "Gaussian" or "StudentsT".| | 
| | msout.storagemanager.disttruncation | double | 2.5 | Truncation level for compression with the Truncated Gaussian distribution.| | | msout.storagemanager.disttruncation | double | 2.5 | Truncation level for compression with the Truncated Gaussian distribution.| | 
| | <step>.corrtype | string | "" | Correlation type to match? Must be auto, cross, or an empty string. | | | <step>.corrtype | string | "" | Correlation type to match? Must be auto, cross, or an empty string. | | 
| | <step>.remove | bool | false | If true, the stations not used in any baseline will be removed from the ANTENNA subtable and the antenna ids in the main table will be renumbered accordingly. To have a consistent output MeasurementSet, other subtables (FEED, POINTING, SYSCAL, LOFAR_ANTENNA_FIELD, LOFAR_ELEMENT_FAILURE, and QUALITY_BASELINE_STATISTIC) will also be updated. \\ Note that stations filtered previously (e.g. using msselect) will also be removed, even if no baseline selection is done in the filter step. | | | <step>.remove | bool | false | If true, the stations not used in any baseline will be removed from the ANTENNA subtable and the antenna ids in the main table will be renumbered accordingly. To have a consistent output MeasurementSet, other subtables (FEED, POINTING, SYSCAL, LOFAR_ANTENNA_FIELD, LOFAR_ELEMENT_FAILURE, and QUALITY_BASELINE_STATISTIC) will also be updated. \\ Note that stations filtered previously (e.g. using msselect) will also be removed, even if no baseline selection is done in the filter step. | | 
|  |  | 
|  | ==== Upsample ==== | 
|  | | <step>.type | string | | Case-insensitive step type; must be 'upsample'| | 
|  | | <step>.timestep | integer |  | Number of times into which each timestep will be expanded | | 
|  |  | 
| ==== AOFlagger ==== | ==== AOFlagger ==== | 
| | <step>.type | string | | Case-insensitive step type; must be 'demixer' (or 'demix'). | | | <step>.type | string | | Case-insensitive step type; must be 'demixer' (or 'demix'). | | 
| | <step>.baseline | string | "" | Baselines to demix. See [[#Description of baseline selection parameters]]. | | | <step>.baseline | string | "" | Baselines to demix. See [[#Description of baseline selection parameters]]. | | 
| | <step>.blrange | double vector | "" | Baselines to demix. See [[#Description of baseline selection parameters]]. | | | <step>.blrange | double vector | [] | Baselines to demix. See [[#Description of baseline selection parameters]]. | | 
| | <step>.corrtype | string | cross | Baselines to demix. Correlation type to match? Must be auto, cross, or an empty string. | | | <step>.corrtype | string | cross | Baselines to demix. Correlation type to match? Must be auto, cross, or an empty string. | | 
| | <step>.timestep | integer | 1 | Number of time slots to average when subtracting. It is truncated if exceeding the actual number of times. Note that the data itself will also be averaged by this amount. | | | <step>.timestep | integer | 1 | Number of time slots to average when subtracting. It is truncated if exceeding the actual number of times. Note that the data itself will also be averaged by this amount. | | 
| | <step>.type | string | | Case-insensitive step type; must be 'applycal' (or 'correct'). | | | <step>.type | string | | Case-insensitive step type; must be 'applycal' (or 'correct'). | | 
| | <step>.parmdb | string | | Path of parmdb in which the parameters are stored. This can also be an H5Parm file, in that case the filename has to end in '.h5' | | | <step>.parmdb | string | | Path of parmdb in which the parameters are stored. This can also be an H5Parm file, in that case the filename has to end in '.h5' | | 
| | <step>.correction | string | gain | Type of correction to perform, can be one of 'gain', 'tec', 'clock', 'commonrotationangle', 'commonscalarphase', 'commonscalaramplitude' or 'rotationmeasure' (create multiple ApplyCal steps for multiple corrections). When using H5Parm, specify the name of the soltab here; the type will be deduced from the metadata in that soltab. | | | <step>.solset | string | "" | In case of applying an H5Parm file: the name of the solset to be used. If empty, defaults to the name of one solset present in the H5Parm (if more solsets are present in an H5Parm and solset is left empty, an error will be thrown)) | | 
|  | | <step>.correction | string | gain | Type of correction to perform, can be one of 'gain', 'tec', 'clock', '(common)rotationangle' / 'rotation', '(common)scalarphase', '(common)scalaramplitude' or 'rotationmeasure' (create multiple ApplyCal steps for multiple corrections). When using H5Parm, this is for now the name of the soltab; the type will be deduced from the metadata in that soltab, except for full Jones, in which case correction should be 'fulljones'.  | | 
|  | | <step>.soltab | string vector | from correction | The name or names of the H5 soltab. Currently only used when correction=fulljones, in which case soltab should list two names (amplitude and phase soltab). | | 
| | <step>.direction | string | "" | If using H5Parm, the direction of the solution to use | | | <step>.direction | string | "" | If using H5Parm, the direction of the solution to use | | 
| | <step>.updateweights | bool | false | Update the weights column, in a way consistent with the weights being inverse proportional to the autocorrelations (e.g. if 'autoweights' was used before). | | | <step>.updateweights | bool | false | Update the weights column, in a way consistent with the weights being inverse proportional to the autocorrelations (e.g. if 'autoweights' was used before). | | 
|  | | <step>.interpolation | string | nearest | If using H5Parm, the type of interpolation (in time and frequency) to use, can be one of 'nearest' or 'linear'. | | 
| | <step>.invert | bool | true | Invert the corrections, to correct the data. Default is true. If you want to corrupt the data, set it to 'false' | | | <step>.invert | bool | true | Invert the corrections, to correct the data. Default is true. If you want to corrupt the data, set it to 'false' | | 
| | <step>.timeslotsperparmupdate | int | 100 | Number of time slots to handle after one read of the parameter file. Optimization to prevent spurious reading from the parmdb. | | | <step>.timeslotsperparmupdate | int | 100 | Number of time slots to handle after one read of the parameter file. Optimization to prevent spurious reading from the parmdb. | | 
| | <step>.parmdb | string | | Path of parmdb in which the computed parameters are to be stored. If the parmdb already exists, it will be overwritten. **Note**: You cannot use this parmdb in an applycal step in the same run of DPPP. To apply the solutions of the gaincal directly, use 'gaincal.applysolution' (see below).  **New in LOFAR 3.1:** if the parmdb name ends in ''.h5'' , an H5Parm will be written.| | | <step>.parmdb | string | | Path of parmdb in which the computed parameters are to be stored. If the parmdb already exists, it will be overwritten. **Note**: You cannot use this parmdb in an applycal step in the same run of DPPP. To apply the solutions of the gaincal directly, use 'gaincal.applysolution' (see below).  **New in LOFAR 3.1:** if the parmdb name ends in ''.h5'' , an H5Parm will be written.| | 
| | <step>.blrange | vector | | Vector of baseline lengths to use for calibration. See [[#Description of baseline selection parameters]]. New in version 2.20 | | | <step>.blrange | vector | | Vector of baseline lengths to use for calibration. See [[#Description of baseline selection parameters]]. New in version 2.20 | | 
|  | | <step>.uvlambdamin | double | 0 | Ignore baselines / channels with UV < uvlambdamin wavelengths. **Note**: also all other variants of uv flagging described in [[#UVWFlagger]] (uvmmin, uvmrange, uvlambdarange, etc) are supported (New in 3.1)| | 
| | <step>.baseline | string | | Baseline selection filter for calibration. See [[#Description of baseline selection parameters]]. New in version 2.20 | | | <step>.baseline | string | | Baseline selection filter for calibration. See [[#Description of baseline selection parameters]]. New in version 2.20 | | 
| | <step>.applysolution | bool | false | Apply the calibration solution to the visibilities. Note that you should always also inspect the parmdb afterwards to check that the solutions look reasonable. | | | <step>.applysolution | bool | false | Apply the calibration solution to the visibilities. Note that you should always also inspect the parmdb afterwards to check that the solutions look reasonable. | | 
| | <step>.sources | | | Same as in **Predict** step | | | <step>.sources | | | Same as in **Predict** step | | 
| | <step>.usebeammodel | | | Same as in **Predict** step | | | <step>.usebeammodel | | | Same as in **Predict** step | | 
| | <step>.operation | | | Same as in **Predict** step | |  | 
| | <step>.applycal.* | | | ApplyCal sub-step, same as in **Predict** step | | | <step>.applycal.* | | | ApplyCal sub-step, same as in **Predict** step | | 
| | <step>.onebeamperpatch | | | Same as in **ApplyBeam** step | | | <step>.onebeamperpatch | | | Same as in **ApplyBeam** step | | 
|  |  | 
| ==== DDECal ==== | ==== DDECal ==== | 
| | <step>.type | string | | Case-insensitive step type; must be 'ddecal' | |  | 
| | <step>.sourcedb | string | | Sourcedb (created with `makesourcedb`) with the sky model to calibrate on | | |<step>.type|string| |Case-insensitive step type; must be 'ddecal'.| | 
| | <step>.directions | list | [] | List of directions to calibrate on. Every element of this list should b a list of facets. Default: every facet is a direction | | |<step>.sourcedb|string| |Sourcedb (created with `makesourcedb`) with the sky model to calibrate on.| | 
| | <step>.maxiter | int | 50 | maximum number of iteratsions | | |<step>.directions|list|[]|List of directions to calibrate on. Every element of this list should b a list of facets. Default: every facet is a direction.| | 
| | <step>.stepsize | double | 0.5 | stepsize between iterations | | |<step>.usemodelcolumn|bool|false|Use model data from the measurement set. This implies solving for one direction, namely the pointing of the measurement set. If you specify usemodelcolumn to be true, directions and sourcedb are not required| | 
| | <step>.h5parm | string | | Filename of output H5Parm (to be read by e.g. losoto). If empty, defaults to ''instrument.h5'' within the measurement  set | | |<step>.maxiter|int|50|Maximum number of iterations.| | 
| | <step>.solint | int | 1 | Solution interval in timesteps | | |<step>.detectstalling|bool|true|Stop iterating when no improvement is measured anymore (after a minimum of 30 iterations).| | 
| | <step>.usebeammodel | bool | false | use the beam model. All beam-related options of the Predict step are also valid | | |<step>.stepsize|double|0.2|stepsize between iterations.| | 
| | <step>.mode | string | complexgain | Type of constraint to apply. Options are scalarcomplexgain, scalarphase, scalaramplitude, tec, tecandphase. Modes in development are fulljones, complexgain, phaseonly and amplitudeonly. | | |<step>.h5parm|string| |Filename of output H5Parm (to be read by e.g. losoto). If empty, defaults to ''instrument.h5'' within the measurement set.| | 
| | <step>.propagatesolutions | bool | false | Initialize solver with the solutions of the previous time slot | | |<step>.solint|int|1|Solution interval in timesteps.| | 
| | <step>.approximatetec | bool | false | Uses an approximation stage in which the phases are constrained with the piece-wise fitter, to solve local minima problems. Only effective when mode=tec or mode=tecandphase. | | |<step>.usebeammodel|bool|false|use the beam model. All beam-related options of the Predict step are also valid.| | 
| | <step>.approxchunksize | int | 0 | Size of fitted chunksize during approximation stage in nr of channels. With approxchunksize=1 the constraint is disabled during the approx stage (so channels are solved for independently). Once converged, the solutions are constrained and more iterations are performed until that has converged too. The default is approxchunksize=0, which calculates the chunksize from the bandwidth (resulting in 10 chunks per octave of bandwidth). | | |<step>.mode|string|diagonal|Type of constraint to apply. Options are scalarcomplexgain, scalarphase, scalaramplitude, tec, tecandphase. Modes in development are fulljones, diagonal, phaseonly, amplitudeonly, rotation, rotation+diagonal.| | 
| | <step>.nchan | int | 1 | Number of channels in each channel block, for which the solution is assumed to be constant. The default is 1, meaning one solution per channel (or in the case of constraints, fitting the constraint over all channels individually). 0 means one solution for the whole channel range. If the total number of channels is not divisable by nchan, some channelblocks will become slightly larger. | | |<step>.tolerance|double|1e-5|Controls the accuracy to be reached: when the normalized solutions move less than this value, the solutions are considered to be converged and the algorithm finishes. Lower values will cause more iterations to be performed.| | 
|  | |<step>.minvisratio|double|0|Minimum number of visibilities within a solution interval, e.g. 0.6 for at least 60% unflagged vis. Intervals with fewer vis will be flagged.| | 
|  | |<step>.propagatesolutions|bool|false|Initialize solver with the solutions of the previous time slot.| | 
|  | |<step>.propagateconvergedonly|bool|false|Propagate solutions of the previous time slot only if the solve converged. Only effective when propagatesolutions=true.| | 
|  | |<step>.flagunconverged|bool|false|Flag unconverged solutions (i.e., those from solves that did not converge within maxiter iterations).| | 
|  | |<step>.flagdivergedonly|bool|false|Flag only the unconverged solutions for which divergence was detected. At the moment, this option is effective only for rotation+diagonal solves, where divergence is detected when the amplitudes of any station are found to be more than a factor of 5 from the mean amplitude over all stations. If divergence for any one station is detected, all stations are flagged for that solution interval. Only effective when flagunconverged=true and mode=rotation+diagonal.| | 
|  | |<step>.approximatetec|bool|false|Uses an approximation stage in which the phases are constrained with the piece-wise fitter, to solve local minima problems. Only effective when mode=tec or mode=tecandphase.| | 
|  | |<step>.maxapproxiter|int|maxiter/2|Maximum number of iterations during approximating stage.| | 
|  | |<step>.approxchunksize|int|0|Size of fitted chunksize during approximation stage in nr of channels. With approxchunksize=1 the constraint is disabled during the approx stage (so channels are solved for independently). Once converged, the solutions are constrained and more iterations are performed until that has converged too. The default is approxchunksize=0, which calculates the chunksize from the bandwidth (resulting in 10 chunks per octave of bandwidth).| | 
|  | |<step>.approxtolerance|double|tolerance*10|Tolerance at which the approximating first stage is considered to be converged and the second full-constraining stage is started. The second stage convergences when the tolerance set by the 'tolerance' keyword is reached. Setting approxtolerance to lower values will cause more approximating iterations. Since tolerance is by default 1e-5, approxtolerance is by default 1e-4.| | 
|  | |<step>.nchan|int|1|Number of channels in each channel block, for which the solution is assumed to be constant. The default is 1, meaning one solution per channel (or in the case of constraints, fitting the constraint over all channels individually). 0 means one solution for the whole channel range. If the total number of channels is not divisable by nchan, some channelblocks will become slightly larger.| | 
|  | |<step>.coreconstraint|double|0|Distance in meters. When unequal to 0, all stations within the given distance from the reference station (0) will be constraint to have the same solution.| | 
|  | |<step>.antennaconstraint|list|[]|A list of lists specifying groups of antennas that are to be constrained to have the same solution. Example: "[ [CS002HBA0,CS002HBA1],[CS003HBA0,CS003HBA1] ]" will keep the solutions of CS002HBA0 and 1 the same, and the same for CS003.| | 
|  | |<step>.smoothnessconstraint|double|0|Kernel size in Hz. When unequal to 0, will constrain the solutions to be smooth over frequency by convolving the solutions with a kernel of the given size (bandwidth). The default kernel is a Gaussian kernel, and the kernel size parameter is the 3 sigma point where the kernel is cut off.| | 
|  | |<step>.statfilename|string| |File to write the step-sizes to. Form of the file is: "<iterationnr> <normalized-stepsize> <unnormalized-stepsize>", and all solution intervals are concatenated. File is not written when this parameter is empty.| | 
|  | |<step>.uvlambdamin|double|0|Ignore baselines / channels with UV < uvlambdamin wavelengths. **Note**: also all other variants of uv flagging described in [[#uvwflagger|UVWFlagger]] (uvmmin, uvmrange, uvlambdarange, etc) are supported (New in 3.1).| | 
|  | |<step>.subtract|bool|false|Subtracts the corrected model from the data. **NOTE** This may not work when you apply a uv-cut.| | 
|  | |<step>.useidg|bool|false|Do image-based prediction using IDG.| | 
|  | |<step>.idg.images|list|[]|Filename of ''.fits'' model images, one per frequency term. The terms are defined as for a polynomial source spectra (not logarithmic), e.g. see [[https://sourceforge.net/p/wsclean/wiki/ComponentList/|this WSClean page]]. The frequency in the metadata of the fits files is used as nu<sub>0</sub> in the polynomial evaluation.| | 
|  | |<step>.idg.regions|string|""|DS9 regions file describing the facets for IDG prediction.| | 
|  | |<step>.idg.buffersize|int|Based on memory|Set the amount of timesteps that are to be used for each IDG buffer| | 
|  | |<step>.savefacets|bool|false|Write out each facet as a fits file (named facet<N>.fits). Only useful when useidg=true.| | 
|  | |<step>.onlypredict|bool|false|Instead of solving, output the predicted visibilities instead. This is useful for testing, although when doing faceted prediction with IDG, it might be fast for certain cases.| | 
|  | |<step>.applycal.*| | |ApplyCal sub-step, same as in Predict step. One can pass an h5parm with as many directions as set in "directions" and each direction model is corrupted accordingly.| | 
|  |  | 
|  | \\ | 
|  |  | 
| ==== Predict ==== | ==== Predict ==== | 
| | <step>.type | string | | Case-insensitive step type; must be 'predict' | | | <step>.type | string | | Case-insensitive step type; must be 'predict' | | 
| | <step>.type | string | | Case-insensitive step type; must be 'h5parmpredict' | | | <step>.type | string | | Case-insensitive step type; must be 'h5parmpredict' | | 
| | <step>.sourcedb | string | | Path of sourcedb in which a sky model is stored (the output of makesourcedb)| | | <step>.sourcedb | string | | Path of sourcedb in which a sky model is stored (the output of makesourcedb)| | 
| | <step>.parmdb | string | | Path of the h5parm in which the corruptions are stored | | | <step>.applycal.parmdb | string | | Path of the h5parm in which the corruptions are stored | | 
| | <step>.applycal.correction | string | | SolTab which contains the directions to be predicted. The names of the directions need to look like ''[dir1,dir2]'', where ''dir1'' and ''dir2'' are patches in the sourcedb. | | | <step>.applycal.correction | string | | SolTab which contains the directions to be predicted, or "fulljones".| | 
| | <step>.directions | string vector | [] | List of directions to include. Each of those directions needs to be in the h5parm soltab. If empty, all directions in the soltab are predicted. || | | <step>.directions | string vector | [] | List of directions to include. Each of those directions needs to be in the h5parm soltab. If empty, all directions in the soltab are predicted.  The names of the directions need to look like ''[dir1,dir2]'', where ''dir1'' and ''dir2'' are patches in the sourcedb. By default, the full list of directions is taken from the H5Parm. The convention for naming directions in DDECal in H5Parm is ''[patch1,patch2]''. This directions parameter can be used to predict / subtract a subset of the directions.|| | 
| | <step>.usebeammodel | bool | false | Use the LOFAR beam in the predict part of the calibration | | | <step>.usebeammodel | bool | false | Use the LOFAR beam in the predict part of the calibration | | 
| | <step>.operation | string | replace | Should the predicted visibilities replace those being processed (''replace'', default), should they be subtracted from those being processed (''subtract'') or added to them (''add'') | | | <step>.operation | string | replace | Should the predicted visibilities replace those being processed (''replace'', default), should they be subtracted from those being processed (''subtract'') or added to them (''add'') | | 
| ==== ApplyBeam ==== | ==== ApplyBeam ==== | 
| | <step>.type | string | | Case-insensitive step type; must be 'applybeam' | | | <step>.type | string | | Case-insensitive step type; must be 'applybeam' | | 
| | <step>.onebeamperpatch | bool | true | Compute the beam only for the center of each patch (saves computation time, but you should set this to false for large patches. This option is only useful if the beam is applied as part of a [[#predict]] step. | | | <step>.direction | string vector | [] | A RA/Dec value specifying in what direction to correct the beam. See phaseshift.phasecenter for syntax. If empty, the beam is corrected in the direction of the current phase center. | | 
|  | | <step>.onebeamperpatch | bool | false | Compute the beam only for the center of each patch (saves computation time, but you should set this to false for large patches). In the ApplyBeam step, this setting does not make sense (but it does if the applybeam is part of predict, ddecal, gaincal, h5parmpredict, etc.). Generally, FALSE is the right setting for this option. The default has changed to false in a recent (Nov 2018) version. | | 
| | <step>.usechannelfreq | bool | **true** | Compute the beam for each channel of the measurement set separately. This is useful for merged / concatenated measurement sets. For raw LOFAR data you should set it to false, so that the beam will be formed as in the station hardware. Also, setting it to false is faster. | | | <step>.usechannelfreq | bool | **true** | Compute the beam for each channel of the measurement set separately. This is useful for merged / concatenated measurement sets. For raw LOFAR data you should set it to false, so that the beam will be formed as in the station hardware. Also, setting it to false is faster. | | 
| | <step>.updateweights | bool | false | Update the weights column, in a way consistent with the weights being inverse proportional to the autocorrelations (e.g. if 'autoweights' was used before). | | | <step>.updateweights | bool | false | Update the weights column, in a way consistent with the weights being inverse proportional to the autocorrelations (e.g. if 'autoweights' was used before). | | 
| | <step>.invert | bool | **true** | Invert the beam. When applying the beam to transfer calibration solutions, this should be true. In other words: ''invert=true'' means correcting for the beam, ''invert=false'' means corrupting with the beam. When using the beam in a predict (or gaincal) step, this option defaults to ''false'' (so it will corrupt for the beam). | | | <step>.invert | bool | **true** | Invert the beam. When applying the beam to transfer calibration solutions, this should be true. In other words: ''invert=true'' means correcting for the beam, ''invert=false'' means corrupting with the beam. When using the beam in a predict (or gaincal) step, this option defaults to ''false'' (so it will corrupt for the beam). | | 
| | <step>.beammode | string | "default" | Beam mode to apply, can be "array_factor", "element" or "default". Default is to apply both the element beam and the array factor. | | | <step>.beammode | string | "default" | Beam mode to apply, can be "array_factor", "element" or "default". Default is to apply both the element beam and the array factor. | | 
|  |  | 
|  | ==== SetBeam ==== | 
|  | SetBeam is an expert option and should only be used in rare cases. It allows direct manipulation of the beam-keywords for a column in a measurement set. Normally, DP3 registers whether the visibilities in a column are corrected for a beam or not, and if so, in what direction the beam was corrected for. This avoids incorrect corrections / scaling by the beam. However, certain actions can change the scaling of the visibilities without that the beam keywords are changed, in particular when predicting (either with DP3 or with another tool). When predicting a single source and not applying the beam, the visibilities are 'corrected' for the beam in the direction of the source. Under those circumstances, SetBeam can be used to modify the beam keywords. In that case, set ''direction'' to the source direction and ''beammode'' to default. | 
|  | | <step>.type | string | | Case-insensitive step type; must be 'setbeam' | | 
|  | | <step>.direction | string vector | [] | A RA/Dec value specifying in what direction the beam is corrected. | | 
|  | | <step>.beammode | string | "default" | Beam mode to apply, can be "array_factor", "element" or "default". Default means that sources in the given direction have corrected (intrinsic) flux values, i.e. they are corrected for the full beam. | | 
|  |  | 
| ==== UVWFlagger ==== | ==== UVWFlagger ==== | 
| | <step>.type | string | | Case-insensitive step type; must be 'uvwflagger' or 'uvwflag'. | |  | 
| | <step>.count.save | bool | false | If true, the flag percentages per frequency are saved to a table with extension ''.flagfreq'' and percentages per station to a table with extension ''.flagstat''. The basename of the table is the MS name (without extension) followed by the stepname and extension. | | |<step>.type|string| |Case-insensitive step type; must be 'uvwflagger' or 'uvwflag'.| | 
| | <step>.count.path | string | "" | The directory where to create the flag percentages table. If empty, the path of the input MS is used. | | |<step>.count.save|bool|false|If true, the flag percentages per frequency are saved to a table with extension ''.flagfreq'' and percentages per station to a table with extension ''.flagstat''. The basename of the table is the MS name (without extension) followed by the stepname and extension.| | 
| | <step>.uvmrange | string vector | [] | Flag baselines with UV within one the given ranges (in meters). Delimiters .. and +- can be used to specify a range. E.g., ''uvmrange = [20..30, 40+-5]'' flags baselines with UV in range 20-30 meter and 35-45 meter. | | |<step>.count.path|string|""|The directory where to create the flag percentages table. If empty, the path of the input MS is used.| | 
| | <step>.uvmmin | double | 0 | Flag baselines with UV < uvmmin meter. | | |<step>.uvmrange|string vector|[]|Flag baselines with UV within one the given ranges (in meters). Delimiters .. and +- can be used to specify a range. E.g., ''uvmrange = [20..30, 40+-5]'' flags baselines with UV in range 20-30 meter and 35-45 meter.| | 
| | <step>.uvmmax | double | 1e15 | Flag baselines with UV > uvmmax meter. | | |<step>.uvmmin|double|0|Flag baselines with UV < uvmmin meter.| | 
| | <step>.umrange | string vector | [] | Flag baselines with U within one of the given ranges (in meters). | | |<step>.uvmmax|double|1e15|Flag baselines with UV > uvmmax meter.| | 
| | <step>.ummin | double | 0 | Flag baselines with U < ummin meter. | | |<step>.umrange|string vector|[]|Flag baselines with U within one of the given ranges (in meters).| | 
| | <step>.ummax | double | 1e15 | Flag baselines with U > ummax meter. | | |<step>.ummin|double|0|Flag baselines with U < ummin meter.| | 
| | <step>.vmrange | string vector | [] | Flag baselines with V within one of the given ranges (in meters). | | |<step>.ummax|double|1e15|Flag baselines with U > ummax meter.| | 
| | <step>.vmmin | double | 0 | Flag baselines with V < vmmin meter. | | |<step>.vmrange|string vector|[]|Flag baselines with V within one of the given ranges (in meters).| | 
| | <step>.vmmax | double | 1e15 | Flag baselines with V > vmmax meter. | | |<step>.vmmin|double|0|Flag baselines with V < vmmin meter.| | 
| | <step>.wmrange | string vector | [] | Flag baselines with W within one of the given ranges (in meters). | | |<step>.vmmax|double|1e15|Flag baselines with V > vmmax meter.| | 
| | <step>.wmmin | double | 0 | Flag baselines with W < wmmin meter. | | |<step>.wmrange|string vector|[]|Flag baselines with W within one of the given ranges (in meters).| | 
| | <step>.wmmax | double | 1e15 | Flag baselines with W > wmmax meter. | | |<step>.wmmin|double|0|Flag baselines with W < wmmin meter.| | 
| | <step>.uvlambdarange | string vector | [] | Flag baselines/channels with UV within one the given ranges (in wavelengths). Delimiters .. and +- can be used to specify a range. E.g., ''uvlambdarange = [20..30, 40+-5]'' flags baselines/channels with UV in range 20-30 wavelengths and 35-45 wavelengths. | | |<step>.wmmax|double|1e15|Flag baselines with W > wmmax meter.| | 
| | <step>.uvlambdamin | double | 0 | Flag baselines/channels with UV < uvlambdamin wavelengths | | |<step>.uvlambdarange|string vector|[]|Flag baselines/channels with UV within one the given ranges (in wavelengths). Delimiters .. and +- can be used to specify a range. E.g., ''uvlambdarange = [20..30, 40+-5]'' flags baselines/channels with UV in range 20-30 wavelengths and 35-45 wavelengths.| | 
| | <step>.uvlambdamax | double | 1e15 | Flag baselines/channels with UV > uvlambdamax wavelengths | | |<step>.uvlambdamin|double|0|Flag baselines/channels with UV < uvlambdamin wavelengths| | 
| | <step>.ulambdarange | string vector | [] | Flag baselines/channels with U within one the given ranges (in wavelengths). | | |<step>.uvlambdamax|double|1e15|Flag baselines/channels with UV > uvlambdamax wavelengths| | 
| | <step>.ulambdamin | double | 0 | Flag baselines/channels with U < ulambdamin wavelengths | | |<step>.ulambdarange|string vector|[]|Flag baselines/channels with U within one the given ranges (in wavelengths).| | 
| | <step>.ulambdamax | double | 1e15 | Flag baselines/channels with U > ulambdamax wavelengths | | |<step>.ulambdamin|double|0|Flag baselines/channels with U < ulambdamin wavelengths| | 
| | <step>.vlambdarange | string vector | [] | Flag baselines/channels with V within one the given ranges (in wavelengths). | | |<step>.ulambdamax|double|1e15|Flag baselines/channels with U > ulambdamax wavelengths| | 
| | <step>.vlambdamin | double | 0 | Flag baselines/channels with V < vlambdamin wavelengths | | |<step>.vlambdarange|string vector|[]|Flag baselines/channels with V within one the given ranges (in wavelengths).| | 
| | <step>.vlambdamax | double | 1e15 | Flag baselines/channels with V > vlambdamax wavelengths | | |<step>.vlambdamin|double|0|Flag baselines/channels with V < vlambdamin wavelengths| | 
| | <step>.wlambdarange | string vector | [] | Flag baselines/channels with W within one the given ranges (in wavelengths). | | |<step>.vlambdamax|double|1e15|Flag baselines/channels with V > vlambdamax wavelengths| | 
| | <step>.wlambdamin | double | 0 | Flag baselines/channels with W < wlambdamin wavelengths | | |<step>.wlambdarange|string vector|[]|Flag baselines/channels with W within one the given ranges (in wavelengths).| | 
| | <step>.wlambdamax | double | 1e15 | Flag baselines/channels with W > wlambdamax wavelengths | | |<step>.wlambdamin|double|0|Flag baselines/channels with W < wlambdamin wavelengths| | 
| | <step>.phasecenter | string vector | [] | If given, use this phase center to calculate the UVW coordinates to flag on. The vector can consist of 1, 2 or, 3 values. If one value is given, it must be the name of a moving source (e.g. SUN or JUPITER). Otherwise the first two values must contain a source position that can be given in sexagesimal format or as a value followed by a unit. The third value can contain the direction type; it defaults to J2000. Possible types are GALACTIC, ECLIPTIC, SUPERGAL, J2000, B1950 (as defined in the casacore ''Measures'' system). | | |<step>.wlambdamax|double|1e15|Flag baselines/channels with W > wlambdamax wavelengths| | 
|  | |<step>.phasecenter|string vector|[]|If given, use this phase center to calculate the UVW coordinates to flag on. The vector can consist of 1, 2 or, 3 values. If one value is given, it must be the name of a moving source (e.g. SUN or JUPITER). Otherwise the first two values must contain a source position that can be given in sexagesimal format or as a value followed by a unit. The third value can contain the direction type; it defaults to J2000. Possible types are GALACTIC, ECLIPTIC, SUPERGAL, J2000, B1950 (as defined in the casacore ''Measures'' system).| | 
|  |  | 
|  | ==== Split ==== | 
|  |  | 
|  | |<step>.type|string| |Case-insensitive step type; must be 'split' or 'explode'| | 
|  | |<step>.steps|string vector|[]|List of next steps; each step will run after this step. E.g. ''[average, msout]'' | | 
|  | |<step>.replaceparms|string vector|[]|The substep keys that should be different for each of the next steps. Instead of their default type, they should now be a list of those things. E.g. ''[average.timestep, msout.name]'' | | 
|  |  | 
|  | \\ | 
|  |  | 
|  |  | 
|  | ==== Interpolate ==== | 
|  | The interpolate step replaces flagged values by interpolating them using "neighbouring" samples (samples close in time and frequency). It calculates the Gaussian weighted sum over non-flagged samples, with a sigma parameter of one timestep/one channel. The flags are removed after interpolation. This is in particular useful in combination with averaging; by replacing flagged values before averaging, the output visibilities will more accurately represent the true sky. This step was aimed to solve frequency structure from flagging/averaging for the EoR experiment, but might be useful in other cases as a more accurate averaging step. Details are published in [[https://arxiv.org/abs/1901.04752|Offringa, Mertens and Koopmans (2018)]]. | 
|  | | <step>.type | string | | Case-insensitive step type; must be 'interpolate'. | | 
|  | | <step>.windowsize | int | 15 | Size of the window over which a value is interpolated. Should be odd. | | 
|  |  | 
| ==== Description of baseline selection parameters ==== | ==== Description of baseline selection parameters ==== |