Differences

This shows you the differences between two versions of the page.

--- public:documents:raw_olap_data_formats [2011-11-02 09:50] – Jan David Mol
+++ public:documents:raw_olap_data_formats [2017-03-08 15:27] (current) – external edit 127.0.0.1
@@ Line 1: / Line 1: @@
-===== Raw OLAP data formats ====
+===== Raw OLAP data formats (obsolete) ====
-OLAP produces several data formats, which are intended to be replaced by their final format, such as HDF5. The formats below are not officially supported and subject to change without notice.
+OLAP produces several data formats, which are intended to be replaced by their final format, such as HDF5.
+===== After 2011-10-24 =====
+Files adhere to the following naming scheme: ''Liiiii_SAPsssss_Bbbb_Sz_bf.raw'', with:
+  - ''iiiii'' = SAS observation ID
+  - ''sssss'' = Station beam number (SAP)
+  - ''bbb'' = Tied-array beam number (TAB)
+  - ''z'' = Stokes number
+The stokes numbers are to be interpreted as follows:
+  - Complex Voltages:
+     - z = 0 -> Xr (X polarisation, real part)
+     - z = 1 -> Xi (X polarisation, imaginary part)
+     - z = 2 -> Yr (Y polarisation, real part)
+     - z = 3 -> Yi (Y polarisation, imaginary part)
+  - Coherent/incoherent Stokes:
+     - z = 0 -> I
+     - z = 1 -> Q
+     - z = 2 -> U
+     - z = 3 -> V
+The data is encoded as follows. Each .raw file is a multiple of the following structure. All data is written as big-endian 32-bit IEEE floats.
+<code>
+struct block {
+  float sample[SUBBANDS][CHANNELS];
+};
+</code>
+The constants used can be derived from the parset:
+<code>
+  SUBBANDS = len(parset["Observation.subbandList"])
+  if (complex voltages || coherent stokes) {
+    CHANNELS = parset["OLAP.CNProc_CoherentStokes.channelsPerSubband"]
+    if (CHANNELS == 0) CHANNELS = parset["Observation.channelsPerSubband"]
+  } elif (incoherent stokes) {
+    CHANNELS = parset["OLAP.CNProc_IncoherentStokes.channelsPerSubband"]
+    if (CHANNELS == 0) CHANNELS = parset["Observation.channelsPerSubband"]
+  }
+</code>
+The sampling rate can be derived as follows:
+<code>
+  # clock frequency (f.e. 200 MHz)
+  clock_hz = parset["Observation.sampleClock"] * 1.0e6
+  # subband frequency (f.e. 195 kHz)
+  base_subband_hz = clock_hz / 1024
+  # channel frequency (f.e. 763 Hz)
+  base_nrchannels = parset["Observation.channelsPerSubband"]
+  base_channel_hz = base_subband_hz / base_nrchannels
+  if(complex voltages || coherent stokes) {
+    cs_temporalintegration = parset["OLAP.CNProc_CoherentStokes.timeIntegrationFactor"]
+    sample_hz = base_channel_hz / cs_temporalintegration
+  } elif(incoherent stokes) {
+    is_temporalintegration = parset["OLAP.CNProc_IncoherentStokes.timeIntegrationFactor"]
+    sample_hz = base_channel_hz / is_temporalintegration
+  }
+</code>
+===== Before 2011-10-24 =====
 Data can be recorded as either complex voltages (yielding X and Y polarisations) or one or more stokes. In either case, a sequence of blocks will be stored, each of which consists of a header and data. The header is defined as:
@@ Line 9: / Line 86: @@
   char padding[508];
 };
-/*
-// Proposed: no header. Missing data is replaced by zeros.
-struct header {
-};
-*/
 </code>
 in which sequence_number starts at 0, and is increased by 1 for every block. Missing sequence numbers implies missing data. The padding can have any value and is to be ignored.
@@ Line 49: / Line 120: @@
   // 2010-06-29 release and earlier stored data per subband instead of per beam:
   fcomplex voltages[BEAMS][CHANNELS][SAMPLES|2][POLARIZATIONS];
-  */
-  /*
-  // Proposed:
-  float voltages[SAMPLES][SUBBANDS][CHANNELS];
-  // Note that because the header will also be empty, the file is essentially
-  // a seamless stream of
-  //   float voltages[SUBBANDS][CHANNELS];
-  // If the subbands are chosen seamless as well, the data reduces to a stream of
-  //   float voltages[CHANNELS];
-  // with simply a larger number of channels.
   */
 };
@@ Line 101: / Line 161: @@
   fcomplex voltages[BEAMS][CHANNELS][SAMPLES|2][STOKES];
   */
-  /*
-  // Proposed:
-  float stokes[SAMPLES][SUBBANDS][CHANNELS];
-  // Note that this format is exactly the same as for complex voltages
-  */
 };
 </code>
@@ Line 120: / Line 174: @@
 |Lxxxxx_SByyy_bf.incoherentstokes|Stokes of subband yyy of observation xxxxx|
-Proposed is the following scheme:
-|Lxxxxx_Byyy_S0_bf.incoherentstokes|Stokes I of incoherent beam yyy of observation xxxxx|
-|Lxxxxx_Byyy_S1_bf.incoherentstokes|Stokes Q of incoherent beam yyy of observation xxxxx|
-|Lxxxxx_Byyy_S2_bf.incoherentstokes|Stokes U of incoherent beam yyy of observation xxxxx|
-|Lxxxxx_Byyy_S3_bf.incoherentstokes|Stokes V of incoherent beam yyy of observation xxxxx|
-Multiple incoherent beams can be formed if it is coherently dedispersed using multiple DMs.
 Each file is a sequence of blocks of the following structure:
@@ Line 150: / Line 195: @@
   // 2010-06-29 release and earlier:
   float stokes[CHANNELS][SAMPLES|2][STOKES];
-  */
-  /*
-  // Proposed:
-  float stokes[SAMPLES][SUBBANDS][CHANNELS];
-  // Note that this requires a transpose, making the
-  // format equal to coherent stokes
   */
 };
@@ Line 175: / Line 213: @@
 A BFRaw file starts with a file header containing the configuration:
-<code>
+<code C>
 struct file_header
 {
@@ Line 205: / Line 243: @@
 After the file header, there is a series of blocks until the end of file, configured using values from the file header:
-<code>
+<code C>
 struct block
   // 0x2913D852
@@ Line 242: / Line 280: @@
 To convert a TimeStamp-compatible int64_t to a C-readable timestamp, use
-<code>
+<code C>
 /* clockspeed is in Hz */
 int64 nanoseconds = (int64) (timestamp * 1024 * 1e9 / clockspeed);