~~NOTOC~~ ====== User Software :: CR-Tools :: DataReader ====== * [[#Overview]] * [[#Fields in the Header Record]] * [[#Filestream positions]] * [[#Data flow]] * [[#Performance]] * [[#Development]] * [[#Usage with C/C++ code]] ==== Overview ==== The ''DataReader'' class implements the processing framework, which can be applied to data before entering further processing. === Fields in the Header Record === This is the list of (mandatory) fields in the header record (as accessed with ''dr.header()'') of a DataReader object. The mandatory fields have to be set by all child classes of the DataReader in order to be usable by the (upcoming) standard tools. ^Field Name ^ M* ^Data Type ^Desctription | | Date | {{wiki:check_tick.gif}} | uInt | Date of the observation. Standard unix date i.e. (GMT-)seconds since 1.1.1970| | AntennaIDs | {{wiki:check_tick.gif}} | Vector | The IDs of the antennas, i.e. an unique number for each channel.| | Observatory | {{wiki:check_tick.gif}} | String | Name of the Observatory, e.g. LOFAR, LOPES, LORUN, ITS, etc.| | Filesize | {{wiki:check_tick.gif}} | Int | Size (number of samples) of the file(s). | | dDate | {{wiki:check_tick.gif}} | Double | Like "Date" but with sub-second precision. Either time when the first sample was taken, or time when sample with ''delay''==0 was taken. | | presync | {{wiki:check.gif}} | Int | Number of samples taken before the trigger (currently LOPES and LORUN only) | | TL | {{wiki:check.gif}} | Int | LOPES/KASCADE style timelabel (number of 5MHz ticks inside the second). | | LTL | {{wiki:check.gif}} | Int | LOPES-time-label (number of 40MHz ticks inside the second). | | EventClass | {{wiki:check.gif}} | Int | Class of the event (1=cosmic ray event, 2=simulation event, 3=test, 4=solar) | | SampleFreq | {{wiki:check.gif}} | uchar | Sample frequency in MHz | | StartSample | {{wiki:check.gif}} | uInt | Number of the first sample in this second | (* Mandatory) The field-names are case sensitive, and should be put into the record exactly as they are written here. === Filestream positions === The [[lopesdoxy>classDataReader.html|DataReader]] handles progression through the data volume via a set of [[DataIterator]] objects, providing //N// positions for //N// data streams. These stream- and position pointers allow a variety of access schemes: * Access to the same segment in multiple streams, e.g. when reading raw data recorded with the //LOFAR ITS//. {{DataReader stream navigation 01.png?300|DataReader stream navigation, example 1}} * Access to different segments in multiple streams {{DataReader stream navigation 02.png?300|DataReader stream navigation, example 2}} * Access to different segments withhin a single stream, e.g. when reading data from a [[lopesdoxy>classLopesEvent.html|LopesEvent]] file. {{DataReader stream navigation 03.png?300|DataReader stream navigation, example 3}} === Data flow === The figure below illustrates the data flow inside the DataReader: {{DataReader data flow.png?550|Data flow inside the DataReader}} There is the option to insert a [[.Hanning Filter|Hanning Filter]] step before performing the Fourier transform; this can be used to reduce the sidelobes in the frequency domain, originating from cutting out a block of data (which is equivalent to the multiplication of the data with a box function). === Performance === A clear trend can be seen when going towards smaller blocksizes, by which data are read from disk. One possible approach for tuning the performance would be read multiple blocks from disk and then dispatch them subsequently to the requesting routine; this of course requires some intelligence to be build into the data reading code, in order to do the bookkeeping. ===== Development ===== ==== Adding a new data format ==== The DataReader framework has been set up in such a way, that adding the capability do read in data from new data formats should be kept as simple as possible: * ''DataReader'' works as base class, from which all data type specific classes are inherited; by this the internal data processing framework is kept. * Only reimplement the function performing the actual input from the data file, returning a standard product to the internal pipeline. At the present time, the following classes are part of the data input framework: {{ http://www.astron.nl/~bahren/coding/lopestools/html/classDataReader.png?550 }} === Example === - In the //header file// of the new class (here: ''ITSBeam.h'') define a private variable ''datatype_p'' which is of the type as which the data are stored in the data file. class ITSBeam : public DataReader { //! Information contained in experiment.meta are stored in their own object ITSMetadata metadata_p; //! Type as which the data are stored in the data file float datatype_p; public: //! Get the raw time series after ADC conversion Matrix fx (); protected: //! Connect the data streams used for reading in the data Bool setStreams (); }; The two methods/functions are reimplemented from the ''DataReader'' class; a detailed description is given below. - In the //implementation file// (here: ''ITSBeam.cc'') we need to reimplement two functions, which are already defined as virtual functions in the ''DataReader'' class: - ''setStreams()'' -- connect the data streams used for reading in the data from disk Bool ITSBeam::setStreams () { bool status (true); uint blocksize (blocksize_p); Vector antennas (metadata_p.antennas()); Vector adc2voltage (DataReader::adc2voltage()); Matrix fft2calfft (DataReader::fft2calfft()); Vector filenames (metadata_p.datafiles(true)); DataIterator *iterator; /* Configure the DataIterator objects: for ITSBeam data, the values are stored as short integer without any header information preceeding the data within the data file. */ uint nofStreams (filenames.nelements()); iterator = new DataIterator[nofStreams]; for (uint file (0); file Even though the setup of the ''DataIterator'' can be quite different for your specific data format, the single-most important instruction -- that needs to be issued before calling ''DataReader::init'' -- is iterator[file].setStepWidth(sizeof(datatype_p)); which takes care of adjusting the width by which the stepping through the data volume is done. Once all parameters have been set up correctly, they are passed to the base class. - ''fx()'' -- Reading in of the data and formatting to one of the standard products within the processing chain internal to the ''DataReader''; keep in mind that your data may be something else but the raw time series after ADC, such that you will need to re-implement another method (e.g. ''fft()''). ==== Internal data initialization ==== {{datareader_init_sequence.png?550}} ==== Usage with C/C++ code ==== - Creation of a new DataReader object: #include #include DataReader *dr; LopesEvent *le = new LopesEvent (eventfile, // location the LopesEvent file blocksize, // nof. samples per block of data adc2voltage, // conversion weights [optional] fft2calfft); // calibration weights [optional] dr = le; With the conversion arrays optional you can even use the simpler construction method: LopesEvent *le = new LopesEvent (eventfile, // location the LopesEvent file blocksize) // nof. samples per block of data - Reading in data for processing: for (int block(0); blockfft(); } - Data selection: Selection of frequency channels and antennas can be performed directly within the DataReader, e.g. dr->setSelectedChannels (selection); where ''selection'' is an array of boolians of length ''fftLength''. {{datareader_-_channel_selection.png?550|Selection of frequency channels in the FFT}} \\ ---- <- [[public:user software:User Software]] • [[public:user software:CR-Tools]]