I have a scientific application, that generates a huge amount of time series data, say, every 10 milliseconds. I saved the data to a database table - let’s call it RAW_DATA_TABLE
The RAW_DATA_TABLE has a data resolution of 10 milliseconds.
I need to do some statistical analysis to generate 5-minutes aggregate of the raw data, for the past 48 hours. That is, the 48 hours RAW_DATA will need to be grouped into 5-min bins. This process will generate new data, to be saved in a new table DATA_5MIN_TABLE
The bin-size can also be 15 minutes, so it could be purely a user input.
I would like to move the computation of the statistical analysis into some middle tier, leaving the backend database only have RAW_DATA.
The problem is, the computation itself is time consuming, if performed on the fly.
Will this fit what DataAbstract does well? Can DataAbstract server l do “prefetch” and do the computation in the background silently, so when a user queries, the processed data will be always ready for the immediately past 48 hours data, rather than doing the data aggregation on demand?
What would be the best practice?