I have some scientific measurement data which should be permanently stored in a data store of some sort.
- And what is your role at which academic organization?
- Which kind of data stores are typically used at your academic organization of such data stores?
I am looking for a way to store measurements from 100 000 sensors with measurement data accumulating over years to around 1 000 000 measurements per sensor. Each sensor produces a reading once every minute or less frequently. Thus the data flow is not very large (around 200 measurements per second in the complete system). The sensors are not synchronized.
- Which kind of mathematics do you use?
- How do you ensure that the other 88000 sensors don't report their measurements without synchronizing your sensor measurements?
- How is the timestamp achieved where?
- Which level of time accuracy and time synchronization do you need and achieve in your experiment?
When I divide 100000 sensors by 60 seconds (one measurement per sensor per minute), then I get around 1667 measurements on average per second, not your claimed 200. If you multiply 200 sensors with 60 seconds (one measurement per sensor per minute), you may only have about 12000 sensors, and have to ensure that the remaining 88000 sensors are not taking measurements during your experiment.
You reported that the measurements of the sensors are not synchronized. But when measurements are taken at the sensors, then time has to be synchronized among the sensors. I did not get which time accuracy you need.
The data itself comes as a stream of triplets: [timestamp] [sensor #] [value], where everything can be represented as a 32-bit value.
As Brian has already mentioned, this sounds to fit scientific databases. Some of them use time series. These seem to fit well for your situation.
I want to know which source allows efficient data storage features between file system and dbms in a large project.
- And why do you mention file system in this context?
- Which relation do you create between file system and DBMS and why?
The data store of a DBMS may optionally use a file system. But it might as well use raw storage. I don't understand your question as I can't yet see a relation to file systems.
I have asked this query on Quora and according to this
Your selected source does not comply with scientific standards. There exist academic societies world wide. Different sciences have needs for scientific databases and use such scientific databases. Don't expect the same needs outside of science. At CERN, physicists are making large experiments which create large sets of data over a short period of time. Their data is more complex than yours. The web and the first web browser were created there too. So if you can't find helpful answers at your academic institution, you may find helpful answers in some relevant academic society. And your academic institution probably has access to such academic societies.
And those physicists at CERN are among those to reform the definition of time. These and some others need accurate time. But the current definition is not accurate enough. These scientists need a more accurate time definition. And the larger geographic distribution is needed for an experiment with such high accuracy needs, the more relevant it is that such better time definition should become an international standard too.
Then the query would be
- Did you understand the non-scientific article which you referenced?
- Why do you expect that your scientific DBMS would be relational as your triple does not include any relation?
- Why do you emphasize relational DBMS although your referenced non-scientific article mentions relational DBMS only as one type among many presented types?
Even this article presents several types of DBMS. These types are not exhaustive. And only relational DBMS would use such a query for querying, not for storing.
- And why do you not expect to need storing other kind of data for your experiment and its measurements in addition to those you mentioned?