I would like to build a model for the following structure of data, but I do not find the right approach (even key or search words) to look for further information. Maybe you could help?!
Structure of the data
I have about 100 test specimens with different properties size, harness, weight, … so a set of values describing a specific specimen. And one result variable I would like to model i.e. by a GLM.
Additional to this type of data I have a load-collective with about 10.000 measurements for each specimen described by the set of invariable (per specimen) properties above. For example, the measurement of a force acting on each specimen. The data is not
a time series, but it could be measured in an order.
Additionally, I don’t have measurements for each test specimen, but know how often each has been exposed to forces from the load-collective. The distribution of this load-collective is very well known, but is multi modal and not represented by a basic distribution function.
The problem I have now:
- I would like to start simple and use only the load-distribution in a first step and later see how much better the model gets with single measurements.
- I don’t know how to feed the load-distribution into the model. Binning of the loads: 0-10, 10-20, … and use the percentage of occurrence per bin doesn’t seem right. The resolution of binning might also affect the results.
What I am looking for:
- An approach / starting point on how to model this data structure?
- ideas for key-words / search-words to find more information?
- Is “time series” the right search-word to go look for further information, even when the data might not be autocorrelated?
- Do you know any vignette along which I can find my way?
- Am I am worrying to much and should I just use the data with one line per observation consisting of the same properties over and over again, but only the force column differing?
Thanks for any suggestions and ideas!