Update: Arjen links to video Supporting Scalable Online Statistical Processing which shows
"rather than doing complete aggregates, use statistical sampling to provide a reasonable estimate (unbiased guess) of the result."
When you have a lot of data, sampling allows you to draw conclusions from a much smaller amount of data. That's why sampling is a scalability solution. If you don't have to process all your data to get the information you need then you've made the problem smaller and you'll need fewer resources and you'll get more timely results.
Recent comments
4 hours 20 min ago
4 hours 50 min ago
5 hours 48 min ago
5 hours 50 min ago
14 hours 12 min ago
14 hours 41 min ago
15 hours 7 min ago
16 hours 46 min ago
17 hours 10 min ago
19 hours 44 min ago