[sdnog] cacti and old data values

Frank Habicht geier at geier.ne.tz
Mon May 9 09:43:19 SAST 2016


RRD (round-robin database) are meant to not increase the size over time.
We know new data is getting into the database.
So it will have to remove some (details about) older data.

One can configure the parameters.
We can take samples every 5 minutes, keep 2 days worth of these samples.
2days * 24h * 12 samples/h   records in there.

we want to keep older data, up to 2 weeks, but only regarding each
30-minute interval.
2weeks * 7 days * 24h * 2 samples/h   records in there.

we want to also keep older data, up to (your choice, btw) 3 months worth
of these samples, but with a granularity of 2-hour intervals.
3 months * ~30days * 12 samples/day  records in there.

You see that the size of the database will not increase.

but every 30 minutes we have to "forget" / remove all the 6 samples that
were in the 30-minute interval 2 days ago.
How do we make one value out of the 6 ?
rrdtool defines some "consolidation functions" and you can choose among
them. You can use the average, the minimum, the maximum and other more
complicated ones, I believe.
Most often used is the average. (that removes your spike)
For network operations the maximum should also be interesting. (that
would keep your spike)

Maybe `man rrdtool` is an even better source. Search for 'Round Robin
`man rrddump` gives info about extracting all these details out of an
RRD file.


On 5/8/2016 2:00 PM, Nishal Goburdhan wrote:
> On 8 May 2016, at 9:44, Samir S. Omer wrote:
>> Hi all
>> can someone explain to me how does cacti/rrds work exactly in term of
>> storing values ?
>> I noticed that the values get inaccurate when time is defined for old
>> times
>> for example when I show the graph for last two days it shows a spike
>> in traffic. However, if I set the time for the last week I see the
>> graph is gone and graph is normal.
> cacti uses rrdtool as a base;  so, yes, after a while, the law of
> averages means that you’ll lose granular reporting.
> are you asking about the algorithms used for calculating these “averages”?
> i’ve never really dug that deep into it, but you can probably find more
> information here:  http://oss.oetiker.ch/rrdtool/
> you might also find that your cacti installation has additional plugins
> to remove traffic spikes.
> from one of doc pages: 
> http://oss.oetiker.ch/rrdtool/pub/contrib/spikekill-1.1-1.txt
> “The spike killing methodologies supported in this utility are as follows:
>   - Remove via Standard Deviation
>   - Remove via Variance from Average
> The Standard Deviation method first calculates the average of all Data
> Sources and
> RRA's and determines the Standard Deviation of each combination.  It
> will then
> remove any sample over X Standard Deviations from the Average.  The
> utility has the
> option of replacing the spike by the average, or by a NaN at your
> discretion.
> The Variance Method first calculates the average of all Data Sources and
> RRA's.
> Before doing that it removes the top X and bottom X samples from each
> sample set
> treating them as outliers.  Then, based upon the revised average will
> remove
> all spikes Y percent above or below the revised averages.  Again, like the
> Standard Deviation method, you have the option of replacing the spike
> with the
> revised average or a NaN.”
> hth,
> —n.
> _______________________________________________
> sdnog mailing list
> sdnog at sdnog.sd
> http://lists.sdnog.sd/mailman/listinfo/sdnog

More information about the sdnog mailing list