Bug 66477 - Data Analysis Toolkit
Summary: Data Analysis Toolkit
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: ux-advise (show other bugs)
Version:
(earliest affected)
4.0.4.2 release
Hardware: All All
: high enhancement
Assignee: Tomaz Vajngerl
URL:
Whiteboard: target:4.2.0
Keywords:
Depends on:
Blocks:
 
Reported: 2013-07-01 23:04 UTC by kgizdov
Modified: 2014-01-27 13:27 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments
Screenshots (761.02 KB, application/zip)
2013-07-01 23:04 UTC, kgizdov
Details

Note You need to log in before you can comment on or make changes to this bug.
Description kgizdov 2013-07-01 23:04:31 UTC
Created attachment 81837 [details]
Screenshots

As a student and a person involved in academia, I am sure I speak not only for myself but for all students, academics and generally people working with data, when I say that the Data Analysis toolkit in Excel has not alternative to data in the Open Source community of software. This feature saves so much time and is essential to anybody that handles data in spreadsheets. It is a systematic way to get statistical information about data in a very organized and speedy fashion. In Microsoft Excel it features the following automated techniques:
- Anova (single factor, two factor with/without replication)
- Correlation
- Covariance
- Descriptive Statistics
- Exponential Smoothing
- F-Test
- Fourier Analysis (Very powerful and I think essential)
- Histogram
- Moving Average
- Random Number Generation (For comparing with data to establish meaningfulness)
- Rank and Percentile
- Regression
- Sampling
- t-Test (all variations)
- z-Test
These all can be applied directly to sets of data and are straightforward to use. I am sure this will benefit a whole lot all user of LibreOffice! Thank you!
Comment 1 bfoman (inactive) 2013-07-02 13:19:53 UTC
This seems to be a good read for you - http://www.comfsm.fm/~dleeling/statistics/text.html
Comment 2 kgizdov 2013-07-02 13:49:23 UTC
That's very kind of you, but somehow I think you assumed I do not already know about that page and the fact that LibreOffice supports the individual functions . The problem is not that LibreOffice does not have the formulas, but that if you need to do analysis on several data in a day, you have to do it manually. And instead of spending that time on meaningful analysis, you end up manually calling, arranging and combining functions. This is the definition of wasting time. The Excel toolkit not only saves enormous amount of time every day, but also enables people without the full statistics skill pack (like Biologists, Geologists, other non-maths intensive sciences) to get the job done in a whim. That is the main point and a lot of people did search on the Internet for an alternative only be disappointed. That's why I chose to file this feature request. ;-) 
(In reply to comment #1)
> This seems to be a good read for you -
> http://www.comfsm.fm/~dleeling/statistics/text.html
Comment 3 Gerry 2013-07-07 20:46:18 UTC
Have you tried the R4Calc extension? If it is still working (last release was 2008) it should solve your problems via the use of R statistical package.

Of course, it is not an inbuilt data analysis toolkit.

http://extensions.services.openoffice.org/project/R4Calc
Comment 4 Commit Notification 2013-07-14 20:35:06 UTC
Tomaž Vajngerl committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=5c05b1cabcf7f6a7f490ae6fc4bc145e75229752

fdo#66477 Random Number Generation added to menu>fill.



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 5 Commit Notification 2013-07-19 15:10:37 UTC
Tomaž Vajngerl committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=f2c9aa43666101c6970ea33f50fb4e780b99b97c

fdo#66477 Add sampling feature to calc



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 6 Commit Notification 2013-07-19 15:11:02 UTC
Tomaž Vajngerl committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=fa20e0dd67c1da8fe8653f163e0fc6743934e7ae

fdo#66477 Add descriptive statistics calculation to Calc.



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 7 Commit Notification 2013-07-28 09:59:30 UTC
Tomaž Vajngerl committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=8e4f4fb541277e35aca7d8f210307635f7a81443

fdo#66477 Add correlation and covariance to Calc.



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 8 Stephan van den Akker 2013-08-29 12:50:06 UTC
This addition needs at least boost >= 1.47 to build.

In case --with-system-boost is used to build LO, the boost version should be checked.

AFAIK this check happens in core/m4/ax_boost_base.m4. 
ATM it checks for boost >= 1.20
Comment 9 severoraz 2013-09-07 22:22:40 UTC
I am in favor of this bug being solved.
Comment 10 Johnny_M 2013-10-15 06:34:25 UTC
(In reply to comment #4)
> Tomaž Vajngerl committed a patch related to this issue.
> It has been pushed to "master":
> 
> http://cgit.freedesktop.org/libreoffice/core/commit/
> ?id=5c05b1cabcf7f6a7f490ae6fc4bc145e75229752
> 
> fdo#66477 Random Number Generation added to menu>fill.

I'm not an expert, just for info: Please consider what was done to the RAND() function of Calc on bug 33365.
Comment 11 Tomaz Vajngerl 2013-10-15 06:53:33 UTC
> I'm not an expert, just for info: Please consider what was done to the
> RAND() function of Calc on bug 33365.

It already uses the same mt19937 seed from boost library like RAND() does.
Comment 12 Saaz Rai 2013-12-15 03:21:06 UTC
+1 for this feature in Calc
Comment 13 Eike Rathke 2014-01-21 20:27:15 UTC
@Tomaz:
With the above commits that are also in 4.2.0, is something missing you're going to implement or why isn't this set to RESOLVED FIXED?

In any case, I reassign this to you (as you implemented it) and set resolved fixed. If there's something missing then let's open a new RFE as this one with its commits is target:4.2.0 now.
Comment 14 Tomaz Vajngerl 2014-01-27 13:27:54 UTC
(In reply to comment #13)
> @Tomaz:
> With the above commits that are also in 4.2.0, is something missing you're
> going to implement or why isn't this set to RESOLVED FIXED?
Not the whole functionality from "Data Analysis Toolkit" is implemented yet.  

> In any case, I reassign this to you (as you implemented it) and set resolved
> fixed. If there's something missing then let's open a new RFE as this one
> with its commits is target:4.2.0 now.
No problem. I will open new RFE for the missing functionality.