GSoC/2019/StatusReports/DevanshuAgarwal
Project Overview
Project Name: Statistical Analysis in Labplot
Abstract: We aimed to add statistically relevant features in Labplot. These features should be able to give the correlation between data points and should perform various hypothesis testings along with assumption checking. Our target audience includes both scientists and engineers, hence we aimed to provide results in the form that is elaborative enough for any non-statistical person to use yet non-distractive for someone who is just interested in numbers.
Proposal
You can find my GSoC proposal here: https://docs.google.com/document/d/1aoibrQXcpJwP8tGdaNrDwoP2LiTqkj9HwJ3gAqA361U/edit
List of Added Features
I have added the following features for the first evaluation:
- TTest
- Two-Sample Independent
- Two Sample Paired
- One Sample
- ZTest
- Two-Sample Independent
- ANOVA
- One Way ANOVA
- TWo Way ANOVA
- Levene Test: To check for the assumption of homogeneity of variance between populations
- Correlation Coefficient
- Pearson's R
- Kendall's Tau
- Spearman Rank
- Chi-Square Test for Independence
Status Reports
First Evaluation:
https://docs.google.com/document/d/1JxA569fFTcrDUTHdInvKJPz9rXmVYM7DuYT54f7C38U/edit?usp=sharing
Second Evaluation:
https://docs.google.com/document/d/1qgss0AssIb3HJIDeAYIos2ig37tk_8UWqDsn4OwDPrQ/edit?usp=sharing
Final Report:
I have included all my work with screenshots and demos in the final post of my blog.
Here is the link: https://agdeva8labplot.blogspot.com/2019/08/final-days-of-gsoc-2019.html
TODO
- Add more tooltips to Result View
- Check for assumptions using various tests (like Levene's Test).
- Reimplement above features when data source type is Database.
- Integrate various tests in one workbook to show a summary to the user in few clicks.
- All other minor TODOs are already written as comments in source code itself.
Future Goals
We aim to generate a single self-contained report for the data, currently analysed by the user. This report will show the statistical analysis summary and graphs in one place, at a single click, without the need of the user to explicitly select or instruct anything unless he/she feels the need of doing so. The idea is to make the task of data analysis easy for the user and give him/her the freedom to play around with the data while keeping track of the changes occurring in different statistical parameters.
Commits
My Commits: https://cgit.kde.org/labplot.git/log/?h=gsoc2019_stats&qt=author&q=Devanshu+Agarwal
These commits are reviewed on phabricator by my mentors Stefan Gerlach and Alexander Semke.
Review Request: https://phabricator.kde.org/p/devanshuagarwal/.
My Blog
https://agdeva8labplot.blogspot.com/
About Me
- Name: Devanshu Agarwal
- Mentors: Stefan Gerlach, Alexander Semke
- Email: [email protected], [email protected]
- Github Id: https://github.com/agdeva8
- IRC nickname: agdeva8