Thursday, April 7, 2011

Excel Spreadsheets and Additional

PopTools -- Windows DLL for Excel 97 and 2000 (PC's only). Facilitates analysis of matrix population models & simulation of stochastic processes. Adds a new menu item and installs many powerful functions: matrix decompositions (Cholesky, QR, singular values, LU), eigenanalysis (eigenvalues and eigenvectors of square matrices) and formulas for generation of random variables (Normal, binomial, gamma, exponential, Poisson, logNormal). Also has routines for iterating spreadsheets to run Monte Carlo simulations, conduct randomisation tests (including the Mantel test) and calculate bootstrap statistics. Some facilities for maximum-likelihood parameter estimation, and some other generally useful functions. Free download from website, which also has documentation, examples, and related links.

SimulAr -- Provides a very elegant point-and-click graphical interface that makes it easy to generate random variables (correlated or uncorrelated) from twenty different distributions, run Monte-Carlo simulations, and generate extensive tabulations and elegant graphical displays of the results.

EZAnalyze -- enhances Excel (Mac and PC) by adding "point and click" functionality for analyzing data and creating graphs (no formula entry required). Does all basic "descriptive statistics" (mean, median, standard deviation, and range), and "disaggregates" data (breaks it down by categories), with results shown as tables or disaggregation graphs". Advanced features: correlation; one-sample, independent samples, and paired samples t-tests; chi square; and single factor ANOVA.

Update Available! The latest version can create z-scores, percentile ranks, and random numbers as new variables; has repeated-measures ANOVA; does simple post hoc tests for single factor and repeated-measures ANOVA; can graph multiple variables on a single graph, and can add error bars for +/- 2 SD’s; adds the sum function to the disaggregate and descriptive statistics functions, and the mode function to descriptive stats; adds delete sheets; adds English & Spanish language options, and works better in international environments; incorporates various bug fixes; and contains an updated user manual.

EZ-R Stats -- supports a variety of analytical techniques, such as: Benford's law, univariate stats, cross-tabs, histograms. Also supports databases such as mySQL, SQLite, MS-Access, MS-SQL. Simplifies the analysis of large volumes of data, enhances audit planning by better characterizing data, identifies potential audit exceptions and facilitates reporting and analysis. This language is a Computer Assisted Audit Technique (CAAT) in support of COSO, SAS 78, SAS 99 and analysis required by Sarbanes-Oxley.

SSC-Stat -- an Excel add-in designed to strengthen those areas where the spreadsheet package is already strong, principally in the areas of data management, graphics and descriptive statistics. SSC-Stat is especially useful for datasets in which there are columns indicating different groups. Menu features within SSC-Stat can:
  • help users manipulate their data (stacking, unstacking columns, 2-way unstacking, lookups, generating factors, etc.);
  • generate good graphs (X-Y Scatter Plot, Category-Value Plot, Boxplot, Normal Probability Plot, Density Estimate), that can be edited and polished like any other Excel graph ;
  • provide basic statistical analysis (descriptive statistics, summary statistics, 1- and 2-sample t tests, 1- and 2-sample tests of proportion).
22 Distribution Functions -- There is one spreadsheet for each of the following distribution functions: Beta, Binomial, Chi-Square, Discrete Uniform, Gamma, Geometric, Hypergeometric, Multivariate Hypergeometric, Laplace, Logistic, Multinomial, Negative Binomial, Normal, Bivariate Normal, Log-normal, Pareto, Poisson, Rectangular, Snedecor F, Student-t, Triangular, and Weibull. Each spreadsheet gives a graph of the distribution, along with the value of various parameters, for whatever shape and scale parameters you specify. You can also download a ZIP file containing all 22 spreadsheets.

Sample-size calculator for cluster randomized controlled trials, which are used when the outcomes are not completely independent of each other. This independence assumption is violated in cluster randomized trials because subjects within any one cluster are more likely to respond in a similar manner. A measure of this similarity is known as the intra-correlation coefficient (ICC). Because of the lack of independence, sample sizes have to be increased. This web site contains two tools to aid the design of cluster trials – a database of ICCs and a sample size calculator (along with instruction manuals).


DAG_Stat -- calculates an enormous number of quantities from a 2 -by-2 table:
  • for diagnostic tests: sensitivity, sensitivity of a random test given the observed prevalence and test level., sensitivity quality index, specificity, specificity of a random test, specificity quality index, efficiency (the correct classification rate), efficiency of a random test, quality index, Youden's index, the predictive value of positive test, predictive value of a positive random test, predictive value of negative test, predictive. value of a negative random test, likelihood ratio of a positive and negative tests, the odds ratio, false positive and false negative rates, prevalence observed in the sample and test level (proportion of subjects classified as 'positive.'
  • for interrater agreement: Cohen's Kappa, observed agreement, chance agreement, agreement about positive and negative cases, Byrt's bias index, Byrt's prevalence asymmetry index, bias adjusted Kappa, prevalence & bias adjusted Kappa. DAG_Stat also calculates Dice's index, Yule's Q (Gamma), Phi, Scott's agreement index, the tetrachoric correlation coefficient, Goodman & Kruskal's tau, Lambda, the Uncertainty Coefficient, Pearson's Chi Square (with and without Yates' correction), the likelihood ratio Chi Square, McNemar's Test, (with and without Yates' correction).
MIX (Meta-analysis with Interactive eXplanations) -- a statistical add-in for Excel 2000 or later (Windows only). Ideal for learning meta-analysis (reproduces the data, calculations, and graphs of virtually all data sets from the most authoritative meta-analysis books, and lets you analyze your own data "by the book"). Handles datasets with dichotomous & continuous outcomes; calculates Risk Diff, RR, OR, Mean Diff, Hedges's g, Cohen's d; performs standard & cumulative meta-analysis with CI ,z & p; fixed and random effects modeling; Cochran's Q with p-value; Higgins's I2 and H with CI; and publication bias tests: Rank correlation (tau-b) test with z & p, Egger's and Macaskill's regression tests with CI, and Trim-and-Fill. Generates numerous plots: tandard and cumulative forest, p-value function, four funnel types, several funnel regression types, exclusion sensitivity, Galbraith, L'Abbe, Baujat, modeling sensitivity, and Trim-and-Fill.

OZGRID -- contains over 4000 pages (and growing) of information on Excel and VBA for Excel. Many add-on's are for sale, but there is also an enormous amount of totally free content: downloads, a free 24/7 question and answer support forum for MS Office, a free Excel monthly newsletter full of detailed tips, tricks, hacks and more for Excel and VBA.

Spreadsheet123 -- a collection of over 70 free Excel spreadsheets. (These will also run under an almost-free Excel-like program, Spreadsheet Software Developer.) Spreadsheets include: capital budgeting, acquisition/buyout, company valuation, risk analysis, FCFE and FCFF, lease or buy a car, NPV & IRR, cash flow, capital structure, stock & bond valuation, financial projections, risk analysis, foreign market exchange, income statement what-if analysis, historical & pro-forma financial statements, template for assessing risk of information technology and data warehousing, IPO timeline, Malcolm Baldrige quality model, and risk return optimization, among many others.

Very-high-precision Statistical Probability Functions -- Provides double-precision (16 significant figures) mass , density, cumulative, inverse probability distributions, critical values, and confidence bounds for the geometric, negative binomial, binomial, Poisson, hypergeometric, negative hypergeometric, exponential, normal, chi-square, gamma, Student t, Fisher F and beta; non-central gamma, chi-square, beta, t and F; and the mixed Gamma-Poisson, Beta-Binomial, and Beta-Negative-binomial distributions. The routines are programmed in VBA, embedded within an Excel spreadsheet that illustrates the usage of each of them.

DE Histograms -- an Excel add-in that provides comprehensive descriptives stats, histograms, outlier detection, normality testing, and much more.

Exact confidence intervals for samples from the Binomial and Poisson distributions -- an Excel spreadsheet with several built-in functions for calculating probabilities and confidence intervals. (42k long).

BiPlot -- by Ilya Lipkovich and Eric P. Smith, of Virginia Tech. A user-friendly add-in for Excel to draw a biplot display (a graph of row and column markers from data that forms a two-way table) based on results from principal components analysis, correspondence analysis, canonical discriminant analysis, metric multidimensional scaling, redundancy analysis, canonical correlation analysis or canonical correspondence analysis. Allows for a variety of transformations of the data prior to the singular value decomposition and scaling of the markers following the decomposition.

Statistical Process Control (SPC) and Reliability spreadsheets from John Zorich's web site -- designed to simplify activities in Production and R&D. Formally validated to be "GMP" and "Part 11" compliant . Free spreadsheets include:
  • Self-made Sampling Plans -- Examine the OC curves for your own custom sampling plans. Use either binomial or hypergeometric calculations. Now be able to explain the "valid statistical rationale" of the sampling plans you already use.
  • Sequential Sampling Plans -- Provides an analysis and planning tool for sample sizes in situations where lots undergo sequential inspections (e.g., 1st by Manufacturing, 2nd by QC, and finally by QA).
Lifetable -- does a full abridged current life table analysis to obtain the life expectancy of a population. Furthermore, one can calculate Potential Gains in Life Expectancy (PGLE) after removing cause k, considering competing causes of death; the (Premature) Years of Potential Life Lost (YPLL), this is the number of person years added to the total number of person years lived in a population if cause of death k would be removed; the Standardized Mortality Ratio (SMR), standardized numbers per 100,000 and the Comparative Mortality Figure (CMF) can also be calculated. From the Downloads section of the QuantitativeSkills web site.

Intracorrelation -- does intra correlation calculations for dichotomous or binary yes/no type outcome variables according to two different methods proposed for the single cluster one by Fleiss and another one by Bennett et.al. A third spreadsheet concerns a method for two clusters by Donner and Klar. You will have to insert your own data by overwriting the tables in the second (total number of positive responses) and third (total number of negative responses) or fourth column (total number). From the Downloads section of the QuantitativeSkills web site.

Weighted Least Squares Linear Fits -- an Excel add-in from Philip Kromer (Univ. of Texas)

XLMathematics -- A set of Excel (Ver 5+) for mathematical computations: graphing , calculus (computing limits, computing and graphing derivatives and/or tangent lines, evaluating integrals using various techniques), Linear algebra (Gauss-Jordan elimination, allowing step-by-step views).

Analyse-it -- includes over 30 parametric & non-parametric statistical functions, including multiple linear regression analysis, ANOVA, & chi-square statistics. A separate specialized package for clinical method evaluation provides NCCLS and IFCC procedures for accuracy & imprecision.

Statistical Process Control (SPC) and Reliability spreadsheets from John Zorich's web site -- designed to simplify activities in Production and R&D. Formally validated to be "GMP" and "Part 11" compliant . Demo's of spreadsheets include:
  • Variables Data SPC -- XbarR, XbarS, XmR, histograms, capability indices, preformatted customizable printable report. Automaticly identify out-of-control points.
  • Count Data SPC  -- P and U SPC charts, pareto chart, preformatted customizable printable report. Sutomaticly identify out-of-control points.
  • Reliability Statistics Basics -- component reliability using K-factors, stress/strength analysis, failure analyses, for "normally distributed" and unknown distributions. Stress / strength formula has been modified to allow input of a "confidence" level, if desired.
  • Reliability Plotting -- Component reliability using "Reliability plotting" ("probability plotting", "rectification", etc.). Can confirm normality, or can identify normalizing transformation.
  • Power Curves for t-Tests  -- Power vs. Sample Size, Power vs. Hypothesized Difference, Power vs. Alpha, and Power vs. Population SD.
  • Statistical Analysis of Gages -- for quantifying measurement uncertainty. Methods include Gage R&R (up to 3 persons, 3 gages, 3 replicates, and 10 parts), Gage Correlation (up to 3 gages), Gage Bias, Gage Linearity, Spec/Inaccuracy Ratios, and Guardbanding..
  • C = 0 Sampling Plans -- two types of OC curves, and AOQL for chosen plan. Calculates the exact absolute smallest sample size that gives the desired protection level for a given exact size lot (up to 1000).
XLStatistics -- a set of Excel (ver 5+) workbooks for statistical analysis of data. A step-by-step guide to data analysis with separate workbooks for handling data with different numbers and types of variables. Contains most standard analyses, analyses using only summary data, power / sample size , nonparametrics, curve fitting , non-linear regression, analysis for 2x2 tables. XLStatistics is not an Excel add-in and all the working and code is visible. A free version for analysis of 1- and 2-variable data is available.

2 comments: