この節の作者: Ravi Selker, Jonathon Love, Damian Dropmann
Descriptives (descriptives
)¶
Description¶
Descriptives are an assortment of summarising statistics, and visualizations which allow exploring the shape and distribution of data. It is good practice to explore your data with descriptives before proceeding to more formal tests.
Usage¶
descriptives(
data,
vars,
splitBy = NULL,
freq = FALSE,
hist = FALSE,
dens = FALSE,
bar = FALSE,
barCounts = FALSE,
box = FALSE,
violin = FALSE,
dot = FALSE,
dotType = "jitter",
qq = FALSE,
n = TRUE,
missing = TRUE,
mean = TRUE,
median = TRUE,
mode = FALSE,
sum = FALSE,
sd = TRUE,
variance = FALSE,
range = FALSE,
min = TRUE,
max = TRUE,
se = FALSE,
skew = FALSE,
kurt = FALSE,
sw = FALSE,
quart = FALSE,
pcEqGr = FALSE,
pcNEqGr = 4,
formula
)
Arguments¶
data |
the data as a data frame |
vars |
a vector of strings naming the variables of interest in data |
splitBy |
a vector of strings naming the variables used to split vars |
freq |
TRUE or FALSE (default), provide frequency tables (nominal, ordinal variables only) |
hist |
TRUE or FALSE (default), provide histograms (continuous variables only) |
dens |
TRUE or FALSE (default), provide density plots (continuous variables only) |
bar |
TRUE or FALSE (default), provide bar plots (nominal, ordinal variables only) |
barCounts |
TRUE or FALSE (default), add counts to the bar plots |
box |
TRUE or FALSE (default), provide box plots (continuous variables only) |
violin |
TRUE or FALSE (default), provide violin plots (continuous variables only) |
dot |
TRUE or FALSE (default), provide dot plots (continuous variables only) |
dotType |
. |
qq |
TRUE or FALSE (default), provide Q-Q plots (continuous variables only) |
n |
TRUE (default) or FALSE , provide the sample size |
missing |
TRUE (default) or FALSE , provide the number of missing values |
mean |
TRUE (default) or FALSE , provide the mean |
median |
TRUE (default) or FALSE , provide the median |
mode |
TRUE or FALSE (default), provide the mode |
sum |
TRUE or FALSE (default), provide the sum |
sd |
TRUE (default) or FALSE , provide the standard deviation |
variance |
TRUE or FALSE (default), provide the variance |
range |
TRUE or FALSE (default), provide the range |
min |
TRUE or FALSE (default), provide the minimum |
max |
TRUE or FALSE (default), provide the maximum |
se |
TRUE or FALSE (default), provide the standard error |
skew |
TRUE or FALSE (default), provide the skewness |
kurt |
TRUE or FALSE (default), provide the kurtosis |
sw |
TRUE or FALSE (default), provide Shapiro-Wilk p-value |
quart |
TRUE or FALSE (default), provide quartiles |
pcEqGr |
TRUE or FALSE (default), provide quantiles |
pcNEqGr |
an integer (default: 4) specifying the number of equal groups |
formula |
(optional) the formula to use, see the examples |
Output¶
A results object containing:
results$descriptives |
a table of the descriptive statistics |
results$frequencies |
an array of frequency tables |
Tables can be converted to data frames with asDF
or
as.data.frame()
. For example:
results$descriptives$asDF
as.data.frame(results$descriptives)
Examples¶
data('mtcars')
dat <- mtcars
# frequency tables can be provided for factors
dat$gear <- as.factor(dat$gear)
descriptives(dat, vars = vars(mpg, cyl, disp, gear), freq = TRUE)
#
# DESCRIPTIVES
#
# Descriptives
# -------------------------------------------
# mpg cyl disp gear
# -------------------------------------------
# N 32 32 32 32
# Missing 0 0 0 0
# Mean 20.1 6.19 231 3.69
# Median 19.2 6.00 196 4.00
# Minimum 10.4 4.00 71.1 3
# Maximum 33.9 8.00 472 5
# -------------------------------------------
#
#
# FREQUENCIES
#
# Frequencies of gear
# --------------------
# Levels Counts
# --------------------
# 3 15
# 4 12
# 5 5
# --------------------
#
# spliting by a variable
descriptives(formula = disp + mpg ~ cyl, dat,
median=F, min=F, max=F, n=F, missing=F)
# providing histograms
descriptives(formula = mpg ~ cyl, dat, hist=T,
median=F, min=F, max=F, n=F, missing=F)
# splitting by multiple variables
descriptives(formula = mpg ~ cyl:gear, dat,
median=F, min=F, max=F, missing=F)