Rle-class {IRanges} | R Documentation |
The Rle class is a general container for storing an atomic vector
that is stored in a run-length encoding format. It is based on the
rle
function from the base package.
Rle(values)
:
This constructor creates an Rle instances out of an atomic
vector values
.
Rle(values, lengths)
:
This constructor creates an Rle instances out of an atomic
vector or factor object values
and an integer or numeric vector
lengths
with all positive elements that represent how many times
each value is repeated. The length of these two vectors must be the same.
as(from, "Rle")
:
This constructor creates an Rle instances out of an atomic
vector from
.
In the code snippets below, x
is an Rle object:
runLength(x)
:
Returns the run lengths for x
.
runValue(x)
:
Returns the run values for x
.
nrun(x)
:
Returns the number of runs in x
.
start(x)
:
Returns the starts of the runs for x
.
end(x)
:
Returns the ends of the runs for x
.
width(x)
:
Same as runLength(x)
.
In the code snippets below, x
is an Rle object:
runLength(x) <- value
:
Replaces x
with a new Rle object using run values
runValue(x)
and run lengths value
.
runValue(x) <- value
:
Replaces x
with a new Rle object using run values
value
and run lengths runLength(x)
.
In the code snippets below, x
and from
are Rle objects:
as.vector(x, mode="any")
, as(from, "vector")
:
Creates an atomic vector based on the values contained in
x
. The vector will be coerced to the requested mode
,
unless mode
is "any", in which case the most appropriate
type is chosen.
as.vectorORfactor(x)
: Creates an atomic vector or factor,
based on the type of values contained in x
. This is the
most general way to decompress the Rle to a native R data
structure.
as.logical(x)
, as(from, "logical")
: Creates a logical
vector based on the values contained in x
.
as.integer(x)
, as(from, "integer")
: Creates an integer
vector based on the values contained in x
.
as.numeric(x)
, as(from, "numeric")
: Creates a numeric
vector based on the values contained in x
.
as.complex(x)
, as(from, "complex")
: Creates a complex
vector based on the values contained in x
.
as.character(x)
, as(from, "character")
: Creates a character
vector based on the values contained in x
.
as.raw(x)
, as(from, "raw")
: Creates a raw
vector based on the values contained in x
.
as.factor(x)
, as(from, "factor")
: Creates a factor object
based on the values contained in x
.
as.data.frame(x)
, as(from, "data.frame")
: Creates
a data.frame
with a single column holding the result of
as.vector(x)
.
as(from, "IRanges")
: Creates an IRanges instance
from a logical Rle. Note that this instance is guaranteed to be normal.
as(from, "NormalIRanges")
: Creates a NormalIRanges instance
from a logical Rle.
Rle objects have support for S4 group generic functionality:
Arith
"+"
, "-"
, "*"
, "^"
,
"%%"
, "%/%"
, "/"
Compare
"=="
, ">"
, "<"
, "!="
,
"<="
, ">="
Logic
"&"
, "|"
Ops
"Arith"
, "Compare"
, "Logic"
Math
"abs"
, "sign"
, "sqrt"
,
"ceiling"
, "floor"
, "trunc"
, "cummax"
,
"cummin"
, "cumprod"
, "cumsum"
, "log"
,
"log10"
, "log2"
, "log1p"
, "acos"
,
"acosh"
, "asin"
, "asinh"
, "atan"
,
"atanh"
, "exp"
, "expm1"
, "cos"
,
"cosh"
, "sin"
, "sinh"
, "tan"
, "tanh"
,
"gamma"
, "lgamma"
, "digamma"
, "trigamma"
Math2
"round"
, "signif"
Summary
"max"
, "min"
, "range"
,
"prod"
, "sum"
, "any"
, "all"
Complex
"Arg"
, "Conj"
, "Im"
,
"Mod"
, "Re"
See S4groupGeneric for more details.
In the code snippets below, x
is an Rle object:
x[i, drop=getOption("dropRle", default=FALSE)]
:
Subsets x
by index i
, where i
can be positive
integers, negative integers, a logical vector of the same length as
x
, an Rle object of the same length as x
containing logical values, or an IRanges object.
When drop=FALSE
returns an Rle object. When drop=TRUE
,
returns an atomic vector.
x[i] <- value
:
Replaces elements in x
specified by i
with corresponding
elements in value
. Supports the same types for i
as
x[i]
.
x %in% table
:
Returns a logical Rle representing set membership in
table
.
aggregate(x, by, FUN, start = NULL, end = NULL, width = NULL,
frequency = NULL, delta = NULL, ..., simplify = TRUE))
:
Generates summaries on the specified windows and returns the result in a
convenient form:
by
An object with start
, end
, and
width
methods.
FUN
The function, found via match.fun
, to be
applied to each window of x
.
start
, end
, width
the start, end, or width
of the window. If by
is missing, then must supply two of the
three.
frequency
, delta
Optional arguments that specify the sampling frequency and increment within the window.
Further arguments for FUN
.
simplify
A logical value specifying whether or not the result should be simplified to a vector or matrix if possible.
append(x, values, after = length(x))
:
Insert one Rle into another Rle.
values
the Rle to insert.
after
the subscript in x
after which the values
are to be inserted.
c(x, ...)
:
Combines a set of Rle objects.
findRange(x, vec)
:
Returns an IRanges object representing the ranges in Rle vec
that are referenced by the indices in the integer vector x
.
findRun(x, vec)
:
Returns an integer vector indicating the run indices in Rle vec
that are referenced by the indices in the integer vector x
.
head(x, n = 6L)
:
If n
is non-negative, returns the first n elements of x
.
If n
is negative, returns all but the last abs(n)
elements
of x
.
is.na(x)
:
Returns a logical Rle indicating with values are NA
.
is.unsorted(x, na.rm = FALSE, strictly = FALSE)
:
Returns a logical value specifying if x
is unsorted.
na.rm
remove missing values from check.
strictly
check for _strictly_ increasing values.
length(x)
:
Returns the underlying vector length of x
.
match(x, table, nomatch = NA_integer_, incomparables = NULL)
:
Matches the values in x
to table
:
table
the values to be matched against.
nomatch
the value to be returned in the case when no match is found.
incomparables
a vector of values that cannot be matched.
Any value in x
matching a value in this vector is assigned
the nomatch
value.
rep(x, times, length.out, each)
, rep.int(x, times)
:
Repeats the values in x
through one of the following conventions:
times
Vector giving the number of times to repeat each
element if of length length(x)
, or to repeat the whole vector
if of length 1.
length.out
Non-negative integer. The desired length of the output vector.
each
Non-negative integer. Each element of x
is
repeated each
times.
rev(x)
:
Reverses the order of the values in x
.
shiftApply(SHIFT, X, Y, FUN, ..., OFFSET = 0L, simplify = TRUE, verbose = FALSE)
:
Let i
be the indices in SHIFT
,
X_i = window(X, 1 + OFFSET, length(X) - SHIFT[i])
, and
Y_i = window(Y, 1 + SHIFT[i], length(Y) - OFFSET)
. Calculates
the set of FUN(X_i, Y_i, ...)
values and return the results in a
convenient form:
SHIFT
A non-negative integer vector of shift values.
X
, Y
The Rle objects to shift.
FUN
The function, found via match.fun
, to be
applied to each set of shifted vectors.
Further arguments for FUN
.
A non-negative integer offset to maintain throughout the shift operations.
simplify
A logical value specifying whether or not the result should be simplified to a vector or matrix if possible.
verbose
A logical value specifying whether or not to
print the i
indices to track the iterations.
show(object)
:
Prints out the Rle object in a user-friendly way.
order(..., na.last = TRUE, decreasing = FALSE)
:
Returns a permutation which rearranges its first argument
into ascending or descending order, breaking ties by further
arguments. See order
.
sort(x, decreasing = FALSE, na.last = NA)
:
Sorts the values in x
.
decreasing
If TRUE
, sort values in decreasing
order. If FALSE
, sort values in increasing order.
na.last
If TRUE
, missing values are placed last.
If FALSE
, they are placed first. If NA
, they are
removed.
split(x, f, drop=FALSE)
:
Splits x
according to f
to create a
CompressedRleList object.
If f
is a list-like object then drop
is ignored
and f
is treated as if it was
rep(seq_len(length(f)), sapply(f, length))
,
so the returned object has the same shape as f
(it also
receives the names of f
).
Otherwise, if f
is not a list-like object, empty list
elements are removed from the returned object if drop
is
TRUE
.
splitRanges(x)
:
Returns a CompressedIRangesList object that contain the
ranges for each of the unique run values.
subset(x, subset)
:
Returns a new Rle object made of the subset using logical vector
subset
.
summary(object, ..., digits = max(3, getOption("digits") - 3))
:
Summarizes the Rle object using an atomic vector convention. The
digits
argument is used for number formatting with
signif()
.
table(...)
:
Returns a table containing the counts of the unique values. Supported
arguments include useNA
with values of ‘no’ and ‘ifany’.
Multiple Rle's must be combined with c()
before calling table
.
tail(x, n = 6L)
:
If n
is non-negative, returns the last n elements of x
.
If n
is negative, returns all but the first abs(n)
elements
of x
.
unique(x, incomparables = FALSE, ...)
:
Returns the unique run values. The incomparables
argument takes a
vector of values that cannot be compared with FALSE
being a special
value that means that all values can be compared.
window(x, start=NA, end=NA, width=NA, frequency=NULL, delta=NULL, ...)
:
Extract the subsequence window from x
specified by:
start
, end
, width
The start, end, or width of the window. Two of the three are required.
frequency
, delta
Optional arguments that specify the sampling frequency and increment within the window.
window(x, start=NA, end=NA, width=NA) <- value
:
Replace the subsequence window specified on the left (i.e. the
subsequence in x
specified by start
, end
and
width
) by value
.
value
must either be of class Rle, belong to a
subclass of Rle, or be coercible to Rle or a
subclass of Rle.
The elements of value
are repeated to create an Rle with the
same number of elements as the width of the subsequence window it is
replacing.
In the code snippets below, x
is an Rle object:
!x
:
Returns logical negation (NOT) of x
.
which(x)
:
Returns an integer vector representing the TRUE
indices of
x
.
ifelse(x, yes, no)
:
For each element of x
, returns the corresponding element
in yes
if TRUE
, otherwise the element in
no
. yes
and no
may be Rle
objects or
anything else coercible to a vector.
In the code snippets below, x
is an Rle object:
diff(x, lag = 1, differences = 1
:
Returns suitably lagged and iterated differences of x
.
lag
An integer indicating which lag to use.
differences
An integer indicating the order of the difference.
pmax(..., na.rm = FALSE)
, pmax.int(..., na.rm = FALSE)
:
Parallel maxima of the Rle input values. Removes NA
s when
na.rm = TRUE
.
pmin(..., na.rm = FALSE)
, pmin.int(..., na.rm = FALSE)
:
Parallel minima of the Rle input values. Removes NA
s when
na.rm = TRUE
.
which.max(x)
: Returns the index of the first element matching
the maximum value of x
.
mean(x, na.rm = FALSE)
:
Calculates the mean of x
. Removes NA
s when
na.rm = TRUE
.
var(x, y = NULL, na.rm = FALSE)
:
Calculates the variance of x
or covariance of x
and y
if both are supplied. Removes NA
s when na.rm = TRUE
.
cov(x, y, use = "everything")
, cor(x, y, use = "everything")
:
Calculates the covariance and correlation respectively of Rle objects
x
and y
.
The use
argument is an optional character string giving a method for
computing covariances in the presence of missing values. This must be
(an abbreviation of) one of the strings "everything"
,
"all.obs"
, "complete.obs"
, "na.or.complete"
, or
"pairwise.complete.obs"
.
sd(x, na.rm = FALSE)
:
Calculates the standard deviation of x
. Removes NA
s
when na.rm = TRUE
.
median(x, na.rm = FALSE)
:
Calculates the median of x
. Removes NA
s when
na.rm = TRUE
.
quantile(x, probs = seq(0, 1, 0.25), na.rm = FALSE, names = TRUE, type = 7, ...)
:
Calculates the specified quantiles of x
.
probs
A numeric vector of probabilities with values in [0,1].
na.rm
If TRUE
, removes NA
s from x
before the quantiles are computed.
names
If TRUE
, the result has names describing the
quantiles.
type
An integer between 1 and 9 selecting one of the nine
quantile algorithms detailed in quantile
.
Further arguments passed to or from other methods.
mad(x, center = median(x), constant = 1.4826, na.rm = FALSE, low = FALSE, high = FALSE)
:
Calculates the median absolute deviation of x
.
center
The center to calculate the deviation from.
constant
The scale factor.
na.rm
If TRUE
, removes NA
s from x
before the mad is computed.
low
If TRUE
, compute the 'lo-median'.
high
If TRUE
, compute the 'hi-median'.
IQR(x, na.rm = FALSE)
:
Calculates the interquartile range of x
.
na.rm
If TRUE
, removes NA
s from x
before the IQR is computed.
smoothEnds(y, k = 3)
:
Smooth end points of an Rle y
using subsequently smaller
medians and Tukey's end point rule at the very end.
k
An integer indicating the width of largest median window; must be odd.
runmean(x, k, endrule = c("drop", "constant"), na.rm = FALSE)
:
Calculates the means for fixed width running windows across x
.
k
An integer indicating the fixed width of the running
window. Must be odd when endrule == "constant"
.
A character string indicating how the values at the beginning and the end (of the data) should be treated.
"drop"
do not extend the running statistics to be the same length as the underlying vectors;
"constant"
copies running statistic to the first values and analogously for the last ones making the smoothed ends constant;
na.rm
A logical indicating if NA and NaN values should be removed.
runmed(x, k, endrule = c("median", "keep", "drop", "constant"))
:
Calculates the medians for fixed width running windows across x
.
k
An integer indicating the fixed width of the running
window. Must be odd when endrule != "drop"
.
A character string indicating how the values at the beginning and the end (of the data) should be treated.
"keep"
keeps the first and last k2
values at both ends, where k2 is the half-bandwidth
k2 = k %/% 2
, i.e., y[j] = x[j]
for
j \in \{1,…,k_2; n-k_2+1,…,n\}
j = 1,..,k2 and (n-k2+1),..,n;
"constant"
copies the running statistic to the first values and analogously for the last ones making the smoothed ends constant;
"median"
the default, smooths the ends by using
symmetrical medians of subsequently smaller bandwidth, but
for the very first and last value where Tukey's robust
end-point rule is applied, see smoothEnds
.
runsum(x, k, endrule = c("drop", "constant"), na.rm = FALSE)
:
Calculates the sums for fixed width running windows across x
.
k
An integer indicating the fixed width of the running
window. Must be odd when endrule == "constant"
.
A character string indicating how the values at the beginning and the end (of the data) should be treated.
"drop"
do not extend the running statistics to be the same length as the underlying vectors;
"constant"
copies running statistic to the first values and analogously for the last ones making the smoothed ends constant;
na.rm
A logical indicating if NA and NaN values should be removed.
runwtsum(x, k, wt, endrule = c("drop", "constant"), na.rm = FALSE)
:
Calculates the sums for fixed width running windows across x
.
k
An integer indicating the fixed width of the running
window. Must be odd when endrule == "constant"
.
wt
A numeric vector of length k
that
provides the weights to use.
A character string indicating how the values at the beginning and the end (of the data) should be treated.
"drop"
do not extend the running statistics to be the same length as the underlying vectors;
"constant"
copies running statistic to the first values and analogously for the last ones making the smoothed ends constant;
na.rm
A logical indicating if NA and NaN values should be removed.
runq(x, k, i, endrule = c("drop", "constant"))
:
Calculates the order statistic for fixed width running windows across
x
.
k
An integer indicating the fixed width of the running
window. Must be odd when endrule == "constant"
.
i
An integer indicating which order statistic to calculate.
A character string indicating how the values at the beginning and the end (of the data) should be treated.
"drop"
do not extend the running statistics to be the same length as the underlying vectors;
"constant"
copies running statistic to the first values and analogously for the last ones making the smoothed ends constant;
na.rm
A logical indicating if NA and NaN values should be removed.
In the code snippets below, x
is an Rle object:
nchar(x, type = "chars", allowNA = FALSE)
:
Returns an integer Rle representing the number of characters in the
corresponding values of x
.
type
One of c("bytes", "chars", "width")
.
allowNA
Should NA
be returned for invalid multibyte
strings rather than throwing an error?
substr(x, start, stop)
, substring(text, first, last = 1000000L)
:
Returns a character or factor Rle containing the specified substrings
beginning at start
/first
and ending at
stop
/last
.
chartr(old, new, x)
:
Returns a character or factor Rle containing a translated version of
x
.
old
A character string specifying the characters to be translated.
new
A character string specifying the translations.
tolower(x)
:
Returns a character or factor Rle containing a lower case version of
x
.
toupper(x)
:
Returns a character or factor Rle containing an upper case version of
x
.
sub(pattern, replacement, x, ignore.case = FALSE,
perl = FALSE, fixed = FALSE, useBytes = FALSE)
:
Returns a character or factor Rle containing replacements based on
matches determined by regular expression matching. See sub
for a description of the arguments.
gsub(pattern, replacement, x, ignore.case = FALSE,
perl = FALSE, fixed = FALSE, useBytes = FALSE)
:
Returns a character or factor Rle containing replacements based on
matches determined by regular expression matching. See gsub
for a description of the arguments.
paste(..., sep = " ", collapse = NULL)
:
Returns a character or factor Rle containing a concatenation of
the values in ...
.
In the code snippets below, x
is an Rle object:
levels(x)
, levels(x) <- value
:
Gets and sets the factor levels, respectively.
nlevels(x)
:
Returns the number of factor levels.
In the code snippets below, x
and y
are Rle object or
some other vector-like object:
setdiff(x, y)
: Returns the unique elements in
x
that are not in y
.
union(x, y)
:
Returns the unique elements in either x
or y
.
intersect(x, y)
:
Returns the unique elements in both x
and y
.
P. Aboyoun
rle
,
Vector-class,
S4groupGeneric,
IRanges-class
x <- Rle(10:1, 1:10) x runLength(x) runValue(x) nrun(x) diff(x) unique(x) sort(x) sqrt(x) x^2 + 2 * x + 1 x[c(1,3,5,7,9)] window(x, 4, 14) range(x) sum(x) mean(x) x > 4 aggregate(x, x > 4, mean) aggregate(x, FUN = mean, start = 1:(length(x) - 50), end = 51:length(x)) x2 <- Rle(LETTERS[c(21:26, 25:26)], 8:1) table(x2) y <- Rle(c(TRUE,TRUE,FALSE,FALSE,TRUE,FALSE,TRUE,TRUE,TRUE)) y as.vector(y) rep(y, 10) c(y, x > 5) z <- c("the", "quick", "red", "fox", "jumps", "over", "the", "lazy", "brown", "dog") z <- Rle(z, seq_len(length(z))) chartr("a", "@", z) toupper(z) ## --------------------------------------------------------------------- ## runsum, runmean, runwtsum, and runq functions ## --------------------------------------------------------------------- ## The .naive_runsum() function demonstrates the semantics of ## runsum(). This test ensures the behavior is consistent with ## base::sum(). .naive_runsum <- function(x, k, na.rm=FALSE) sapply(0:(length(x)-k), function(offset) sum(x[1:k + offset], na.rm=na.rm)) x0 <- c(1, Inf, 3, 4, 5, NA) x <- Rle(x0) target1 <- .naive_runsum(x0, 3, na.rm = TRUE) target2 <- .naive_runsum(x, 3, na.rm = TRUE) stopifnot(target1 == target2) current <- as.vector(runsum(x, 3, na.rm = TRUE)) stopifnot(target1 == current) ## runmean() and runwtsum() : x <- Rle(c(2, 1, NA, 0, 1, -Inf)) runmean(x, k = 3) runmean(x, k = 3, na.rm = TRUE) runwtsum(x, k = 3, wt = c(0.25, 0.50, 0.25)) runwtsum(x, k = 3, wt = c(0.25, 0.50, 0.25), na.rm = TRUE) ## runq() : runq(x, k = 3, i = 1, na.rm = TRUE) ## smallest value in window runq(x, k = 3, i = 3, na.rm = TRUE) ## largest value in window ## When na.rm = TRUE, it is possible the number of non-NA ## values in the window will be less than the 'i' specified. ## Here we request the 4th smallest value in the window, ## which tranlates to the value at the 4/5 (0.8) percentile. x <- Rle(c(1, 2, 3, 4, 5)) runq(x, k=length(x), i=4, na.rm=TRUE) ## The same request on a Rle with two missing values ## finds the value at the 0.8 percentile of the vector ## at the new length of 3 after the NA's have been removed. ## This translates to round((0.8) * 3). x <- Rle(c(1, 2, 3, NA, NA)) runq(x, k=length(x), i=4, na.rm=TRUE)