Title: | Convenience Functions for Routine Data Exploration |
---|---|
Description: | A series of shortcuts for routine tasks originally developed by Rafael A. Irizarry to facilitate data exploration. |
Authors: | Rafael A. Irizarry and Michael I. Love |
Maintainer: | Rafael A. Irizarry <[email protected]> |
License: | Artistic-2.0 |
Version: | 1.0.0 |
Built: | 2025-02-20 04:14:03 UTC |
Source: | https://github.com/cran/rafalib |
Converts a vector of characters into factors and then converts these into numeric.
as.fumeric(x, levels = unique(x))
as.fumeric(x, levels = unique(x))
x |
a character vector |
levels |
the leves to be used in the call to factor |
Rafael A. Irizarry
group = c("a","a","b","b") plot(seq_along(group),col=as.fumeric(group))
group = c("a","a","b","b") plot(seq_along(group),col=as.fumeric(group))
Plot the overlap of three groups with a barplot
bartab(x, y, z, names, skipNone = FALSE, ...)
bartab(x, y, z, names, skipNone = FALSE, ...)
x |
logical |
y |
logical |
z |
logical |
names |
a character vector of length 3 |
skipNone |
remove the "none" group |
... |
further arguments passed on to |
Michael I. Love
Produces an image of a matrix which matches the natural orientation.
imagemat(x, col = colorRampPalette(c("white", "black"))(9), las = 1, xlab = "", ylab = "", ...)
imagemat(x, col = colorRampPalette(c("white", "black"))(9), las = 1, xlab = "", ylab = "", ...)
x |
the matrix |
col |
the colors |
las |
as in par |
xlab |
x-axis title |
ylab |
y-axis title |
... |
arguments passed to image |
Michael I. Love
x <- matrix(c(1,0,0,0,1, 1,1,0,1,1, 1,0,1,0,1, 1,0,0,0,1, 1,0,0,0,1), ncol=5,byrow=TRUE) imagemat(x)
x <- matrix(c(1,0,0,0,1, 1,1,0,1,1, 1,0,1,0,1, 1,0,0,0,1, 1,0,0,0,1), ncol=5,byrow=TRUE) imagemat(x)
the rows are sorted such that the first column has 2 blocks, the second column has 4 blocks, etc. see example("imagesort")
imagesort(x, col = c("white", "black"), ...)
imagesort(x, col = c("white", "black"), ...)
x |
a matrix of 0s and 1s |
col |
the colors of 0 and 1 |
... |
arguments to heatmap |
Michael I. Love
x <- replicate(4,sample(0:1,40,TRUE)) imagesort(x)
x <- replicate(4,sample(0:1,40,TRUE)) imagesort(x)
This is function simply a wrapper for biocLite
. It first sources the code
from the Bioconductor site then calls biocLite
.
install_bioc(...)
install_bioc(...)
... |
arguments passed on to |
Note that once you run this function in a session, you no
longer need to call since
you can call biocLite
directly.
Rafael A. Irizarry
This function lists all the objects in the global environmnet and lists the n
largest.
largeobj(n = 5, units = "Mb")
largeobj(n = 5, units = "Mb")
n |
the number of objects to return |
units |
units to display, see |
a named character string of the size of the 'n' largest objects
Michael I. Love
Takes two vectors x and y and plots M=y-x versus A=(x+y)/2. If the vectors a more longer than length n the data is sampled to size n. A smooth curve is added to show trends.
maplot(x, y, n = 10000, subset = NULL, xlab = NULL, ylab = NULL, curve.add = TRUE, curve.col = 2, curve.span = 1/2, curve.lwd = 2, curve.n = 2000, ...)
maplot(x, y, n = 10000, subset = NULL, xlab = NULL, ylab = NULL, curve.add = TRUE, curve.col = 2, curve.span = 1/2, curve.lwd = 2, curve.n = 2000, ...)
x |
a numeric vector |
y |
a numeric vector |
n |
a numeric value. If |
subset |
index of the points to be plotted |
xlab |
a title for the x axis |
ylab |
a title for the y axis |
curve.add |
if |
curve.col |
a numeric value that determines the color of the smooth curve |
curve.span |
is passed on to |
curve.lwd |
the line width for the smooth curve |
curve.n |
a numeric value that determines the sample size used to fit the curve. This makes fitting the curve faster with large datasets |
... |
further arguments passed to |
Rafael A. Irizarry
n <- 10000 signal <- runif(n,4,15) bias <- (signal/5 - 2)^2 x <- signal + rnorm(n) y <- signal + bias + rnorm(n) maplot(x,y)
n <- 10000 signal <- runif(n,4,15) bias <- (signal/5 - 2)^2 x <- signal + rnorm(n) y <- signal + bias + rnorm(n) maplot(x,y)
Called without arguments, this function optimizes graphical parameters
for the RStudio plot window. bigpar
uses big fonts which are good for presentations.
mypar(a = 1, b = 1, brewer.n = 8, brewer.name = "Dark2", cex.lab = 1, cex.main = 1.2, cex.axis = 1, mar = c(2.5, 2.5, 1.6, 1.1), mgp = c(1.5, 0.5, 0), ...)
mypar(a = 1, b = 1, brewer.n = 8, brewer.name = "Dark2", cex.lab = 1, cex.main = 1.2, cex.axis = 1, mar = c(2.5, 2.5, 1.6, 1.1), mgp = c(1.5, 0.5, 0), ...)
a |
the first entry of the vector passed to |
b |
the second entry of the vector passed to |
brewer.n |
parameter |
brewer.name |
parameters |
cex.lab |
passed on to |
cex.main |
passed on to |
cex.axis |
passed on to |
mar |
passed on to |
mgp |
passed on to |
... |
other parameters passed on to |
Rafael A. Irizarry
mypar() plot(cars) bigpar() plot(cars)
mypar() plot(cars) bigpar() plot(cars)
Modifiction of plclust for plotting hclust objects in *in colour*!
myplclust(hclust, labels = hclust$labels, lab.col = rep(1, length(hclust$labels)), hang = 0.1, xlab = "", sub = "", ...)
myplclust(hclust, labels = hclust$labels, lab.col = rep(1, length(hclust$labels)), hang = 0.1, xlab = "", sub = "", ...)
hclust |
hclust object |
labels |
a character vector of labels of the leaves of the tree |
lab.col |
colour for the labels; NA=default device foreground colour |
hang |
|
xlab |
title for x-axis (defaults to no title) |
sub |
subtitle (defualts to no subtitle) |
... |
further arguments passed to |
Eva KF Chan
Make an plot with nothing in it
nullplot(x1 = 0, x2 = 1, y1 = 0, y2 = 1, xlab = "", ylab = "", ...)
nullplot(x1 = 0, x2 = 1, y1 = 0, y2 = 1, xlab = "", ylab = "", ...)
x1 |
lowest x-axis value |
x2 |
largest x-axis value |
y1 |
lowest y-axis value |
y2 |
largest y-axis value |
xlab |
x-axis title, defaults to no title |
ylab |
y-axis title, defaults to no title |
... |
further arguments passed on to plot |
this returns a character vector which shows the top n lines of a file
peek(x, n = 2)
peek(x, n = 2)
x |
a filename |
n |
the number of lines to return |
Michael I. Love
Returns the population variance. Note that sd
returns
the unbiased sample estimate of the population varaince.
It simply multiplies the result of var
by (n-1) / n with n
the populaton size and takes the square root.
popsd(x, na.rm = FALSE)
popsd(x, na.rm = FALSE)
x |
a numeric vector or an R object which is coercible to one by |
na.rm |
logical. Should missing values be removed? |
Returns the population variance. Note that var
returns
the unbiased sample estimate of the population varaince.
It simply multiplies the result of var
by (n-1) / n with n
the populaton size.
popvar(x, ...)
popvar(x, ...)
x |
a numeric vector, matrix or data frame. |
... |
further arguments passed along to |
draws points or boxes depending on sample size
sboxplot(x, ...)
sboxplot(x, ...)
x |
a named list of numeric vectors |
... |
further arguments passed on to |
sboxplot(list(a=rnorm(15),b=rnorm(75),c=rnorm(10000)))
sboxplot(list(a=rnorm(15),b=rnorm(75),c=rnorm(10000)))
a smooth histogram with unit indicator
(we're simply scaling the kernel density estimate). The advantage of this plot
is its interpretability since the height of the curve represents the
frequency of a interval of size unit
around the point in question.
Another advantage is that if z
is a matrix, curves are plotted
together.
shist(z, unit, bw = "nrd0", n, from, to, plotHist = FALSE, add = FALSE, xlab, ylab = "Frequency", xlim, ylim, main, ...)
shist(z, unit, bw = "nrd0", n, from, to, plotHist = FALSE, add = FALSE, xlab, ylab = "Frequency", xlim, ylim, main, ...)
z |
the data |
unit |
the unit which determines the y axis scaling and is drawn |
bw |
arguments to density |
n |
arguments to density |
from |
arguments to density |
to |
arguments to density |
plotHist |
a logical: should an actual histogram be drawn under curve? |
add |
a logical: add should the curve be added to existing plot? |
xlab |
x-axis title, defaults to no title |
ylab |
y-axis title, defaults to no title |
xlim |
range of the x-axis |
ylim |
range of the y-axis |
main |
an overall title for the plot: see |
... |
arguments to lines |
set.seed(1) x = rnorm(50) par(mfrow=c(2,1)) hist(x, breaks=-5:5) shist(x, unit=1, xlim=c(-5,5))
set.seed(1) x = rnorm(50) par(mfrow=c(2,1)) hist(x, breaks=-5:5) shist(x, unit=1, xlim=c(-5,5))
Creates an list of indexes for each unique entry of x
splitit(x)
splitit(x)
x |
a vector |
x <- c("a","a","b","a","b","c","b","b") splitit(x)
x <- c("a","a","b","a","b","c","b","b") splitit(x)
if n > 10,000, make a random subset of 10,000 and plot. You can also specify
a specific subset to plot. If length of subset is larger
than n
, a random sample is still used to reduce data size.
splot(x, y, n = 10000, subset = NULL, xlab = NULL, ylab = NULL, ...)
splot(x, y, n = 10000, subset = NULL, xlab = NULL, ylab = NULL, ...)
x |
the x data |
y |
the y data |
n |
the number to subset |
subset |
explicit subset index (optional). |
xlab |
title for the x-axis |
ylab |
title for the y-axis |
... |
further parameters passed on to |
x <- rnorm(1e5) y <- rnorm(1e5) splot(x,y,pch=16,col=rgb(0,0,0,.25))
x <- rnorm(1e5) y <- rnorm(1e5) splot(x,y,pch=16,col=rgb(0,0,0,.25))