Package 'bdvis' reference manual

Title:	Biodiversity Data Visualizations
Description:	Provides a set of functions to create basic visualizations to quickly preview different aspects of biodiversity information such as inventory completeness, extent of coverage (taxonomic, temporal and geographic), gaps and biases. Barve & Otegui (2016) <DOI:10.1093/bioinformatics/btw333>.
Authors:	Vijay Barve [aut, cre] , Javier Otegui [aut]
Maintainer:	Vijay Barve <[email protected]>
License:	GPL-3
Version:	0.2.37
Built:	2024-11-02 03:43:30 UTC
Source:	https://github.com/vijaybarve/bdvis

Calendar heat map of biodiversity data

Description

Produces a heat map https://en.wikipedia.org/wiki/Heat_map representing the distribution of records in time.

Usage

bdcalendarheat(indf = NA, title = NA)
bdcalendarheat(indf = NA, title = NA)

Arguments

`indf`	input data frame containing biodiversity data set
`title`	title custom title for the plot

Details

The calendar heat map is a matrix-like plot where each cell represents a unique date, and the color the cell is painted with shows the amount of records that have that particular date. Rows are weekdays and columns are week numbers, each year having its own "panel".

Value

No return value, called for plotting the heatmap plot

Examples

## Not run: 
bdcalendarheat(inat)

## End(Not run)
## Not run: 
bdcalendarheat(inat)

## End(Not run)

Computes completeness values of the dataset

Description

Computes completeness values for each cell. Currently returns Chao2 index of species richness.

Usage

bdcomplete(indf, recs = 50, gridscale = 1)
bdcomplete(indf, recs = 50, gridscale = 1)

Arguments

`indf`	input data frame containing biodiversity data set
`recs`	minimum number of records per grid cell required to make the calculations. Default is 50. If there are too few records, the function throws an error.
`gridscale`	plot the map grids at specific degree scale. Default is 1.

Details

After dividing the extent of the dataset in cells (via the getcellid function), the function calculates the Chao2 estimator of species richness. Given the nature of the calculations, a minimum number of records must be present on each cell to properly compute the index. If there are too few records in the cells, the function is unable to finish, and it throws an error.

This function produces a plot of number of species versus completeness index to give an idea of output. The data frame returned can be used to visualize the completeness of the data using mapgrid function with ptype as "complete".

Value

data.frame with the columns

"Cell_id" - id of the cell
"nrec" - Number of records in the cell
"Sobs" - Number of Observed species
"Sest" - Estimated number of species
"c" - Completeness ratio the cell

Plots a graph of Number of species vs completeness

Examples

## Not run: 
bdcomplete(inat)

## End(Not run)
## Not run: 
bdcomplete(inat)

## End(Not run)

Provides summary of biodiversity data

Description

Calculates some general indicators of the volume, spatial, temporal and taxonomic aspects of the provided data set.

Usage

bdsummary(indf)
bdsummary(indf)

Arguments

indf

input data frame containing biodiversity data set

Details

The function returns information on the volume of the data set (number of records), temporal coverage (minimum and maximum dates), taxonomic coverage (brief breakdown of the records by taxonomic levels) and spatial coverage (coordinates of the edges of the bounding box containing all records and division of covered area in degree cells) of the records.

To update spatial grid data to dataset, please use format_bdvis or getcellid function before using bdsummary.

Value

No return value, just displays the summary in console

Examples

## Not run: 
 if (requireNamespace("rinat", quietly=TRUE)) {
  inat <- get_inat_obs_project("reptileindia") 
  inat <- format_bdvis(inat, source="rinat")
  bdsummary(inat)
 }

## End(Not run)
## Not run: 
 if (requireNamespace("rinat", quietly=TRUE)) {
  inat <- get_inat_obs_project("reptileindia") 
  inat <- format_bdvis(inat, source="rinat")
  bdsummary(inat)
 }

## End(Not run)

bdvis: Biodiversity Data Visualizations

Description

Biodiversity data visualizations using R would be helpful to understand completeness of biodiversity inventory, extent of geographical, taxonomic and temporal coverage, gaps and biases in data.

Citation

Barve, V., & Otegui, J. (2016). bdvis: Biodiversity data visualizations (R package V 0.2). Retrieved from https://cran.r-project.org/web/packages/bdvis/index.html

(Deprecated) Interactive web page based map of records

Description

(Deprecated) Interactive web page based map of records

Usage

bdwebmap()
bdwebmap()

Value

No return value. NULL

Draws a chronohorogram of records

Description

Draws a detailed temporal representation (also known as chronohorogram) of the dates in the provided data set. For more information on the chronohorogram, please see the References section.

Usage

chronohorogram(
  indf = NA,
  title = "Chronohorogram",
  startyear = 1980,
  endyear = NA,
  colors = c("red", "blue"),
  ptsize = 1
)
chronohorogram(
  indf = NA,
  title = "Chronohorogram",
  startyear = 1980,
  endyear = NA,
  colors = c("red", "blue"),
  ptsize = 1
)

Arguments

`indf`	input data frame containing biodiversity data set
`title`	title of the plot. Default is "Chronohorogram"
`startyear`	starting year for the plot. Default is 1980
`endyear`	end year for the plot. Default is current year
`colors`	Pair of colors to build color gradient, in the form of a character vector. Default is blue (less) - red (more) gradient `c("red", "blue")`
`ptsize`	point size adjustment factor. Default is 1

Value

No return value, called for plotting the graph

References

Arino, A. H., & Otegui, J. (2008). Sampling biodiversity sampling. In Proceedings of TDWG (pp. 77-78). Retrieved from http://www.tdwg.org/fileadmin/2008conference/documents/Proceedings2008.pdf#page=77

Examples

## Not run: 
chronohorogram(inat)

## End(Not run)
## Not run: 
chronohorogram(inat)

## End(Not run)

Distribution graphs

Description

Build plots displaying distribution of biodiversity records among user-defined features.

Usage

distrigraph(indf, ptype = NA, cumulative = FALSE, ...)
distrigraph(indf, ptype = NA, cumulative = FALSE, ...)

Arguments

`indf`	input data frame containing biodiversity data set
`ptype`	Feature to represent. Accepted values are "species", "cell", "efforts" and "effortspecies" (year)
`cumulative`	with ptype as efforts, plot a cumulative records graph
`...`	any additional parameters for the `plot` function.

Details

The main use of this function is to create record histograms according to different features of the data set. For example, one might want to see the evolution of records by year, or by species. This function enables easy access to such plots.

Value

No return value, called for plotting the graphs

Examples

## Not run: 
 distrigraph(inat,ptype="cell",col="tomato")
 distrigraph(inat,ptype="species",ylab="Species")
 distrigraph(inat,ptype="efforts",col="red")
 distrigraph(inat,ptype="efforts",col="red",type="s")

## End(Not run)
## Not run: 
 distrigraph(inat,ptype="cell",col="tomato")
 distrigraph(inat,ptype="species",ylab="Species")
 distrigraph(inat,ptype="efforts",col="red")
 distrigraph(inat,ptype="efforts",col="red",type="s")

## End(Not run)

Prepare data frame for flagging functions

Description

format_bdvis renames certain fields in the data frame to make sure the other package functions knows how to use them. This step is highly recommended for the proper working of the functions.

Usage

format_bdvis(
  indf,
  source = NULL,
  config = NULL,
  quiet = FALSE,
  gettaxo = FALSE,
  ...
)
format_bdvis(
  indf,
  source = NULL,
  config = NULL,
  quiet = FALSE,
  gettaxo = FALSE,
  ...
)

Arguments

`indf`	Required. The data.frame on which to operate.
`source`	Optional. Indicates the package that was used to retrieve the data. Currently accepted values are "rvertnet", "rgbif", "bdsns" or "rinat". Either `source`, `config` or individual parameters must be present (see details).
`config`	Optional. Configuration object indicating mapping of field names from the data.frame to the DarwinCore standard. Useful when importing data multiple times from a source not available via the `source` argument. Either `source`, `config` or individual parameters must be present (see details).
`quiet`	Optional. Don't show any logging message at all. Defaults to FALSE.
`gettaxo`	optional. Call function gettaxo to build higher level taxanony. Defaults to FALSE.
`...`	Optional. If none of the previous is present, the four key arguments (`Latitude`, `Longitude`, `Date_collected`, `Scientific_name`) can be put here. See examples.

Details

When invoked, there are three ways of indicating the function how to transform the data.frame: using the source parameter, providing a config object with field mapping, or passing individual values to the mapping function. This is the order in which the function will parse arguments; source overrides config, which overrides other mapping arguments.

source refers to the package that was used to retrieve the data. Currently, three values are supported for this argument: "rgbif", "rvertnet", "besns" and "rinat", but many more are on their way. A caution with "besns" data is he scientific name has to be in the field "searchText".

config asks for a configuration object holding the mapping of the field names. This option is basically a shortcut for those users with custom-formatted data.frames who will use the same mapping many times, to avoid having to type them each time. In practice, this object is a named list with the following four fields: Latitude, Longitude, Date_collected and Scientific_name. Each element must have a string indicating the name of the column in the data.frame holding the values #' for that element. If the data.frame doesn't have one or more of these fields, #' put NA in that element; otherwise, the function will throw an error. See the examples section.

If none of the two is provided, the function expects the user to provide the mapping by passing the individual column names associated with the right term. See the examples section.

Value

The provided data frame, with field names changed to suite the functioning of further visualization functions.

Examples

## Not run: 
# Using the rgbif package and the source argument
if (requireNamespace("rinat", quietly=TRUE)) {
 d <- get_inat_obs_project("reptileindia") 
 d <- format_bdvis(d, source="rinat")

 # Using a configuration object, matches 'rinat' schema
 conf <- list(Latitude <- "latitude",
              Longitude <- "longitude",
              Date_collected <- "Observed.on",
              Scientific_name <- "Scientific.name")
 d <- format_bdvis(d, config=conf)

 # Passing individual parameters, all optional
 d <- format_bdvis(d,
                Latitude <- "lat",
                Longitude <- "lng",
                Date_collected <- "ObservedOn",
                Scientific_name <- "sciname")
}

## End(Not run)

## Not run: 
# Using the rgbif package and the source argument
if (requireNamespace("rinat", quietly=TRUE)) {
 d <- get_inat_obs_project("reptileindia") 
 d <- format_bdvis(d, source="rinat")

 # Using a configuration object, matches 'rinat' schema
 conf <- list(Latitude <- "latitude",
              Longitude <- "longitude",
              Date_collected <- "Observed.on",
              Scientific_name <- "Scientific.name")
 d <- format_bdvis(d, config=conf)

 # Passing individual parameters, all optional
 d <- format_bdvis(d,
                Latitude <- "lat",
                Longitude <- "lng",
                Date_collected <- "ObservedOn",
                Scientific_name <- "sciname")
}

## End(Not run)

Assign GBIF style degree cell ids and generate custom grid cell ids

Description

Calculate and assign a GBIF-style degree cell id and centi-degree (0.1 degrees, dividing a 1 degree cell into 100 centi-degree cells) cell id to each record. This function also creates a custom grid scale if parameter gridscale is supplied. This is a necessary previous step for some functions like mapgrid

Usage

getcellid(indf, gridscale = 0)
getcellid(indf, gridscale = 0)

Arguments

`indf`	input data frame containing biodiversity data set
`gridscale`	generate custom grid scale column for mapping. Default is 0.

Value

data frame with two columns for cell_id added

Examples

## Not run: 
getcellid(inat)

## End(Not run)
## Not run: 
getcellid(inat)

## End(Not run)

Get higher taxonomy data

Description

This function is slated to deprecate in next version. Please use function taxotools::list_higher_taxo instead.

Usage

gettaxo(indf, genus = FALSE, verbose = FALSE, progress = TRUE)
gettaxo(indf, genus = FALSE, verbose = FALSE, progress = TRUE)

Arguments

`indf`	input data frame containing biodiversity data set
`genus`	If TRUE, use only genus level data to get taxanomy
`verbose`	If TRUE, displays each name string for which the higher taxonomy is sought
`progress`	If TRUE prints progress bar and messages on the consol.

Details

Retrieve higher taxonomy information (like Family and Order) for each record from the "Encyclopedia of Life" web API.

This function makes use of certain functions in the taxize package. It scans and retrieves the taxonomic hierarchy for each scientific name (or just genus name) in the data set. When new data are retrieved, they are stored in a local sqlite database, taxo.db, for faster further access.

Value

indf with added / updated columns

"Kingdom" - Kingdom of the Scientific name
"Phylum" - Phylum of the Scientific name
"Order_" - Order of the Scientific name
"Family" - Family of the Scientific name
"Genus" - Genus of the Scientific name

and also saves a local copy of taxonomy downloaded for future use in taxo.db sqlite file

Examples

## Not run: 
inat <- gettaxo(inat)

## End(Not run)
## Not run: 
inat <- gettaxo(inat)

## End(Not run)

Maps the data points on the map in grid format

Description

Customizable grid-based spatial representation of the coordinates of the records in the data set.

Usage

mapgrid(
  indf = NULL,
  comp = NULL,
  ptype = "records",
  title = "",
  bbox = NA,
  legscale = 0,
  collow = "blue",
  colhigh = "red",
  mapdatabase = NULL,
  region = NULL,
  shp = NA,
  gridscale = 1,
  customize = NULL
)
mapgrid(
  indf = NULL,
  comp = NULL,
  ptype = "records",
  title = "",
  bbox = NA,
  legscale = 0,
  collow = "blue",
  colhigh = "red",
  mapdatabase = NULL,
  region = NULL,
  shp = NA,
  gridscale = 1,
  customize = NULL
)

Arguments

`indf`	input data frame containing biodiversity data set
`comp`	Completeness matrix generate by function `bdcomplete`
`ptype`	Type of map on the grid. Accepted values are "presence" for presence/absence maps, "records" for record-density map, "species" for species-density map and "complete" for completeness map
`title`	title for the map. There is no default title
`bbox`	bounding box for the map in format c(xmin,xmax,ymin,ymax)
`legscale`	Set legend scale to a higher value than the max value in the data
`collow`	Color for lower range in the color ramp of the grid
`colhigh`	Color for higher range in the color ramp of the grid
`mapdatabase`	Parameter is deprecated
`region`	Parameter is deprecated. Please use shape files.
`shp`	path to shapefile to load as basemap (default NA)
`gridscale`	plot the map grids at scale specified. Scale needs to specified in decimal degrees. Default is 1 degree which is approximately 100km.
`customize`	additional customization string to customize the map output using ggplot2 parameters

Details

This function builds a grid map colored according to the density of records in each cell. Grids are 1-degree cells, build with the getcellid function. Currently, four types of maps can be rendered. Presence maps show only if the cell is populated or not, without paying attention to how many records or species are present. Record-density maps apply a color gradient according to the number of records in the cell, regardless of the number of species they represent. Species-density maps apply a color gradient according to the number of different species in the cell, regardless of how many records there are for each one of those. Completeness maps apply a color gradient according to the completeness index, from 0 (incomplete) to 1 (complete).

See parameter descriptions for ways of customizing the map.

Value

No return value, called for plotting the graph

Examples

## Not run: 
mapgrid(inat,ptype="records", region="India")

## End(Not run)
## Not run: 
mapgrid(inat,ptype="records", region="India")

## End(Not run)

Treemap based on taxonomic hierarchy of records

Description

Draws a treemap (https://en.wikipedia.org/wiki/Treemapping) based on the taxonomic information of the records.

Usage

taxotree(
  indf,
  n = 30,
  title = NA,
  legend = NA,
  sum1 = "Family",
  sum2 = "Genus"
)
taxotree(
  indf,
  n = 30,
  title = NA,
  legend = NA,
  sum1 = "Family",
  sum2 = "Genus"
)

Arguments

`indf`	input data frame containing biodiversity data set
`n`	maximum number of rectangles to be plotted in the treemap. Default is 30
`title`	title for the tree. Default is "Records per <sum1>"
`legend`	legend title. Default is "Number of <sum2>"
`sum1`	Taxonomic level whose density will be represented with different cell sizes
`sum2`	Taxonomic level whose density will be represented with a color gradient

Details

This function builds a treemap of the taxonomic information present in the data set. It represents this information at two levels (with the arguments sum1 and sum2). The first level (sum1) will be represented with cell sizes and is a reflection of the number of records in that group. If, for example, "Family" is selected as value for sum1, the size of the cells in the treemap will be directly proportional to the number of records for that taxonomic family. The second level (sum2) will be represented by color and is a reflection of the number of sub-groups in a particular cell. If, for example, "Genus" is selected as value for sum2, the color of the cell will depend on the number of different genera for that particular cell.

Value

No return value, called for plotting the graph

References

Otegui, J., Arino, A. H., Encinas, M. A., & Pando, F. (2013). Assessing the Primary Data Hosted by the Spanish Node of the Global Biodiversity Information Facility (GBIF). PLoS ONE, 8(1), e55144. doi:10.1371/journal.pone.0055144

Examples

## Not run: 
 taxotree(inat)

## End(Not run)
## Not run: 
 taxotree(inat)

## End(Not run)

Polar plot of temporal data

Description

Representation in polar axis of the distribution of dates in the provided data set.

Usage

tempolar(
  indf = NA,
  timescale = NA,
  title = NA,
  color = NA,
  plottype = NA,
  avg = FALSE
)
tempolar(
  indf = NA,
  timescale = NA,
  title = NA,
  color = NA,
  plottype = NA,
  avg = FALSE
)

Arguments

`indf`	input data frame containing biodiversity data set
`timescale`	Temporal scale of the graph, or how are dates aggregated. Accepted values are: d (daily, each feature in the plot represents a day), w (weekly, each feature in the plot represents a week) and m (monthly, each feature in the plot represents a month). Default is d (daily).
`title`	Title for the graph. Default is "Temporal coverage".
`color`	color of the graph plot. Default is "red".
`plottype`	Type of feature. Accepted values are: r (lines), p (polygon) and s (symbols). Default is p (polygon).
`avg`	If TRUE plots a graph of the average records rather than total numbers. Default is FALSE.

Details

This function returns a plot representing the temporal distribution of records in the data set. This is done by representing dates in a radial axis, with the distance from the center being the amount of records for that particular date. This function allows several arguments indicating different representation types. See the arguments section for an enumeration of them.

Value

No return value, called for plotting the graph

References

Examples

## Not run: 
tempolar(inat)

## End(Not run)
## Not run: 
tempolar(inat)

## End(Not run)

Package 'bdvis'

Help Index

Calendar heat map of biodiversity data

Description

Usage

Arguments

Details

Value

See Also

Examples

Computes completeness values of the dataset

Description

Usage

Arguments

Details

Value

See Also

Examples

Provides summary of biodiversity data

Description

Usage

Arguments

Details

Value

See Also

Examples

bdvis: Biodiversity Data Visualizations

Description

Data preparation

Spatial visualizations

Temporal Visualizations

Taxonomic Visualizations

Miscellaneous functions

Citation

(Deprecated) Interactive web page based map of records

Description

Usage

Value

Draws a chronohorogram of records

Description

Usage

Arguments

Value

References

See Also

Examples

Distribution graphs

Description

Usage

Arguments

Details

Value

Examples

Prepare data frame for flagging functions

Description

Usage

Arguments

Details

Value

See Also

Examples

Assign GBIF style degree cell ids and generate custom grid cell ids

Description

Usage

Arguments

Value

See Also

Examples

Get higher taxonomy data

Description

Usage

Arguments

Details

Value

See Also

Examples

Maps the data points on the map in grid format

Description

Usage

Arguments