Loads one or more MultiQC reports into a data frame
A string vector of filepaths to multiqc_data.json files
A string vector, each of which contains the ID of a plot you
want to include in the output. You can use list_plots()
to help here.
A single function that will be called with a sample name and the parsed JSON for the entire report and returns a named list of metadata fields for the sample. Refer to the vignette for an example.
Advanced. A named list of custom parser functions. The names of the list should correspond to plotly plot types, such as "xy_line", and the values should be functions that return a named list of named lists. For the return value, the outer list is named by the sample ID, and the inner list is named by the name of the column. Refer to the source code for some examples.
A string vector of zero or more sections to include in the output. Each section can be:
Parse plot data. Note that you should also provide a list of plots via the plots
argument
parse the general stat section
Parse the raw data section
This defaults to 'general', which tends to contain the most useful statistics
A tibble (data.frame subclass) with QC data and metadata as columns, and samples as rows. Columns are named according to the respective section they belong to, and will always be listed in the following order:
metadata.X
This column contains metadata for this sample.
By default this is only the sample ID, but if you have provided the
find_metadata
argument, there may be more columns.
general.X
This column contains a generally useful summary statistic for each sample
plot.X
This column contains a data frame of plot data for each sample.
Refer to the plot parsers documentation (ie the parse_X
functions) for more information on the output format.
raw.X
This column contains a raw summary statistic or value relating to each sample
load_multiqc(system.file("extdata", "wgs/multiqc_data.json", package = "TidyMultiqc"))
#> # A tibble: 6 × 165
#> metadata.sam…¹ gener…² gener…³ gener…⁴ gener…⁵ gener…⁶ gener…⁷ gener…⁸ gener…⁹
#> <chr> <dbl> <dbl> <dbl> <int> <int> <dbl> <dbl> <dbl>
#> 1 P4107_1003 8.68e8 8.48e8 97.6 40 365 41.4 92.3 92.2
#> 2 P4107_1004 1.00e9 9.85e8 98.2 46 363 41.0 92.3 92.2
#> 3 P4107_1005 9.75e8 9.56e8 98.0 45 368 41.2 92.3 92.2
#> 4 P4107_1002 8.66e8 8.47e8 97.8 40 367 41.3 92.3 92.2
#> 5 P4107_1006 9.12e8 8.95e8 98.1 43 362 41.3 92.3 92.2
#> 6 P4107_1001 7.72e8 7.51e8 97.3 36 358 41.4 92.3 92.2
#> # … with 156 more variables: general.10_x_pc <dbl>, general.30_x_pc <dbl>,
#> # general.50_x_pc <dbl>, general.genome <chr>,
#> # general.number_of_variants_before_filter <dbl>,
#> # general.number_of_known_variants_brie_non_empty_id <dbl>,
#> # general.number_of_known_variants_brie_non_empty_id_percent <dbl>,
#> # general.number_of_effects <dbl>, general.genome_total_length <dbl>,
#> # general.genome_effective_length <dbl>, general.change_rate <dbl>, …