NAME
    Bencher - A benchmark framework

VERSION
    This document describes version 1.062.4 of Bencher (from Perl
    distribution Bencher), released on 2024-02-19.

SYNOPSIS
    See bencher CLI.

DESCRIPTION
    Bencher is a benchmark framework. The main feature of Bencher is
    permuting list of Perl codes with list of arguments into benchmark
    items, and then benchmark them. You can run only some of the items as
    well as filter codes and arguments to use. You can also permute multiple
    perls and multiple module versions.

TERMINOLOGY
    Scenario. A hash data structure that lists *participants*, *datasets*,
    and other things. The bencher CLI can accept a scenario from a module
    (under "Bencher::Scenario::*" namespace or "Acme::CPANModules::*"
    namespace), a script, or from command-line option. See "SCENARIO".

    Participant. What to run or benchmark. Usually a Perl code or code
    template, or a command or command template. See "participants".

    Dataset. Arguments or parameters to permute with a participant. See
    "datasets".

    (Benchmark) item. Participant that has been permuted with dataset into
    code ready to run. Usually a scenario does not contain items directly,
    but only participants and datasets, and lets Bencher permute these into
    items.

SCENARIO
    The core data structure that you need to prepare is the scenario. It is
    a DefHash (i.e. just a regular Perl hash). The two most important keys
    of this hash are: participants and datasets.

    An example scenario (from "Bench::Scenario::Example"):

     package Bencher::Scenario::Example;
     our $scenario = {
         participants => [
             {fcall_template => q[Text::Wrap::wrap('', '', <text>)]},
         ],
         datasets => [
             { name=>"foobar x100",   args => {text=>"foobar " x 100} },
             { name=>"foobar x1000",  args => {text=>"foobar " x 1000} },
             { name=>"foobar x10000", args => {text=>"foobar " x 10000} },
         ],
     };
     1;

  participants
    participants (array) lists Perl codes (or external commands) that we
    want to benchmark.

   Specifying participant's code
    There are several kinds of code you can specify:

    *   module

        First, you can just specify "module" (str, a Perl module name). This
        is useful when running scenario in "Running benchmark in module
        startup mode" in module_startup mode. Also useful to instruct
        Bencher to load the module. When not in module startup mode, there
        is no code in this participant to run. In addition to "module" you
        can also specify "import_args" which can be a string or an arrayref.

    *   modules

        You can also specify "modules" (an array of Perl module names) if
        you want to benchmark several modules together. Similarly, this is
        only useful for running in module startup mode. To specify import
        arguments for each of these modules, use "import_args_array".

    *   code

        You can specify "code" (a coderef) which contains the code to
        benchmark. However, the point of Bencher is to use "fcall_template"
        or at least "code_template" to be able to easily permute the code
        with datasets (see below). So you should only specify "code" when
        you cannot specify "fcall_template" or "code_template" or the other
        way.

        Caveat: if you run bencher in multiperl or multi-module-version
        mode, the current implementaiton will fill the code templates and
        dump the scenario into a temporary script and run that script with
        separate perl processes. The dumped coderef will lack 'use'
        statement or code in BEGIN block. Avoid using that in your code.

    *   fcall_template

        You can specify "fcall_template", and this is the recommended way
        whenever possible. It is a string containing a function call code,
        in the form of:

         MODULENAME::FUNCTIONAME(ARG, ...)

        or

         CLASSNAME->FUNCTIONAME(ARG, ...)

        For example:

         Text::Levenshtein::fastdistance(<word1>, <word2>)

        Another example:

         Module::CoreList->is_code(<module>)

        It can be used to benchmark a function call or a method call. From
        this format, Bencher can easily extract the module name so user can
        also run in module startup mode.

        By using a template, Bencher can generate actual codes from this
        template by combining it with datasets. The words enclosed in
        "<...>" will be replaced with actual arguments specified in
        "datasets". The arguments are automatically encoded as Perl value,
        so it's safe to use arrayref or complex structures as argument
        values (however, you can use "<...:raw>" to avoid this automatic
        encoding).

    *   code_template

        Aside from "fcall_template", you can also use "code_template" (a
        string containing arbitrary code), in the cases where the code you
        want to benchmark cannot be expressed as a simple function/method
        call, for example (taken from Bencher::Scenario::ComparisonOps):

         participants => [
             {name=>'1k-numeq'      , code_template=>'my $val =     1; for (1..1000) { if ($val ==     1) {} if ($val ==     2) {} }'},
             {name=>'1k-streq-len1' , code_template=>'my $val = "a"  ; for (1..1000) { if ($val eq "a"  ) {} if ($val eq "b"  ) {} }'},
             {name=>'1k-streq-len3' , code_template=>'my $val = "foo"; for (1..1000) { if ($val eq "foo") {} if ($val eq "bar") {} }'},
             {name=>'1k-streq-len10', code_template=>'my $val = "abcdefghij"; for (1..1000) { if ($val eq "abcdefghij") {} if ($val eq "klmnopqrst") {} }'},
         ],

        Like in "fcall_template", words enclosed in "<...>" will be replaced
        with actual data. When generating actual code, Bencher will enclose
        the code template with "sub { .. }".

        Caveat: if you run bencher in multiperl or multi-module-version
        mode, the current implementaiton will fill the code templates and
        dump the scenario into a temporary script and run that script with
        separate perl processes. The dumped coderef will lack 'use'
        statement or code in BEGIN block. Avoid using that in your code.

    *   cmdline, cmdline_template, perl_cmdline, perl_cmdline_templates

        Or, if you are benchmarking external commands, you specify "cmdline"
        (array or strings, or strings) or "cmdline_template" (array/str) or
        "perl_cmdline" or "perl_cmdline_template" instead. An array cmdline
        will not use shell, while the string version will use shell.
        "perl_cmdline*" are the same as "cmdline*" except the first implicit
        argument/prefix is perl. When the cmdline template is filled with
        the arguments, the values will automatically be shell-escaped
        (unless you use the "<...:raw>" syntax).

        When using code template, code will be generated and eval-ed in the
        "main" package.

   Specifying participant's name
    By default, Bencher will attempt to figure out the name for a
    participant (a sequence number starting from 1, a module name or module
    name followed by function name, etc). You can also specify name for a
    participant explicitly so you can refer to it more easily later, e.g.:

     participants => [
         {name=>'pp', fcall_template=>'List::MoreUtils::PP::uniq(@{<array>})'},
         {name=>'xs', fcall_template=>'List::MoreUtils::XS::uniq(@{<array>})'},
     ],

   List of properties for a participant
    This is a reference section.

    *   name (str)

        From DefHash.

    *   summary (str)

        From DefHash.

    *   description (str)

        From DefHash.

    *   tags (array of str)

        From DefHash. Define tag(s) for this participant. Can be used to
        include/exclude groups of participants having the same tags.

    *   module (str)

    *   modules (array of str)

        These are modules that are to be benchmarked. When running the
        benchmark scenario, these modules will be require'd first, so you
        don't have to require them again manually in your code/code
        template.

    *   helper_modules (array of str)

        These are helper modules that are required when running the
        participant code, but are not the main subject to be benchmarked.
        They will also be require'd when running the benchmark scenario so
        you don't have to require them manually in your code/code template.

        The difference with the "modules" property is: the modules specified
        in "modules" should be specified as phase=x_benchmarks,
        rel=x_benchmarks prerequisites while modules specified in
        "helper_modules" should not. But all modules can be specified as
        phase=x_benchmarks, rel=requires prerequisites.

    *   function (str)

    *   fcall_template (str)

    *   code_template (str)

    *   code (code)

    *   cmdline (str|array of str)

    *   cmdline_template (str|array of str)

    *   perl_cmdline (str|array of str)

    *   perl_cmdline_template (str|array of str)

    *   result_is_list (bool, default 0)

        This is useful when dumping item's codes, so Bencher will use a list
        context when receiving result.

    *   include_by_default (bool, default 1)

        Can be set to false if you want to exclude participant by default
        when running benchmark, unless the participant is explicitly
        included e.g. using "--include-participant" command-line option.

  datasets
    datasets (array) lists the function inputs (or command-line arguments).
    You can "name" each dataset too, to be able to refer to it more easily.

    Other properties you can add to a dataset: "include_by_default" (bool,
    default true, can be set to false if you want to exclude dataset by
    default when running benchmark, unless the dataset is explicitly
    included).

    *   name (str)

        From DefHash.

    *   summary (str)

        From DefHash.

    *   description (str)

        From DefHash.

    *   tags (array of str)

        From DefHash. Define tag(s) for this dataset. Can be used to
        include/exclude groups of datasets having the same tags.

    *   args (hash)

        Example:

         {filename=>"ujang.txt", size=>10}

        You can supply multiple argument values by adding "@" suffix to the
        argument name. You then supply an array for the values, example:

         {filename=>"ujang.txt", 'size@'=>[10, 100, 1000]}

        This means, for each participant mentioning "size", three benchmark
        items will be generated, one for each value of "size".

        Aside from array, you can also use hash for the multiple values.
        This has a nice effect of showing nicer names (in the hash keys) for
        the argument value, e.g.:

         {filename=>"ujang.txt", 'size@'=>{"1M"=>1024*2, "1G"=>1024**3, "1T"=>1024**4}}

    *   argv (array)

    *   include_by_default (bool, default 1)

    *   include_participant_tags (array of str)

        Only include participants having one of these tags. For example:

         ['a', 'b']

        will include all participants having either "a" or "b" in their
        tags. To only include participants which have all of "a" and "b" in
        their tags, use:

         ['a & b']

    *   exclude_participant_tags (array of str)

        Exclude participants having any of these tags. For example:

         ['a', 'b']

        will exclude all participants having either "a" or "b" in their
        tags. To only exclude participants which have all of "a" and "b" in
        their tags, use:

         ['a & b']

    *   result (any)

        Specify result that any participant must return. This can be used to
        verify that participant code is correct (returns the desired result)
        before benchmarking. If a participant fails to return this result,
        benchmarking will be aborted.

  Other properties
    Other known scenario properties (keys):

    *   name

        From DefHash, scenario name (usually short and one word).

    *   summary

        From DefHash, a one-line plaintext summary.

    *   description (str)

        From DefHash, longer description in Markdown.

    *   test (bool, default 1)

        By default, participant code is run once first for testing (e.g.
        whether it dies or return the correct result) before benchmarking.
        If your code runs for many seconds, you might want to skip this test
        and set this to 0.

    *   module_startup (bool)

    *   code_startup (bool)

    *   precision (float)

        Precision to pass to Benchmark::Dumb. Default is 0. Can be overriden
        via "--precision" (CLI). takes precedence over "default_precision".

    *   default_precision (float)

        (DEPRECATED)

        Precision to pass to Benchmark::Dumb. Default is 0. Can be overriden
        via "--precision" (CLI).

    *   module_startup_precision (float)

        Precision to pass to Benchmark::Dumb, when running benchmark in
        module_startup mode. Default is from "precision" or
        "default_precision". Can be overriden with "--precision" (CLI).

    *   result (any)

        Like per-dataset "result" property, which you normally should use
        instead of this. But this property can be used when a scenario does
        not have any dataset.

    *   with_args_size (bool)

        Show the size of the item code's arguments. Size is measured using
        Devel::Size. The measurement is done once per item when it is
        testing the code. See also: "with_result_size".

    *   with_result_size (bool)

        Show the size of the item code's return value. Size is measured
        using Devel::Size. The measurement is done once per item when it is
        testing the code. See also: "with_args_size".

    *   with_process_size (bool)

        Include some memory statistics in each item's result. This currently
        only works on Linux because the measurement is done by reading
        "/proc/PID/smaps". Also, since this is a per-process information, to
        get this information each item's code will be run by dumping the
        code (using B::Deparse) into a temporary file, then running the file
        (once per item, after the item's code is completed) using a new perl
        interpreter process. This is done to get a measurement on a clean
        process that does not load Bencher itself or the other items. This
        also means that not all code will work: all the caveats in "MULTIPLE
        PERLS AND MULTIPLE MODULE VERSIONS" apply. In short, all outside
        data will not be available for the code.

        Also, this information normally does not make sense for external
        command participants, because what is measured is the memory
        statistics of the perl process itself, not the external command's
        processes.

    *   capture_stdout (bool)

        Useful for silencing command/code that outputs stuffs to stdout.
        Note that output capturing might affect timings if your benchmark
        code outputs a lot of stuffs. See also: "capture_stderr".

    *   capture_stderr (bool)

        Useful for silencing command/code that outputs stuffs to stderr.
        Note that output capturing might affect timings if your benchmark
        code outputs a lot of stuffs. See also: "capture_stdout".

    *   extra_modules (array of str)

        You can specify extra modules to load here before benchmarking. The
        modules and their versions will be listed in the result metadata
        under "func.module_versions", for extra information. An example to
        put here are modules that contain/produce datasets that get
        benchmarked, because the data might differ from version to version.

    *   env_hashes (array of hash)

        With this property, you can permute multiple sets of environment
        variables. Suppose you want to benchmark each participant when
        running under environment variables FOO=0, FOO=1, and FOO=2. You can
        specify:

         env_hashes => [
             {FOO=>0},
             {FOO=>1},
             {FOO=>2},
         ]

    *   runner (str)

        Set which runner to run the benchmark with. Either "Benchmark::Dumb"
        (the default), "Benchmark" ("Benchmark.pm", to get
        Benchmark.pm-style output), or "Benchmark::Dumb::SimpleTime" (to be
        able to run a code with only 1 or 2 iterations without warning,
        suitable if your code that runs for at least a few seconds and you
        don't want to wait for too long).

    *   on_failure (str, "skip"|"die")

        For a command participant, failure means non-zero exit code. For a
        Perl-code participant, failure means Perl code dies or (if expected
        result is specified) the result is not equal to the expected result.

        The default is "die". When set to "skip", will first run the code of
        each item before benchmarking and trap command failure/Perl
        exception and if that happens, will "skip" the item.

        Can be overriden in the CLI with "--on-failure" option.

    *   on_result_failure (str, "skip"|"die"|"warn")

        This is like "on_failure" except that it specifically refer to the
        failure of item's result not being equal to expected result.

        The default is the value of "on_failure".

        There is an extra choice of `warn` for this type of failure, which
        is to print a warning to STDERR and continue.

        Can be overriden in the CLI with "--on-result-failure" option.

    *   before_parse_scenario (code)

        If specified, then this code will be called before parsing scenario.
        Code will be given hash argument with the following keys:
        "hook_name" (str, set to "before_gen_items"), "scenario" (hash,
        unparsed scenario), "stash" (hash, which you can use to pass data
        between hooks).

    *   before_parse_participants (code)

        If specified, then this code will be called before
        parsing/normalizing participants from scenario. Code will be given
        hash argument with the following keys: "hook_name", "scenario",
        "stash".

        You can use this hook to, e.g.: generate participants dynamically.

    *   before_parse_datasets (code)

        If specified, then this code will be called before
        parsing/normalizing datasets from scenario. Code will be given hash
        argument with the following keys: "hook_name", "scenario", "stash".

        You can use this hook to, e.g.: generate datasets dynamically.

    *   after_parse_scenario (code)

        If specified, then this code will be called after parsing scenario.
        Code will be given hash argument with the following keys:
        "hook_name", "scenario" (hash, parsed scenario), "stash".

    *   before_gen_items (code)

        If specified, then this code will be called before generating items.
        Code will be given hash argument with the following keys:
        "hook_name", "scenario", "stash".

        You can use this hook to, e.g.: modify datasets/participants before
        being permuted into items.

    *   before_bench (code)

        If specified, then this code will be called before starting the
        benchmark. Code will be given hash argument with the following keys:
        "hook_name", "scenario", "stash".

    *   before_test_item (code)

        If specified, then this code will be called before testing each
        item's code. Code will be given hash argument with the following
        keys: "hook_name", "scenario", "stash", "item" (itself a hash
        containing the following keys: "_name", "seq", "permute").

    *   after_test_item (code)

        If specified, then this code will be called after testing each
        item's code. Code will be given hash argument with the following
        keys: "hook_name", "scenario", "stash", "item" (itself a hash
        containing the following keys: "_name", "seq", "permute",
        "_result").

        Item's code result can be accessed in "_result" in "item" hash.

        The hook is expected to return false (0 or '') or a string which
        will be interpreted as error message. Thus, this hook can be used to
        force error during an item's test. The hook can be used to check
        that the item's code has executed correctly by checking other side
        effects aside from the return in "_result".

    *   after_bench (code)

        If specified, then this code will be called after completing
        benchmark. Code will be given hash argument with the following keys:
        "hook_name", "scenario", "stash", "result" (array, enveloped
        result).

        You can use this hook to, e.g.: do some custom
        formatting/modification to the result.

    *   before_return (code)

        If specified, then this code will be called before
        displaying/returning the result. Code will be given hash argument
        with the following keys: "hook_name", "scenario", "stash", "result".

        You can use this hook to, e.g.: modify the result in some way.

USING THE BENCHER COMMAND-LINE TOOL
  Running benchmark
  Running benchmark in code startup mode
    Code startup mode can be activated either by specifying "--code-startup"
    option from the command-line, or by setting "code_startup" property to
    true in the scenario.

    In this mode, instead of running participant's perl code, it runs a perl
    command:

     perl -e 'YOUR_PERL_CODE'

    or if there are modules to be loaded:

     perl -MModule1 -MModule2 ... -e 'YOUR_PERL_CODE'

    and then comparing the time with the baseline:

     perl -e1

    and measure the startup overhead of each code.

    See also "module_startup" mode.

  Running benchmark in module startup mode
    Module startup mode can be activated either by specifying
    "--module-startup" option from the command-line, or by setting
    "module_startup" property to true in the scenario.

    In this mode, instead of running each participant's code, module name
    will be extracted from each participant and this will be benchmarked
    instead:

     perl -MModule1 -e1
     perl -MModule2 -e1
     ...
     perl -e1 ;# the baseline, for comparison

    Basically, this mode tries to measure the startup overhead of each
    module in isolation.

    Module name can be extracted from a participant if a participant
    specifies "module" or "fcall_template" or "modules". When a participant
    does not contain any module name, it will be skipped.

MULTIPLE PERLS AND MULTIPLE MODULE VERSIONS
    Bencher can be instructed to run benchmark items against multiple perl
    installations, as well as multiple versions of a module.

    Bencher uses perlbrew to get the list of available perl installations,
    so you need to install perlbrew and brew some perls first.

    To run against multiple versions of a module, specify the module name in
    "--multimodver" then add one or more library include paths using "-I".
    The include paths need to contain different versions of the module.

    Caveats. Here is how benchmarking against multiple perls and module
    versions currently works. Bencher first prepares a new scenario based on
    the input scenario. But the new scenario contains benchmark items that
    has been permuted and where the code template has been converted into
    actual Perl code (a coderef). The new scenario along with the Perl codes
    in it will be dumped using Data::Dmp (which can deparse code) into a
    temporary file. A new Bencher process is then started using the
    appropriate perl interpreter, runs the scenario, and returns the result
    as JSON. The original Bencher process then collects and combines the
    per-interpreter results into the final result.

    Due to the above way of working, there are some caveats. First, code
    that contains closures won't work properly because the original
    variables that the code can see are no longer available in the new
    process. Also, some scenarios prepare data in a hook like in the
    "before_bench" or "before_gen_items" hook. This also won't work because
    the new scenario that gets dumped into temporary file currently has all
    the hooks stripped first.

    So in principle, to enable a benchmark item to be run against multiple
    perls or module versions, make the code self-sufficient. Do not depend
    on an outside variable. Instead, only depend on the variables in the
    dataset.

HOMEPAGE
    Please visit the project's homepage at
    <https://metacpan.org/release/Bencher>.

SOURCE
    Source repository is at <https://github.com/perlancar/perl-Bencher>.

SEE ALSO
    bencher

    "Bencher::Manual::*"

    BenchmarkAnything. There are lot of overlaps of goals between Bencher
    and this project. I hope to reuse or interoperate parts of
    BenchmarkAnything, e.g. storing Bencher results in a BenchmarkAnything
    storage backend, sending Bencher results to a BenchmarkAnything HTTP
    server, and so on.

    Benchmark, Benchmark::Dumb (Dumbbench)

    "Bencher::Scenario::*" for examples of scenarios.

    An article about Bencher written in 2016,
    <https://perladvent.org/2016/2016-12-03.html>

AUTHOR
    perlancar <perlancar@cpan.org>

CONTRIBUTING
    To contribute, you can send patches by email/via RT, or send pull
    requests on GitHub.

    Most of the time, you don't need to build the distribution yourself. You
    can simply modify the code, then test via:

     % prove -l

    If you want to build the distribution (e.g. to try to install it locally
    on your system), you can install Dist::Zilla,
    Dist::Zilla::PluginBundle::Author::PERLANCAR,
    Pod::Weaver::PluginBundle::Author::PERLANCAR, and sometimes one or two
    other Dist::Zilla- and/or Pod::Weaver plugins. Any additional steps
    required beyond that are considered a bug and can be reported to me.

COPYRIGHT AND LICENSE
    This software is copyright (c) 2024 by perlancar <perlancar@cpan.org>.

    This is free software; you can redistribute it and/or modify it under
    the same terms as the Perl 5 programming language system itself.

BUGS
    Please report any bugs or feature requests on the bugtracker website
    <https://rt.cpan.org/Public/Dist/Display.html?Name=Bencher>

    When submitting a bug or request, please include a test-file or a patch
    to an existing test-file that illustrates the bug or desired feature.