Bash: Passing variables by reference

From FVue
Jump to: navigation, search

Problem

How can I pass variables by reference to a bash function?

I see this technique of 'passing by reference' proliferating:

f() { local b; g b; echo $b; }
g() { eval $1=bar; }  #  WRONG, although it
f                     #+ looks ok: b=bar

But this deceptive simple solution has some serious drawbacks.

For one, the eval introduces a security thread:

g() { eval $1=bar; }  # WRONG
g 'ls /; true'        # Oops, executes command

Second, the eval doesn't work if `g()' has the return variable declared local:

g() { local b; eval $1=bar; }  # WRONG
g b                            # Conflicts with `local b'
echo $b                        # b is empty unexpected

The conflict stays there even if local `b' is unset:

g() { local b; unset b; eval $1=bar; }  # WRONG
g b                                     # Still conflicts with `local b'
echo $b                                 # b is empty unexpected

Why bother?

Why bother passing variables by reference, if one can return values in bash by using a subshell:

a=$(func)

Well, subshells do an expensive fork operation - costing time, resources, and energy. Consider this time comparison, using either a subshell or eval when doing 1,000 single value assignments:

                                           time
           ---------------------------------------------------------------------
                         real [%]                  real [s]   user [s]   sys [s]
           -------------------------------------   --------   --------   -------
           0                                 100
           -------------------------------------
subshell   #####################################      4.046      0.968     3.076

eval       ###                                        0.367      0.360     0.008
           -------------------------------------   --------   --------   -------

Table 1: Time consumed doing 1,000 assignments, using either a subshell or eval. See #Appendix B: time_eval.sh for the code used.


As can be seen in table 1, pursuing the eval method - trying to find a safe solution, is worth the effort.

Also, the subshell method supports returning one value only, or at least makes it very cumbersome to return multiple values via the subshell method. Passing by reference would make returning multiple values much easier.

Solutions

Both solutions upvar & upvars have been tested successfully on bash versions: 2.05b, 3.0.0, 3.2.39, 4.0.33 and 4.1.7.

See #Test suite for the code used.

Solution 1: Upvar: Assign single variable by reference

Use upvar in a function, returning a single value, like this:

local "$1" && upvar $1 "value(s)"

Example:

f() { local b; g b; echo $b; }                 # (1)
g() { local "$1" && upvar $1 bar; }            # (2)
f  # Ok: b=bar

Explicitly make the variable local in the function whishing to return (2). Then this function unsets the variable, calling upvar (2), effectively making the variable appear down the call-stack to the caller (1).

Returning an array by reference goes like this:

f() { local b; g b; declare -p b; }         # (1)
# @param $1: Name of variable to return value into
gg() {                                      # (2)
    # Declare array containing three elements:
    # - foo
    # - bar \" cee # Including double quote (")
    # - dus \n  enk # Including newline (\n) followed by two spaces
    # foo, bar \" cee, dus\n  enk
    local a=(foo "bar \"cee" $'dus\n  enk')
    # Return array
    local "$1" && upvar $1 "${a[@]}"
}
f  # Ok: declare -a b=([0]="foo" [1]="bar \"cee" [2]=$'dus\n  enk')

The `upvar' code makes use of some surprising behaviour of unset which is capable of traversing down the call-stack and unsetting variables repeatedly. For more information, see Bash: Unset.

The name "upvar" is borrowed from Tcl's upvar command.

For the upvar code, see #Appendix A: upvars.sh.

Caveat: Subsequent 'upvar' calls may conflict

Consider this example:

f() { local b a; g b a; echo $b $a; }
g() {
    local a=A b=B
    # ...
    if local "$1" "$2"; then
        upvar $1 $a           # (1a)
        upvar $2 $b           # (1b)
    fi
}
f  # Error; got "A A", expected: "A B"

The problem is that in the first call to upvar (1a), `f.b' gets assigned the value "A". Unfortunately, upvar also unsets the local variable `g.b', so that in the second call to upvar (1b), `f.a' gets assigned the value of not `g.b' but `f.b' which now is "A".

The solution is to pass all variables and values in one call, using upvars, see solution 2.

Solution 2: Upvars: Assign multiple variables by reference

Use upvars like this:

local varname [varname ...] && 
    upvars [-v varname value] | [-aN varname [value ...]] ...

Available OPTIONS:
    -aN  Assign next N values to varname as array
    -v   Assign single value to varname

Example:

f() { local a b; g a b; declare -p a b; }                      # (1)
g() {
    local c=( foo bar )
    local "$1" "$2" && upvars -v $1 A -a${#c[@]} $2 "${c[@]}"  # (2)
}
f  # Ok: a=A, b=(foo bar)

Explicitly declare the variables local in the function whishing to return (2). Then this function can return variables using upvars (2), effectively making the variables appear down the call-stack to the caller (1).

For the upvars code, see #Appendix A: upvars.sh.

Download

Time comparison

                                           time
           ---------------------------------------------------------------------
                         real [%]                  real [s]   user [s]   sys [s]
           -------------------------------------   --------   --------   -------
           0                                 100
           -------------------------------------
subshell   #####################################      4.046      0.968     3.076

eval       ###                                        0.367      0.360     0.008

upvar      #######                                    0.759      0.748     0.008

upvars     ###########                                1.175      1.148     0.024
           -------------------------------------   --------   --------   -------

Table 2: Time consumed doing 1,000 single value assignments. See #Appendix C: time_upvar.sh for the code used.

Examples

Function returning array by reference

# Param $1  Name of variable to return array to
return_array() {
    local r=(e1 e2 "e3  e4" $'e5\ne6')
    local "$1" && upvars -a${#r[@]} $1 "${r[@]}"
}

Function returning optional variables by reference

bash >= 3.1.0

# Params $*  (optional) names of variables to return values to.
#            Supported variable names are:
#            - A1:  Return array 1
#            - A2:  Return array 2
#            - V1:  Return value 1
#            - V2:  Return value 2
return_optional_vars() {
    local a1=(bar "cee  dee") a2=() upargs=() upvars=() v1=foo v2 var
    for var; do
        case $var in
            A1) upargs+=(-a${#a1[@]} $var "${a1[@]}") ;;
            A2) upargs+=(-a${#a2[@]} $var "${a2[@]}") ;;
            V1) upargs+=(-v $var "$v1") ;;
            V2) upargs+=(-v $var "$v2") ;;
            *) echo "bash: ${FUNCNAME[0]}: \`$var': unknown variable"
               return 1 ;;
        esac
        upvars+=("$var")
    done
    (( ${#upvars[@]} )) && local "${upvars[@]}" && upvars "${upargs[@]}"
}

bash >= 2.05b

# Params $*  (optional) names of variables to return values to.
#            Supported variable names are:
#            - A1:  Return array 1
#            - A2:  Return array 2
#            - V1:  Return value 1
#            - V2:  Return value 2
return_optional_vars() {
    local a1 a2 upargs upvars v1=foo v2 var
    a1=(bar "cee  dee") a2=() upargs=() upvars=()
    for var; do
        case $var in
            A1) upargs=("${upargs[@]}" -a${#a1[@]} $var "${a1[@]}") ;;
            A2) upargs=("${upargs[@]}" -a${#a2[@]} $var "${a2[@]}") ;;
            V1) upargs=("${upargs[@]}" -v $var "$v1") ;;
            V2) upargs=("${upargs[@]}" -v $var "$v2") ;;
            *) echo "bash: ${FUNCNAME[0]}: \`$var': unknown variable"
               return 1 ;;
        esac
        upvars=("${upvars[@]}" "$var")
    done
    (( ${#upvars[@]} )) && local "${upvars[@]}" && upvars "${upargs[@]}"
}

Test suite

The test suite uses the bash-completion test suite, which is written on top of the DejaGnu testing framework. DejaGnu is written in Expect, which in turn uses Tcl -- Tool command language.

Install

Git

git clone git@github.com:fvue/BashByRef.git  # BashByRef
cd BashByRef && git submodule update --init  # bash-completion

Dependencies

Debian/Ubuntu

On Debian/Ubuntu you can use `apt-get`:

sudo apt-get install dejagnu tcllib

This should also install the necessary `expect` and `tcl` packages.

Fedora/RHEL/CentOS

On Fedora and RHEL/CentOS (with EPEL) you can use `yum`:

sudo yum install dejagnu tcllib

This should also install the necessary `expect` and `tcl` packages.

Running the tests

The tests are run by calling runUnit:

cd test
./runUnit

Example output:

Test Run By me on Sun May 30 08:51:40 2010
Native configuration is i686-pc-linux-gnu

        === unit tests ===

Schedule of variations:
    unix

Running target unix
Using /usr/share/dejagnu/baseboards/unix.exp as board description file for target.
Using /usr/share/dejagnu/config/unix.exp as generic interface file for target.
Using ./config/default.exp as tool-and-target-specific interface file.
Running ./unit/upvar.exp ...
Running ./unit/upvars.exp ...

        === unit Summary ===

# of expected passes		22
# of expected failures		1
/tmp/BashByRef/test, bash-4.0.33(7)-release

See also

Passing variables by reference conflicts with local
Me questioning the problem on the bug-bash mailing list

Appendixes

Appendix A: upvars.sh

# Bash: Passing variables by reference
# Copyright (C) 2010 Freddy Vulto
# Version: upvars-0.9.dev
# See: http://fvue.nl/wiki/Bash:_Passing_variables_by_reference
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program.  If not, see <http://www.gnu.org/licenses/>.


# Assign variable one scope above the caller
# Usage: local "$1" && upvar $1 "value(s)"
# Param: $1  Variable name to assign value to
# Param: $*  Value(s) to assign.  If multiple values, an array is
#            assigned, otherwise a single value is assigned.
# NOTE: For assigning multiple variables, use 'upvars'.  Do NOT
#       use multiple 'upvar' calls, since one 'upvar' call might
#       reassign a variable to be used by another 'upvar' call.
# Example: 
#
#    f() { local b; g b; echo $b; }
#    g() { local "$1" && upvar $1 bar; }
#    f  # Ok: b=bar
#
upvar() {
    if unset -v "$1"; then           # Unset & validate varname
        if (( $# == 2 )); then
            eval $1=\"\$2\"          # Return single value
        else
            eval $1=\(\"\${@:2}\"\)  # Return array
        fi
    fi
}


# Assign variables one scope above the caller
# Usage: local varname [varname ...] && 
#        upvars [-v varname value] | [-aN varname [value ...]] ...
# Available OPTIONS:
#     -aN  Assign next N values to varname as array
#     -v   Assign single value to varname
# Return: 1 if error occurs
# Example:
#
#    f() { local a b; g a b; declare -p a b; }
#    g() {
#        local c=( foo bar )
#        local "$1" "$2" && upvars -v $1 A -a${#c[@]} $2 "${c[@]}"
#    }
#    f  # Ok: a=A, b=(foo bar)
#
upvars() {
    if ! (( $# )); then
        echo "${FUNCNAME[0]}: usage: ${FUNCNAME[0]} [-v varname"\
            "value] | [-aN varname [value ...]] ..." 1>&2
        return 2
    fi
    while (( $# )); do
        case $1 in
            -a*)
                # Error checking
                [[ ${1#-a} ]] || { echo "bash: ${FUNCNAME[0]}: \`$1': missing"\
                    "number specifier" 1>&2; return 1; }
                printf %d "${1#-a}" &> /dev/null || { echo "bash:"\
                    "${FUNCNAME[0]}: \`$1': invalid number specifier" 1>&2
                    return 1; }
                # Assign array of -aN elements
                [[ "$2" ]] && unset -v "$2" && eval $2=\(\"\${@:3:${1#-a}}\"\) && 
                shift $((${1#-a} + 2)) || { echo "bash: ${FUNCNAME[0]}:"\
                    "\`$1${2+ }$2': missing argument(s)" 1>&2; return 1; }
                ;;
            -v)
                # Assign single value
                [[ "$2" ]] && unset -v "$2" && eval $2=\"\$3\" &&
                shift 3 || { echo "bash: ${FUNCNAME[0]}: $1: missing"\
                "argument(s)" 1>&2; return 1; }
                ;;
            --help) echo "\
Usage: local varname [varname ...] &&
   ${FUNCNAME[0]} [-v varname value] | [-aN varname [value ...]] ...
Available OPTIONS:
-aN VARNAME [value ...]   assign next N values to varname as array
-v VARNAME value          assign single value to varname
--help                    display this help and exit
--version                 output version information and exit"
                return 0 ;;
            --version) echo "\
${FUNCNAME[0]}-0.9.dev
Copyright (C) 2010 Freddy Vulto
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law."
                return 0 ;;
            *)
                echo "bash: ${FUNCNAME[0]}: $1: invalid option" 1>&2
                return 1 ;;
        esac
    done
}

Appendix B: time_eval.sh

#--- time_eval.sh -------------------------------------------
# Compare times doing 1,000 single value assignments:
# - subshell
# - eval

echo bash-$BASH_VERSION
echo
echo subshell:

f() {
    local a i
    for (( i=1000; i > 0; i-- )); do
        a=$(g)
    done
}
g() {
    local b=foo
    echo $b
}
time f

echo
echo eval:

f() {
    local a i
    for (( i=1000; i > 0; i-- )); do
        g a
    done
}
g() {
    local b=foo
    eval $1=\$b
}
time f

Example run:

$ . time_eval.sh 
bash-3.2.39(1)-release

subshell:

real    0m5.089s
user    0m1.056s
sys     0m4.032s

eval:

real    0m0.374s
user    0m0.372s
sys     0m0.000s

Appendix C: time_upvar.sh

#--- time_upvar.sh ------------------------------------------
# Compare times doing 1,000 single value assignments:
# - subshell
# - eval
# - upvar
# - upvars

# Assign variable one scope above the caller
# Usage: local "$1" && upvar $1 "value(s)"
# Param: $1  Variable name to assign value to
# Param: $*  Value(s) to assign.  If multiple values, an array is
#            assigned, otherwise a single value is assigned.
# NOTE: For assigning multiple variables, use 'upvars'.  Do NOT
#       use multiple 'upvar' calls, since one 'upvar' call might
#       reassign a variable to be used by another 'upvar' call.
# See: http://fvue.nl/wiki/Bash:_Passing_variables_by_reference
upvar() {
    if unset -v "$1"; then           # Unset & validate varname
        if (( $# == 2 )); then
            eval $1=\"\$2\"          # Return single value
        else
            eval $1=\(\"\${@:2}\"\)  # Return array
        fi
    fi
}

# Assign variables one scope above the caller
# Usage: local varname [varname ...] && 
#        upvars [-v varname value] | [-aN varname [value ...]] ...
# Available OPTIONS:
#     -aN  Assign next N values to varname as array
#     -v   Assign single value to varname
# Return: 1 if error occurs
# See: http://fvue.nl/wiki/Bash:_Passing_variables_by_reference
upvars() {
    if ! (( $# )); then
        echo "${FUNCNAME[0]}: usage: ${FUNCNAME[0]} [-v varname"\
            "value] | [-aN varname [value ...]] ..." 1>&2
        return 2
    fi
    while (( $# )); do
        case $1 in
            -a*)
                # Error checking
                [[ ${1#-a} ]] || { echo "bash: ${FUNCNAME[0]}: \`$1': missing"\
                    "number specifier" 1>&2; return 1; }
                printf %d "${1#-a}" &> /dev/null || { echo "bash:"\
                    "${FUNCNAME[0]}: \`$1': invalid number specifier" 1>&2
                    return 1; }
                # Assign array of -aN elements
                [[ "$2" ]] && unset -v "$2" && eval $2=\(\"\${@:3:${1#-a}}\"\) && 
                shift $((${1#-a} + 2)) || { echo "bash: ${FUNCNAME[0]}:"\
                    "\`$1${2+ }$2': missing argument(s)" 1>&2; return 1; }
                ;;
            -v)
                # Assign single value
                [[ "$2" ]] && unset -v "$2" && eval $2=\"\$3\" &&
                shift 3 || { echo "bash: ${FUNCNAME[0]}: $1: missing"\
                "argument(s)" 1>&2; return 1; }
                ;;
            --help) echo "\
Usage: local varname [varname ...] &&
   ${FUNCNAME[0]} [-v varname value] | [-aN varname [value ...]] ...
Available OPTIONS:
-aN VARNAME [value ...]   assign next N values to varname as array
-v VARNAME value          assign single value to varname
--help                    display this help and exit
--version                 output version information and exit"
                return 0 ;;
            --version) echo "\
${FUNCNAME[0]}-0.9.dev
Copyright (C) 2010 Freddy Vulto
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law."
                return 0 ;;
            *)
                echo "bash: ${FUNCNAME[0]}: $1: invalid option" 1>&2
                return 1 ;;
        esac
    done
}

echo bash-$BASH_VERSION
echo
echo subshell:

f() {
    local a i
    for (( i=1000; i > 0; i-- )); do
        a=$(g)
    done
}
g() {
    local b=foo
    echo $b
}
time f

echo
echo eval:

f() {
    local a i
    for (( i=1000; i > 0; i-- )); do
        g a
    done
}
g() {
    local b=foo
    eval $1=\$b
}
time f

echo
echo upvar:

f() {
    local a i
    for (( i=1000; i > 0; i-- )); do
        g a
    done
}
g() {
    local b=foo
    local "$1" && upvar $1 $b
}
time f

echo
echo upvars:

f() {
    local a i
    for (( i=1000; i > 0; i-- )); do
        g a
    done
}
g() {
    local b=foo
    local "$1" && upvars -v $1 $b
}
time f

Example run:

$ . time_upvar.sh
bash-3.2.39(1)-release

subshell:

real	0m4.233s
user	0m0.972s
sys	0m3.260s

eval:

real	0m0.372s
user	0m0.368s
sys	0m0.004s

upvar:

real	0m0.757s
user	0m0.740s
sys	0m0.016s

upvars:

real	0m1.175s
user	0m1.148s
sys	0m0.024s

Comments

blog comments powered by Disqus