Bash: Passing variables by reference
Contents
Problem
How can I pass variables by reference to a bash function?
I see this technique of 'passing by reference' proliferating:
f() { local b; g b; echo $b; } g() { eval $1=bar; } # WRONG, although it f #+ looks ok: b=bar
But this deceptive simple solution has some serious drawbacks.
For one, the eval
introduces a security thread:
g() { eval $1=bar; } # WRONG g 'ls /; true' # Oops, executes command
Second, the eval
doesn't work if `g()' has the return variable declared local
:
g() { local b; eval $1=bar; } # WRONG g b # Conflicts with `local b' echo $b # b is empty unexpected
The conflict stays there even if local `b' is unset
:
g() { local b; unset b; eval $1=bar; } # WRONG g b # Still conflicts with `local b' echo $b # b is empty unexpected
Why bother?
Why bother passing variables by reference, if one can return values in bash by using a subshell:
a=$(func)
Well, subshells do an expensive fork operation - costing time, resources, and energy. Consider this time comparison, using either a subshell or eval
when doing 1,000 single value assignments:
time --------------------------------------------------------------------- real [%] real [s] user [s] sys [s] ------------------------------------- -------- -------- ------- 0 100 ------------------------------------- subshell ##################################### 4.046 0.968 3.076 eval ### 0.367 0.360 0.008 ------------------------------------- -------- -------- -------
Table 1: Time consumed doing 1,000 assignments, using either a subshell or eval. See #Appendix B: time_eval.sh for the code used.
As can be seen in table 1, pursuing the eval
method - trying to find a safe solution, is worth the effort.
Also, the subshell method supports returning one value only, or at least makes it very cumbersome to return multiple values via the subshell method. Passing by reference would make returning multiple values much easier.
Solutions
Both solutions upvar
& upvars
have been tested successfully on bash versions: 2.05b, 3.0.0, 3.2.39, 4.0.33 and 4.1.7.
See #Test suite for the code used.
Solution 1: Upvar: Assign single variable by reference
Use upvar
in a function, returning a single value, like this:
local "$1" && upvar $1 "value(s)"
Example:
f() { local b; g b; echo $b; } # (1) g() { local "$1" && upvar $1 bar; } # (2) f # Ok: b=bar
Explicitly make the variable local
in the function whishing to return (2). Then this function unsets the variable, calling upvar
(2), effectively making the variable appear down the call-stack to the caller (1).
Returning an array by reference goes like this:
f() { local b; g b; declare -p b; } # (1) # @param $1: Name of variable to return value into gg() { # (2) # Declare array containing three elements: # - foo # - bar \" cee # Including double quote (") # - dus \n enk # Including newline (\n) followed by two spaces # foo, bar \" cee, dus\n enk local a=(foo "bar \"cee" $'dus\n enk') # Return array local "$1" && upvar $1 "${a[@]}" } f # Ok: declare -a b=([0]="foo" [1]="bar \"cee" [2]=$'dus\n enk')
The `upvar' code makes use of some surprising behaviour of unset
which is capable of traversing down the call-stack and unsetting variables repeatedly. For more information, see Bash: Unset.
The name "upvar" is borrowed from Tcl's upvar
command.
For the upvar
code, see #Appendix A: upvars.sh.
Caveat: Subsequent 'upvar' calls may conflict
Consider this example:
f() { local b a; g b a; echo $b $a; } g() { local a=A b=B # ... if local "$1" "$2"; then upvar $1 $a # (1a) upvar $2 $b # (1b) fi } f # Error; got "A A", expected: "A B"
The problem is that in the first call to upvar (1a), `f.b' gets assigned the value "A". Unfortunately, upvar
also unsets the local variable `g.b', so that in the second call to upvar (1b), `f.a' gets assigned the value of not `g.b' but `f.b' which now is "A".
The solution is to pass all variables and values in one call, using upvars, see solution 2.
Solution 2: Upvars: Assign multiple variables by reference
Use upvars
like this:
local varname [varname ...] && upvars [-v varname value] | [-aN varname [value ...]] ... Available OPTIONS: -aN Assign next N values to varname as array -v Assign single value to varname
Example:
f() { local a b; g a b; declare -p a b; } # (1) g() { local c=( foo bar ) local "$1" "$2" && upvars -v $1 A -a${#c[@]} $2 "${c[@]}" # (2) } f # Ok: a=A, b=(foo bar)
Explicitly declare the variables local
in the function whishing to return (2). Then this function can return variables using upvars
(2), effectively making the variables appear down the call-stack to the caller (1).
For the upvars
code, see #Appendix A: upvars.sh.
Download
- Single file: upvars.sh
Time comparison
time --------------------------------------------------------------------- real [%] real [s] user [s] sys [s] ------------------------------------- -------- -------- ------- 0 100 ------------------------------------- subshell ##################################### 4.046 0.968 3.076 eval ### 0.367 0.360 0.008 upvar ####### 0.759 0.748 0.008 upvars ########### 1.175 1.148 0.024 ------------------------------------- -------- -------- -------
Table 2: Time consumed doing 1,000 single value assignments. See #Appendix C: time_upvar.sh for the code used.
Examples
Function returning array by reference
# Param $1 Name of variable to return array to return_array() { local r=(e1 e2 "e3 e4" $'e5\ne6') local "$1" && upvars -a${#r[@]} $1 "${r[@]}" }
Function returning optional variables by reference
bash >= 3.1.0
# Params $* (optional) names of variables to return values to. # Supported variable names are: # - A1: Return array 1 # - A2: Return array 2 # - V1: Return value 1 # - V2: Return value 2 return_optional_vars() { local a1=(bar "cee dee") a2=() upargs=() upvars=() v1=foo v2 var for var; do case $var in A1) upargs+=(-a${#a1[@]} $var "${a1[@]}") ;; A2) upargs+=(-a${#a2[@]} $var "${a2[@]}") ;; V1) upargs+=(-v $var "$v1") ;; V2) upargs+=(-v $var "$v2") ;; *) echo "bash: ${FUNCNAME[0]}: \`$var': unknown variable" return 1 ;; esac upvars+=("$var") done (( ${#upvars[@]} )) && local "${upvars[@]}" && upvars "${upargs[@]}" }
bash >= 2.05b
# Params $* (optional) names of variables to return values to. # Supported variable names are: # - A1: Return array 1 # - A2: Return array 2 # - V1: Return value 1 # - V2: Return value 2 return_optional_vars() { local a1 a2 upargs upvars v1=foo v2 var a1=(bar "cee dee") a2=() upargs=() upvars=() for var; do case $var in A1) upargs=("${upargs[@]}" -a${#a1[@]} $var "${a1[@]}") ;; A2) upargs=("${upargs[@]}" -a${#a2[@]} $var "${a2[@]}") ;; V1) upargs=("${upargs[@]}" -v $var "$v1") ;; V2) upargs=("${upargs[@]}" -v $var "$v2") ;; *) echo "bash: ${FUNCNAME[0]}: \`$var': unknown variable" return 1 ;; esac upvars=("${upvars[@]}" "$var") done (( ${#upvars[@]} )) && local "${upvars[@]}" && upvars "${upargs[@]}" }
Test suite
The test suite uses the bash-completion test suite, which is written on top of the DejaGnu testing framework. DejaGnu is written in Expect, which in turn uses Tcl -- Tool command language.
Install
Git
git clone git@github.com:fvue/BashByRef.git # BashByRef cd BashByRef && git submodule update --init # bash-completion
Dependencies
Debian/Ubuntu
On Debian/Ubuntu you can use `apt-get`:
sudo apt-get install dejagnu tcllib
This should also install the necessary `expect` and `tcl` packages.
Fedora/RHEL/CentOS
On Fedora and RHEL/CentOS (with EPEL) you can use `yum`:
sudo yum install dejagnu tcllib
This should also install the necessary `expect` and `tcl` packages.
Running the tests
The tests are run by calling runUnit
:
cd test ./runUnit
Example output:
Test Run By me on Sun May 30 08:51:40 2010 Native configuration is i686-pc-linux-gnu === unit tests === Schedule of variations: unix Running target unix Using /usr/share/dejagnu/baseboards/unix.exp as board description file for target. Using /usr/share/dejagnu/config/unix.exp as generic interface file for target. Using ./config/default.exp as tool-and-target-specific interface file. Running ./unit/upvar.exp ... Running ./unit/upvars.exp ... === unit Summary === # of expected passes 22 # of expected failures 1 /tmp/BashByRef/test, bash-4.0.33(7)-release
See also
- Passing variables by reference conflicts with local
- Me questioning the problem on the bug-bash mailing list
Appendixes
Appendix A: upvars.sh
# Bash: Passing variables by reference # Copyright (C) 2010 Freddy Vulto # Version: upvars-0.9.dev # See: http://fvue.nl/wiki/Bash:_Passing_variables_by_reference # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see <http://www.gnu.org/licenses/>. # Assign variable one scope above the caller # Usage: local "$1" && upvar $1 "value(s)" # Param: $1 Variable name to assign value to # Param: $* Value(s) to assign. If multiple values, an array is # assigned, otherwise a single value is assigned. # NOTE: For assigning multiple variables, use 'upvars'. Do NOT # use multiple 'upvar' calls, since one 'upvar' call might # reassign a variable to be used by another 'upvar' call. # Example: # # f() { local b; g b; echo $b; } # g() { local "$1" && upvar $1 bar; } # f # Ok: b=bar # upvar() { if unset -v "$1"; then # Unset & validate varname if (( $# == 2 )); then eval $1=\"\$2\" # Return single value else eval $1=\(\"\${@:2}\"\) # Return array fi fi } # Assign variables one scope above the caller # Usage: local varname [varname ...] && # upvars [-v varname value] | [-aN varname [value ...]] ... # Available OPTIONS: # -aN Assign next N values to varname as array # -v Assign single value to varname # Return: 1 if error occurs # Example: # # f() { local a b; g a b; declare -p a b; } # g() { # local c=( foo bar ) # local "$1" "$2" && upvars -v $1 A -a${#c[@]} $2 "${c[@]}" # } # f # Ok: a=A, b=(foo bar) # upvars() { if ! (( $# )); then echo "${FUNCNAME[0]}: usage: ${FUNCNAME[0]} [-v varname"\ "value] | [-aN varname [value ...]] ..." 1>&2 return 2 fi while (( $# )); do case $1 in -a*) # Error checking [[ ${1#-a} ]] || { echo "bash: ${FUNCNAME[0]}: \`$1': missing"\ "number specifier" 1>&2; return 1; } printf %d "${1#-a}" &> /dev/null || { echo "bash:"\ "${FUNCNAME[0]}: \`$1': invalid number specifier" 1>&2 return 1; } # Assign array of -aN elements [[ "$2" ]] && unset -v "$2" && eval $2=\(\"\${@:3:${1#-a}}\"\) && shift $((${1#-a} + 2)) || { echo "bash: ${FUNCNAME[0]}:"\ "\`$1${2+ }$2': missing argument(s)" 1>&2; return 1; } ;; -v) # Assign single value [[ "$2" ]] && unset -v "$2" && eval $2=\"\$3\" && shift 3 || { echo "bash: ${FUNCNAME[0]}: $1: missing"\ "argument(s)" 1>&2; return 1; } ;; --help) echo "\ Usage: local varname [varname ...] && ${FUNCNAME[0]} [-v varname value] | [-aN varname [value ...]] ... Available OPTIONS: -aN VARNAME [value ...] assign next N values to varname as array -v VARNAME value assign single value to varname --help display this help and exit --version output version information and exit" return 0 ;; --version) echo "\ ${FUNCNAME[0]}-0.9.dev Copyright (C) 2010 Freddy Vulto License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law." return 0 ;; *) echo "bash: ${FUNCNAME[0]}: $1: invalid option" 1>&2 return 1 ;; esac done }
Appendix B: time_eval.sh
#--- time_eval.sh ------------------------------------------- # Compare times doing 1,000 single value assignments: # - subshell # - eval echo bash-$BASH_VERSION echo echo subshell: f() { local a i for (( i=1000; i > 0; i-- )); do a=$(g) done } g() { local b=foo echo $b } time f echo echo eval: f() { local a i for (( i=1000; i > 0; i-- )); do g a done } g() { local b=foo eval $1=\$b } time f
Example run:
$ . time_eval.sh bash-3.2.39(1)-release subshell: real 0m5.089s user 0m1.056s sys 0m4.032s eval: real 0m0.374s user 0m0.372s sys 0m0.000s
Appendix C: time_upvar.sh
#--- time_upvar.sh ------------------------------------------ # Compare times doing 1,000 single value assignments: # - subshell # - eval # - upvar # - upvars # Assign variable one scope above the caller # Usage: local "$1" && upvar $1 "value(s)" # Param: $1 Variable name to assign value to # Param: $* Value(s) to assign. If multiple values, an array is # assigned, otherwise a single value is assigned. # NOTE: For assigning multiple variables, use 'upvars'. Do NOT # use multiple 'upvar' calls, since one 'upvar' call might # reassign a variable to be used by another 'upvar' call. # See: http://fvue.nl/wiki/Bash:_Passing_variables_by_reference upvar() { if unset -v "$1"; then # Unset & validate varname if (( $# == 2 )); then eval $1=\"\$2\" # Return single value else eval $1=\(\"\${@:2}\"\) # Return array fi fi } # Assign variables one scope above the caller # Usage: local varname [varname ...] && # upvars [-v varname value] | [-aN varname [value ...]] ... # Available OPTIONS: # -aN Assign next N values to varname as array # -v Assign single value to varname # Return: 1 if error occurs # See: http://fvue.nl/wiki/Bash:_Passing_variables_by_reference upvars() { if ! (( $# )); then echo "${FUNCNAME[0]}: usage: ${FUNCNAME[0]} [-v varname"\ "value] | [-aN varname [value ...]] ..." 1>&2 return 2 fi while (( $# )); do case $1 in -a*) # Error checking [[ ${1#-a} ]] || { echo "bash: ${FUNCNAME[0]}: \`$1': missing"\ "number specifier" 1>&2; return 1; } printf %d "${1#-a}" &> /dev/null || { echo "bash:"\ "${FUNCNAME[0]}: \`$1': invalid number specifier" 1>&2 return 1; } # Assign array of -aN elements [[ "$2" ]] && unset -v "$2" && eval $2=\(\"\${@:3:${1#-a}}\"\) && shift $((${1#-a} + 2)) || { echo "bash: ${FUNCNAME[0]}:"\ "\`$1${2+ }$2': missing argument(s)" 1>&2; return 1; } ;; -v) # Assign single value [[ "$2" ]] && unset -v "$2" && eval $2=\"\$3\" && shift 3 || { echo "bash: ${FUNCNAME[0]}: $1: missing"\ "argument(s)" 1>&2; return 1; } ;; --help) echo "\ Usage: local varname [varname ...] && ${FUNCNAME[0]} [-v varname value] | [-aN varname [value ...]] ... Available OPTIONS: -aN VARNAME [value ...] assign next N values to varname as array -v VARNAME value assign single value to varname --help display this help and exit --version output version information and exit" return 0 ;; --version) echo "\ ${FUNCNAME[0]}-0.9.dev Copyright (C) 2010 Freddy Vulto License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law." return 0 ;; *) echo "bash: ${FUNCNAME[0]}: $1: invalid option" 1>&2 return 1 ;; esac done } echo bash-$BASH_VERSION echo echo subshell: f() { local a i for (( i=1000; i > 0; i-- )); do a=$(g) done } g() { local b=foo echo $b } time f echo echo eval: f() { local a i for (( i=1000; i > 0; i-- )); do g a done } g() { local b=foo eval $1=\$b } time f echo echo upvar: f() { local a i for (( i=1000; i > 0; i-- )); do g a done } g() { local b=foo local "$1" && upvar $1 $b } time f echo echo upvars: f() { local a i for (( i=1000; i > 0; i-- )); do g a done } g() { local b=foo local "$1" && upvars -v $1 $b } time f
Example run:
$ . time_upvar.sh bash-3.2.39(1)-release subshell: real 0m4.233s user 0m0.972s sys 0m3.260s eval: real 0m0.372s user 0m0.368s sys 0m0.004s upvar: real 0m0.757s user 0m0.740s sys 0m0.016s upvars: real 0m1.175s user 0m1.148s sys 0m0.024s