I'm trying to get my head around a core concept of functional langauges:
"A central concept in functional languages is that the result of a function is determined by its input, and only by its input. There are no side-effects!"
http://www.haskell.org/haskellwiki/Why_Haskell_matters#Functions_and_side-effects_in_functional_languages
My question is, if a function makes changes only within its local environment, and returns the result, how can it interact with a database or a file system? By definition, wouldn't that be accessing what is in effect a global variable or global state?
What is the most common pattern used to get around or address this?
The functional language Haskell eliminates side effects such as I/O and other stateful computations by replacing them with monadic actions. Functional languages such as Standard ML, Scheme and Scala do not restrict side effects, but it is customary for programmers to avoid them.
To deal with "side effects" in purely functional programming, you (programmers) write pure functions from the input to the output, and the system causes the side effects by applying those pure functions to the "real world".
In functional programs, since functions are pure (i.e. no side effects), they must return a value, otherwise, they are pretty much useless. In addition, determinism is a key aspect of functional programming. We cannot allow a function to trigger a side effect just by calling it.
Side Effects & Pure Functions A side effect is when a function relies on, or modifies, something outside its parameters to do something. For example, a function which reads or writes from a variable outside its own arguments, a database, a file, or the console can be described as having side effects.
Just because a functional language is functional (Maybe even completely pure like Haskell!), it doesn't mean that programs written in that language must be pure when ran.
Haskell's approach, for example, when dealing with side-effects, can be explained rather simply: Let the whole program itself be pure (meaning that functions always return the same values for the same arguments and don't have any side effect), but let the return value of the main function be an action that can be ran.
Trying to explain this with pseudocode, here is some program in an imperative, non-functional language:
main:
  read contents of abc.txt into mystring
  write contents of mystring to def.txt
The main procedure above is just that: a series of steps describing how to perform a series of actions.
Compare this to a purely functional language like Haskell. In functional languages, everything is an expression, including the main function. One can thus read the equivalent of the above program like this:
main = the reading of abc.txt into mystring followed by
       the writing of mystring to def.txt
So, main is an expression that, when evaluated, will return an action describing what to do in order to execute the program. The actual execution of this action happens outside of the programmers world. And this is really how it works; the following is an actual Haskell program that can be compiled and ran:
main = readFile "abc.txt" >>= \ mystring ->
       writeFile "def.txt" mystring
a >>= b can be said to mean "the action a followed by the result of a given to action b" in this situation, and the result of the operator is the combined actions a and b. The above program is of course not idiomatic Haskell; one can rewrite it as follows (removing the superfluous variable):
main = readFile "abc.txt" >>=
       writeFile "def.txt"
...or, using syntactic sugar and the do-notation:
main = do
  mystring <- readFile "abc.txt"
  writeFile "def.txt" mystring
All of the above programs are not only equivalent, but they are identical as far as the compiler is concerned.
This is how files, database systems, and web servers can be written as purely functional programs: by threading action values through the program so that they are combined, and finally end up in the main function. This gives the programmer enormous control over the program, and is why purely functional programming languages are so appealing in some situations.
The most common pattern for dealing with side-effects and impurity in functional languages is:
Examples:
set!
var
Haskell cheats a little bit -- its solution is that for functions that access the file system, or the database, the state of the entire universe at that instant, including the state of the filesystem/db, will be passed in to the function.(1) Thus, if you can replicate the state of the entire universe at that instant, then you can get the same results twice from such a function. Of course, you can't replicate the state of the entire universe at that instant, and so the functions return different values ...
But Haskell's solution, IMHO, is not the most common.
(1) Not sure of the specifics here. Thanks to CAMcCann for pointing out that this metaphor is overused and maybe not all that accurate.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With