In order to test how church-encoded lists perform against user-defiend lists and native lists, I've prepared 3 benchmarks:
data List a = Cons a (List a) | Nil deriving Show
lenumTil n        = go n Nil where
    go 0 result   = result
    go n result   = go (n-1) (Cons (n-1) result)
lsum Nil          = 0
lsum (Cons h t)   = h + (lsum t)
main = print (lsum (lenumTil (100000000 :: Int)))
main = print $ sum ([0..100000000-1] :: [Int])
fsum   = (\ a -> (a (+) 0))
fenumTil n cons nil = go n nil where
    go 0 result    = result
    go n result    = go (n-1) (cons (n-1) result)
main = print $ (fsum (fenumTil (100000000 :: Int)) :: Int)
The benchmark results are unexpected:
-- 4999999950000000
-- real 0m22.520s
-- user 0m59.815s
-- sys  0m20.327s
-- 4999999950000000
-- real 0m0.999s
-- user 0m1.357s
-- sys  0m0.252s
-- 4999999950000000
-- real 0m0.010s
-- user 0m0.002s
-- sys  0m0.003s
One would expect that, with the huge amount of specific optimizations targeted to native lists, they would perform the best. Yet, church lists outperform them by a 100x factor, and outperform user-defined ADTs by a 2250x factor. I've compiled all programs with GHC -O2. I've tried replacing sum by foldl', same result. I've attempted adding user-inputs to make sure the church-list version wasn't optimized to a constant. arkeet pointed out on #haskell that, by inspecting Core, the native version has an intermediate lists, but why? Forcing allocation with an additional reverse, all 3 perform roughly the same. 
GHC 7.10 has call arity analysis, which lets us define foldl in terms of foldr and thus let left folds, including sum, participate in fusion. GHC 7.8 also defines sum with foldl but it can't fuse the lists away. Thus GHC 7.10 performs optimally and identically to the Church version.
The Church version is child's play to optimize in either GHC versions. We just have to inline (+) and 0 into fenumTil, and then we have a patently tail-recursive go which can be readily unboxed and then turned into a loop by the code generator. 
The user-defined version is not tail-recursive and it works in linear space, which wrecks performance, of course.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With