I'm relatively inexperienced in Haskell and I wanted to improve, so for a learning project of mine I have the following requirements:
.md.toplevel/.excluded..filename.md.swp.I searched all over SO. Here's what I have so far:
import qualified System.FilePath.Find as SFF
import qualified Filesystem.Path.CurrentOS as FP
srcFolderName = "src"
outFolderName = "output"
resFolderName = "res"
ffNotHidden :: SFF.FindClause Bool
ffNotHidden = SFF.fileName SFF./~? ".?*"
ffIsMD :: SFF.FindClause Bool
ffIsMD = SFF.extension SFF.==? ".md" SFF.&&? SFF.fileName SFF./~? ".?*"
findMarkdownSources :: FilePath -> IO [FilePath]
findMarkdownSources filePath = do
    paths <- SFF.find ffNotHidden ffIsMD filePath
    return paths
This doesn't work. printf-style debugging in "findMarkdownSources", I can verify that filePath is correct e.g. "/home/user/testdata" (print includes the ", in case that tells you something). The list paths is always empty. I'm absolutely certain there are markdown files in the directory I have specified (find /path/to/dir -name "*.md" finds them).
I therefore have some specific questions.
There are a number of ways to do this in haskell. It seems there are at least six packages (fileman, system.directory, system.filepath.find) dedicated to this. Here's some questions where something like this is answered:
Each one has about three unique ways to achieve what I want to achieve, so, we're nearly at 10 ways to do it...
If it helps, I'm reasonably comfortable with basic haskell, but you'll need to slow down if we start getting too heavy with monads and applicative functors (I don't use haskell enough for this to stay in my head). I find the haskell docs on hackage incomprehensible, though.
so, we're nearly at 10 ways to do it...
Here's yet another way to do it, using functions from the directory, filepath and extra packages, but not too much monad wizardry:
import Control.Monad (foldM)
import System.Directory (doesDirectoryExist, listDirectory) -- from "directory"
import System.FilePath ((</>), FilePath) -- from "filepath"
import Control.Monad.Extra (partitionM) -- from the "extra" package
traverseDir :: (FilePath -> Bool) -> (b -> FilePath -> IO b) -> b -> FilePath -> IO b
traverseDir validDir transition =
    let go state dirPath =
            do names <- listDirectory dirPath
               let paths = map (dirPath </>) names
               (dirPaths, filePaths) <- partitionM doesDirectoryExist paths
               state' <- foldM transition state filePaths -- process current dir
               foldM go state' (filter validDir dirPaths) -- process subdirs
     in go
The idea is that the user passes a FilePath -> Bool function to filter unwanted directories; also an initial state b and a transition function b -> FilePath -> IO b that processes file names, updates the b state and possibly has some side effects. Notice that the type of the state is chosen by the caller, who might put useful things there.
If we only want to print file names as they are produced, we can do something like this:
traverseDir (\_ -> True) (\() path -> print path) () "/tmp/somedir"
We are using () as a dummy state because we don't really need it here.
If we want to accumulate the files into a list, we can do it like this:
traverseDir (\_ -> True) (\fs f -> pure (f : fs)) [] "/tmp/somedir" 
And what if we want to filter some files? We would need to tweak the transition function we pass to traverseDir so that it ignores them.
I tested you code on my machine, and it seems to work fine. Here is some example data:
$ find test/data
test/data
test/data/look-a-md-file.md
test/data/another-dir
test/data/another-dir/shown.md
test/data/.not-shown.md
test/data/also-not-shown.md.bkp
test/data/.hidden
test/data/some-dir
test/data/some-dir/shown.md
test/data/some-dir/.ahother-hidden
test/data/some-dir/.ahother-hidden/im-hidden.md
Running your function will result in:
ghci> findMarkdownSources "test"
["test/data/another-dir/shown.md","test/data/look-a-md-file.md","test/data/some-dir/shown.md"]
I've tested this with an absolute path, and it also works. Are you sure you have passed a valid path? You'll get an empty list if that is the case (although you also get a warning).
Note that your code could be simplified as follows:
module Traversals.FileManip where
import           Data.List            (isPrefixOf)
import           System.FilePath.Find (always, extension, fileName, find, (&&?),
                                       (/~?), (==?))
findMdSources :: FilePath -> IO [FilePath]
findMdSources fp = find isVisible (isMdFile &&? isVisible) fp
    where
      isMdFile = extension ==? ".md"
      isVisible = fileName /~? ".?*"
And you can even remove the fp parameter, but I'm leaving it here for the sake of clarity.
I prefer to import explicitly so that I know where each function comes from (since I don't know of any Haskell IDE with advanced symbol navigation).
However, note that this solution uses uses unsafe interleave IO, which is not recommended.
So regarding your questions 2 and 3, I would recommend a streaming solution, like pipes or conduits. Sticking to these kind of solutions will reduce your options (just like sticking to pure functional programming languages reduced my options for programming languages ;)). Here you have an example on how pipes can be used to walk a directory.
Here is the code in case you want to try this out.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With