constants.py
import os
BASE_PATH = os.path.abspath(os.path.dirname(__file__))
INPUT_PATH = os.path.join(BASE_PATH, 'input')
FILE_INPUT1_PATH = os.path.join(INPUT_PATH, 'input1.csv')
FILE_INPUT2_PATH = os.path.join(INPUT_PATH, 'input2.csv')
PROCESSED_PATH = os.path.join(BASE_PATH, 'processed')
FILE_PROC1_PATH = os.path.join(PROCESSED_PATH, 'processed1.pkl')
FILE_PROC2_PATH = os.path.join(PROCESSED_PATH, 'processed2.pkl')
structure dir:
root
|__ constant.py
|__ input
|__ input1.csv
|__ input2.csv
|__ process
|__ processed1.pkl
|__ processed2.pkl
data_handling.py
from constants import FILE_INPUT1_PATH, FILE_INPUT2_PATH, FILE_PROC1_PATH, FILE_PROC2_PATH
def foo(*args):
file = FILE_INPUT1_PATH
# Here it is doing staff
# Finally I write data into FILE_PROC1_PATH
def bar(*args):
file = FILE_INPUT2_PATH
# Here it is doing staff
# Finally I write data into FILE_PROC2_PATH
Currently I'm trying to use pytest and testing foo()
and bar()
but I don't know how to proceed due to input files and processed files are too big and test process musn't override processed files.
One approach is to change definition bar()
to bar(path)
and then call bar(FILE_INPUT2_PATH)
but that it isn't make sense in the code because bar
always needs to read FILE_INPUT2_PATH
and it is called in many places.
Unit test for foo() and bar() would test if the processed files were created or not because it depends on *args
.
So... question is how can I solve it? Does a pattern/good practice exists for this case? What should I change in my code?
input files and processed files are too big and test process musn't override processed files
Yes, and tests are perfectly suitable for that kind of job. Generic approach is to create a test data (which can be a subset of original data with edge cases included) and place it somewhere near your tests, for example:
├───tests
│ │ test_bar.py
│ │ test_foo.py
│ │
│ └───data
│ input_1.dat
│ input_2.dat
│ expected_1.pkl
│ expected_2.pkl
Then, if testing functions accept input as a constant rather than a parameter, use unittest.mock.patch
to change constant in test run (see this excellent answer for a quick reference). For storing output either regular or temporary file can be used.
import tempfile
from pathlib import Path
from unittest.mock import patch
import foo_module
TEST_DATA_DIR = Path(__file__).resolve().parent / 'data'
@patch('foo_module.FILE_INPUT1_PATH', TEST_DATA_DIR / 'input_1.dat')
@patch('foo_module.FILE_PROC1_PATH', tempfile.mktemp())
def test_foo(tmpdir):
"""Process input and check result."""
foo_module.foo()
result = open(foo_module.FILE_PROC1_PATH, 'rb').read()
expected = open(TEST_DATA_DIR / 'expected_1.pkl', 'rb').read()
assert result == expected
NOTE: tempfile.mktemp()
is deprecated because file is not created on mktemp()
call thus can be locked by another process. Feel free to suggest alternative approach.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With